Logic of these bitwise operations

It's been a while since I want to start in the world of emulators, I decided to stop trying to make an emulator of a complex system and start with a basic good, a CHIP-8 emulator, which is what many indicate in emulation forums. Well let's go by Parts:

First operation I don't see logic:

std::uint16_t opcode = m_Memory[reg.PC] << 8 | m_Memory[reg.PC + 1];

Basically 1 opcode of the CHIP-8 is worth 2 bytes, but the rom is 8 bits, first I access the array of std:: uint8_t that I call m_Memory that I used to store the ROM and the set of sources in the position of the program Counter that starts as 0x200 which is where most programs / CHIP-8 games start, then add 8 more zeros, which is easy to understand, 1 byte = 8 bits, then 2 bytes are 16 bits, but then the confusion begins, if you have already obtained the opcode then why mask a value of 16 bits with 8? and why use the rom itself but advancing the position of the pc?

Here we go to the second part of my problem:

switch (opcode & 0xF000) {
   ...
}

In a discussion I started on a reddit forum about emulators people told me they mask the opcode with 0xF000 to get the actual opcode, but what I didn't understand is how they came to the conclusion that they should mask and why with that value.

The final part:

I use this documentation in which I and many others guide themselves, first we go to the opcode 0x6000 or 6XKK or LD V x, byte:

//LD Vx, byte
case 0x6000:
    reg.Vx[(opcode & 0x0F00) >> 8] = (opcode & 0x00FF);
    reg.PC += 2;
    std::cout << "OPCODE LD Vx, byte executado." << std::endl;
    break; 

CHIP-8 has 16 8-bit loggers I chamei De Vx, let's go to:

Reg.Vx [(opcode & 0x0f00) > > 8]

First I converted opcode 0x6000 to binary and performed operation and:

0110 0000 0000 0000    //0x6000
0000 1111 0000 0000    //0x0F00
-------------------
0000 0000 0000 0000    //0x0

Then >> 8 moves 8 bits to the right what would be 0000 0000 that is, the index 0 of Vx, then = (opcode & 0x00FF) that is:

0110 0000 0000 0000    //0x6000
0000 0000 1111 1111    //0x00FF
-------------------
0000 0000 0000 0000    //0x0

So why not just do reg.Vx[0] = 0; ?

Remembering that I've never had to do bitwise operations before on any project, just know what the books me they said about the operation AND, OR, XOR, not etc...

I wish I could understand this logic that people use to be able to use in future projects.

Author: Samuel Ives, 2018-06-18

1 answers

Some of the things you're not understanding seem to be because you haven't interpreted that there are values that are a "family" of opcodes, or parameters for the same opcode - all encoded at 16 - Bit value-and not just a fixed value. The last example, from opcode 0x6000, for example, you did the whole simulation as if the value Always were to be exactly 0x6000 - however, see the documentation:

6xkk-LD Vx, byte Set Vx = kk.

The interpreter puts the value kk into register Vx.

That is, the first" nibble "(first 4 bits) of the opcode contains the hexa digit"6". The remaining 3 hexadecimal digits are the opcode arguments. So, yes, "0x6000" will always be "set V0 = 0x00", but the opcode 0x62FF means"set V2 = 0xFF". The role of your interpreter/emulator is precisely to detect that opcode 6 means to put a value in a register, extract those values, and execute the operation.

See how this already answers your second question - when making the switch-case with the opcode masked with 0xF000, only the value "0x6000" gets to _compare as case, but inside the case code, you need the opcode in full - it is in the other digits of the opcode that the parameters are.

opcode = 0x62ff;
switch (opcode  & 0xf000):
   ...
   case 0x6000:
       register_number = (opcode & 0x0f00) >> 8;
       value = opcode & 0xff;
       registers[register_number] = value;
       break;
   ...

Note in the documentation that not all opcodes are determined integrally by the first hexadecimal digit - for some of them, for example, the very "0x0", there are a whole subfamily of opcodes - in these cases you will make another switch/case within the first (or, call a function in C for that), to test the other values.

And finally, as for:

opcode = m_Memory[reg.PC] << 8 | m_Memory[reg.PC + 1]; 

Is readable as clearly as in Portuguese - The vector m_Memory ( * ) contains 8-bit values. You need to read two bytes and compose a single 16-Bit value (and, see documentation: the most significant byte comes first - i.e. "big endian" )

All instructions are 2 bytes long and are stored most-significant-byte first.

Then-you take the first byte, multiply it by 2 ^ 8 using shift 8 << 8- that is, insert 8 zeros to the right of that byte - and then arrow these 8 lower binary digits with the value of the next byte in memory, using the or binary (since all corresponding values are 0, the value of the second byte is placed in integra in the lower bits of the opcode). In other words: you read a byte, put it in the position of bits 15 to 8 of your opcode, and read the byte in the next position of memory, in the position of bits 7 to 0.

insert the description of the image here

(*) separate note: you actually gain very little from complicating variable names - even if that's the style practice in other examples you're reading: "m_Memory" instead of "memory" just means 4 more keystrokes on the keyboard, and three signs of "visual garbage" that your brain has to discard at the same time. read the variable. It doesn't have much risk that you have another "memory" variable in that code, does it?

 2
Author: jsbueno, 2018-06-18 17:37:08