What are registrars and what is their basic operation?

I am reading a book about Assembly in order to better understand how a low-level language works, this will help me to more effectively understand the inner workings of the programs I develop in C. and in a certain excerpt, the book mentions the registrars, see:

Registers are named storage locations in the CPU that hold intermediate results of operations.

There is also an example of a code in C++ and Assembly, see:

In C++:

int Y;
int X = (Y + 4) * 3;

In Assembly:

mov   eax,Y
add   eax,4
mov   ebx,3
imul  ebx
mov   X,eax

In the above quote says that they are a place of data storage on the CPU, however, I do not yet can not understand what in fact is a logger and how important it is for the operation of the program. Thus, the doubts that have arisen are addressed below.

Doubts

  1. What in fact are registrars?
  2. How is the basic functioning of a registrar?
  3. what is the importance that registrars have in relation to programs I develop?
  4. is there any relationship between RAM and registers?

The book I'm reading is Assembly Language for x86 Processors (Sixth Edition).

 7
Author: Comunidade, 2018-08-06

1 answers

TL; DR

What in fact are Registrars?

Are memory positions within the processor with specific names, it is as if they were variables.

How is the basic functioning of a registrar?

Are addresses that store data for a short time so that the processor can manipulate that data or use it to manipulate some other. Some even serve for fundamental control of the operation of the processor or the execution of the your code, in general things you don't even know if exist.

What is the importance of registrars in relation to the programs I develop?

Nothing in the abstract sense you read. Everything concretely. It is only in them that there is real execution and they are much faster than memory where you think your data is in execution.

Is there any relationship between RAM and loggers?

They are a type of short-term memory. To only relationship with RAM is that they talk to each other all the time. In relation to the loggers data comes and goes to and from RAM.

Detailing

Basically that's what's in the definition: P

Memory do you know? And variable ?

Memory

Memory consists of several slots data and we can say that always a slot has 1 byte in size. Access to each slot is done by a number, not least because it has a quantity big of him. Think of memory as a huge array of bytes.

Some of these slots can be accessed together and it is possible to give a name to access some of them specifically during the creation of the code, but in fact the Access is done by the number, even if you see it in it.

Registrars

The registers are not a memory, but with special characteristics and in very low number, because the distance that the signal electric needs to run needs to be quite small to happen very fast. If they had many registrars, most of them would be far away and the access time would be longer.

Unlike normal memory each slot in this memory inside the processor core has a slightly larger size, we usually call it word. So in 32-bit processors this size is 4 bytes and a 64-bit size is 8 bytes. But there are special registered with sizes different, some are 1 bit because it does not need more than that and others can have several bytes to process special actions in vectors, encryption, etc.

Is still a place where bits are stored for a while, in general very little time, in up to 1 or a few cycles. As they are few can have a name. But like everything in Assembly did not give such easy names. And since it is not a specific task as it occurs in a normal code of an application the names are well generic. But we can say that they are the low-level variables of any code.

Keeping the analogy I made with memory understand them as a large object with multiple named members, it would be defined as a class or a structure.

Operations

All operations that the processor manages to perform are on top of the registers. It is not possible to manipulate the RAM directly, you have to move the information to the logger manipulate and then can move the result to RAM again, if that's what you want.

Logic ports that perform something by taking the bits present in one or two registers (have special instructions that can take more data, are called SIMD) and turning into another bit(s) that must enter some Register(s).

Access to a logger on an x86 processor has a cost in the picosecond House. It is possible to make 3 or 4 billion hits per second. Access to RAM it costs almost 100 nanoseconds (has lowered), so about 10 million hits per second. It's a brutal difference. An ARM doesn't lag far behind that.

Performance

So it is important to keep the data in the logger. And so in the past writing Assembly helped a lot. Today compilers tend to make better choices than humans in many cases and puts what is most important in the Registrar.

Note that the access time is not the same as for a manipulation. A split, for example, can cost several nanoseconds even if you only access the Registrar.

Abstraction

Everything you write in high-level language that touches a die will go through a registrar.

This assembly code is a bit high level because the variables X and Y do not exist in the Assembly context, there would be pure memory addresses (in the case of stack).

Limitation

Must be imagining that because they are few registers (top 16 in the most common cases) what to do when you are working with a lot of variables (even if conceptually speaking). You will send to the memory what does not fit in the processor at that moment. In practice this occurs naturally because you put some data in the main registers, perform something and get the result by sending it to memory.

Cache

The processor has a cool abstraction that it can hold certain data very accessed in cache, the famous L1, L2, L3 and late L4 that because they are small are closer to the processor and have access times much better than RAM. And distance is the reason for having several levels.

In a certain point of view the logger is a kind of cache too, where the memory would be like the file swap of the operating system, is there to ensure that everything works with large volumes, but it is better to avoid its use.

I could even talk in the new non-volatile memories that will make the RAM persist data, or could speak of the cache line where the data is always transferred in block, so accessing 1 byte or 64 (typically) has almost equal cost, but this escapes a bit of focus.

Existing Registrars

There are 4 main registrars on an Intel x86 processor that are called EAX, EBX, ECX, EDX. In 64 bits the names are RAX, RBX, RCX, RCX and obviously the sizes are larger. As curiosity at 16 bits they are called AX, BX, CX, DX, and they can be accessed in each individual byte in its low or high part. so you have AL and AH, BL and BH, and so on.

Remember These are just names as if they were variables, there is not much secret. And we can say that they have only one type, which is the word. Almost everything is done on these registers. The most common, but only by convention, is that they are:

  • eax used as an accumulator (receives results from operations)
  • EBX would be used as the basis for operations
  • ECX is a counter (incrementing something)
  • EDX acts as General Data to be used in the operation.

Other very important recorders used over time in every application that are considered general use but that are almost always used for something very specific are:

  • ESP (Stack Pointer-Wave indicator is the end of the stack in memory)
  • EBP (Base Pointer - indicator of where the scope is now, the accesses to the data in the stack are always relative to this address, in general it indicates the beginning of the data of the running function, so there is an arithmetic in each access to a data)
  • ESI (Source, sometimes called index )
  • Edi (Destination, the latter are used by optimized access instructions to multiple data such as arrays , including strings )

Remembering that in 64 bits they start with R.

Then we have special segment recorders that there is no practical use nowadays with the advent of virtual memory.

One of the most important registers is the EIP or instruction Pointer. It is he who knows where the code is running. Each statement that ends its execution increments to the next address of execution that the Code must perform, which is variable on Intel-like processors, but has fixed size on RISC processors as is ARM. One goto (jmp) among other instructions manipulate this address by diverting to a specific address totally out of sequence.

In 64 bits we have the R8 to the R15 which are complementary registers and work like the first, but with nothing more conventional to use and are used as optimizations, in operations simple usually get empty (conceptually since it will always have given that it was there).

I did not talk about special recorders used by MMX, SSEx, etc. instructions. because I do not understand them well and I think that is not the case with most uses.

Finally we come to the bit registers (flags ) that receive certain control results and are consulted in certain instructions to decide what to do, you should already imagine that it rolls in many comparison instructions, but not only, even in arithmetic it can roll quite a lot. These registers are updated in a good part of the operations, so you only have the last state, if you need this information for some later operation (usually do not need) then you should store somewhere, be it a general register or in memory. I won't list all of them, but the main ones are (bit addresses):

  • 00 CF - Carry Flag-is the famous go one (yes, the computer needs to account the way you always done since childhood)
  • 02 PF - Parity Flag-indicates whether the result is even or ODD, which allows some optimizations
  • 04 AF - Adjust Flag-used for calculation
 7
Author: ,