How to run inline assembly in a code with variables in C?

From an example of a book, I was able to run the following assembly (AT&T) code with gas, which returns a message with the processor name:

.section .data 
output:
    .asciz "The processor Vendor ID is '%s'\n"

.section .bss
    .lcomm tam, 12

.section .text
.global main
main:
    movl $0, %eax
    cpuid
    movl $tam, %edi
    movl %ebx, (%edi)
    movl %edx, 4(%edi)
    movl %ecx, 8(%edi)
    pushl $tam
    pushl $output
    call printf
    addl $8, %esp   

    pushl $0    
    call exit

Question: I'm trying to put inline code in C, but I'm having difficulties. How do I set my variables in C so that they work correctly? The " try " code below didn't work, but it demonstrates my doubt a bit:

#include <stdio.h> 

int main(void)
{   
    const int tam = 12;
    char *output = "The processor Vendor ID is '%s'\n";

    __asm__ (

            "movl $0, %eax;"
            "cpuid;"
            "movl tam, %edi;"
            "movl %ebx, (%edi);"
            "movl %edx, 4(%edi);"
            "movl %ecx, 8(%edi);"
            "pushl $tam;"
            "pushl $output;"
            "call printf;"
            "addl $8, %esp;"
    );

    return 0;
}

I would like to know how to do this inline code in C that returns the processor name. Simple commands (no .date or .bss) I can compile and run, but whenever I have do variables .bss or even constants of the .data I can't compile. How do I get the source above to compile and run similarly to assembly ? I'm grateful for your attention.

 8
Author: Rafael Bluhm, 2015-02-07

1 answers

Basic Notation

The basic scheme of using asm or __asm__ or _asm or __asm in C\C++ is as follows (using GCC as a reference):

asm [volatile] ( "SEU CODIGO\n\t"
                 "EM\n\t"
                 "ASSEMBLY"
                    : OperadoresDeSaída
                  [ : OperadoresDeEntrada
                  [ : Clobbers ] ])

This notation changes depending on the compiler.


Examples

I find it easier to explain using some examples. See in ideone.

Assuming you have the following variables:

// Criando variáveis para interagir com assembly:
int foo, bar, var;

You can interact with them using inline assembly as follows shape:

//    Em C, seria:
//        foo = 1;
//        bar = 2;
//        var = 3;
asm volatile ("movl $1, %0;"  // código assembly
              "movl $2, %1;"
              "movl $3, %2;"
              : "=r" (foo), "=r" (bar), "=r" (var) // variáveis de saída
              );

The =r indicate to the compiler that the result of that statement should be sent via a logger to the variable %N, where N is the index. You can also use =g by letting the compiler decide which medium to use to send the value. more details in the documentation.


//    Em C, seria:
//        bar = foo * 2;
asm volatile ("movl $2, %%eax;"      // eax = 2
              "imul %%ebx, %%eax;"   // eax * ebx
              "movl %%eax, %0;"      // faz bar igual ao resultado.
              : "=r" (bar)    // variáveis de saída
              : "b" (foo)     // variáveis de entrada (ebx = foo)
              );

In this case, the compiler passes the value of foo to the EBX logger, then uses it in the assembly code informed.


//    Em C, seria:
//        var = bar;
asm volatile ("movl %0, %%eax;"
              "movl %%eax, %1;"
              : "=r" (var) // saída
              : "b" (bar)  // entrada
              : "%eax"     // clobbers
              ); 

Makes var equal to bar using the eax logger (note the use of indexes in %0 and %1). The third parameter (clobbers) serves to tell the compiler that the eax logger will be used. That way, before running your assembly code, the compiler will save any content present in EAX that will be used after your code, releasing EAX for you.


To get the CPU manufacturer

So, you can use the following code to call get the CPU manufacturer. Using cpuid:

    asm volatile ("cpuid" : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx)
                          : "a" (op));

Where, eax, ebx, ecx and edx is the values of the registers and op is the function of the cpuid that will be called. With this, variables eax and etc will receive the cpuid return, which you will use to print the processor manufacturer.

In Windows, you can also call the above code using the following function:

int regs[4]; // recebe eax, ebx, ecx, edx
int op = 0;  // código da função
__cpuid(regs, op);

For this you must include intrin.h


Example in ideone:

#include <stdio.h>
#include <stdint.h>
#include <cpuid.h>
#include <string.h>

int main(int argc, char **argv)
{
    // a função opcode CPUID:
    int op;  

    // registradores:
    int eax; 
    int ebx; 
    int ecx; 
    int edx;

    // parâmetro zero para CPUID indica que você quer o fabricante.
    op = 0; 

    __asm__ ("cpuid" : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx)
                     : "a" (op));

    // Receberá os valores de EBX, ECX e EDX para sistemas 32bits:
    char vendor[sizeof(int) * 3 + 1]; 
    strncpy(vendor, (const char*) &ebx, sizeof(int));
    strncpy(&vendor[8], (const char*) &ecx, sizeof(int));
    strncpy(&vendor[4], (const char*) &edx, sizeof(int));
    vendor[12] = '\0'; // terminador nulo

    printf("CPU: %s", vendor);

    return 0;
}

The return will depend on the CPU and will show only the manufacturer, using the following notation:

"AMDisbetter!" ou "AuthenticAMD" -> "AMD";
"GenuineIntel" -> "Intel"
"VIA VIA VIA " -> "VIA"
"CentaurHauls" -> "Centaur"
"CyrixInstead" -> "Cyrix"
"TransmetaCPU" ou "GenuineTMx86" -> "Transmeta"
"Geode by NSC" -> "National Semiconductor"
"NexGenDriven" -> "NexGen"
"RiseRiseRise" -> "Rise"
"SiS SiS SiS " -> "SiS"
"UMC UMC UMC " -> "UMC"
"Vortex86 SoC" -> "Vortex"
"KVMKVMKVMKVM" -> "KVM"
"Microsoft Hv" -> "Microsoft Hyper-V"
"VMwareVMware" -> "VMware"
"XenVMMXenVMM" -> "Xen HVM"

Note: This code is for x86. To know which CPU model is a bigger job.

Reference: Playing with cpuid

 5
Author: Lucas Lima, 2015-02-08 18:07:01