sizeof does not work to determine malloc size

Well, I was doing some data structure work when I came across the need to allocate a vector dynamically, however, even allocating the space needed for the structure, the value returned by sizeof is incorrect. Follow the code example:

int *vetor = (int *)malloc(sizeof(int)*4);//alocando espaço para 4 inteiros ou seja 4*4 = 16 bytes

printf("%d -- %d\n", sizeof(vetor), sizeof(vetor)/sizeof(int));//printando o valo

In this example above sizeof(vetor) should return us the size of the vector, i.e. the amount of bytes we allocate to it, correct?

And therefore the expected output would be:

16 -- 4

However, that is not what happens. Following example of the output:

Exit

And no matter how many bytes I target the vector with malloc, the output is always the same, i.e. vector size in bytes is always 8.

Why does this happen, and what would be the solution to this problem?

Author: Maniero, 2020-08-03

2 answers

You are stating two things, and asking a question assuming those two things are fact. However

  • is not "allocating the required space"
  • and neither "the value returned by sizeof() is incorrect"

Here is the statement of vetor

int        *vetor = (int *)malloc(sizeof(int)*4);

I don't want to get into religious discussions here, but it's declaring vetor and vetor is int* so maybe it would be clearer to read, especially to those who are learning, if write

int*        vetor = (int *)malloc(sizeof(int)*4);

sizeof(vetor) it actually returns the size of vetor. vetor is int*, a pointer to int. And the size of the Pointer is given by the architecture of the machine, 8 bytes in your case, to compile in 64 bits.

" and no matter how many bytes I target the vector with malloc, the output is always the same, i.e. vector size in bytes is always 8 "

In this you are right: one thing is vetor, a pointer to int. Another thing is the size of the area for the which he points out, which was in this case determined by the account malloc(sizeof(int)*4).

In the case of the size of the allocated area there is officially a way for you to know what it is, and the reason is simple: it was you who allocated so you should know. And for the system it keeps an internal table of these values. See this example excerpt

int tamanho = 1801;
int* mais_um_vetor = (int*)malloc(tamanho);
free(mais_um_vetor);
mais_um_vetor = (int*)malloc(130);
tamanho = 32 * sizeof(int);
int* p = (int*)realloc(mais_um_vetor, tamanho);
if (p != NULL) mais_um_vetor = p;
free(mais_um_vetor);

malloc() it has no size arithmetic: it will allocate 1801 bytes and put the address at mais_um_vetor, a pointer to int. Will not allocate 1801 int! As the size of the area is not multiple of sizeof(int) you should cancel your program when trying to access 4 in 4 if you use mais_um_vetor as a vector of int

But then free() rotates ok and releases the 1801. And malloc() allocates 130 to the same pointer. Perhaps because you remembered that it should be multiple of sizeof(int) the program calls realloc() and allocates space for 32 int. And saves in size the total area.

Note that this example is just that: a meaningless example, included in it the program at the end.

Note that for realloc() also makes no difference the size of the area. Only the new size.

Were allocated 1800, then released and allocated 130. sizeof(mais_um_vetor) will not change: 8. And the system keeps track of the area size for when you call free() to release or realloc() to change the size, and the thing works.

However that's probably not what you want

You that access vetor as a vector of int, with an arbitrary number of values, allocated dynamically.

How to do this?

You can allocate the exact number, starting from N=1 and using realloc() to allocate N=N+1 each time, or you can allocate in blocks of a certain number, like in blocks of 64 int, to get a little more efficient.

The problem is that realloc() may have to change everyone's place, at the discretion of the system, and will not warn you before. And this of course will cost your program a while, from one hour to another. In programs of study this is not relevant, but I think you understood the problem: the block that allocated is in the middle of possible other things that your program allocated, and to realloc() need a few more bytes may not have at the time and then will allocate a larger area elsewhere and copy everything that had in the original area. And your program will have to wait.

Before you ask: to decrease is guaranteed that the address of the pointer does not change. To increase it is only guaranteed that the content up to the time of increase will not change.

So what you want to allocate is actually

int**        vetor;

vetor it should point to a vector of pointers to int, and not to a single int as the case of

int*         vetor;

This is exactly what the system does for each program in C, mounting the vector argv[] with argc elements, and it's clear why it needs the argc: someone has to tell the program the size of the argument vector.

Just like someone has to warn your vector to how many int it points to

An example program

Allocates a vector of 32 pointers to int, allocates the guys, and puts a value of 100 to 131 in each. It shows the first and last value and then erases everything. And then allocates, fills and deletes an array of 3 int.

Output:

sizeof(vetor) 8
sizeof(int) 4
sizeof(vetor) = 8
sizeof(outro_vetor)  int outro_vetor[30] = 120
alocado um vetor de 32 int
Primeiro: 100 Ultimo 131
Liberando o vetor...
Liberado...
Alocando vetor de 3 int...
sizeof() = 8
3 4 5
Final...

The program:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int* vetor = (int*)malloc(sizeof(int) * 4);
    printf("sizeof(vetor) %zu\n", sizeof(vetor));
    printf("sizeof(int) %zu\n", sizeof(int));

    int     outro_vetor[30];
    vetor = outro_vetor;

    printf("sizeof(vetor) = %zu\n", sizeof(vetor));
    printf("sizeof(outro_vetor)  int outro_vetor[30] = %zu\n",
        sizeof(outro_vetor));

    int tamanho = 1801;
    int* mais_um_vetor = (int*)malloc(tamanho);
    free(mais_um_vetor);
    mais_um_vetor = (int*)malloc(130);
    tamanho = 32 * sizeof(int);
    int* p = (int*)realloc(mais_um_vetor, tamanho);
    if (p != NULL) mais_um_vetor = p;
    free(mais_um_vetor);

    // cria vetor_de_int apontando para intN int
    int     intN = 32;
    int** vetor_de_int = NULL;

    // passo a passo (podia ter feito direto)
    vetor_de_int = (int**)malloc(intN * sizeof(int*));
    for (int n = 0; n < 32; n += 1)
    {
        vetor_de_int[n] = malloc(sizeof(int*)); // aloca um
        *vetor_de_int[n] = 100 + n; // valores de 100 a 131
    };  // for()

    printf("alocado um vetor de %d int\n", intN);
    printf("Primeiro: %d Ultimo %d\n",
        *vetor_de_int[0],
        *vetor_de_int[intN-1]
    );

    // destroi tudo, como em C++ ao contrario 
    // da criacao
    printf("Liberando o vetor...\n");
        for (int n = 0; n < 32; n += 1)
        free(vetor_de_int[n]);
    // liberado o vetor, agora a tabela
    free(vetor_de_int);
    printf("Liberado...\n");

    printf("Alocando vetor de 3 int...\n");

    int     (*vetor3_int)[3] = malloc(3 * sizeof(int));
    printf("sizeof() = %zu\n", sizeof(vetor3_int));
    (*vetor3_int)[0] = 3;
    (*vetor3_int)[1] = 4;
    (*vetor3_int)[2] = 5;
    for (int i = 0; i < 3; i += 1)
        printf("%d ", (*vetor3_int)[i]);
    free(vetor3_int);
    printf("\nFinal...\n");
    return 0;
};

But the first is not an INT vector...

Yeah. it is a vector of int*. In practice is what if whether. Especially in Data Structures. If you want to allocate a vector of 32 int you declare

int     (*vetor32_int)[32] = malloc(32 * sizeof(int));

Only there it is much less flexible: in memory it is ok, and the size is set. But it only fits 32. Fixed.

So normal is to use a pair of variables, as the system uses, and allocate in blocks of a reasonable size, so as not to have either too much waste or too many operations of realloc()

 1
Author: arfneto, 2020-08-17 20:06:34

The calculation you are doing is only for a array with size set at compile time. Although it is using a constant in the dynamic allocation it is potentially unknown and does not work. The operator sizeof can only get information that the compiler can prove to be constant, and it is complicated for him to prove it, because in this case it is not so difficult to analyze, but it has more complex code that does not know this value.

So,

  • or you allocate a array on the stack where the compiler can know the size, and I see no reason not to do so (it can exist, but in general no dynamic allocation is required in simple cases),
  • or use the value you already know is 4 instead of making an account that doesn't even make sense. If you are using the code in another function you have to pass this value together to the know function, or in some cases it uses a global constant like this having a name everywhere and it gets more easy to swap the number everywhere .

Anyway, there are several ways to solve this depending on the context, which we do not know.

Just for completeness, the 8 that appears there is the size of the pointer and not the size of the vector. vetor is of a pointer type, you stated saying it is, you can't expect it to magically show the size of something else. In 64 buts architectures all pointers are of size 8. A arrayallocated in the stack is not a pointer, it is the given and the size is known, even for the compiler to know the space it needs to reserve. Dynamism is used when you do not know the value. If you do not know the value I need to create own control mechanisms to maintain the size, then the second option that I listed above is recommended, it can even be sophisticated, but I will not talk about why it is advanced.

 3
Author: Maniero, 2020-08-04 12:18:34