Has anyone met sizeof (char)!= 1 in their practice?

Interested in one of the answers to a recent question on C++ (C)

Calculating the length of a string without using strlen()

I have not met, and can not even remember, any description of any system with a char size greater than one byte. Naturally, we are talking about C (C++), not Java, etc.

At the same time, has anyone ever dealt with a byte not consisting of 8 bits ?

 86
Author: Дух сообщества, 2012-06-09

6 answers

Systems with a char size greater than one byte not only existed, but exist and are actively used today. Typically, these are embedded systems or specialized processors, such as DSP. Here is a quote from the report Reading and Writing Binary Files on Targets With More Than 8-Bit Chars by Texas Instruments, the world's fourth-largest semiconductor manufacturer:

On theC2000 and C5000 DSP platforms, a char is 16 bits; on the C3x DSP generation, a char is 32 bits.

@avp, comment

Here's about sizeof('t') = = sizeof(int) this may be important.

Yes, it can, but most often the difference is offset by the signed extension of the character code to int. Let's look at a small example of reading a single character from a file and comparing it with the completion sign:

#include <stdio.h>

int main()
{
    if (getchar() == EOF)
        printf("Ох, Щи!!!");
    return 0;
}

Run by passing the program a file with a single character with the code 0xFF:

user@linux:~> echo $'\xff' > test
user@linux:~> gcc test.c
user@linux:~> ./a.out < test
user@linux:~>

No conclusion, and this is logical - getchar () returns the int value 255, compare with -1 (EOF), do not match. However, the following two examples

if ('\xff' == EOF)
    printf("Ох, Щи!!!");

And

int c = '\xff';
if (c == EOF)
    printf("Ох, Щи!!!");

Demonstrate equality of values, since the value of 'xff' is extended to int signwise (we get 0xffffffff = -1). On the one hand, this is a source of errors, on the other-this behavior does not cause incompatibility with C++, where char is strictly equal to one byte and when compared with int will be expanded in the same way.

 53
Author: northerner, 2012-06-10 08:05:16
  • C++ Standard - 5.3.3 / 1:

    sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1.

  • More interesting, by the way, is the fact that in C sizeof('t') is not 1, but sizeof(int).

    Character constants (enclosed in single quotes) have by default: type int in C and type char in C++. Therefore, in C, the equality sizeof('t') == sizeof(int) is valid, and in C++ - equality sizeof('t') == sizeof(char).

 64
Author: M. Williams, 2020-06-12 12:52:24

As far as I know:

Char is the only documented type with the exact specified size. The other types have a relative size, for example:

1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)

 18
Author: Bohdan Shevchenko, 2015-11-14 12:29:28

The int border alignment can be set, which will make the sizeof(char)==sizeof(int) expression true.

 5
Author: Dzmitry, 2015-05-19 05:59:37

Has anyone met sizeof(char) in their practice != 1 ?

Defines work wonders:

#include <stdio.h>
#define char wchar_t
int main()
{
    printf("sizeof(char)==%d", sizeof(char));
}

Outputs sizeof(char)==2.

 4
Author: devoln, 2012-06-10 14:32:10

So many answers and only one partially correct one. You can't mislead people. In the standard With (6.5.3.4 The sizeof operator)

When applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1.

And C++ (5.3.3 Sizeof)

Sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1.

It is clearly stated that the char type has a size of ONE byte. No system can't have a compiler that supports the standard and that will return to sizeof(char) something else. Otherwise, it is a non-standard compiler! Now about the" fair " equality sizeof(char) == sizeof(int). Do not confuse the value returned by sizeof with the return type, because sizeof returns the type int and in this sense, yes, equality can be. And even such equality can be 'c'!= 0, because 'c' will be aligned to the int type, again the standard. But, the literal 'c' is of type char and so sizeof ('c') will return one and, yes, type int. Further, a programming language and its implementation on different platforms are different things. And if the information storage quantum does not allow storing one byte, then, yes, a single-byte type will be stored in two or more bytes, depending on the minimum information quantum of the system. But this has nothing to do with sizeof(char). Now for the stupid example, which is even more misleading.

if (getchar() == EOF)

Naturally, the return value is getchar() will be aligned to int with its value, i.e. to 0x000000FF and of course it will not be equal to 0xFFFFFFFF. The next more interesting option.

if ('\xff' == EOF)

And here we compare (char)-1 with (int)-1, why? Yes, because there was a conversion from char to int and -1 turned back to -1.

UPD: Based on the results of the discussion

The C++ standard has the concept of bytes and if we are talking about the sizeof operator, then in this case the size is returned in bytes. And though in the range definitions and in the examples, the values that require exactly eight bits are given, the bitness itself is not directly defined in the standard. It is only required that the byte can at least accommodate all the characters reserved in the standard.

As for the definition of a byte itself, it is in the IEC 80000 standard and is defined exactly as an octet, eight bits. Any documentation that does not explicitly specify the byte dimension is based on the standard. But, the standard does not the law, so there may be deviations. For example, additional bits can be used to control the integrity of information or for other service purposes, but only eight bits will still be visible to the user. Any other deviation is more of an anachronism.

 3
Author: Andrey Sv, 2018-11-09 21:06:09