C: problem with getchar () and EOF (^Z) in Windows console

I've been trying to understand for a very long time:

Why does the loop not end if I enter "dfkjsdf^Z", while when
"dfkjsdf (here I press Enter) ^Z" - ends? That is, how do I make it exit the loop if I press CTRL+Z before I press Enter

This is probably the most incomprehensible thing for me in the C language. And no matter how much I searched Google, I still couldn't find the answer..

int main() {
    int c;
    int i = 0;
    int arr[10];
    while((arr[i] = getchar()) != EOF && i < 10) {
        printf("arr[i] is %c\n",arr[i]);
        i++;
    }
    return 0;
}
Author: jfs, 2017-12-15

1 answers

This has nothing to do with the C language, but depends only on the algorithm for processing the combination Ctrl-Z by the Windows console and the interpretation of the results of this processing by the implementation of the standard library that you are using.

Input in the Windows terminal is buffered line by line. At the same time, the processing of the characters ^Z present in the buffer follows a rather confusing algorithm (at least when using the standard library from the kit MSVC).

  • The Ctrl-Z combination itself does not "push" the accumulated buffer to the output (unlike the Ctrl-D combination in Linux). It only adds the character ^Z, i.e. \x1a, to the input buffer. You can press Ctrl-Z several times, placing several ^Z characters in the input buffer. After that, you can continue to enter something else. To still send the accumulated buffer to the waiting process, you will have to press Enter.

  • If the input buffer is contains some characters before the first occurrence of the ^Z character, then the waiting process will see all these characters, after which the process will see a single ^Z character, i.e. \x1a. It will just be the \x1a character. No "end of file" situation will occur. However, the rest of the input buffer (after the first character ^Z) will not be visible to the process, as if it did not exist.

    That is, if you enter the sequence abc^Z^Zdef^Zghi in the Windows terminal and press Enter, then your process will receive the input characters a, b, c and \x1a. All other input will disappear without a trace. Note that the newline character generated by pressing Enter also "disappears".

  • If the input buffer immediately starts with the character ^Z, the input buffer is considered empty. All its contents disappear, not even the ^Z character is read. The "end of file" situation occurs.

    That is, if you enter the sequence ^Zdef in the Windows terminal and press Enter, then your process will receive nothing at all at the input. Instead, the input function will tell you that it has hit the end of the file.

Therefore, in order to create the "end of file" situation in the buffered console input, you will have to enter ^Z at the very beginning of a new line.


If you need character-by-character input processing, you can first disable line-by-line input buffering

HANDLE hIn = GetStdHandle(STD_INPUT_HANDLE);
DWORD dwMode;
GetConsoleMode(hIn, &dwMode);
dwMode &= ~ENABLE_LINE_INPUT;
SetConsoleMode(hIn, dwMode);

In this case, each character entered will be immediately read your getchar() and the ^Z character will be immediately interpreted as the end of the file.

 6
Author: AnT, 2020-06-12 12:52:24