Is it possible to define integer overflow in C/C++ languages?

The problem of integer overflow (integer overflow) is often mentioned in the context of safe programming. Is it possible to catch this situation in C / C++ code? After all, processors (at least x86) have the EFLAGS flag among Overflow Flag, it turns out that this is technically possible, and the overhead for checking the flag in the register should not be large (I think so). Either the problem is in the formulation of the question and the overflow should not be caught, but not allowed never? Does the latter mean that integer types are fundamentally unsuitable for any computation?

P.S. The question arose due to the fact that in the existing code (observed by me) int is widely used for arithmetic calculations and nothing protects against overflows, hypothetically they can go unnoticed (by logic). And in most cases, with a high probability, everything is normal (the real values are small), but sometimes...

Author: LXA, 2016-04-16

6 answers

Officially, it is written in the language standard, if I am not confused, that the overflow of signed integer types depends on the implementation (an exception can be thrown, can be ignored), and unsigned ones are ignored - the value is reduced to a representable range.

That is, it turns out something like this-the C/C++ built-in it has no portable overflow detection tools.

But you can always use additional methods that will allow you to use the following data: the operands find out whether there will be an overflow when performing the operation. A lot of such things are described in the book by G. Warren Algorithmic tricks for programmers.

Well, as an illustration of how easy it is to miss this overflow - in Stroustrup's Programming. Principles and practice... there is a program that calculated a series for e x, or something - and it was in his mouth. He honestly wrote that earlier, they say, I thought that this was due to the loss of accuracy in double, and I even wrote this in the first edition of the book, and then by the second edition I realized that this is actually an overflow of integer values when calculating the factorial. Even if it's Stroustrup... :)

 14
Author: Harry, 2016-04-17 05:36:30

The default action depends on the processor, compiler, and settings. On MIPS, for example, its "classic" compilers turned the addition of signed ones into the add command, which generates an overflow exception, and unsigned ones - addu, which does not generate. But gcc, clang have not done this, all through addu. On x86-native add, sub commands ignore overflow (more precisely, they set the CF or OF flag, but do not throw an exception).
Modern compilers believe that overflows in operations with integers with the sign should not be, sometimes because of this there are interesting effects - here is the most killer example of those that I have seen. The standard here operates on the principle of "we give you the opportunity to write as efficiently as possible, because this is C, not a training language, and protection from curves is your problem". In contrast, for unsigned integers, the behavior is defined as for arithmetic modulo 2**N, where N is the number of bits in the value. This difference is due to the fact that it was not known in the general case, which representation of signed integers will be from the set: the direct code (sign and magnitude), the reverse code (1's complement), the additional code (0's complement, but in the already established jargon of two's complement), it is now safe to say that there are no options other than the additional code. But then this difference was actually applied to the possibility of making tricky optimizations based on the assumption of non-completion.

For C++, there is, for example, a library (template header file) SafeInt3 from Microsoft, under a free license; it can cover all the basic operations, although it is very old, does not use the specifics of processors (even x86) and therefore in many cases is inefficient - where a single OF or CF, or overflow builtins from GCC is enough, it performs a lot of complex calculations.

For C, you will have to write everything manually, replacing the usual operations with their equivalents with functions or macros.

About the new built-in features of the latest gcc and clang (__builtin_add_overflow, __builtin_mul_overflow and the whole group) have already written. They can always get a truncated (aka wrapped) operation value (as in 16 bits 30000 + 30000 -> -5536), and an overflow flag, and you can get each of them safely (there will be no unexpected optimizations). But even without them, you can do quite well. For example, in the case of adding two int's, a check of the form a > INT_MAX - b, if b > 0, otherwise a < INT_MIN - b, is sufficient to check the solvability of the addition. The hardest thing here is checking the multiplication without compiler support: the only option that always works is the division check, which is very expensive. I believe that overflow builtins should be introduced in all standards (except for the current set of add, mul, div, add more shl).

 12
Author: Netch, 2020-12-20 08:27:26

Although the answers only from the links are not welcome, but I found good material (including the code for checking whether the overflow will cause (more generally-is it safe) addition, subtraction, multiplication and division of integers):

Https://www.securecoding.cert.org/confluence/display/c/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow

With discussions, etc.

Update

Because when searching for an answer, you always want to immediately to see a specific code that helps in solving the problem (and not to guess, for example, how to correctly write a test condition for subtraction, already knowing the correct answer for addition), I will give here some of the materials on this link that can be used as samples for your programs.

Checking before adding:

#include <limits.h>

void f(signed int si_a, signed int si_b) {
  signed int sum;
  if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
      ((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
    /* Handle error */
  } else {
    sum = si_a + si_b;
  }
  /* ... */
}

Checking before subtraction:

#include <limits.h>

void func(signed int si_a, signed int si_b) {
  signed int diff;
  if ((si_b > 0 && si_a < INT_MIN + si_b) ||
      (si_b < 0 && si_a > INT_MAX + si_b)) {
    /* Handle error */
  } else {
    diff = si_a - si_b;
  }

  /* ... */
}

Checking before multiplication:

#include <limits.h>

void func(signed int si_a, signed int si_b) {
  signed int result; 
  if (si_a > 0) {  /* si_a is positive */
    if (si_b > 0) {  /* si_a and si_b are positive */
      if (si_a > (INT_MAX / si_b)) {
        /* Handle error */
      }
    } else { /* si_a positive, si_b nonpositive */
      if (si_b < (INT_MIN / si_a)) {
        /* Handle error */
      }
    } /* si_a positive, si_b nonpositive */
  } else { /* si_a is nonpositive */
    if (si_b > 0) { /* si_a is nonpositive, si_b is positive */
      if (si_a < (INT_MIN / si_b)) {
        /* Handle error */
      }
    } else { /* si_a and si_b are nonpositive */
      if ( (si_a != 0) && (si_b < (INT_MAX / si_a))) {
        /* Handle error */
      }
    } /* End if si_a and si_b are nonpositive */
  } /* End if si_a is nonpositive */

  result = si_a * si_b;
}

Checking before by division:
(or by calculating the remainder)

#include <limits.h>

void func(signed long s_a, signed long s_b) {
  signed long result;
  if ((s_b == 0) || ((s_a == LONG_MIN) && (s_b == -1))) {
    /* Handle error */
  } else {
    result = s_a / s_b;
  }
  /* ... */
}

Well, if someone is not too lazy to pull out the rest (related to the issue of the vehicle) code (as well as a description of all the essential points) here and carefully arrange it, you are welcome.

 3
Author: avp, 2016-04-20 10:10:03

In Visual C++ , you can use functions from Intsafe. h, for example, to multiply:

#include <stdlib.h>
#include <stdio.h>
#include <windows.h>
#include <tchar.h>
#include <Intsafe.h>

int _tmain(int argc, _TCHAR* argv[])
{
    ULONGLONG  a=100000000000, b=5000000000, c;

    HRESULT hr = ULongLongMult(a,b,&c);

    if(SUCCEEDED(hr)) printf("Result is %llu",c);
    else if(hr == INTSAFE_E_ARITHMETIC_OVERFLOW) printf("Overflow");        

    return 0;
}

These functions are defined as inline, and their implementation depends on the architecture. Function ULongLongMult:

  • On 64-bit architectures, it uses the intrinsic function of the compiler _umul128, so it should be quite efficient.

  • On 32-bit architectures, it uses a special calculation algorithm with the division of numbers into 2 32-bit ones parts (the result is calculated using the formula a.b * c.d = (a*c*2^64) + (a*d*2^32) + (b*c*2^32) + (b*d)), and the overflow is detected by checking certain bits in the intermediate results.

 1
Author: MSDN.WhiteKnight, 2018-03-14 07:02:21

What is wrong with these methods? Yes, they require two times to calculate the same thing! In fact, you have to calculate the very fact that when adding or subtracting will go beyond... But you can do it differently after the fact... after all, in the case of an overflow, the escaped bits are discarded otherwise there would be no overflow! Therefore, it is clear that if, for example, the result should decrease, and it increased with a non-zero addition, or vice versa, then this is an overflow.

int a = 2147483645, b = 3, c;
c=a+b;

Then what will it be like with? The answer is -1! Well I so specially picked up... But the fact is that -1 db is more than a, but it is less. This is the criterion. Overflow can also occur if we try to invert the sign of the smallest negative number... well, this is not necessary to count, but just check!

int a = -2147483648, b;
b=-а;

What will b be equal to? It will be equal to -1 again. But it is db positive!!! In fact, it has not changed in the binary representation. The operation of changing the sign is equivalent to 2 operations: inversion and +1. Overflow occurs when appendix 1! Because if we invert the specified number, we get the maximum positive number to which we can no longer add 1. Formally, this is the transfer of the highest bit... Here is the overflow criterion when changing the sign. The advantage of this method is that if everything is fine, then the result is already there and it does not need to be counted a second time... When multiplying and dividing, the same thing happens. If we multiply by a number greater than 1 these are all numbers except zero, then the result of db is greater, not less than 1 cofactor... And when dividing less. And then you will figure out how to write the code yourself... This approach speeds up calculations, especially when calculating the sum or product. What happens in the calculations.And yet ... another criterion for when to check is desirable. And it should not be done for any calculation, but only when the calculations occur near the boundaries of the representable range of numbers for the result. Because if it is far from the borders, then the check is unnecessary. Here it is desirable to take this into account as well. Here this is just not easy...

 0
Author: Анатолий, 2020-06-16 13:51:38

This is actually a problem with many compiled languages. Platform-independent methods are a bunch of code. It may be faster to count in float/double, and check the bounds before round. Yes, when you count every cycle of the proc-it will not work, but then it is not a sin to insert an assembler insert: in any case, if you need to squeeze the maximum productivity out of the platform, you can forget about portability. Alternatively , if you need to count as an integer, then

  1. using the method, suggested above by Anatoly
  2. intermediate results are stored in a variable that is at least twice as long as is required to represent a range of data (for example, if you have 32 bits of data, then we perform intermediate calculations in int64_t). Then checking for going beyond representable bounds before being cast to a "shorter" type becomes trivial. But that's not all. If you need to maintain accuracy in the presence of intermediate division operations, then you should not just put, for example, a 16-bit number in a 32-bit variable, and also shift it, say, 8 bits to the left, and after all calculations-back to the right, to leave the "extra" bits on both the left (in case of overflow) and right (in case of loss of accuracy in divisions). And the number of these bits must be counted so that in the entire range of input data, it will not fly into overflow/loss of accuracy at all intermediate steps. So there are enough" pitfalls " in integer arithmetic. The right word, if the resources allow it - it is easier to count in float.
 0
Author: vlatro, 2020-06-16 17:06:17