Integral types and possibly confusing behavior
Technical Note 12582
Arkitekturer:
All
Komponent:
general
Uppdaterad:
2015-11-06 13:52
Introduction
There are some cases when ISO/ANSI standard C gives a possibly confusing behavior. This text gives a few examples and a bit of technical background to these, including references to the standard.
Integral Promotion
According to 6.2.1.1, "A char, a short int or an int bit-field, or their signed and unsigned varieties, or an enumeration type" is converted to an int or an unsigned int by integral promotion when they are used in expressions. In most cases this gives the results one would expect mathematically from the expression; for example:
char c1,c2,c3;
c1=0xA
c2=0xB;
c3=c1+c2;
For a compiler with 16-bit int this is, by definition, performed as int addition followed by truncation of the result to char. 0x0A + 0x0B --> 0x000A+ 0x000B --> 0x0015 --> 0x15.
Note that for this example the resulting value does fit into the destination type. When it does not, the situation becomes slightly more complex. With an optimizing compiler, operations will actually be performed in the lower precision in most similar cases (for situations when this does not affect the results, see Size optimizations below).
Integer Constant types
According to 6.1.3.2, for an unsuffixed integer constant the type is given by the actual value and what number base it is written in. For a hexadecimal constant, the type is the first that can represent its value in the following list: int, unsigned int, long int, unsigned long int. The interesting point here is that the smallest type is int, not char; for example:
char c1;
c1=0xA
For a compiler with 16-bit ints the assignment requires a truncation: 0xA is really 0x000A and it must be truncated to fit in a char. Again, with an optimizing compiler, operations will actually be performed in the lower precision in most similar cases (for situations when this does not affect the results, see Size optimizations below).
Size optimizations
As long as the result is the same, compilers are free to use other precision when actually evaluating expressions. Please compare with section 5.1.2.3, notably the list of points after the text "The least requirements on a conforming implementation are:" where the minimal requirements are outlined. The examples listed are also of interest.
Possibly confusing behavior
There are situations when the rules for integer types and their conversions lead to possibly confusing behavior. Things to look out for are assignments and/or conditionals (test expressions) involving types with different size and/or logical operations, especially bit negation. Types here include also types of constants.
In some cases there may be warnings (e.g., "constant conditional", "pointless comparison"), in others just a different result than what is expected. Under certain circumstances a compiler may warn only at higher optimizations, e.g., if the compiler relies on optimizations to identify some instances of constant conditionals.
Example 1:
Assume 8 bit char, 16 bit int, 2's complement.
void f1(unsigned char c1)
{
if (c1 == ~0x80)
;
}
Here the test is always false!
Explanation: Right Hand Side: 0x80 really is 0x0080, and ~0x0080 becomes 0xFF7F. LHS: c1 is 8 bit unsigned character, and thus can not be larger than 255. It also cannot be negative, thus the integral promoted value can never have the top 8 bits set.
Example 2:
Assume 8 bit char, 16 bit int, 2's complement.
void f2(void)
{
char c1;
c1= ~0x80;
if (c1 == ~0x80)
;
}
In the assignment, again the bit negation is performed on an int type object, i.e., ~(0x0080) --> 0xFF7F. This value is then assigned to a char, i.e., the char will have the (positive) value 0x7F. In the conditional this is then integral promoted to 0x007F, which is compared to 0xFF7F - and the test fails. IF plain char is signed, and IF the constant has the top bit cleared (any value in the range 0x00-0x7F), the bit-negated value will be negative and the test may be successful (see below).
Example 3:
Assume 8 bit char, 16 bit int, 2's complement.
void f3(void)
{
signed char c1;
c1= ~0x70;
if (c1 == ~0x70)
;
}
In the assignment, again the bit negation is performed on an int type object, i.e., ~(0x0070) --> 0xFF8F. This value is then assigned to a char, i.e., the char will have the value 0x8F. In the conditional this is then integral promoted/sign extended to 0xFF8F, which matches and the test works.
Example 4:
Assume 8 bit char, 16 bit int, 2's complement.
void f4(void)
{
signed char c1;
signed int i1;
i1= 0xFF;
c1= 0xFF;
if (c1 == i1)
;
}
In the first assignment i1 becomes 255, while in the second assignment c1, 255 does not fit into the destination type signed char. Thus you should not rely on this test to succeed. Note that while this problem is fairly obvious with explicitly signed char for c1, it is more difficult to detect if plain char is used and defaults to signed char.
Example 5:
void CompilerTest(void)
{
int64_t a,b;
int16_t c,d,e;
c=0x789F;
d=0x7FFF;
e=0x7777;
a=c*d*e; //Wrong result: 0xFFFFFFFFC52A8517
b=c*d;
b=b*e; //Correct result: 0x00001C24C52A8517
}
After integer promotion the expression '(c*d)*e' becomes '((int) c * (int) d) * (int) e'. The result of this expression is implicitly converted to 'long long' before it is stored in 'a'.
One way to force 'long long' operations is to explicitly convert one of the inner operands, for example 'd', as in '(c*(long long)d)*e'. This forces the inner operation to be performed in 'long long', producing a result in 'long long'. The result is an operand to the outer operation, and therefore forces the outer operation to be performed in 'long long'.
All product names are trademarks or registered trademarks of their respective owners.