The behavior of integral types

As a programmer, you don’t need to know the whole ISO/ANSI C standard in and out. It is possible to build a fairly complex C application without knowing all the details. However, the standard is of great help as a reference when you are digging into the less obvious parts of the language, and also when you are exploring the corners of the C language.

When you read the standard, some points look all clear on paper, but when you compile your code it turns out that it does everything except what you wanted it to do! Some parts of the standard need to be analyzed before you fully understand their behavior, and can take advantage of them.

A couple of examples of a possibly confusing behavior are described below—and these are far from what you would call using the corners of C; expressions similar to these can be found in real application code. Sometimes they cause a support case. Most of the time, however, you can anticipate the behavior by analyzing what the standard really says.

Using an optimizing compiler

In many embedded applications, code size is extremely important and the choice of data types will have an impact on the final code size.

The limits.h standard header file defines the minimum and maximum values of the integral data type. According to the standard, compilers are free to use a different precision when actually evaluating expressions, provided that the result is the same, of course. This means that with an optimizing compiler such as IAR C/C++ Compiler, operations will actually be performed using the lower precision in most cases.

This is described in paragraph 5.2.4.2.1 of the standard, where the minimal requirements are outlined. The examples listed are also worth reading.

Before digging into the behavior that may seem confusing, let’s just recap what the standard says about integral promotion and integer constant types.

Integral promotion

The C Standard 6.3.1.1: If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integral promotions. The integral promotions preserve the value, including the sign.

According to 6.3.1.1, a bool, a char, a signed or unsigned char or short int, or an enumeration type is converted to an int or an unsigned int by integral promotion when they are used in expressions. In most cases, this gives the results you would expect mathematically from the expression; for example:

char c1,c2,c3;
c1=0xA
c2=0xB;
c3=c1+c2;

For a compiler with 16-bit int type, this is, by definition, performed as an int addition, followed by truncating the result to char. 0x0A + 0x0B --> 0x000A+ 0x000B --> 0x0015 --> 0x15.

Note that for this example, the resulting value does fit into the destination type. When it does not, the situation becomes slightly more complex. With an optimizing compiler, operations will actually be performed in the lower precision in most similar cases.

Integer constant types

According to paragraph 6.4.4.1, the type of an integer constant is given by its suffix (such as none, u or U—unsigned, l or L—long, ll or LL—long long etc.) and the number base (decimal, octal or hexadecimal). The type of an unsuffixed hexadecimal constant is the first possible of the following types, in which its value can be represented: int, unsigned int, long int, unsigned long int, long long int, unsigned long long int. The interesting point here is that the smallest type is int, not char; for example:

char c1; 
c1=0xA

For a compiler with 16-bit int type, the assignment requires a truncation: 0xA becomes 0x000A and it must be truncated to fit in a char. Again, with an optimizing compiler, operations will actually be performed in the lower precision in most similar cases.

Possibly confusing behavior

The rules for integer types and their conversions might lead to possibly confusing behavior in some situations, for example assignments and/or conditionals (test expressions) involving types with different size and/or logical operations, especially bit negation. Types here also include types of constants.

In some cases there might be warnings (e.g., "constant conditional", "pointless comparison"), in others just a different result than what you expected. Under certain circumstances a compiler may warn only at higher optimizations, for example, if the compiler relies on optimizations to identify some instances of constant conditionals.

Example 1:

Assume 8-bit char, 16-bit int, 2's complement.

void f1(unsigned char c1) 
{
if (c1 == ~0x80)
;
}

Here the test is always false!

Explanation: The right-hand side: 0x80 --> 0x0080, and ~0x0080 --> 0xFF7F. The left hand-side: c1 is an 8-bit unsigned character. It must be smaller than 255 and positive, thus the integral promoted value can never have the highest 8 bits set.

Example 2:

Assume 8-bit char, 16-bit int, 2's complement.

void f2(void) 
{
char c1;
c1= ~0x80;
if (c1 == ~0x80)
;
}

In the assignment, the bit negation is performed on an int type object, that is, ~(0x0080)
--> 0xFF7F. This value is then assigned to a char, that is, the char will have the (positive) value 0x7F. In the conditional, this is then integral-promoted to 0x007F, which is compared to 0xFF7F—and the test fails. IF the plain char type is signed, and IF the constant has the highest bit cleared (any value in the range 0x00-0x7F), the bit-negated value will be negative and the test might be successful (see below).

Example 3:

Assume 8-bit char, 16-bit int, 2's complement.

void f3(void)
{
signed char c1;
c1= ~0x70;
if (c1 == ~0x70)
;
}

In the assignment, the bit negation is performed on an int type object, i.e., ~(0x0070) 
--> 0xFF8F. This value is then assigned to a char, that is, the char will have the value 0x8F. In the conditional, this is then integral-promoted/sign-extended to 0xFF8F, which matches and the test works.

Example 4:

Assume 8-bit char, 16-bit int, 2's complement.

void f4(void) 
{
signed char c1;
signed int i1;
i1= 0xFF;
c1= 0xFF;
if (c1 == i1)
;
}

In the first assignment i1 becomes 255, while in the second assignment c1, 255 does not fit into the destination type signed char. Thus you should not rely on this test to succeed. Note that while this problem is fairly obvious with explicitly signed char for c1, it is more difficult to detect if plain char is used and is, by default, signed char.

Reference

The C standard, Incorporating Technical Corrigendum 1, BS ISO/IEC 9899:1999

© IAR Systems 1995-2016 - All rights reserved.