Module 2. Basic components of C programming language
Lesson 5
DATA TYPES IN C
5.1 Introduction
C uses the concept of data types, which is used to define a variable before its use in a programme. The definition of a variable will assign storage for the variable and define the type of data that will be held in the memory location. Hence, the data type defines: the amount of storage allocated to variables; the values that the variables can accept; and the operations that can be performed on these variables. C data types can be broadly classified as:
· Primary data type
· Derived data type
· User-defined data type
5.2 Primary Data Types
All C compilers accept the following basic data types:
Table 5.1 Basic data types supported by C
Serial
Number |
Data
Type |
Keyword |
Range
of Values |
1. |
Integer |
int |
- 32768 to +32767 |
2. |
Character |
char |
- 128 to 127 |
3. |
Floating-point |
float |
3.4e - 38 to 3.4e + 38 |
4. |
Double
precision floating-point |
double |
1.7e - 308 to 1.7e + 308 |
5. |
Void |
void |
- |
Integer types
Integers are whole numbers with a machine dependent range of values. C has three classes of integer storage, namely, short int, int and long int. All of these data types have signed and unsigned forms. The short int requires half the space than normal integer values. Unsigned numbers are always positive and consume all the bits for the magnitude of the number. The long and unsigned integers are used to declare a longer range of values.
Floating-point types
A
floating-point number represents a real number with six digits precision.
Floating-point numbers are denoted by the keyword float. When the accuracy of the
floating-point number is insufficient, the data type, double is used to define
the number. The double is same
as float but with longer precision. To extend the precision further, use
long double,which consumes 80 bits of
memory space.
Void type
The void data type is used to specify the type of a function.
It is a good practice to avoid functions that does not return any values to the
calling function.
Character type
A single
character can be defined as a character type of data. Characters are usually
stored in 8 bits of internal storage. The
qualifier signed or unsigned can be explicitly applied to char. While unsigned characters have
value between 0 and 255 characters have values from -128 to 127. Size and range of data
types on 16 bit machine is given in
Table-5.2 below:
Table 5.2 Size
and range of data types on 16 bit machine
Type
|
Size
(Bits) |
Range
|
char
or signed char |
8 |
- 128 to
127 |
unsigned
char |
8 |
0
to 255 |
int
or signed int |
16 |
- 32768 to
32767 |
unsigned
int |
16 |
0
to 65535 |
short
int or signed short int |
8 |
- 128 to
127 |
unsigned
short int |
8 |
0
to 255 |
long
int or signed long int |
32 |
- 2147483648 to
2147483647 |
unsigned
long int |
32 |
0
to 4294967295 |
float
|
32 |
3.4e - 38 to
3.4e + 38 |
double
|
64 |
1.7e - 308 to
1.7e + 308 |
long
double |
80 |
3.4e - 4932 to
3.4e + 4932 |
5.3 Declaration of Variables
Every variable used in the programme should be declared
to the compiler. The declaration serves two purposes:
a) Tells the compiler the variables names.
b) Specifies types of data the variables will hold.
The general format of any declaration
is:
datatype variable_1, variable_2,
, variable_n;
where variable_1, variable_2, etc., are variable names. Variables are separated by commas. A
declaration statement must end with a semicolon.
Examples
int sum;
int number, salary;
double average, mean;
5.4 Programmer-defined Type Declaration
In C, the typedef feature allows programmers to define
new data type(s) that is/are equivalent to existing data type(s) [see
Table-5.3]. Once a programmer-defined data type is defined, new identifiers
such as variables, arrays, etc., can be declared in terms of the newly defined data type. The
general syntax is:
typedef type user-defined-type;
where type represents existing data type (including a standard data
type or earlier programmer-defined data type) and user-defined-type refers to the new
programmer-defined name given to the data type.
Examples
typedef int age;
typedef float marks;
Here, age is a programmer-defined data type, which is equivalent to
integer data type. Hence, the variable declaration
age male, female;
is equivalent to writing
int male, female;
That is, the variables, male and female are regarded as variables of type age, though these are actually variables of integer type.
Table 5.3 The standard data types in
C
Data Type |
Description |
Typical Memory Requirements |
int |
Integer quantity |
2 bytes or 1 word (varies from one computer to another) |
Short |
Short integer quantity (may contain fewer digits that int) |
2 bytes or 1 word (varies from one computer to another) |
long |
Long integer quantity (may contain more digits that int) |
1 or 2 words (varies from one computer to another) |
unsigned |
Unsigned (positive) integer quantity (maximum permissible quantity is approximately twice as large as int) |
2 bytes or 1 word (varies from one computer to another) |
char |
Single character |
1 byte |
signed char |
Single character, with numerical values ranging from -128 to 127 |
1 byte |
unsigned char |
Single character, with numerical values ranging from
0 to 255 |
1 byte |
float |
Floating-point number (i.e., a number containing a decimal point and/or an exponent) |
1 word |
double |
Double-precision floating-point number (i.e., more significant figures, and an exponent that may be larger in magnitude) |
2 words |
long double |
Double-precision floating-point number (may be higher precision than double) |
2 or more words (varies from one computer to another) |
void |
Special data type for functions that do not return any value |
(not applicable) |
enum |
Enumeration constant (special type of int) |
2 bytes or 1 word (varies from one computer to another) |
Note: The qualifier unsigned may appear with short int or long int, e.g., unsigned short int (or unsigned short), or unsigned long int (or unsigned long). |
Similarly, the following declarations:
typedef float height[30];
height boys, girls;
define height as a 30-element floating-point array type. Hence, boys and girls are 30-element floating-point arrays (arrays are
discussed later). The typedef feature is quite suitable while defining structures as it avoids the need
to repeatedly write the struct tag whenever a structure is referenced. Further discussion will be continued
on this topic in the last module.
5.5 Symbolic
Constants
Symbolic constants are names for a
sequence of characters. The characters may represent a numeric constant, a
character constant or a string constant. Thus, a symbolic constant allows a
name to appear in place of a numeric, character or string constant. At the time
of compiling each occurrence of a symbolic constant gets replaced by its
corresponding character sequence. These are usually defined at the
beginning of a programme, e.g., the following statements:
#define ANGLE_MIN 0
#define ANGLE_MAX 360
#define PI 3.141593
#define TRUE 1
#define FALSE 0
would define the symbolic constants, ANGLE_MIN, ANGLE_MAX, PI, TRUE and FALSE to the values 0, 360, 3.141593, 1 and 0, respectively. Recall that C distinguishes between lowercase and
uppercase letters in variable names. It is a tradition to use capital letters
in defining global constants (like symbolic constants). Note that the symbolic
constant definitions do not end with a semicolon unlike other C statements.
Important Note: The following Sections 5.6 and 5.7
deal with the advanced topics on Enumerations and Macros. Therefore, the
students may better comprehend these advanced topics after completing other
lessons of this module.
An enumeration is a data type like a structure or a union (to be discussed later). It consists of a set of named
values that represent integral constants, known as enumeration constants. An
enumeration is also referred to as an enumerated
type because you must list (enumerate) every value in creating a name for each
of them. In addition to providing a way of defining and grouping sets of
integral constants, enumerations are useful for variables that have a small
number of possible values. You can declare an enumeration type separately from
the definition of variables.
Enumeration type definition
Generally, an enumeration type definition begins with the enum keyword followed by an optional
identifier (the enumeration tag) and a brace-enclosed list of enumerators. A
comma separates each enumerator in the enumerator list.
enum tag {member 1, member 2,
, member
m};
where enum is required keyword; tag is a name that identifies enumerations having this
composition; and member
1, member 2,
, member m represent the individual identifiers that may be
assigned to variables of this type. These member names must unique as well as
they must be distinct from other identifiers whose scope is the same as that of
the enumeration.
Enumeration variable declaration
Once the enumeration is defined,
corresponding enumeration variables can be declared as follows:
storage-class enum tag variable 1, variable 2,
, variable n;
where storage-class is an optional storage class
specifier, enum is the required keyword, tag is the name that appeared in the
enumeration definition, and variable 1, variable 2,
, variable n are enumeration variables of the type tag.
The enumeration definition can be clubbed with the variable declarations,
as follows:
storage-class enum tag {member 1, member 2,
, member
m}
variable 1, variable 2,
, variable n;
the tag is optional in this situation.
An illustration Consider the following statements as
a part of a C programme:
enum colours {black, blue, cyan,
green, magenta, red, white, yellow};
colours foreground, background;
Note that, the first statement defines
enumeration named colours (i.e., the tag is colour). The enumeration consists of eight constants whose
names are black,
blue, cyan, green, magenta, red, white and yellow. The second statement declares the variables foreground and background
to be enumeration variables of type colours. Thus, each variable can be
assigned any one of the constants black, blue,
, yellow
The two declarations can be combined
as follows:
enum colours {black, blue, cyan, green, magenta, red,white, yellow} foreground
background;
or without the tag, simply:
enum {black, blue, cyan, green, magenta, red,
white,yellow} foreground background;
Enumeration constants are
automatically assigned equivalent integer values, beginning with 0 for the
first constant and with each successive constant increasing by 1. Thus, member 1 will automatically be assigned the
value 0; member 2 will be assigned 1, and so on.
Example 1: Consider the following code
demonstrating the use of enumeration constants. (For detailed description about
the printf statement, see Lesson-5).
#include <stdio.h>
int main ()
{
enum compass_direction
{north, east, south, west};
enum compass_direction
my_direction;
my_direction = east;
printf(%d, my_direction);
return 0;
}
Output :
It is quite interesting to note that
the aforementioned automatic assignments can be overridden within the
definition of the enumeration! That is, some of the constants can be assigned
explicit integer values, which differ from default values. To do so, each
constant (i.e., each member), which is assigned an explicit value is expressed as an ordinary
assignment expression; member = int, where int represents a signed integer quantity. Those constants that are not assigned
explicit values will automatically be assigned values, which increase
successively by 1 from last explicit assignment. This
may cause two or more enumeration constants to have the same integer value.
Example 2: Consider the following code
demonstrating the use of enumeration constants with implicit and explicit
assignments.
#include <stdio.h>
int main ()
{
enum colour {black=-1, blue, cyan, green,
magenta, red=2, white,
yellow};
enum foreground background;
background = green;
printf("%d\n",
background);
return 0;
}
Output :
The constants black and red are now assigned the explicit values,-1 and 2
respectively. The remaining enumeration constants are automatically assigned
values that increase successively by 1 from the last explicit
assignment. Thus, blue, cyan, green and magenta are assigned the values. 0, 1,
2 and 3 respectively. Similarly, white and yellow are assigned the values 3 and
4, respectively. Note that there are now duplicate assignments, i.e., green and
red represents 2, where magenta and white both represent 3.
Enumeration variables can be
processed in the same manner as other integer variables. Thus, they can be
assigned new values, compared, etc. However, it should be understood
that enumeration variables are generally used internally, to indicate various
conditions that can arise within a programme. Hence, there are certain
restrictions associated with their use. In particular, an enumeration constant
cannot be read into the computer and assigned to an enumeration variable. (It
is possible to enter an integer and assign it to an enumeration variable,
though it is generally not done). Moreover, only integer value of an
enumeration variable can be written out of the
computer.
5.7 Macros
As discussed earlier, the #define statement is used to define symbolic
constants within a C programme. All symbolic constants are replaced by their
equivalent text at the beginning of the compilation process. Thus, symbolic
constants provide shorthand notation to simplify the organisation of a
programme. Besides, #define statement can be used to define
macros. A macro is a single identifier that is equivalent to expressions,
complete statements or group of statements; i.e., a fragment of code,
which has been given a name. Whenever this name is used within the programme,
it is replaced by the contents of the macro. In this sense, macros resemble
functions; however, they are defined in an entirely different manner than
functions.
You may define any valid identifier as
a macro, even if it is a C keyword. The pre-processor does not know about the
keywords. This can be useful if you wish to hide a keyword such as const from an older compiler that does not
recognise it. However, the pre-processor operator defined can never be defined as a macro. Macros slow down the
compiling process; however, the compiled programmes (executable codes) are
faster than functions as functions involve passing values thereby increasing
CPU usage.
The formal syntax of a macro is:
#define name(dummy1[,dummy2][,...]) token string
The symbols dummy1, dummy2, ... are called dummy arguments (the square brackets indicate
optional items).
Example 1:
Consider the following simple example
of a macro (Students should never emulate this in any real project):
#define SquareOf(x) x*x
It defines a kind of function, which,
used in an actual piece of code, looks exactly like any other function call:
double y_out, x_in=3;
y_out = SquareOf(x_in);
As you would see subsequently, the
problem is that the macro SquareOf only pretends to be a function call, while it is absolutely different.
There are a few additional rules such
as that the macro can extend over several lines, provided one uses a backslash
to indicate line continuation:
#define ThirdPowerOf(dummy_argument) \
dummy_argument \
*dummy_argument \
*dummy_argument
Of course, you should break the line
at a reasonable position; and not, for example, in the middle of a symbol.
How does a compiler handle a macro?
What makes a macro different from a
standard function is primarily the fact that a macro is a scripted directive
for the compiler rather than a scripted piece of run-time code; and, therefore,
it is dealt with at compilation time rather than at run time. When the compiler
encounters a previously defined macro, it first isolates its actual arguments,
handling them as plain text strings separated by commas. Then it parses (i.e.,
divides the code into functional components; compiler must parse source code in
order to translate it into object code) the token
string, isolates all occurrences of each dummy-argument symbol and replaces it
by the actual argument string. The whole process consists entirely of
mechanical string substitutions with almost no semantic (logical) testing!
The compiler then substitutes the
modified token
string for the original macro call and compiles the resulting code script. It is
only in that phase that compilation errors can occur. When they do, the result
is often either amusing or frustrating, depending upon how you feel at that
moment as you may get mysteriously looking error messages resulting from the
modified text; and thus, referring to something you have never written!
This is explained with the help of
the following small programme, which is formally correct and compiles without
any problem:
#include <stdio.h>
#define SquareOf(x) x*x
void main()
{
int x_in=3;
printf("\nx_in=%i",x_in);
printf("\nSquareOf(x_in)=%i",SquareOf(x_in));
printf("\nSquareOf(x_in+4)=%i",SquareOf(x_in+4));
printf("\nSquareOf(x_in+x_in)=%i",SquareOf(x_in+x_in));
}
Naturally, you would expect the
output of this programme as:
x_in=3
SquareOf(x_in)=9
SquareOf(x_in+4)=49
SquareOf(x_in+x_in)=36
However, what you actually get is:
x_in=3
SquareOf(x_in)=9
SquareOf(x_in+4)=19
SquareOf(x_in+x_in)=15
Let us see what happened. When the
complier encountered the string Squareof(x_in+4), it replaced the string with
the string x*x; followed by replacing each of the dummy-argument-string
x_in+4, obtaining the final string x_in+4 * x_in+4, which, in fact,
evaluates to 19 and not to the expected value of 49. Similarly, it is now easy
to work out the expression SquareOf(x_in+x_in) and understand why and how the result differs from the
expected one.
The problem would have never happened
if SquareOf(x) were a normal function. In that
case, the argument x_in+4 would be first evaluated as a
self-standing expression and only then would the result be passed to the
function SquareOf for the evaluation of the square.
Actually, both the ways are correct!
They are just two different recipes on how to handle the respective scripts.
However, given the formal similarity between the function-like macro call and a
standard function call; the discrepancy is dangerous and should be removed.
Luckily, there is a simple remedy, i.e., replace the original definition
of the SquareOf macro by
#define SquareOf(x) (x)*(x)
The problem vanishes because, for example,
the macro-call string "SquareOf(x_in+4)" is transformed into "(x)*(x)" and then into "(x_in+4)*(x_in+4)", which evaluates exactly as
intended.
On the basis of the foregoing
discussion about macros, the students are advised to keep in mind the following
rules while using macros in their programmes (For detailed description about
the do...while statement, see Module-III):
Rule 1: Always write multi-line macros using following pattern:
#define name \
do { \
macro definition here \
} while (0)
Rule 2: Always surround macro
arguments with parentheses inside the macro body.
Rule 3: Keep your macros as short as
possible.
Example 2:
#include
<stdio.h>
#define MUL(x,y) (x) * (y)
int main()
{
printf("%d\n" , MUL(3, 5));
// system("pause");
return 0;
}
Example 3:
/*
* It swaps two integer numbers.
* Requires tmp variable to be defined.
*/
#define SWAP(x, y) \
do { \
tmp = x; \
x = y; \
y = tmp; } \
while (0)
int main()
{
int x=10,
y=20;
SWAP(x,y);
printf("%d %d\n" , x, y);
return 0;
}