Executable file and its Memory Organization in a Process

The file produced after compilation of program is the executable file. It can be produced using the four stage of compilation or by bypassing them.
We already know that an executable file when executing is called a process. This discussion is all about the contents of an executable file and different sections of it.
There are several executable file formats supported by Linux based operating systems. They are .axf, .bin, .elf, .o, .prx, .puff, .so and sometimes an executable file may not have any extension.
When we compile a C program, we get the default executable file as a.out. The .out was the extension for the executable file format for earlier versions of Linux and UNIX based operating systems. As the technology has emerged .elf has become the most commonly used executable file format, though the name of the default executable file generated is a.out. All the latest systems consider .elf as the executable file format by default.
ELF stands for Executable and Linkable File format. The major reason for migration from .out to .elf format is that .out file couldn’t support linking with the libraries.
An ELF file consists of two parts. 1. ELF header and 2. File data
The contents of ELF file can be observed using the command:
readelf -a a.out  
The following is the screenshot of an elf file.
Screenshot of ELF description
Screenshot of ELF description
Understanding sections of ELF file:
1. ELF Header:
As shown in the screenshot, there are several terms present in the ELF header. In simple words, ELF header gives the information about meta data like class of the file, endianness of the file etc. The following is the explanation of the terms you see in the screenshot shown above.
Magic is used to get the information like file format id, architecture of the system on which the file was developed, endianness of the system, version of the system etc. Note that all the numbers represented in the elf file follow hexadecimal number system. It also assures that the file is not corrupt, based on all the parameters present in the magic number.
Magic number is present in any of the file types and it decides the file format and the metadata of the file.
The following image depicts a magic number and the latter are the explanation for important understandings from the magic number.

ELF description
ELF description
Magic: The first byte (7f) represents the id of the file format, which is 7f for elf file format. The next three bytes of data denote ‘E’, ‘L’, and ‘F’ consecutively.
Class: The fifth byte denotes the architecture of the platform developed. 01 in this byte denotes 32-bit (01) or 64-bit (02) format of the elf file.
Data: The sixth byte denotes the endianness of the data. 01 in this byte represents little endian format and 02 represents big endian format.
Version: The seventh byte denotes the version of elf format. However, there is only one version called Type 1. Hence, 01 in this byte denotes Type 1 of ELF.
OS/ABI: ABI stands for Application Binary Interface. Due to different versions of a given OS, there occurs overlapping or ambiguity between the common functions. ABI byte ensures that right functions are used. For all the Linux systems, ABI version is System V.
Machine: it represents the architecture of the machine. 01 – 32-bit architecture (x86) and 02 – 64-bit architecture.
Type: It gives the purpose of the file viz.,
                01 – DYN – Shared object files for libraries
                02 – EXEC – Executable files for binaries
                03 – REL – Relocatable files, before linked into executable files
All other bytes denote advanced metadata related to the executable file.
2. File Data:
A file data of ELF file consists of three parts.

  1. Program headers or Segments
  2. Section headers or Sections
  3. Data
Program headers or Segments:
Program headers are used by linker to allow execution of multiple source files by linking together. They convert the predefined instructions to a memory may using mmap(2) system call.
Eg: GNU_EH_FRAME, GNU_STACK etc.
Section headers or Sections:
Section headers categorize the data into two types – Instructions or data required for processing i.e., section headers of a file define all the sections of a file.
Eg: .data, .rodata etc.
The contents of section headers are divided into four types. They are: .text, .data, .rodata and .bss.

  • The .text section contains the executable code of the given program. The contents of the text section do not change and are loaded only once, during compilation.
  • The .data section consists of initialized data with read/write access i.e., initialized static, global and extern variable.
  • The .rodata section consists of the initialized data with read access only i.e., numeric constants and string constants.
  • The .bss section consists of uninitialized data with read/write access i.e., uninitialized static, global and extern variables.
However, the most common terminology of these sections are text section and data section. The following image can give you a better view of executable files.

ELF file contents
ELF file contents
These are details of the ELF file when it is just a file. But, when the executable file is executing, there are two more sections – Stack and Heap.
For every process, a section of RAM is allocated as segment, which is called Stack. The contents of the executable are brought into the stack and the processor starts execution. This stack consists of text section, data section, stack section and heap section.
Process during execution
Process during execution
Text section and data section were discussed above. 

  • The stack is the memory space used to allocate memory to a function, called Stack Frame. Stack frame for a given function is allocated only when the function is called.
  • The heap section is used for allocating memory for the pointers using Dynamic Memory Allocation (DMA).
Any executable file should be brought into RAM for execution, as the CPU cannot process the data present in secondary memory. During execution, it is called a Process. A process consists of every single statement that is present in the source code. It even consists of the variable those are just declared but not used; functions those are defined but not called.
Utilities for .elf file description:

  • hexdump is used for getting the details of the hex file.
  • readelf is used to get the structure of an ELF file.
  • scanelf and execstack are the two tools used to get stack details of the ELF file.
  • dumpelf, elfls and eu-readelf are used to get the headers of the ELF file.
  • objdump is used to see the symbols of the ELF file.
  • elfutils package consists of utilities to perform analysis on an ELF file.
Commands:

  • hexdump a.out
  • readelf -a a.out
  • dumpelf <pax-utils>
  • elfls -S /bin/ps
  • eu-readelf -program-headers /bin/ps
  • objdump -h /bin/ps
Advanced concepts of elf file can be found in the man page using the command: man 5 elf


Share:

Introduction to Basics of Functions in C language

Functions are the major building blocks of a C program. Of course, every C program execution starts from main function. We have coded many programs even without a basic knowledge of a function. Let us define and understand the term function.
What is a function in C language?
A function is a group/block of statements that is used to perform a given task.
return_datatype function_name(list_of_arguments);  
A function is an identifier and every identifier in C language should be declared and/or defined.
There are three constraints for every function in C language – Function declaration, Function definition, and Function call.
Function declaration gives the structural details of a function i.e., name of the function, type of arguments it accepts (if any) and the datatype of return data from the function (if any). The following is the syntax of a function declaration. The function call initiates the function to perform the task given task.
Function definition includes the task for which the function was designed to perform. This is the core part of a function. The following is how a function definition looks.
return_datatype function_name(list_of_arguments)  
{  
 ...  
 ...  
 statements  
 ...  
 ...  
}  
Based upon the definition of functions, there are two types of functions – User-defined functions and library functions. The names of both these functions types define them.
User-defined functions are the functions, whose functionality is defined by the user/programmer. Whereas the library functions are defined by the developers.
The user-defined functions allow the programmer to define the functionality based upon the specifications of the application. The library functions help the user to reduce the repeated definition of certain most commonly used constructs. For example, when we have string, fetching the length of a string is one of the most commonly used string operations. Instead of defining it again in the program we code, the library function strlen can be used directly.
The added and advanced advantage of library functions is that the library functions provide access to manage the system resources in system-level programming.
Based upon the function arguments and return type, there are four types of functions.
  1. Functions with arguments and return type
  2. Functions with arguments and without return type
  3. Functions without arguments and with return type
  4. Functions without arguments and without return type
The following are the examples that demonstrate these four function types.
int getSum(int a,int b)//function with arguments and return type   
{  
 return a+b;  
}  
void getSum(int a,int b)//function with arguments and without return type  
{  
 printf("Sum is %d\n",a+b);  
}  
int getSum(void)//function without arguments and with return type  
{  
 int a=10,b=20;  
 return a+b;  
}  
void getSum(void)//function without arguments and return type  
{  
 int a=10,b=20;  
 printf("Sum is %d\n",a+b);  
}  
Out of all the user-defined and library functions, main is the special kind of function all in terms of declaration, definition and the number of arguments. However, in gcc, the main function always returns an integer, which returns the termination status of main function to the _start function, which in turn returns to the kernel. The main function returning int datatype is implicit i.e., even if the programmer does not do that explicitly, the compiler return an int value.
Always remember that main is a user-defined function because, the functionality of the function is defined by the programmer based upon the specifications. But, it needs no declaration because the declaration of main function is done in the library files those are included during execution of the program. In terms of number of arguments, main can have zero arguments, two arguments or three arguments as shown below.
int main(void) //int main() - no arguments  
int main(int argc,char* argv[]) //int main(int argc,char argv[][]) - two arguments  
int main(int argc, char* argv[],char* envp[]) //int main(int argc, char* argv[],char envp[][]) - three arguments  
In the above prototypes of main function, there are three arguments.
  • The argument argc gives the number of arguments supplied in command line. This includes the executable file that is provided for execution.
  • The argument char* argv[] or char argv[][] stores the arguments those are provided during the command line.
  • The argument char* envp[] or char envp[][] stores the environment variables for the executable file that is being executed, after compiling the program. The environment variables include the parameters like default header path, default path for libraries, default time for execution etc.
The concept of command line arguments is very early at this stage for discussion hence, they will be discussed in detail in further demonstrations.
However, a sample program for calculating factorial of a given number is shown below.
#include<stdio.h>  
int getFactorial(int);//function prototype or declaration  
main()  
{  
 int num;  
 printf("Enter number to find factorial:");  
 scanf("%d",&num);  
 printf("Factorial is:%d\n",getFactorial(num));//function call for get factorial  
}  
int getFactorial(int n)//function definition  
{  
 int i,fact=1;  
 for(i=1;i<=n;i++)  
 fact*=i;  
 return fact;  
}  
Here, the getFactorial function is designed as the function with arguments and return type.  In the function call getFactorial(num), num is called actual parameter and in the function definition int n is called formal parameter.
Another important and most useful concept of functions, especially for data structures, is recursion. This can also be discussed in coming sessions. To give a rough idea, about recursion, remember that recursive function is the function that calls itself.
The following are limits of parameter/argument passing about functions accodring to C99 standard.

  • 127 parameters in one function definition
  • 127 arguments in one function call

Share:

Bitwise operators for bit manipulation in C language and example programs

Bitwise operators are very much useful in bit extraction in C language. These operators are still more useful in Embedded C and bit manipulations in C. These operations include extracting the bit status, setting the bit status, resetting the bit status etc.
Bitwise oeprators
Bitwise oeprators
Following are some of the bit manipulations that are useful.
Checking the status of bit:
  • Write a C program to check the status of bit at given position.
#include<stdio.h>  
main()  
{  
 int n,pos;  
 printf("Enter a number:");  
 scanf("%d",&n);  
 printf("Enter position:");  
 scanf("%d",&pos);  
 if(n>>pos & 1)  
 printf("Set\n");  
 else  
 printf("Reset\n");  
}  
Setting the bit:
  • Write a C program to set the bit at given position.
#include<stdio.h>  
main()  
{  
 int n,pos;  
 printf("Enter a number:");  
 scanf("%d",&n);  
 printf("Enter position to set(0 to %d):",sizeof(int)*8-1);  
 scanf("%d",&pos);  
 if(pos>31 || pos<0)  
 printf("Invalid position..\n");  
 else  
 {  
  n=n|1<<pos;  
  printf("After setting, n:%d\n",n);  
 }  
}  
In the previous program, the bug is – when the user enters beyond 0-31 range, the program does not show any information. Hence, it is modified here.
Resetting the bit:
  • Write a C program to reset the bit at given position.
#include<stdio.h>  
main()  
{  
 int n,pos;  
 printf("Enter a number:");  
 scanf("%d",&n);  
 printf("Enter position to reset(0 to %d):",sizeof(int)*8-1);  
 scanf("%d",&pos);  
 if(pos>31 || pos<0)  
 printf("Invalid position..\n");  
 else  
 {  
  n=n & ~(1 << pos);  
  printf("After resetting, n:%d\n",n);  
 }  
}  
Complementing the bit:
  • Write a C program to complement the bit at given position.
#include<stdio.h>  
main()  
{  
 int n,pos;  
 printf("Enter a number:");  
 scanf("%d",&n);  
 printf("Enter position to complement(0 to %d):",sizeof(int)*8-1);  
 scanf("%d",&pos);  
 if(pos>31 || pos<0)  
 printf("Invalid position..\n");  
 else  
 {  
  n=n ^ 1<<pos;  
  printf("After complementing, n:%d\n",n);  
 }  
}  
The following are some of the example programs using the bitwise operations in C language.
Program to exchange nibbles of a byte:
  • Write a program to exchange the nibbles of a byte.
#include<stdio.h>  
main()  
{  
 unsigned char n,temp1,temp2;  
 printf("Enter a number:");  
 scanf("%d",&n);  
 temp1=n>>4;  
 temp2=n<<4;  
 n=temp1|temp2;  
 printf("n after nibble exchange:%d\n",n);  
}  
Though n is of unsigned char type, %d is used in scanf, in order to avoid ambiguity when a character is given as input. Note that a byte of unsigned integers (0-255) is considered here.
Program to print binary equivalent of a decimal number:
  • Write a C program to print binary equivalent of a given decimal number (signed/unsigned integer) using bitwise operators.
#include<stdio.h>  
main()  
{  
 int n,i;  
 printf("Enter a number:");  
 scanf("%d",&n);  
 for(i=sizeof(int)*8-1;i>=0;i--)  
 if(n&1<<i)  
  printf("1");  
 else  
  printf("0");  
 printf("\n");  
}  
Program to check even or odd:
  • Write a C program to check if a given number is even or odd using bitwise operators.
#include<stdio.h>  
main()  
{  
 int n;  
 printf("Enter a number:");  
 scanf("%d",&n);  
 if(n&1)  
  printf("Given number is odd\n");  
 else  
  printf("Given number is even\n");  
}  
Program to swap variables:
  • Write a C program to swap two variables without using temporary variables and arithmetic operators.
#include<stdio.h>  
main()  
{  
 int a,b;  
 printf("Enter a,b:");  
 scanf("%d%d",&a,&b);  
 printf("Before swapping:\na=%d\tb=%d\n",a,b);  
 a^=b^=a^=b;     //b=(a^b)^(a=b);  
 printf("After swapping:\na=%d\tb=%d\n",a,b);  
}  
Both the logics mentioned in the program can perfectly swap two given variables without using temporary variables and arithmetic operators.
Recollect Operators in C language.
Share:

Control statements in C language

Control statements decide the flow of execution. We know that C language is a procedural language, staring the execution with main function. Not all the times we need the sequential execution. The program should be executed based upon conditions; should be executing certain statements for several times. Based upon where the control is transferred, there are two types of control statements. They are Iterative control statements and Non-iterative control statements.
The iterative control statements are used to execute single statement or set of statements several times. The non-iterative control statements just shift the flow of execution.
Control statements in C language
Control statements in C language
C consists of three iterative control structures. The following are the syntaxes and examples of each of these iterative control structures.
while control statement:
syntax:
initialization  
while(condition)  
{  
statements  
}  
Eg:
i=0;  
while(i<10)  
{  
printf(“Hello world\n”);  
i++;  
}  
do..while control statement:
syntax:
initialization  
do{  
statements  
}while(condition);  
Eg:
 i=0;  
 do{  
 printf(“Hello world\n”);  
 i++;  
 while(i<10);  
for control statement:
syntax:
for(expression1;expression2;expression3)  
{  
statements  
}  
Eg:
for(i=0;i<10;i++)  
 printf(“Hello world\n”);  
Out of the three iterative control statements, for is the most commonly used control statement. Based upon the context, while and do..while control statements are used. It is important to note the minimum number of times each of these control structures are executed. Both the for loop and while loop are executed zero time; whereas do..while loop is executed once before validating the condition.
Non-iterative control statements:
These control statements just pass the flow of execution to some other instruction as the programmer requires. There are two types of non-iterative control statements. They are conditional and unconditional control statements.
The conditional control statements pass the control of execution upon satisfying a given condition; whereas the unconditional control statements do the same without any condition.
The conditional control statements are if, if..else and switch..case. The unconditional control statements are goto, break, continue and return.
Conditional control statements:
The following are the syntaxes and examples of each of these conditional control statements.
if control structure:
This is used to check if a certain condition is met, as shown below.
Syntax:
if(condition)  
{  
 statements  
}  
Eg:
if(a==10)  
 printf(“a is 10\n”);  
if..else control statement:
If there are only two outcomes of a particular expression, then if..else is used.
Syntax:
if(condition)  
{  
 statements  
}  
else  
{  
 statements  
}  
Eg:
if(a<0)  
 printf(“Negative number”);  
else  
 printf(“Non-negative number”);  
The if and if..else control statemetns can be nested as shown below. The most common nesting format is else..if ladder i.e., a if block is nested within the else block, as shown below.
Out of all the conditions mentioned in the if block, only one is executed.
Syntax:
if(condition1)  
{  
 statements  
}  
else if(condition2)  
{  
 statements  
}  
else if(condition3)  
{  
 statements  
}  
…  
Eg:
if(a<0)  
 printf(“Negative value”);  
else if(a==0)  
 printf(“Zero value”);  
else  
 printf(“Positive value”);  
switch..case control statement:
The switch..case control statement is used when there are multiple results of an expression. The main advantage with switch..case is found in menu driven applications to get the user response. Based upon the result of the condition written in switch, respective case is executed. Remember that every case is to mentioned with break statement, that is useful in breaking the switch, otherwise all the cases after the given case are executed.
Syntax:
switch(choice)  
{  
 cases and statements  
}  
Consider the following example program.
#include<stdio.h>  
main()  
{  
 int ch,num1,num2;  
 do{  
     switch(ch)  
     {  
      case 1:printf("Sum is %d\n",num1+num2);  
      case 2:printf("Subtraction is %d\n",num1-num2);  
      case 3:printf("Product is %d\n",num1*num2);  
      default:printf("Invalid choice..\n");       
      case 4:printf("Division is %d\n",num1/num2);  
      case 5:printf("Mod is %d\n",num1%num2);  
     }  
     printf("1:Add\t2:Subtract\t3:Product\t4:Divide\t5:Modulus\t6:Quit\n");  
     scanf("%d",&ch);  
     printf("Enter 2 numbers:");  
     scanf("%d%d",&num1,&num2);  
  }while(ch<=5);  
}  
In this program, observe that the do..while loop is executed first with garbage value with ch and so the default case is executed. Also, observe that default case is mentioned among the cases.
Initially, ch contains garbage value. Hence, the default case is executed. At the same time, all the cases beneath the default case. This is because, switch cannot exit after executing the respective case. It is to be taken care by the programmer by mentioning break at the end of every case. Irrespective of whether it is default case or any other case, the last case does not need break.
#include<stdio.h>  
main()  
{  
 int ch,num1,num2;  
 do{  
     switch(ch)  
     {  
      case 1:printf("Sum is %d\n",num1+num2);break;  
      case 2:printf("Subtraction is %d\n",num1-num2);break;  
      case 3:printf("Product is %d\n",num1*num2);break;  
      case 4:printf("Division is %d\n",num1/num2);break;  
      case 5:printf("Mod is %d\n",num1%num2);break;  
      default:printf("Invalid choice..\n");  
     }  
     printf("1:Add\t2:Subtract\t3:Product\t4:Divide\t5:Modulus\t6:Quit\n");  
     scanf("%d",&ch);  
     printf("Enter 2 numbers:");  
     scanf("%d%d",&num1,&num2);  
  }while(ch<=5);  
}  
The switch..case always jumps to the case when the switch condition is true, ignoring all other statements. Check the following example
#include<stdio.h>  
main()  
{  
 int a=10;  
 switch(1)  
 {  
 a++;  
 case 1:printf("This is case 1\n");break;  
 default:printf("This is default case\n");  
 }  
 printf("a:%d\n",a);  
}  
*Remember that the choice can be an expression. Though the choice expression returns a floating point, the integral value is only considered. Also, the choice of a case should be integral type (recollect that char is also considered as integral datatype).
Try out the following program with different ‘choice’ values and observe the output.
#include<stdio.h>  
main()  
{  
 int a=10,b=6;  
 switch(1)//swithc(1.5)//switch(a/b)  
 {  
 case 1:printf("This is case 1\n");break;  
 default:printf("This is default case\n");  
 }  
}  
Unconditional iterative statements:
These iterative statements branch unconditionally.
goto control statement:
Syntax:
goto label;  
This just branches the flow of execution with a given label. Check out the following program.
#include<stdio.h>  
main()  
{  
 label:printf("Hello world..\n");  
 printf("How are you?\n");  
 goto label;  
}  
This program executes infinitely, as there is no condition to break the execution. Modify the program as shown below. Then observe the output.
#include<stdio.h>  
main()  
{  
 int i=0;  
 label:  
 if(i<5)  
 printf("Hello world..\n");  
 printf("How are you?\n");  
 i++;  
 goto label;  
}  
break control statement:
One of the uses of break statement was observed previously. The break control statement is also used to break a loop. Check out the following program.
#include<stdio.h>  
main()  
{  
 int i;  
 for(i=0;i<100;i++)  
 {  
 printf("%d\n",i);  
 if(i>5)  
  break;  
 }
}    
continue control statement:
The continue control statement is used to skip the executing of certain statements. See the following example.
#include<stdio.h>  
main()  
{  
 int i;  
 for(i=0;i<100;i++)  
 {  
 if(i<5)  
  continue;  
 printf("%d\n",i);  
 }  
}  
return control statement:
The return control statement is used just to return to the calling function. See the following example.
#include<stdio.h>  
int print(int i)  
{  
 i=i+10;  
 return i;  
}  
main()  
{  
 int i=10;  
 printf("Before return:%d\n",i);  
 i=print(i);  
 printf("After return:%d\n",i);  
}  
Observe that the value of i is being returned to the main function, which is the calling function. The complete discussion about functions is mentioned in further sessions.
Share:

Typecasting - Implicit and Explicit typecasting in C language

Type casting means converting one datatype to another datatype based upon the requirement. There are two types of type casting – Implicit type casting and Explicit type casting. Consider the following program.
Implicit typecasting: 
Before you execute, guess the output of the program shown below to differentiate what you know and what the compiler compiles.
#include<stdio.h>  
main()  
{  
 char ch;  
 int x=-5;  
 unsigned int y=5;  
 int a=5.5;  
 float b=5.5;  
 int c;  
 printf("%d\n",a);  
 c=a==b;  
 printf("c=%d\n",c);  
 c=x>y;  
 printf("c=%d\n",c);  
 ch='B';  
 c=ch>'A';  
 printf("c=%d\n",c);  
}  
We know that all real values are considered as double. In the declaration int a=5.5, 5.5 is a double type data, which is being assigned to an integer variable a. When we try print a using %d, you can see only the integral part of 5.5 i.e., 5. This is because the complier converts the double value 5.5 to integer as 5. This is an example of type casting; in specific it is called implicit type casting, as converting one datatype to another is take care by the compiler itself.
Now, consider the statement c=a==b. Here, = is least precedence when compared to ==. Hence comparison is done first and the result is assigned to c. In the comparison a==b, a is of integer datatype and b is of float type. The variable a is stored in normal integer format and that of b in IEEE format for floating point. In this comparison, the compiler converts the integer variable to float variable (implicit typecasting) and then compares them. Though b=5.5, when after typecasting, b becomes 5.0 instead of 5.5. Hence, compiler considers a==b as 5==5.0, which is, of course, unequal. This assigns 0 to c.
In the statement c=x>y, x>y is computed first and the result is assigned to c. Though we have not mentioned x as signed integer, int means signed int by default. So, x is signed and y is unsigned. The signed variable x is converted into unsigned data (implicit typecasting) and compared. Hence, the expression 5==5 results true and 1 is assigned to c.
In the statement c=ch>’A’, both ch and ‘A’ are of char datatype and are compared with each other. The result is true and 1 is assigned to c. There is no need of comparison in this statement. Here, remember that though ch and ‘A’ are characters, their numerical equivalents of ASCII values are compared.
Explicit typecasting:
Now, modify the respective statements in the above progarm as shown below.
c=a==(int)b;
c=x>(int)y;
In c=a==(int)b, a is an integer data; though b is float data, it is compared with a by “explicit” typecasting. The (int)b instructs the compiler to consider b as an integer data for comparison. Note that this explicit typecasting is not going to change the datatype of b forever, but for the instant it is instructed with different datatype for typecasting. Hence, this becomes true when 5==5 is compared.
In c=x>(int)y, x is signed integer; after explicit typecasting, the compiler gets the expression as -5>5, which is obviously false and 0 is assigned to c.
*In simple words, when the compiler takes care in operating with different datatypes together, it is called implicit typecasting. When the user forces to consider particular data as desired datatype, then it is called explicit typecasting.
Type casting from highest rank to lowest is shown below.
*Always, note that signed data is converted into unsigned data i.e., signed char is converted to unsigned char, signed short int is converted to unsigned short int etc.
For a still more understanding, execute the following program.
#include<stdio.h>  
main()  
{  
 int a=57;  
 printf("*%f*\t*%d*\t*%c*\n",(float)a,(int)a,(char)a);  
}  
Execute the following program and compare the output with expected results before you check the output. There are many ways to be wrong but only one way to be right.
#include<stdio.h>  
main()  
{  
 int a=57;  
 printf("*%hd*\t*%c*\t*%d*\n",(float)a,(int)a,(char)a);  
}  
Remember that type casting can also be performed upon pointers, which can be discussed in pointers.
Share:

Operators in C language

Arithmetic operators:
These operators are used to perform arithmetic operations. The arithmetic operators are: +, -, *, / and %. The operations performed by these operators are well known to us. Now, go through the following program and predict the output. After this, execute the program and check the result.
#include<stdio.h>  
main()  
{  
 int a=10,b=20,c;  
 c=a+5-b/4*2;  
 printf("a=%d\tb=%d\tc=%d\n",a,b,c);  
}  
It is ambiguous for us to expect the output of the assignment expression c=a+5-b/4*2, so does the compiler. In order to eliminate such ambiguities precedence of the operator was introduced. Within the five arithmetic operators available, the following is the order of precedence i.e., when multiple operators occur in same statement, the compiler computes the operation based upon the precedence of the operator.
+             -              -> lower precedence
*             /              %            -> higher precedence
In, the expression c=a+5-b/4*2, / operator is of highest precedence followed by *, - and +. Hence, the operators and operands of this expression are grouped as c=(a+(5-((b/4)*2))). b/4 is computed first and the result is multiplied with 2, which is subtracted from 5, which is in turn added with a. After all the operations are performed, the result is assigned to c.
Consider the following program.
#include<stdio.h>  
main()  
{  
 int a=10,b=20;  
 b=a*b/a=b;  
 printf("a=%d\tb=%d\n",a,b);  
}  
In this, the expression b=a*b/a=b consists of the operators *, / and =. Out of these operators, * and / have same precedence. Also, there are two assignment operators. The compiler now follows associativity of the operators. Associativity gives the orientation in which the compiler should start performing the operation with the given operators on the given operands. The expression is evaluated as: b=((a*b)/(a=b)) in left-to-right fashion i.e., first a*b is evaluated, then a=b and then both the results are divided. At last, the result is assigned to b.
Click here to get the precedence and associativity table of operators in C language.
 *Remember that modulo division is not possible for real datatypes.
Relational operators:
Relational operators are used for comparison and the result is given as 0 for false and 1 for true. The relational operators in C language are <,<=,>,>=,== and !=. The functionality of all these operators are well known. Execute the following program and check the output.
#include<stdio.h>  
main()  
{  
 char ch;  
 int x=5;  
 unsigned int y=5;  
 int a=5.5;  
 float b=5.5;  
 int c;  
 c=a==b;  
 printf("c=%d\n",c);  
 c=x>y;  
 printf("c=%d\n",c);  
 ch='B';  
 c=ch>'A';  
 printf("c=%d\n",c);  
}  
In the program mentioned above, we have comparisons between variables of different datatypes. The variable x stores the integer value 5. The variable y also stores 5, the difference is the MSB in x is used for representing the sign but not the MSB of y. Though the variable a is of integer datatype, we are explicitly storing a real value into it. The variable b is of float type and 5.5 is stored into it. Observe that data from different datatypes are being compared in the latter statements.
In order to compare data of different datatypes, data is “type casted”. When compiling the above program, the compiler itself typecasts the data, which is called implicit type casting. On the other hand, if the programmer typecasts the data, it is called explicit type casting. The concept of typecasting is discussed in latter sections. As of now, just assume that type casting is converting one data type to the other.
Logical operators:
Logical AND (&&), Logical OR (||) and Logical NOT (!) are the three logical operators in C language. Execute the following program to know these logical operators.
#include<stdio.h>  
main()  
{  
 int a=0,b=10,c,d;  
 d=a&&b;  
 printf("AND:d=%d\n",d);  
 d=a||b;  
 printf("OR:d=%d\n",d);  
 d=!a;  
 printf("NOT:d=%d\n",d);  
 if((c=a&&b)&&(d=a||b))  
  printf("In if block..\n");  
 else  
  printf("In else block..\n");  
}  
*Note that any non-zero value (even if it is negative) is considered as true and only zero is considered as false. Now, execute the following program and check the output. 
#include<stdio.h>  
main()  
{  
 int a=20,b=-10,c,=0,d;  
 d=(a=b)||(b=c)||(c=a);  
 printf("a:%d\tb:%d\tc:%d\td:%d\n");  
 d=(a=b)&&(b=c)&&(c=a);  
 printf("a:%d\tb:%d\tc:%d\td:%d\n");  
}  
In this program, the outputs are not according to the expectations. Because, the gcc compiler omits executing all other conditions when any one of the conditions is false for AND and when any one of the conditions is true for OR.
In the expression d=(a=b)||(b=c)||(c=a), the expression a=b is evaluated first, which results in a non-zero value, hence true. Therefore, the other two expression b=c and c=a are omitted, as the complete expression results a non-zero value irrespective of other conditions.
In the expression d=(a=b)&&(b=c)&&(c=a), a=b results in a non-zero value, which is true. Then the compiler moves to next condition b=c, which results in a zero value, which is false. Hence, the compiler skips the expression c=a, as the complete AND results in zero irrespective of other conditions.
Bitwise operators:
As far as embedded systems are concerned, bitwise operators play a major role. The bitwise operators present in C are &,|,^,<<,>> and ~.  We are very familiar with the operations of these bitwise operators. Note that bitwise operations cannot be performed on real values.
Bitwise AND (&), OR (|), and XOR (^) performs respective operations on every single bit of the given data. The left shift operator (>>) and the right shift operator (<<) shift the bits of the given data. The complement operator (~) performs bitwise complement of the given data. Execute the following program and predict the output before executing it. Compare the expected results with the output produced after execution.
#include<stdio.h>  
main()  
{  
 int a=20,b=-100,c=0,d;  
 printf("a>>2:%d\n",a>>2);  
 printf("b>>3:%d\n",b>>3);  
 printf("b&-1:%d\n",b&-1);  
 printf("a|-1:%d\n",a|-1);  
 printf("a&0:%d\n",a&0);  
 printf("b|0:%d\n",b|0);  
 printf("a&(1<<4):%d\n",a&(1<<4));  
 printf("(a<<4)&1:%d\n",(a<<4)&1);  
 printf("1<<35:%d\n",1<<35);  
 printf("1>>35:%d\n",1>>35);  
}  
Points to remember about bitwise operators:
*Remember that any of the bitwise operators does not affect the value contained by the operators.
*Bitwise AND (&) with -1 always results in second operand.
*Any number is reduced by half, for every right shift.
Now, checkout the following program.
#include<stdio.h>  
main()  
{  
 int a=20,b=100,c=0,d;  
 printf("-30>>1:%d\n",-30>>1);  
 printf("-31>>1:%d\n",-31>>1);  
 printf("-32>>1:%d\n",-32>>1);  
}  
The output of this program is -16 and -16 for both the expression -31>>1 and -32>>1. Recollect that negative numbers are stored in their respective 2s complement format. Go through the following explanation and observe that the LBS in -31>>1 and -32>>1 goes out of the 8-bit limit of the binary representation. (For easy representation, only 8 bits were considered, which can be extrapolated to 32 bits perfectly). Observe that -30>>1 can be converted to -15, by applying 2s complement on the bits.
Example for bitwise operators
Now, check out the following program that illustrates complement operator (~). The complement of ‘a’ here is fetched as a negative value by the compiler, as the MSB is 1 in the result. Hence, the output.
In complement, the result can be expected using the equation: ~b=-b-1
#include<stdio.h>  
main()  
{  
 int a=20,b;  
 b=~a;  
 printf("a:%d\tb:%d\n",a,b);  
}  
sizeof operator:
sizeof is the operator that gives the size of the given data in bytes. Execute the following program and check the output.
#include<stdio.h>  
main()  
{  
 int a=20,b=100,c=0,d;  
 d=sizeof(a);  
 printf("%d\n",d);  
 d=sizeof(int);  
 printf("%d\n",d);  
 d=sizeof(-100);  
 printf("%d\n",d);  
 d=sizeof(1.0);  
 printf("%d\n",d);  
 d=sizeof('1');  
 printf("%d\n",d);  
}  
As the variable a is integer, it gives 4 bytes. The datatype int is of 4 bytes in size. The signed integral value -100 occupies, it gives 4 bytes. Generally, we consider the real data as floating. But, the gcc compiler considers every real data as double by default. Hence sizeof(1.0) gives 8 bytes. In the last statement, ‘1’ is considered as a character. Though it is considered as character, it is stored in the memory in the form of equivalent ASCII, which is nothing but an integer. Hence, it gives 4 bytes.
Now, consider the following program with strings.
#include<stdio.h>  
main()  
{  
 int a;  
 a=sizeof("A");  
 printf("%d\n",a);  
 a=sizeof("gcchub");  
 printf("%d\n",a);  
 a=sizeof("123.45");  
 printf("%d\n",a);  
}  
When the data to the sizeof operator is given in double quotations then every character is considered as one byte; at the same time, the string is appended with the NULL character – ‘\0’, which also needs one byte of memory. Hence, “A” gives 2 bytes, “gcchub” gives 7 bytes and “123.45” gives 6 bytes.
Finally, execute the following program to get some more details about sizeof operator.
#include<stdio.h>  
main()  
{  
 int a=20,b=100,c=50;  
 printf("%d\n",sizeof(a)>a);  
 printf("%d\n",sizeof(a)>a);  
 printf("%d\n",sizeof(c++));  
 printf("%d\n",c);  
}  
The result given by sizeof operator is of unsigned int datatype. Hence, both the comparisons sizeof(a)>a and sizeof(b)>b produce 0 as the output, as a and b are integers. In the next statement, there is sizeof(c++) and then we have printed the value of c. For our surprise, c gives the output as 50 though c++ mentioned. This is because sizeof is an operator but not a function. Hence, c++ operation cannot be performed with sizeof operator.
Ternary operator:
This is the short form of if-else statement. This operator consists of three operands and hence the name ternary operator. The syntax of ternary operator is given below.
(condition)?(operand1):(operand2);
In this, if the condition is true, then operand1 is executed and operand2 is executed otherwise.
Increment/decrement operators:
These operators are used to increment/decrement the given variable by 1.
If the increment operator is placed “post” to the operand, then it is called post increment operator; if placed before is called pre-increment operator.
If the decrement operator is placed “post” to the operand, then it is called post decrement operator; if placed before is called pre-decrement operator.
Checkout the following example to differentiate pre/post increment/decrement operators.
#include<stdio.h>  
main()  
{  
 int v=1,r;  
 r=++v;  
 printf("r:%d\tv:%d\n",r,v);  
 r=v++;  
 printf("r:%d\tv:%d\n",r,v);  
 r=--v;  
 printf("r:%d\tv:%d\n",r,v);  
 r=v--;  
 printf("r:%d\tv:%d\n",r,v);  
}  
In r=++v, v is first incremented by 1 (pre-increment) and then the value is assigned to x. In r=v++, the current value of v is assigned to x and the value of v is incremented by 1 (post increment). The same can be extrapolated to decrement operator. To be still more precise about pre/post increment/decrement operators, checkout the following program and give a try to predict the output.
#include<stdio.h>  
main()  
{  
 int a=1,b=2,c=3,r;  
 r=a++ + ++b + ++c;  
 printf("a:%d\tb:%d\tc:%d\tr:%d\n",a,b,c,r);  
 c=r++ + a-- + b--;  
 printf("a:%d\tb:%d\tc:%d\tr:%d\n",a,b,c,r);  
 b=c-- + a ++ + --r;  
 printf("a:%d\tb:%d\tc:%d\tr:%d\n",a,b,c,r);  
}  
In the statement r=a++ + ++b + ++c, initially, a=1, b=2 and c=3. Here, a is post increment, b and c are pre-increment. So, r is computed as r=1+3+4 i.e., 8. After calculating the sum, a is incremented by 1 and then, the result is assigned to r. After the execution of this statement, the status is a=2, b=3, c=4 and r=8.
In the statement c=r++ + a-- + b--, the current values are considered for operation. So, c is computed as c=8+2+3. After computing the sum, r is incremented and a, and b are decremented. After the execution of this statement, the status is a=1, b=2, c=13 and r=9.
The statement b=c-- + a++ + --r is computed as b=13+1+8. After the execution of this statement, c is decremented and a is incremented. The status is a=2, b=22, c=12 and r=8.
Share:

Format specifiers and input-output data management in C language

Format specifiers in C play a major role in storing or retrieving the data. Data may be corrupted or unexpected results may be produced if proper format specifiers are not used. Have a look at the list of format specifiers supported by C on gcc compiler.
Format specifiers in C on GCC
Format specifiers in C on GCC
Before going deep into format specifiers, two points are to be known and remembered throughout the input/output operations in C.

  1. Any input given does not directly stored into the memory. It is first moved to a buffer called ‘stdin’ and then the data fetched from the stdin to store into the memory (specifically RAM, when the process is running).
  2. Any output displayed on the console is not directly written on to the console by the data from the memory/processor. The data is first put into an intermediate buffer called ‘stdout’ and then fetched onto the console.

Data IO in C language
Data IO in C language
*Do not confuse with the term buffer. A buffer is simply a two-way data pool, acting as intermediary for data storage. The buffers stdin and stdout are never empty; they always consist of some data or the other, which we call garbage, unless explicitly defined by the user.
Format specifiers for scanning data from user:
There are different functions in C language supporting Input/output data streaming. Functions like scanf, gets, getc and getchar are used to collect data from the user. Except the scanf family of functions, all other functions for data input are predefined for particular datatype. Only scanf depends upon format specifiers to “scan” any type of data from the user, as shown above.
As we said earlier, whenever we input data to the computer, it is first stored in stdin buffer and then stored into the memory (RAM), after which the processor fetches data for further processing. But the question is…what is the amount of data fetched from the buffer? The format specifiers come into picture to define the amount of data to be fetched from the buffer to store in the memory.
For example, if %c is used, 1 byte of data from the stdin buffer is fetched; if %d or %i is used, 4 bytes of data is fetched from stdin. Consider the example shown below.
 #include<stdio.h>  
 main()  
 {  
  int f;  
  printf("Enter f:");  
  scanf("%c",&f);  
  printf("%d\t%c\n",f,f);  
 }  
This produces a warning after compilation; however, the program runs! When we enter an input, say a, the %d format specifier fetches a garbage value and %c fetches the exact value a.
Thought the variable f is declared as an integer. But we have used %c format specifier. This instructs the compiler to fetch 1 byte of data from stdin, where 4 bytes of memory is allocated for f. The 1 byte of data fetched from stdin is stored in the variable f for 4 bytes’ size. The remaining 3 bytes of data consist of some undefined values, resulting in garbage when %d is used to fetch 4 bytes of data.
Now consider the following example program.
 #include<stdio.h>  
 main()  
 {  
  char f;  
  printf("Enter f:");  
  scanf("%c",&f);  
  printf("%d\t*%c*\n",f,f);  
 }  
The variable f is declared with char datatype and the input is being stored into buffer with the format specifier %c. There is no mismatch in the datatype and the format specifier. So, this produces no warning and is compiled smoothly. After executing this program, you see certain output. But, this time, it is not a garbage. The value that %d and %c has fetched can be cross verified with – man ascii.
The same pattern of fetching data hold good for all the integral data types. The scene slightly differs for real datatypes. Execute the following program.
 #include<stdio.h>  
 main()  
 {  
  int f;  
  printf("Enter f:");  
  scanf("%d",&f);  
  printf("%d\t%f\n",f,f);  
 }  
Format specifiers for printing data onto console:
In the above example, though the variable f is declared as an integer, the format specifier changes how the data fetched from stdin is stored into memory. With the format specifier %f, the 4 bytes of data is fetched according to the IEEE 754 standard (click here to check floating format of IEEE 754 standard). But, when the data is fetched from the memory and displayed on console, the compiler just fetched 4 bytes normal of data, instead of fetching 4 bytes of data stored in IEEE format.
Suppose that the input provided during runtime is 5. When storing the data, it is stored by converting into IEEE format. But when fetching, irrespective of the bits stored, equivalent decimal data is fetched, which results as 1084227584 after binary to decimal conversion. Now, execute the following program and observe the difference.
 #include<stdio.h>  
 main()  
 {  
  int f;  
  printf("Enter f:");  
  scanf("%d",&f);  
  printf("%f\t%d\n",f,f);  
 }  
*Observe that we are using different format specifiers to fetch data available in the same variable.
According to the explanation given above, the format specifier %f should produce the equivalent output, for the bits stored by converting given input from decimal to binary. But, the output is shown as -0.000000. This is not because of any error; but because of the bug in the printf function itself. The printf function fails to fetch the data or corrupts the data when the format specifiers from both integral datatype and real datatype are used together on same variable and this cannot be avoided unless proper format specifier are used to store or fetch data. The same argument holds good from all the real datatypes.
 #include<stdio.h>  
 main()  
 {  
  int f;  
  printf("Enter f:");  
  scanf("%d",&f);  
  printf("%d\n",f);  
 }  
The bottom line of this post is:
The amount of data stored into the memory or fetched from the memory depends upon the format specifier used. On the other hand, the function printf and scanf fail to organise data from the user or data to the user when cross format specifiers from integral and real datatypes are used for the same variable.
Share: