Format specifiers and input-output data management in C language

Format specifiers in C play a major role in storing or retrieving the data. Data may be corrupted or unexpected results may be produced if proper format specifiers are not used. Have a look at the list of format specifiers supported by C on gcc compiler.
Format specifiers in C on GCC
Format specifiers in C on GCC
Before going deep into format specifiers, two points are to be known and remembered throughout the input/output operations in C.

  1. Any input given does not directly stored into the memory. It is first moved to a buffer called ‘stdin’ and then the data fetched from the stdin to store into the memory (specifically RAM, when the process is running).
  2. Any output displayed on the console is not directly written on to the console by the data from the memory/processor. The data is first put into an intermediate buffer called ‘stdout’ and then fetched onto the console.

Data IO in C language
Data IO in C language
*Do not confuse with the term buffer. A buffer is simply a two-way data pool, acting as intermediary for data storage. The buffers stdin and stdout are never empty; they always consist of some data or the other, which we call garbage, unless explicitly defined by the user.
Format specifiers for scanning data from user:
There are different functions in C language supporting Input/output data streaming. Functions like scanf, gets, getc and getchar are used to collect data from the user. Except the scanf family of functions, all other functions for data input are predefined for particular datatype. Only scanf depends upon format specifiers to “scan” any type of data from the user, as shown above.
As we said earlier, whenever we input data to the computer, it is first stored in stdin buffer and then stored into the memory (RAM), after which the processor fetches data for further processing. But the question is…what is the amount of data fetched from the buffer? The format specifiers come into picture to define the amount of data to be fetched from the buffer to store in the memory.
For example, if %c is used, 1 byte of data from the stdin buffer is fetched; if %d or %i is used, 4 bytes of data is fetched from stdin. Consider the example shown below.
 #include<stdio.h>  
 main()  
 {  
  int f;  
  printf("Enter f:");  
  scanf("%c",&f);  
  printf("%d\t%c\n",f,f);  
 }  
This produces a warning after compilation; however, the program runs! When we enter an input, say a, the %d format specifier fetches a garbage value and %c fetches the exact value a.
Thought the variable f is declared as an integer. But we have used %c format specifier. This instructs the compiler to fetch 1 byte of data from stdin, where 4 bytes of memory is allocated for f. The 1 byte of data fetched from stdin is stored in the variable f for 4 bytes’ size. The remaining 3 bytes of data consist of some undefined values, resulting in garbage when %d is used to fetch 4 bytes of data.
Now consider the following example program.
 #include<stdio.h>  
 main()  
 {  
  char f;  
  printf("Enter f:");  
  scanf("%c",&f);  
  printf("%d\t*%c*\n",f,f);  
 }  
The variable f is declared with char datatype and the input is being stored into buffer with the format specifier %c. There is no mismatch in the datatype and the format specifier. So, this produces no warning and is compiled smoothly. After executing this program, you see certain output. But, this time, it is not a garbage. The value that %d and %c has fetched can be cross verified with – man ascii.
The same pattern of fetching data hold good for all the integral data types. The scene slightly differs for real datatypes. Execute the following program.
 #include<stdio.h>  
 main()  
 {  
  int f;  
  printf("Enter f:");  
  scanf("%d",&f);  
  printf("%d\t%f\n",f,f);  
 }  
Format specifiers for printing data onto console:
In the above example, though the variable f is declared as an integer, the format specifier changes how the data fetched from stdin is stored into memory. With the format specifier %f, the 4 bytes of data is fetched according to the IEEE 754 standard (click here to check floating format of IEEE 754 standard). But, when the data is fetched from the memory and displayed on console, the compiler just fetched 4 bytes normal of data, instead of fetching 4 bytes of data stored in IEEE format.
Suppose that the input provided during runtime is 5. When storing the data, it is stored by converting into IEEE format. But when fetching, irrespective of the bits stored, equivalent decimal data is fetched, which results as 1084227584 after binary to decimal conversion. Now, execute the following program and observe the difference.
 #include<stdio.h>  
 main()  
 {  
  int f;  
  printf("Enter f:");  
  scanf("%d",&f);  
  printf("%f\t%d\n",f,f);  
 }  
*Observe that we are using different format specifiers to fetch data available in the same variable.
According to the explanation given above, the format specifier %f should produce the equivalent output, for the bits stored by converting given input from decimal to binary. But, the output is shown as -0.000000. This is not because of any error; but because of the bug in the printf function itself. The printf function fails to fetch the data or corrupts the data when the format specifiers from both integral datatype and real datatype are used together on same variable and this cannot be avoided unless proper format specifier are used to store or fetch data. The same argument holds good from all the real datatypes.
 #include<stdio.h>  
 main()  
 {  
  int f;  
  printf("Enter f:");  
  scanf("%d",&f);  
  printf("%d\n",f);  
 }  
The bottom line of this post is:
The amount of data stored into the memory or fetched from the memory depends upon the format specifier used. On the other hand, the function printf and scanf fail to organise data from the user or data to the user when cross format specifiers from integral and real datatypes are used for the same variable.
Share:

0 comments:

Post a Comment