What is preprocessing?

Preprocessing, as the name suggests, is the preprocessing of C-code before the actual compilation begins. In C, a line that begins with ‘#’ character is called a preprocessing directive. It is an indicator to the compiler that this preprocessing directive needs to be processed before compilation phase begins.

Preprocessing directives

Source file inclusion (#include)

#include preprocessing directive tells the preprocessor to include contents of the specified file, as is, within the current source file, at the location of the #include directive.

e.g. when we write #include <stdio.h> in our program, it leads to inline inclusion of all contents of stdio.h file in our source code.

There are two variants of #include directive:

#include “filename.h”

#include <filename.h>

When we include a file by specifying its name in angle brackets, the pre-processor starts by looking for the file in the directories specified in system include path. Simply put, include path can be viewed as a list of directory paths. The header file in question is looked up in this list of directory paths. Typically this method is used to include standard library header files.

When we include a file by specifying its name in double quotes, the pre-processor starts by looking for the file in the current directory. If the file is not found in the current directory, it would proceed to search for the file in the system include path. Typically this method is used to include user-defined header files.

Macro replacement (#define)

#define directive is commonly known as macro. A macro consists of a macro name and a macro body. The first token after ‘#’ character is macro name, and all the remaining tokens till a new line is encountered constitute the macro body. During preprocessing, macro name is replaced by macro body. E.g.

#define BUFFERSIZE 128

As we have learned, ‘#’ symbol at the beginning of the line indicates a preprocessing directive. #define is the directive and it defines a symbolic name BUFFERSIZE for literal 128.

Once pre-processor encounters this directive, it will replace all following instances of BUFFERSIZE with 128. Let us consider the following declaration

int buffer[BUFFERSIZE] = { 0 };

After preprocessing the output will be

int buffer[128] = { 0 };

This preprocessed output will then go as input to the compilation phase.

Conditional inclusion (#if #endif, #ifdef #endif, #ifndef #endif)

As the name suggests, conditional inclusion directives are used to include or exclude certain piece(s) of code from compilation, depending on the specified condition(s). E.g.

#if DEBUG_LEVEL == 3

    PRINT_LOG(“Call failed. Counter dump:\n”);

    for(int i=0; i<10; i++) {

        printf(“counter[%d] = %d\n”, i, counter[i]);

    }

#else

    PRINT_LOG(“Call failed\n”);

#endif

In above piece of code, if value of macro DEBUG_LEVEL is defined as 3, the lines between #if and #else are included in compilation, else the lines between #else and #endif are included in compilation phase.

Similarly #ifdef or #ifndef include or exclude the code from compilation depending on whether a macro was defined or not. e.g.

#ifdef DEBUG_MODE

    printf(Cumulative sum in iteration %d is %d\n, i, sum);

#endif

In above example, if DEBUG_MODE is defined, irrespective of the value of the macro, the following piece of code will be included in compilation.

#ifndef BUFFER_LENGTH

#define BUFFER_LENGTH 128

#endif

In above example, if BUFFER_LENGTH is NOT defined, then the following piece of code is included in compilation

Line control directive (#line)

#line lets you modify the compiler’s line number and (optionally) the file name output for errors and warnings. Let’s create a file sample.c with following code snippet:

01: #include <stdio.h>

02: int main(int argc, char** argv)
03: {
04:      #line 400 “abc.c”
05:      printf(“Hello world”)
06:     return 0;
07:}

As you can see, there is a missing ‘;’ at the end of line 05, hence it will give compilation error. The compiler error is as following:

abc.c: In function ‘main’:
abc.c:401:2: error: expected ‘;’ before ‘return’

Note the following in above error:

  1. Filename is reported as “abc.c” and not the name of the file we created – “sample.c”
  2. The line number reported is 401, whereas the line in file is line 05.

Both these effects are caused by the #line directive at line 04. The first parameter defines the new line number for current line (i.e. line 04) as 400. Hence line 05 becomes line 401 with this new numbering. The second optional parameter defines the new file name user for reporting compilation error/warnings. Hence whatever be the filename for this code snippet, the compiler will consider it as “abc.c” from line 04 onwards.

Error directive (#error)

#error directive is used to report any error that we encounter in preprocessing. When the preprocessor hits the #error directive, it will report the input string as an error message and halt compilation.

Let us consider an example where the code can be compiled either for linux or windows, depending on definition of macros – LINUX_BLD and WIN_BLD. If neither of them is defined or both are defined, it is an error situation. The code to detect the same can be written like this:

#if defined LINUX_BLD && defined WIN_BLD
#error You cannot build Linux and Windows build at same time
#endif

#if !(defined LINUX_BLD && defined WIN_BLD)
#error You must specify either Linux or Windows build
#endif

In either of the above cases, the compilation will stop and print the corresponding error message at console.

Predefined macros

C comes with some predefined macros ready for us to use. Three of the most commonly used predefined macros are __FILE__, __FUNCTION__, __LINE__ which expand to the current file name, function name, and line number respectively. Here is an example using these macros.

#define PRINT_LOCATION printf(“Control is at %s %s %d”, __FILE__, __FUNCTION__, __LINE__);

Now when we will call PRINT_LOCATION in our code, it will print the file name, function name, and line number of that location.

In real world, the code is split into numerous C code files. Hence, when we print log messages in our program, it will be useful if log messages contained exact location from where the log was printed. Let us put our macro knowledge to use and define a useful macro to achieve the same.

#define PRINT_LOG(MSG)  printf(“%s: %s: %d: %s”, __FILE__,  __FUNCTION__, __LINE__, MSG);