CMPSC 311, Introduction to Systems Programming

Background Information for Programming Projects


We start with some background information about C programs on Unix, including a full example.  Some of this will be review, and some of this will be new information.

For information on remote access and editors, see the General Instructions.  There will be examples of compiler commands later.


Standards.  The POSIX standard describes the C language interface to a Unix-like system, and it often defers to the ANSI/ISO standard for C.  This describes how system and library functions should behave.  The on-line documentation for each particular operating system should indicate if there are any discrepancies.  Links to the POSIX standard are given on the General Instructions page.  Most Linux distributions now include various sections of the standard in their man pages.

Terminology.  In the POSIX standard (IEEE Std 1003, 2008), there are some important definitions:

The commonly used Unix and Linux command shells are sh (the Bourne shell, standardized by POSIX), csh (the C shell), tcsh (the Tenex C shell, another version of csh), ksh (the Korn shell), and bash (the Bourne-again shell, from GNU).  You can see which shell you are using with a command like "echo $SHELL", as will be seen later.

Notation.  The notation fork(2) means that the appropriate man page is in section 2 ("man -s 2 fork" for this one page, or "man -a fork" to see everything).  The section numbers used here refer to Solaris, and may be different on Linux or other versions of Unix.

man pages.  (manual pages)  These are written for people who have been using Unix already, and beginners often find them inadequate.  Try the GNU documentation or one of the textbooks if the man page is too confusing.  The command syntax for the man command differs between systems.  For example, on Solaris or Linux, "man -s 2 fork", while on Mac OS X, "man 2 fork".  The apropos command is also useful.

One reason for using "man -a something" instead of "man something" is that you discover more information.  Another is that Unix systems are inconsistent about which information goes in which section, so what worked on one system might not work on another.

On the Web.  Here are some links to documentation related to program design.  You might find these more informative or more accessible than the Solaris man pages.  Examples are included.
    GNU C Library, The Basic Program/System Interface (read this along with the current discussion)
    GNU C Library, Signal Handling (read this later)
    GNU C Library, Processes (read this later)

On-Line.  The GNU C library information can also be accessed from Linux or the CSE Sun systems with the commands
    info libc "program basics"
    info libc "signal handling"
    info libc processes
A quick guide to info is available.   (Exercise: how many command-line arguments are there in each of these three examples?)

Command-line structure.  More information is available in intro(1).  The simplest form of a Unix command is to name a program to run, and give it some arguments to use in the same way that a function uses parameters.  When the program runs, the command-line arguments are given to it as an array of character strings.  Arguments are separated into options and operands; options can be simple or have an option-argument.  All the options should precede the operands.  The general form is
    utility_name [options] [operands]
The utility name is expected to refer to an executable file, or a command that is built into the shell.  The brackets indicate that something is optional.  For example,
    ls -l -t -r foobar
will list directory information about the file system entry foobar, which can be a file or directory.  The options all start with "-" and are one letter.  The simple options can be combined, so that
    ls -ltr foobar
has the same effect.  In most cases the order of the options does not matter.  Another example is
    cc -v -o bgi bgi.c
The option -v is simple, the option -o has one option-argument (bgi), and the command has one operand (bgi.c).  The parts of a command line are separated by "white space", which is some number of spaces, tabs and escaped-newlines (backslash followed by return).  If an option has more than one option-argument, they are separated by commas; this is used infrequently.  The man page for each utility will describe how it treats repeated options, or mutually-exclusive options.  The getopt(3C) function is the standard mechanism to separate options and operands.  Examples of the use of getopt() can be found on its man page and in the bgi.n.c programs to follow.  The GNU C library has two other mechanisms, getopt_long() and argp_parse(), which you could use after gaining experience with getopt(), but they are not part of the Sun or Posix libraries.

Environment variables.  If the command line is like a function call with parameters, then the environment variables are like global variables and hidden parameters.  The shell maintains a set of strings of the form "name=value" which determine the environment.  As the shell starts a new program, it copies the environment variables to a place where the new program can find them.  The command printenv(1) will print the current set of environment variables known to the shell, and the set can be altered with the commands setenv and unsetenv (or export, declare, set and unset, depending on the shell).  On Solaris, see set(1).  If you already know the name of an environment variable, such as SHELL, then you can print its value with "printenv SHELL" or "echo $SHELL".

main() structure.  The usual way to start a C main program is with

int main(int argc, char *argv[]) { ... }
To get direct access to the environment variables, add the POSIX standard declaration
extern char **environ;
or use the nonstandard form
int main(int argc, char *argv[], char *envp[]) { ... }
or both.  environ and envp have essentially the same type, and have the same value when the program starts on Solaris and Linux.  The number of environment strings is not specified; start with envp[0] and continue until envp[i] is NULL, which indicates the end of the array.  The same applies to environ[0] and environ[j].  In general, it is better to use getenv(3C) to search for a specific environment variable, so the array envp or the pointer environ are not normally used explicitly.  Environment variables can be set with the library function putenv(3C)putenv() could cause environ to change, to make room for new string pointers, so in general it is safer to use environ (whose value could change to reflect a new environment variable) than envp (whose value will not change).  Of course, if you really want to ignore changes to the environment as the program runs, then use envp.  Note that calls to putenv() affect only the current process, not the command shell that started the process.

The command line is parsed by the shell into argv[0], ..., argv[argc-1], where argv[0] is the program name as the command was given.  Although the number of argument strings is specified by argc, it is also the case that argv[] has a null element argv[argc] to mark the end of the array, as with envp[] and environ[].  In most cases, it is more natural to iterate through argv[] by incrementing an index, comparing the index to argc.

The exit status of the program is the return value from main(), or the value supplied to the system function exit(3C).  Note that there is also a man page for exit(2).  There is some additional information in intro(1).  An exit status of 0 is interpreted as success, non-zero as failure.  In the context of shell scripts with loops and conditionals, this would be treated as true or false.

The following conditions should hold for main():

envp is equal to environ at the start of the program, but this could change if the program adds to its own environment.  In this case, envp remains the same, and environ changes.

There is no importance to the ordering of the environment strings.  Windows requires the environment strings to be sorted alphabetically, but Unix does not.

The POSIX standard includes two functions setenv(3) and unsetenv(3), which would be useful for a shell program.  These are implemented on Solaris 10, GNU/Linux and Mac OS X but not Solaris 9.

General guidelines.

  1. the command line is parsed by the command shell into argv[0], ..., argv[argc-1]
  2. argv[0] is the program name as the command was given
  3. exit status 0 indicates success, nonzero indicates failure
  4. exit status for help is success
  5. exit status for error is failure
  6. normal output to stdout with printf(...)
  7. error messages to stderr with fprintf(stderr, ...)
  8. command-line argument processing with getopt(3C)
  9. scan through the argv[] array only once with getopt() (although this is not a completely firm rule)
Compile and test.  To verify that your program has no obvious flaws, compile it with commands that turn on extra warnings.  (The lint program is available only on Solaris.)

Using C89,
Sun's compiler   cc -v -O -o bgi bgi.c
GNU compiler     gcc -Wall -Wextra -O -o bgi bgi.c
code checker     lint bgi.c

Using C99,
Sun's compiler   c99 -v -O3 -o bgi bgi.c
GNU compiler     gcc -std=c99 -Wall -Wextra -O -o bgi bgi.c
code checker     lint -Xc99 bgi.c

The optimization flags (-O and -O3) can be omitted for this short example program.  On Sparc processors running Solaris with Sun's compiler (cc) the default is 32-bit addresses; you can request 64-bit addresses by adding the option -m64 or the option  -xarch=generic64

Some of what lint(1) complains about can be ignored.  Use "man -a lint" to see complete instructions.  Of course, you should read the man pages for cc and gcc to see what these additional options do for you.  Note that the -v option on cc does not mean "verbose", as it does with GCC.  Use "info gcc" for more complete information.

Run the program.  At the prompt from the command shell, type the name of the program followed by its options and operands.  If you get a response like "command not found", try using "./bgi" instead of "bgi".  This problem is connected to the path variable in the command shell.


Now we'll go through the development of a complete example.  Links to the code follow the specification and some output examples.  The program itself is intended to be more-or-less realistic, but it's still just an example.

The objective of this example is to write, test and understand a Unix program bgi [BackGround Information] in C with command-line arguments, in the style of a typical Unix utility, using the standard library functions getopt(), getenv(), putenv(), setenv(), unsetenv(), exit(), printf(), fprintf(), atoi().  The program should return with an exit status in a style consistent with other Unix commands, and you should provide a simple shell script to test the exit status, using an if-then-else-endif construct (the exact syntax depends on the shell you choose) and the echo command.  Most of the testing can be done interactively.  The program should print the following when the -h option is used (this is the typical form of a help option):

usage: bgi [-h] [-v] [-a] [-e] [-p] [-s var=val] [-t var] [-u var] [-x stat]
The options should behave as follows:
 
-h help, print a usage message to stdout
(some additional output could be provided, explaining the options)
-a print the command-line arguments, one per line
-e print the environment variables using environ, one per line, in the style
SOMETHING=something
-p print the environment variables using envp[], one per line, in the style
SOMETHING=something
-s var=val set the environment variable var to the value val
-t var print the value of the environment variable var, in the style
  var = "something"
or
  var: not found
-u var remove (unset) the environment variable var from the environment
-x stat exit with the given status
-v verbose mode,
print the addresses of argv[i] with the -a option
print the addresses of environ[i] with the -e option
print the addresses of envp[i] with the -p option

The options can occur in any order, and may be repeated.  If multiple options are used, the -t output should come before the -a output, which should come before the -p output, which should come before the -e output.  The -s, -u and -x options produce no output except perhaps an error message.  The -s and -u options affect only the current process, not the command shell that started it.  The last -x option takes effect; if there is none, then exit with status 0, for success.  The -x option does not cause an immediate exit, it just saves a status value for use later when the program would normally exit.  If some other option is used, or there is an option-argument missing, this is an error, and the program should print (to stderr) the help message or something equivalent.  Command-line operands are ignored except with the -a option.

The output from the command printenv should be identical to your output with the -e or -p option alone.

The exit status should be a byte-sized integer, between 0 and 255, but it might be interesting to avoid checking this condition, just to see what happens.  The value of the exit status can be printed from a shell script or interactively with a command sequence like

bgi -x 17
echo $?
for sh, ksh, tcsh and bash, or
bgi -x 17
echo $status
for csh, tcsh.  You should also try an if-then-else construct to see how the exit status can be used in a shell script (there is an example shortly).

Before reading further, try sketching the design of this program.

Here is an example using Solaris, where '%' is the shell's prompt.  The commands have been highlighted.  The addresses might be different when you try it.

% bgi -a -t USER -t SHELL
USER = "dheller"
SHELL = "/bin/tcsh"
argc = 6
bgi
-a
-t
USER
-t
SHELL

% bgi -v -a -t USER -t SHELL
USER = "dheller"
SHELL = "/bin/tcsh"
address   ffbff0a4: argc = 7
address   ffbff0a8: argv =   ffbff0c4
  address   ffbff0c4: argv[0] =   ffbff220 --> "bgi"
  address   ffbff0c8: argv[1] =   ffbff224 --> "-v"
  address   ffbff0cc: argv[2] =   ffbff227 --> "-a"
  address   ffbff0d0: argv[3] =   ffbff22a --> "-t"
  address   ffbff0d4: argv[4] =   ffbff22d --> "USER"
  address   ffbff0d8: argv[5] =   ffbff232 --> "-t"
  address   ffbff0dc: argv[6] =   ffbff235 --> "SHELL"
% bgi -v -z
bgi: illegal option -- z
bgi: invalid option 'z'
usage: bgi [-h] [-v] [-a] [-e] [-p] [-s var=val] [-t var] [-u var] [-x stat]
  -h           print help
  -v           verbose mode
  -a           print argc and argv[]
  -e           print environ[]
  -p           print envp[]
  -s var=val   set environment variable var to value val
  -t var       print value of environment variable var
  -u var       unset environment variable var
  -x stat      set exit status to stat
% bgi -a blat -t USER
argc = 5
bgi
-a
blat
-t
USER

Here is the same example on Mac OS X.  The prompt is '$'.
$ bgi -a -t USER -t SHELL
USER = "dheller"
SHELL = "/bin/bash"
argc = 6
bgi
-a
-t
USER
-t
SHELL

$ bgi -v -a -t USER -t SHELL
USER = "dheller"
SHELL = "/bin/bash"
address 0xbffffd58: argc = 7
address 0xbffffd5c: argv = 0xbffffdf8
  address 0xbffffdf8: argv[0] = 0xbffffe64 --> "bgi"
  address 0xbffffdfc: argv[1] = 0xbffffe68 --> "-v"
  address 0xbffffe00: argv[2] = 0xbffffe6b --> "-a"
  address 0xbffffe04: argv[3] = 0xbffffe6e --> "-t"
  address 0xbffffe08: argv[4] = 0xbffffe71 --> "USER"
  address 0xbffffe0c: argv[5] = 0xbffffe76 --> "-t"
  address 0xbffffe10: argv[6] = 0xbffffe79 --> "SHELL"
$ bgi -v -z
bgi: illegal option -- z
bgi: invalid option 'z'
usage: bgi [-h] [-v] [-a] [-e] [-p] [-s var=val] [-t var] [-u var] [-x stat]
  -h           print help
  -v           verbose mode
  -a           print argc and argv[]
  -e           print environ[]
  -p           print envp[]
  -s var=val   set environment variable var to value val
  -t var       print value of environment variable var
  -u var       unset environment variable var
  -x stat      set exit status to stat
$ bgi -a blat -t USER
argc = 5
bgi
-a
blat
-t
USER

Here is a sequence of initial versions of the program, with extensive comments, which you can use to get started.  The fourth version was used to generate the output shown above.  The fifth version cleans up some details, and completes this example.
bgi.1.c
bgi.2.c
bgi.3.c   bgi.3a.c (one character different)
bgi.4.c
bgi.5.c
bgi.sh (shell script, Solaris only)
bgi.gcc.sh (shell script, Solaris, Linux or Mac OS X)
Use the browser to save these files - copy and paste is sometimes inaccurate, and retyping is just a waste of time.


A reasonable order in which to read the Solaris man pages for this example (and what to look for) is

   man -s 1 intro
     introduction to commands and application programs,
     manual page organization,
     command syntax standard, diagnostics (exit status), list of commands
   man -s 2 intro
     introduction to system calls and error numbers,
     errno.h, errno, various definitions, list of system functions
   man -s 3 intro
     introduction to functions and libraries,
     include files, interfaces and headers, definitions,
     standard C library, memory allocators, networking, etc.
   man -s 4 intro
     introduction to file formats
   man -s 5 intro
     introduction to miscellany,
     standards, environment, etc.

   man -s 3C exit
     stdlib.h, exit(), EXIT_SUCCESS, EXIT_FAILURE
   man -s 2 exit
     most of this information will be relevant later
   man -s 3C stdio
     stdio.h, stdin, stdout, stderr, EOF (end of file), NULL (null pointer)
   man -s 3C printf
     printf(), fprintf(), format conversions, examples

   man -s 3C getopt
     getopt(), relevant example

   man -s 3C atoi
     string to int conversion, but without error checking

   man -s 5 environ
     user environment, environment strings
     try the command printenv first
     most of the specific environment information is not needed here
   man -s 3C getenv
     getenv()
   man -s 3C putenv
     putenv(), heed the warning!
   man printenv
   which printenv
     printenv is built into the shell tcsh, but is a separate utility under csh or sh or ksh or bash
   man -s 1 set
     setenv, unsetenv, etc.
   man tcsh
     or, man csh, depending on which shell you are using
   man sh
     the Bourne shell, adopted by POSIX,
     because there is less reading to reach what you need here (the if-then-else)

The command which printenv will be useful to see whether the command printenv is built into the shell (tcsh) or is a separate command (sh, csh, ksh, bash).


Here are some test cases to try.  If you already have files named T1 and T2, and you don't want to destroy them, then pick some other names.

bgi
bgi -h
    (does the message go to stdout or stderr?)
bgi -g
    (same question)
bgi -x 0
bgi -x 1
    (the shell script will be needed to see any difference)
bgi -a
bgi -e
bgi -p
bgi -p > T1
printenv > T2
diff T1 T2
    (there should be no difference, but there is using the bash shell)
bgi -t USER -t GROUP -t SHELL
    (also try this as a command in a make file, or with another shell)
bgi -t LC -t USER -a
bgi -t LC -a -t USER
setenv foo 13
bgi -t foo
unsetenv foo
bgi -t foo
Some experiments with getopt():  (digit 1, not letter l)
bgi -v x 1
bgi -v -x 1
bgi -v -x1
bgi -vx1
bgi -av
bgi -va
Here is another test script using the sh shell.  If you already have files named e, p, p2, then pick some other names.
# test bgi.c with Sun and GNU C compilers

for compile in "cc -v" "gcc -Wall -Wextra"
do
  echo compiling with $compile
  rm bgi
  if $compile -o bgi bgi.c
  then
    echo testing -s option
    bgi -t foo
    bgi -t foo -s foo=bar
    bgi -t foo -s foo=bar -t foo
    echo testing -e and -p options
    bgi -e > e
    bgi -p > p
    printenv > p2
    echo compare -e and -p
    diff e p
    echo compare -p and printenv
    diff p p2
    rm e p p2
    echo compare -e and -p after -s
    bgi -s foo=bar -e > e
    bgi -s foo=bar -p > p
    diff e p
    rm e p
    echo done
  else
    echo
    echo compile failed, test abandoned
  fi
  echo
done


Some additional notes.

When writing any C program, check all the system functions and library functions to verify that they are being used correctly, and that possible runtime errors are detected.  No one bothers checking to see if printf() or fprintf() fail - what would you do about it, anyway?

Some versions of printf() don't mind character string pointers that are NULL, and some will choke.  You might need to check the value of a pointer before passing it to printf().  The runtime error messages "segmentation fault" and "bus error" usually mean "bad pointer".  Sun's position is that the C programming language standard says that the argument for a %s specifier shall be a pointer to an array of characters, and while NULL is indeed a pointer it clearly does not qualify as a pointer to an array of characters.  The GNU C library will accept a NULL string pointer for printf(), but usually not for other functions.  Would you rather find your bugs sooner or later?

Most Unix utilities accept "--" (two hyphens) as a separator between the command-line options and operands, in case one of the operands starts with "-".  The operand "-" usually refers to stdin.  What does getopt() do about these special cases?

Unix has been historically inconsistent about whether help and error messages go to stdout or stderr.  The current standard is help to stdout, error messages to stderr.

Single-character constants like 'c' have type int in C, not type char as in C++.  Usually this does not matter because of the type conversion rules.

Why is the return value from getopt() an int and not a chargetopt() returns either a character (normal result) or the int value -1 (abnormal result or end of the options).  The dual use of return values is often described as a design flaw, and it is a common problem in the C libraries.  The global variables optarg, optind, opterr, optopt simplify the interface to getopt(), as they might have been parameters to a more complex function.  If there is a problem with one of the options, getopt() will print a message to stderr unless you disable this feature by first setting opterr to 0.  optopt and optind are used to see which option and command-line argument you are currently referring to.  This is necessary because some options can be combined into the same argument.


GNU's Not Unix.  No kidding.

The GNU/Linux getopt(3) function differs from the Solaris getopt(3C) function.  Here is the earlier example, for comparison, but now run on Linux.  The commands have been highlighted.

$ ./bgi -a -t USER -t SHELL
USER = "dheller"
SHELL = "/bin/tcsh"
argc = 6
./bgi
-a
-t
USER
-t
SHELL

$ ./bgi -v -a -t USER -t SHELL
USER = "dheller"
SHELL = "/bin/tcsh"
address 0xbfe301e0: argc = 7
address 0xbfe301e4: argv = 0xbfe30264
  address 0xbfe30264: argv[0] = 0xbff13b05 --> "./bgi"
  address 0xbfe30268: argv[1] = 0xbff13b0b --> "-v"
  address 0xbfe3026c: argv[2] = 0xbff13b0e --> "-a"
  address 0xbfe30270: argv[3] = 0xbff13b11 --> "-t"
  address 0xbfe30274: argv[4] = 0xbff13b14 --> "USER"
  address 0xbfe30278: argv[5] = 0xbff13b19 --> "-t"
  address 0xbfe3027c: argv[6] = 0xbff13b1c --> "SHELL"

$ ./bgi -v -z
./bgi: invalid option -- z
./bgi: invalid option 'z'
usage: ./bgi [-h] [-v] [-a] [-e] [-p] [-s var=val] [-t var] [-u var] [-x stat]
  -h           print help
  -v           verbose mode
  -a           print argc and argv[]
  -e           print environ[]
  -p           print envp[]
  -s var=val   set environment variable var to value val
  -t var       print value of environment variable var
  -u var       unset environment variable var
  -x stat      set exit status to stat

$ ./bgi -a blat -t USER
USER = "dheller"
argc = 5
./bgi
-a
-t
USER
blat

Note from the last example that the GNU/Linux getopt() permutes the argv[] array.  This is non-standard behavior.  One way to force the standard behavior is to define the environment variable POSIXLY_CORRECT (if you believe one part of the documentation) or _POSIX_OPTION_ORDER (if you believe another part of the documentation).  The bash commands would be
declare -x POSIXLY_CORRECT
declare -x _POSIX_OPTION_ORDER
The csh or tcsh commands would be
setenv POSIXLY_CORRECT
setenv
_POSIX_OPTION_ORDER
In case you're wondering, POSIXLY_CORRECT is the right one to use, but you could also insert the character '+' at the head of getopt()'s third argument (this is non-standard).  However, that defeats the use of ':' as the first character, which makes it easier to detect missing option-arguments.  So much for consistent and orthogonal extensions to the standard.
The GNU Coding Standards page has extensive commentary on the design of command-line interfaces, and advice on the use of C.  You should read it sometime.  [The term defun is used in this document, without defining it.  This is a mechanism for defining functions in the programming language Lisp, which underlies Emacs and some other GNU tools.]


There are a few other incompatibilities that arise with the bgi program.

The GNU/Linux putenv() function can remove something from the environment.  Try this on Solaris, Linux and Mac OS X:

gcc -Wall -Wextra -o bgi bgi.4.c

bgi -s a=A -t a -s a -t a

Solaris
a = "A"
a = "A"

Linux
a = "A"
a: not found

Mac OS X
a = "A"
bgi: putenv(a) failed: Invalid argument
a = "A"

The unsetenv() function returns an int, according to the Posix Standard. On Mac OS X the compiler formerly complained

% gcc -Wall -Wextra -o bgi bgi.4.c
bgi.4.c: In function 'main':
bgi.4.c:217: error: void value not ignored as it ought to be

This was because the implementation of unsetenv() had a void return; the man page for unsetenv() notes this.  The standard implementation is used on Linux, but the man page is wrong (if you are feeling adventuresome, compare man -a unsetenv and the file /usr/include/stdlib.h).  The cure for the Mac OS X problem was simply to insist on the standard version:

gcc -std=c99 -D_POSIX_C_SOURCE=200112L -D_XOPEN_SOURCE=600 -Wall -Wextra -o bgi bgi.4.c

but that is no longer necessary.


Q&A

Q.  From bgi.3.c,
     printf("  address %10p: argv[%d] = %10p --> \"%s\"\n",
          &argv[i], i, argv[i], argv[i]);

Could you explain why &argv[i] and argv[i] give two different addresses?  Let me make a guess: &argv[i] gives the address of argv since by default a pointer of char array always points to the first element so &argv[i] gives the current address of the pointer argv? and argv[i] always gives the address of where argv[i] is stored?

A.  argv has type char *[] so each of its elements has type char *.  argv[i] has type char *, and its value points to some character string (or is NULL).  The character string is an array of char elements in memory.  &argv[i] is the address of argv[i], and has type char **.  The operator precedence of C and C++ causes [] to be evaluated before &.

The best way to answer this kind of question is to start drawing boxes for memory locations and arrows from one box to another when you have a pointer.  It's kinda hard to do in plain text.

For this particular case we have 

argv ---> argv[0] ---> "string"
          argv[1] ---> "another string" 
          ... 
          argv[argc] == NULL

 
Actually, argv[0] points to argv[0][0], the value of argv[0][0] is 's', the value of argv[0][1] is 't', etc.  And, the argv array should go upward, since argv[1] is at a higher address than argv[0].


Last revised 31 Jan. 2013