CMPSC
311, Introduction to Systems Programming
The World's Most Boring C Program
Remember the course mantra, System
programming requires you to be
aware of the resources that your
program uses.
This is a lesson about how characters are represented in C, how
character strings are stored in C, and how to compile your C programs,
even if that's not what it looks like. There are some exercises
at the end; the one marked *** is essential to see if you understand
what's going wrong with this program.
Reading
- A simple explanation of
printf()
- CP:AMA, Sec. 3.1, The
printf Function; Sec. 7.3,
Character Types; Sec. 13.1, String Literals; Sec. 13.2, String
Variables; Sec. 22.3, Formatted I/O; Appendix E, ASCII Character Set.
- Elementary examples of
printf() are in Ch. 2, pp.
14-15, 19-20, 22.
- C:ARM, Sec. 2.7.3, Character Constants; Sec. 2.7.4, String
Constants; Sec. 2.7.5, Escape Characters; Sec. 2.7.6, Character Escape
Codes; Sec. 2.7.7, Numeric Escape Codes; Sec. 15.11,
fprintf, ...;
Appendix
A,
The
ASCII
Character
Set
- on any Unix-like system,
man ascii, man -a printf
(there's a printf command in most shells, as well as the
function in the C Standard Library)
Source code
boring.c
- This program writes byte values from 0 to 127, using
printf()
and numeric escape codes.
write-boring.c
- This program writes
boring.c.
Makefile
- This file is used with the
make program to
control compilation of write-boring and boring,
as
well
as
some
other
useful tasks.
- Note that
Makefile has tab characters in it - if
you copy-and-paste its contents from the web page, you run the risk of
turning the tabs into spaces. It's better to use control-click to
download the files directly, if you plan to run them on your own system.
Test systems
- Mac OS X (iMac), Intel Core i7 processor, gcc version 4.2.1
- Mac OS X (MacPro), G5 processor (a PowerPC), gcc version 4.0.1
- Linux, AMD Athlon processor,
gcc
version
3.4.6
- Solaris, Sparc processor,
gcc version 3.4.3, Sun Studio 11
Compilation results,
warnings turned off by default
- GCC
gcc -std=c99 -o boring boring.c
- Mac OS X, Linux, Solaris
- Sun's C compiler
c99 -o boring boring.c
- Solaris
Compilation results, warnings
turned on
- GCC
gcc -std=c99 -Wall -Wextra -o boring boring.c
- Mac OS X, Linux, Solaris
- We had to change some of the quote characters produced on the
Mac, so they would display cleanly on a generic Web browser; they look
fine on the screen in the original.
- Sun's C compiler
c99 -v -o boring boring.c
- Solaris
- Sun's
lint program
Compilation results, assembly language output, file boring.s
Execution output
- There are numerous "non-printing" characters in the program
output, which means they will be displayed on the screen in various
ways, depending on which tools you are using. If you want to try
various cases, compare
- logged in at the system console without a window manager
- logged in at the system console with a window manager and a
terminal window (this is how you normally work locally)
- logged in remotely with a terminal window (this is how you
normally work remotely)
- viewing the saved output with a web browser, under the file
name
execution-output
(this is treated
as a "binary file", so most browsers will prompt for download, but some
will download it without asking)
- viewing the saved output with a web browser, under the file
name
execution-output.txt
(same
file
contents, just a different name, so the browser might treat it
differently)
- To make this even worse, try about five different web browsers
on
various systems.
- Only Safari (Mac OS X) and Internet Explorer 8 (Windows 7)
displayed the
.txt file without complaint, but they do it
differently. Firefox and SeaMonkey prompt for download.
TextEdit (Mac OS X) and Notepad (Windows 7) will open the downloaded
file without problem. Mozilla (Solaris, JDS) prompts for gedit
to open the .txt file, and this actually does a good job.
Execution output -- count characters with wc, print
characters with
od
./boring | wc
./boring | od -c
./boring | od -t a
- The program names wc
and od are short for
"word count" and "octal dump", but they are actually more versatile
than the names suggest.
- The vertical bar character |
tells the command interpreter to connect the output of the first
program to the input of the second program. This is called a command pipeline.
- The ./ prefix to
the program name boring
helps to locate the executable file.
- We'll get to more details of Unix commands as the course
progresses.
- Mac OS X, iMac or MacPro
- The
od
spacing is different, but this is inconsequential.
od -c reports \012 as nl
(newline).
- Linux
wc
counts the number of words differently, but the word count is not
well-determined for this example, since there is no terminating
whitespace character. Unix text files are expected to end with a
newline character, which doesn't happen here.
od -c reports \012 as nl
(newline).
- Solaris, gcc or c99
- Two escape sequences are not recognized by
od -c
(\a and \v). This problem goes away with the command
od -t c
od -c reports \012 as lf
(linefeed).
OK, Unix is supposed to be standardized, so why are there
differences in the behavior of the wc and od
programs between these three systems?
- The standard for Unix defines the interface to programs in
general, and to some programs in particular (
wc and od,
in
this
case). The standard specifies many
aspects of a program's behavior, but not every aspect of its
behavior.
For example, od -c is allowed to report \012
as either nl (newline) or lf (linefeed).
- The other differences mentioned above are (in my opinion) bugs,
but they are not officially bugs.
Exercises
1. Why does the program boring make a noise when
you
run it? (If it doesn't, it should.)
2. *** The program boring.c
is supposed to output 128
characters, but
the boring | wc and boring | od -t a
pipelines report only 126
characters.
- Which characters are missing, and why?
- Did you need to run the program to figure this out?
- What could you have done before running the program that would
have suggested there's a problem?
- Do you promise to do that before running every other program
you write?
3. The output from the boring | od -c pipeline
seems
to be missing three characters, not just two. Why?
4. Try these changes to the program, and explain what happens
differently.
- Check the return value of
printf()
- Use
putchar() instead of printf()
- In
write-boring.c, reduce the number of calls to printf()
to a minimum.
- Write a new version of
boring.c that really does
write all the byte values from 0 to 127. It should produce no
complaints when the compiler warning flags are turned on. It
should "pass
the test" as far as wc and od are
concerned, as we tried to do above.
- On Linux or Mac OS X, test the output using
hexdump
instead of od .
Last revised, 7 Jan. 2012