CMPSC
311,
Introduction to Systems Programming
How are functions implemented in
C and
C++?
Remember the course mantra, System
programming requires you to be aware of the resources that your
program
uses.
This is a lesson about reading assembly code (which may also include
guessing what it means, if you've never seen anything like this
before), and how functions are translated to assembly code,
starting from C and C++. The "resources" here are the
names known to the program. We are concerned about function
overloading and default arguments in C++, which affect how function
names are constructed.
There are some exercises at the end; the ones marked *** are essential to
see if you understand what's going on with these programs.
Reading
- CS:APP
- Sec. 1.2, Programs Are Translated by Other Programs into
Different Forms
- Sec. 1.3, It Pays to Understand How Compilation Systems Work
- Sec. 1.4, Processors Read and Interpret Instructions Stored
in
Memory
- Ch. 3, Machine-Level Representation of Programs, for the
Intel/AMD x86 instruction set architecture
- Patterson & Hennessy, Computer
Organization
and
Design, for the MIPS instruction set
architecture; this is the textbook for CMPEN 331/431
- CP:AMA, Ch. 9, Functions
- Any good textbook on C++
Source code
foo.c
- one
function named
foo
- C doesn't allow function overloading or default arguments
foo.cpp
- three
functions named
foo
- these use function overloading
bar.cpp
- one
function named bar
with a
default argument (see Exercise 5, below)
Makefile
- This file is used with the
make program to
control compilation of foo.c, etc.,
as
well
as
some
other
useful tasks.
- Note that
Makefile has tab characters in it -
if
you copy-and-paste its contents from the web page, you run the
risk of
turning the tabs into spaces. It's better to use
control-click to
download the files directly, if you plan to run them on your
own system.
Test systems
imc Mac OS X (iMac),
Intel
Core i7 processor, GCC
version 4.2.1
- This almost corresponds to the explanations in CS:APP.
pmc Mac OS X (MacPro),
G5
processor (a PowerPC), GCC
version 4.0.1
- The PowerPC instruction set looks kinda-sort-somewhat like
the
MIPS instruction set.
lnx Linux, AMD Athlon
processor, GCC version 3.4.6
- This corresponds to the explanations in CS:APP. See
Sec.
3.7.2 for the function call and return instructions.
sun Solaris, Sparc
processor, GCC version 3.4.3, Sun Studio 11
- The Sparc instruction set looks kinda-sort-somewhat like the
MIPS instruction set.
Compilers
gcc GNU C compiler, C89 or C99
g++ GNU C++ compiler
cc Sun C compiler, C89 (cc is lowercase)
c99 Sun C compiler, C99
CC Sun C++ compiler (CC is uppercase)
Compiler options used here
(see Makefile
for the commands)
-S
compile, but only generate assembly
language
output, directed to a .s file (note "dash
capital
S")
-O compile
with
optimization
turned
on,
usually
results
in a shorter and faster assembly
language version (note "dash capital O")
-O 3 compile with
optimization level 3 (Sun's compiler requires a specific level,
while
GCC -O defaults to level 1)
-o f direct
the
output to file f (note "dash lowercase o")
-std=c99 (GCC only) compile
according
to the C99 standard; the default is C89
Assembly Language output --
various systems and compilers, optimized (.1)
or
not
(.0)
Exercises
1. Read through a few of the .s files
above.
Which lines refer to the following features?
(At this point in the course, and with these examples, simply
looking for patterns is probably the right approach. For
the Intel and AMD processors, CS:APP Ch. 3 explains the instructions
and
more.)
- comments
- OK, this was a trick question. GCC doesn't
produce comments in the
.s files, at least not
the way we
generated them. So, why not?
- function names
- Since a function is compiled into a sequence of assembly
language instructions, you need a label to indicate the first
instruction to be executed when the function starts. In
some
instances, there are additional labels.
- assembly language instructions (in general)
- Look for things like add, move, load, store.
- assembly language function return instructions (in particular)
- There are no function call instructions in these examples.
2. Assemblers work by writing bytes into various sections of
an
object file. Object files are linked together to form an
executable file. Loaders work by writing bytes from an
executable file into various sections of memory. The assembly
language (in the .s files)
specifies which section to write into next by using directives that
look like .section. Use the fgrep
command to extract all lines of the
.s files in which ".section"
appears.
Explain what you
see.
We haven't covered the grep
command family yet, so here's the command
fgrep
.section
*/*.s
3. Use the grep command to extract all lines of
the
.s files in which "foo" appears.
Explain what you
see.
We haven't covered the grep
command family yet, so here's the command
grep foo
*/*.s
4. *** Use the egrep
command to extract only those
lines of the .s files in which "foo" and
":"
appear
on
the
same
line.
Explain what you see. In particular, explain how
function overloading is implemented in C++.
We haven't covered regular
expressions
yet, so here's the command
egrep 'foo.*:' */*.s
and its
result.
The
quotes
'' are necessary to prevent the command shell
from expanding
foo.*:
as a wildcard file name. We'll cover more details like this
later
in the course.
5. *** Now consider
the
C++ program bar.cpp
compiled with g++ -S bar.cpp producing bar.s (on Linux), and bar.s (on Solaris).
How are default arguments implemented in C++? How does this
example differ from the function overloading example foo.cpp?
Last revised, 7 Jan. 2012