CMPSC 311, Spring 2013, Project 1

Posted Jan. 7, 2013.  Due Jan. 16, 2013, on paper, in class.  15 points; a 5 point bonus is possible.

This is a small project concerned with the amount of memory required to represent various types, and the impact of memory alignment restrictions.  There isn't much programming involved, but you do need to look at the program's output from several different systems, and make sense of it.

This is an individual project.  Do the work on your own.  Only Projects 7 and 8 can be done with a partner.

The ANGEL Dropbox for this project closes Jan. 16 at 11:55 pm.

Reading, CS:APP
Reading, CP:AMA
Starter kit

The program pr.1.1.c extends CS:APP Fig. 2.3 with information about these types:
See also CS:APP Fig. 2.8, 2.9, 2.10, for some related information.

The starter kit pr1.1.c can be extended to pr1.2.c in the obvious way; well, we hope it's obvious.  The macro SHOW isn't obvious now, but hopefully it will be later; you don't need to modify that part.

The sizeof operator in C tells you how many bytes a data type occupies in memory.  The result value of applying sizeof has type size_t, which is an unsigned integer type defined in <stddef.h>.  This is explained in the textbooks.

The __alignof__ operator is an extension to C99 allowed in the GCC compiler when the option -std=c99 is supplied.  The new edition of C (C11) has an alignof operator as part of the language.  The idea is that many processors and memory systems work better if data is aligned properly in memory.  For example, a 4-byte float might be required to be at a memory address that is a multiple of 4, and an 8-byte double at an address that is a multiple of 8.
Alignment constraints affect the size of composite types such as struct, union and (in C++) class.  Sometimes extra space must be inserted into a struct, which can be surprising.

After accumulating the required data, explain anything in the data that seems strange, or at least unexpected.

Turn in your results on paper in class (which allows for flexibility in the formatting, perhaps even hand-written).  Also turn in a copy of your program, to the ANGEL Dropbox for Project 1, with the output from one of your test runs.  The grade will be based on the paper version, with the electronic version for reference in case there are any questions about what you did.

Note that the 218 IST lab systems use Intel processors, the Sun Microsystems [now Oracle] server uses SPARC processors, and the Linux server uses AMD processors.  You should try your program on all three varieties.  If you have easy access to a Mac, that would be a good comparison.

This program might work on Windows with Visual Studio.  We haven't tried it yet, and it's not required.

The command  uname -p  will tell you what the processor type is.  For example, 220% uname -p


eru 206% uname -p

but sometimes you get a relatively non-informative response of i386.

On a Linux system, the command  cat /proc/cpuinfo  will give you a lot of detailed information.

Now we're going to attempt an explanation of the macro

#define SHOW(type) printf("%-32s %3zd %3zd\n", #type, sizeof(type), __alignof__(type))

and one of its uses,

SHOW(long int)

Macros in C are defined by writing

#define macro_name(macro_parameter_list) macro_body

and are used by writing


When a macro use is expanded, the arguments are matched with the parameters, and substituted into the macro body.  This is purely a textual substitution.  The macro operator # causes the macro argument to be converted to a double-quoted character string; in this example, "long int".  So, the macro expansion of SHOW(long int) is

printf("%-32s %3zd %3zd\n", "long int", sizeof(long int), __alignof__(long int))

Next, we need to explain printf().  The first argument to printf() is always a format string.  In this example, the format code %-32s is matched with "long int", the first %3zd is matched with the value of sizeof(long int), and the second %3zd is matched with the value of __alignof__(long int).  The format code %s is used to print a string, and %d is used to print a signed integer in decimal.  By using %-32s we print the string in a 32-character field, left-justified; if the string won't fit, we use more than 32 characters, and print the whole string.  The format code %3d would use a 3-character field, right-justified.  The default type of the integer that matches %d is signed int, and the code %zd is used to indicate an integer of type size_t.  There are other examples in pr1.1.c for char (%hhd), short int (%hd), long int (%ld), and long long int (%lld).  You can also use %u for an unsigned int in decimal, or %x for an unsigned int in hexadecimal.

When trying to see the difference between two text files, the diff command is useful.  The example in the Makefile is

diff m32 m64 > m32-m64-diff

which does a line-by-line comparison between the files m32 and m64, and writes the output to the file m32-m64-diff.  In the context of the Makefile, however, we needed to add an extra character - before the command, for reasons that will be explained later in the course.

Please note -

The server is quite old, and it has become physically unreliable after many years of very reliable service.  There is no guarantee or expectation that it will work properly in the future, and when it finally fails completely, it will not be replaced.  It's a useful example for this project, as a comparison to the Intel/AMD x86 processors, but you should not plan for it to be available on later projects.  When using eru, it would be best for you to log in to one of the CSE Linux systems, do your editing on Linux, and open a terminal window from there so you can ssh into eru.  The file system is shared, so you can issue compiler commands and run programs on eru, then inspect the results on Linux.  It should not be necessary to any editing directly on eru.

Five point bonus:  CS:APP, Homework Problem 2.58, on p. 119.  Don't forget to test it!

Last modified 28 Jan. 2013