Background
information for the projects, with a complete example
program, is provided. You should work through the background
information first; we covered some of this in class on Feb.
1. A starting point for Project 6 is also provided here,
which uses some of the background information.
A solution to Project 6 will be posted soon after the due date, so that you can check your work, review for the exam, and proceed to the next project even if not completely successful. Please be sure to turn in your work before the solution is posted, even if it is incomplete. Late submissions will not be accepted.
Here is what you should turn in for these projects:
An electronic version of your program should be submitted
through ANGEL. Specific instructions will be included with
each project. Be sure to attach all parts of your
program's source code. Do not attach an executable
file. The dropbox will remain open until 2 pm on the
project due date for Project 6, and 11 pm for Project 7.
You can run your examples on Solaris, Linux or Mac OS X, but
please specify which system you used, and when. The test
cases to demonstrate the output are your choice; 5 or 6 should be
enough. There are some examples at the end of this
description. The time allocation statement should be like "1
hr planning, ...", and not like "1% planning, ..."; there's no
reason to be completely precise about it, but at least try to be
honest.
The notation <checkpoint> indicates a place in the
description where you should have a version of the program that
compiles correctly without complaint, and does some limited action
correctly. You may need to rewrite code between the
checkpoints. Only the final version needs to be turned
in. Code that was provided and remains unchanged does not
need to be turned in, but you should indicate this somewhere.
Some parts of the project descriptions review topics discussed
earlier in the course, when they might not have seemed so
important, and in case they didn't sink in the first time.
Posted Mar. 12, 2013. Due Friday, Mar. 22, 2013, 2 pm
(electronic version, to ANGEL), and in class (paper
version). A solution will be posted on ANGEL at 2 pm on Mar.
22, so no late projects will be accepted after that time. 25
points.
Reading (review).
Reading (new, processes).
exit);
Sec.
8.2,
Processes;
Sec.
8.4,
Process
Control
(skip
Sec. 8.4.6 for now - we'll come back to it later)fork Function; Sec.
8.5, exit Functions; Sec. 8.6, wait
and waitpid FunctionsReading (new, signals).
signal and raise)signal Function; Sec. 10.8,
Reliable-Signal Terminology and Semantics; Sec. 10.9, kill
and raise Functions; Sec. 10.10, alarm
and pause Functions (p. 313 is the important
part); Sec. 10.14, sigaction Function (we'll give
you a particular call for this project); Sec. 10.19, sleep
Function; Sec. 10.22, SummaryReading (new, further concepts).
These man pages on Solaris will also be useful:
The project description is written as a series of incremental steps leading to the following program structure:
alarm().pr6.1.c
pr6.2.c
pr6.3.cpr6_ctime.h pr6_ctime.cpr6_signal.h pr6_signal.cpr6_wait.h pr6_wait.cpr6_table.h pr6_table.cMakefile
Makefile-c99 Makefile-gcc Makefile-lnx Makefile-mac Usage: pr6 [-h] [-v] [-a n]
[-b n] [-c n] [-f n] [-s n] [-t n]
[-x n]
The options should work as follows:
| -h | print a help message |
| -v | enable verbose mode (extra output) |
| -a n | child alarm time interval, default 0 |
| -b n | parent alarm time interval, default 0 The alarms repeat at regular intervals. The default value 0 means that the alarm feature is not to be used. |
| -c n | fork() n child processes, default 0 You can force a maximum value for n, but the program should at least allow n between 0 and 8. |
| -f n | fflush() before fork(), 0 = no, 1 =
yes, default 1 This is described later. It is an "extra feature" that allows some experimentation. |
| -s n | child sleep time, default 0 |
| -t n | parent sleep time, default 0 The default value 0 means that the process does not sleep. |
| -x n | child exit status n, default 0 The default value is EXIT_SUCCESS, which is 0. |
| The options -a, -s and -x apply to each
child process. The option -v applies to the parent and child processes. The remaining options apply only to the parent process. |
The starting point of the project is in the file pr6.1.c, which you should save and
test. Compile it with one of these commands, using the Sun
or GNU compilers, and C89 or C99. The -D
option with C99 clears up a warning from the compilers on Solaris
about getopt().
cc -v -o pr6 pr6.1.c
gcc -Wall -Wextra -o pr6 pr6.1.c
c99 -v -D_POSIX_C_SOURCE=200112L -o pr6 pr6.1.c
gcc -std=c99 -Wall -Wextra -D_POSIX_C_SOURCE=200112L -o pr6
pr6.1.c
With Linux, you can use instead -D_POSIX_C_SOURCE=200809L
. With Mac OS X, just omit it.You should now pick up and read the code in pr6_ctime.h and pr6_ctime.c. Note that the
file names have an embedded underscore _ not a space; web browsers
that underline links make this difficult to see. The
functions defined here are
char * Ctime(char buf[26]);
void print_msg(char *msg);
void print_msg_1(char *msg, int n);
void print_msg_2(char *msg, int n1, int n2);
void print_msg_error(char *msg, char *errmsg);
void print_msg_abort(char *msg);
Here is an example of using the print_msg() function,
print_msg("Hello,
world");
18173: Tue Mar 12 13:43:45 2013 Hello, world
If the child process and the parent process want to print the
same text, print_msg() makes it possible to see which
process caused the output, and at what time. If more than
two processes are involved, then having the process ID and time
attached to an error message or diagnostic output makes it easier
to see what's going on. Note that printed output is
typically buffered, and you might see an earlier message from one
process appear after a later message from a different
process. This can be disconcerting, but it isn't
wrong. If you sort the output by process number, or by the
time, then everything looks right.
These functions are built using ctime(3C) and getpid(2).
getpid(2) returns the process identifier of the current
process. getppid(2) returns the parent process's
identifier. The type of a process identifier is pid_t,
an integer type defined in the header <sys/types.h>.
The ps(1) command can be used to see the identifiers of all processes currently running, but that's the wrong way to obtain process numbers for use in a program. It might be a good idea to run ps occasionally (or top continuously) if you think your program has a bug, to be sure you are not accumulating a large number of processes.Later, when studying thread programming, we will need the reentrant function Ctime(). It is useful here because it allows a simple method of giving a time-stamp to a line of output. You should understand why ctime(3C) is not reentrant, and why ctime_r(3C) is necessary. It will be harder to understand why there are two versions of ctime_r(). (It's because Sun implemented one proposed version before a second version was chosen by the POSIX committee to be the standard version. The old version remains so that old code doesn't need to be rewritten. No one ever said this business of programming makes complete sense. More information can be found in /usr/include/time.h, but it's not easy reading and not really recommended at this point.)
The lines #ifdef ... #else ... #endif are used by the C preprocessor to select which part of the code is actually given to the compiler. If you want to test both versions in pr6_ctime.c, the Sun compiler command now is one of
cc -v
-D_POSIX_PTHREAD_SEMANTICS -o pr6 pr6.1.c pr6_ctime.c
cc -v -o pr6 pr6.1.c pr6_ctime.cc99
and gcc (use gcc on Linux and Mac OS
X). The -D option acts as if there was a
corresponding #define before the first line of the source
code. Using the POSIX version is generally preferred.make not previously
discussed.The possible values for the failure indicator and for errno are given in the man page for the system function. The particular values of errno that are interesting are given symbolically (for example, ECHILD), and the values can be converted to character strings with strerror(3C).
/* for errno, strerror(3C) */
#include <errno.h>
#include <string.h>
Some uses of errno and strerror() associated
with the system functions signal(), sigaction(),
fork(), wait() and waitpid() are
shown later. Note that we checked the return values of time()
and ctime_r() in Ctime(). printf()
hardly ever fails, so most programmers don't check its return
value. Some programmer tools, such as lint(1) on
Solaris, can check to see if you have ignored the return value,
but this is not entirely foolproof.
void sighandler(int sig)
{
/* This is just a sample */
print_msg_1("sighandler, signal number", sig);
}
The parameter is a signal number, supplied as part of generating
and catching the signal. The complete list of signal numbers
and their symbolic values is given in the header signal.h
and the man page signal.h(3HEAD); partial lists are in
CS:APP Fig. 8.25, or APUE Sec. 10.2.
Of course, what the signal handler actually should do depends on the requirements of the program. In the simplest case, it could just print a message giving the signal number, and let the program continue. In general, printing is not required, and it is often undesirable. You could also call exit() from within the handler, which is appropriate in some cases but not here. In this project's final version you will only need handlers for the signals SIGINT, SIGCHLD and SIGALRM. The handler body print_msg_1() is sufficient for now, but you should add to it later.
You should now pick up and read the code in pr6_signal.h and pr6_signal.c. Compare this
to CS:APP Fig. 8.34.
To install a signal handler, use
#include "pr6_signal.h"
and then just call the installer like this:
install_signal_handler(SIGALRM, sighandler);
The reason the code in pr6_signal.c is so ugly (with
the #ifdef's) is that the simpler code using signal(3C)
is an older style, while the sigaction(2) function is a
more modern version giving greater control if you want to exert
it. In this case, you do want to use sigaction(),
for reasons explained later, but we are allowing for experiments.
In the example, when the process receives an alarm signal, the function sighandler() will be called. (SIGALRM really is spelled that way, and not SIGALARM.) Some additional code would be needed if you want to save the previously-installed signal handler so it could be restored later; you do not need to do that here. The code to install the signal handler should go into main() before the call to fork(), which ensures that the parent and child will use the same handlers.
We wrote a simple program sigs.c
to see what the different versions of Unix actually use for their
signal numbers; sigs.c uses the non-standard value NSIG
and the non-standard function strsignal().
There is output for Solaris 9,
Solaris 10, Linux (2.6 kernel) and Mac OS X. The lesson of
this test is, always use the symbolic name of a signal, because
the numeric values differ widely. The symbolic names valid
for a particular system are in the man page for signal(),
or through the system command "kill -l" to generate the
variable part of the signal names. There is output from the latter for the same
systems.
To test the signal handler installation, use the program pr6.2.c. This will simply try to install your own signal handler for every available signal. Note that some signal handlers, such as those responsible for killing the program, cannot be replaced. The system function raise(3) will send signal i to its own process, and you should see one line of output from the handler.
Here is what you should do to improve your understanding of signals:
cc -v -D_POSIX_PTHREAD_SEMANTICS -o pr6
pr6.2.c pr6_ctime.c pr6_signal.c
gcc -Wall -Wextra
-D_POSIX_PTHREAD_SEMANTICS -o pr6 pr6.2.c pr6_ctime.c
pr6_signal.cpr6.2.c
and pr6.3.c use the non-standard value NSIG
from signal.h, and NSIG will not
be defined when the Posix standard is enforced, as with c99 -v -D_POSIX_C_SOURCE=200112L ...NSIG and its surrounding code will need to be
removed in the final version of this project's program.pr6.2.c is for
testing and understanding, and will need to be removed in the final
version of the program. alarm(10);
for ten seconds ahead, removing any previous alarm setting.
(Remember, the 2 in alarm(2) refers to section 2 of
the man pages.) If the -a or -b options
of pr6 are used with a non-zero time (assumed to be
an integer number of seconds), then you can set an alarm with
something like
/* for alarm(2) */
#include <unistd.h>
if (alarm_time_interval > 0)
{ alarm(alarm_time_interval); }
After the alarm signal is delivered, the alarm can be reset with the same time interval.
The reason to prefer sigaction() over signal() is so that the signal handler you installed will stay installed. On some Unix systems, including Solaris but not Linux or Mac OS X, signal() can have the effect of a one-shot installation, so that in this case, if the process receives more than one signal of the same type, after the first signal is caught then the default signal handler is reinstalled. This is usually not what you want to happen. Here is an example assuming you used signal() for the installation. (In pr6_signal.c, that would be the case if you compile with the -DPR6_USE_SIGNAL option.)
void sighandler_test_case(int sig)
{ /* point A */ signal(sig, sighandler_test_case); }
Which signal handler is installed when the process arrives at /* point A */? It will be the default handler. If you don't reinstall the signal handler, the default remains in effect. This may lead to a race condition or undesired behavior.
Here is an example on Solaris. The commands entered are in bold. We typed control-C twice after the first "signal 2" output appeared, and omitted some of the output for convenience.
Notice that the first SIGINT signal was sent by the process to itself with% cc -v -D_POSIX_PTHREAD_SEMANTICS -o pr6 pr6.2.c pr6_ctime.c pr6_signal.c
pr6.2.c:
pr6_ctime.c:
pr6_signal.c:
% pr6
CMPSC 311 Project 6, version 2
install_signal_handler(0) failed: Invalid argument
install_signal_handler(9) failed: Invalid argument
install_signal_handler(23) failed: Invalid argument
18222: Tue Mar 12 13:50:57 2013 generic_signal_handler, signal 1
18222: Tue Mar 12 13:50:58 2013 generic_signal_handler, signal 2
18222: Tue Mar 12 13:50:59 2013 generic_signal_handler, signal 3
18222: Tue Mar 12 13:51:00 2013 generic_signal_handler, signal 4
^C 18222: Tue Mar 12 13:51:00 2013 generic_signal_handler, signal 2
18222: Tue Mar 12 13:51:00 2013 generic_signal_handler, signal 5
18222: Tue Mar 12 13:51:01 2013 generic_signal_handler, signal 6
18222: Tue Mar 12 13:51:02 2013 generic_signal_handler, signal 7
^C 18222: Tue Mar 12 13:51:03 2013 generic_signal_handler, signal 2
18222: Tue Mar 12 13:51:03 2013 generic_signal_handler, signal 8
18222: Tue Mar 12 13:51:04 2013 generic_signal_handler, signal 10
18222: Tue Mar 12 13:51:05 2013 generic_signal_handler, signal 11
...
18222: Tue Mar 12 13:51:40 2013 generic_signal_handler, signal 47
18222: Tue Mar 12 13:51:41 2013 generic_signal_handler, signal 48
18222: Tue Mar 12 13:51:44 2013 generic_signal_handler, signal 14
18222: Tue Mar 12 13:51:44 2013 all done!
% cc -v -D_POSIX_PTHREAD_SEMANTICS -DPR6_USE_SIGNAL -o pr6 pr6.2.c pr6_ctime.c pr6_signal.c
pr6.2.c:
pr6_ctime.c:
pr6_signal.c:
% pr6
CMPSC 311 Project 6, version 2
install_signal_handler(0) failed: Invalid argument
install_signal_handler(9) failed: Invalid argument
install_signal_handler(23) failed: Invalid argument
18245: Tue Mar 12 13:52:40 2013 generic_signal_handler, signal 1
18245: Tue Mar 12 13:52:41 2013 generic_signal_handler, signal 2
18245: Tue Mar 12 13:52:42 2013 generic_signal_handler, signal 3
18245: Tue Mar 12 13:52:43 2013 generic_signal_handler, signal 4
^C
%
raise(), and the second was
sent to the process by the terminal driver in response to the
control-C. The second used the default handler, which
terminated the process.
By now you may have noticed that the default handler for SIGALRM
will print "Alarm clock" and terminate the process.
If not, just let the second-compiled pr6 run to
completion.
If you tried running this example on Linux, and got output that
begins with
Using built-in specs.
Target: x86_64-redhat-linux
...
then you forgot to change the compile command from "cc -v"
to "gcc -Wall -Wextra". Recall that the cc
and gcc commands on Linux and Mac OS X are the same.
When comparing Unix signal handlers to C++ or Java exception
handlers, keep these features in mind. Signal handlers apply
to the entire process and are installed while the program is
running. This is the same concept as for interrupt handlers
in an operating system. Exception handlers in C++ or Java
(the catch clause following try) apply to
regions of the program. This gives more structure to the
program, and greater control over program behavior, but at a
higher design and runtime cost. The major benefit is that
errors in program design can often be caught by the compiler
before runtime.
<checkpoint>
pr6 are
used with a non-zero time (assumed to be an integer number of
seconds), then the process should sleep for a total of that
number of seconds. The parent process should sleep after all
the child processes have been started (there are no child processes
at this point in the development). The system function sleep(3C)
takes the process out of the operating system's run queue and
reschedules it to start again at a later time. The code for
this ought to be easy:
/* for sleep(3C) */
#include <unistd.h>
if (sleep_time > 0)
{ sleep(sleep_time); }
The problem is that if a signal (in particular, SIGALRM) is received while the process is sleeping, the process wakes up and sleep() returns too soon; you should have noticed this earlier. So, sleep() returns the number of seconds remaining from the original request. You can reset the alarm and sleep for the remaining time, as follows:
int remaining_sleep_time = sleep_time;
while (remaining_sleep_time > 0)
{
if (alarm_time_interval >
0)
{
alarm(alarm_time_interval); }
remaining_sleep_time =
sleep(remaining_sleep_time);
}
Note that these are now the only calls to alarm() and sleep(), but the argument values are different in the parent and child processes.
All this is wrapped up in the Sleep() function in pr6.3.c, which provides some more tests you should run shortly.
There are other ways to reset the alarm, most of which cause mysterious behavior. For example, resetting the alarm inside the SIGALRM signal handler may confuse sleep() about how much time was remaining. One good technique to look at later is the interval timer function setitimer(2).
Your signal handlers could print some additional messages to help
see what's going on. If your program ends unexpectedly with
the message "Alarm clock" then you have the problem
that the default alarm signal handler is installed, or became
reinstalled. For example, try compiling pr6.3.c
with the -DPR6_USE_SIGNAL option. You could change
from signal() to sigaction(), or call signal()
again inside the while loop just before alarm(),
or call signal() again inside the handler.
sigaction() is replaced by signal()),
then a signal could cause a system function to return before it is
finished. Some of the following code is designed to handle
that case.If you tried using control-Z to test the program, it is possible you have some background or suspended jobs. Try running the jobs command; if there is no output, then you have no background or suspended jobs. You can use a command like "kill %1" to terminate a suspended job (in this case, job 1). Jobs that are running in the background probably should be left alone. For example,
<checkpoint>% cc -v -D_POSIX_PTHREAD_SEMANTICS -o pr6 pr6.3.c pr6_ctime.c pr6_signal.c
pr6.3.c:
pr6_ctime.c:
pr6_signal.c:
% pr6
CMPSC 311 Project 6, version 3
18315: Tue Mar 12 13:55:50 2013 starting - the default signal handlers are installed
18315: Tue Mar 12 13:55:50 2013 going to sleep for 10 seconds - try control-C or control-Z (or not)
^C
% jobs
% pr6
CMPSC 311 Project 6, version 3
18317: Tue Mar 12 13:56:13 2013 starting - the default signal handlers are installed
18317: Tue Mar 12 13:56:13 2013 going to sleep for 10 seconds - try control-C or control-Z (or not)
^Z
Suspended
% jobs
[1] + Suspended pr6
% kill %1
% (just type return)
[1] Terminated pr6
The code to create a child process looks like this. Indented parts are inside some function, perhaps main().
/* for fork(2), getpid(2) */
#include <sys/types.h>
#include <unistd.h>
pid_t child_pid;
child_pid = fork();
/* There is (should be) one more process running
the same program.
* Both processes have returned from fork(), but
with different
* values assigned to child_pid.
*/
if (child_pid == (pid_t)(-1))
{ /* This is the parent process.
The fork failed, there is no child. */
print_msg_error("fork()",
strerror(errno));
/* maybe quit? */
}
else if (child_pid == 0)
{ /* This is the child process. The
fork succeeded. */
/* add more code, but
still exit */
exit(child_exit_status);
/* this
will also send a SIGCHLD signal to the parent process */
}
else
{ /* This is the parent process.
The fork succeeded. */
/* add more code, but do
not exit yet */
}
The entire address space of the parent process is copied to build the address space of the child process. Once the child process starts, it is at the point in the program where fork() returns, the same point as in the parent process. The only way the two can notice which is which is by the return value from fork(). In the child process, fork() returns 0; in the parent process, fork() returns the process identifier of the child process that was just created.
It will be helpful to print the process identifiers of each process and its parent after the fork, just to see what's going on. Similarly, print this information before each process exits. Use the getpid() and getppid() functions for this.
<checkpoint>
/* for errno */
#include <errno.h>
/* for waitpid(2) and wait(2) */
#include <sys/types.h>
#include <sys/wait.h>
/* wait for a child process whose pid you know
*
* return 1 if a child was found
* *child_status has been updated,
and the child has terminated
*
* return 0 if no child was found
* *child_status has not been updated
*/
int wait_child(pid_t wait_pid, int *child_status)
{
int s;
/* loop because waitpid() can be interrupted by
a signal and return early */
while (waitpid(wait_pid, &s, 0) ==
(pid_t)(-1))
{
if (errno ==
ECHILD)
/* no more children */
{ return 0; }
}
*child_status = s;
return 1;
}
/* wait for a child process whose pid you do not know
* if more than one child has
terminated, report only one
*
* return 1 if a child was found
* *wait_pid and *child_status have
been updated, and the child has terminated
*
* return 0 if no child was found
* *wait_pid and *child_status have
not been updated
*/
int wait_any_child(pid_t *wait_pid, int *child_status)
{
pid_t w;
int s;
/* loop because wait() can be interrupted by a
signal and return early */
while ((w = wait(&s)) == (pid_t)(-1))
{
if (errno ==
ECHILD)
/* no more children */
{ return 0; }
}
*wait_pid = w;
*child_status = s;
return 1;
}
It would make more sense to use the wait.h(3HEAD) macros applied to child_status, but that is a feature you can use in the next project (see CS:APP Fig. 8.17 or APUE Sec. 8.6 for an example).
The reason that wait() and waitpid() should be used in a loop is that if the parent process receives any signal, then wait() returns. It is possible that the child has not actually terminated, and you need to wait some more. Otherwise, the code would have been something easy like
wait_pid = wait(&child_status);
if (wait_pid == (pid_t)(-1))
{ deal
with the error }
As long as a process exists, its own process identifier will not
change. However, if the parent process terminates before the
child process, then the parent process ID of the child is set to 1
for the init process. This affects the return
value of getppid() in the child process.
<checkpoint>
pr6
command-line option -c can be used to determine how many
children to fork(). The easiest way to do this
is to put some of the code you have written so far into a loop,
actually two loops. Fork the children in one loop, and wait
for the children in a second loop. In general, you don't know
the order in which the children will finish, so the second loop
should use wait_any_child(). One version of the
solution to be posted will use wait_child() in the second
loop, just to test the code; it will be useful later. Of
course, you would need to change child_pid from a
simple variable to an array. child finished" output uses print_msg_2()
to print the process number and exit status as retrieved by wait().
Note
that
the
exit
status
2
in
the
last
example
does
not
appear
in
the
low
byte.
This
has to do with the encoding of the exit status and termination
status of the child process into one int. The WEXITSTATUS
macro in <sys/wait.h> cleans this up.% cc -v
-D_POSIX_PTHREAD_SEMANTICS -o pr6 pr6.4.c pr6_ctime.c
pr6_signal.c pr6_wait.c
pr6.4.c:
pr6_ctime.c:
pr6_signal.c:
pr6_wait.c:
% pr6 -a 2 -b 3
-s 5 -t 8
CMPSC 311 Project 6, version 4
child_alarm_time = 2
parent_alarm_time = 3
child_sleep_time = 5
parent_sleep_time = 8
18350: Tue Mar 12 13:57:56 2013 here is the parent, all
children created
18350: Tue Mar 12 13:57:59 2013 alarm signal received
18350: Tue Mar 12 13:58:02 2013 alarm signal received
% pr6 -a 2 -b 3
-s 5 -t 8 -c 1
CMPSC 311 Project 6, version 4
child_alarm_time = 2
parent_alarm_time = 3
child_processes = 1
child_sleep_time = 5
parent_sleep_time = 8
18353: Tue Mar 12 13:58:24 2013 here is the parent, all
children created
18354: Tue Mar 12 13:58:24 2013 here is child 0
18354: Tue Mar 12 13:58:26 2013 alarm signal received
18353: Tue Mar 12 13:58:27 2013 alarm signal received
18354: Tue Mar 12 13:58:28 2013 alarm signal received
18353: Tue Mar 12 13:58:29 2013 child signal received -
ignored
18353: Tue Mar 12 13:58:32 2013 alarm signal received
18353: Tue Mar 12 13:58:32 2013 child finished 18354
0x00000000
% pr6 -a 2 -b 3
-s 5 -t 8 -c 2
CMPSC 311 Project 6, version 4
child_alarm_time = 2
parent_alarm_time = 3
child_processes = 2
child_sleep_time = 5
parent_sleep_time = 8
18357: Tue Mar 12 13:58:59 2013 here is the parent, all
children created
18358: Tue Mar 12 13:58:59 2013 here is child 0
18359: Tue Mar 12 13:58:59 2013 here is child 1
18358: Tue Mar 12 13:59:01 2013 alarm signal received
18359: Tue Mar 12 13:59:01 2013 alarm signal received
18357: Tue Mar 12 13:59:02 2013 alarm signal received
18358: Tue Mar 12 13:59:03 2013 alarm signal received
18359: Tue Mar 12 13:59:03 2013 alarm signal received
18357: Tue Mar 12 13:59:04 2013 child signal received -
ignored
18357: Tue Mar 12 13:59:04 2013 child signal received -
ignored
18357: Tue Mar 12 13:59:07 2013 alarm signal received
18357: Tue Mar 12 13:59:07 2013 child finished 18358
0x00000000
18357: Tue Mar 12 13:59:07 2013 child finished 18359
0x00000000
% pr6 -a 2 -b 3
-s 5 -t 8 -c 2 -x 2
CMPSC 311 Project 6, version 4
child_alarm_time = 2
parent_alarm_time = 3
child_processes = 2
child_sleep_time = 5
parent_sleep_time = 8
child_exit_status = 2
18375: Tue Mar 12 13:59:32 2013 here is the parent, all
children created
18376: Tue Mar 12 13:59:32 2013 here is child 0
18377: Tue Mar 12 13:59:32 2013 here is child 1
18377: Tue Mar 12 13:59:34 2013 alarm signal received
18376: Tue Mar 12 13:59:34 2013 alarm signal received
18375: Tue Mar 12 13:59:35 2013 alarm signal received
18377: Tue Mar 12 13:59:36 2013 alarm signal received
18376: Tue Mar 12 13:59:36 2013 alarm signal received
18375: Tue Mar 12 13:59:37 2013 child signal received -
ignored
18375: Tue Mar 12 13:59:37 2013 child signal received -
ignored
18375: Tue Mar 12 13:59:40 2013 alarm signal received
18375: Tue Mar 12 13:59:40 2013 child finished 18376
0x00000200
18375: Tue Mar 12 13:59:40 2013 child finished 18377
0x00000200
/* fixed-size process table, give the size as a
symbolic constant */
#define MAX_CHILDREN 8
/* an entry in the process table */
typedef struct pr6_process {
pid_t
pid;
/* process ID, supplied from fork() */
/* if 0, this entry is currently not in use */
int
state;
/* process state, your own definition */
int exit_status; /*
supplied from wait() if process has finished */
} pr6_process_info;
/* the process table, maintained by the parent process
only */
pr6_process_info process_table[MAX_CHILDREN];
A full but too-simple implementation of this is given in the
files pr6_table.h and pr6_table.c. One problem
with this design is the size limit. It would be better to
use a dynamic data structure, such as a linked list or an array
allocated with malloc(), so you don't have to force a
strong limit on the number of child processes (8 is not enough on
a real system) and so that space can be economized (usually we are
far below the maximum). However, the fixed-size table is a
reasonable compromise for now. You will need to improve on
it in Project 7. It will be useful to have a function to
print the process table, and to use this as part of the verbose
option (-v). An example is given below.
One idea you should consider but reject is to update the process
table as soon as possible. After all, the child process
sends a SIGCHLD signal to the parent as soon as it
terminates. (Actually, there are some other times when this
signal could be sent, but don't worry about that yet.) Why
not let the signal handler for SIGCHLD call wait()
and update the process table? It turns out that this is a
bad idea. The reason is that a child process could exit and
send the signal before the parent process has even created the
process table entry for the child. This gets you into a race
condition, and very likely into an error condition. Later
we'll discuss how to get around the problem, but the easiest
approach for now is to avoid it entirely. This is a real
problem and you will need to know how to deal with it, but that's
mostly for later. See CS:APP Sec. 8.5.7 for another example
of the same problem.
Here's an example of the race condition, but without using the
process table. The first version is on Solaris, and the
second version is on Mac OS X. We added a call to print_msg_1()
in the parent so you can see its progress through the loop that
calls fork(); the extra number after "here is
the parent" is just a loop iteration counter. Pay
attention to the relative ordering of the output from the parent
and from child 0.
% cc -v
-D_POSIX_PTHREAD_SEMANTICS -o pr6 pr6.4a.c pr6_ctime.c
pr6_signal.c pr6_wait.c
pr6.4a.c:
pr6_ctime.c:
pr6_signal.c:
pr6_wait.c:
% pr6 -b 3 -t 8
-c 2 -x 2
CMPSC 311 Project 6, version 4a
parent_alarm_time = 3
child_processes = 2
parent_sleep_time = 8
child_exit_status = 2
18411: Tue Mar 12 14:03:49 2013 here is the parent 0
18412: Tue Mar 12 14:03:49 2013 here is child 0
18411: Tue Mar 12 14:03:49 2013 here is the parent 1
18411: Tue Mar 12 14:03:49 2013 here is the parent, all
children created
18411: Tue Mar 12 14:03:49 2013 child signal received -
ignored
18413: Tue Mar 12 14:03:49 2013 here is child 1
18411: Tue Mar 12 14:03:49 2013 child signal received -
ignored
18411: Tue Mar 12 14:03:52 2013 alarm signal received
18411: Tue Mar 12 14:03:55 2013 alarm signal received
18411: Tue Mar 12 14:03:57 2013 child finished 18412
0x00000200
18411: Tue Mar 12 14:03:57 2013 child finished 18413
0x00000200
% gcc -Wall -Wextra
-D_POSIX_PTHREAD_SEMANTICS -o pr6 pr6.4a.c pr6_ctime.c
pr6_signal.c pr6_wait.c
pr6_signal.c:63: warning: unused parameter ‘sig’
pr6_signal.c:63: warning: unused parameter ‘func’
% pr6 -b 3 -t 8
-c 2 -x 2
CMPSC 311 Project 6, version 4a
parent_alarm_time = 3
child_processes = 2
parent_sleep_time = 8
child_exit_status = 2
7974: Tue Mar 12 14:18:07 2013 here is the parent 0
7975: Tue Mar 12 14:18:07 2013 here is child 0
7974: Tue Mar 12 14:18:07 2013 child signal received -
ignored
7974: Tue Mar 12 14:18:07 2013 here is the parent 1
7974: Tue Mar 12 14:18:07 2013 here is the parent, all
children created
7976: Tue Mar 12 14:18:07 2013 here is child 1
7974: Tue Mar 12 14:18:07 2013 child signal received -
ignored
7974: Tue Mar 12 14:18:10 2013 alarm signal received
7974: Tue Mar 12 14:18:13 2013 alarm signal received
7974: Tue Mar 12 14:18:15 2013 child finished 7975
0x00000200
7974: Tue Mar 12 14:18:15 2013 child finished 7976
0x00000200
int main(void)
{
printf("0 %d\n", getpid());
fork();
printf("1 %d\n", getpid());
return 0;
}
Here is some output on Solaris (Linux is similar, because it's really a problem with the C libraries). The first command line (after compiling) sends all output to the terminal window. The second command line sends all output to the program cat through a pipe. cat simply repeats its input, in this case by sending it to the terminal. The third command line sends all output to a file out, then prints the file.
% cc -o a example.cThere are two different lines beginning with "1 "
because there are two processes running the program at the point
of the second printf(). But, why did we get
two identical lines "0 27395"? After all, there was only one printf()
of this text before the fork(). The reason is that
stdout in C, and cout in C++, is a buffered
output stream. The characters output by printf()
are placed in a reserved part of memory in the process address
space. If the output is really going to the terminal window,
then the buffer is "flushed" promptly, so you see the output as
soon as it is complete (the output stream is line buffered).
If the output is going to a file or to another process through a
pipe, then the "flush" operation is delayed until the buffer is
full (the output stream is fully buffered) or until the fflush(3C)
function is called by the producer (the program example.c
in this case). When the parent process in the example
executes fork(), its entire address space is copied to
build the child process address space. If the output buffer
has anything in it, that is also copied. When the parent
does its next printf(), the original copy of the buffer
is used. When the child does its next printf(),
its copy of the buffer is used. Eventually, both buffers are
flushed. That's why you get the same line twice. The
cure for the problem is to call fflush() before fork(),
as follows:
if (flush_before_fork)
{ fflush(stdout); fflush(stderr); }
child_pid = fork();
The flag flush_before_fork should be set from the command line option -f, with the default being to perform the flush operations. This will give you some flexibility for experiments. Remember, the conditional here is only to make experiments easy. A proper application program would always fflush() before fork().
<checkpoint>
/* for kill(2) */
#include <sys/types.h>
#include <signal.h>
/* in the child */
if (...)
{ kill(parent_pid, SIGUSR1); }
/* in the parent */
if (...)
{ kill(child_pid, SIGUSR2); }
The name kill() is from the old days of Unix, when the kill signal to terminate a process was the most important use of signals. The signals SIGUSR1 and SIGUSR2 (yes, these are spelled correctly) are reserved for application programs, without any restrictions on their use or interpretation by the OS.
Consider: What happens to the signal being sent if the other process has already terminated?
% cc -v -D_POSIX_PTHREAD_SEMANTICS -o
pr6 pr6.5.c pr6_ctime.c pr6_signal.c pr6_wait.c pr6_table.c
pr6.5.c:
pr6_ctime.c:
pr6_signal.c:
pr6_wait.c:
pr6_table.c:
% pr6 -h
CMPSC 311 Project 6, version 5
Usage: pr6 [-h] [-v] [-a n] [-b n] [-c n] [-f n] [-s n] [-t n]
[-x n]
-h help
-v verbose mode
-a n child alarm time
interval, default 0
-b n parent alarm time
interval, default 0
-c n fork() n child
processes, default 0, max 5
-f n fflush() before
fork(), 0 = no, 1 = yes, default 1
-s n child sleep time,
default 0
-t n parent sleep time,
default 0
-x n child exit status n,
default 0The output from the following commands can be found here.
% pr6
% pr6 -c 1
% pr6 -c 2
% pr6 -v -c 1 -x 12
% pr6 -b 3 -t 10
% pr6 -a 3 -s 10 -c 1
% pr6 -a 3 -s 10 -c 3
% pr6 -a 3 -b 7 -s 8 -t 15 -c 1
% pr6 -a 3 -b 7 -s 8 -t 15 -c 3
% pr6 -a 1 -b 2 -s 3 -t 4 -x 5 -c 2
% pr6 -a 1 -b 2 -s 3 -t 4 -x 5 -c 2 -v
% pr6 -a 1 -b 2 -s 3 -t 4 -x 5 -c 4 -v
Here is a makefile that could be useful on
Solaris; save it as Makefile. Some other
variations,
Makefile-c99, using C99 for
Solaris
Makefile-gcc, using GCC for
Solaris
Makefile-lnx, using GCC for
Linux
Makefile-mac, using GCC for
Mac OS X
Be sure that lines like "cc ..." begin with a tab
character and not 8 spaces. Pay attention to what lint
says about "function returns value which is always ignored".
This means that you might not be checking the success or failure
of a system function.
The Solaris man page siginfo.h(3HEAD) might be useful
later.
project6 directory, run these commands, which
will create the file project-6-username.tar.gz. Be sure to
substitute your own username and your own list of files. The
first command creates a tar file, and the second confirms its
contents. The third command compresses the file. Note
that the first command is so long that it may wrap around to the
next line of the browser; it ends with Makefile.tar cvf project-6-username.tar pr6.5.c pr6_ctime.[ch]
pr6_signal.[ch] pr6_wait.[ch] pr6_table.[ch] Makefiletar tvf project-6-username.tar
gzip project-6-username.tarls -l project-6-username.tar.gzproject-6-username.tar.gz in the
ANGEL Dropbox for Project 6 (with your username substituted, of
course).fork(). There's a lot of belated
wisdom here.