CMPSC 311, Spring 2013, Final Exam, Sample questions for review

The Final Exam will be Monday, April 29,
The exam is closed book, closed computer, closed neighbor, no cell phones, etc.  But, you can bring three 8 1/2 x 11 sheets of paper, with your name on each, as a "cheat sheet"; turn these in with the exam.

Class time on Wednesday and Friday, Apr. 24 and 26, will be devoted to review only.  Project solutions are posted on ANGEL.

The exam will ask questions about general knowledge of C and Unix, and will require both programming and debugging.  Any material that was covered in class, as assigned reading, as background for the projects, or in the projects, could be on the exam.  In particular, don't neglect the programming examples in the Intro to Unix notes, or the exercises in the notes.  The "cheat sheet" could remind you of Unix function prototypes and C syntax, but you should be able to answer most questions without referring to it.

The questions will be
The number of points in each category might be different, but not by much.  Some 10-point questions are in two 5-point parts.

The final exam is comprehensive for the course, with emphasis on material since the second exam.

Sample questions and practice problems for review

Note that these are primarily for review, and are not necessarily intended as sample exam questions.  Some of the sample questions here are harder than are actually on the exam, and some are actual exam questions from previous years.


Multiple Choice

    How many of these statements are correct?

       *   Variables allocated in the data segment are always initialized.
       *   Variables allocated in the stack segment are always initialized.
       *   Variables allocated in the heap segment are always initialized.
       *   Variables allocated in the text segment are always initialized.

      (1) none, they are all incorrect or nonsense
      (2) one
      (3) two
      (4) three

Write a code sequence to open a file, read its contents, determine whether the file contains a line of text with exactly 17 characters, print "yes" or "no" as appropriate, and close the file.  Partial credit for pseudo-code is possible.  Note that "yes" or "no" should be printed only once.  Assume the usual Unix text file convention for end-of-line, and ASCII characters.

Write a function that will search for file names in a given directory.  The output of the function is all file names that start with a given string.  For example, the file name "abcdef" starts with the given string "abc", so the function prints "abcdef" (without the quotes).  If no name matches, then print nothing.  Do not print names from directory entries that are not regular files.

Suggestion: write or use two more functions to check the strings and to see if a file is "regular".

Here is some actual student code.  Fix it.

        if (c == '\ ')
         { if ((c = getc(stdin)) == "n")
           { newline++; }
           if (c == "r")
           { ret++;
             if ((c = getc(stdin)) == '\ ')
             { if ((c = getc(stdin)) == "n")
               { ret--;

Explain the difference between these four statements:

  int a[];
  int *a;
  extern int a[];
  extern int *a;

In particular, draw a diagram illustrating your answer.

How does the placement of a variable's declaration in a program affect its storage allocation?

Which C keywords affect the way in which a variable is allocated, and the way in which its storage can be accessed?

Why is this declaration illegal in C?

  extern static int a;

What is the scope of a global variable?

The Posix regular expression data types and functions are

  regex_t       the compiled internal form of a regular expression
  regmatch_t    the matching position of a r.e. in a string (starting and
                ending indices)
  regoff_t      signed integer, offset from a string position as an index
  regcomp()     compiles a r.e. written as a string into an internal form
  regexec()     matches the internal form of a r.e. against a string and
                reports position information
  regerror()    transforms error codes from regcomp() or regexec() into
                readable messages
  regfree()     frees any dynamically-allocated storage used by the internal
                form of a regular expression

The types regex_t and regmatch_t are structs, defined with a typedef.

The function prototypes [that you need for this question] are

  int regcomp(regex_t *preg, const char *pattern, int cflags);
  int regexec(const regex_t *preg, const char *string,
              size_t nmatch, regmatch_t pmatch[], int eflags);
  void regfree(regex_t *preg);

A typical simplified usage would be something like

  regex_t re;
  char buffer[1024];
  if (regcomp(&re, "[A-Z][a-z]*", 0) != 0)
    { /* failed */ }
  int status = regexec(&re, buffer, (size_t) 0, NULL, 0);

(a) Explain why the function regfree() is necessary, or at least useful.

(b) Explain why the functions regcomp() and regexec() are separate functions, and are not combined into one function.  [Hint - the answer is an issue of function design, and is largely independent of regular expressions.]

Suppose you want to write some functions that cooperate by sharing some data.  Write a short code sequence that demonstrates a safe way to allocate the data, and to restrict its use to these functions.

[There might be some problems caused by sharing data in a multi-threaded program, so you can assume a single-threaded program for this question.]

Why is putenv() better or worse than setenv()?  Your answer must consider how putenv() and setenv() differ in their use of memory and pointers.  When would putenv() be preferred or not?  When would setenv() be preferred or not?

Hint:  Which other data, functions and later actions should you also consider?  Would a picture help?

The regular expression "^ab+c*d$" represents the set of strings that start with 'a', followed by one or more 'b', followed by 0 or more 'c', and end with 'd'.

Write the body of the following function, to return 1 if the argument string is in this set, or 0 if not.

int func(const char * s)

Explain why this loop runs slowly.  Rewrite the loop so that it will run faster.

   char *buf = pointer_to_a_valid_character_string;

   for (int i = 0; i < strlen(buf); i++)
     { /* loop body */ }

What specific assumptions did you make about the loop body?

Explain why this code sequence has a serious bug.  Its intent is to construct the character string b from the "tail" of character string a.  Rewrite the code sequence so that it will run correctly.

   char a[large_number];  // initialized to a valid character string
   char b[large_number];

   for (int i = 8; i < strlen(a); i++)
     { b[i-8] = a[i]; }

What is wrong with this code sequence, which prints an error message?

[The parts marked ... indicate code that is correct, so that's not where the bugs are.]

    if ( ... )
      { printf("%s: failed, %s\n", strerror(errno)); exit(0); }


This multiple-choice question is a lot harder than it looks.

To discover whether or not the function getc() is actually a macro,

    (1)  read the Posix Standard specification for getc().
    (2)  read the man page for getc().
    (3)  read the include file <stdlib.h>.
    (4)  write a program that prints &getc (since macros don't have addresses).

(3) is not correct, because you need to look at <stdio.h>

  #include <stdio.h>
  int getc(FILE *stream);

The C Standard
"The getc function is equivalent to fgetc, except that if it is implemented as a
macro, ...".  The standard requires fgetc() to be a function.

CP:AMA (p. 567)
"getc is usually implemented as a macro (as well as a function), while fgetc is
only implemented as a function.  Since getc is normally available in macro form,
it tends to be faster."  [there is more info as well]

C:ARM (p. 375)
"The function getc is identical to fgetc except that getc is usually implemented
as a macro for efficiency."

The Posix Standard [an old edition]
"The functionality described on this reference page is aligned with the ISO C
standard.  Any conflict between the requirements described here and the ISO C
standard is unintentional.  This volume of IEEE Std 1003.1-2001 defers to the
ISO C standard."
"The getc() function shall be equivalent to fgetc(), except that if it is
implemented as a macro it may evaluate stream more than once, so the argument
should never be an expression with side effects."

APUE (p. 140)
"The difference between the first two functions is that getc can be implemented
as a macro, whereas fgetc cannot be implemented as a macro."

The Solaris man page
"The getc() function is functionally identical to fgetc(), except that it is
implemented as a macro."

The Linux man page
"getc() is equivalent to fgetc() except that it may be implemented as a macro
... ."

The Mac OS X man page
"The getc() function acts essentially identically to fgetc(), but is a macro
that expands in-line."

Test program

  #include <stdio.h>

  int main(void)
    #if defined(getc)
      printf("getc is a macro\n");      // [1]
    printf("%p\n", &getc);              // [2]
    return 0;

C89, 32-bit
                Solaris         Linux           Mac OS X
[1] only        macro           macro           (nothing)
[2] only        address         address         address
[1] and [2]     both            both            address

C99, 32-bit
                Solaris         Linux           Mac OS X
[1] only        (nothing)       macro           (nothing)
[2] only        address         address         address
[1] and [2]     address         both            address

Can you explain why outputs [1] and [2] could both occur?

Solaris /usr/include/stdio.h --> /usr/include/iso/stdio_iso.h
Depending on how the program is compiled, we could have
  extern int getc(FILE *);
  #define getc(p) (--(p)->_cnt < 0 ? __filbuf(p) : (int)*(p)->_ptr++)

Can you explain how the macro implementation works?