/* CMPSC 311, Linked List, demo version 1a
 *   data structures
 *   function prototypes
 *   functions
 *   test program
 */

struct node {
};

struct list {
};

  struct list demo;

  list_insert(demo, node_create(“foo”));

  list_print(demo);

  // What’s wrong here?  [discussion]



Obviously, the struct definitions are not finished yet.  We need to add some pointers and some data.

Here is a simple boxes-and-arrows diagram, comparing two choices for struct list.  The symbol === represents some data, and the symbol >>> represents a pointer, with 000 being a null pointer.  The end of the list is easily recognized in this example.  An empty list is represented with a null pointer in the initial struct list.

list    node    node    node    node
[>>>]   [>>>]   [>>>]   [>>>]   [000]
[===]   [===]
   [===]   [===]   [===]

list    node    node    node    node
[>>>]   [>>>]   [>>>]   [>>>]   [000]
        [===]
   [===]   [===]   [===]

We made a design decision that a list should be distinct from a node, since a list is a set of nodes.  There might be some form of data that goes with the list, that does not go with each node on the list.  Moreover, there are operations on lists that are not operations on nodes, so we should not confuse the two data types.  It is possible to get by with only one of these types, if you are willing to make some compromises, but that's a bad idea, as we'll discuss later.

We could have written

typedef struct node {
} node_t;

typedef struct list {
} list_t;

which allows us to use the type names node_t and list_t instead of struct node and struct list.  This is not just a cosmetic change.  Consider the difference between the declarations

struct list foo;

and

list_t foo;

The second version allows us to use the type definition

typedef struct node *list_t;

which is a simpler design (see the second boxes-and-arrows example).  We'll come back to this later when we consider some more design choices.

The _t style is used so you can easily distinguish a type name from other kinds of names.

The Posix Standard actually reserves type names like this for use by the Standard, and discourages what we did.  See the System Interfaces volume, Sec. 2.2.2, The Name Space, and the Rationale volume, Sec. B.2.2.2, for more details, but it's mostly reading-between-the-lines.  The risk is that some future Posix header file will define a type node_t or list_t, and then we'll have a conflict.  If you really want to avoid this kind of bug, then use node_type and list_type instead.

If this were C++, you could declare

list foo;

since omitting the keyword struct is allowed in some circumstances.  See C:ARM Sec. 4.9.2 for details.

This is legal in both C and C++,


typedef struct node {
} node;

so you could write

int test_1(struct node *node)
{
  return (node == NULL);
}

int test_2(node *node)
{
  return (node == NULL);
}

but then you should question the wisdom of test_2().  Good code is obviously correct, and test_2() requires too much thought.  A common practice is to write pnode or nodep for the parameter, where the initial or final letter p indicates a pointer, and that solves a lot of problems.



demo is not explicitly initialized.  If demo is declared as a global variable, it will be implicitly initialized to the value with all 0 bits, according to the rules of C.  If demo is declared as a local variable, it will be stored on the runtime stack, and its value will inherit (by reuse of the stack memory) whatever old junk is there already.  Bug city.

We could give demo an explicit value as part of its definition, as an assignment, or through a function call.  We should provide for all three possibilities, and will do this later.



Parameter passing in C is always call-by-value.  Let's use the symbol *** for some code that is omitted to simplify the discussion.  The function prototypes implied by the examples are

*** node_create(char *str);
void list_insert(struct list list, ***);
void list_print(struct list list);

The function list_print() does not present a problem, only an inefficiency if struct list is a large thing.  Under the rules of call-by-value, the entire function argument is copied to a temporary location (on the runtime stack) before the function starts to execute, and that copy is now the function parameter.  Since list_print() probably only reads its parameters, this will not cause anything to go wrong, as we'll just read a copy of the original argument.

But, we're in for a serious head-scratching head-banging debugging session with list_insert().  The argument demo is copied to the runtime stack, and then list_insert() does its thing.  To the copy.  Not to the original.  That's the whole point of call-by-value, as a safety precaution.  But that's not what we want to happen here.  The original struct list has not been altered, the copy has been altered, and the copy is thrown away when the list_insert() function returns.  So as far as the original struct list is concerned, list_insert() did nothing at all.  That's not what we intended, so it's a bug, and we'll fix it in the next version, by passing a pointer to a struct list instead of a struct list.  We'll do the same for list_print(), for consistency.



What would happen if node_create() failed?  It is possible to run out of memory, although that's unlikely.  Just because a problem is unlikely to occur does not mean it can be ignored.

Should list_insert() return a success/failure indicator?

list_print() should print something special if the list is empty, or if the list is somehow "strange", but it really should not fail.  We could add another parameter indicating where to print, or just assume stdout.  When list_print() is used for debugging information, printing to stderr might be a good idea.