/* CMPSC 311, Linked List, demo version 1a
* data structures
* function prototypes
* functions
* test program
*/
struct node {
};
struct list {
};
struct list demo;
list_insert(demo, node_create(“foo”));
list_print(demo);
//
What’s wrong here? [discussion]
Obviously, the struct definitions
are not finished yet. We need to add some pointers and some data.
Here
is a simple boxes-and-arrows diagram, comparing two choices for struct
list. The symbol === represents some data, and the symbol
>>> represents a pointer, with 000 being a null pointer.
The end of the list is easily recognized in this example. An
empty list is represented with a null pointer in the initial struct
list.
list
node node
node
node
[>>>] [
>>>
]
[
>>>
]
[
>>>
]
[000]
[===] [===]
[===]
[===]
[===]
list
node node
node
node
[>>>] [
>>>
]
[
>>>
]
[
>>>
]
[000]
[===]
[===]
[===]
[===]
We
made a design decision that a list should be distinct from a node,
since a list is a set of nodes. There might be some form of data
that goes
with the list, that does not go with each node on the list.
Moreover, there are operations on lists that are not operations on
nodes, so we should not confuse the two data types. It is
possible to get by with only one of these types, if you are willing to
make some compromises, but that's a bad idea, as we'll discuss later.
We could have written
typedef struct node {
} node_t;
typedef struct list {
} list_t;
which allows us to use the type names node_t and list_t instead of
struct node and struct list. This is not just a cosmetic
change. Consider the difference between the declarations
struct
list foo;
and
list_t
foo;
The second version allows us to use the type definition
typedef
struct node *list_t;
which is a simpler design (see the second boxes-and-arrows
example). We'll come back to this later when we
consider some more design choices.
The _t style is used so you can easily distinguish a type name from
other kinds of names.
The
Posix Standard actually reserves type names like this for use by the
Standard, and discourages what we did. See the System Interfaces
volume, Sec. 2.2.2, The Name Space, and the Rationale volume, Sec.
B.2.2.2, for more details, but it's mostly
reading-between-the-lines. The risk is that some future Posix
header file will define a type node_t or list_t, and then we'll have a
conflict. If you really want to avoid this kind of bug, then use
node_type and list_type instead.
If this were C++, you could declare
list
foo;
since omitting the keyword struct is allowed in some
circumstances. See C:ARM Sec. 4.9.2 for details.
This is legal in both C and C++,
typedef
struct node {
} node;
so you could write
int
test_1(struct node *node)
{
return (node == NULL);
}
int test_2(node *node)
{
return (node == NULL);
}
but then you should question the wisdom of test_2(). Good code is
obviously correct, and test_2() requires too much thought. A
common practice is to write pnode or nodep for the parameter, where the
initial or final letter p indicates a pointer, and that solves a lot of
problems.
demo is not explicitly initialized. If demo is declared as a
global variable, it will be implicitly initialized to the value with
all 0 bits, according to the rules of C. If demo is declared as a
local variable, it will be stored on the runtime stack, and its value
will inherit (by reuse of the stack memory) whatever old junk is there
already. Bug city.
We could give demo an explicit value as part of its definition, as an
assignment, or through a function call. We should provide for all
three possibilities, and will do this later.
Parameter passing in C is always call-by-value. Let's use the
symbol *** for some code that is omitted to simplify the
discussion. The function prototypes implied by the examples are
***
node_create(char *str);
void list_insert(struct list list, ***);
void list_print(struct list list);
The function list_print() does not present a problem, only an
inefficiency if struct list is a large thing. Under the rules of
call-by-value, the entire function argument is copied to a temporary
location (on the runtime stack) before the function starts to execute,
and that copy is now the function parameter. Since list_print()
probably only reads its parameters, this will not cause anything to go
wrong, as we'll just read a copy of the original argument.
But, we're in for a serious head-scratching head-banging debugging
session with list_insert(). The argument demo is copied to the
runtime stack, and then list_insert() does its thing. To the
copy. Not to the original. That's the whole point of
call-by-value, as a safety precaution. But that's not what we
want to happen here. The original struct list has not been
altered, the
copy has been altered, and the copy is thrown away when the
list_insert() function returns. So as far as the original struct
list is
concerned, list_insert() did nothing at all. That's not what we
intended, so it's a bug, and we'll fix it in the next version, by
passing a pointer to a struct list instead of a struct list.
We'll do the same for list_print(), for consistency.
What would happen if node_create()
failed? It is possible to run out of memory, although that's
unlikely. Just because a problem is unlikely to occur does not
mean it can be ignored.
Should list_insert() return a success/failure indicator?
list_print() should print something special if the list is empty, or if
the list is somehow "strange", but it really should not fail. We
could add another parameter indicating where to print, or just assume
stdout. When list_print() is used for debugging information,
printing to stderr might be a good idea.