311, Introduction to Systems Programming
Introduction to Unix
It is stated in APUE (p. 4),
"The only two characters that cannot
appear in a filename are the slash character (
and the null character. The slash separates the filenames that
pathname and the null character terminates a pathname.
it's good practice to restrict the characters in a filename to a subset
of the normal printing characters."
It helps to remember that character strings in C are represented as an
array of characters whose contents follow one simple rule. The
character in the array is the first character of the string, and the
- Why are DOS-style pathnames incompatible with C character
- for example,
- Why is it a bad idea to use the space character in a filename?
- Why is it a bad idea to use a semicolon in a filename?
- Same question, ampersand
- Same question, question mark, ampersand, equals sign
'\0' indicates the end of the
"null terminator" is not part of the string, but it must be stored with
the string because there is no other way to find the end of the
string. Of course, that's not the only thing to remember ...
Why are DOS-style pathnames incompatible with C character
If you try to write the pathname as a
character string in C, in the obvious way, you get "
Express\MSIMN.EXE", but the backslash is interpreted by C as
starting an escape character sequence. You would need to use
double-backslash to fix the problem. This probably means
rewriting the string into a larger
char array, or
altering a string literal.
Why is it a bad idea to use the space character in a filename?
At the shell command level, spaces are
used to separate words in a command. The command
ls Program Files
complains about not finding the files Program and Files. Write
ls 'Program Files'
ls "Program Files"
Depending on how the output of
is arranged, and how many spaces are in the file name, you could be
Does this tell me I have two files,
with seven spaces in the middle of its name?
Why is it a bad idea to use a semicolon in a filename?
Same question, ampersand
Same question, question mark, ampersand, equals sign
Same question, percent sign
Most command shells use a semicolon as
a separator between commands. One day, you will write a script
that goes wrong when the second half of a filename could be
misinterpreted as the start of the next command.
There are similar problems with ampersand. This also separates
commands, but doesn't wait for the first command to complete before
starting the second command.
Web browser URL syntax gets confused when the filename contains ?,
& and =. Create two files whose contents differ,
echo one > filename
echo two > 'filename?a=foo&b=bar'
We need the single-quotes to prevent the command shell from doing
something "interesting" to the second filename. Now start your
favorite browser, and try to open the second file. Most browsers
, and replace it in the URL with
(the ASCII character code for
in hexadecimal is
second file. Now manually change the URL
, and try
again. You will get the first file.
OK, now you can guess what happens with the last case.
Last revised, 9 Jan. 2012