CMPSC
311,
Introduction to Systems Programming
The C Standard
Reading
- CP:AMA
- Sec. 1.1, History of C
- Ch. 4, Expressions, is all review except for these parts:
- Sec. 4.1, top of p. 55, Implementation-Defined Behavior
- Sec. 4.2, top of p. 59, Side Effects; p. 59-60, Lvalues
- Sec. 4.3, Q&A, top of p. 68, Sequence Points
- Sec. 4.4, middle of p. 65, Undefined Behavior
- Sec. 14.3, Macro Definitions, esp. pp. 329-331, Predefined
Macros
- Ch. 21, The Standard Library, esp. Sec. 21.1-3
- Ch. 23, Library Support for Numbers and Character Data, esp.
Sec. 23.2
- Ch. 27, Additional C99 Support for Mathematics, esp. Sec.
27.1
- App. B, C99 versus C89
- App. C, C89 versus K&R C
- App. D, Standard Library Functions
- CS:APP
- p. 32-33, Aside, The evolution of the C programming language
- C:ARM, Preface, Ch. 1 (Introduction)
- APUE, Sec. 2.1 (UNIX Standardization and Implementations,
Introduction), Sec. 2.2.1 (ISO C), Sec. 2.5 (Limits), Sec. 2.5.1
(ISO C Limits)
References
- listed at the end of the page
The C Standard describes the
syntax, semantics and execution environment of the language and
its associated libraries.
- What does the program look like?
- What is the meaning of the program?
- What happens when you run the program?
The official designation of the standard for the C programming
language is ISO/IEC 9899:1999, and the newly-approved version is
ISO/IEC 9899:2011.
The standard is produced by ISO/IEC JTC 1 / SC 22 / WG 14, http://www.open-std.org/jtc1/sc22/wg14/.
- ISO = International Organization for Standardization, http://www.iso.org/
- National standards groups include
- ANSI = American
National Standards Institute
- BSI = British
Standards Institution
- DIN = Deutsches Institut
für Normung
- JISC = Japanese
Industrial Standards Committee
- etc.
- IEC = International Electrotechnical Commission, http://www.iec.ch/
- Voting on International Standards is done by the national
standards bodies.
- JTC 1 = Joint Technical Committee 1 (Information technology)
- SC 22 = Subcommittee 22 (Programming languages, their
environments and system software interfaces)
- WG 14 = Working Group 14, the international standardization
working group for the programming language C.
- The ANSI Technical Committee for C is now designated INCITS
J11, and it often meets in conjunction with WG14.
- SC 22 / WG
21 is The C++ Standards Committee.
- SC 2 is
responsible for Coded Character Sets.
- SC 2 / WG 2 was responsible for UCS - the Universal
Multiple-Octet Coded Character Set - ISO/IEC 10646, which is
essentially Unicode.
- This is how the C header
<iso646.h> got
its name.
- SC 22 / WG 23,
Programming Language Vulnerabilities, is a special project since
2005.
- Their FAQ is good reading.
"All programming languages have
constructs that are undefined, imperfectly defined,
implementation-dependent, or difficult to use correctly. As
a result, software programs can execute differently than intended
by the writer. In some cases, these vulnerabilities can be
exploited by an attacker to compromise the safety, security, and
privacy of a system.
ISO/IEC JTC 1/SC 22/WG 23 is preparing
comparative guidance spanning multiple programming languages, so
that application developers will be better able to avoid the
programming errors that lead to vulnerabilities in these languages
and their attendant consequences. This guidance can also be
used by developers to select source code evaluation tools that can
discover and eliminate coding errors that lead to vulnerabilities.
The project is preparing an ISO/IEC
Technical Report containing guidance to users of programming
languages on how to avoid the vulnerabilities that exist in the
programming language selected for a particular project. The
document is tentatively scheduled for publication in 2010."
[and it was]
Why do we need a standard for a
programming language?
- To define the nature and behavior of programs written in that
language, so we have some idea what to expect when the programs
are compiled and executed.
- To allow programs to be written on one computer system, and
moved to another, and work properly on both systems.
We need not just "some idea", but "as clear an idea as possible"
without overly constraining the design of computer systems.
Together, these goals allow a program to be portable from one system to
another.
In general, the standard describes what is required, what is
prohibited, and what is allowed within certain ranges.
In detail, the standard describes
- the representation of C programs
- the syntax and constraints of the C language
- the semantic rules for interpreting C programs
- the representation of input data to be processed by C programs
- the representation of output data produced by C programs
- the restrictions and limits imposed by a conforming
implementation of C
The standard is divided into preliminary elements (some of which are
discussed here), and
- the characteristics of environments that translate and execute
C programs
- the language syntax, constraints, and semantics
- the library facilities
The C Standard is like a contract between implementers and users of
the programming language and its associated tools. In some
countries, there may actually be legal consequences if you assert
that your software is "Standards-Conforming". In most large
companies, the claim of being standards-conforming may be an
important part of bidding on a software development project.
But, there is also a lot that is not specified. By
concentrating on the observable features of programs, the Standard
avoids discussing the internals of compilers and instruction set
architectures, and the use, limits and design of operating systems
and input/output subsystems.
The Standard is accompanied by a non-binding Rationale document that
explains the decisions of the C Standard Committee (up to April
2003).
Historical timeline
- early 1970's, initial work on C at Bell Labs by Dennis
Ritchie, influenced by Ken Thompson and others
- Thompson's Unix work had begun there in the late 1960's,
first in assembler, later in C. This explains
a.out
(assembler output) as the default name for an executable file
produced by the C compiler.
- 1978, publication of The C
Programming Language, by Brian Kernighan and Dennis
Ritchie
- This version of the language is known as K&R C, but
there were a number of points that were not completely
specified or satisfactory to all.
- 1984, an informal standard for the C library, by /usr/group
(founded 1980, now UniForum)
- 1989, ANSI X3.159-1989, ANSI Standard C, C89
- The goal was to codify common existing practice, largely but
not entirely as described by K&R, /usr/group, and various
extensions. The process began in 1983. The second
edition of K&R, published in 1988, describes an "almost
standard" version of C.
- see CP:AMA, App. C, C89 versus K&R C
- 1990, ISO/IEC 9899:1990, ISO Standard C, C90
- There was no practical difference from C89.
- Select this version with the GCC compiler options -ansi, -std=c90 or -std=iso9899:1990
- 1994, Technical Corrigendum 1, ISO/IEC 9899/COR1:1994
- 1995, Amendment 1, ISO/IEC 9899/AMD1:1995, C95, or C89 with
Amendment 1
- Select this version with the GCC compiler option
-std=iso9899:199409
- 1996, Technical Corrigendum 2, ISO/IEC 9899/COR2:1996
- 1999, ISO/IEC 9899:1999, C99
- see CP:AMA, App. B, C99 versus C89
- Select this version with the GCC compiler options -std=c99
or -std=iso9899:1999
- 2001, Technical Corrigendum 1, ISO/IEC
9899:1999/Cor.1:2001
- 2004, Technical Corrigendum 2, ISO/IEC
9899:1999/Cor.2:2004
- 2007, Technical Corrigendum 3, ISO/IEC
9899:1999/Cor.3:2007
- Dec. 2010, first ballot, Committee Draft
- Dec. 2011, final ballot, Draft International Standard
- Dec. 2011, publication of C11, ISO/IEC 9899:2011
- the language formerly known as C1X
- for the official publication, from ISO, see here
- Select this version with the GCC compiler options -std=c11 or -std=iso9899:2011
- Language
standards
supported by GCC, current status
- The C Standard, for C99, accumulated through TC 3, is N1256
(Committee Draft, September 7, 2007, ISO/IEC 9899:TC3, 552
pages).
- The C Standard, for C11, is N1570
(Committee Draft, April 12, 2011, ISO/IEC 9899:201x, 701 pages).
- This is the last draft before approval, and the last free
version.
For comparison, here is the C++ timeline.
- 1986, publication of The
C++ Programming Language, by Bjarne Stroustrup of Bell
Labs.
- The language design began in 1979, and went through several
iterations before release in 1985.
- Second ed., 1991; third ed., 1997; special ed. 2000.
- 1989, ANSI Standard C++ committee first meets
- 1991, ISO Standard C++ committee first meets
- 1994, Draft ANSI/ISO C++ Standard
- 1998, ISO/IEC 14882:1998, Standard C++
- 2003, ISO/IEC 14882:2003, Standard C++
- 2008, Working Draft, Aug. 2008, 1205 pages
- 2009, Working Draft, June 2009, 1326 pages
- 2010, Working Draft, Aug. 2010, 1331 pages
- 2011, ISO/IEC 14882:2011, Sep. 2011, 1338 pages
- the language formerly known as C++0X
- for the official publication, from ISO, see here
And again for comparison, the timeline for the IEEE Floating-Point
Arithmetic Standard.
- 1985, IEEE Std 754-1985, IEEE Standard for Binary
Floating-Point Arithmetic.
- 1987, IEEE Std 854-1987, IEEE Standard for Radix-Independent
Floating-Point Arithmetic.
- 1989, IEC 60559:1989, Binary Floating-Point Arithmetic for
Microprocessor Systems (previously designated IEC 559:1989).
- June 2008, IEEE Std 754-2008, IEEE Standard for Floating-Point
Arithmetic
Syntax and semantics
The syntax of a language
describes the form of its constructs. In English, that would
be phrases and sentences, built upon the letters and words of the
language. In a programming language, that would be
definitions, expressions, statements, etc., built upon the symbols
and keywords of the language. The syntax of English is rather
flexible, but the syntax of a programming language should be
precisely defined; some languages allow more flexibility than
others. The syntax of a language is usually described by a
grammar, but there may be additional constraints on the constructs
that are not expressed directly in the grammar.
The semantics of a
language describes the meaning of its constructs. The more
completely or precisely a meaning is described, the more useful it
can be. For a programming language, its semantics describe
what happens when a program is executed.
Some more details of the phrase
"what happens" include
- what should happen
- what should not happen
- the range of possible outcomes
- what happens when the data is valid
- what happens when the data is not valid
There is no reason to describe the
semantics of a syntactically incorrect program.
Sometimes a part of the language is context-sensitive, so its syntax or semantics can
only be determined in the context of some additional parts.
The structure or meaning of something might not be clear until you
see the whole of it. In the worst case, this would make it
difficult to compose a large program from smaller modules.
For example, a comma character can
appear in at least seven different contexts in C, depending on its
surrounding text:
- in the character constant
','
- as a character in a string literal such as
","
- as the comma operator in an expression
- in a declarator list, such as
int a, b;
- as a separator in an initializer list for a struct or array
- as a separator in a function parameter list
- as a separator in a function argument list
- (and there are still more ...)
In English, ambiguous syntax and semantics can be entertaining, such
as "Time flies like an arrow." (Which is the verb, time, flies, or like?)
In
a programming language, ambiguous syntax can be detected by a
compiler, but ambiguous or undefined semantics is dangerous, and
should be avoided.
C makes a distinction between the translation
environment, where the compiler runs, and the execution environment, where
the compiled program runs. These are often the same, but
that's not a requirement. The execution environment often
comes with an operating system, but that's not a requirement either,
as the program might be the operating system, or might be executed
from read-only memory.
Categories of behavior as specified
in the C Standard
The term behavior
describes the external appearance or action of a program or program
component. This much is observable when the program is
compiled or runs.
The term implementation
includes the compiler, header files, runtime library, and the
execution environment in general.
- required behavior
- The Standard specifies the syntax, semantics, and enough of
the execution environment to guarantee the observable results.
- This applies to all systems, in all environments.
- implementation-defined behavior
- The Standard specifies a range of possible implementations,
and the implementer is responsible for documenting the choices
made within that range.
- unspecified behavior
- The Standard specifies a range of possible implementations,
but the implementer is not required to tell you which choices
were made.
- For examples, unspecified behavior includes the use of an
unspecified value, or the order in which the arguments to a
function are evaluated.
- undefined behavior
- The Standard specifies only the syntax. The actual
behavior is not specified in any way at all.
- This applies to the use of a nonportable or erroneous
program construct or of erroneous data, where the Standard has
no other requirements. The implementation could ignore
the situation completely with unpredictable results, issue a
diagnostic message or not, or terminate with a diagnostic
message; the diagnostic message may or may not be
helpful. The implementation could also extend the
language by providing a definition of the officially undefined
behavior.
- Undefined behavior due to a nonportable construct might not
be wrong, but try to avoid it if you can.
- locale-specific behavior
- The actual behavior depends on local conventions of
nationality, culture, and language, and these must be
documented.
- Changing locales does not affect the correctness of the
program, though it does affect the form of the input and
output.
- observable behavior
- This is new in the C11 standard; it refers to the state of
memory, files and interactive I/O devices before, during and
after execution of a program.
- "What constitutes an interactive device is
implementation-defined."
- recommended practice
- This refers to a specification that is strongly recommended
as being in keeping with the intent of the Standard, but that
may be impractical for some implementations.
An implementation or program can follow the standard to varying
degrees. Strictly
conforming programs are intended to be maximally portable
among conforming implementations. Conforming programs may depend on nonportable
features of a conforming implementation.
- strictly conforming program
- The program stays within the bounds of the language,
considering all the most restrictive interpretations, and does
not depend on any undefined, unspecified, or
implementation-defined behavior.
- conforming program
- A conforming program is one that is acceptable to a
conforming implementation.
Annex J of the C Standard (Portability issues) has a complete list
of the behaviors which are described as unspecified, undefined,
implementation-defined, or locale-specific.
The implementation and environment can be hosted or freestanding;
the distinction is, roughly, with or without an operating
system. For example, an embedded system or the OS kernel
itself would be a freestanding environment, while a workstation
would be a hosted implementation.
- conforming hosted implementation
- supports the whole standard including all the library
facilities, and accepts any strictly conforming program
- conforming freestanding implementation
- accepts any strictly conforming program that is limited to
the language standard (except perhaps for the complex types)
and the library standard described in
<float.h>,
<iso646.h>, <limits.h>,
<stdalign.h> (in C11), <stdarg.h>,
<stdbool.h>, <stddef.h>,
<stdint.h>
- hosted environment
- all the library facilities are provided, and startup is
through
main()
- freestanding environment
- the required minimum; the methods of program startup and
termination are implementation-defined
Some features of the language and library are described as obsolescent, to indicate that
they could be withdrawn in the future. These features should
be avoided in new programs.
- The
gets() function, for example, because it was
never safe to use; officially withdrawn in C11.
All compilers have compile-time switches or options that allow the
programmer to select from various levels of conformance, hosted or
freestanding, and so on.
Categories of behavior as specified
in the Posix Standard
more later ...
Categories of behavior as specified
in the C++ Standard
more later ...
An informal statement of C's design
philosophy
- The programmer is knowledgeable; trust the programmer.
- Provide a minimal language facility to support the right level
of abstraction, but not too much to hide the underlying
hardware.
- Allow the programmer to get close to the hardware when
needed.
- Allow construction of code that is easy to understand and
maintain.
- Provide good performance without extensive programming
effort.
- For system programming, you can write C-compatible code in
assembly language when you need to, but you won't need it very
often.
- For compiler writers, high-quality assembly code can be
generated from well-written C code.
- For programmers in general, understanding machine
organization continues to be important.
Some guiding philosophy used by the
C Standard Committee
For the C89/C90 process
- Existing code is important, existing implementations are not.
- It's far easier to change implementations (especially
compilers) than to change existing source code.
- But if an existing program now has different semantics, that
should not be accepted quietly.
- C code can be portable.
- C code can be non-portable.
- Portability is a nice goal, but it shouldn't be forced on
you. Some code is required to be machine-specific.
- Keep the "spirit of C".
- Trust the programmer.
- Don't prevent the programmer from doing what needs to be
done.
- Keep the language small and simple.
- Provide only one way to do an operation.
- Make it fast, even if it is not guaranteed to be portable.
- Avoid interfering with the ability of compilers to
generate compact efficient code.
For the C99 process
- Support international programming.
- Codify existing practice to address evident deficiencies.
- Follow proven ideas that fix widely-perceived
problems. This is not the place for research projects.
- Minimize incompatibilities with C90, and with C++.
- "The Committee is content to let C++ be the big and
ambitious language."
- Maintain conceptual simplicity.
For the C1X process
- The prior principle, Trust the programmer, should be regarded
as outdated with respect to the security and safety programming
communities. Programmers need the ability to check their
work. So, add the principle "Make support for safety and
security demonstrable."
- No Invention. Only features that represent existing
practice, and have gained commercial acceptance in their
implementation should be standardized. Further, their
specification should be compatible with commercial
implementations.
- Migration of the existing code base must be taken into
consideration. We don't want to break existing code.
What were some of the changes from
K&R C, traditional C, to C89?
- The most important change to K&R C, according to Ritchie,
was the addition of function prototypes, which allow type
checking across the function-call boundary. Function
prototypes were borrowed from C++, but were optional in C until
C99.
- Other important changes included improvements to the
preprocessor, and loosening the ties of the library to Unix.
What were some of the changes from
C89 to C99?
- wider integer and floating-point types
long long int
long double
- wide characters,
wchar_t, <wchar.h>,
<wctype.h>, universal character names
- extended integers,
<stdint.h>, <inttypes.h>
- complex numbers,
<complex.h>
- type-generic math function interfaces,
<tgmath.h>
- preprocessor arithmetic in
intmax_t, uintmax_t
- boolean type,
_Bool, <stdbool.h>
- additional floating-point support, in
<float.h>,
<math.h>, <fenv.h>
- features familiar from C++, such as
// comments,
block scope for selection and iteration statements, inline
functions, required function prototypes, more flexible placement
of declarations, variable length arrays
- safer versions of existing features, such as restricted
pointers, reliable integer division,
snprintf()
and vscanf() in <stdio.h>
- more precise rules for many features
- broader and more flexible rules for some features
What are some of the changes from
C99 to C11?
- threads and synchronization tools
- condition variable and mutex types
- atomic objects, and thread-local storage
- <stdatomic.h>
and <threads.h>
- further security issues considered
- static assertions, bounds-checking interfaces,
analyzability, etc.
- removed the gets
function (<stdio.h>)
- many architectural clarifications
- an improved memory sequencing model, esp. with regard to
multicore and threads
- Unicode characters and strings; <uchar.h>
- additional floating-point characteristic macros (<float.h>)
- querying and specifying alignment of objects (<stdalign.h>, <stdlib.h>)
- type-generic expressions
- anonymous structures and unions
- no-return functions
- macros to create complex numbers (<complex.h>)
- support for opening files for exclusive access
- added the aligned_alloc,
at_quick_exit, and
quick_exit
functions (<stdlib.h>)
Keywords in C89 (32)
auto
break
case
char
const
continue
default
do |
double
else
enum
extern
float
for
goto
if |
int
long
register
return
short
signed
sizeof
static |
struct
switch
typedef
union
unsigned
void
volatile
while |
Keywords in C99 (37)
- add
inline, restrict, _Bool,
_Complex, _Imaginary
Keywords in C11 (44)
- add
_Alignas, _Alignof,
_Atomic, _Generic, _Noreturn, _Static_assert,
_Thread_local
Note that the C Preprocessor has an additional set of keyword-like
symbols, such as ifdef and defined,
but these are only recognized in preprocessor directives.
The C Standard headers
We highlighted the headers that describe the core library, that must
be available even in a freestanding implementation. The third
column shows the version of the C Standard that first required the
header, if not C89. The chapter, section and table numbers
refer to the course textbooks. See also APUE Fig. 2.1, which
needs to be updated for Solaris 10.
Header
|
Standard
description
|
Cxx
|
CP:AMA
|
C:ARM
|
APUE
|
<assert.h> |
Diagnostics |
|
Sec. 24.1
|
Sec. 19.1
|
|
<complex.h> |
Complex arithmetic
|
C99
|
Sec. 27.4
|
Ch. 23
|
|
<ctype.h> |
Character handling |
|
Sec. 23.5
|
Ch. 12
|
|
<errno.h> |
Errors |
|
Sec. 24.2 |
Ch. 11.2
|
Sec. 1.7
|
<fenv.h> |
Floating-point environment |
C99 |
Sec. 27.6 |
Ch. 22
|
|
<float.h> |
Characteristics of floating types |
|
Sec. 23.1 |
Table 5-3
|
Sec. 2.5.1 |
<inttypes.h> |
Format conversion of integer types |
C99 |
Sec. 27.2 |
Ch. 21
|
|
<iso646.h> |
Alternative spellings |
C95
|
Sec. 25.3 |
Sec. 11.5
|
|
<limits.h> |
Sizes of integer types |
|
Sec. 23.2
|
Table 5-2
|
Sec. 2.5 |
<locale.h> |
Localization |
|
Sec. 25.1
|
Ch. 10
|
|
<math.h> |
Mathematics |
|
Sec. 23.3-4
|
Ch. 17
|
|
<setjmp.h> |
Nonlocal jumps |
|
Sec. 24.4
|
Sec. 19.4
|
Sec. 7.10 |
<signal.h> |
Signal handling |
|
Sec. 24.3
|
Sec. 19.6
|
Ch. 10
|
<stdalign.h>
|
Alignment
|
C11
|
|
|
|
<stdarg.h> |
Variable arguments |
|
Sec. 26.1 |
Sec. 11.4
|
|
<stdatomic.h>
|
Atomics
|
C11
|
|
|
|
<stdbool.h> |
Boolean type and values |
C99 |
Sec. 21.5 |
Sec. 11.3
|
|
<stddef.h> |
Common definitions |
|
Sec. 21.4 |
Sec. 11.1
|
|
<stdint.h> |
Integer types |
C99 |
Sec. 27.1
|
Ch. 21
|
|
<stdio.h> |
Input/output |
|
Sec. 22.1-8
|
Ch. 15
|
Ch. 5
|
<stdlib.h> |
General utilities |
|
Sec. 26.2
|
Ch. 16
|
|
<string.h> |
String handling |
|
Sec. 23.6
|
Ch. 13
|
|
<tgmath.h> |
Type-generic math |
C99 |
Sec. 27.5
|
Sec. 17.12
|
|
<threads.h>
|
Threads
|
C11
|
|
|
|
<time.h> |
Date and time |
|
Sec. 26.3
|
Ch. 18
|
Sec. 6.10
|
<uchar.h>
|
Unicode utilities
|
C11
|
|
|
|
<wchar.h> |
Extended multibyte/wide character utilities |
C95 |
Sec. 25.5
|
Ch. 24
|
|
<wctype.h> |
Wide character classification and mapping utilities |
C95 |
Sec. 25.6
|
Ch. 24
|
|
The following table is derived from the C Standard, Annex B, Library
summary.
Header
|
Types
defined
|
<assert.h> |
|
<complex.h> |
complex, imaginary are macros
which expand to the keywords _Complex, _Imaginary
|
<ctype.h> |
|
<errno.h> |
|
<fenv.h> |
fenv_t
fexcept_t
|
<float.h> |
|
<inttypes.h> |
imaxdiv_t |
<iso646.h> |
|
<limits.h> |
|
<locale.h> |
struct lconv |
<math.h> |
float_t
double_t |
<setjmp.h> |
jmp_buf |
<signal.h> |
sig_atomic_t |
<stdalign.h> |
|
<stdarg.h> |
va_list |
<stdatomic.h> |
too many to list
|
<stdbool.h> |
bool is a macro that expands to _Bool |
<stddef.h> |
ptrdiff_t
size_t
wchar_t |
<stdint.h> |
intptr_t
uintptr_t
intmax_t
uintmax_t
intN_t
(N = 8, 16, 32, 64)
uintN_t
int_leastN_t
uint_leastN_t
int_fastN_t
uint_fastN_t |
<stdio.h> |
size_t
FILE
fpos_t |
<stdlib.h> |
size_t
wchar_t
div_t
ldiv_t
lldiv_t |
<string.h> |
size_t |
<tgmath.h> |
|
<threads.h> |
too many to list
|
<time.h> |
size_t
clock_t
time_t
struct tm |
<uchar.h> |
mbstate_t
size_t
char16_t
char32_t
|
<wchar.h> |
wchar_t
size_t
mbstate_t
wint_t
struct tm |
<wctype.h> |
wint_t
wctrans_t
wctype_t |
Extensions to the C Standard
An implementation can support additional types, language features
and library functions, as long as they are clearly marked as
extensions of the language. Of course, this makes the
implementation non-conforming.
Here are some examples: nested functions, typeof,
insertion of assembly code, access to "unusual" types such as
Intel's MMX/SSE and Motorola's AltiVec. For more examples, see
the GCC extensions list in the References, or Annex J.5 of the C
Standard (Portability issues, Common extensions).
If you want to claim the highest level of portability, don't use
extensions to the language. If you are writing an operating
system or a compiler, or a modern graphics library, the usual
extensions are necessary, but be aware that they introduce
implementation dependencies.
Programming support tools
Some typical tools that are not included with or specified by the C
Standard
- support for program preparation
- compiler
- program consistency checker
- static analysis of various properties, such as pointer usage
- runtime and post-mortem support
- symbolic debugger
- dynamic analysis of various properties, such as memory usage
and execution time
Some of the programming examples and projects to be described later
would be part of a full inquiry and validation suite.
The idea of a C program checker to supplement a compiler goes back
to 1979 with the lint program, still available in
modern form on Solaris. C is a flexible language, but it can
be pushed into some highly questionable use. Lint,
and other tools like splint or cqual,
can check whole programs for inconsistent or suspicious usages.
The C Standard Libraries are adopted as part of the Posix Standard
libraries. Note that Posix has a specification for some of the
C99 compiler's command-line options, but the C Standard does not.
What is not described by the C
Standard
- the mechanism by which C programs are transformed for use by a
data-processing system
- the mechanism by which C programs are invoked for use by a
data-processing system
- the mechanism by which input data are transformed for use by a
C program
- the mechanism by which output data are transformed after being
produced by a C program
- the size or complexity of a program and its data that will
exceed the capacity of any specific data-processing system or
the capacity of a particular processor
- all minimal requirements of a data-processing system that is
capable of supporting a conforming implementation
- The terms used here are meant to be completely general.
Some devices should not be called computers, for example.
Coding example
How do you know which version of the C Standard is being
used, if any? From outside the program, the compiler command
can select which version of the C Standard to use; the default would
be stated in the compiler documentation. From inside the
program, there are predefined macros that can help determine which
version was used by the compiler, and this makes it possible for one
source file to contain code for several different versions of the
Standard.
Here is an example, where we want
to write a function that will work with both standard and
non-standard versions of the C compiler.
#ifdef __STDC__
void print_date_compiled(void)
{
printf("%s", __DATE__);
}
#else
/* Not Standard C, void and __DATE__ not available.
*/
int print_date_compiled()
{
printf("(unknown)");
return 0;
}
#endif
For more details and more examples, see CP:AMA, Sec. 14.3, Macro
Definitions, esp. pp. 329-331, Predefined Macros, or C:ARM, Ch. 3,
The C Preprocessor (especially Sec. 3.3.4, 3.9), and Sec. 10.1,
10.2.
The following is an exhaustive example, intended to gather
information about the compiler and some of the available compiling
options.
Standards and Portability
Hardware eventually becomes old and obsolete. If your programs
are tied closely to the hardware, they will also become old and
obsolete. A "high-level" programming language provides a
useful abstraction of a processor and memory. An operating
system provides useful abstractions for additional devices, and the
management of resources. A standard for the language, and
another standard for the interfaces to the operating system, give
some assurance that the abstractions are useful on more than one
kind of computer system, and will survive over time.
If your programs are carefully designed and well-written, they can
be improved over time. If your programs are portable, they can
be moved to new hardware with no modification other than
recompiling. If the operating system and compiler are
transportable, they can be moved to new hardware with relatively
little modification, and then recompiled. Now your programs
have a chance to avoid becoming old and obsolete.
Design and Implementation
The combined history of the C and C++ languages shows that design
and implementation go together. This is typical of
experimental programming. Moreover, it suggested the rule that
no feature would be added to the standard language unless it was
already known to work and be useful.
References
- The C Standard, updated through Technical Corrigendum 3, http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
(Committee Draft, September 7, 2007, ISO/IEC 9899:TC3).
Some parts of this page are taken directly from this document,
without explicit attribution.
- Draft version of the C Standard, N1494
(Committee Draft, June 25, 2010, ISO/IEC 9899:201x).
- The last draft version (before approval) of the C Standard, N1570
(Committee Draft, April 12, 2011, ISO/IEC 9899:201x)
- Rationale for International Standard — Programming Languages —
C, Revision 5.10, April 2003, http://www.open-std.org/jtc1/sc22/wg14/www/docs/C99RationaleV5.10.pdf.
This describes the situation through Technical Corrigendum 1.
- John Benito, The C1X Charter, June 29, 2007, http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1250.pdf
- GCC Manual, Sec. 2, Language
Standards
Supported by GCC, Sec. 5, Extensions
to the C Language Family. For the latest version,
see http://gcc.gnu.org/onlinedocs/.
See
also
Status of C99
features in GCC.
- C:ARM, Sec. 3.3.4 (Predefined Macros), Sec. 3.9 (C++
Compatibility)
- Solaris man pages,
lint(1), standards(5)
- Brian W. Kernighan and Dennis M. Ritchie, The C Programming Language,
second edition, Prentice Hall, 1988. This book should be
part of every serious programmer's library, as an example of
clear explanation and elegant style. In some ways, it is
still the best introduction to C, but in other ways it is
outdated.
- Dennis M. Ritchie, The
development
of the C programming language. In History of
Programming languages---II, T. J. Bergin and R. G. Gibson,
Eds. ACM, New York, NY, 1993, pp. 671-698. In case you
ever wondered why C turned out the way it did. A shorter
version is available (HTML
or PDF).
- Bjarne Stroustrup, A
history of C++: 1979--1991. In History of Programming
languages---II, T. J. Bergin and R. G. Gibson, Eds. ACM,
New York, NY, 1993, pp. 699-769.
- Bjarne Stroustrup, The
Design and Evolution of C++, Addison-Wesley,
1994. A longer version of the previous reference.
- Bjarne Stroustrup, Evolving a
language in and for the real world: C++ 1991-2006. In Proceedings
of the Third ACM SIGPLAN Conference on History of Programming
Languages (San Diego, California, June 9-10, 2007). ACM,
New York, NY, 2007, pp. 4-1 - 4-59.
- Splint, from Univ. of
Virginia, often shipped with Linux. Splint does static
analysis (compile-time checking; it reads the program but
doesn't run it).
- Deputy, from UC
Berkeley. Deputy does static analysis and dynamic analysis
(run-time checking; it modifies the program to add additional
code).
Wikipedia
Last revised, 22 Jan. 2013