C Compiler User's Guide

From NitrOS-9
Revision as of 01:30, 31 May 2010 by Glaw16 (Talk | contribs) (Characteristics of Compiled Programs)

Jump to: navigation, search

The C Compiler System

Introduction

The C programming language is rapidly growing in popularity and seems destined to become one of the most popular programming languages used for microcomputers. The rapid rise in the use of C is not surprising. C is an incredibly versatile and efficient language that can handle tasks that previously would have required complex assembly language programming.

C was originally developed at the Bel Telephone Laboratories as an implementation language for the UNIX operating system by Brian Kernighan and Dennis Ritchie. They also wrote a book titled The C Programming Language which is universally accepted as the standard for the language. It is an interesting reflection on the language that although no formal industry-wide "standard" was ever developed for C, programs written in C tend to be far more portable between radically different computer systems as compared to so-called "standardized" languages such as BASIC, COBOL, and PASCAL. The reason C is so portable is that the language is so inherently expandable that if some special function is required, the user can create a portable extension to the language, as opposed to the common practice of adding additional statements to the language. For example, the number of special-purpose BASIC dialects defies all reason. A lesser factor is the underlying UNIX operating system, which is also sufficiently versatile to discourage nonstandardization of the language. Indeed, standard C compilers and UNIX are intimately related.

Fortunately, the 6809 microprocessor, the OS-9 operating system, and the C language form an outstanding combination. The 6809 was specifically designed to efficiently run high-level languages, and its stack-oriented instruction set and versatile repertoire of addressing modes handle the C language very well. As mentioned previously, UNIX and C are closely related, and because OS-9 is derived from UNIX, it also supports C to the degree that almost any application written in C can be transported from a UNIX system to an OS-9 system, recompiled, and corrected executed.

The Language Implementation

OS-9 C is implemented almost exactly as described in The C Programming Language by Kernighan and Ritchie (hereafter referred to as K&R). A copy of this book, which serves as the language reference manual, is included with each software package.

Although this version of C follows the specification faithfully, there are some differences. The differences mostly reflect parts of C that are obsolete or the constraints imposed by memory size limitations.

Differences From the K&R Specification

  • Bit fields are not supported.
  • Constant expressions for initializers may include arithmetic operators only if all the operands are of type int or char.
  • The older forms of assignment operators, =+ or =*, which are recognized by some C compilers, are not supported. You must use the newer forms, +=, *=, etc.
  • "#ifdef (#ifndef) ... [#else...] #endif" is supported but "#if <constant expression>" is not.
  • It is not possible to extend macro definitions or strings over more than one line of source code.
  • The escape sequence for newline '\n' refers to the ASCII carriage return character (used by OS-9 for end-of-line), not linefeed (hex 0A). Programs which use '\n' for end-of-line (which includes all programs in K&R) will still work properly.

Enhancements and Extensions

The "Direct" Storage Class

The 6809 microprocessor instruction for accessing memory via an index register or the stack pointer can be relatively short and fast when they are used in C programs to access "auto" (function local) variables or function arguments. The instructions for accessing global variables are normally not so nice and must be four-bytes long and correspondingly slow. However, the 6809 has a nice feature which helps considerably. Memory, anywhere in a single page (256 byte block), may be accessed with fast, two byte instructions. This is called the "direct page", and at any time its location is specified by the contents of the "direct page register" within the processor. The linkage editor sorts out where this should be, and it need not concern the program, who only needs to specify for the compiler which variables should be in the direct page to give the maximum benefit in code size and execution speed.

To this end, a new storage class specifier is recognized by the compiler. In the manner of K&R page 192, the sc-specifier list is extended as follows:

Sc-specifier: auto
              static
              extern
              register
              typedef
              direct        (extension)
              extern direct (extension)
              static direct (extension)

The new keyword may be used in place of one of the other sc-specifiers, and its effect is that the variable will be placed in the direct page. direct creates a global direct page variable. extern direct references an external-type direct page variable and static direct creates a local direct page variable. These new classes may not be used to declare function arguments. "Direct" variables can be initialized but will, as with other variables not explicitly initialized, have the value zero at the start of program execution. 255 bytes are available in the direct page (the linker requires one byte). If all the direct variables occupy less than the full 255 bytes, the remaining global variables will occupy the balance and memory above if necessary. If too many bytes of storage are requested in the direct page, the linkage editor will report an error, and the programmer will have to reduce the use of direct variables to fit the 256 bytes addressable by the 6809.

It should be kept in mind that direct is unique to this compiler, and it may not be possible to transport programs written using direct to other environments without modification.

Embedded Assembly Language

As versatile as C is, occasionally there are some things that can only be done (or done at maximum speed) in assembly language. The OS-9 C compiler permits user-supplied assembly-language statements to be directly embedded in C source programs.

A line beginning with #asm switches the compiler into a mode which passes all subsequent lines directly to the assembly-language output, until a line beginning with #endasm is encountered. #endasm switches the mode back to normal. Care should be exercised when using this directive so that the correct code section is adhered to. Normal code from the compiler is in the PSECT (code) section. If your assembly code uses the VSECT (variable) section, be sure to put an ENDSECT directive at the end to leave the state correct for following compiler generated code.

Control Character Escape Sequences

The escape sequence for non-printing characters in character constants and strings (see K&R page 181) are extended as follows:

    linefeed (LF): \l (lowercase 'ell')

This is to distinguish LF (hex OA) from \n which on OS-9 is the same as \r (hex OD).

    bit patterns: \NNN (octal constant)
                  \dNNN (decimal constant)
                  \xNN (hexadecimal constant)

For example, the following all have the value 255 (decimal):

    \377    \xff    \d255

Implementation-Dependent Characteristics

K&R frequently refer to characteristics of the C language whose exact operations depend on the architecture and instruction set of the computer actually used. This section contains specific information regarding this version of C for the 6809 processor.

Data Representation and Storage

Each variable type requires a specific amount of memory for storage. The sizes of the basic types in bytes are as follows:

Data Type Size Internal Representation
char 1 two's complement binary
int 2 two's complement binary
unsigned 2 unsigned binary
long 4 two's complement binary
float 4 binary floating point (see below)
double 8 binary floating point (see below)

This compiler follows the PDP-11 implementation and format in that char is converted to int by sign extension, short or short int means int, long int means long, and long float means double. The format for double values is as follows:

(low byte)                              (high byte)
+-+------------------------------------+----------+
| |     seven byte                     |  1 byte  |
| |      mantissa                      | exponent |
+-+------------------------------------+----------+
 ^ sign bit

The form of the mantissa is sign and magnitude with an implied "1" bit at the sign bit position. The exponent is biased by 128. The format of a float is identical, except that the mantissa is only three bytes long. Conversion from double to float is carried out by truncating the least significant (right-most) four bytes of the mantissa. The reverse conversion is done by padding the least significant four mantissa bytes with zeros.

Register Variables

One register variable may be declared in each function. The only types permitted for register variables are int, unsigned, and pointer. Invalid register variable declarations are ignored; i.e., the storage class is made auto. For further details see K&R page 81.

A considerable saving in code size and speed can be made by judicious use a register variable. The most efficient use is made of it for a pointer or a counter for a loop. However, if a register variable is used in a complex arithmetic expression, there is no savings. The "U" register is assigned to register variables.

IMPORTANT NOTE: Upper- and lowercase letters cannot be mixed as in Basic09. For example, Prog.c and prog.c are distinct names. Since the Color Computer is usually used in uppercase only, it is necessary to enter the following commands to use upper- and lowercase: TMODE -UPC and CLEAR<0>.

Access to Command Line Parameters

The standard C arguments argc and argv are available to main as described in K&R page 110. The startup routine for C programs ensures that the parameter string passed to it by the parent process is converted into null-terminated strings as expected by the program. In addition, it will run together as a single argument any strings enclosed between single or double quotes ("'" or '"'). If either is part of the string required, then the other should be used as a delimiter.

System Calls and the Standard Library

Operating System Calls

The system interface supports almost all the system calls of both OS-9 and UNIX. In order to facilitate the portability of programs from UNIX, some of the calls use UNIX names rather than OS-9 names for the same function. There are a few UNIX calls that do not have exactly equivalent OS-9 calls. In these cases, the library function simulates the function of the corresponding UNIX call. In cases where there are OS-9 calls that do not have UNIX equivalents, the OS-9 names are used. Details of the calls and a name cross-reference are provided in the C System Calls section of this manual.

The Standard Library

The C compiler includes a very complete library of standard functions. It is essential for any program which uses functions from the standard library to have the statement:

    #include <stdio.h>

See the C Standard Library section of this manual for details on the standard library functions provided.

IMPORTANT NOTE: If output via printf(), fprintf(), or sprintf() of long integers is required, the program must call pflinit() at some point; this is necessary so that programs not involving longs do not have te extra longs output code appended. Similarly, if floats or doubles are to be printed, pffinit() must be called. These functions do nothing; existence of calls to them in a program informs the linker that the relevant routines are also needed.

Run-Time Arithmetic Error Handling

K&R leave the treatment of various arithmetic errors open, merely saying that it is machine dependent. This implementation deals with a limited number of error conditions in a special way; it should be assumed that the results of other possible errors are undefined.

Three new system error numbers are defined in <errno.h>:

    #define EFPOVR  40 /* floating point overflow or underflow */
    #define EDIVERR 41 /* division by zero */
    #define EINTERR 42 /* overflow on conversion of floating point to long integer */

If one of these conditions occur, the program will send a signal to itself with the value of one of these errors. If not caught or ignored, this will cause termination of the program with an error return to the parent process. However, the program can catch the interrupt using signal() or intercept() (see C System Calls), and in this case the service routine has the error number as its argument.

Achieving maximum Program Performance

Programming Considerations

Because the 6809 is an 8/16 bit microprocessor, the compiler can generate efficient code for 8 and 16 bit objects (char, int, etc.). However, code for 32 and 64 bit values (long, float, double) can be at least four times longer and slower. Therefore don't use long, float, or double where int or unsigned will do.

The compiler can perform extensive evaluation of constant expressions provided they involve only constants of type char, int, and unsigned. There is no constant expression evaluation at compile-time (except single constants and casts of them) where there are constants of type long, float, or double, therefore, complex constant expressions involving these types are evaluated at run time by the compiled program. You should manually compute the value of constant expressions of these types if speed is essential.

The Optimizer Pass

The optimizer pass automatically occurs after the compilation passes. It reads the assembler source code text and removes redundant code and searches for code sequences that can be replaced by shorter and faster equivalents. The optimizer will shorten object code by about 11% with a significant increase in program execution speed. The optimizer is recommended for production versions of debugged programs. Because this pass takes additional time, the -O compiler option can be used to inhibit it during error-checking-only compilation.

The Profiler

The profiler is an optional method used to determine the frequency of execution of each function in a C program. It allows you to identify the most frequently used functions where algorithmic or C source code programming improvements will yield the greatest gains.

When the -P compiler option is selected, code is generated at the beginning of each function to call the profiler module (called _prof), which counts invocations of each function during program execution. When the program has terminated, the profiler automatically prints a list of all functions and the number of times each was called. the profiler slightly reduces program execution speed. See prof.c source for more information.

C Compiler Component Files and File Usage

Compilation of a C program by cc requires that the following files be present in the current execution direction (CMDS).

OS-9 Level I Systems:

cc1 compiler executive program
c.prep macro pre-processor
c.pass1 compiler pass 1
c.pass2 compiler pass 2
c.opt assembly code optimizer
c.asm relocating assembler
c.link linkage editor

OS-9 Level II Systems:

cc2 compiler executive program
c.prep macro pre-processor
c.comp compiler proper
c.asm relocating assembler
c.link linkage editor

In addition a file called clib.l contains the standard library, math functions, and system library. The file cstart.r is the setup code for compiled programs. Both of these files must be located in a directory named LIB on drive /D1. The DEFS directory must also be on /D1.

If, when specifying #include files for the processor to read in, the programmer uses angle brackets, < and >, instead of parentheses, the file will be sought starting at the DEFS directory.

Temporary Files

A number of temporary files are created in the current data directory during compilation, and it is important to ensure that enough space is available on the disk drive. As a rough guide, at least three times the number of blocks in the largest source file (and its include files) should be free.

The identifiers etext, edata, and end are predefined in the linkage editor and may be used to establish the address of the end of executable text, initialized data, and uninitialized data respectively.

Running the Compiler

There are two commands which invoke distinct versions of the compiler. cc1 is for OS-9 Level I which uses a two pass compiler, and cc2 is for Level II which uses a single pass version. Both versions of the compiler work identically, the main difference is that cc1 has been divided into two passes to fit the smaller memory size of OS-9 Level I systems. In the following text, cc refers to either cc1 or cc2 as appropriate for your system. The syntax of the command line which calls the compiler is:

   cc [ option-flags ] file {file}

One file at a time can be compiled, or a number of files may be compiled together. The compiler manages the compilation through up to four stages: pre-processor, compilation to assembler code, assembly to relocatable module, and linking to binary executable code (in OS-9 memory module format).

The compiler accepts three types of source files, provided each name on the command line has the relevant postfix as show below. Any of the above file types may be mixed on the command line.

File Name Suffix Conventions

Suffix Usage
.c C source file
.a assembly language source file
.r relocatable module
none executable binary (OS-9 memory module)

There are two modes of operation: multiple source file and single source file. The compiler selects the mode by inspecting the command line. The usual mode is single source and is specified by having only one source file named on the command line. Of course, more than one source file may be compiled together by using the #include facility in the source code. In this mode, the compiler will use the name obtained by removing the postfix from the name supplied on the command line, and the output file (and the memory module produced) will have this name. For example:

    cc prg.c

will leave an executable file called prg in the current execution directory.

The multiple source mode is specified by having more than one source file name on the command line. In this mode, the object code output file will have the name output in the current execution directory, unless a name is given using the -f= option (see below). Also, in multiple source mode, the relocatable modules generated as intermediate files will be left int he same directories as their corresponding source files with the postfixes changed to .r. For example:

    cc prg1.c /d0/fred/prg2.c

will leave an executable file called output in the current execution directory, one called prg1.r in the current data directory, and prg2.r in /d0/fred.

    CC -E=3 FNAME.C -F=PROG

compiles the file called FNAME.C into an executable object file named PROG and sets the module revision level to 3.

    CC PROG.C -DIDENTIFIER=VALUE

compiles the program with a definition identifier being passed to the compiler. The definition being passed is used within the source to control compilation via #ifdef/#ifndef functions.

Compiler Option Flags

The compiler recognizes several command-line option flags which modify the compilation process where needed. All flags are recognized before compilation commences so the flags may be placed anywhere on the command line. Flags may be ran together as in -ro, except where a flag is followed by something else; see -f= and -d for examples:

-A Suppresses assembly, leaving the output as assembler code in a file whose name is postfixed .a.
-E=number Sets the edition number constant byte to the number given. This is an OS-9 convention for memory modules.
-O Inhibits the assembly code optimizer pass. The optimizer will shorten object code by about 11% with a comparable increase in speed and is recommended for production versions of debugged programs.
-P Invokes the profiler to generate function invocation frequency statistics after program execution.
-R Suppresses linking library modules into an executable program. Outputs are left in files with postfixes .r.
-M=memory size Instructs the linker to allocate memory size for data, stack, and parameter area. Memory size may be expressed in pages (an integer) or in kilobytes by appending k to an integer. For more details of the use of this option, see the Memory Management section of this manual.
-L=filename Specifies a library to be searched by the linker before the Standard Library and system interface.
-F=path Overrides the above output file naming. The output file will be left with filename as its name. This flag does not make sense in multiple source mode, and either the -a or -r flag is also present. The module will be called the last name in path.
-C Outputs the source code as comments with the assembler code.
-S Stops the generation of stack-checking code. -S should only be used with great care when the application is extremely time-critical and when the use of the stack by compiler generated code is fully understood.
-D identifier Equivalent to #define identifier written in the source file. -D is useful where different versions of a program are maintained in one source file and differentiated by means of the #ifdef or #ifndef pre-processor directives. If the identifier is used as a macro for expansion by the pre-processor, 1 will the expanded value unless the form -d identifier=string is used in which case the expansion will be string.

Command Line and Option Flag Examples

command line action output file(s)
cc prg.c compile to an executable program prg
cc prg.c -a compile to assembly language source code prg.a
cc prg.c -r compile to relocatable module prg.r
cc prg1.c prg2.c prg3.c compile to executable program prog1.r, prg2.r, prg3.r, output
cc prg1.c prg2.a prg3.r compile prg1.c, assemble prg2.a and combine all into an executable program prg1.r, prg2.r
cc prg1.c prg2.c -a compile to assembly language source code prg1.a, prg2.a
cc prg1.c, prg2.c -f=prg compile to executable program prg

Characteristics of Compiled Programs

The Object Code Module

The compiler produces position-independent, reentrant 6809 code in a standard OS-9 memory module format. The format of an executable program module is shown below. Detailed descriptions of each section of the module are given on the following pages.

     Module                                               Section
     Offset                                             Size (bytes)
                  +-----------------------------+
     $00          |                             |
                  |       Module Header         |            8
                  |                             |
                  +-----------------------------+
     $09          |      Execution Offset       |---+        2
                  +-----------------------------+   |
     $0B          |   Permanent Storage Size    |   |        2
                  +-----------------------------+   |
     $0D          |         Module Name         |   |
                  |.............................|   |
                  |                             |<--+
                  |       Executable Code       |
                  |.............................|
                  |       String Literals       |
                  |                             |
                  +-----------------------------+
                  |   Initializing Data Size    |            2
                  +-----------------------------+
                  |                             |
                  |      Initializing Data      |
                  |                             |
                  +-----------------------------+
                  | Data-text Reference Count   |            2
                  +-----------------------------+
                  |                             |
                  | Data-text Reference Offsets |
                  |                             |
                  +-----------------------------+
                  | Data-data Reference Count   |            2
                  +-----------------------------+
                  |                             |
                  | Data-data Reference Offsets |
                  |                             |
                  +-----------------------------+
                  |       CRC Check Value       |            3
                  +-----------------------------+

Module Header

This is a standard module header with the type/language byte set to $11 (Program + 6809 Object Code), and the attribute/revision byte set to $81 (Reentrant + 1).

Execution Offset

Used by OS-9 to locate where to start execution of the program.

Storage Size

Storage size is the initial default allocation of memory for data, stack, and parameter area. For a full description of memory allocation, the section entitled Memory Management located elsewhere in this manual.

Module Name

Module name is used by OS-9 to enter the module in the module directory. The module name is followed by the edition byte encoded in cstart. If this situation is not desired it may be overriden by the -E= option in cc.

Information

Any strings preceded by the directive info in an assembly code file will be placed here. A major use of this facility is to place in the module the version number and/or a copyright notice. Note that the #asm pre-compiler instruction may be used in a C source file to enable the inclusion of this directive in the compiler-generated assembly code file.

Executable Code

The machine code instructions of the program.

String Literals

Quoted strings in the C source are placed here. They are in the null-terminated form expected by the functions in the Standard Library. NOTE: the definition of the C language assumes that strings are in the DATA area and are therefore subject to alteration without making the program non-reentrant. However, in order to avoid the duplication of memory requirements which would be ncessary if they were to be in the data area, they are placed in the TEXT (executable) section of the module. Putting the strings in the executable section implies that no attempt should be made by a C programmer to alter string literals. They should be copied out first. The exception that proves the rule is the initialization of an array of type char like this:

    char message[] = "Hello world\n";

The string will be found in the array message in the data area and can be altered.

Initialization Data and its Size

If a C program contains initializers, the data for the initial values of the variables is placed in this section. The definition of C states that all uninitialized global and static variables have the value zero when the program starts running, so the startup routine of each C program first copies the data from the module into the data area and then clears the rest of the data memory to nulls.

Data References

No absolute addresses are known at compile time under OS-9, so where there are pointer values in the initializing data, they must be adjusted at run time so that they reflect the absolute values at that time. The startup routine uses the two data reference tables to locate the values that need alteration and adjusts them by the absolute values of the bases of the executable code and data respectively.

For example, suppose there are the following statements in the program being compiled:

    char *p = "I'm a string!";
    char **q = &p;

These declarations tell the compiler that there is to be a char pointer variable, p, whose initial value is the address of the string and a pointer to a char pointer, q, whose initial value is the address of p. The variables must be in the DATA section of memory at run time because they are potentially alterable, but absolute addresses are not known until run time, so the values that p and q must have are not known at compile time. The string will be placed by the compiler in the TEXT section and will not be copied out to DATA memory by the startup routine. The initializing data section of the program module will contain entries for p and q. They will have as values the offsets of the string from the base of the TEXT section and the offset of the location of p from the base of the DATA section respectively.

The startup routine will first copy all the entries in the initializing data section into their allotted places in the DATA section. then it will scan the data-text reference table for the offsets of values that need to have the addresses of the base of the TEXT section added to them. Among thee will be p which, after updating, will point to the string which is in the TEXR section. Similarly, after a scan of the data-data references, q will point to (contain the absolute address of) p.

Memory Management

The C compiler and its support programs have default conditions such that the average programmer need not be concerned with details of memory management. However, there are situations where advanced programmers may wish to tailor the storage allocation of a program for special situations. The following information explains in detail how a C program's data area is allocated and used.

Typical C Program Memory Map

A storage area is allocated by OS-9 when the C program is executed. The layout of this memory is as follows:

                  high addresses
               |                  | <- SBRK() adds more
               |                  |    memory here
               |                  |
               +------------------+ <- memend
               |    parameters    |
               +------------------+
               |                  |
Current stack  |       stack      | <- sp register
reservation -> +..................+
               |         v        |
               |                  | <- standard I/O buffers
               |    free memory   |    allocated here
Current top    |                  |
of data     -> |..................| <- IBRK() changes this
               |                  |    memory bound upward
               | requested memory |
               +------------------+ <-- end
               |   uninitialized  |
               |       data       |
               +------------------+ <- edata
               |    initialized   |
               |       data       |
               +------------------+
          ^    |    direct page   |
        dpsiz  |     variables    |
          v    |------------------+ <- y,dp registers
                   low addresses

The overall size of the this memory area is defined by the storage size value stored in the program's module header. This can be overridden to assign the program additional memory if the OS-9 Shell # command is used.

The parameter area is where the parameter string from the calling process (typically the OS-9 Shell) is placed by the system. The initializing routine for C programs converts the parameter into null-terminated strings and makes pointers to them available to main() via argc and argv.

The stack area is the currently reserved memory for exclusive use of the stack. As each C function is entered, a routine in the system interface is called to reserve enough stack space for the use of the function with an addition of 64 bytes. The 64 bytes are for the use of user-written assembly code functions and/or the system interface and/or arithmetic routines. A record is kept of the lowest address so far granted for the stack. If the area requested would not being this lower, the C function is allowed to proceed. If the new lower limit would mean that the stack area would overlap the data area, the program stops with the message:

     **** STACK OVERFLOW ****

on the standard error output. Otherwise, the new lower limit is set, and the C function resumes as before.

the direct page variables area is where variables reside that have been defined with the storage class direct in the C source code or in the direct segment in assembly language code source. Notice that the size of this area is always at least one byte (to ensure that no pointer to a variable can have the value NULL or 0) and that it is not necessarily 256 bytes.

The uninitialized data area is where the remainder of the uninitialized program variables reside. These two areas are, in fact, cleared to all zeros by the program entry routine. The initialized data area is where the initialized variables of the program reside. There are two globally defined values which may be referred to: edata and end, which are the addresses of one byte higher than the initialized data and one byte higher than the uninitialized data respectively. Note that these are not variables; the values are accessed in C using the & operator as in:

    high = &end;
    low = &edata;

and in assembler:

    leax end,y
    stx  high,y

The Y register points to the base of the data area and variables are addresses using Y-offset indexed instructions.

When the program starts running, the remaining memory is assigned to the "free" area. A program may call ibrk() to request additional working memory (initialized to zeros) from the free memory area. Alternatively, more memory can be dynamically obtained using the sbrk() which requests additional memory from the operating system and returns its lower bound. If this fails because OS-9 refuses to grant more memory for each reason sbrk() will return -1.

Compile Time Memory Allocation

If not instructed otherwise, the linker will automatically allocate 1k bytes more than the total size of the program's variables and strings. This size will normally be adequate to cover the parameter area, stack requirements, and Standard Library file buffers. The allocation size may be altered when using the compiler by using the -m option on the command line. The memory requirements may be stated in pages, for example,

    cc prg.c =m-2

which allocates 512 bytes extra, or in kilobytes, for example:

    cc prg.c -m=10k

The linker will ignore the request if the size is less than 256 bytes.

The following rules can serve as a rough guide to estimate how much memory to specify:

  1. The parameter area should be large enough for any anticipated command line string.
  2. The stack should not be less than 128 bytes and should take into account the depth of function calling chains and any recursion.
  3. All function arguments and local variables occupy stack space and each function entered needs 4 bytes more for the return address and temporary storage of the calling function's register variable.
  4. Free memory is requested by the Standard Library I/O functions for buffers at the rate of 256 bytes per accessed file. This does not apply to the lower level service request I/O functions such as open(), read(), or write() nor to stderr which is always unbuffered, but it does apply to both stdin and stdout (see the Standard Library documentation).

A good method for getting the feel of how much memory is needed by your program is to allow the linker to set the memory size to its usually conservative default value. Then, if the program runs with a variety of input satisfactorily but memory is limited on the system, try reducing the allocation at the next compilation. If a stack overflow occurs or an ibrk() call return -1, then try increasing the memory next time. You cannot damage the system by getting it wrong, but data may be lost if the program runs out of space at a crucial time. It pays to be in error on the generous side.

System Calls