Function (computer programming)

In computer programming, a subprogram is a separate sequence of program instructions that perform a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Subprograms may be defined within programs, or separately in libraries that can be used by multiple programs.

There are numerous motivations for the use of subprograms:

to reduce redundancy in a program,
to enable reuse of code across multiple programs,
to decompose complex problems into simpler pieces,
to improve readability of a program,
to replicate useful mathematical functions,
to hide or regulate part of the program (see Information Hiding).

Generally, to make use of a subprogram, a programmer places some form of call instruction--which constitutes a call site--into an instruction sequence. When the call site is encountered, the instruction sequence is temporarily suspended, and the subprogram itself executes until it completes, at which time the original instruction sequence resumes.

Parameters and return values

Many programming languages provide facilities for data to be passed into the subprogram from the call instruction, in the form of arguments, and for data to be returned from the subprogram to its caller, in the form of one or more return values. (todo: formal and actual parameters?)

Calling conventions

Parameters can be passed to subprograms in several ways:

In the call-by-value mechanism, a copy of each argument is passed to the subprogram. If the subprogram modifies an argument, it is merely modifying its own copy. Therefore, such modifications are not "visible" to the caller, in the sense that the caller's copy of that datum is unaffected.

In the call-by-reference mechanism, the caller passes a reference to each argument. This means the caller and the subprogram are both manipulating the same datum, so when such a subprogram modifies one of its parameters, the modification is visible from the caller's perspective as well. Call-by-reference can be simulated in some call-by-value languages via pointers: each pointer behaves like a reference to some datum, and while changes to the pointer itself are not visible by the caller (because the pointer was passed by value), modifications to the referenced datum are visible.

[TODO: Add information about call-by-name]

Local variables, recursion, and re-entrancy

A subprogram may find it useful to make use of a certain amount of "scratch" space; that is, memory used during the execution of that subprogram to hold intermediate results. Variables stored in this scratch space are referred to as local variables, and the scratch space itself is referred to as an activation record.

A subprogram may have any number and nature of call sites; in fact, a subprogram may even call itself, causing its execution to suspend while another nested execution of the same subprogram occurs. This is referred to as recursion, and is a useful technique for making some complex algorithms more comprehensible. However, recursion poses a problem if the recursive execution modifies any local variables, because when the suspended execution resumes, it will find that the data stored in its local variables have been lost.

Early languages like Fortran simply didn't support recursion for this reason. Modern languages almost invariably provide a fresh activation record for every execution of a subprogram; that way, the nested execution is free to modify its local variables without concern for the effect on other suspended executions in progress. As nested calls accumulate, a call stack structure is formed, consisting of one activation record for each suspended subprogram. In fact, this stack structure is virtually ubiquitous, and so activation records are commonly referred to as stack frames.

If a subprogram can function properly even when called while another execution is already in progreses, that subprogram is said to be re-entrant. A recursive subprogram must be re-entrant. Re-entrant subprograms are also useful in [[threads (computer science}|multi-threaded]] situations, since multiple threads can call the same subprogram without fear of interfering with each other.

[TODO: mention return address]

Related terms

Different programming languages and methodologies possess notions and mechanisms related to subprograms:

Subroutine is practically synonymous with "subprogram." The former term may derive from the terminology of assembly languages and Fortran.
Function and procedure often denote a subprogram that takes parameters and may or may not have a return value. Many make the distinction between "functions", that possess return values and appear in expressions, versus "procedures", that possess no return values and appear in statements (though this is not a distinction found in the C programming language). (See also Command-Query Separation.)
Method is used in object-oriented programming and denotes a subprogram that describes some behaviour of an object.

Parameters and return values

Calling conventions

Local variables, recursion, and re-entrancy

Related terms

See also