A simpler version that writes its output directly to memory is called the loader, though loading is typically considered a separate process.
Overview
Computer
programs typically are composed of several parts or modules; these
parts/modules do not need to be contained within a single object file, and in such cases refer to each other by means of symbols as addresses into other modules, which are mapped into memory addresses when linked for execution.
While the process of linking is meant to ultimately combine these
independent parts, there are many good reasons to develop those
separately at the source-level. Among these reasons are the ease of organizing several smaller pieces over a monolithic
whole and the ability to better define the purpose and responsibilities
of each individual piece, which is essential for managing complexity
and increasing long-term maintainability in software architecture.
Typically, an object file can contain three kinds of symbols:
defined "external" symbols, sometimes called "public" or "entry" symbols, which allow it to be called by other modules,
undefined "external" symbols, which reference other modules where these symbols are defined, and
local symbols, used internally within the object file to facilitate relocation.
For most compilers, each object file is the result of compiling one
input source code file. When a program comprises multiple object files,
the linker combines these files into a unified executable program,
resolving the symbols as it goes along.
Linkers can take objects from a collection called a library or runtime library. Most linkers do not include all the object files in a static library
in the output executable; they include only those object files from the
library that are referenced by other object files or libraries directly
or indirectly. But for a shared library,
the entire library has to be loaded during runtime as it is not known
which functions or methods will be called during runtime. Library
linking may thus be an iterative process, with some referenced modules
requiring additional modules to be linked, and so on. Libraries exist
for diverse purposes, and one or more system libraries are usually
linked in by default.
The linker also takes care of arranging the objects in a program's address space. This may involve relocating code that assumes a specific base address
into another base. Since a compiler seldom knows where an object will
reside, it often assumes a fixed base location (for example, zero). Relocating machine code may involve re-targeting of absolute jumps, loads and stores.
The executable output by the linker may need another relocation
pass when it is finally loaded into memory (just before execution). This
pass is usually omitted on hardware offering virtual memory:
every program is put into its own address space, so there is no
conflict even if all programs load at the same base address. This pass
may also be omitted if the executable is a position independent executable.
On some Unix variants, such as SINTRAN III, the process performed by a linker (assembling object files into a program) was called loading (as in loading executable code onto a file). Additionally, in some operating systems, the same program handles both the jobs of linking and loading a program (dynamic linking).
Many operating system
environments allow dynamic linking, deferring the resolution of some
undefined symbols until a program is run. That means that the executable
code still contains undefined symbols, plus a list of objects or
libraries that will provide definitions for these. Loading the program
will load these objects/libraries as well, and perform a final linking.
This approach offers two advantages:
Often-used libraries (for example the standard system libraries)
need to be stored in only one location, not duplicated in every single
executable file, thus saving limited memory and disk space.
If a bug in a library function is corrected by replacing the library or performance
is improved, all programs using it dynamically will benefit from the
correction after restarting them. Programs that included this function
by static linking would have to be re-linked first.
There are also disadvantages:
Known on the Windows platform as "DLL hell",
an incompatible updated library will break executables that depended on
the behavior of the previous version of the library if the newer
version is not correctly backward compatible.
A program, together with the libraries it uses, might be certified
(e.g. as to correctness, documentation requirements, or performance) as a
package, but not if components can be replaced (this also argues
against automatic OS updates in critical systems; in both cases, the OS
and libraries form part of a qualified environment).
Static
linking is the result of the linker copying all library routines used
in the program into the executable image. This may require more disk
space and memory than dynamic linking, but is more portable, since it
does not require the presence of the library
on the system where it runs. Static linking also prevents "DLL hell",
since each program includes exactly the versions of library routines
that it requires, with no conflict with other programs. A program using
just a few routines from a library does not require the entire library
to be installed.
Relocation
As
the compiler has no information on the layout of objects in the final
output, it cannot take advantage of shorter or more efficient
instructions that place a requirement on the address of another object.
For example, a jump instruction can reference an absolute address or an
offset from the current location, and the offset could be expressed with
different lengths depending on the distance to the target. By first
generating the most conservative instruction (usually the largest
relative or absolute variant, depending on platform) and adding relaxation hints,
it is possible to substitute shorter or more efficient instructions
during the final link. In regard to jump optimizations this is also
called automatic jump-sizing. This step can be performed only after all input objects have been read and assigned temporary addresses; the linker relaxation
pass subsequently reassigns addresses, which may in turn allow more
potential relaxations to occur. In general, the substituted sequences
are shorter, which allows this process to always converge on the best
solution given a fixed order of objects; if this is not the case,
relaxations can conflict, and the linker needs to weigh the advantages
of either option.
While instruction relaxation typically occurs at link-time,
inner-module relaxation can already take place as part of the optimizing
process at compile-time. In some cases, relaxation can also occur at load-time as part of the relocation process or combined with dynamic dead-code elimination techniques.
Linkage editor
In IBM System/360mainframe environments such as OS/360, including z/OS for the z/Architecture mainframes, this type of program is known as a linkage editor. As the name implies a linkage editor
has the additional capability of allowing the addition, replacement,
and/or deletion of individual program sections. Operating systems such
as OS/360 have format for executable load-modules containing
supplementary data about the component sections of a program, so that an
individual program section can be replaced, and other parts of the
program updated so that relocatable addresses and other references can
be corrected by the linkage editor, as part of the process.
One advantage of this is that it allows a program to be
maintained without having to keep all of the intermediate object files,
or without having to re-compile program sections that haven't changed.
It also permits program updates to be distributed in the form of small
files (originally card decks),
containing only the object module to be replaced. In such systems,
object code is in the form and format of 80-byte punched-card images, so
that updates can be introduced into a system using that medium. In
later releases of OS/360 and in subsequent systems, load-modules contain
additional data about versions of components modules, to create a
traceable record of updates. It also allows one to add, change, or
remove an overlay structure from an already linked load module.
The term "linkage editor" should not be construed as implying
that the program operates in a user-interactive mode like a text editor.
It is intended for batch-mode execution, with the editing commands
being supplied by the user in sequentially organized files, such as punched cards, DASD, or magnetic tape.
Linkage editing (IBM nomenclature) or consolidation or collection (ICL nomenclature) refers to the linkage editor's or consolidator's
act of combining the various pieces into a relocatable binary, whereas
the loading and relocation into an absolute binary at the target address
is normally considered a separate step.
Linker Control Scripts
In
the beginning linkers gave users very limited control over the
arrangement of generated output object files. As the target systems
became complex with different memory requirements such as embedded
systems, it became necessary to give users control to generate output
object files with their specific requirements such as defining base
addresses' of segments. Linkers control scripts were used for this.
Common implementations
On
Unix and Unix-like systems, the linker is known as "ld". Origins of the
name "ld" are "LoaDer" and "Link eDitor". The term "loader" was used to
describe the process of loading external symbols from other programs
during the process of linking.
GNU linker
The GNU linker (or GNU ld) is the GNU Project's free software
implementation of the Unix command ld. GNU ld runs the linker, which
creates an executable file (or a library) from object files created
during compilation of a software project. A linker script may be passed to GNU ld to exercise greater control over the linking process. The GNU linker is part of the GNU Binary Utilities (binutils). Two versions of ld are provided in binutils: the traditional GNU ld based on bfd, and a "streamlined" ELF-only version called gold.
The command-line and linker script syntaxes of GNU ld is the de facto standard in much of the Unix-like world. The LLVM project's linker, lld, is designed to be drop-in compatible,
and may be used directly with the GNU compiler. Another drop-in
replacement, mold, is a highly parallelized and faster alternative which
is also supported by GNU tools.
A hierarchy can link entities either directly or indirectly, and
either vertically or diagonally. The only direct links in a hierarchy,
insofar as they are hierarchical, are to one's immediate superior or to
one of one's subordinates,
although a system that is largely hierarchical can also incorporate
alternative hierarchies. Hierarchical links can extend "vertically"
upwards or downwards via multiple links in the same direction, following
a path.
All parts of the hierarchy that are not linked vertically to one
another nevertheless can be "horizontally" linked through a path by
traveling up the hierarchy to find a common direct or indirect superior,
and then down again. This is akin to two co-workers or colleagues;
each reports to a common superior, but they have the same relative
amount of authority. Organizational forms exist that are both
alternative and complementary to hierarchy. Heterarchy is one such form.
Level or Tier: a set of objects with the same rank OR importance
Ordering: the arrangement of the (ranks or levels)
Hierarchy: the arrangement of a particular set of members
into (ranks or levels). Multiple hierarchies are possible per (dimension
taxonomy or Classification-system), in which selected levels of the
dimension are omitted to flatten the structure
Terms about Placement
Hierarch, the apex of the hierarchy, consisting of one single orphan (object or member) in the top level of a dimension. The root of an inverted-tree structure
Member, a (member or node) in any level of a hierarchy in a dimension to which (superior and subordinate) members are attached
Orphan,
a member in any level of a dimension without a parent member. Often the
apex of a disconnected branch. Orphans can be grafted back into the
hierarchy by creating a relationship (interaction) with a parent in the
immediately superior level
Leaf, a member in any level of a dimension without subordinates in the hierarchy
Neighbour: a member adjacent to another member in the same (level or rank). Always a peer.
Superior: a higher level or an object ranked at a higher level (A parent or an ancestor)
Subordinate: a lower level or an object ranked at a lower level (A child or a descendant)
Collection: all of the objects at one level (i.e. Peers)
Peer: an object with the same rank (and therefore at the same level)
Interaction: the relationship between an object and its direct superior or subordinate (i.e. a superior/inferior pair)
a direct interaction occurs when one object is on a level exactly one higher or one lower than the other (i.e., on a tree, the two objects have a line between them)
Distance:
the minimum number of connections between two objects, i.e., one less
than the number of objects that need to be "crossed" to trace a path from one object to another
Span: a qualitative description of the width of a level when diagrammed, i.e., the number of subordinates an object has
Terms about Nature
Attribute: a heritable characteristic of (members and their subordinates) in a level (e.g. hair-colour)
Attribute-value: the specific value of a heritable characteristic (e.g. Auburn)
Most hierarchies use a more specific vocabulary pertaining to
their subject, but the idea behind them is the same. For example, with data structures, objects are known as nodes, superiors are called parents and subordinates are called children. In a business setting, a superior is a supervisor/boss and a peer is a colleague.
Degree of branching
Degree of branching refers to the number of direct subordinates or children an object has (in graph theory, equivalent to the number of other vertices
connected to via outgoing arcs, in a directed graph) a node has.
Hierarchies can be categorized based on the "maximum degree", the
highest degree present in the system as a whole. Categorization in this
way yields two broad classes: linear and branching.
In a linear hierarchy, the maximum degree is 1.
In other words, all of the objects can be visualized in a line-up, and
each object (excluding the top and bottom ones) has exactly one direct
subordinate and one direct superior. This is referring to the objects and not the levels;
every hierarchy has this property with respect to levels, but normally
each level can have an infinite number of objects. An example of a
linear hierarchy is the hierarchy of life.
In a branching hierarchy, one or more objects has a degree of 2 or more (and therefore the minimum degree is 2 or higher). For many people, the word "hierarchy" automatically evokes an image of a branching hierarchy. Branching hierarchies are present within numerous systems, including organizations and classification schemes. The broad category of branching hierarchies can be further subdivided based on the degree.
A flat hierarchy (also known for companies as flat organization) is a branching hierarchy in which the maximum degree approaches infinity, i.e., that has a wide span.
Most often, systems intuitively regarded as hierarchical have at most a
moderate span. Therefore, a flat hierarchy is often not viewed as a
hierarchy at all. For example, diamonds and graphite are flat hierarchies of numerous carbon atoms that can be further decomposed into subatomic particles.
An overlapping hierarchy is a branching hierarchy in which at least one object has two parent objects. For example, a graduate student can have two co-supervisors to whom the student reports directly and equally, and who have the same level of authority within the university hierarchy (i.e., they have the same position or tenure status).
Etymology
Possibly the first use of the English word hierarchy cited by the Oxford English Dictionary was in 1881, when it was used in reference to the three orders of three angels as depicted by Pseudo-Dionysius the Areopagite (5th–6th centuries). Pseudo-Dionysius used the related Greek word (ἱεραρχία, hierarchia) both in reference to the celestial hierarchy and the ecclesiastical hierarchy. The Greek term hierarchia means 'rule of a high priest', from hierarches (ἱεράρχης, 'president of sacred rites, high-priest') and that from hiereus (ἱερεύς, 'priest') and arche (ἀρχή, 'first place or power, rule'). Dionysius is credited with first use of it as an abstract noun.
Since hierarchical churches, such as the Roman Catholic (see Catholic Church hierarchy) and Eastern Orthodox churches, had tables of organization that were "hierarchical" in the modern sense of the word (traditionally with God as the pinnacle or head of the hierarchy), the term came to refer to similar organizational methods in secular settings.
Representing hierarchies
A hierarchy is typically depicted as a pyramid,
where the height of a level represents that level's status and width of
a level represents the quantity of items at that level relative to the
whole. For example, the few Directors of a company could be at the apex, and the base could be thousands of people who have no subordinates.
These pyramids are often diagrammed with a triangle
diagram which serves to emphasize the size differences between the
levels (but not all triangle/pyramid diagrams are hierarchical; for
example, the 1992 USDA food guide pyramid). An example of a triangle diagram appears to the right.
More recently, as computers have allowed the storage and
navigation of ever larger data sets, various methods have been developed
to represent hierarchies in a manner that makes more efficient use of
the available space on a computer's screen. Examples include fractal maps, TreeMaps and Radial Trees.
Visual hierarchy
In
the design field, mainly graphic design, successful layouts and
formatting of the content on documents are heavily dependent on the
rules of visual hierarchy. Visual hierarchy is also important for proper organization of files on computers.
An example of visually representing hierarchy is through nested
clusters. Nested clusters represent hierarchical relationships using
layers of information. The child element is within the parent element,
such as in a Venn diagram.
This structure is most effective in representing simple hierarchical
relationships. For example, when directing someone to open a file on a
computer desktop, one may first direct them towards the main folder,
then the subfolders within the main folder. They will keep opening files
within the folders until the designated file is located.
For more complicated hierarchies, the stair structure represents
hierarchical relationships through the use of visual stacking. Visually
imagine the top of a downward staircase beginning at the left and
descending on the right. Child elements are towards the bottom of the
stairs and parent elements are at the top. This structure represents
hierarchical relationships through the use of visual stacking.
Informal representation
In plain English, a hierarchy can be thought of as a set in which:
No element is superior to itself, and
One element, the (apex or hierarch), is superior to all of the other elements in the set.
The first requirement is also interpreted to mean that a hierarchy can have no circular relationships; the association between two objects is always transitive.
The second requirement asserts that a hierarchy must have a leader or root that is common to all of the objects.
Mathematically, in its most general form, a hierarchy is a partially ordered set or poset. The system
in this case is the entire poset, which is constituted of elements.
Within this system, each element shares a particular unambiguous
property. Objects with the same property value are grouped together, and
each of those resulting levels is referred to as a class.
"Hierarchy" is particularly used to refer to a poset in which the
classes are organized in terms of increasing complexity.
Operations such as addition, subtraction, multiplication and division
are often performed in a certain sequence or order. Usually, addition
and subtraction are performed after multiplication and division has
already been applied to a problem. The use of parentheses is also a
representation of hierarchy, for they show which operation is to be done
prior to the following ones. For example:
(2 + 5) × (7 - 4).
In this problem, typically one would multiply 5 by 7 first, based on the
rules of mathematical hierarchy. But when the parentheses are placed,
one will know to do the operations within the parentheses first before
continuing on with the problem. These rules are largely dominant in
algebraic problems, ones that include several steps to solve. The use of
hierarchy in mathematics is beneficial to quickly and efficiently solve
a problem without having to go through the process of slowly dissecting
the problem. Most of these rules are now known as the proper way into
solving certain equations.
Subtypes
Nested hierarchy
A nested hierarchy or inclusion hierarchy is a hierarchical ordering of nested sets. The concept of nesting is exemplified in Russian matryoshka dolls.
Each doll is encompassed by another doll, all the way to the outer
doll. The outer doll holds all of the inner dolls, the next outer doll
holds all the remaining inner dolls, and so on. Matryoshkas represent a
nested hierarchy where each level contains only one object, i.e., there
is only one of each size of doll; a generalized nested hierarchy allows
for multiple objects within levels but with each object having only one
parent at each level. The general concept is both demonstrated and
mathematically formulated in the following example:
A square can always also be referred to as a quadrilateral, polygon
or shape. In this way, it is a hierarchy. However, consider the set of
polygons using this classification. A square can only be a quadrilateral; it can never be a triangle, hexagon, etc.
Nested hierarchies are the organizational schemes behind taxonomies and systematic classifications. For example, using the original Linnaean taxonomy (the version he laid out in the 10th edition of Systema Naturae), a human can be formulated as:
Taxonomies may change frequently (as seen in biological taxonomy), but the underlying concept of nested hierarchies is always the same.
In many programming taxonomies and syntax models (as well as
fractals in mathematics), nested hierarchies, including Russian dolls,
are also used to illustrate the properties of self-similarity and recursion.
Recursion itself is included as a subset of hierarchical programming,
and recursive thinking can be synonymous with a form of hierarchical
thinking and logic.
Containment hierarchy
A containment hierarchy is a direct extrapolation of the nested hierarchy concept. All of the ordered sets are still nested, but every set must be "strict"—no two sets can be identical. The shapes example above can be modified to demonstrate this:
The notation means x is a subset of y but is not equal to y.
Two types of containment hierarchies are the subsumptive containment hierarchy and the compositional containment hierarchy. A subsumptive hierarchy "subsumes" its children, and a compositional hierarchy is "composed" of its children. A hierarchy can also be both subsumptive and compositional.
Subsumptive containment hierarchy
A subsumptive
containment hierarchy is a classification of object classes from the
general to the specific. Other names for this type of hierarchy are
"taxonomic hierarchy" and "IS-A hierarchy".
The last term describes the relationship between each level—a
lower-level object "is a" member of the higher class. The taxonomical
structure outlined above is a subsumptive containment hierarchy. Using
again the example of Linnaean taxonomy, it can be seen that an object
that is a member of the level Mammalia "is a" member of the level Animalia;
more specifically, a human "is a" primate, a primate "is a" mammal, and
so on. A subsumptive hierarchy can also be defined abstractly as a
hierarchy of "concepts". For example, with the Linnaean hierarchy outlined above, an entity name like Animalia is a way to group all the species that fit the conceptualization of an animal.
Compositional containment hierarchy
A compositional containment hierarchy is an ordering of the parts that make up a system—the system is "composed" of these parts. Most engineered structures, whether natural or artificial, can be broken down in this manner.
The compositional hierarchy that every person encounters at every moment is the hierarchy of life. Every person can be reduced to organ systems, which are composed of organs, which are composed of tissues, which are composed of cells, which are composed of molecules, which are composed of atoms. In fact, the last two levels apply to all matter, at least at the macroscopic scale. Moreover, each of these levels inherit all the properties of their children.
In this particular example, there are also emergent properties—functions that are not seen at the lower level (e.g., cognition is not a property of neurons but is of the brain)—and
a scalar quality (molecules are bigger than atoms, cells are bigger
than molecules, etc.). Both of these concepts commonly exist in
compositional hierarchies, but they are not a required general property.
These level hierarchies are characterized by bi-directional causation. Upward causation
involves lower-level entities causing some property of a higher level
entity; children entities may interact to yield parent entities, and
parents are composed at least partly by their children. Downward causation refers to the effect that the incorporation of entity x into a higher-level entity can have on x's properties and interactions. Furthermore, the entities found at each level are autonomous.
While the above examples are often
clearly depicted in a hierarchical form and are classic examples,
hierarchies exist in numerous systems where this branching structure is
not immediately apparent. For example, most postal-code systems are hierarchical. Using the Canadian postal code system as an example, the top level's binding concept, the "postal district",
consists of 18 objects (letters). The next level down is the "zone",
where the objects are the digits 0–9. This is an example of an overlapping hierarchy,
because each of these 10 objects has 18 parents. The hierarchy
continues downward to generate, in theory, 7,200,000 unique codes of the
format A0A 0A0 (the second and third letter positions allow 20 objects each). Most library classification systems are also hierarchical. The Dewey Decimal System is infinitely hierarchical because there is no finite bound on the number of digits can be used after the decimal point.
In a reverse hierarchy, the conceptual pyramid
of authority is turned upside-down, so that the apex is at the bottom
and the base is at the top. This mode represents the idea that members
of the higher rankings are responsible for the members of the lower
rankings.
Empirically, when we observe in nature a large proportion of the
(complex) biological systems, they exhibit hierarchic structure.
On theoretical grounds we could expect complex systems to be
hierarchies in a world in which complexity had to evolve from
simplicity. System hierarchies analysis performed in the 1950s, laid the empirical foundations for a field that would become, from the 1980s, hierarchical ecology.
CGI and computer-animationprograms mostly use hierarchies for models. On a 3D model of a human for example, the chest is a parent of the upper left arm, which is a parent of the lower left arm, which is a parent of the hand. This pattern is used in modeling and animation for almost everything built as a 3D digital model.
In this system, the three (or four with Algonquian languages) persons occur in a hierarchy of salience. To distinguish which is subject and which object, inverse markers are used if the object outranks the subject.
On the other hand, languages include a variety of phenomena that
are not hierarchical. For example, the relationship between a pronoun
and a prior noun-phrase to which it refers commonly crosses grammatical
boundaries in non-hierarchical ways.
Music
The structure of a musical composition is often understood hierarchically (for example by Heinrich Schenker (1768–1835, see Schenkerian analysis), and in the (1985) Generative Theory of Tonal Music, by composer Fred Lerdahl and linguist Ray Jackendoff).
The sum of all notes in a piece is understood to be an all-inclusive
surface, which can be reduced to successively more sparse and more
fundamental types of motion. The levels of structure that operate in
Schenker's theory are the foreground, which is seen in all the details
of the musical score; the middle ground, which is roughly a summary of
an essential contrapuntal progression and voice-leading; and the
background or Ursatz, which is one of only a few basic "long-range counterpoint" structures that are shared in the gamut of tonal music literature.
The pitches and form of tonal music are organized hierarchically, all pitches deriving their importance from their relationship to a tonic key, and secondary themes in other keys are brought back to the tonic in a recapitulation of the primary theme.
In the work of diverse theorists such as William James (1842 to 1910), Michel Foucault (1926 to 1984) and Hayden White (1928 to 2018), important critiques of hierarchical epistemology are advanced. James famously asserts in his work Radical Empiricism
that clear distinctions of type and category are a constant but
unwritten goal of scientific reasoning, so that when they are
discovered, success is declared. But if aspects of the world are
organized differently, involving inherent and intractable ambiguities,
then scientific questions are often considered unresolved.
Feminists, Marxists, anarchists, communists, critical theorists
and others, all of whom have multiple interpretations, criticize the
hierarchies commonly found within human society, especially in social
relationships. Hierarchies are present in all parts of society: in
businesses, schools, families, etc. These relationships are often viewed
as necessary. Entities that stand in hierarchical arrangements are
animals, humans, plants, etc.
Ethics, behavioral psychology, philosophies of identity
In ethics, various virtues are enumerated and sometimes organized hierarchically according to certain brands of virtue theory.
In some of these random examples, there is an asymmetry of
'compositional' significance between levels of structure, so that small
parts of the whole hierarchical array depend, for their meaning, on
their membership in larger parts. There is a hierarchy of activities in
human life: productive activity serves or is guided by the moral life;
the moral life is guided by practical reason; practical reason (used in
moral and political life) serves contemplative reason (whereby we
contemplate God). Practical reason sets aside time and resources for
contemplative reason.