A Medley of Potpourri

Friday, July 6, 2018

Generic programming

From Wikipedia, the free encyclopedia

Generic programming is a style of computer programming in which algorithms are written in terms of types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by ML in 1973,^[1]^[2] permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplication. Such software entities are known as generics in Ada, C#, Delphi, Eiffel, F#, Java, Rust, Swift, TypeScript and Visual Basic .NET. They are known as parametric polymorphism in ML, Scala, Haskell (the Haskell community also uses the term "generic" for a related but somewhat different concept) and Julia; templates in C++ and D; and parameterized types in the influential 1994 book Design Patterns.^[3] The authors of Design Patterns note that this technique, especially when combined with delegation, is very powerful, however,

Dynamic, highly parameterized software is harder to understand than more static software.

— Gang of Four, Design Patterns^[3] (Chapter 1)

The term generic programming was originally coined by David Musser and Alexander Stepanov^[4] in a more specific sense than the above, to describe a programming paradigm whereby fundamental requirements on types are abstracted from across concrete examples of algorithms and data structures and formalized as concepts, with generic functions implemented in terms of these concepts, typically using language genericity mechanisms as described above.

Stepanov–Musser and other generic programming paradigms

Generic programming is defined in Musser & Stepanov (1989) as follows,

Generic programming centers around the idea of abstracting from concrete, efficient algorithms to obtain generic algorithms that can be combined with different data representations to produce a wide variety of useful software.

— Musser, David R.; Stepanov, Alexander A., Generic Programming^[5]

Generic programming paradigm is an approach to software decomposition whereby fundamental requirements on types are abstracted from across concrete examples of algorithms and data structures and formalized as concepts, analogously to the abstraction of algebraic theories in abstract algebra.^[6] Early examples of this programming approach were implemented in Scheme and Ada,^[7] although the best known example is the Standard Template Library (STL),^[8]^[9] which developed a theory of iterators that is used to decouple sequence data structures and the algorithms operating on them.

For example, given N sequence data structures, e.g. singly linked list, vector etc., and M algorithms to operate on them, e.g. find, sort etc., a direct approach would implement each algorithm specifically for each data structure, giving N × M combinations to implement. However, in the generic programming approach, each data structure returns a model of an iterator concept (a simple value type that can be dereferenced to retrieve the current value, or changed to point to another value in the sequence) and each algorithm is instead written generically with arguments of such iterators, e.g. a pair of iterators pointing to the beginning and end of the subsequence to process. Thus, only N + M data structure-algorithm combinations need be implemented. Several iterator concepts are specified in the STL, each a refinement of more restrictive concepts e.g. forward iterators only provide movement to the next value in a sequence (e.g. suitable for a singly linked list or a stream of input data), whereas a random-access iterator also provides direct constant-time access to any element of the sequence (e.g. suitable for a vector). An important point is that a data structure will return a model of the most general concept that can be implemented efficiently—computational complexity requirements are explicitly part of the concept definition. This limits the data structures a given algorithm can be applied to and such complexity requirements are a major determinant of data structure choice. Generic programming similarly has been applied in other domains, e.g. graph algorithms.^[10]

Note that although this approach often utilizes language features of compile-time genericity/templates, it is in fact independent of particular language-technical details. Generic programming pioneer Alexander Stepanov wrote,

Generic programming is about abstracting and classifying algorithms and data structures. It gets its inspiration from Knuth and not from type theory. Its goal is the incremental construction of systematic catalogs of useful, efficient and abstract algorithms and data structures. Such an undertaking is still a dream.

— Alexander Stepanov, Short History of STL ^[11]^[12]

I believe that iterator theories are as central to Computer Science as theories of rings or Banach spaces are central to Mathematics.

— Alexander Stepanov, An Interview with A. Stepanov^[13]

Bjarne Stroustrup noted,

Following Stepanov, we can define generic programming without mentioning language features: Lift algorithms and data structures from concrete examples to their most general and abstract form.

— Bjarne Stroustrup, Evolving a language in and for the real world: C++ 1991-2006^[12]

Other programming paradigms that have been described as generic programming include Datatype generic programming as described in "Generic Programming — an Introduction".^[14] The Scrap your boilerplate approach is a lightweight generic programming approach for Haskell.^[15]

In this article we distinguish the high-level programming paradigms of generic programming, above, from the lower-level programming language genericity mechanisms used to implement them (see Programming language support for genericity). For further discussion and comparison of generic programming paradigms, see.^[16]

Programming language support for genericity

Genericity facilities have existed in high-level languages since at least the 1970s in languages such as ML, CLU and Ada, and were subsequently adopted by many object-based and object-oriented languages, including BETA, C++, D, Eiffel, Java, and DEC's now defunct Trellis-Owl language.

Genericity is implemented and supported differently in various programming languages; the term "generic" has also been used differently in various programming contexts. For example, in Forth the compiler can execute code while compiling and one can create new compiler keywords and new implementations for those words on the fly. It has few words that expose the compiler behaviour and therefore naturally offers genericity capacities that, however, are not referred to as such in most Forth texts. Similarly, dynamically typed languages, especially interpreted ones, usually offer genericity by default as both passing values to functions and value assignment are type-indifferent and such behavior is often utilized for abstraction or code terseness, however this is not typically labeled genericity as it's a direct consequence of dynamic typing system employed by the language^{[citation needed]}. The term has been used in functional programming, specifically in Haskell-like languages, which use a structural type system where types are always parametric and the actual code on those types is generic. These usages still serve a similar purpose of code-saving and the rendering of an abstraction.

Arrays and structs can be viewed as predefined generic types. Every usage of an array or struct type instantiates a new concrete type, or reuses a previous instantiated type. Array element types and struct element types are parameterized types, which are used to instantiate the corresponding generic type. All this is usually built-in in the compiler and the syntax differs from other generic constructs. Some extensible programming languages try to unify built-in and user defined generic types.

A broad survey of genericity mechanisms in programming languages follows. For a specific survey comparing suitability of mechanisms for generic programming, see.^[17]

In object-oriented languages

When creating container classes in statically typed languages, it is inconvenient to write specific implementations for each datatype contained, especially if the code for each datatype is virtually identical. For example, in C++, this duplication of code can be circumvented by defining a class template:

template<typename T>
class List
{
   /* class contents */
};

List<Animal> list_of_animals;
List<Car> list_of_cars;

Above, T is a placeholder for whatever type is specified when the list is created. These "containers-of-type-T", commonly called templates, allow a class to be reused with different datatypes as long as certain contracts such as subtypes and signature are kept. This genericity mechanism should not be confused with inclusion polymorphism, which is the algorithmic usage of exchangeable sub-classes: for instance, a list of objects of type Moving_Object containing objects of type Animal and Car. Templates can also be used for type-independent functions as in the Swap example below:

template<typename T>
void Swap(T & a, T & b) //"&" passes parameters by reference
{
   T temp = b;
   b = a;
   a = temp;
}

string hello = "World!", world = "Hello, ";
Swap( world, hello );
cout << hello << world << endl; //Output is "Hello, World!"

The C++ template construct used above is widely cited^{[citation needed]} as the genericity construct that popularized the notion among programmers and language designers and supports many generic programming idioms. The D programming language also offers fully generic-capable templates based on the C++ precedent but with a simplified syntax. The Java programming language has provided genericity facilities syntactically based on C++'s since the introduction of J2SE 5.0.

C# 2.0, Oxygene 1.5 (also known as Chrome) and Visual Basic .NET 2005 have constructs that take advantage of the support for generics present in the Microsoft .NET Framework since version 2.0.

Generics in Ada

Ada has had generics since it was first designed in 1977–1980. The standard library uses generics to provide many services. Ada 2005 adds a comprehensive generic container library to the standard library, which was inspired by C++'s standard template library.

A generic unit is a package or a subprogram that takes one or more generic formal parameters.

A generic formal parameter is a value, a variable, a constant, a type, a subprogram, or even an instance of another, designated, generic unit. For generic formal types, the syntax distinguishes between discrete, floating-point, fixed-point, access (pointer) types, etc. Some formal parameters can have default values.

To instantiate a generic unit, the programmer passes actual parameters for each formal. The generic instance then behaves just like any other unit. It is possible to instantiate generic units at run-time, for example inside a loop.

Example

The specification of a generic package:

 generic
    Max_Size : Natural; -- a generic formal value
    type Element_Type is private; -- a generic formal type; accepts any nonlimited type
 package Stacks is
    type Size_Type is range 0 .. Max_Size;
    type Stack is limited private;
    procedure Create (S : out Stack;
                      Initial_Size : in Size_Type := Max_Size);
    procedure Push (Into : in out Stack; Element : in Element_Type);
    procedure Pop (From : in out Stack; Element : out Element_Type);
    Overflow : exception;
    Underflow : exception;
 private
    subtype Index_Type is Size_Type range 1 .. Max_Size;
    type Vector is array (Index_Type range <>) of Element_Type;
    type Stack (Allocated_Size : Size_Type := 0) is record
       Top : Index_Type;
       Storage : Vector (1 .. Allocated_Size);
    end record;
 end Stacks;

Instantiating the generic package:

 type Bookmark_Type is new Natural;
 -- records a location in the text document we are editing

 package Bookmark_Stacks is new Stacks (Max_Size => 20,
                                        Element_Type => Bookmark_Type);
 -- Allows the user to jump between recorded locations in a document

Using an instance of a generic package:

 type Document_Type is record
    Contents : Ada.Strings.Unbounded.Unbounded_String;
    Bookmarks : Bookmark_Stacks.Stack;
 end record;

 procedure Edit (Document_Name : in String) is
   Document : Document_Type;
 begin
   -- Initialise the stack of bookmarks:
   Bookmark_Stacks.Create (S => Document.Bookmarks, Initial_Size => 10);
   -- Now, open the file Document_Name and read it in...
 end Edit;

Advantages and limitations

The language syntax allows precise specification of constraints on generic formal parameters. For example, it is possible to specify that a generic formal type will only accept a modular type as the actual. It is also possible to express constraints between generic formal parameters; for example:

 generic
    type Index_Type is (<>); -- must be a discrete type
    type Element_Type is private; -- can be any nonlimited type
    type Array_Type is array (Index_Type range <>) of Element_Type;

In this example, Array_Type is constrained by both Index_Type and Element_Type. When instantiating the unit, the programmer must pass an actual array type that satisfies these constraints.

The disadvantage of this fine-grained control is a complicated syntax, but, because all generic formal parameters are completely defined in the specification, the compiler can instantiate generics without looking at the body of the generic.

Unlike C++, Ada does not allow specialised generic instances, and requires that all generics be instantiated explicitly. These rules have several consequences:

the compiler can implement shared generics: the object code for a generic unit can be shared between all instances (unless the programmer requests inlining of subprograms, of course). As further consequences:
- there is no possibility of code bloat (code bloat is common in C++ and requires special care, as explained below).
- it is possible to instantiate generics at run-time, as well as at compile time, since no new object code is required for a new instance.
- actual objects corresponding to a generic formal object are always considered to be non-static inside the generic; see Generic formal objects in the Wikibook for details and consequences.
all instances of a generic being exactly the same, it is easier to review and understand programs written by others; there are no "special cases" to take into account.
all instantiations being explicit, there are no hidden instantiations that might make it difficult to understand the program.
Ada does not permit "template metaprogramming", because it does not allow specialisations.

Templates in C++

C++ uses templates to enable generic programming techniques. The C++ Standard Library includes the Standard Template Library or STL that provides a framework of templates for common data structures and algorithms. Templates in C++ may also be used for template metaprogramming, which is a way of pre-evaluating some of the code at compile-time rather than run-time. Using template specialization, C++ Templates are considered Turing complete.

Technical overview

There are two kinds of templates: function templates and class templates. A function template is a pattern for creating ordinary functions based upon the parameterizing types supplied when instantiated. For example, the C++ Standard Template Library contains the function template max(x, y) that creates functions that return either x or y, whichever is larger. max() could be defined like this:

template <typename T>

 T max(T x, T y)
{
    return x < y ? y : x;
}

Specializations of this function template, instantiations with specific types, can be called just like an ordinary function:

cout << max(3, 7);   // outputs 7

The compiler examines the arguments used to call max and determines that this is a call to max(int, int). It then instantiates a version of the function where the parameterizing type T is int, making the equivalent of the following function:

int max(int x, int y)
{
    return x < y ? y : x;
}

This works whether the arguments x and y are integers, strings, or any other type for which the expression x < y is sensible, or more specifically, for any type for which operator< is defined. Common inheritance is not needed for the set of types that can be used, and so it is very similar to duck typing. A program defining a custom data type can use operator overloading to define the meaning of < for that type, thus allowing its use with the max() function template. While this may seem a minor benefit in this isolated example, in the context of a comprehensive library like the STL it allows the programmer to get extensive functionality for a new data type, just by defining a few operators for it. Merely defining < allows a type to be used with the standard sort(), stable_sort(), and binary_search() algorithms or to be put inside data structures such as sets, heaps, and associative arrays.

C++ templates are completely type safe at compile time. As a demonstration, the standard type complex does not define the < operator, because there is no strict order on complex numbers. Therefore, max(x, y) will fail with a compile error, if x and y are complex values. Likewise, other templates that rely on < cannot be applied to complex data unless a comparison (in the form of a functor or function) is provided. E.g.: A complex cannot be used as key for a map unless a comparison is provided. Unfortunately, compilers historically generate somewhat esoteric, long, and unhelpful error messages for this sort of error. Ensuring that a certain object adheres to a method protocol can alleviate this issue. Languages which use compare instead of < can also use complex values as keys.

The second kind of template, a class template, extends the same concept to classes. A class template specialization is a class. Class templates are often used to make generic containers. For example, the STL has a linked list container. To make a linked list of integers, one writes list. A list of strings is denoted list. A list has a set of standard functions associated with it, that work for any compatible parameterizing types.

Template specialization

A powerful feature of C++'s templates is template specialization. This allows alternative implementations to be provided based on certain characteristics of the parameterized type that is being instantiated. Template specialization has two purposes: to allow certain forms of optimization, and to reduce code bloat.

For example, consider a sort() template function. One of the primary activities that such a function does is to swap or exchange the values in two of the container's positions. If the values are large (in terms of the number of bytes it takes to store each of them), then it is often quicker to first build a separate list of pointers to the objects, sort those pointers, and then build the final sorted sequence. If the values are quite small however it is usually fastest to just swap the values in-place as needed. Furthermore, if the parameterized type is already of some pointer-type, then there is no need to build a separate pointer array. Template specialization allows the template creator to write different implementations and to specify the characteristics that the parameterized type(s) must have for each implementation to be used.

Unlike function templates, class templates can be partially specialized. That means that an alternate version of the class template code can be provided when some of the template parameters are known, while leaving other template parameters generic. This can be used, for example, to create a default implementation (the primary specialization) that assumes that copying a parameterizing type is expensive and then create partial specializations for types that are cheap to copy, thus increasing overall efficiency. Clients of such a class template just use specializations of it without needing to know whether the compiler used the primary specialization or some partial specialization in each case. Class templates can also be fully specialized, which means that an alternate implementation can be provided when all of the parameterizing types are known.

Advantages and disadvantages

Some uses of templates, such as the max() function, were previously filled by function-like preprocessor macros (a legacy of the C programming language). For example, here is a possible max() macro:

#define max(a,b) ((a) < (b) ? (b) : (a))

Macros are expanded by preprocessor, before compilation proper; templates are expanded at compile time. Macros are always expanded inline; templates can also be expanded as inline functions when the compiler deems it appropriate. Thus both function-like macros and function templates have no run-time overhead.

However, templates are generally considered an improvement over macros for these purposes. Templates are type-safe. Templates avoid some of the common errors found in code that makes heavy use of function-like macros, such as evaluating parameters with side effects twice. Perhaps most importantly, templates were designed to be applicable to much larger problems than macros.

There are three primary drawbacks to the use of templates: compiler support, poor error messages, and code bloat. Many compilers historically have poor support for templates, thus the use of templates can make code somewhat less portable. Support may also be poor when a C++ compiler is being used with a linker that is not C++-aware, or when attempting to use templates across shared library boundaries. Most modern compilers however now have fairly robust and standard template support, and the new C++ standard, C++11, further addresses these issues.

Almost all compilers produce confusing, long, or sometimes unhelpful error messages when errors are detected in code that uses templates.^[18] This can make templates difficult to develop.

Finally, the use of templates requires the compiler to generate a separate instance of the templated class or function for every permutation of type parameters used with it. (This is necessary because types in C++ are not all the same size, and the sizes of data fields are important to how classes work.) So the indiscriminate use of templates can lead to code bloat, resulting in excessively large executables. However, judicious use of template specialization and derivation can dramatically reduce such code bloat in some cases:

So, can derivation be used to reduce the problem of code replicated because templates are used? This would involve deriving a template from an ordinary class. This technique proved successful in curbing code bloat in real use. People who do not use a technique like this have found that replicated code can cost megabytes of code space even in moderate size programs.

— Bjarne Stroustrup, The Design and Evolution of C++, 1994^[19]

In simple cases templates can be transformed into generics (not causing code bloat) by creating a class getting a parameter derived from a type in compile time and wrapping a template around this class. It is a nice approach for creating generic heap-based containers.

The extra instantiations generated by templates can also cause debuggers to have difficulty working gracefully with templates. For example, setting a debug breakpoint within a template from a source file may either miss setting the breakpoint in the actual instantiation desired or may set a breakpoint in every place the template is instantiated.

Also, because the compiler needs to perform macro-like expansions of templates and generate different instances of them at compile time, the implementation source code for the templated class or function must be available (e.g. included in a header) to the code using it. Templated classes or functions, including much of the Standard Template Library (STL), if not included in header files, cannot be compiled. (This is in contrast to non-templated code, which may be compiled to binary, providing only a declarations header file for code using it.) This may be a disadvantage by exposing the implementing code, which removes some abstractions, and could restrict its use in closed-source projects.^{[citation needed]}

Templates in D

The D programming language supports templates based in design on C++. Most C++ template idioms will carry over to D without alteration, but D adds some additional functionality:

Template parameters in D are not restricted to just types and primitive values, but also allow arbitrary compile-time values (such as strings and struct literals), and aliases to arbitrary identifiers, including other templates or template instantiations.
Template constraints and the static if statement provide an alternative to C++'s substitution failure is not an error (SFINAE) mechanism, similar to C++ concepts.
The is(...) expression allows speculative instantiation to verify an object's traits at compile time.
The auto keyword and the typeof expression allow type inference for variable declarations and function return values, which in turn allows "Voldemort types" (types which do not have a global name).^[20]

Templates in D use a different syntax than in C++: whereas in C++ template parameters are wrapped in angular brackets (Template), D uses an exclamation sign and parentheses: Template!(param1, param2). This avoids the C++ parsing difficulties due to ambiguity with comparison operators. If there is only one parameter, the parentheses can be omitted.

Conventionally, D combines the above features to provide compile-time polymorphism using trait-based generic programming. For example, an input range is defined as any type that satisfies the checks performed by isInputRange, which is defined as follows:

template isInputRange(R)
{
    enum bool isInputRange = is(typeof(
    (inout int = 0)
    {
        R r = R.init;     // can define a range object
        if (r.empty) {}   // can test for empty
        r.popFront();     // can invoke popFront()
        auto h = r.front; // can get the front of the range
    }));
}

A function that accepts only input ranges can then use the above template in a template constraint:

auto fun(Range)(Range range)
    if (isInputRange!Range)
{
    // ...
}

Code generation

In addition to template metaprogramming, D also provides several features to enable compile-time code generation:

The import expression allows reading a file from disk and using its contents as a string expression.
Compile-time reflection allows enumerating and inspecting declarations and their members during compilation.
User-defined attributes allow users to attach arbitrary identifiers to declarations, which can then be enumerated using compile-time reflection.
Compile-Time Function Execution (CTFE) allows a subset of D (restricted to safe operations) to be interpreted during compilation.
String mixins allow evaluating and compiling the contents of a string expression as D code that becomes part of the program.

Combining the above allows generating code based on existing declarations. For example, D serialization frameworks can enumerate a type's members and generate specialized functions for each serialized type to perform serialization and deserialization. User-defined attributes could further indicate serialization rules.

The import expression and compile-time function execution also allow efficiently implementing domain-specific languages. For example, given a function that takes a string containing an HTML template and returns equivalent D source code, it is possible to use it in the following way:

// Import the contents of example.htt as a string manifest constant.
enum htmlTemplate = import("example.htt");

// Transpile the HTML template to D code.
enum htmlDCode = htmlTemplateToD(htmlTemplate);

// Paste the contents of htmlDCode as D code.
mixin(htmlDCode);

Genericity in Eiffel

Generic classes have been a part of Eiffel since the original method and language design. The foundation publications of Eiffel,^[21]^[22] use the term genericity to describe the creation and use of generic classes.

Basic/Unconstrained genericity

Generic classes are declared with their class name and a list of one or more formal generic parameters. In the following code, class LIST has one formal generic parameter G

class
    LIST [G]
            ...
feature   -- Access
    item: G
            -- The item currently pointed to by cursor
            ...
feature   -- Element change
    put (new_item: G)
            -- Add `new_item' at the end of the list
            ...

The formal generic parameters are placeholders for arbitrary class names that will be supplied when a declaration of the generic class is made, as shown in the two generic derivations below, where ACCOUNT and DEPOSIT are other class names. ACCOUNT and DEPOSIT are considered actual generic parameters as they provide real class names to substitute for G in actual use.

 list_of_accounts: LIST [ACCOUNT]
            -- Account list

    list_of_deposits: LIST [DEPOSIT]
            -- Deposit list

Within the Eiffel type system, although class LIST [G] is considered a class, it is not considered a type. However, a generic derivation of LIST [G] such as LIST [ACCOUNT] is considered a type.

Constrained genericity

For the list class shown above, an actual generic parameter substituting for G can be any other available class. To constrain the set of classes from which valid actual generic parameters can be chosen, a generic constraint can be specified. In the declaration of class SORTED_LIST below, the generic constraint dictates that any valid actual generic parameter will be a class that inherits from class COMPARABLE. The generic constraint ensures that elements of a SORTED_LIST can in fact be sorted.

class
    SORTED_LIST [G -> COMPARABLE]

Generics in Java

Support for the generics, or "containers-of-type-T" was added to the Java programming language in 2004 as part of J2SE 5.0. In Java, generics are only checked at compile time for type correctness. The generic type information is then removed via a process called type erasure, to maintain compatibility with old JVM implementations, making it unavailable at runtime. For example, a List is converted to the raw type List. The compiler inserts type casts to convert the elements to the String type when they are retrieved from the list, reducing performance compared to other implementations such as C++ templates.

Genericity in .NET [C#, VB.NET]

Generics were added as part of .NET Framework 2.0 in November 2005, based on a research prototype from Microsoft Research started in 1999.^[23] Although similar to generics in Java, .NET generics do not apply type erasure, but implement generics as a first class mechanism in the runtime using reification. This design choice provides additional functionality, such as allowing reflection with preservation of generic types, as well as alleviating some of the limitations of erasure (such as being unable to create generic arrays).^[24]^[25] This also means that there is no performance hit from runtime casts and normally expensive boxing conversions. When primitive and value types are used as generic arguments, they get specialized implementations, allowing for efficient generic collections and methods. As in C++ and Java, nested generic types such as Dictionary> are valid types, however are advised against for member signatures in code analysis design rules.^[26]
.NET allows six varieties of generic type constraints using the where keyword including restricting generic types to be value types, to be classes, to have constructors, and to implement interfaces.^[27] Below is an example with an interface constraint:

using System;

class Sample

    static void Main()

        int[] array = { 0, 1, 2, 3 };

        MakeAtLeast<int>(array, 2); // Change array to { 2, 2, 2, 3 }

        foreach (int i in array)

            Console.WriteLine(i); // Print results.

        Console.ReadKey(true);

    static void MakeAtLeast<T>(T[] list, T lowest) where T : IComparable<T>)

        for (int i = 0; i < list.Length; i++)

            if (list[i].CompareTo(lowest) < 0)

                list[i] = lowest;

The MakeAtLeast() method allows operation on arrays, with elements of generic type T. The method's type constraint indicates that the method is applicable to any type T that implements the generic IComparable interface. This ensures a compile time error, if the method is called if the type does not support comparison. The interface provides the generic method CompareTo(T).

The above method could also be written without generic types, simply using the non-generic Array type. However, since arrays are contravariant, the casting would not be type safe, and compiler may miss errors that would otherwise be caught while making use of the generic types. In addition, the method would need to access the array items as objects instead, and would require casting to compare two elements. (For value types like types such as int this requires a boxing conversion, although this can be worked around using the Comparer class, as is done in the standard collection classes.)

A notable behavior of static members in a generic .NET class is static member instantiation per run-time type (see example below).

 //A generic class
    public class GenTest<T>
    {
        //A static variable - will be created for each type on refraction
        static CountedInstances OnePerType = new CountedInstances();

        //a data member
        private T mT;

        //simple constructor
        public GenTest(T pT)
        {
            mT = pT;
        }
    }

    //a class
    public class CountedInstances
    {
        //Static variable - this will be incremented once per instance
        public static int Counter;

        //simple constructor
        public CountedInstances()
        {
            //increase counter by one during object instantiation
            CountedInstances.Counter++;
        }
    }

  //main code entry point
  //at the end of execution, CountedInstances.Counter = 2
  GenTest<int> g1 = new GenTest<int>(1);
  GenTest<int> g11 = new GenTest<int>(11);
  GenTest<int> g111 = new GenTest<int>(111);
  GenTest<double> g2 = new GenTest<double>(1.0);

Genericity in Delphi

Delphi's Object Pascal dialect acquired generics in the Delphi 2007 release, initially only with the (now discontinued) .NET compiler before being added to the native code in the Delphi 2009 release. The semantics and capabilities of Delphi generics are largely modelled on those had by generics in .NET 2.0, though the implementation is by necessity quite different. Here's a more or less direct translation of the first C# example shown above:

program Sample;

{$APPTYPE CONSOLE}

uses
  Generics.Defaults; //for IComparer<>

type
  TUtils = class
    class procedure MakeAtLeast<T>(Arr: TArray<T>; const Lowest: T;
      Comparer: IComparer<T>); overload;
    class procedure MakeAtLeast<T>(Arr: TArray<T>; const Lowest: T); overload;
  end;

class procedure TUtils.MakeAtLeast<T>(Arr: TArray<T>; const Lowest: T;
  Comparer: IComparer<T>);
var
  I: Integer;
begin
  if Comparer = nil then Comparer := TComparer<T>.Default;
  for I := Low(Arr) to High(Arr) do
    if Comparer.Compare(Arr[I], Lowest) < 0 then
      Arr[I] := Lowest;
end;

class procedure TUtils.MakeAtLeast<T>(Arr: TArray<T>; const Lowest: T);
begin
  MakeAtLeast<T>(Arr, Lowest, nil);
end;

var
  Ints: TArray<Integer>;
  Value: Integer;
begin
  Ints := TArray<Integer>.Create(0, 1, 2, 3);
  TUtils.MakeAtLeast<Integer>(Ints, 2);
  for Value in Ints do
    WriteLn(Value);
  ReadLn;
end.

As with C#, methods as well as whole types can have one or more type parameters. In the example, TArray is a generic type (defined by the language) and MakeAtLeast a generic method. The available constraints are very similar to the available constraints in C#: any value type, any class, a specific class or interface, and a class with a parameterless constructor. Multiple constraints act as an additive union.

Genericity in Free Pascal

Free Pascal implemented generics before Delphi, and with different syntax and semantics. However, work is now underway to implement Delphi generics alongside native FPC ones (see Wiki). This allows Free Pascal programmers to use generics in whatever style they prefer.

Delphi and Free Pascal example:

// Delphi style
unit A;

{$ifdef fpc}
  {$mode delphi}
{$endif}

interface

type
  TGenericClass<T> = class
    function Foo(const AValue: T): T;
  end;

implementation

function TGenericClass<T>.Foo(const AValue: T): T;
begin
  Result := AValue + AValue;
end;

end.

// Free Pascal's ObjFPC style
unit B;

{$ifdef fpc}
  {$mode objfpc}
{$endif}

interface

type
  generic TGenericClass<T> = class
    function Foo(const AValue: T): T;
  end;

implementation

function TGenericClass.Foo(const AValue: T): T;
begin
  Result := AValue + AValue;
end;

end.

// example usage, Delphi style
program TestGenDelphi;

{$ifdef fpc}
  {$mode delphi}
{$endif}

uses
  A,B;

var
  GC1: A.TGenericClass<Integer>;
  GC2: B.TGenericClass<String>;
begin
  GC1 := A.TGenericClass<Integer>.Create;
  GC2 := B.TGenericClass<String>.Create;
  WriteLn(GC1.Foo(100)); // 200
  WriteLn(GC2.Foo('hello')); // hellohello
  GC1.Free;
  GC2.Free;
end.

// example usage, ObjFPC style
program TestGenDelphi;

{$ifdef fpc}
  {$mode objfpc}
{$endif}

uses
  A,B;

// required in ObjFPC
type
  TAGenericClassInt = specialize A.TGenericClass<Integer>;
  TBGenericClassString = specialize B.TGenericClass<String>;
var
  GC1: TAGenericClassInt;
  GC2: TBGenericClassString;
begin
  GC1 := TAGenericClassInt.Create;
  GC2 := TBGenericClassString.Create;
  WriteLn(GC1.Foo(100)); // 200
  WriteLn(GC2.Foo('hello')); // hellohello
  GC1.Free;
  GC2.Free;
end.

Functional languages

Genericity in Haskell

The type class mechanism of Haskell supports generic programming. Six of the predefined type classes in Haskell (including Eq, the types that can be compared for equality, and Show, the types whose values can be rendered as strings) have the special property of supporting derived instances. This means that a programmer defining a new type can state that this type is to be an instance of one of these special type classes, without providing implementations of the class methods as is usually necessary when declaring class instances. All the necessary methods will be "derived" – that is, constructed automatically – based on the structure of the type. For instance, the following declaration of a type of binary trees states that it is to be an instance of the classes Eq and Show:

data BinTree a = Leaf a | Node (BinTree a) a (BinTree a)
      deriving (Eq, Show)

This results in an equality function (==) and a string representation function (show) being automatically defined for any type of the form BinTree T provided that T itself supports those operations.

The support for derived instances of Eq and Show makes their methods == and show generic in a qualitatively different way from para-metrically polymorphic functions: these "functions" (more accurately, type-indexed families of functions) can be applied to values of various types, and although they behave differently for every argument type, little work is needed to add support for a new type. Ralf Hinze (2004) has shown that a similar effect can be achieved for user-defined type classes by certain programming techniques. Other researchers have proposed approaches to this and other kinds of genericity in the context of Haskell and extensions to Haskell (discussed below).

PolyP

PolyP was the first generic programming language extension to Haskell. In PolyP, generic functions are called polytypic. The language introduces a special construct in which such polytypic functions can be defined via structural induction over the structure of the pattern functor of a regular datatype. Regular datatypes in PolyP are a subset of Haskell datatypes. A regular datatype t must be of kind * → *, and if a is the formal type argument in the definition, then all recursive calls to t must have the form t a. These restrictions rule out higher-kinded datatypes as well as nested datatypes, where the recursive calls are of a different form. The flatten function in PolyP is here provided as an example:

 flatten :: Regular d => d a -> [a]
   flatten = cata fl

   polytypic fl :: f a [a] -> [a]
     case f of
       g+h -> either fl fl
       g*h -> \(x,y) -> fl x ++ fl y
       () -> \x -> []
       Par -> \x -> [x]
       Rec -> \x -> x
       d@g -> concat . flatten . pmap fl
       Con t -> \x -> []

   cata :: Regular d => (FunctorOf d a b -> b) -> d a -> b

Generic Haskell

Generic Haskell is another extension to Haskell, developed at Utrecht University in the Netherlands. The extensions it provides are:

Type-indexed values are defined as a value indexed over the various Haskell type constructors (unit, primitive types, sums, products, and user-defined type constructors). In addition, we can also specify the behaviour of a type-indexed values for a specific constructor using constructor cases, and reuse one generic definition in another using default cases.

The resulting type-indexed value can be specialized to any type.

Kind-indexed types are types indexed over kinds, defined by giving a case for both * and k → k'. Instances are obtained by applying the kind-indexed type to a kind.
Generic definitions can be used by applying them to a type or kind. This is called generic application. The result is a type or value, depending on which sort of generic definition is applied.
Generic abstraction enables generic definitions be defined by abstracting a type parameter (of a given kind).
Type-indexed types are types that are indexed over the type constructors. These can be used to give types to more involved generic values. The resulting type-indexed types can be specialized to any type.

As an example, the equality function in Generic Haskell:^[28]

 type Eq {[ * ]} t1 t2 = t1 -> t2 -> Bool
   type Eq {[ k -> l ]} t1 t2 = forall u1 u2. Eq {[ k ]} u1 u2 -> Eq {[ l ]} (t1 u1) (t2 u2)

   eq {| t :: k |} :: Eq {[ k ]} t t
   eq {| Unit |} _ _ = True
   eq {| :+: |} eqA eqB (Inl a1) (Inl a2) = eqA a1 a2
   eq {| :+: |} eqA eqB (Inr b1) (Inr b2) = eqB b1 b2
   eq {| :+: |} eqA eqB _ _ = False
   eq {| :*: |} eqA eqB (a1 :*: b1) (a2 :*: b2) = eqA a1 a2 && eqB b1 b2
   eq {| Int |} = (==)
   eq {| Char |} = (==)
   eq {| Bool |} = (==)

Clean

Clean offers generic programming based PolyP and the generic Haskell as supported by the GHC>=6.0. It parametrizes by kind as those but offers overloading.

Other languages

The ML family of programming languages support generic programming through parametric polymorphism and generic modules called functors. Both Standard ML and OCaml provide functors, which are similar to class templates and to Ada's generic packages. Scheme syntactic abstractions also have a connection to genericity – these are in fact a superset of templating à la C++.

A Verilog module may take one or more parameters, to which their actual values are assigned upon the instantiation of the module. One example is a generic register array where the array width is given via a parameter. Such the array, combined with a generic wire vector, can make a generic buffer or memory module with an arbitrary bit width out of a single module implementation.^[29]

VHDL, being derived from Ada, also have generic ability.

C++ (programming language)

From Wikipedia, the free encyclopedia

C++
Major implementations

Paradigm	Multi-paradigm: procedural, functional, object-oriented, generic^[1]
Designed by	Bjarne Stroustrup
First appeared	1985; 33 years ago

Stable release	ISO/IEC 14882:2017 / 1 December 2017; 7 months ago
Typing discipline	Static, nominative, partially inferred
Implementation language	C++ or C
Filename extensions	.C .cc .cpp .cxx .c++ .h .hh .hpp .hxx .h++
Website	isocpp.org
LLVM Clang, GCC, Microsoft Visual C++, Embarcadero C++Builder, Intel C++ Compiler, IBM XL C++, EDG
Influenced by
Ada, ALGOL 68, C, CLU, ML, Simula
Influenced
Ada 95, C#,^[2] C99, Chapel,^[3] D, Java,^[4] Lua, Perl, PHP, Python,^[5] Rust, Nim^{[citation needed]}

C++ (/ˌsiːˌplʌsˈplʌs/ "see plus plus") is a general-purpose programming language. It has imperative, object-oriented and generic programming features, while also providing facilities for low-level memory manipulation.

It was designed with a bias toward system programming and embedded, resource-constrained and large systems, with performance, efficiency and flexibility of use as its design highlights.^[6] C++ has also been found useful in many other contexts, with key strengths being software infrastructure and resource-constrained applications,^[6] including desktop applications, servers (e.g. e-commerce, web search or SQL servers), and performance-critical applications (e.g. telephone switches or space probes).^[7] C++ is a compiled language, with implementations of it available on many platforms. Many vendors provide C++ compilers, including the Free Software Foundation, Microsoft, Intel, and IBM.

C++ is standardized by the International Organization for Standardization (ISO), with the latest standard version ratified and published by ISO in December 2017 as ISO/IEC 14882:2017 (informally known as C++17).^[8] The C++ programming language was initially standardized in 1998 as ISO/IEC 14882:1998, which was then amended by the C++03, C++11 and C++14 standards. The current C++17 standard supersedes these with new features and an enlarged standard library. Before the initial standardization in 1998, C++ was developed by Bjarne Stroustrup at Bell Labs since 1979, as an extension of the C language as he wanted an efficient and flexible language similar to C, which also provided high-level features for program organization. C++20 is the next planned standard thereafter.

Many other programming languages have been influenced by C++, including C#, D, Java, and newer versions of C.

History

Bjarne Stroustrup, the creator of C++

In 1979, Bjarne Stroustrup, a Danish computer scientist, began work on "C with Classes", the predecessor to C++.^[9] The motivation for creating a new language originated from Stroustrup's experience in programming for his Ph.D. thesis. Stroustrup found that Simula had features that were very helpful for large software development, but the language was too slow for practical use, while BCPL was fast but too low-level to be suitable for large software development. When Stroustrup started working in AT&T Bell Labs, he had the problem of analyzing the UNIX kernel with respect to distributed computing. Remembering his Ph.D. experience, Stroustrup set out to enhance the C language with Simula-like features.^[10] C was chosen because it was general-purpose, fast, portable and widely used. As well as C and Simula's influences, other languages also influenced C++, including ALGOL 68, Ada, CLU and ML.

Initially, Stroustrup's "C with Classes" added features to the C compiler, Cpre, including classes, derived classes, strong typing, inlining and default arguments.^[11]

In 1983, "C with Classes" was renamed to "C++" (++ being the increment operator in C), adding new features that included virtual functions, function name and operator overloading, references, constants, type-safe free-store memory allocation (new/delete), improved type checking, and BCPL style single-line comments with two forward slashes (//). Furthermore, it included the development of a standalone compiler for C++, Cfront.

In 1985, the first edition of The C++ Programming Language was released, which became the definitive reference for the language, as there was not yet an official standard.^[12] The first commercial implementation of C++ was released in October of the same year.^[9]

In 1989, C++ 2.0 was released, followed by the updated second edition of The C++ Programming Language in 1991.^[13] New features in 2.0 included multiple inheritance, abstract classes, static member functions, const member functions, and protected members. In 1990, The Annotated C++ Reference Manual was published. This work became the basis for the future standard. Later feature additions included templates, exceptions, namespaces, new casts, and a boolean type.

After the 2.0 update, C++ evolved relatively slowly until, in 2011, the C++11 standard was released, adding numerous new features, enlarging the standard library further, and providing more facilities to C++ programmers. After a minor C++14 update released in December 2014, various new additions were introduced in C++17, and further changes planned for 2020.^[14]

As of 2017, C++ remains the third most popular programming language, behind Java and C.^[15]^[16]

On January 3, 2018, Stroustrup was announced as the 2018 winner of the Charles Stark Draper Prize for Engineering, which comes with $500,000, "for conceptualizing and developing the C++ programming language."^[17]

Etymology

According to Stroustrup: "the name signifies the evolutionary nature of the changes from C".^[18] This name is credited to Rick Mascitti (mid-1983)^[11] and was first used in December 1983. When Mascitti was questioned informally in 1992 about the naming, he indicated that it was given in a tongue-in-cheek spirit. The name comes from C's ++ operator (which increments the value of a variable) and a common naming convention of using "+" to indicate an enhanced computer program.

During C++'s development period, the language had been referred to as "new C" and "C with Classes"^[11]^[19] before acquiring its final name.

Philosophy

Throughout C++'s life, its development and evolution has been guided by a set of principles:^[10]

It must be driven by actual problems and its features should be useful immediately in real world programs.
Every feature should be implementable (with a reasonably obvious way to do so).
Programmers should be free to pick their own programming style, and that style should be fully supported by C++.
Allowing a useful feature is more important than preventing every possible misuse of C++.
It should provide facilities for organising programs into well-defined separate parts, and provide facilities for combining separately developed parts.
No implicit violations of the type system (but allow explicit violations; that is, those explicitly requested by the programmer).
User-created types need to have the same support and performance as built-in types.
Unused features should not negatively impact created executables (e.g. in lower performance).
There should be no language beneath C++ (except assembly language).
C++ should work alongside other existing programming languages, rather than fostering its own separate and incompatible programming environment.
If the programmer's intent is unknown, allow the programmer to specify it by providing manual control.

Standardization

Year	C++ Standard	Informal name
1998	ISO/IEC 14882:1998^[20]	C++98
2003	ISO/IEC 14882:2003^[21]	C++03
2011	ISO/IEC 14882:2011^[22]	C++11, C++0x
2014	ISO/IEC 14882:2014^[23]	C++14, C++1y
2017	ISO/IEC 14882:2017^[8]	C++17, C++1z
2020	to be determined	C++20^[14]

C++ is standardized by an ISO working group known as JTC1/SC22/WG21. So far, it has published five revisions of the C++ standard and is currently working on the next revision, C++20.

In 1998, the ISO working group standardized C++ for the first time as ISO/IEC 14882:1998, which is informally known as C++98. In 2003, it published a new version of the C++ standard called ISO/IEC 14882:2003, which fixed problems identified in C++98.

The next major revision of the standard was informally referred to as "C++0x", but it was not released until 2011.^[24] C++11 (14882:2011) included many additions to both the core language and the standard library.^[22]

In 2014, C++14 (also known as C++1y) was released as a small extension to C++11, featuring mainly bug fixes and small improvements.^[25] The Draft International Standard ballot procedures completed in mid-August 2014.^[26]

After C++14, a major revision C++17, informally known as C++1z, was completed by the ISO C++ Committee in mid July 2017 and was approved and published in December 2017.^[27]

As part of the standardization process, ISO also publishes technical reports and specifications:

ISO/IEC TR 18015:2006^[28] on the use of C++ in embedded systems and on performance implications of C++ language and library features,
ISO/IEC TR 19768:2007^[29] (also known as the C++ Technical Report 1) on library extensions mostly integrated into C++11,
ISO/IEC TR 29124:2010^[30] on special mathematical functions,
ISO/IEC TR 24733:2011^[31] on decimal floating point arithmetic,
ISO/IEC TS 18822:2015^[32] on the standard filesystem library,
ISO/IEC TS 19570:2015^[33] on parallel versions of the standard library algorithms,
ISO/IEC TS 19841:2015^[34] on software transactional memory,
ISO/IEC TS 19568:2015^[35] on a new set of library extensions, some of which are already integrated into C++17,
ISO/IEC TS 19217:2015^[36] on the C++ Concepts

More technical specifications are in development and pending approval, including concurrency library extensions, a networking standard library, ranges, and modules.^[37]

Language

The C++ language has two main components: a direct mapping of hardware features provided primarily by the C subset, and zero-overhead abstractions based on those mappings. Stroustrup describes C++ as "a light-weight abstraction programming language [designed] for building and using efficient and elegant abstractions";^[6] and "offering both hardware access and abstraction is the basis of C++. Doing it efficiently is what distinguishes it from other languages".^[38]

C++ inherits most of C's syntax. The following is Bjarne Stroustrup's version of the Hello world program that uses the C++ Standard Library stream facility to write a message to standard output:^[39]^[40]

#include

int main()

    std::cout << "Hello, world!\n";

Object storage

As in C, C++ supports four types of memory management: static storage duration objects, thread storage duration objects, automatic storage duration objects, and dynamic storage duration objects.^[41]

Static storage duration objects

Static storage duration objects are created before main() is entered (see exceptions below) and destroyed in reverse order of creation after main() exits. The exact order of creation is not specified by the standard (though there are some rules defined below) to allow implementations some freedom in how to organize their implementation. More formally, objects of this type have a lifespan that "shall last for the duration of the program".^[42]

Static storage duration objects are initialized in two phases. First, "static initialization" is performed, and only after all static initialization is performed, "dynamic initialization" is performed. In static initialization, all objects are first initialized with zeros; after that, all objects that have a constant initialization phase are initialized with the constant expression (i.e. variables initialized with a literal or constexpr). Though it is not specified in the standard, the static initialization phase can be completed at compile time and saved in the data partition of the executable. Dynamic initialization involves all object initialization done via a constructor or function call (unless the function is marked with constexpr, in C++11). The dynamic initialization order is defined as the order of declaration within the compilation unit (i.e. the same file). No guarantees are provided about the order of initialization between compilation units.

Thread storage duration objects

Variables of this type are very similar to static storage duration objects. The main difference is the creation time is just prior to thread creation and destruction is done after the thread has been joined.^[43]

Automatic storage duration objects

The most common variable types in C++ are local variables inside a function or block, and temporary variables.^[44] The common feature about automatic variables is that they have a lifetime that is limited to the scope of the variable. They are created and potentially initialized at the point of declaration (see below for details) and destroyed in the reverse order of creation when the scope is left. This is implemented by allocation on the stack.

Local variables are created as the point of execution passes the declaration point. If the variable has a constructor or initializer this is used to define the initial state of the object. Local variables are destroyed when the local block or function that they are declared in is closed. C++ destructors for local variables are called at the end of the object lifetime, allowing a discipline for automatic resource management termed RAII, which is widely used in C++.

Member variables are created when the parent object is created. Array members are initialized from 0 to the last member of the array in order. Member variables are destroyed when the parent object is destroyed in the reverse order of creation. i.e. If the parent is an "automatic object" then it will be destroyed when it goes out of scope which triggers the destruction of all its members.

Temporary variables are created as the result of expression evaluation and are destroyed when the statement containing the expression has been fully evaluated (usually at the ; at the end of a statement).

Dynamic storage duration objects

These objects have a dynamic lifespan and are created with a call to new and destroyed explicitly with a call to delete.^[45]

Templates

C++ templates enable generic programming similar to generics in Java. C++ supports function, class, alias and variable templates. Templates may be parameterized by types, compile-time constants, and other templates. Templates are implemented by instantiation at compile-time. To instantiate a template, compilers substitute specific arguments for a template's parameters to generate a concrete function or class instance. Some substitutions are not possible; these are eliminated by an overload resolution policy described by the phrase "Substitution failure is not an error" (SFINAE). Templates are a powerful tool that can be used for generic programming, template metaprogramming, and code optimization, but this power implies a cost. Template use may increase code size, because each template instantiation produces a copy of the template code: one for each set of template arguments, however, this is the same or smaller amount of code that would be generated if the code was written by hand.^[46] This is in contrast to run-time generics seen in other languages (e.g., Java) where at compile-time the type is erased and a single template body is preserved.
Templates are different from macros: while both of these compile-time language features enable conditional compilation, templates are not restricted to lexical substitution. Templates are aware of the semantics and type system of their companion language, as well as all compile-time type definitions, and can perform high-level operations including programmatic flow control based on evaluation of strictly type-checked parameters. Macros are capable of conditional control over compilation based on predetermined criteria, but cannot instantiate new types, recurse, or perform type evaluation and in effect are limited to pre-compilation text-substitution and text-inclusion/exclusion. In other words, macros can control compilation flow based on pre-defined symbols but cannot, unlike templates, independently instantiate new symbols. Templates are a tool for static polymorphism (see below) and generic programming.

In addition, templates are a compile time mechanism in C++ that is Turing-complete, meaning that any computation expressible by a computer program can be computed, in some form, by a template metaprogram prior to runtime.

In summary, a template is a compile-time parameterized function or class written without knowledge of the specific arguments used to instantiate it. After instantiation, the resulting code is equivalent to code written specifically for the passed arguments. In this manner, templates provide a way to decouple generic, broadly applicable aspects of functions and classes (encoded in templates) from specific aspects (encoded in template parameters) without sacrificing performance due to abstraction.

Objects

C++ introduces object-oriented programming (OOP) features to C. It offers classes, which provide the four features commonly present in OOP (and some non-OOP) languages: abstraction, encapsulation, inheritance, and polymorphism. One distinguishing feature of C++ classes compared to classes in other programming languages is support for deterministic destructors, which in turn provide support for the Resource Acquisition is Initialization (RAII) concept.

Encapsulation

Encapsulation is the hiding of information to ensure that data structures and operators are used as intended and to make the usage model more obvious to the developer. C++ provides the ability to define classes and functions as its primary encapsulation mechanisms. Within a class, members can be declared as either public, protected, or private to explicitly enforce encapsulation. A public member of the class is accessible to any function. A private member is accessible only to functions that are members of that class and to functions and classes explicitly granted access permission by the class ("friends"). A protected member is accessible to members of classes that inherit from the class in addition to the class itself and any friends.

The OO principle is that all of the functions (and only the functions) that access the internal representation of a type should be encapsulated within the type definition. C++ supports this (via member functions and friend functions), but does not enforce it: the programmer can declare parts or all of the representation of a type to be public, and is allowed to make public entities that are not part of the representation of the type. Therefore, C++ supports not just OO programming, but other decomposition paradigms, like modular programming.

It is generally considered good practice to make all data private or protected, and to make public only those functions that are part of a minimal interface for users of the class. This can hide the details of data implementation, allowing the designer to later fundamentally change the implementation without changing the interface in any way.^[47]^[48]

Inheritance

Inheritance allows one data type to acquire properties of other data types. Inheritance from a base class may be declared as public, protected, or private. This access specifier determines whether unrelated and derived classes can access the inherited public and protected members of the base class. Only public inheritance corresponds to what is usually meant by "inheritance". The other two forms are much less frequently used. If the access specifier is omitted, a "class" inherits privately, while a "struct" inherits publicly. Base classes may be declared as virtual; this is called virtual inheritance. Virtual inheritance ensures that only one instance of a base class exists in the inheritance graph, avoiding some of the ambiguity problems of multiple inheritance.

Multiple inheritance is a C++ feature not found in most other languages, allowing a class to be derived from more than one base class; this allows for more elaborate inheritance relationships. For example, a "Flying Cat" class can inherit from both "Cat" and "Flying Mammal". Some other languages, such as C# or Java, accomplish something similar (although more limited) by allowing inheritance of multiple interfaces while restricting the number of base classes to one (interfaces, unlike classes, provide only declarations of member functions, no implementation or member data). An interface as in C# and Java can be defined in C++ as a class containing only pure virtual functions, often known as an abstract base class or "ABC". The member functions of such an abstract base class are normally explicitly defined in the derived class, not inherited implicitly. C++ virtual inheritance exhibits an ambiguity resolution feature called dominance.

Operators and operator overloading

C++ provides more than 35 operators, covering basic arithmetic, bit manipulation, indirection, comparisons, logical operations and others. Almost all operators can be overloaded for user-defined types, with a few notable exceptions such as member access (. and .*) as well as the conditional operator. The rich set of overloadable operators is central to making user-defined types in C++ seem like built-in types.
Overloadable operators are also an essential part of many advanced C++ programming techniques, such as smart pointers. Overloading an operator does not change the precedence of calculations involving the operator, nor does it change the number of operands that the operator uses (any operand may however be ignored by the operator, though it will be evaluated prior to execution). Overloaded "&&" and "||" operators lose their short-circuit evaluation property.

Polymorphism

Polymorphism enables one common interface for many implementations, and for objects to act differently under different circumstances.
C++ supports several kinds of static (resolved at compile-time) and dynamic (resolved at run-time) polymorphisms, supported by the language features described above. Compile-time polymorphism does not allow for certain run-time decisions, while runtime polymorphism typically incurs a performance penalty.

Static polymorphism

Function overloading allows programs to declare multiple functions having the same name but with different arguments (i.e. ad hoc polymorphism). The functions are distinguished by the number or types of their formal parameters. Thus, the same function name can refer to different functions depending on the context in which it is used. The type returned by the function is not used to distinguish overloaded functions and would result in a compile-time error message.
When declaring a function, a programmer can specify for one or more parameters a default value. Doing so allows the parameters with defaults to optionally be omitted when the function is called, in which case the default arguments will be used. When a function is called with fewer arguments than there are declared parameters, explicit arguments are matched to parameters in left-to-right order, with any unmatched parameters at the end of the parameter list being assigned their default arguments. In many cases, specifying default arguments in a single function declaration is preferable to providing overloaded function definitions with different numbers of parameters.

Templates in C++ provide a sophisticated mechanism for writing generic, polymorphic code (i.e. parametric polymorphism). In particular, through the curiously recurring template pattern, it's possible to implement a form of static polymorphism that closely mimics the syntax for overriding virtual functions. Because C++ templates are type-aware and Turing-complete, they can also be used to let the compiler resolve recursive conditionals and generate substantial programs through template metaprogramming. Contrary to some opinion, template code will not generate a bulk code after compilation with the proper compiler settings.^[46]

Dynamic polymorphism

Inheritance

Variable pointers and references to a base class type in C++ can also refer to objects of any derived classes of that type. This allows arrays and other kinds of containers to hold pointers to objects of differing types (references cannot be directly held in containers). This enables dynamic (run-time) polymorphism, where the referred objects can behave differently depending on their (actual, derived) types.
C++ also provides the dynamic_cast operator, which allows code to safely attempt conversion of an object, via a base reference/pointer, to a more derived type: downcasting. The attempt is necessary as often one does not know which derived type is referenced. (Upcasting, conversion to a more general type, can always be checked/performed at compile-time via static_cast, as ancestral classes are specified in the derived class's interface, visible to all callers.) dynamic_cast relies on run-time type information (RTTI), metadata in the program that enables differentiating types and their relationships. If a dynamic_cast to a pointer fails, the result is the nullptr constant, whereas if the destination is a reference (which cannot be null), the cast throws an exception. Objects known to be of a certain derived type can be cast to that with static_cast, bypassing RTTI and the safe runtime type-checking of dynamic_cast, so this should be used only if the programmer is very confident the cast is, and will always be, valid.

Virtual member functions

Ordinarily, when a function in a derived class overrides a function in a base class, the function to call is determined by the type of the object. A given function is overridden when there exists no difference in the number or type of parameters between two or more definitions of that function. Hence, at compile time, it may not be possible to determine the type of the object and therefore the correct function to call, given only a base class pointer; the decision is therefore put off until runtime. This is called dynamic dispatch. Virtual member functions or methods^[49] allow the most specific implementation of the function to be called, according to the actual run-time type of the object. In C++ implementations, this is commonly done using virtual function tables. If the object type is known, this may be bypassed by prepending a fully qualified class name before the function call, but in general calls to virtual functions are resolved at run time.

In addition to standard member functions, operator overloads and destructors can be virtual. As a rule of thumb, if any function in the class is virtual, the destructor should be as well. As the type of an object at its creation is known at compile time, constructors, and by extension copy constructors, cannot be virtual. Nonetheless a situation may arise where a copy of an object needs to be created when a pointer to a derived object is passed as a pointer to a base object. In such a case, a common solution is to create a clone() (or similar) virtual function that creates and returns a copy of the derived class when called.

A member function can also be made "pure virtual" by appending it with = 0 after the closing parenthesis and before the semicolon. A class containing a pure virtual function is called an abstract class. Objects cannot be created from an abstract class; they can only be derived from. Any derived class inherits the virtual function as pure and must provide a non-pure definition of it (and all other pure virtual functions) before objects of the derived class can be created. A program that attempts to create an object of a class with a pure virtual member function or inherited pure virtual member function is ill-formed.

Lambda expressions

C++ provides support for anonymous functions, which are also known as lambda expressions and have the following form:

[capture](parameters) -> return_type { function_body }

The [capture] list supports the definition of closures. Such lambda expressions are defined in the standard as syntactic sugar for an unnamed function object. An example lambda function may be defined as follows:

[](int x, int y) -> int { return x + y; }

Exception handling

Exception handling is used to communicate the existence of a runtime problem or error from where it was detected to where the issue can be handled.^[50] It permits this to be done in a uniform manner and separately from the main code, while detecting all errors.^[51] Should an error occur, an exception is thrown (raised), which is then caught by the nearest suitable exception handler. The exception causes the current scope to be exited, and also each outer scope (propagation) until a suitable handler is found, calling in turn the destructors of any objects in these exited scopes.^[52] At the same time, an exception is presented as an object carrying the data about the detected problem.^[53]

Note that many C++ "styles", like Google's^[54], forbid usage of exceptions in C++ programs, restricting the language thusly.

The exception-causing code is placed inside a try block. The exceptions are handled in separate catch blocks (the handlers); each try block can have multiple exception handlers, as it is visible in the example below.^[55]

#include

#include

#include

int main()

try

        std::vector<int> vec{3, 4, 3, 1};

        int i{vec.at(4)}; // Throws an exception, std::out_of_range (indexing for vec is from 0-3 not 1-4)

    // An exception handler, catches std::out_of_range, which is thrown by vec.at(4)

    catch (std::out_of_range &e)

        std::cerr << "Accessing a non-existent element: " << e.what() << '\n';

    // To catch any other standard library exceptions (they derive from std::exception)

    catch (std::exception &e)

        std::cerr << "Exception thrown: " << e.what() << '\n';

    // Catch any unrecognised exceptions (i.e. those which don't derive from std::exception)

    catch (...)

        std::cerr << "Some fatal error\n";

It is also possible to raise exceptions purposefully, using the throw keyword; these exceptions are handled in the usual way. In some cases, exceptions cannot be used due to technical reasons. One such example is a critical component of an embedded system, where every operation must be guaranteed to complete within a specified amount of time. This cannot be determined with exceptions as no tools exist to determine the maximum time required for an exception to be handled.^[56]

Standard library

The C++ standard consists of two parts: the core language and the standard library. C++ programmers expect the latter on every major implementation of C++; it includes aggregate types (vectors, lists, maps, sets, queues, stacks, arrays, tuples), algorithms (find, for_each, binary_search, random_shuffle, etc.), input/output facilities (iostream, for reading from and writing to the console and files), filesystem library, localisation support, smart pointers for automatic memory management, regular expression support, multi-threading library, atomics support (allowing a variable to be read or written to by at most one thread at a time without any external synchronisation), time utilities (measurement, getting current time, etc.), a system for converting error reporting that doesn't use C++ exceptions into C++ exceptions, a random number generator and a slightly modified version of the C standard library (to make it comply with the C++ type system).
A large part of the C++ library is based on the Standard Template Library (STL). Useful tools provided by the STL include containers as the collections of objects (such as vectors and lists), iterators that provide array-like access to containers, and algorithms that perform operations such as searching and sorting.

Furthermore, (multi)maps (associative arrays) and (multi)sets are provided, all of which export compatible interfaces. Therefore, using templates it is possible to write generic algorithms that work with any container or on any sequence defined by iterators. As in C, the features of the library are accessed by using the #include directive to include a standard header. C++ provides 105 standard headers, of which 27 are deprecated.

The standard incorporates the STL that was originally designed by Alexander Stepanov, who experimented with generic algorithms and containers for many years. When he started with C++, he finally found a language where it was possible to create generic algorithms (e.g., STL sort) that perform even better than, for example, the C standard library qsort, thanks to C++ features like using inlining and compile-time binding instead of function pointers. The standard does not refer to it as "STL", as it is merely a part of the standard library, but the term is still widely used to distinguish it from the rest of the standard library (input/output streams, internationalization, diagnostics, the C library subset, etc.).^[57]

Most C++ compilers, and all major ones, provide a standards-conforming implementation of the C++ standard library.

Compatibility

To give compiler vendors greater freedom, the C++ standards committee decided not to dictate the implementation of name mangling, exception handling, and other implementation-specific features. The downside of this decision is that object code produced by different compilers is expected to be incompatible. There were, however, attempts to standardize compilers for particular machines or operating systems (for example C++ ABI),^[58] though they seem to be largely abandoned now.

With C

C++ is often considered to be a superset of C, but this is not strictly true.^[59] Most C code can easily be made to compile correctly in C++, but there are a few differences that cause some valid C code to be invalid or behave differently in C++. For example, C allows implicit conversion from void* to other pointer types, but C++ does not (for type safety reasons). Also, C++ defines many new keywords, such as new and class, which may be used as identifiers (for example, variable names) in a C program.
Some incompatibilities have been removed by the 1999 revision of the C standard (C99), which now supports C++ features such as line comments (//), and declarations mixed with code. On the other hand, C99 introduced a number of new features that C++ did not support, were incompatible or redundant in C++, such as variable-length arrays, native complex-number types (however, the std::complex class in the C++ standard library provides similar functionality, although not code-compatible), designated initializers, compound literals, and the restrict keyword.^[60] Some of the C99-introduced features were included in the subsequent version of the C++ standard, C++11 (out of those which were not redundant).^[61]^[62]^[63] However, the C++11 standard introduces new incompatibilities, such as disallowing assignment of a string literal to a character pointer, which remains valid C.

To intermix C and C++ code, any function declaration or definition that is to be called from/used both in C and C++ must be declared with C linkage by placing it within an extern "C" {/*...*/} block. Such a function may not rely on features depending on name mangling (i.e., function overloading).

Criticism

Despite its widespread adoption, notable programmers have criticized the C++ language, including Linus Torvalds,^[64] Richard Stallman,^[65] Joshua Bloch, Ken Thompson^[66]^[67]^[68], and Donald Knuth^[69]^[70].
One of the most often criticised points of C++ is its enormous complexity as a language, the large number of non-orthogonal features. This in practice necessitates restricting code to subset of C++, thus eschewing the readability benefits of common style and idioms. As expressed by Joshua Bloch:

I think C++ was pushed well beyond its complexity threshold and yet there are a lot of people programming it. But what you do is you force people to subset it. So almost every shop that I know of that uses C++ says, “Yes, we’re using C++ but we’re not doing multiple-implementation inheritance and we’re not using operator overloading.” There are just a bunch of features that you’re not going to use because the complexity of the resulting code is too high. And I don’t think it’s good when you have to start doing that. You lose this programmer portability where everyone can read everyone else’s code, which I think is such a good thing.

Ken Thompson, whose colleague Stroustrup was at Bell Labs, comments^[67]^[68] on how that came to be:

It certainly has its good points. But by and large I think it’s a bad language. It does a lot of things half well and it’s just a garbage heap of ideas that are mutually exclusive. Everybody I know, whether it’s personal or corporate, selects a subset and these subsets are different. So it’s not a good language to transport an algorithm—to say, “I wrote it; here, take it.” It’s way too big, way too complex. And it’s obviously built by a committee. Stroustrup campaigned for years and years and years, way beyond any sort of technical contributions he made to the language, to get it adopted and used. And he sort of ran all the standards committees with a whip and a chair. And he said “no” to no one. He put every feature in that language that ever existed. It wasn’t cleanly designed—it was just the union of everything that came along. And I think it suffered drastically from that.

Donald Knuth, who said of Edsger Dijkstra that "to think of programming in C++" "would make him physically ill"^[69], corroborates^[70] Thompson:

The problem that I have with them today is that... C++ is too complicated. At the moment, it's impossible for me to write portable code that I believe would work on lots of different systems, unless I avoid all exotic features. Whenever the C++ language designers had two competing ideas as to how they should solve some problem, they said "OK, we'll do them both". So the language is too baroque for my taste.

Even Stroustrup himself admits: "Within C++, there is a much smaller and cleaner language struggling to get out"^[71]

Other complaints may include a lack of reflection or garbage collection, slow compilation times, perceived feature creep,^[72] and verbose error messages, particularly from template metaprogramming.^[73]

Search This Blog

Friday, July 6, 2018

Generic programming

Stepanov–Musser and other generic programming paradigms

Programming language support for genericity

In object-oriented languages

Generics in Ada

Example

Advantages and limitations

Templates in C++

Technical overview

Template specialization

Advantages and disadvantages

Templates in D

Code generation

Genericity in Eiffel

Basic/Unconstrained genericity

Constrained genericity

Generics in Java

Genericity in .NET [C#, VB.NET]

Genericity in Delphi

Genericity in Free Pascal

Functional languages

Genericity in Haskell

PolyP

Generic Haskell

Clean

Other languages

C++ (programming language)

History

Etymology

Philosophy

Standardization

Language

Object storage

Static storage duration objects

Thread storage duration objects

Automatic storage duration objects

Dynamic storage duration objects

Templates

Objects

Encapsulation

Inheritance

Operators and operator overloading

Polymorphism

Static polymorphism

Dynamic polymorphism

Inheritance

Virtual member functions

Lambda expressions

Exception handling

Standard library

Compatibility

With C

Criticism

Emic and etic