Search This Blog

Thursday, May 28, 2020

Object-oriented programming

From Wikipedia, the free encyclopedia

Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data, in the form of fields (often known as attributes or properties), and code, in the form of procedures (often known as methods). A feature of objects is an object's procedures that can access and often modify the data fields of the object with which they are associated (objects have a notion of "this" or "self"). In OOP, computer programs are designed by making them out of objects that interact with one another. OOP languages are diverse, but the most popular ones are class-based, meaning that objects are instances of classes, which also determine their types.

Many of the most widely used programming languages (such as C++, Java, Python, etc.) are multi-paradigm and they support object-oriented programming to a greater or lesser degree, typically in combination with imperative, procedural programming. Significant object-oriented languages include Java, C++, C#, Python, R, PHP, JavaScript, Ruby, Perl, Object Pascal, Objective-C, Dart, Swift, Scala, Kotlin, Common Lisp, MATLAB, and Smalltalk.

Features

Object-oriented programming uses objects, but not all of the associated techniques and structures are supported directly in languages that claim to support OOP. The features listed below are common among languages considered to be strongly class- and object-oriented (or multi-paradigm with OOP support), with notable exceptions mentioned.

Shared with non-OOP predecessor languages

Modular programming support provides the ability to group procedures into files and modules for organizational purposes. Modules are namespaced so identifiers in one module will not conflict with a procedure or variable sharing the same name in another file or module.

Objects and classes

Languages that support object-oriented programming (OOP) typically use inheritance for code reuse and extensibility in the form of either classes or prototypes. Those that use classes support two main concepts:
  • Classes – the definitions for the data format and available procedures for a given type or class of object; may also contain data and procedures (known as class methods) themselves, i.e. classes contain the data members and member functions
  • Objects – instances of classes
Objects sometimes correspond to things found in the real world. For example, a graphics program may have objects such as "circle", "square", "menu". An online shopping system might have objects such as "shopping cart", "customer", and "product". Sometimes objects represent more abstract entities, like an object that represents an open file, or an object that provides the service of translating measurements from U.S. customary to metric.

Object-oriented programming is more than just classes and objects; it's a whole programming paradigm based around [sic] objects (data structures) that contain data fields and methods. It is essential to understand this; using classes to organize a bunch of unrelated methods together is not object orientation.
Junade Ali, Mastering PHP Design Patterns

Each object is said to be an instance of a particular class (for example, an object with its name field set to "Mary" might be an instance of class Employee). Procedures in object-oriented programming are known as methods; variables are also known as fields, members, attributes, or properties. This leads to the following terms:
  • Class variables – belong to the class as a whole; there is only one copy of each one
  • Instance variables or attributes – data that belongs to individual objects; every object has its own copy of each one
  • Member variables – refers to both the class and instance variables that are defined by a particular class
  • Class methods – belong to the class as a whole and have access only to class variables and inputs from the procedure call
  • Instance methods – belong to individual objects, and have access to instance variables for the specific object they are called on, inputs, and class variables
Objects are accessed somewhat like variables with complex internal structure, and in many languages are effectively pointers, serving as actual references to a single instance of said object in memory within a heap or stack. They provide a layer of abstraction which can be used to separate internal from external code. External code can use an object by calling a specific instance method with a certain set of input parameters, read an instance variable, or write to an instance variable. Objects are created by calling a special type of method in the class known as a constructor. A program may create many instances of the same class as it runs, which operate independently. This is an easy way for the same procedures to be used on different sets of data.

Object-oriented programming that uses classes is sometimes called class-based programming, while prototype-based programming does not typically use classes. As a result, a significantly different yet analogous terminology is used to define the concepts of object and instance.

In some languages classes and objects can be composed using other concepts like traits and mixins.

Class-based vs prototype-based

In class-based languages the classes are defined beforehand and the objects are instantiated based on the classes. If two objects apple and orange are instantiated from the class Fruit, they are inherently fruits and it is guaranteed that you may handle them in the same way; e.g. a programmer can expect the existence of the same attributes such as color or sugar_content or is_ripe.

In prototype-based languages the objects are the primary entities. No classes even exist. The prototype of an object is just another object to which the object is linked. Every object has one prototype link (and only one). New objects can be created based on already existing objects chosen as their prototype. You may call two different objects apple and orange a fruit, if the object fruit exists, and both apple and orange have fruit as their prototype. The idea of the fruit class doesn't exist explicitly, but as the equivalence class of the objects sharing the same prototype. The attributes and methods of the prototype are delegated to all the objects of the equivalence class defined by this prototype. The attributes and methods owned individually by the object may not be shared by other objects of the same equivalence class; e.g. the attribute sugar_content may be unexpectedly not present in apple. Only single inheritance can be implemented through the prototype.

Dynamic dispatch/message passing

It is the responsibility of the object, not any external code, to select the procedural code to execute in response to a method call, typically by looking up the method at run time in a table associated with the object. This feature is known as dynamic dispatch, and distinguishes an object from an abstract data type (or module), which has a fixed (static) implementation of the operations for all instances. If the call variability relies on more than the single type of the object on which it is called (i.e. at least one other parameter object is involved in the method choice), one speaks of multiple dispatch

A method call is also known as message passing. It is conceptualized as a message (the name of the method and its input parameters) being passed to the object for dispatch.

Encapsulation

Encapsulation is an object-oriented programming concept that binds together the data and functions that manipulate the data, and that keeps both safe from outside interference and misuse. Data encapsulation led to the important OOP concept of data hiding

If a class does not allow calling code to access internal object data and permits access through methods only, this is a strong form of abstraction or information hiding known as encapsulation. Some languages (Java, for example) let classes enforce access restrictions explicitly, for example denoting internal data with the private keyword and designating methods intended for use by code outside the class with the public keyword. Methods may also be designed public, private, or intermediate levels such as protected (which allows access from the same class and its subclasses, but not objects of a different class). In other languages (like Python) this is enforced only by convention (for example, private methods may have names that start with an underscore). Encapsulation prevents external code from being concerned with the internal workings of an object. This facilitates code refactoring, for example allowing the author of the class to change how objects of that class represent their data internally without changing any external code (as long as "public" method calls work the same way). It also encourages programmers to put all the code that is concerned with a certain set of data in the same class, which organizes it for easy comprehension by other programmers. Encapsulation is a technique that encourages decoupling.

Composition, inheritance, and delegation

Objects can contain other objects in their instance variables; this is known as object composition. For example, an object in the Employee class might contain (either directly or through a pointer) an object in the Address class, in addition to its own instance variables like "first_name" and "position". Object composition is used to represent "has-a" relationships: every employee has an address, so every Employee object has access to a place to store an Address object (either directly embedded within itself, or at a separate location addressed via a pointer). 

Languages that support classes almost always support inheritance. This allows classes to be arranged in a hierarchy that represents "is-a-type-of" relationships. For example, class Employee might inherit from class Person. All the data and methods available to the parent class also appear in the child class with the same names. For example, class Person might define variables "first_name" and "last_name" with method "make_full_name()". These will also be available in class Employee, which might add the variables "position" and "salary". This technique allows easy re-use of the same procedures and data definitions, in addition to potentially mirroring real-world relationships in an intuitive way. Rather than utilizing database tables and programming subroutines, the developer utilizes objects the user may be more familiar with: objects from their application domain.

Subclasses can override the methods defined by superclasses. Multiple inheritance is allowed in some languages, though this can make resolving overrides complicated. Some languages have special support for mixins, though in any language with multiple inheritance, a mixin is simply a class that does not represent an is-a-type-of relationship. Mixins are typically used to add the same methods to multiple classes. For example, class UnicodeConversionMixin might provide a method unicode_to_ascii() when included in class FileReader and class WebPageScraper, which don't share a common parent.

Abstract classes cannot be instantiated into objects; they exist only for the purpose of inheritance into other "concrete" classes which can be instantiated. In Java, the final keyword can be used to prevent a class from being subclassed.

The doctrine of composition over inheritance advocates implementing has-a relationships using composition instead of inheritance. For example, instead of inheriting from class Person, class Employee could give each Employee object an internal Person object, which it then has the opportunity to hide from external code even if class Person has many public attributes or methods. Some languages, like Go do not support inheritance at all.

The "open/closed principle" advocates that classes and functions "should be open for extension, but closed for modification".

Delegation is another language feature that can be used as an alternative to inheritance.

Polymorphism

Subtyping – a form of polymorphism – is when calling code can be agnostic as to which class in the supported hierarchy it is operating on – the parent class or one of its descendants. Meanwhile, the same operation name among objects in an inheritance hierarchy may behave differently.

For example, objects of type Circle and Square are derived from a common class called Shape. The Draw function for each type of Shape implements what is necessary to draw itself while calling code can remain indifferent to the particular type of Shape being drawn.

This is another type of abstraction which simplifies code external to the class hierarchy and enables strong separation of concerns.

Open recursion

In languages that support open recursion, object methods can call other methods on the same object (including themselves), typically using a special variable or keyword called this or self. This variable is late-bound; it allows a method defined in one class to invoke another method that is defined later, in some subclass thereof.

History

UML notation for a class. This Button class has variables for data, and functions. Through inheritance a subclass can be created as subset of the Button class. Objects are instances of a class.

Terminology invoking "objects" and "oriented" in the modern sense of object-oriented programming made its first appearance at MIT in the late 1950s and early 1960s. In the environment of the artificial intelligence group, as early as 1960, "object" could refer to identified items (LISP atoms) with properties (attributes); Alan Kay was later to cite a detailed understanding of LISP internals as a strong influence on his thinking in 1966:
 
I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning – it took a while to see how to do messaging in a programming language efficiently enough to be useful).
Alan Kay
 
Another early MIT example was Sketchpad created by Ivan Sutherland in 1960–61; in the glossary of the 1963 technical report based on his dissertation about Sketchpad, Sutherland defined notions of "object" and "instance" (with the class concept covered by "master" or "definition"), albeit specialized to graphical interaction. Also, an MIT ALGOL version, AED-0, established a direct link between data structures ("plexes", in that dialect) and procedures, prefiguring what were later termed "messages", "methods", and "member functions".

In the 1960s, object-oriented programming was put into practice with the Simula language, which introduced important concepts that are today an essential part of object-oriented programming, such as class and object, inheritance, and dynamic binding. Simula was also designed to take account of programming and data security. For programming security purposes a detection process was implemented so that through reference counts a last resort garbage collector deleted unused objects in the random-access memory (RAM). But although the idea of data objects had already been established by 1965, data encapsulation through levels of scope for variables, such as private (-) and public (+), were not implemented in Simula because it would have required the accessing procedures to be also hidden.

In 1962, Kristen Nygaard initiated a project for a simulation language at the Norwegian Computing Center, based on his previous use of the Monte Carlo simulation and his work to conceptualise real-world systems. Ole-Johan Dahl formally joined the project and the Simula programming language was designed to run on the Universal Automatic Computer (UNIVAC) 1107. In the early stages Simula was supposed to be a procedure package for the programming language ALGOL 60. Dissatisfied with the restrictions imposed by ALGOL the researchers decided to develop Simula into a fully-fledged programming language, which used the UNIVAC ALGOL 60 compiler. Simula launched in 1964, and was promoted by Dahl and Nygaard throughout 1965 and 1966, leading to increasing use of the programming language in Sweden, Germany and the Soviet Union. In 1968, the language became widely available through the Burroughs B5500 computers, and was later also implemented on the URAL-16 computer. In 1966, Dahl and Nygaard wrote a Simula compiler. They became preoccupied with putting into practice Tony Hoare's record class concept, which had been implemented in the free-form, English-like general-purpose simulation language SIMSCRIPT. They settled for a generalised process concept with record class properties, and a second layer of prefixes. Through prefixing a process could reference its predecessor and have additional properties. Simula thus introduced the class and subclass hierarchy, and the possibility of generating objects from these classes. The Simula 1 compiler and a new version of the programming language, Simula 67, was introduced to the wider world through the research paper "Class and Subclass Declarations" at a 1967 conference.

A Simula 67 compiler was launched for the System/360 and System/370 IBM mainframe computers in 1972. In the same year a Simula 67 compiler was launched free of charge for the French CII 10070 and CII Iris 80 mainframe computers. By 1974, the Association of Simula Users had members in 23 different countries. Early 1975 a Simula 67 compiler was released free of charge for the DECsystem-10 mainframe family. By August the same year the DECsystem-10 Simula 67 compiler had been installed at 28 sites, 22 of them in North America. The object-oriented Simula programming language was used mainly by researchers involved with physical modelling, such as models to study and improve the movement of ships and their content through cargo ports.

In the 1970s, the first version of the Smalltalk programming language was developed at Xerox PARC by Alan Kay, Dan Ingalls and Adele Goldberg. Smaltalk-72 included a programming environment and was dynamically typed, and at first was interpreted, not compiled. Smalltalk got noted for its application of object orientation at the language level and its graphical development environment. Smalltalk went through various versions and interest in the language grew. While Smalltalk was influenced by the ideas introduced in Simula 67 it was designed to be a fully dynamic system in which classes could be created and modified dynamically.

In the 1970s, Smalltalk influenced the Lisp community to incorporate object-based techniques that were introduced to developers via the Lisp machine. Experimentation with various extensions to Lisp (such as LOOPS and Flavors introducing multiple inheritance and mixins) eventually led to the Common Lisp Object System, which integrates functional programming and object-oriented programming and allows extension via a Meta-object protocol. In the 1980s, there were a few attempts to design processor architectures that included hardware support for objects in memory but these were not successful. Examples include the Intel iAPX 432 and the Linn Smart Rekursiv.

In 1981, Goldberg edited the August 1981 issue of Byte Magazine, introducing Smalltalk and object-oriented programming to a wider audience. In 1986, the Association for Computing Machinery organised the first Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), which was unexpectedly attended by 1,000 people. In the mid-1980s Objective-C was developed by Brad Cox, who had used Smalltalk at ITT Inc., and Bjarne Stroustrup, who had used Simula for his PhD thesis, eventually went to create the object-oriented C++. In 1985, Bertrand Meyer also produced the first design of the Eiffel language. Focused on software quality, Eiffel is a purely object-oriented programming language and a notation supporting the entire software lifecycle. Meyer described the Eiffel software development method, based on a small number of key ideas from software engineering and computer science, in Object-Oriented Software Construction. Essential to the quality focus of Eiffel is Meyer's reliability mechanism, Design by Contract, which is an integral part of both the method and language.

The TIOBE programming language popularity index graph from 2002 to 2018. In the 2000s the object-oriented Java (blue) and the procedural C (black) competed for the top position.
 
In the early and mid-1990s object-oriented programming developed as the dominant programming paradigm when programming languages supporting the techniques became widely available. These included Visual FoxPro 3.0, C++, and Delphi. Its dominance was further enhanced by the rising popularity o graphical user interfaces, which rely heavily upon object-oriented programming techniques. An example of a closely related dynamic GUI library and OOP language can be found in the Cocoa frameworks on Mac OS X, written in Objective-C, an object-oriented, dynamic messaging extension to C based on Smalltalk. OOP toolkits also enhanced the popularity of event-driven programming (although this concept is not limited to OOP).

At ETH Zürich, Niklaus Wirth and his colleagues had also been investigating such topics as data abstraction and modular programming (although this had been in common use in the 1960s or earlier). Modula-2 (1978) included both, and their succeeding design, Oberon, included a distinctive approach to object orientation, classes, and such.

Object-oriented features have been added to many previously existing languages, including Ada, BASIC, Fortran, Pascal, and COBOL. Adding these features to languages that were not initially designed for them often led to problems with compatibility and maintainability of code.

More recently, a number of languages have emerged that are primarily object-oriented, but that are also compatible with procedural methodology. Two such languages are Python and Ruby. Probably the most commercially important recent object-oriented languages are Java, developed by Sun Microsystems, as well as C# and Visual Basic.NET (VB.NET), both designed for Microsoft's .NET platform. Each of these two frameworks shows, in its own way, the benefit of using OOP by creating an abstraction from implementation. VB.NET and C# support cross-language inheritance, allowing classes defined in one language to subclass classes defined in the other language.

OOP languages

Simula (1967) is generally accepted as being the first language with the primary features of an object-oriented language. It was created for making simulation programs, in which what came to be called objects were the most important information representation. Smalltalk (1972 to 1980) is another early example, and the one with which much of the theory of OOP was developed. Concerning the degree of object orientation, the following distinctions can be made:

OOP in dynamic languages

In recent years, object-oriented programming has become especially popular in dynamic programming languages. Python, PowerShell, Ruby and Groovy are dynamic languages built on OOP principles, while Perl and PHP have been adding object-oriented features since Perl 5 and PHP 4, and ColdFusion since version 6.

The Document Object Model of HTML, XHTML, and XML documents on the Internet has bindings to the popular JavaScript/ECMAScript language. JavaScript is perhaps the best known prototype-based programming language, which employs cloning from prototypes rather than inheriting from a class (contrast to class-based programming). Another scripting language that takes this approach is Lua.

OOP in a network protocol

The messages that flow between computers to request services in a client-server environment can be designed as the linearizations of objects defined by class objects known to both the client and the server. For example, a simple linearized object would consist of a length field, a code point identifying the class, and a data value. A more complex example would be a command consisting of the length and code point of the command and values consisting of linearized objects representing the command's parameters. Each such command must be directed by the server to an object whose class (or superclass) recognizes the command and is able to provide the requested service. Clients and servers are best modeled as complex object-oriented structures. Distributed Data Management Architecture (DDM) took this approach and used class objects to define objects at four levels of a formal hierarchy:
  • Fields defining the data values that form messages, such as their length, code point and data values.
  • Objects and collections of objects similar to what would be found in a Smalltalk program for messages and parameters.
  • Managers similar to AS/400 objects, such as a directory to files and files consisting of metadata and records. Managers conceptually provide memory and processing resources for their contained objects.
  • A client or server consisting of all the managers necessary to implement a full processing environment, supporting such aspects as directory services, security and concurrency control.
The initial version of DDM defined distributed file services. It was later extended to be the foundation of Distributed Relational Database Architecture (DRDA).

Design patterns

Challenges of object-oriented design are addressed by several approaches. Most common is known as the design patterns codified by Gamma et al.. More broadly, the term "design patterns" can be used to refer to any general, repeatable, solution pattern to a commonly occurring problem in software design. Some of these commonly occurring problems have implications and solutions particular to object-oriented development.

Inheritance and behavioral subtyping

It is intuitive to assume that inheritance creates a semantic "is a" relationship, and thus to infer that objects instantiated from subclasses can always be safely used instead of those instantiated from the superclass. This intuition is unfortunately false in most OOP languages, in particular in all those that allow mutable objects. Subtype polymorphism as enforced by the type checker in OOP languages (with mutable objects) cannot guarantee behavioral subtyping in any context. Behavioral subtyping is undecidable in general, so it cannot be implemented by a program (compiler). Class or object hierarchies must be carefully designed, considering possible incorrect uses that cannot be detected syntactically. This issue is known as the Liskov substitution principle.

Gang of Four design patterns

Design Patterns: Elements of Reusable Object-Oriented Software is an influential book published in 1994 by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides, often referred to humorously as the "Gang of Four". Along with exploring the capabilities and pitfalls of object-oriented programming, it describes 23 common programming problems and patterns for solving them. As of April 2007, the book was in its 36th printing. 

The book describes the following patterns:

Object-orientation and databases

Both object-oriented programming and relational database management systems (RDBMSs) are extremely common in software today. Since relational databases don't store objects directly (though some RDBMSs have object-oriented features to approximate this), there is a general need to bridge the two worlds. The problem of bridging object-oriented programming accesses and data patterns with relational databases is known as object-relational impedance mismatch. There are a number of approaches to cope with this problem, but no general solution without downsides. One of the most common approaches is object-relational mapping, as found in IDE languages such as Visual FoxPro and libraries such as Java Data Objects and Ruby on Rails' ActiveRecord.
 
There are also object databases that can be used to replace RDBMSs, but these have not been as technically and commercially successful as RDBMSs.

Real-world modeling and relationships

OOP can be used to associate real-world objects and processes with digital counterparts. However, not everyone agrees that OOP facilitates direct real-world mapping (see Criticism section) or that real-world mapping is even a worthy goal; Bertrand Meyer argues in Object-Oriented Software Construction that a program is not a model of the world but a model of some part of the world; "Reality is a cousin twice removed". At the same time, some principal limitations of OOP have been noted. For example, the circle-ellipse problem is difficult to handle using OOP's concept of inheritance

However, Niklaus Wirth (who popularized the adage now known as Wirth's law: "Software is getting slower more rapidly than hardware becomes faster") said of OOP in his paper, "Good Ideas through the Looking Glass", "This paradigm closely reflects the structure of systems 'in the real world', and it is therefore well suited to model complex systems with complex behaviours" (contrast KISS principle).

Steve Yegge and others noted that natural languages lack the OOP approach of strictly prioritizing things (objects/nouns) before actions (methods/verbs). This problem may cause OOP to suffer more convoluted solutions than procedural programming.

OOP and control flow

OOP was developed to increase the reusability and maintainability of source code. Transparent representation of the control flow had no priority and was meant to be handled by a compiler. With the increasing relevance of parallel hardware and multithreaded coding, developing transparent control flow becomes more important, something hard to achieve with OOP.

Responsibility- vs. data-driven design

Responsibility-driven design defines classes in terms of a contract, that is, a class should be defined around a responsibility and the information that it shares. This is contrasted by Wirfs-Brock and Wilkerson with data-driven design, where classes are defined around the data-structures that must be held. The authors hold that responsibility-driven design is preferable.

SOLID and GRASP guidelines

SOLID is a mnemonic invented by Michael Feathers that stands for and advocates five programming practices:
GRASP (General Responsibility Assignment Software Patterns) is another set of guidelines advocated by Craig Larman.

Criticism

The OOP paradigm has been criticised for a number of reasons, including not meeting its stated goals of reusability and modularity, and for overemphasizing one aspect of software design and modeling (data/objects) at the expense of other important aspects (computation/algorithms).

Luca Cardelli has claimed that OOP code is "intrinsically less efficient" than procedural code, that OOP can take longer to compile, and that OOP languages have "extremely poor modularity properties with respect to class extension and modification", and tend to be extremely complex. The latter point is reiterated by Joe Armstrong, the principal inventor of Erlang, who is quoted as saying:
The problem with object-oriented languages is they've got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.
A study by Potok et al. has shown no significant difference in productivity between OOP and procedural approaches.

Christopher J. Date stated that critical comparison of OOP to other technologies, relational in particular, is difficult because of lack of an agreed-upon and rigorous definition of OOP; however, Date and Darwen have proposed a theoretical foundation on OOP that uses OOP as a kind of customizable type system to support RDBMS.

In an article Lawrence Krubner claimed that compared to other languages (LISP dialects, functional languages, etc.) OOP languages have no unique strengths, and inflict a heavy burden of unneeded complexity.

Alexander Stepanov compares object orientation unfavourably to generic programming:
I find OOP technically unsound. It attempts to decompose the world in terms of interfaces that vary on a single type. To deal with the real problems you need multisorted algebras — families of interfaces that span multiple types. I find OOP philosophically unsound. It claims that everything is an object. Even if it is true it is not very interesting — saying that everything is an object is saying nothing at all.
Paul Graham has suggested that OOP's popularity within large companies is due to "large (and frequently changing) groups of mediocre programmers". According to Graham, the discipline imposed by OOP prevents any one programmer from "doing too much damage".

Leo Brodie has suggested a connection between the standalone nature of objects and a tendency to duplicate code in violation of the don't repeat yourself principle of software development.

Steve Yegge noted that, as opposed to functional programming:
Object Oriented Programming puts the Nouns first and foremost. Why would you go to such lengths to put one part of speech on a pedestal? Why should one kind of concept take precedence over another? It's not as if OOP has suddenly made verbs less important in the way we actually think. It's a strangely skewed perspective.
Rich Hickey, creator of Clojure, described object systems as overly simplistic models of the real world. He emphasized the inability of OOP to model time properly, which is getting increasingly problematic as software systems become more concurrent.

Eric S. Raymond, a Unix programmer and open-source software advocate, has been critical of claims that present object-oriented programming as the "One True Solution", and has written that object-oriented programming languages tend to encourage thickly layered programs that destroy transparency. Raymond compares this unfavourably to the approach taken with Unix and the C programming language.

Rob Pike, a programmer involved in the creation of UTF-8 and Go, has called object-oriented programming "the Roman numerals of computing" and has said that OOP languages frequently shift the focus from data structures and algorithms to types. Furthermore, he cites an instance of a Java professor whose "idiomatic" solution to a problem was to create six new classes, rather than to simply use a lookup table.

Formal semantics

Objects are the run-time entities in an object-oriented system. They may represent a person, a place, a bank account, a table of data, or any item that the program has to handle.

There have been several attempts at formalizing the concepts used in object-oriented programming. The following concepts and constructs have been used as interpretations of OOP concepts:
Attempts to find a consensus definition or theory behind objects have not proven very successful (however, see Abadi & Cardelli, A Theory of Objects for formal definitions of many OOP concepts and constructs), and often diverge widely. For example, some definitions focus on mental activities, and some on program structuring. One of the simpler definitions is that OOP is the act of using "map" data structures or arrays that can contain functions and pointers to other maps, all with some syntactic and scoping sugar on top. Inheritance can be performed by cloning the maps (sometimes called "prototyping").

HTML5

From Wikipedia, the free encyclopedia
HTML5
(HyperText Markup Language)
HTML5 logo and wordmark.svg
Filename extension.html, .htm
Internet media typetext/html
Type codeTEXT
Uniform Type Identifier (UTI)public.html
Developed byW3C
Initial release28 October 2014
Type of formatMarkup language
StandardHTML 5.2
Open format?Yes

HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It was the fifth and last major version of HTML that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML Living Standard and is maintained by a consortium of the major browser vendors (Apple, Google, Mozilla, and Microsoft), the Web Hypertext Application Technology Working Group (WHATWG).

HTML5 was first released in public-facing form on 22 January 2008, with a major update and "W3C Recommendation" status in October 2014. Its goals were to improve the language with support for the latest multimedia and other new features; to keep the language both easily readable by humans and consistently understood by computers and devices such as web browsers, parsers, etc., without XHTML's rigidity; and to remain backward-compatible with older software. HTML5 is intended to subsume not only HTML 4, but also XHTML 1 and DOM Level 2 HTML.

HTML5 includes detailed processing models to encourage more interoperable implementations; it extends, improves and rationalizes the markup available for documents, and introduces markup and application programming interfaces (APIs) for complex web applications. For the same reasons, HTML5 is also a candidate for cross-platform mobile applications, because it includes features designed with low-powered devices in mind.

Many new syntactic features are included. To natively include and handle multimedia and graphical content, the new , and elements were added, and support for scalable vector graphics (SVG) content and MathML for mathematical formulas. To enrich the semantic content of documents, new page structure elements are added. New attributes are introduced, some elements and attributes have been removed, and others such as, and have been changed, redefined, or standardized. The APIs and Document Object Model (DOM) are now fundamental parts of the HTML5 specification and HTML5 also better defines the processing for any invalid documents.

History

The Web Hypertext Application Technology Working Group (WHATWG) began work on the new standard in 2004. At that time, HTML 4.01 had not been updated since 2000, and the World Wide Web Consortium (W3C) was focusing future developments on XHTML 2.0. In 2009, the W3C allowed the XHTML 2.0 Working Group's charter to expire and decided not to renew it.

The Mozilla Foundation and Opera Software presented a position paper at a World Wide Web Consortium (W3C) workshop in June 2004, focusing on developing technologies that are backward-compatible with existing browsers, including an initial draft specification of Web Forms 2.0. The workshop concluded with a vote—8 for, 14 against—for continuing work on HTML. Immediately after the workshop, WHATWG was formed to start work based upon that position paper, and a second draft, Web Applications 1.0, was also announced. The two specifications were later merged to form HTML5. The HTML5 specification was adopted as the starting point of the work of the new HTML working group of the W3C in 2007.

WHATWG's Ian Hickson (Google) and David Hyatt (Apple) produced W3C's first public working draft of the specification on 22 January 2008.

"Thoughts on Flash"

While some features of HTML5 are often compared to Adobe Flash, the two technologies are very different. Both include features for playing audio and video within web pages, and for using Scalable Vector Graphics. However, HTML5 on its own cannot be used for animation or interactivity – it must be supplemented with CSS3 or JavaScript. There are many Flash capabilities that have no direct counterpart in HTML5. HTML5's interactive capabilities became a topic of mainstream media attention around April 2010 after Apple Inc.'s then-CEO Steve Jobs issued a public letter titled "Thoughts on Flash" in which he concluded that "Flash is no longer necessary to watch video or consume any kind of web content" and that "new open standards created in the mobile era, such as HTML5, will win". This sparked a debate in web development circles suggesting that, while HTML5 provides enhanced functionality, developers must consider the varying browser support of the different parts of the standard as well as other functionality differences between HTML5 and Flash. In early November 2011, Adobe announced that it would discontinue development of Flash for mobile devices and reorient its efforts in developing tools using HTML5. On 25 July 2017, Adobe announced that both the distribution and support of Flash will cease by the end of 2020.

Last call, candidacy, and recommendation stages

On 14 February 2011, the W3C extended the charter of its HTML Working Group with clear milestones for HTML5. In May 2011, the working group advanced HTML5 to "Last Call", an invitation to communities inside and outside W3C to confirm the technical soundness of the specification. The W3C developed a comprehensive test suite to achieve broad interoperability for the full specification by 2014, which was the target date for recommendation. In January 2011, the WHATWG renamed its "HTML5" specification HTML Living Standard. The W3C nevertheless continued its project to release HTML5.

In July 2012, WHATWG and W3C decided on a degree of separation. W3C will continue the HTML5 specification work, focusing on a single definitive standard, which is considered as a "snapshot" by WHATWG. The WHATWG organization continues its work with HTML5 as a "living standard". The concept of a living standard is that it is never complete and is always being updated and improved. New features can be added but functionality will not be removed.

In December 2012, W3C designated HTML5 as a Candidate Recommendation. The criterion for advancement to W3C Recommendation is "two 100% complete and fully interoperable implementations".

On 16 September 2014, W3C moved HTML5 to Proposed Recommendation. On 28 October 2014, HTML5 was released as a W3C Recommendation, bringing the specification process to completion. On 1 November 2016, HTML 5.1 was released as a W3C Recommendation. On 14 December 2017, HTML 5.2 was released as a W3C Recommendation.

Timeline

The combined timelines for HTML 5.0, HTML 5.1 and HTML 5.2:

Version First draft Candidate recommendation Recommendation
HTML 5.0 2007 2012 2014
HTML 5.1 2012 2015 2016
HTML 5.2 2015 2017 2017
HTML 5.3 2017 N/A N/A

W3C and WHATWG conflict

The W3C ceded authority over the HTML and DOM standards to WHATWG on 28 May 2019, as it considered that having two standards is harmful. The HTML Living Standard is now authoritative. However, W3C will still participate in the development process of HTML.

Before the ceding of authority, W3C and WHATWG had been characterized as both working together on the development of HTML5, and yet also at cross purposes ever since the July 2012 split, creating WHATWG. The W3C standard was snapshot-based and static, while the WHATWG is a continually updated "living standard". The relationship had been described as "fragile", even a "rift", and characterized by "squabbling".

In at least one case, namely the permissible content of the <cite> element, the two specifications directly contradicted each other (as of July 2018), with the W3C definition being permissive and reflecting traditional use of the element since its introduction, but WHATWG limiting it to a single defined type of content (the title of the work cited). This is actually at odds with WHATWG's stated goals of ensuring backward compatibility and not losing prior functionality.

The "Introduction" section in the WHATWG spec (edited by Ian "Hixie" Hickson) is critical of W3C, e.g. "Note: Although we have asked them to stop doing so, the W3C also republishes some parts of this specification as separate documents." In its "History" subsection it portrays W3C as resistant to Hickson's and WHATWG's original HTML 5 plans, then jumping on the bandwagon belatedly (though Hickson was in control of the W3C HTML 5 spec, too). Regardless, it indicates a major philosophical divide between the organizations:
For a number of years, both groups then worked together. In 2011, however, the groups came to the conclusion that they had different goals: the W3C wanted to publish a "finished" version of "HTML5", while the WHATWG wanted to continue working on a Living Standard for HTML, continuously maintaining the specification rather than freezing it in a state with known problems, and adding new features as needed to evolve the platform.
Since then, the WHATWG has been working on this specification (amongst others), and the W3C has been copying fixes made by the WHATWG into their fork of the document (which also has other changes).
The two entities signed an agreement to work together on a single version of HTML on 28 May 2019.

Differences between the two standards

In addition to the contradiction in the <cite> element mentioned above, other differences between the two standards include at least the following, as of September 2018:

Content or Features Unique to W3C or WHATWG Standard

W3C WHATWG
Site pagination
Single page version (allows global search of contents)
Chapters
§5 Microdata §9 Communication
§10 Web workers
§11 Web storage
Global attributes class, id autocapitalize, enterkeyhint, inputmode, is, itemid, itemprop, itemref, itemscope, itemtype, nonce
Chapter Elements of HTML
§4.13 Custom elements
Elements , (See compatibility notes below.)
is in section Grouping content.

, , (See compatibility notes below.)
is in section Sections.
§ §4.2.5.4. Other pragma directives, based on deprecated WHATWG procedure.
§ Sections
§ 4.3.11.2 Sample outlines § 4.3.11.3 Exposing outlines to users
Structured data Recommends RDFa (code examples, separate specs, no special attributes). Recommends Microdata (code examples, spec chapter, special attributes).

The following table provides data from the Mozilla Development Network on compatibility with major browsers, as of September 2018, of HTML elements unique to one of the standards:

Element Standard Compatibility Note
W3C All browsers, except Edge
W3C None, except Firefox

WHATWG All browsers "[Since] the HTML outline algorithm is not implemented in any browsers ... the
semantics are in practice only theoretical."

WHATWG Full support only in Edge and Firefox desktop. Partial support in Firefox mobile.
Supported in Opera with user opt-in.
Not supported in other browsers.
Experimental technology
WHATWG All browsers, except IE Experimental technology

Features and APIs

The W3C proposed a greater reliance on modularity as a key part of the plan to make faster progress, meaning identifying specific features, either proposed or already existing in the spec, and advancing them as separate specifications. Some technologies that were originally defined in HTML 5 itself are now defined in separate specifications:
  • HTML Working Group – HTML Canvas 2D Context;
  • Web Apps Working Group – Web Messaging, Web workers, Web storage, WebSocket, Server-sent events, Web Components (this was not part of HTML 5, though); the Web Applications Working Group was closed in October 2015 and its deliverables transferred to the Web Platform Working Group (WPWG).
  • IETF HyBi Working Group – WebSocket Protocol;
  • WebRTC Working Group – WebRTC;
  • Web Media Text Tracks Community Group – WebVTT.
After the standardization of the HTML 5 specification in October 2014, the core vocabulary and features are being extended in four ways. Likewise, some features that were removed from the original HTML 5 specification have been standardized separately as modules, such as Microdata and Canvas. Technical specifications introduced as HTML 5 extensions such as Polyglot markup have also been standardized as modules. Some W3C specifications that were originally separate specifications have been adapted as HTML 5 extensions or features, such as SVG. Some features that might have slowed down the standardization of HTML 5 will be standardized as upcoming specifications, instead. HTML 5.1 was finalized in 2016, and it is currently on the standardization track at the W3C.

Features

Markup

HTML 5 introduces elements and attributes that reflect typical usage on modern websites. Some of them are semantic replacements for common uses of generic block (
) and inline () elements, for example (website navigation block),
(usually referring to bottom of web page or to last lines of HTML code), or and instead of . Some deprecated elements from HTML 4.01 have been dropped, including purely presentational elements such as and
, whose effects have long been superseded by the more capable Cascading Style Sheets. There is also a renewed emphasis on the importance of DOM scripting in Web behavior.

The HTML 5 syntax is no longer based on SGML despite the similarity of its markup. It has, however, been designed to be backward-compatible with common parsing of older versions of HTML. It comes with a new introductory line that looks like an SGML document type declaration, , which triggers the standards-compliant rendering mode. Since 5 January 2009, HTML 5 also includes Web Forms 2.0, a previously separate WHATWG specification.

New APIs

HTML5 related APIs
In addition to specifying markup, HTML 5 specifies scripting application programming interfaces (APIs) that can be used with JavaScript. Existing Document Object Model (DOM) interfaces are extended and de facto features documented. There are also new APIs, such as:
Not all of the above technologies are included in the W3C HTML 5 specification, though they are in the WHATWG HTML specification. Some related technologies, which are not part of either the W3C HTML 5 or the WHATWG HTML specification, are as follows. The W3C publishes specifications for these separately:
  • Geolocation;
  • IndexedDB – an indexed hierarchical key-value store (formerly WebSimpleDB);
  • File – an API intended to handle file uploads and file manipulation;
  • Directories and System – an API intended to satisfy client-side-storage use cases not well served by databases;
  • File Writer – an API for writing to files from web applications;
  • Web Audio – a high-level JavaScript API for processing and synthesizing audio in web applications;
  • ClassList.
  • Web cryptography API
  • WebRTC
  • Web SQL Database – a local SQL Database (no longer maintained);
HTML 5 cannot provide animation within web pages. Additional JavaScript or CSS3 is necessary for animating HTML elements. Animation is also possible using JavaScript and HTML 4, and within SVG elements through SMIL, although browser support of the latter remains uneven as of 2011.

XHTML 5 (XML-serialized HTML 5)

XML documents must be served with an XML Internet media type (often called "MIME type") such as application/xhtml+xml or application/xml, and must conform to strict, well-formed syntax of XML. XHTML 5 is simply XML-serialized HTML 5 data (that is, HTML 5 constrained to XHTML's strict requirements, e.g., not having any unclosed tags), sent with one of XML media types. HTML that has been written to conform to both the HTML and XHTML specifications  and which will therefore produce the same DOM tree whether parsed as HTML or XML is known as polyglot markup.

Error handling

HTML 5 is designed so that old browsers can safely ignore new HTML 5 constructs. In contrast to HTML 4.01, the HTML 5 specification gives detailed rules for lexing and parsing, with the intent that compliant browsers will produce the same results when parsing incorrect syntax. Although HTML 5 now defines a consistent behavior for "" documents, those documents are not regarded as conforming to the HTML 5 standard.

Popularity

According to a report released on 30 September 2011, 34 of the world's top 100 Web sites were using HTML 5 – the adoption led by search engines and social networks. Another report released in August 2013 has shown that 153 of the Fortune 500 U.S. companies implemented HTML5 on their corporate websites.

Since 2014, HTML 5 is at least partially supported by most popular layout engines.

Differences from HTML 4.01 and XHTML 1.x

The following is a cursory list of differences and some specific examples.
  • New parsing rules: oriented towards flexible parsing and compatibility; not based on SGML
  • Ability to use inline SVG and MathML in text/html
  • New elements: article, aside, audio, bdi, canvas, command, data, datalist, details, embed, figcaption, figure, footer, header, keygen, mark, meter, nav, output, progress, rp, rt, ruby, section, source, summary, time, track, video, wbr
  • New types of form controls: dates and times, email, url, search, number, range, tel, color
  • New attributes: charset (on meta), async (on script)
  • Global attributes (that can be applied for every element): id, tabindex, hidden, data-* (custom data attributes)
  • Deprecated elements will be dropped altogether: acronym, applet, basefont, big, center, dir, font, frame, frameset, isindex, noframes, strike, tt
W3C Working Group publishes "HTML5 differences from HTML 4", which provides a complete outline of additions, removals and changes between HTML 5 and HTML 4.

The W3C HTML5 logo

On 18 January 2011, the W3C introduced a logo to represent the use of or interest in HTML 5. Unlike other badges previously issued by the W3C, it does not imply validity or conformance to a certain standard. As of 1 April 2011, this logo is official.

When initially presenting it to the public, the W3C announced the HTML 5 logo as a "general-purpose visual identity for a broad set of open web technologies, including HTML 5, CSS, SVG, WOFF, and others". Some web standard advocates, including The Web Standards Project, criticized that definition of "HTML5" as an umbrella term, pointing out the blurring of terminology and the potential for miscommunication. Three days later, the W3C responded to community feedback and changed the logo's definition, dropping the enumeration of related technologies. The W3C then said the logo "represents HTML5, the cornerstone for modern Web applications".

Digital rights management

Industry players including the BBC, Google, Microsoft, Apple Inc. have been lobbying for the inclusion of Encrypted Media Extensions (EME), a form of digital rights management (DRM), into the HTML 5 standard. As of the end of 2012 and the beginning of 2013, 27 organisations including the Free Software Foundation have started a campaign against including digital rights management in the HTML 5 standard. However, in late September 2013, the W3C HTML Working Group decided that Encrypted Media Extensions, a form of DRM, was "in scope" and will potentially be included in the HTML 5.1 standard. WHATWG's "HTML Living Standard" continued to be developed without DRM-enabled proposals.

Manu Sporny, a member of the W3C, said that EME will not solve the problem it's supposed to address. Opponents point out that EME itself is just an architecture for a DRM plug-in mechanism.

The initial enablers for DRM in HTML 5 were Google and Microsoft. Supporters also include Adobe. On 14 May 2014, Mozilla announced plans to support EME in Firefox, the last major browser to avoid DRM. Calling it "a difficult and uncomfortable step", Andreas Gal of Mozilla explained that future versions of Firefox would remain open source but ship with a sandbox designed to run a content decryption module developed by Adobe, later it was replaced with Widevine module from Google which is much widely adopted by content providers. While promising to "work on alternative solutions", Mozilla's Executive Chair Mitchell Baker stated that a refusal to implement EME would have accomplished little more than convincing many users to switch browsers. This decision was condemned by Cory Doctorow and the Free Software Foundation.

Significant other

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Sig...