HTML5 was first released in public-facing form on 22 January 2008, with a major update and "W3C Recommendation" status in October 2014.
Its goals were to improve the language with support for the latest
multimedia and other new features; to keep the language both easily
readable by humans and consistently understood by computers and devices
such as web browsers, parsers, etc., without XHTML's rigidity; and to remain backward-compatible with older software. HTML5 is intended to subsume not only HTML 4, but also XHTML 1 and DOM Level 2 HTML.
HTML5 includes detailed processing models to encourage more
interoperable implementations; it extends, improves and rationalizes the
markup available for documents, and introduces markup and application programming interfaces (APIs) for complex web applications. For the same reasons, HTML5 is also a candidate for cross-platform mobile applications, because it includes features designed with low-powered devices in mind.
Many new syntactic features are included. To natively include and handle multimedia and graphical content, the new , and elements were added, and support for scalable vector graphics (SVG) content and MathML for mathematical formulas. To enrich the semantic content of documents, new page structure elements such as , , , , , , , and are added. New attributes are introduced, some elements and attributes have been removed, and others such as , , and have been changed, redefined, or standardized.
The APIs and Document Object Model (DOM) are now fundamental parts of the HTML5 specification and HTML5 also better defines the processing for any invalid documents.
History
The Web Hypertext Application Technology Working Group (WHATWG) began work on the new standard in 2004. At that time, HTML 4.01 had not been updated since 2000, and the World Wide Web Consortium (W3C) was focusing future developments on XHTML 2.0. In 2009, the W3C allowed the XHTML 2.0 Working Group's charter to expire and decided not to renew it.
The Mozilla Foundation and Opera Software presented a position paper at a World Wide Web Consortium (W3C) workshop in June 2004, focusing on developing technologies that are backward-compatible with existing browsers,
including an initial draft specification of Web Forms 2.0. The workshop
concluded with a vote—8 for, 14 against—for continuing work on HTML.
Immediately after the workshop, WHATWG was formed to start work based
upon that position paper, and a second draft, Web Applications 1.0, was
also announced. The two specifications were later merged to form HTML5. The HTML5 specification was adopted as the starting point of the work of the new HTML working group of the W3C in 2007.
WHATWG's Ian Hickson (Google) and David Hyatt (Apple) produced W3C's first public working draft of the specification on 22 January 2008.
"Thoughts on Flash"
While some features of HTML5 are often compared to Adobe Flash, the two technologies are very different. Both include features for playing audio and video within web pages, and for using Scalable Vector Graphics. However, HTML5 on its own cannot be used for animation or interactivity – it must be supplemented with CSS3 or JavaScript. There are many Flash capabilities that have no direct counterpart in HTML5 (see Comparison of HTML5 and Flash). HTML5's interactive capabilities became a topic of mainstream media attention around April 2010 after Apple Inc.'s then-CEO Steve Jobs
issued a public letter titled "Thoughts on Flash" in which he concluded
that "Flash is no longer necessary to watch video or consume any kind
of web content" and that "new open standards created in the mobile era,
such as HTML5, will win".
This sparked a debate in web development circles suggesting that, while
HTML5 provides enhanced functionality, developers must consider the
varying browser support of the different parts of the standard as well
as other functionality differences between HTML5 and Flash.
In early November 2011, Adobe announced that it would discontinue
development of Flash for mobile devices and reorient its efforts in
developing tools using HTML5. On 25 July 2017, Adobe announced that both the distribution and support of Flash will cease by the end of 2020.
Last call, candidacy, and recommendation stages
On
14 February 2011, the W3C extended the charter of its HTML Working
Group with clear milestones for HTML5. In May 2011, the working group
advanced HTML5 to "Last Call", an invitation to communities inside and
outside W3C to confirm the technical soundness of the specification. The
W3C developed a comprehensive test suite to achieve broad
interoperability for the full specification by 2014, which was the
target date for recommendation. In January 2011, the WHATWG renamed its "HTML5" specification HTML Living Standard. The W3C nevertheless continued its project to release HTML5.
In July 2012, WHATWG and W3C decided on a degree of separation.
W3C will continue the HTML5 specification work, focusing on a single
definitive standard, which is considered as a "snapshot" by WHATWG. The
WHATWG organization continues its work with HTML5 as a "living
standard". The concept of a living standard is that it is never complete
and is always being updated and improved. New features can be added but
functionality will not be removed.
In December 2012, W3C designated HTML5 as a Candidate Recommendation. The criterion for advancement to W3C Recommendation is "two 100% complete and fully interoperable implementations".
On 16 September 2014, W3C moved HTML5 to Proposed Recommendation. On 28 October 2014, HTML5 was released as a W3C Recommendation, bringing the specification process to completion. On 1 November 2016, HTML 5.1 was released as a W3C Recommendation. On 14 December 2017, HTML 5.2 was released as a W3C Recommendation.
Timeline
The combined timelines for HTML 5.0, HTML 5.1 and HTML 5.2:
Version
First draft
Candidate recommendation
Recommendation
HTML 5.0
2007
2012
2014
HTML 5.1
2012
2015
2016
HTML 5.2
2015
2017
2017
HTML 5.3
2017
N/A
N/A
W3C and WHATWG conflict
The W3C ceded authority over the HTML and DOM standards to WHATWG on
28 May 2019, recognizing that having 2 standards is harmful. The HTML Living Standard is now authoritative. However, W3C will still participate in the development process of HTML.
Before the ceding of authority, W3C and WHATWG had been characterized as both working together on the development of HTML5, and yet also at cross purposes
ever since the July 2012 split, creating WHATWG. The W3C standard was
snapshot-based and static, while the WHATWG is a continually updated
"living standard". The relationship had been described as "fragile",
even a "rift", and characterized by "squabbling".
In at least one case, namely the permissible content of the <cite> element, the two specifications directly contradicted each other (as of July 2018), with the W3C definition being permissive and reflecting traditional use of the element since its introduction, but WHATWG limiting it to a single defined type of content (the title of the work cited). This is actually at odds with WHATWG's stated goals of ensuring backward compatibility and not losing prior functionality.
The "Introduction" section in the WHATWG spec (edited by Ian "Hixie" Hickson) is critical of W3C, e.g. "Note:
Although we have asked them to stop doing so, the W3C also republishes
some parts of this specification as separate documents." In its
"History" subsection it portrays W3C as resistant to Hickson's and
WHATWG's original HTML 5 plans, then jumping on the bandwagon belatedly
(though Hickson was in control of the W3C HTML 5 spec, too). Regardless,
it indicates a major philosophical divide between the organizations:
For a number of years, both groups
then worked together. In 2011, however, the groups came to the
conclusion that they had different goals: the W3C wanted to publish a
"finished" version of "HTML5", while the WHATWG wanted to continue
working on a Living Standard for HTML, continuously maintaining the
specification rather than freezing it in a state with known problems,
and adding new features as needed to evolve the platform. Since
then, the WHATWG has been working on this specification (amongst
others), and the W3C has been copying fixes made by the WHATWG into
their fork of the document (which also has other changes).
The two entities signed an agreement to work together on a single version of HTML on 28 May 2019.
Differences between the two standards
In addition to the contradiction in the <cite> element mentioned above, other differences between the two standards include at least the following, as of September 2018:
Content or Features Unique to W3C or WHATWG Standard
W3C
WHATWG
Site pagination
Single page version (allows global search of contents)
Chapters
§5 Microdata§9 Communication §10 Web workers §11 Web storage
§4.2.5.4. Other pragma directives, based on deprecated WHATWG procedure.
§ Sections
§ 4.3.11.2 Sample outlines§ 4.3.11.3 Exposing outlines to users
Structured data
Recommends RDFa (code examples, separate specs, no special attributes).
Recommends Microdata (code examples, spec chapter, special attributes).
The following table provides data from the Mozilla Development
Network on compatibility with major browsers, as of September 2018, of
HTML elements unique to one of the standards:
Element
Standard
Compatibility
Note
W3C
All browsers, except Edge
W3C
None, except Firefox
WHATWG
All browsers
"[Since] the HTML outline algorithm is not implemented in any browsers ... the semantics are in practice only theoretical."
WHATWG
Full support only in Edge and Firefox desktop.
Partial support in Firefox mobile.
Supported in Opera with user opt-in.
Not supported in other browsers.
Experimental technology
WHATWG
All browsers, except Edge and IE
Experimental technology
Features and APIs
The W3C proposed a greater reliance on modularity as a key part of
the plan to make faster progress, meaning identifying specific features,
either proposed or already existing in the spec, and advancing them as
separate specifications. Some technologies that were originally defined
in HTML 5 itself are now defined in separate specifications:
After the standardization of the HTML 5 specification in October 2014, the core vocabulary and features are being extended in four ways.
Likewise, some features that were removed from the original HTML 5
specification have been standardized separately as modules, such as Microdata and Canvas. Technical specifications introduced as HTML 5 extensions such as Polyglot markup
have also been standardized as modules. Some W3C specifications that
were originally separate specifications have been adapted as HTML 5
extensions or features, such as SVG.
Some features that might have slowed down the standardization of HTML 5
will be standardized as upcoming specifications, instead. HTML 5.1 was
finalized in 2016, and it is currently on the standardization track at
the W3C.
Features
Markup
HTML 5 introduces elements
and attributes that reflect typical usage on modern websites. Some of
them are semantic replacements for common uses of generic block (
) and inline () elements, for example (website navigation block), (usually referring to bottom of web page or to last lines of HTML code), or and instead of .
Some deprecated elements from HTML 4.01 have been dropped, including purely presentational elements such as and
, whose effects have long been superseded by the more capable Cascading Style Sheets. There is also a renewed emphasis on the importance of DOM scripting in Web behavior.
The HTML 5 syntax is no longer based on SGML
despite the similarity of its markup. It has, however, been designed to
be backward-compatible with common parsing of older versions of HTML.
It comes with a new introductory line that looks like an SGML document type declaration, , which triggers the standards-compliant rendering mode.
Since 5 January 2009, HTML 5 also includes Web Forms 2.0, a previously separate WHATWG specification.
New APIs
HTML5 related APIs
In addition to specifying markup, HTML 5 specifies scripting application programming interfaces (APIs) that can be used with JavaScript. Existing Document Object Model (DOM) interfaces are extended and de facto features documented. There are also new APIs, such as:
Web Storage – a key-value pair storage framework that provides behaviour similar to cookies but with larger storage capacity and improved API.
Not all of the above technologies are included in the W3C HTML 5
specification, though they are in the WHATWG HTML specification.
Some related technologies, which are not part of either the W3C HTML 5
or the WHATWG HTML specification, are as follows. The W3C publishes
specifications for these separately:
HTML 5 cannot provide animation within web pages. Additional JavaScript or CSS3 is necessary for animating HTML elements. Animation is also possible using JavaScript and HTML 4, and within SVG elements through SMIL, although browser support of the latter remains uneven as of 2011.
XHTML 5 (XML-serialized HTML 5)
XML documents must be served with an XML Internet media type (often called "MIME type") such as application/xhtml+xml or application/xml,
and must conform to strict, well-formed syntax of XML. XHTML 5 is
simply XML-serialized HTML 5 data (that is, HTML 5 constrained to
XHTML's strict requirements, e.g., not having any unclosed tags), sent
with one of XML media types. HTML that has been written to conform to
both the HTML and XHTML specifications and which will therefore produce
the same DOM tree whether parsed as HTML or XML is known as polyglot markup.
Error handling
HTML 5 is designed so that old browsers can safely ignore new HTML 5 constructs. In contrast to HTML 4.01, the HTML 5 specification gives detailed rules for lexing and parsing, with the intent that compliant browsers will produce the same results when parsing incorrect syntax. Although HTML 5 now defines a consistent behavior for "tag soup" documents, those documents are not regarded as conforming to the HTML 5 standard.
Popularity
According
to a report released on 30 September 2011, 34 of the world's top 100
Web sites were using HTML 5 – the adoption led by search engines and social networks. Another report released in August 2013 has shown that 153 of the Fortune 500 U.S. companies implemented HTML5 on their corporate websites.
New types of form controls: dates and times, email, url, search, number, range, tel, color
New attributes: charset (on meta), async (on script)
Global attributes (that can be applied for every element): id, tabindex, hidden, data-* (custom data attributes)
Deprecated elements will be dropped altogether: acronym, applet, basefont, big, center, dir, font, frame, frameset, isindex, noframes, strike, tt
W3C Working Group publishes "HTML5 differences from HTML 4", which provides a complete outline of additions, removals and changes between HTML 5 and HTML 4.
Logo
The W3C HTML5 logo
On 18 January 2011, the W3C introduced a logo to represent the use of
or interest in HTML 5. Unlike other badges previously issued by the
W3C, it does not imply validity or conformance to a certain standard. As
of 1 April 2011, this logo is official.
When initially presenting it to the public, the W3C announced the
HTML 5 logo as a "general-purpose visual identity for a broad set of
open web technologies, including HTML 5, CSS, SVG, WOFF, and others".
Some web standard advocates, including The Web Standards Project,
criticized that definition of "HTML5" as an umbrella term, pointing out
the blurring of terminology and the potential for miscommunication.
Three days later, the W3C responded to community feedback and changed
the logo's definition, dropping the enumeration of related technologies. The W3C then said the logo "represents HTML5, the cornerstone for modern Web applications".
Digital rights management
Industry players including the BBC, Google, Microsoft, Apple Inc. have been lobbying for the inclusion of Encrypted Media Extensions (EME), a form of digital rights management (DRM), into the HTML 5 standard. As of the end of 2012 and the beginning of 2013, 27 organisations including the Free Software Foundation have started a campaign against including digital rights management in the HTML 5 standard. However, in late September 2013, the W3C HTML Working Group
decided that Encrypted Media Extensions, a form of DRM, was "in scope"
and will potentially be included in the HTML 5.1 standard. WHATWG's "HTML Living Standard" continued to be developed without DRM-enabled proposals.
Manu Sporny, a member of the W3C, said that EME will not solve the problem it's supposed to address.
Opponents point out that EME itself is just an architecture for a DRM plug-in mechanism.
The initial enablers for DRM in HTML 5 were Google and Microsoft. Supporters also include Adobe. On 14 May 2014, Mozilla announced plans to support EME in Firefox, the last major browser to avoid DRM.
Calling it "a difficult and uncomfortable step", Andreas Gal of Mozilla
explained that future versions of Firefox would remain open source but
ship with a sandbox designed to run a content decryption module
developed by Adobe. While promising to "work on alternative solutions", Mozilla's Executive Chair Mitchell Baker stated that a refusal to implement EME would have accomplished little more than convincing many users to switch browsers. This decision was condemned by Cory Doctorow and the Free Software Foundation.
Ajax (also AJAX/ˈeɪdʒæks/; short for "Asynchronous JavaScript + XML") is a set of web development techniques using many web technologies on the client side to create asynchronousweb applications. With Ajax, web applications can send and retrieve data from a server
asynchronously (in the background) without interfering with the display
and behavior of the existing page. By decoupling the data interchange
layer from the presentation layer, Ajax allows web pages and, by
extension, web applications, to change content dynamically without the
need to reload the entire page. In practice, modern implementations commonly utilize JSON instead of XML.
Ajax is not a single technology, but rather a group of technologies. HTML and CSS
can be used in combination to mark up and style information. The
webpage can then be modified by JavaScript to dynamically display—and
allow the user to interact with—the new information. The built-in XMLHttpRequest
object, or since 2017 the new "fetch()" function within JavaScript, is
commonly used to execute Ajax on webpages allowing websites to load
content onto the screen without refreshing the page. Ajax is not a new
technology, or different language, just existing technologies used in
new ways.
History
In the early-to-mid 1990s, most Web
sites were based on complete HTML pages. Each user action required that
a complete new page be loaded from the server. This process was
inefficient, as reflected by the user experience: all page content
disappeared, then the new page appeared. Each time the browser reloaded a
page because of a partial change, all of the content had to be re-sent,
even though only some of the information had changed. This placed
additional load on the server and made bandwidth a limiting factor on performance.
In 1996, the iframe tag was introduced by Internet Explorer; like the object element, it can load or fetch content asynchronously. In 1998, the Microsoft Outlook Web Access team developed the concept behind the XMLHttpRequest scripting object. It appeared as XMLHTTP in the second version of the MSXML library, which shipped with Internet Explorer 5.0 in March 1999.
The functionality of the XMLHTTP ActiveX control in IE 5 was later implemented by Mozilla, Safari, Opera and other browsers as the XMLHttpRequest JavaScript object. Microsoft adopted the native XMLHttpRequest model as of Internet Explorer 7. The ActiveX version is still supported in Internet Explorer, but not in Microsoft Edge. The utility of these background HTTP
requests and asynchronous Web technologies remained fairly obscure
until it started appearing in large scale online applications such as
Outlook Web Access (2000) and Oddpost (2002).
Google made a wide deployment of standards-compliant, cross browser Ajax with Gmail (2004) and Google Maps (2005). In October 2004 Kayak.com's
public beta release was among the first large-scale e-commerce uses of
what their developers at that time called "the xml http thing". This increased interest in AJAX among web program developers.
The term AJAX was publicly used on 18 February 2005 by Jesse James Garrett in an article titled Ajax: A New Approach to Web Applications, based on techniques used on Google pages.
On 5 April 2006, the World Wide Web Consortium (W3C) released the first draft specification for the XMLHttpRequest object in an attempt to create an official Web standard.
The latest draft of the XMLHttpRequest object was published on 6 October 2016.
Technologies
The conventional model for a Web Application versus an application using Ajax
The term Ajax has come to represent a broad group of Web
technologies that can be used to implement a Web application that
communicates with a server in the background, without interfering with
the current state of the page. In the article that coined the term Ajax, Jesse James Garrett explained that the following technologies are incorporated:
Since then, however, there have been a number of developments in the
technologies used in an Ajax application, and in the definition of the
term Ajax itself. XML is no longer required for data interchange and,
therefore, XSLT is no longer required for the manipulation of data. JavaScript Object Notation (JSON) is often used as an alternative format for data interchange, although other formats such as preformatted HTML or plain text can also be used. A variety of popular JavaScript libraries, including JQuery, include abstractions to assist in executing Ajax requests.
Drawbacks
Any
user whose browser does not support JavaScript or XMLHttpRequest, or
has this functionality disabled, will not be able to properly use pages
that depend on Ajax. Simple devices (such as smartphones and PDAs)
may not support the required technologies. The only way to let the user
carry out functionality is to fall back to non-JavaScript methods. This
can be achieved by making sure links and forms can be resolved properly
and not relying solely on Ajax.
Similarly, some Web applications that use Ajax are built in a way that cannot be read by screen-reading technologies, such as JAWS. The WAI-ARIA standards provide a way to provide hints in such a case.
Screen readers that are able to use Ajax may still not be able to properly read the dynamically generated content.
The same-origin policy prevents some Ajax techniques from being used across domains, although the W3C has a draft of the XMLHttpRequest object that would enable this functionality.
Methods exist to sidestep this security feature by using a special
Cross Domain Communications channel embedded as an iframe within a page, or by the use of JSONP.
Ajax is designed for one-way communications with the server. If two
way communications are needed (ie. for the client to listen for
events/changes on the server), then WebSockets may be a better option.
In pre-HTML5
browsers, pages dynamically created using successive Ajax requests did
not automatically register themselves with the browser's history engine,
so clicking the browser's "back" button may not have returned the
browser to an earlier state of the Ajax-enabled page, but may have
instead returned to the last full page visited before it. Such
behavior — navigating between pages instead of navigating between page
states — may be desirable, but if fine-grained tracking of page state is
required, then a pre-HTML5
workaround was to use invisible iframes to trigger changes in the
browser's history. A workaround implemented by Ajax techniques is to
change the URL fragment identifier (the part of a URL after the "#") when an Ajax-enabled page is accessed and monitor it for changes. HTML5 provides an extensive API standard for working with the browser's history engine.
Dynamic Web page updates also make it difficult to bookmark
and return to a particular state of the application. Solutions to this
problem exist, many of which again use the URL fragment identifier.
On the other hand, as AJAX-intensive pages tend to function as
applications rather than content, bookmarking interim states rarely
makes sense. Nevertheless, the solution provided by HTML5 for the above
problem also applies for this.
Depending on the nature of the Ajax application, dynamic page
updates may disrupt user interactions, particularly if the Internet
connection is slow or unreliable. For example, editing a search field
may trigger a query to the server for search completions, but the user
may not know that a search completion popup is forthcoming, and if the
Internet connection is slow, the popup list may show up at an
inconvenient time, when the user has already proceeded to do something
else.
Excluding Google, most major Web crawlers do not execute JavaScript code, so in order to be indexed by Web search engines,
a Web application must provide an alternative means of accessing the
content that would normally be retrieved with Ajax. It has been
suggested that a headless browser
may be used to index content provided by Ajax-enabled websites,
although Google is no longer recommending the Ajax crawling proposal
they made in 2009.
Examples
JavaScript example
An example of a simple Ajax request using the GET method, written in JavaScript.
get-ajax-data.js:
// This is the client-side script.// Initialize the HTTP request.varxhr=newXMLHttpRequest();xhr.open('GET','send-ajax-data.php');// Track the state changes of the request.xhr.onreadystatechange=function(){varDONE=4;// readyState 4 means the request is done.varOK=200;// status 200 is a successful return.if(xhr.readyState===DONE){if(xhr.status===OK){console.log(xhr.responseText);// 'This is the output.'}else{console.log('Error: '+xhr.status);// An error occurred during the request.}}};// Send the request to send-ajax-data.phpxhr.send(null);
send-ajax-data.php:
// This is the server-side script.// Set the content type.header('Content-Type: text/plain');// Send the data back.echo"This is the output.";?>
Many developers dislike the syntax used in the XMLHttpRequest object, so some of the following workarounds have been created.
jQuery example
The popular JavaScript library jQuery
has implemented abstractions which enable developers to use Ajax more
conveniently. Although it still uses XMLHttpRequest behind the scenes,
the following is a client-side implementation of the same example as
above using the 'ajax' method.
$.ajax({type:'GET',url:'send-ajax-data.php',dataType:"JSON",// data type expected from serversuccess:function(data){console.log(data);},error:function(error){console.log('Error: '+error);}});
jQuery also implements a 'get' method which allows the same code to be written more concisely.
Fetch is a new native JavaScript API. According to Google Developers Documentation, "Fetch makes it easier to make web requests and handle responses than with the older XMLHttpRequest."
SOAP allows developers to invoke processes running on disparate operating systems (such as Windows, macOS, and Linux) to authenticate, authorize, and communicate using Extensible Markup Language
(XML). Since Web protocols like HTTP are installed and running on all
operating systems, SOAP allows clients to invoke web services and
receive responses independent of language and platforms.
Characteristics
SOAP provides the Messaging Protocol layer of a web services protocol stack for web services. It is an XML-based protocol consisting of three parts:
an envelope, which defines the message structure and how to process it
a set of encoding rules for expressing instances of application-defined datatypes
a convention for representing procedure calls and responses
SOAP has three major characteristics:
extensibility (security and WS-Addressing are among the extensions under development)
neutrality (SOAP can operate over any protocol such as HTTP, SMTP, TCP, UDP, or JMS)
As an example of what SOAP procedures can do, an application can send
a SOAP request to a server that has web services enabled—such as a
real-estate price database—with the parameters for a search. The server
then returns a SOAP response (an XML-formatted document with the
resulting data), e.g., prices, location, features. Since the generated
data comes in a standardized machine-parsable format, the requesting
application can then integrate it directly.
The SOAP architecture consists of several layers of specifications for:
SOAP evolved as a successor of XML-RPC, though it borrows its transport and interaction neutrality from Web Service Addressing and the envelope/header/body from elsewhere (probably from WDDX).
History
SOAP was designed as an object-access protocol in 1998 by Dave Winer, Don Box, Bob Atkinson, and Mohsen Al-Ghosein for Microsoft, where Atkinson and Al-Ghosein were working.[3] The specification was not made available until it was submitted to IETF 13 September 1999.[4][5] According to Don Box, this was due to politics within Microsoft. Because of Microsoft's hesitation, Dave Winer shipped XML-RPC in 1998.
The submitted Internet Draft did not reach RFC
status and is therefore not considered a "standard" as such. Version
1.1 of the specification was published as a W3C Note on 8 May 2000. Since version 1.1 did not reach W3C Recommendation status, it can not be considered a "standard" either. Version 1.2 of the specification, however, became a W3C recommendation on June 24, 2003.
The SOAP specification was maintained by the XML Protocol Working Group of the World Wide Web Consortium until the group was closed 10 July 2009. SOAP originally stood for "Simple Object Access Protocol" but version 1.2 of the standard dropped this acronym.
After SOAP was first introduced, it became the underlying layer of a more complex set of web services, based on Web Services Description Language (WSDL), XML schema and Universal Description Discovery and Integration
(UDDI). These different services, especially UDDI, have proved to be of
far less interest, but an appreciation of them gives a complete
understanding of the expected role of SOAP compared to how web services
have actually evolved.
SOAP terminology
SOAP
specification can be broadly defined to be consisting of the following 3
conceptual components: protocol concepts, encapsulation concepts and
network concepts.
Protocol concepts
SOAP
The set of rules formalizing and governing the format and processing
rules for information exchanged between a SOAP sender and a SOAP
receiver.
SOAP nodes
These are physical/logical machines with processing units which are
used to transmit/forward, receive and process SOAP messages. These are
analogous to nodes in a network.
SOAP roles
Over the path of a SOAP message, all nodes assume a specific role.
The role of the node defines the action that the node performs on the
message it receives. For example, a role "none" means that no node will process the SOAP header in any way and simply transmit the message along its path.
SOAP protocol binding
A SOAP message needs to work in conjunction with other protocols to
be transferred over a network. For example, a SOAP message could use TCP as a lower layer protocol to transfer messages. These bindings are defined in the SOAP protocol binding framework.
SOAP features
SOAP provides a messaging framework only. However, it can be
extended to add features such as reliability, security etc. There are
rules to be followed when adding features to the SOAP framework.
SOAP module
A collection of specifications regarding the semantics of SOAP
header to describe any new features being extended upon SOAP. A module
needs to realize zero or more features. SOAP requires modules to adhere
to prescribed rules.
Data encapsulation concepts
SOAP message
Represents the information being exchanged between 2 SOAP nodes.
SOAP envelope
As per its name, it is the enclosing element of an XML message identifying it as a SOAP message.
SOAP header block
A SOAP header can contain more than one of these blocks, each being a
discrete computational block within the header. In general, the SOAP role
information is used to target nodes on the path. A header block is said
to be targeted at a SOAP node if the SOAP role for the header block is
the name of a role in which the SOAP node operates. (ex: A SOAP header
block with role attribute as ultimateReceiver is targeted only at the destination node which has this role. A header with a role attribute as next is targeted at each intermediary as well as the destination node.)
SOAP header
A collection of one or more header blocks targeted at each SOAP receiver.
SOAP body
Contains the body of the message intended for the SOAP receiver. The
interpretation and processing of SOAP body is defined by header blocks.
SOAP fault
In case a SOAP node fails to process a SOAP message, it adds the
fault information to the SOAP fault element. This element is contained
within the SOAP body as a child element.
Message sender and receiver concepts
SOAP sender
The node that transmits a SOAP message.
SOAP receiver
The node receiving a SOAP message. (Could be an intermediary or the destination node.)
SOAP message path
The path consisting of all the nodes that the SOAP message traversed to reach the destination node.
Initial SOAP sender
This is the node which originated the SOAP message to be transmitted. This is the root of the SOAP message path.
SOAP intermediary
All the nodes in between the SOAP originator and the intended SOAP
destination. It processes the SOAP header blocks targeted at it and acts
to forward a SOAP message towards an ultimate SOAP receiver.
Ultimate SOAP receiver
The destination receiver of the SOAP message. This node is
responsible for processing the message body and any header blocks
targeted at it.
Specification
SOAP structure
The SOAP specification defines the messaging framework, which consists of:
The SOAP processing model, defining the rules for processing a SOAP message
The SOAP extensibility model defining the concepts of SOAP features and SOAP modules
The SOAP underlying protocol binding framework describing the
rules for defining a binding to an underlying protocol that can be used
for exchanging SOAP messages between SOAP nodes
The SOAP message construct defining the structure of a SOAP message
SOAP building blocks
A SOAP message is an ordinary XML document containing the following elements:
Element
Description
Required
Envelope
Identifies the XML document as a SOAP message.
Yes
Header
Contains header information.
No
Body
Contains call, and response information.
Yes
Fault
Provides information about errors that occurred while processing the message.
No
Transport methods
Both SMTP and HTTP
are valid application layer protocols used as transport for SOAP, but
HTTP has gained wider acceptance as it works well with today's internet
infrastructure; specifically, HTTP works well with network firewalls. SOAP may also be used over HTTPS (which is the same protocol as HTTP at the application level, but uses an encrypted transport protocol underneath) with either simple or mutual authentication; this is the advocated WS-I method to provide web service security as stated in the WS-I Basic Profile 1.1.
This is a major advantage over other distributed protocols like GIOP/IIOP or DCOM, which are normally filtered by firewalls. SOAP over AMQP is yet another possibility that some implementations support. SOAP also has an advantage over DCOM
that it is unaffected by security rights configured on the machines
that require knowledge of both transmitting and receiving nodes. This
lets SOAP be loosely coupled in a way that is not possible with DCOM. There is also the SOAP-over-UDPOASIS standard.
Message format
XML Information Set was chosen as the standard message format because of its widespread use by major corporations and open source development efforts. Typically, XML Information Set is serialized as XML. A wide variety of freely available tools significantly eases the transition to a SOAP-based implementation. The somewhat lengthy syntax of XML
can be both a benefit and a drawback. While it promotes readability for
humans, facilitates error detection, and avoids interoperability
problems such as byte-order (endianness), it can slow processing speed and can be cumbersome. For example, CORBA, GIOP, ICE, and DCOM use much shorter, binary message formats. On the other hand, hardware appliances are available to accelerate processing of XML messages.[19][20]Binary XML
is also being explored as a means for streamlining the throughput
requirements of XML.
XML messages by their self-documenting nature usually have more
'overhead' (e.g., headers, nested tags, delimiters) than actual data in
contrast to earlier protocols where the overhead was usually a
relatively small percentage of the overall message.
In financial messaging SOAP was found to result in a 2–4 times larger message than previous protocols FIX (Financial Information Exchange) and CDR (Common Data Representation).
XML Information Set does not have to be serialized in XML. For instance, CSV and JSON
XML-infoset representations exist. There is also no need to specify a
generic transformation framework. The concept of SOAP bindings allows
for specific bindings for a specific application. The drawback is that
both the senders and receivers have to support this newly defined
binding.
Example message (encapsulated in HTTP)
The message below is requesting a stock price for AT&T (stock ticker symbol "T").
SOAP's
neutrality characteristic explicitly makes it suitable for use with any
transport protocol. Implementations often use HTTP as a transport
protocol, but other popular transport protocols can be used. For
example, SOAP can also be used over SMTP, JMS and message queues.
SOAP, when combined with HTTP post/response exchanges, tunnels
easily through existing firewalls and proxies, and consequently doesn't
require modifying the widespread computing and communication
infrastructures that exist for processing HTTP post/response exchanges.
SOAP has available to it all the facilities of XML, including easy internationalization and extensibility with XML Namespaces.
Disadvantages
When
using standard implementation and the default SOAP/HTTP binding, the
XML infoset is serialized as XML. To improve performance for the special
case of XML with embedded binary objects, the Message Transmission Optimization Mechanism was introduced.
When relying on HTTP as a transport protocol and not using Web Services Addressing or an Enterprise Service Bus, the roles of the interacting parties are fixed. Only one party (the client) can use the services of the other.
The verbosity of the protocol, slow parsing speed of XML, and lack
of a standardized interaction model led to the domination in the field
by services using the HTTP protocol more directly. See, for example, REST.
WebSocket is a computer communications protocol, providing full-duplex communication channels over a single TCP connection. The WebSocket protocol was standardized by the IETF as RFC 6455 in 2011, and the WebSocket API in Web IDL is being standardized by the W3C.
WebSocket is distinct from HTTP. Both protocols are located at layer 7 in the OSI model and depend on TCP at layer 4. Although they are different, RFC 6455
states that WebSocket "is designed to work over HTTP ports 80 and 443
as well as to support HTTP proxies and intermediaries," thus making it
compatible with the HTTP protocol. To achieve compatibility, the
WebSocket handshake uses the HTTP Upgrade header to change from the HTTP protocol to the WebSocket protocol.
The WebSocket protocol enables interaction between a web browser (or other client application) and a web server
with lower overhead than half-duplex alternatives such as HTTP polling,
facilitating real-time data transfer from and to the server. This is
made possible by providing a standardized way for the server to send
content to the client without being first requested by the client, and
allowing messages to be passed back and forth while keeping the
connection open. In this way, a two-way ongoing conversation can take
place between the client and the server. The communications are done
over TCP port number 80 (or 443 in the case of TLS-encrypted connections), which is of benefit for those environments which block non-web Internet connections using a firewall. Similar two-way browser-server communications have been achieved in non-standardized ways using stopgap technologies such as Comet.
Unlike HTTP, WebSocket provides full-duplex communication.
Additionally, WebSocket enables streams of messages on top of TCP. TCP
alone deals with streams of bytes with no inherent concept of a message.
Before WebSocket, port 80 full-duplex communication was attainable
using Comet
channels; however, Comet implementation is nontrivial, and due to the
TCP handshake and HTTP header overhead, it is inefficient for small
messages. The WebSocket protocol aims to solve these problems without
compromising the security assumptions of the web.
The WebSocket protocol specification defines ws (WebSocket) and wss (WebSocket Secure) as two new uniform resource identifier (URI) schemes that are used for unencrypted and encrypted connections, respectively. Apart from the scheme name and fragment (i.e. # is not supported), the rest of the URI components are defined to use URI generic syntax.
Using browser developer tools, developers can inspect the WebSocket handshake as well as the WebSocket frames.
History
WebSocket was first referenced as TCPConnection in the HTML5 specification, as a placeholder for a TCP-based socket API. In June 2008, a series of discussions were led by Michael Carter that resulted in the first version of the protocol known as WebSocket.
The name "WebSocket" was coined by Ian Hickson and Michael Carter
shortly thereafter through collaboration on the #whatwg IRC chat room,
and subsequently authored for inclusion in the HTML5 specification by
Ian Hickson, and announced on the cometdaily blog by Michael Carter.
In December 2009, Google Chrome 4 was the first browser to ship full
support for the standard, with WebSocket enabled by default. Development of the WebSocket protocol was subsequently moved from the W3C and WHATWG group to the IETF in February 2010, and authored for two revisions under Ian Hickson.
After the protocol was shipped and enabled by default in multiple
browsers, the RFC was finalized under Ian Fette in December 2011.
Browser implementation
A secure version of the WebSocket protocol is implemented in Firefox 6, Safari 6, Google Chrome 14, Opera 12.10 and Internet Explorer 10. A detailed protocol test suite report lists the conformance of those browsers to specific protocol aspects.
An older, less secure version of the protocol was implemented in Opera 11 and Safari 5, as well as the mobile version of Safari in iOS 4.2. The BlackBerry Browser in OS7 implements WebSockets. Because of vulnerabilities, it was disabled in Firefox 4 and 5, and Opera 11.
Nginx has supported WebSockets since 2013, implemented in version 1.3.13 including acting as a reverse proxy and load balancer of WebSocket applications.
Protocol handshake
To
establish a WebSocket connection, the client sends a WebSocket
handshake request, for which the server returns a WebSocket handshake
response, as shown in the example below.
Client request (just like in HTTP, each line ends with \r\n and there must be an extra blank line at the end):
The handshake starts with an HTTP request/response, allowing servers
to handle HTTP connections as well as WebSocket connections on the same
port. Once the connection is established, communication switches to a
bidirectional binary protocol which does not conform to the HTTP
protocol.
In addition to Upgrade headers, the client sends a Sec-WebSocket-Key header containing base64-encoded random bytes, and the server replies with a hash of the key in the Sec-WebSocket-Accept header. This is intended to prevent a cachingproxy from re-sending a previous WebSocket conversation, and does not provide any authentication, privacy, or integrity. The hashing function appends the fixed string 258EAFA5-E914-47DA-95CA-C5AB0DC85B11 (a GUID) to the value from Sec-WebSocket-Key header (which is not decoded from base64), applies the SHA-1 hashing function, and encodes the result using base64.
Once the connection is established, the client and server can send WebSocket data or text frames back and forth in full-duplex mode. The data is minimally framed, with a small header followed by payload.[35]
WebSocket transmissions are described as "messages", where a single
message can optionally be split across several data frames. This can
allow for sending of messages where initial data is available but the
complete length of the message is unknown (it sends one data frame after
another until the end is reached and marked with the FIN bit). With
extensions to the protocol, this can also be used for multiplexing
several streams simultaneously (for instance to avoid monopolizing use
of a socket for a single large payload).
Security considerations
Unlike regular cross-domain HTTP requests, WebSocket requests are not restricted by the Same-origin policy.
Therefore WebSocket servers must validate the "Origin" header against
the expected origins during connection establishment, to avoid
Cross-Site WebSocket Hijacking attacks (similar to Cross-site request forgery),
which might be possible when the connection is authenticated with
Cookies or HTTP authentication. It is better to use tokens or similar
protection mechanisms to authenticate the WebSocket connection when
sensitive (private) data is being transferred over the WebSocket.
Proxy traversal
WebSocket protocol client implementations try to detect if the user agent is configured to use a proxy when connecting to destination host and port and, if it is, uses HTTP CONNECT method to set up a persistent tunnel.
While the WebSocket protocol itself is unaware of proxy servers
and firewalls, it features an HTTP-compatible handshake thus allowing
HTTP servers to share their default HTTP and HTTPS ports (80 and 443)
with a WebSocket gateway or server. The WebSocket protocol defines a
ws:// and wss:// prefix to indicate a WebSocket and a WebSocket Secure
connection, respectively. Both schemes use an HTTP upgrade mechanism
to upgrade to the WebSocket protocol. Some proxy servers are
transparent and work fine with WebSocket; others will prevent WebSocket
from working correctly, causing the connection to fail. In some cases,
additional proxy server configuration may be required, and certain proxy
servers may need to be upgraded to support WebSocket.
If unencrypted WebSocket traffic flows through an explicit or a
transparent proxy server without WebSockets support, the connection will
likely fail.
If an encrypted WebSocket connection is used, then the use of Transport Layer Security
(TLS) in the WebSocket Secure connection ensures that an HTTP CONNECT
command is issued when the browser is configured to use an explicit
proxy server. This sets up a tunnel, which provides low-level end-to-end
TCP communication through the HTTP proxy, between the WebSocket Secure
client and the WebSocket server. In the case of transparent proxy
servers, the browser is unaware of the proxy server, so no HTTP CONNECT
is sent. However, since the wire traffic is encrypted, intermediate
transparent proxy servers may simply allow the encrypted traffic
through, so there is a much better chance that the WebSocket connection
will succeed if WebSocket Secure is used. Using encryption is not free
of resource cost, but often provides the highest success rate since it
would be travelling through a secure tunnel.
A mid-2010 draft (version hixie-76) broke compatibility with reverse proxies and gateways by including eight bytes of key data after the headers, but not advertising that data in a Content-Length: 8 header. This data was not forwarded by all intermediates, which could lead to protocol failure. More recent drafts (e.g., hybi-09) put the key data in a Sec-WebSocket-Key header, solving this problem.