A Medley of Potpourri

Friday, December 8, 2023

Web server

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Web_server

ADSL modem running an embedded web server serving dynamic web pages used for modem configuration

A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message. A web server can also accept and store resources sent from the user agent if configured to do so.

The hardware used to run a web server can vary according to the volume of requests that it needs to handle. At the low end of the range are embedded systems, such as a router that runs a small web server as its configuration interface. A high-traffic Internet website might handle requests with hundreds of servers that run on racks of high-speed computers.

A resource sent from a web server can be a pre-existing file (static content) available to the web server, or it can be generated at the time of the request (dynamic content) by another program that communicates with the server software. The former usually can be served faster and can be more easily cached for repeated requests, while the latter supports a broader range of applications.

Technologies such as REST and SOAP, which use HTTP as a basis for general computer-to-computer communication, as well as support for WebDAV extensions, have extended the application of web servers well beyond their original purpose of serving human-readable pages.

History

This is a very brief history of web server programs, so some information necessarily overlaps with the histories of the web browsers, the World Wide Web and the Internet; therefore, for the sake of clarity and understandability, some key historical information below reported may be similar to that found also in one or more of the above-mentioned history articles.

Initial WWW project (1989–1991)

In March 1989, Sir Tim Berners-Lee proposed a new project to his employer CERN, with the goal of easing the exchange of information between scientists by using a hypertext system. The proposal titled "HyperText and CERN", asked for comments and it was read by several people. In October 1990 the proposal was reformulated and enriched (having as co-author Robert Cailliau), and finally, it was approved.

Between late 1990 and early 1991 the project resulted in Berners-Lee and his developers writing and testing several software libraries along with three programs, which initially ran on NeXTSTEP OS installed on NeXT workstations:

a graphical web browser, called WorldWideWeb;
a portable line mode web browser;
a web server, later known as CERN httpd.

Those early browsers retrieved web pages written in a simple early form of HTML, from web server(s) using a new basic communication protocol that was named HTTP 0.9.

In August 1991 Tim Berner-Lee announced the birth of WWW technology and encouraged scientists to adopt and develop it. Soon after, those programs, along with their source code, were made available to people interested in their usage. Although the source code was not formally licensed or placed in the public domain, CERN informally allowed users and developers to experiment and further develop on top of them. Berners-Lee started promoting the adoption and the usage of those programs along with their porting to other operating systems.

Fast and wild development (1991–1995)

In December 1991 the first web server outside Europe was installed at SLAC (U.S.A.). This was a very important event because it started trans-continental web communications between web browsers and web servers.

In 1991–1993 CERN web server program continued to be actively developed by the www group, meanwhile, thanks to the availability of its source code and the public specifications of the HTTP protocol, many other implementations of web servers started to be developed.

In April 1993 CERN issued a public official statement stating that the three components of Web software (the basic line-mode client, the web server and the library of common code), along with their source code, were put in the public domain. This statement freed web server developers from any possible legal issue about the development of derivative work based on that source code (a threat that in practice never existed).

At the beginning of 1994, the most notable among new web servers was NCSA httpd which ran on a variety of Unix-based OSs and could serve dynamically generated content by implementing the POST HTTP method and the CGI to communicate with external programs. These capabilities, along with the multimedia features of NCSA's Mosaic browser (also able to manage HTML FORMs in order to send data to a web server) highlighted the potential of web technology for publishing and distributed computing applications.

In the second half of 1994, the development of NCSA httpd stalled to the point that a group of external software developers, webmasters and other professional figures interested in that server, started to write and collect patches thanks to the NCSA httpd source code being available to the public domain. At the beginning of 1995 those patches were all applied to the last release of NCSA source code and, after several tests, the Apache HTTP server project was started.

At the end of 1994 a new commercial web server, named Netsite, was released with specific features. It was the first one of many other similar products that were developed first by Netscape, then also by Sun Microsystems, and finally by Oracle Corporation.

In mid-1995 the first version of IIS was released, for Windows NT OS, by Microsoft. This marked the entry, in the field of World Wide Web technologies, of a very important commercial developer and vendor that has played and still is playing a key role on both sides (client and server) of the web.

In the second half of 1995 CERN and NCSA web servers started to decline (in global percentage usage) because of the widespread adoption of new web servers which had a much faster development cycle along with more features, more fixes applied, and more performances than the previous ones.

Explosive growth and competition (1996–2014)

At the end of 1996 there were already over fifty known (different) web server software programs that were available to everybody who wanted to own an Internet domain name and/or to host websites. Many of them lived only shortly and were replaced by other web servers.

The publication of RFCs about protocol versions HTTP/1.0 (1996) and HTTP/1.1 (1997, 1999), forced most web servers to comply (not always completely) with those standards. The use of TCP/IP persistent connections (HTTP/1.1) required web servers both to increase a lot the maximum number of concurrent connections allowed and to improve their level of scalability.

Between 1996 and 1999 Netscape Enterprise Server and Microsoft's IIS emerged among the leading commercial options whereas among the freely available and open-source programs Apache HTTP Server held the lead as the preferred server (because of its reliability and its many features).

In those years there was also another commercial, highly innovative and thus notable web server called Zeus (now discontinued) that was known as one of the fastest and most scalable web servers available on market, at least till the first decade of 2000s, despite its low percentage of usage.

Apache resulted in the most used web server from mid-1996 to the end of 2015 when, after a few years of decline, it was surpassed initially by IIS and then by Nginx. Afterward IIS dropped to much lower percentages of usage than Apache (see also market share).

From 2005–2006 Apache started to improve its speed and its scalability level by introducing new performance features (e.g. event MPM and new content cache). As those new performance improvements initially were marked as experimental, they were not enabled by its users for a long time and so Apache suffered, even more, the competition of commercial servers and, above all, of other open-source servers which meanwhile had already achieved far superior performances (mostly when serving static content) since the beginning of their development and at the time of the Apache decline were able to offer also a long enough list of well tested advanced features.

In fact, a few years after 2000 started, not only other commercial and highly competitive web servers, e.g. LiteSpeed, but also many other open-source programs, often of excellent quality and very high performances, among which should be noted Hiawatha, Cherokee HTTP server, Lighttpd, Nginx and other derived/related products also available with commercial support, emerged.

Around 2007–2008 most popular web browsers increased their previous default limit of 2 persistent connections per host-domain (a limit recommended by RFC-2616) to 4, 6 or 8 persistent connections per host-domain, in order to speed up the retrieval of heavy web pages with lots of images, and to mitigate the problem of the shortage of persistent connections dedicated to dynamic objects used for bi-directional notifications of events in web pages. Within a year, these changes, on average, nearly tripled the maximum number of persistent connections that web servers had to manage. This trend (of increasing the number of persistent connections) definitely gave a strong impetus to the adoption of reverse proxies in front of slower web servers and it gave also one more chance to the emerging new web servers that could show all their speed and their capability to handle very high numbers of concurrent connections without requiring too many hardware resources (expensive computers with lots of CPUs, RAM and fast disks).

New challenges (2015 and later years)

In 2015, RFCs published new protocol version [HTTP/2], and as the implementation of new specifications was not trivial at all, a dilemma arose among developers of less popular web servers (e.g. with a percentage of usage lower than 1% .. 2%), about adding or not adding support for that new protocol version.

In fact supporting HTTP/2 often required radical changes to their internal implementation due to many factors (practically always required encrypted connections, capability to distinguish between HTTP/1.x and HTTP/2 connections on the same TCP port, binary representation of HTTP messages, message priority, compression of HTTP headers, use of streams also known as TCP/IP sub-connections and related flow-control, etc.) and so a few developers of those web servers opted for not supporting new HTTP/2 version (at least in the near future) also because of these main reasons:

protocols HTTP/1.x would have been supported anyway by browsers for a very long time (maybe forever) so that there would be no incompatibility between clients and servers in next future;
implementing HTTP/2 was considered a task of overwhelming complexity that could open the door to a whole new class of bugs that till 2015 did not exist and so it would have required notable investments in developing and testing the implementation of the new protocol;
adding HTTP/2 support could always be done in future in case the efforts would be justified.

Instead, developers of most popular web servers, rushed to offer the availability of new protocol, not only because they had the work force and the time to do so, but also because usually their previous implementation of SPDY protocol could be reused as a starting point and because most used web browsers implemented it very quickly for the same reason. Another reason that prompted those developers to act quickly was that webmasters felt the pressure of the ever increasing web traffic and they really wanted to install and to try – as soon as possible – something that could drastically lower the number of TCP/IP connections and speedup accesses to hosted websites.

In 2020–2021 the HTTP/2 dynamics about its implementation (by top web servers and popular web browsers) were partly replicated after the publication of advanced drafts of future RFC about HTTP/3 protocol.

Technical overview

PC clients connected to a web server via Internet

The following technical overview should be considered only as an attempt to give a few very limited examples about some features that may be implemented in a web server and some of the tasks that it may perform in order to have a sufficiently wide scenario about the topic.

A web server program plays the role of a server in a client–server model by implementing one or more versions of HTTP protocol, often including the HTTPS secure variant and other features and extensions that are considered useful for its planned usage.

The complexity and the efficiency of a web server program may vary a lot depending on (e.g.):

common features implemented;
common tasks performed;
performances and scalability level aimed as a goal;
software model and techniques adopted to achieve wished performance and scalability level;
target hardware and category of usage, e.g. embedded system, low-medium traffic web server, high traffic Internet web server.

Common features

Although web server programs differ in how they are implemented, most of them offer the following common features.

These are basic features that most web servers usually have.

Static content serving: to be able to serve static content (web files) to clients via HTTP protocol.
HTTP: support for one or more versions of HTTP protocol in order to send versions of HTTP responses compatible with versions of client HTTP requests, e.g. HTTP/1.0, HTTP/1.1 (eventually also with encrypted connections HTTPS), plus, if available, HTTP/2, HTTP/3.
Logging: usually web servers have also the capability of logging some information, about client requests and server responses, to log files for security and statistical purposes.

A few other more advanced and popular features (only a very short selection) are the following ones.

Dynamic content serving: to be able to serve dynamic content (generated on the fly) to clients via HTTP protocol.
Virtual hosting: to be able to serve many websites (domain names) using only one IP address.
Authorization: to be able to allow, to forbid or to authorize access to portions of website paths (web resources).
Content cache: to be able to cache static and/or dynamic content in order to speed up server responses;
Large file support: to be able to serve files whose size is greater than 2 GB on 32 bit OS.
Bandwidth throttling: to limit the speed of content responses in order to not saturate the network and to be able to serve more clients;
Rewrite engine: to map parts of clean URLs (found in client requests) to their real names.
Custom error pages: support for customized HTTP error messages.

Common tasks

A web server program, when it is running, usually performs several general tasks, (e.g.):

starts, optionally reads and applies settings found in its configuration file(s) or elsewhere, optionally opens log file, starts listening to client connections / requests;
optionally tries to adapt its general behavior according to its settings and its current operating conditions;
manages client connection(s) (accepting new ones or closing the existing ones as required);
receives client requests (by reading HTTP messages):
- reads and verify each HTTP request message;
- usually performs URL normalization;
- usually performs URL mapping (which may default to URL path translation);
- usually performs URL path translation along with various security checks;
executes or refuses requested HTTP method:
- optionally manages URL authorizations;
- optionally manages URL redirections;
- optionally manages requests for static resources (file contents):
  - optionally manages directory index files;
  - optionally manages regular files;
- optionally manages requests for dynamic resources:
  - optionally manages directory listings;
  - optionally manages program or module processing, checking the availability, the start and eventually the stop of the execution of external programs used to generate dynamic content;
  - optionally manages the communications with external programs / internal modules used to generate dynamic content;
replies to client requests sending proper HTTP responses (e.g. requested resources or error messages) eventually verifying or adding HTTP headers to those sent by dynamic programs / modules;
optionally logs (partially or totally) client requests and/or its responses to an external user log file or to a system log file by syslog, usually using common log format;
optionally logs process messages about detected anomalies or other notable events (e.g. in client requests or in its internal functioning) using syslog or some other system facilities; these log messages usually have a debug, warning, error, alert level which can be filtered (not logged) depending on some settings, see also severity level;
optionally generates statistics about web traffic managed and/or its performances;
other custom tasks.

Read request message

Web server programs are able:

to read an HTTP request message;
to interpret it;
to verify its syntax;
to identify known HTTP headers and to extract their values from them.

Once an HTTP request message has been decoded and verified, its values can be used to determine whether that request can be satisfied or not. This requires many other steps, including security checks.

URL normalization

Web server programs usually perform some type of URL normalization (URL found in most HTTP request messages) in order:

to make resource path always a clean uniform path from root directory of website;
to lower security risks (e.g. by intercepting more easily attempts to access static resources outside the root directory of the website or to access to portions of path below website root directory that are forbidden or which require authorization);
to make path of web resources more recognizable by human beings and web log analysis programs (also known as log analyzers / statistical applications).

The term URL normalization refers to the process of modifying and standardizing a URL in a consistent manner. There are several types of normalization that may be performed, including the conversion of the scheme and host to lowercase. Among the most important normalizations are the removal of "." and ".." path segments and adding trailing slashes to a non-empty path component.

URL mapping

"URL mapping is the process by which a URL is analyzed to figure out what resource it is referring to, so that that resource can be returned to the requesting client. This process is performed with every request that is made to a web server, with some of the requests being served with a file, such as an HTML document, or a gif image, others with the results of running a CGI program, and others by some other process, such as a built-in module handler, a PHP document, or a Java servlet."

In practice, web server programs that implement advanced features, beyond the simple static content serving (e.g. URL rewrite engine, dynamic content serving), usually have to figure out how that URL has to be handled, e.g.:

as a URL redirection, a redirection to another URL;
as a static request of file content;
as a dynamic request of:
- directory listing of files or other sub-directories contained in that directory;
- other types of dynamic request in order to identify the program / module processor able to handle that kind of URL path and to pass to it other URL parts, i.e. usually path-info and query string variables.

One or more configuration files of web server may specify the mapping of parts of URL path (e.g. initial parts of file path, filename extension and other path components) to a specific URL handler (file, directory, external program or internal module).

When a web server implements one or more of the above-mentioned advanced features then the path part of a valid URL may not always match an existing file system path under website directory tree (a file or a directory in file system) because it can refer to a virtual name of an internal or external module processor for dynamic requests.

URL path translation to file system

Web server programs are able to translate an URL path (all or part of it), that refers to a physical file system path, to an absolute path under the target website's root directory.

Website's root directory may be specified by a configuration file or by some internal rule of the web server by using the name of the website which is the host part of the URL found in HTTP client request.

Path translation to file system is done for the following types of web resources:

a local, usually non-executable, file (static request for file content);
a local directory (dynamic request: directory listing generated on the fly);
a program name (dynamic requests that is executed using CGI or SCGI interface and whose output is read by web server and resent to client who made the HTTP request).

The web server appends the path found in requested URL (HTTP request message) and appends it to the path of the (Host) website root directory. On an Apache server, this is commonly /home/www/website (on Unix machines, usually it is: /var/www/website). See the following examples of how it may result.

URL path translation for a static file request

Example of a static request of an existing file specified by the following URL:

http://www.example.com/path/file.html

The client's user agent connects to www.example.com and then sends the following HTTP/1.1 request:

GET /path/file.html HTTP/1.1
Host: www.example.com
Connection: keep-alive

The result is the local file system resource:

/home/www/www.example.com/path/file.html

The web server then reads the file, if it exists, and sends a response to the client's web browser. The response will describe the content of the file and contain the file itself or an error message will return saying that the file does not exist or its access is forbidden.

URL path translation for a directory request (without a static index file)

Example of an implicit dynamic request of an existing directory specified by the following URL:

http://www.example.com/directory1/directory2/

The client's user agent connects to www.example.com and then sends the following HTTP/1.1 request:

GET /directory1/directory2 HTTP/1.1
Host: www.example.com
Connection: keep-alive

The result is the local directory path:

/home/www/www.example.com/directory1/directory2/

The web server then verifies the existence of the directory and if it exists and it can be accessed then tries to find out an index file (which in this case does not exist) and so it passes the request to an internal module or a program dedicated to directory listings and finally reads data output and sends a response to the client's web browser. The response will describe the content of the directory (list of contained subdirectories and files) or an error message will return saying that the directory does not exist or its access is forbidden.

URL path translation for a dynamic program request

For a dynamic request the URL path specified by the client should refer to an existing external program (usually an executable file with a CGI) used by the web server to generate dynamic content.

Example of a dynamic request using a program file to generate output:

http://www.example.com/cgi-bin/forum.php?action=view&orderby=thread&date=2021-10-15

The client's user agent connects to www.example.com and then sends the following HTTP/1.1 request:

GET /cgi-bin/forum.php?action=view&ordeby=thread&date=2021-10-15 HTTP/1.1
Host: www.example.com
Connection: keep-alive

The result is the local file path of the program (in this example, a PHP program):

/home/www/www.example.com/cgi-bin/forum.php

The web server executes that program, passing in the path-info and the query string action=view&orderby=thread&date=2021-10-15 so that the program has the info it needs to run. (In this case, it will return an HTML document containing a view of forum entries ordered by thread from October 15, 2021). In addition to this, the web server reads data sent from the external program and resends that data to the client that made the request.

Manage request message

Once a request has been read, interpreted, and verified, it has to be managed depending on its method, its URL, and its parameters, which may include values of HTTP headers.

In practice, the web server has to handle the request by using one of these response paths:

if something in request was not acceptable (in status line or message headers), web server already sent an error response;
if request has a method (e.g. OPTIONS) that can be satisfied by general code of web server then a successful response is sent;
if URL requires authorization then an authorization error message is sent;
if URL maps to a redirection then a redirect message is sent;
if URL maps to a dynamic resource (a virtual path or a directory listing) then its handler (an internal module or an external program) is called and request parameters (query string and path info) are passed to it in order to allow it to reply to that request;
if URL maps to a static resource (usually a file on file system) then the internal static handler is called to send that file;
if request method is not known or if there is some other unacceptable condition (e.g. resource not found, internal server error, etc.) then an error response is sent.

Serve static content

If a web server program is capable of serving static content and it has been configured to do so, then it is able to send file content whenever a request message has a valid URL path matching (after URL mapping, URL translation and URL redirection) that of an existing file under the root directory of a website and file has attributes which match those required by internal rules of web server program.

That kind of content is called static because usually it is not changed by the web server when it is sent to clients and because it remains the same until it is modified (file modification) by some program.

NOTE: when serving static content only, a web server program usually does not change file contents of served websites (as they are only read and never written) and so it suffices to support only these HTTP methods:

OPTIONS
HEAD
GET

Response of static file content can be sped up by a file cache.

Directory index files

If a web server program receives a client request message with an URL whose path matches one of an existing directory and that directory is accessible and serving directory index file(s) is enabled then a web server program may try to serve the first of known (or configured) static index file names (a regular file) found in that directory; if no index file is found or other conditions are not met then an error message is returned.

Most used names for static index files are: index.html, index.htm and Default.htm.

Regular files

If a web server program receives a client request message with an URL whose path matches the file name of an existing file and that file is accessible by web server program and its attributes match internal rules of web server program, then web server program can send that file to client.

Usually, for security reasons, most web server programs are pre-configured to serve only regular files or to avoid to use special file types like device files, along with symbolic links or hard links to them. The aim is to avoid undesirable side effects when serving static web resources.

Serve dynamic content

If a web server program is capable of serving dynamic content and it has been configured to do so, then it is able to communicate with the proper internal module or external program (associated with the requested URL path) in order to pass to it parameters of client request; after that, web server program reads from it its data response (that it has generated, often on the fly) and then it resends it to the client program who made the request.

NOTE: when serving static and dynamic content, a web server program usually has to support also the following HTTP method in order to be able to safely receive data from client(s) and so to be able to host also websites with interactive form(s) that may send large data sets (e.g. lots of data entry or file uploads) to web server / external programs / modules:

POST

In order to be able to communicate with its internal modules and/or external programs, a web server program must have implemented one or more of the many available gateway interface(s) (see also Web Server Gateway Interfaces used for dynamic content).

The three standard and historical gateway interfaces are the following ones.

CGI: An external CGI program is run by web server program for each dynamic request, then web server program reads from it the generated data response and then resends it to client.

SCGI: An external SCGI program (it usually is a process) is started once by web server program or by some other program / process and then it waits for network connections; every time there is a new request for it, web server program makes a new network connection to it in order to send request parameters and to read its data response, then network connection is closed.

FastCGI: An external FastCGI program (it usually is a process) is started once by web server program or by some other program / process and then it waits for a network connection which is established permanently by web server; through that connection are sent the request parameters and read data responses.

Directory listings

A web server program may be capable to manage the dynamic generation (on the fly) of a directory index list of files and sub-directories.

If a web server program is configured to do so and a requested URL path matches an existing directory and its access is allowed and no static index file is found under that directory then a web page (usually in HTML format), containing the list of files and/or subdirectories of above mentioned directory, is dynamically generated (on the fly). If it cannot be generated an error is returned.

Some web server programs allow the customization of directory listings by allowing the usage of a web page template (an HTML document containing placeholders, e.g. $(FILE_NAME), $(FILE_SIZE), etc., that are replaced with the field values of each file entry found in directory by web server), e.g. index.tpl or the usage of HTML and embedded source code that is interpreted and executed on the fly, e.g. index.asp, and / or by supporting the usage of dynamic index programs such as CGIs, SCGIs, FCGIs, e.g. index.cgi, index.php, index.fcgi.

Usage of dynamically generated directory listings is usually avoided or limited to a few selected directories of a website because that generation takes much more OS resources than sending a static index page.

The main usage of directory listings is to allow the download of files (usually when their names, sizes, modification date-times or file attributes may change randomly / frequently) as they are, without requiring to provide further information to requesting user.

Program or module processing

An external program or an internal module (processing unit) can execute some sort of application function that may be used to get data from or to store data to one or more data repositories, e.g.:

files (file system);
databases (DBs);
other sources located in local computer or in other computers.

A processing unit can return any kind of web content, also by using data retrieved from a data repository, e.g.:

a document (e.g. HTML, XML, etc.);
an image;
a video;
structured data, e.g. that may be used to update one or more values displayed by a dynamic page (DHTML) of a web interface and that maybe was requested by an XMLHttpRequest API (see also: dynamic page).

In practice whenever there is content that may vary, depending on one or more parameters contained in client request or in configuration settings, then, usually, it is generated dynamically.

Send response message

Web server programs are able to send response messages as replies to client request messages.

An error response message may be sent because a request message could not be successfully read or decoded or analyzed or executed.

NOTE: the following sections are reported only as examples to help to understand what a web server, more or less, does; these sections are by any means neither exhaustive nor complete.

Error message

A web server program may reply to a client request message with many kinds of error messages, anyway these errors are divided mainly in two categories:

HTTP client errors, due to the type of request message or to the availability of requested web resource;
HTTP server errors, due to internal server errors.

When an error response / message is received by a client browser, then if it is related to the main user request (e.g. an URL of a web resource such as a web page) then usually that error message is shown in some browser window / message.

URL authorization

A web server program may be able to verify whether the requested URL path:

can be freely accessed by everybody;
requires a user authentication (request of user credentials, e.g. such as user name and password);
access is forbidden to some or all kind of users.

If the authorization / access rights feature has been implemented and enabled and access to web resource is not granted, then, depending on the required access rights, a web server program:

can deny access by sending a specific error message (e.g. access forbidden);
may deny access by sending a specific error message (e.g. access unauthorized) that usually forces the client browser to ask human user to provide required user credentials; if authentication credentials are provided then web server program verifies and accepts or rejects them.

URL redirection

A web server program may have the capability of doing URL redirections to new URLs (new locations) which consists in replying to a client request message with a response message containing a new URL suited to access a valid or an existing web resource (client should redo the request with the new URL).

URL redirection of location is used:

to fix a directory name by adding a final slash '/';
to give a new URL for a no more existing URL path to a new path where that kind of web resource can be found.
to give a new URL to another domain when current domain has too much load.

Example 1: a URL path points to a directory name but it does not have a final slash '/' so web server sends a redirect to client in order to instruct it to redo the request with the fixed path name.

From:
/directory1/directory2
To:
/directory1/directory2/

Example 2: a whole set of documents has been moved inside website in order to reorganize their file system paths.

From:
/directory1/directory2/2021-10-08/
To:
/directory1/directory2/2021/10/08/

Example 3: a whole set of documents has been moved to a new website and now it is mandatory to use secure HTTPS connections to access them.

From:
http://www.example.com/directory1/directory2/2021-10-08/
To:
https://docs.example.com/directory1/2021-10-08/

Above examples are only a few of the possible kind of redirections.

Successful message

A web server program is able to reply to a valid client request message with a successful message, optionally containing requested web resource data.

If web resource data is sent back to client, then it can be static content or dynamic content depending on how it has been retrieved (from a file or from the output of some program / module).

Content cache

In order to speed up web server responses by lowering average HTTP response times and hardware resources used, many popular web servers implement one or more content caches, each one specialized in a content category.

Content is usually cached by its origin, e.g.:

static content:
- file cache;
dynamic content:
- dynamic cache (module / program output).

File cache

Historically, static contents found in files which had to be accessed frequently, randomly and quickly, have been stored mostly on electro-mechanical disks since mid-late 1960s / 1970s; regrettably reads from and writes to those kind of devices have always been considered very slow operations when compared to RAM speed and so, since early OSs, first disk caches and then also OS file cache sub-systems were developed to speed up I/O operations of frequently accessed data / files.

Even with the aid of an OS file cache, the relative / occasional slowness of I/O operations involving directories and files stored on disks became soon a bottleneck in the increase of performances expected from top level web servers, specially since mid-late 1990s, when web Internet traffic started to grow exponentially along with the constant increase of speed of Internet / network lines.

The problem about how to further efficiently speed-up the serving of static files, thus increasing the maximum number of requests/responses per second (RPS), started to be studied / researched since mid 1990s, with the aim to propose useful cache models that could be implemented in web server programs.

In practice, nowadays, many popular / high performance web server programs include their own userland file cache, tailored for a web server usage and using their specific implementation and parameters.

The wide spread adoption of RAID and/or fast solid-state drives (storage hardware with very high I/O speed) has slightly reduced but of course not eliminated the advantage of having a file cache incorporated in a web server.

Dynamic cache

Dynamic content, output by an internal module or an external program, may not always change very frequently (given a unique URL with keys / parameters) and so, maybe for a while (e.g. from 1 second to several hours or more), the resulting output can be cached in RAM or even on a fast disk.

The typical usage of a dynamic cache is when a website has dynamic web pages about news, weather, images, maps, etc. that do not change frequently (e.g. every n minutes) and that are accessed by a huge number of clients per minute / hour; in those cases it is useful to return cached content too (without calling the internal module or the external program) because clients often do not have an updated copy of the requested content in their browser caches.

Anyway, in most cases those kind of caches are implemented by external servers (e.g. reverse proxy) or by storing dynamic data output in separate computers, managed by specific applications (e.g. memcached), in order to not compete for hardware resources (CPU, RAM, disks) with web server(s).

Kernel-mode and user-mode web servers

A web server software can be either incorporated into the OS and executed in kernel space, or it can be executed in user space (like other regular applications).

Web servers that run in kernel mode (usually called kernel space web servers) can have direct access to kernel resources and so they can be, in theory, faster than those running in user mode; anyway there are disadvantages in running a web server in kernel mode, e.g.: difficulties in developing (debugging) software whereas run-time critical errors may lead to serious problems in OS kernel.

Web servers that run in user-mode have to ask the system for permission to use more memory or more CPU resources. Not only do these requests to the kernel take time, but they might not always be satisfied because the system reserves resources for its own usage and has the responsibility to share hardware resources with all the other running applications. Executing in user mode can also mean using more buffer/data copies (between user-space and kernel-space) which can lead to a decrease in the performance of a user-mode web server.

Nowadays almost all web server software is executed in user mode (because many of the aforementioned small disadvantages have been overcome by faster hardware, new OS versions, much faster OS system calls and new optimized web server software). See also comparison of web server software to discover which of them run in kernel mode or in user mode (also referred as kernel space or user space).

Performances

To improve the user experience (on client / browser side), a web server should reply quickly (as soon as possible) to client requests; unless content response is throttled (by configuration) for some type of files (e.g. big or huge files), also returned data content should be sent as fast as possible (high transfer speed).

In other words, a web server should always be very responsive, even under high load of web traffic, in order to keep total user's wait (sum of browser time + network time + web server response time) for a response as low as possible.

Performance metrics

For web server software, main key performance metrics (measured under vary operating conditions) usually are at least the following ones (i.e.):

number of requests per second (RPS, similar to QPS, depending on HTTP version and configuration, type of HTTP requests and other operating conditions);
number of connections per second (CPS), is the number of connections per second accepted by web server (useful when using HTTP/1.0 or HTTP/1.1 with a very low limit of requests / responses per connection, i.e. 1 .. 20);
network latency + response time for each new client request; usually benchmark tool shows how many requests have been satisfied within a scale of time laps (e.g. within 1ms, 3ms, 5ms, 10ms, 20ms, 30ms, 40ms) and / or the shortest, the average and the longest response time;
throughput of responses, in bytes per second.

Among the operating conditions, the number (1 .. n) of concurrent client connections used during a test is an important parameter because it allows to correlate the concurrency level supported by web server with results of the tested performance metrics.

Software efficiency

The specific web server software design and model adopted (e.g.):

single process or multi-process;
single thread (no thread) or multi-thread for each process;
usage of coroutines or not;

... and other programming techniques, such as (e.g.):

minimization of possible CPU cache misses;
minimization of possible CPU branch mispredictions in critical paths for speed;
minimization of the number of system calls used to perform a certain function / task;
other tricks;

... used to implement a web server program, can bias a lot the performances and in particular the scalability level that can be achieved under heavy load or when using high end hardware (many CPUs, disks and lots of RAM).

In practice some web server software models may require more OS resources (specially more CPUs and more RAM) than others to be able to work well and so to achieve target performances.

Operating conditions

There are many operating conditions that can affect the performances of a web server; performance values may vary depending on (i.e.):

the settings of web server (including the fact that log file is or is not enabled, etc.);
the HTTP version used by client requests;
the average HTTP request type (method, length of HTTP headers and optional body);
whether the requested content is static or dynamic;
whether the content is cached or not cached (by server and/or by client);
whether the content is compressed on the fly (when transferred), pre-compressed (i.e. when a file resource is stored on disk already compressed so that web server can send that file directly to the network with the only indication that its content is compressed) or not compressed at all;
whether the connections are or are not encrypted;
the average network speed between web server and its clients;
the number of active TCP connections;
the number of active processes managed by web server (including external CGI, SCGI, FCGI programs);
the hardware and software limitations or settings of the OS of the computer(s) on which the web server runs;
other minor conditions.

Benchmarking

Performances of a web server are typically benchmarked by using one or more of the available automated load testing tools.

Load limits

A web server (program installation) usually has pre-defined load limits for each combination of operating conditions, also because it is limited by OS resources and because it can handle only a limited number of concurrent client connections (usually between 2 and several tens of thousands for each active web server process, see also the C10k problem and the C10M problem).

When a web server is near to or over its load limits, it gets overloaded and so it may become unresponsive.

Causes of overload

At any time web servers can be overloaded due to one or more of the following causes (e.g.).

Excess legitimate web traffic. Thousands or even millions of clients connecting to the website in a short amount of time, e.g., Slashdot effect.
Distributed Denial of Service attacks. A denial-of-service attack (DoS attack) or distributed denial-of-service attack (DDoS attack) is an attempt to make a computer or network resource unavailable to its intended users.
Computer worms that sometimes cause abnormal traffic because of millions of infected computers (not coordinated among them).
XSS worms can cause high traffic because of millions of infected browsers or web servers.
Internet bots Traffic not filtered/limited on large websites with very few network resources (e.g. bandwidth) and/or hardware resources (CPUs, RAM, disks).
Internet (network) slowdowns (e.g. due to packet losses) so that client requests are served more slowly and the number of connections increases so much that server limits are reached.
Web servers, serving dynamic content, waiting for slow responses coming from back-end computer(s) (e.g. databases), maybe because of too many queries mixed with too many inserts or updates of DB data; in these cases web servers have to wait for back-end data responses before replying to HTTP clients but during these waits too many new client connections / requests arrive and so they become overloaded.
Web servers (computers) partial unavailability. This can happen because of required or urgent maintenance or upgrade, hardware or software failures such as back-end (e.g. database) failures; in these cases the remaining web servers may get too much traffic and become overloaded.

Symptoms of overload

The symptoms of an overloaded web server are usually the following ones (e.g.).

Requests are served with (possibly long) delays (from 1 second to a few hundred seconds).
The web server returns an HTTP error code, such as 500, 502, 503, 504, 408, or even an intermittent 404.
The web server refuses or resets (interrupts) TCP connections before it returns any content.
In very rare cases, the web server returns only a part of the requested content. This behavior can be considered a bug, even if it usually arises as a symptom of overload.

Anti-overload techniques

To partially overcome above average load limits and to prevent overload, most popular websites use common techniques like the following ones (e.g.).

Tuning OS parameters for hardware capabilities and usage.
Tuning web server(s) parameters to improve their security and performances.
Deploying web cache techniques (not only for static contents but, whenever possible, for dynamic contents too).
Managing network traffic, by using:
- Firewalls to block unwanted traffic coming from bad IP sources or having bad patterns;
- HTTP traffic managers to drop, redirect or rewrite requests having bad HTTP patterns;
- Bandwidth management and traffic shaping, in order to smooth down peaks in network usage.
Using different domain names, IP addresses and computers to serve different kinds (static and dynamic) of content; the aim is to separate big or huge files (download.*) (that domain might be replaced also by a CDN) from small and medium-sized files (static.*) and from main dynamic site (maybe where some contents are stored in a backend database) (www.*); the idea is to be able to efficiently serve big or huge (over 10 – 1000 MB) files (maybe throttling downloads) and to fully cache small and medium-sized files, without affecting performances of dynamic site under heavy load, by using different settings for each (group) of web server computers, e.g.:
- https://download.example.com
- https://static.example.com
- https://www.example.com
Using many web servers (computers) that are grouped together behind a load balancer so that they act or are seen as one big web server.
Adding more hardware resources (i.e. RAM, fast disks) to each computer.
Using more efficient computer programs for web servers (see also: software efficiency).
Using the most efficient Web Server Gateway Interface to process dynamic requests (spawning one or more external programs every time a dynamic page is retrieved, kills performances).
Using other programming techniques and workarounds, especially if dynamic content is involved, to speed up the HTTP responses (i.e. by avoiding dynamic calls to retrieve objects, such as style sheets, images and scripts), that never change or change very rarely, by copying that content to static files once and then keeping them synchronized with dynamic content).
Using latest efficient versions of HTTP (e.g. beyond using common HTTP/1.1 also by enabling HTTP/2 and maybe HTTP/3 too, whenever available web server software has reliable support for the latter two protocols) in order to reduce a lot the number of TCP/IP connections started by each client and the size of data exchanged (because of more compact HTTP headers representation and maybe data compression).

Caveats about using HTTP/2 and HTTP/3 protocols

Even if newer HTTP (2 and 3) protocols usually generate less network traffic for each request / response data, they may require more OS resources (i.e. RAM and CPU) used by web server software (because of encrypted data, lots of stream buffers and other implementation details); besides this, HTTP/2 and maybe HTTP/3 too, depending also on settings of web server and client program, may not be the best options for data upload of big or huge files at very high speed because their data streams are optimized for concurrency of requests and so, in many cases, using HTTP/1.1 TCP/IP connections may lead to better results / higher upload speeds (your mileage may vary).

Phishing

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Phishing

An example of a phishing email, disguised as an official email from a (fictional) bank. The sender is attempting to trick the recipient into revealing confidential information by prompting them to "confirm" it at the phisher's website. Note the misspelling of the words *received* and *discrepancy* as *recieved* and *discrepency*, respectively.

Phishing is a form of social engineering and scam where attackers deceive people into revealing sensitive information or installing malware such as ransomware. Phishing attacks have become increasingly sophisticated and often transparently mirror the site being targeted, allowing the attacker to observe everything while the victim is navigating the site, and transverse any additional security boundaries with the victim. As of 2020, it is the most common type of cybercrime, with the FBI's Internet Crime Complaint Center reporting more incidents of phishing than any other type of computer crime.

The term "phishing" was first recorded in 1995 in the cracking toolkit AOHell, but may have been used earlier in the hacker magazine 2600. It is a variation of fishing and refers to the use of lures to "fish" for sensitive information.

Measures to prevent or reduce the impact of phishing attacks include legislation, user education, public awareness, and technical security measures. The importance of phishing awareness has increased in both personal and professional settings, with phishing attacks among businesses rising from 72% to 86% from 2017 to 2020.

Types

Email phishing

Phishing attacks, often delivered via email spam, attempt to trick individuals into giving away sensitive information or login credentials. Most attacks are "bulk attacks" that are not targeted and are instead sent in bulk to a wide audience. The goal of the attacker can vary, with common targets including financial institutions, email and cloud productivity providers, and streaming services. The stolen information or access may be used to steal money, install malware, or spear phish others within the target organization. Compromised streaming service accounts may also be sold on darknet markets.

This type of social engineering attack can involve sending fraud emails or messages that appear to be from a trusted source, such as a bank or government agency. These messages typically redirect to a fake login page where the user is prompted to enter their login credentials.

Spear phishing

Spear phishing is a targeted phishing attack that uses personalized emails to trick a specific individual or organization into believing they are legitimate. It often utilizes personal information about the target to increase the chances of success. These attacks often target executives or those in financial departments with access to sensitive financial data and services. Accountancy and audit firms are particularly vulnerable to spear phishing due to the value of the information their employees have access to.

Threat Group-4127 (Fancy Bear) targeted Hillary Clinton's 2016 presidential campaign with spear phishing attacks on over 1,800 Google accounts, using the accounts-google.com domain to threaten targeted users.

A study on spear phishing susceptibility among different age groups found that 43% of 100 young and 58% of older users clicked on simulated phishing links in daily emails over 21 days. Older women had the highest susceptibility, while susceptibility in young users declined over the study, but remained stable in older users.

Whaling and CEO fraud

Whaling attacks use spear phishing techniques to target senior executives and other high-profile individuals with customized content, often related to a subpoena or customer complaint.

CEO fraud involves sending fake emails from senior executives to trick employees into sending money to an offshore account. It has a low success rate, but can result in organizations losing large sums of money.

Clone phishing

Clone phishing is a type of attack where a legitimate email with an attachment or link is copied and modified to contain malicious content. The modified email is then sent from a fake address made to look like it's from the original sender. The attack may appear to be a resend or update of the original email. It often relies on the sender or recipient being previously hacked so the attacker can access the legitimate email.

Voice phishing

Voice over IP (VoIP) is used in vishing or voice phishing attacks, where attackers make automated phone calls to large numbers of people, often using text-to-speech synthesizers, claiming fraudulent activity on their accounts. The attackers spoof the calling phone number to appear as if it is coming from a legitimate bank or institution. The victim is then prompted to enter sensitive information or connected to a live person who uses social engineering tactics to obtain information. Vishing takes advantage of the public's lower awareness and trust in voice telephony compared to email phishing.

SMS phishing

SMS phishing or smishing is a type of phishing attack that uses text messages from a cell phone or smartphone to deliver a bait message. The victim is usually asked to click a link, call a phone number, or contact an email address provided by the attacker. They may then be asked to provide private information, such as login credentials for other websites. The difficulty in identifying illegitimate links can be compounded on mobile devices due to the limited display of URLs in mobile browsers. Smishing can be just as effective as email phishing, as many smartphones have fast internet connectivity. Smishing messages may also come from unusual phone numbers.

Page hijacking

Page hijacking involves redirecting users to malicious websites or exploit kits through the compromise of legitimate web pages, often using cross site scripting. Hackers may insert exploit kits such as MPack into compromised websites to exploit legitimate users visiting the server. Page hijacking can also involve the insertion of malicious inline frames, allowing exploit kits to load. This tactic is often used in conjunction with watering hole attacks on corporate targets.

Calendar phishing

Calendar phishing involves sending fake calendar invitations with phishing links. These invitations often mimic common event requests and can easily be added to calendars automatically. To protect against this form of fraud, former Google click fraud czar Shuman Ghosemajumder recommends changing calendar settings to not automatically add new invitations.

Quishing

QR codes have been used maliciously in phishing attacks. The term "quishing" involves deceiving individuals into thinking a QR code is harmless while the true intent is malicious, aiming to access sensitive information. Cybercriminals exploit the trust placed in QR codes, particularly on mobile phones, which are more vulnerable to attacks compared to desktop operating systems. Quishing attacks often involve sending QR codes via email, enticing users to scan them to verify accounts, leading to potential device compromise. It is advised to exercise caution and avoid scanning QR codes unless the source is verified.

Techniques

Link manipulation

Phishing attacks often involve creating fake links that appear to be from a legitimate organization. These links may use misspelled URLs or subdomains to deceive the user. In the following example URL, http://www.yourbank.example.com/, it can appear to the untrained eye as though the URL will take the user to the example section of the yourbank website; actually this URL points to the "yourbank" (i.e. phishing subdomain) section of the example website (fraudster's domain name). Another tactic is to make the displayed text for a link appear trustworthy, while the actual link goes to the phisher's site. To check the destination of a link, many email clients and web browsers will show the URL in the status bar when the mouse is hovering over it. However, some phishers may be able to bypass this security measure.

Internationalized domain names (IDNs) can be exploited via IDN spoofing or homograph attacks to allow attackers to create fake websites with visually identical addresses to legitimate ones. These attacks have been used by phishers to disguise malicious URLs using open URL redirectors on trusted websites. Even digital certificates, such as SSL, may not protect against these attacks as phishers can purchase valid certificates and alter content to mimic genuine websites or host phishing sites without SSL.

Filter evasion

Phishers have sometimes used images instead of text to make it harder for anti-phishing filters to detect the text commonly used in phishing emails. In response, more sophisticated anti-phishing filters are able to recover hidden text in images using optical character recognition (OCR).

Social engineering

Phishing often uses social engineering techniques to trick users into performing actions such as clicking a link or opening an attachment, or revealing sensitive information. It often involves pretending to be a trusted entity and creating a sense of urgency, like threatening to close or seize a victim's bank or insurance account.

An alternative technique to impersonation-based phishing is the use of fake news articles to trick victims into clicking on a malicious link. These links often lead to fake websites that appear legitimate, but are actually run by attackers who may try to install malware or present fake "virus" notifications to the victim.

History

Early history

Early phishing techniques can be traced back to the 1990s, when black hat hackers and the warez community used AOL to steal credit card information and commit other online crimes. The term "phishing" is said to have been coined by Khan C. Smith, a well-known spammer and hacker, and its first recorded mention was found in the hacking tool AOHell, which was released in 1995. AOHell allowed hackers to impersonate AOL staff and send instant messages to victims asking them to reveal their passwords. In response, AOL implemented measures to prevent phishing and eventually shut down the warez scene on their platform.

2000s

In the 2000s, phishing attacks became more organized and targeted. The first known direct attempt against a payment system, E-gold, occurred in June 2001, and shortly after the September 11 attacks, a "post-9/11 id check" phishing attack followed. The first known phishing attack against a retail bank was reported in September 2003. Between May 2004 and May 2005, approximately 1.2 million computer users in the United States suffered losses caused by phishing, totaling approximately US$929 million. Phishing was recognized as a fully organized part of the black market, and specializations emerged on a global scale that provided phishing software for payment, which were assembled and implemented into phishing campaigns by organized gangs. The United Kingdom banking sector suffered from phishing attacks, with losses from web banking fraud almost doubling in 2005 compared to 2004. In 2006, almost half of phishing thefts were committed by groups operating through the Russian Business Network based in St. Petersburg. Email scams posing as the Internal Revenue Service were also used to steal sensitive data from U.S. taxpayers. Social networking sites are a prime target of phishing, since the personal details in such sites can be used in identity theft. In 2007, 3.6 million adults lost US$3.2 billion due to phishing attacks. The Anti-Phishing Working Group reported receiving 115,370 phishing email reports from consumers with US and China hosting more than 25% of the phishing pages each in the third quarter of 2009.

2010s

Phishing in the 2010s saw a significant increase in the number of attacks. In 2011, the master keys for RSA SecurID security tokens were stolen through a phishing attack. Chinese phishing campaigns also targeted high-ranking officials in the US and South Korean governments and military, as well as Chinese political activists. According to Ghosh, phishing attacks increased from 187,203 in 2010 to 445,004 in 2012. In August 2013, Outbrain suffered a spear-phishing attack, and in November 2013, 110 million customer and credit card records were stolen from Target customers through a phished subcontractor account. CEO and IT security staff subsequently fired. In August 2014, iCloud leaks of celebrity photos were based on phishing e-mails sent to victims that looked like they came from Apple or Google. In November 2014, phishing attacks on ICANN gained administrative access to the Centralized Zone Data System; also gained was data about users in the system - and access to ICANN's public Governmental Advisory Committee wiki, blog, and whois information portal. Fancy Bear was linked to spear-phishing attacks against the Pentagon email system in August 2015, and the group used a zero-day exploit of Java in a spear-phishing attack on the White House and NATO. Fancy Bear carried out spear phishing attacks on email addresses associated with the Democratic National Committee in the first quarter of 2016. In August 2016, members of the Bundestag and political parties such as Linken-faction leader Sahra Wagenknecht, Junge Union, and the CDU of Saarland were targeted by spear-phishing attacks suspected to be carried out by Fancy Bear. In August 2016, the World Anti-Doping Agency reported the receipt of phishing emails sent to users of its database claiming to be official WADA, but consistent with the Russian hacking group Fancy Bear. In 2017, 76% of organizations experienced phishing attacks, with nearly half of the information security professionals surveyed reporting an increase from 2016. In the first half of 2017, businesses and residents of Qatar were hit with over 93,570 phishing events in a three-month span. In August 2017, customers of Amazon faced the Amazon Prime Day phishing attack, when hackers sent out seemingly legitimate deals to customers of Amazon. When Amazon's customers attempted to make purchases using the "deals", the transaction would not be completed, prompting the retailer's customers to input data that could be compromised and stolen. In 2018, the company block.one, which developed the EOS.IO blockchain, was attacked by a phishing group who sent phishing emails to all customers aimed at intercepting the user's cryptocurrency wallet key, and a later attack targeted airdrop tokens.

2020s

Phishing attacks have evolved in the 2020s to include elements of social engineering, as demonstrated by the July 15, 2020, Twitter breach. In this case, a 17-year-old hacker and accomplices set up a fake website resembling Twitter's internal VPN provider used by remote working employees. Posing as helpdesk staff, they called multiple Twitter employees, directing them to submit their credentials to the fake VPN website. Using the details supplied by the unsuspecting employees, they were able to seize control of several high-profile user accounts, including those of Barack Obama, Elon Musk, Joe Biden, and Apple Inc.'s company account. The hackers then sent messages to Twitter followers soliciting Bitcoin, promising to double the transaction value in return. The hackers collected 12.86 BTC (about $117,000 at the time).

Anti-phishing

There are anti-phishing websites which publish exact messages that have been recently circulating the internet, such as FraudWatch International and Millersmiles. Such sites often provide specific details about the particular messages.

As recently as 2007, the adoption of anti-phishing strategies by businesses needing to protect personal and financial information was low. Now there are several different techniques to combat phishing, including legislation and technology created specifically to protect against phishing. These techniques include steps that can be taken by individuals, as well as by organizations. Phone, web site, and email phishing can now be reported to authorities, as described below.

User training

Frame of an animation by the U.S. Federal Trade Commission intended to educate citizens about phishing tactics

Effective phishing education, including conceptual knowledge and feedback, is an important part of any organization's anti-phishing strategy. While there is limited data on the effectiveness of education in reducing susceptibility to phishing, much information on the threat is available online.

Simulated phishing campaigns, in which organizations test their employees' training by sending fake phishing emails, are commonly used to assess their effectiveness. One example is a study by the National Library of Medicine, in which an organization received 858,200 emails during a 1-month testing period, with 139,400 (16%) being marketing and 18,871 (2%) being identified as potential threats. These campaigns are often used in the healthcare industry, as healthcare data is a valuable target for hackers. These campaigns are just one of the ways that organizations are working to combat phishing.

To avoid phishing attempts, people can modify their browsing habits and be cautious of emails claiming to be from a company asking to "verify" an account. It's best to contact the company directly or manually type in their website address rather than clicking on any hyperlinks in suspicious emails.

Nearly all legitimate e-mail messages from companies to their customers contain an item of information that is not readily available to phishers. Some companies, for example PayPal, always address their customers by their username in emails, so if an email addresses the recipient in a generic fashion ("Dear PayPal customer") it is likely to be an attempt at phishing. Furthermore, PayPal offers various methods to determine spoof emails and advises users to forward suspicious emails to their spoof@PayPal.com domain to investigate and warn other customers. However it is unsafe to assume that the presence of personal information alone guarantees that a message is legitimate, and some studies have shown that the presence of personal information does not significantly affect the success rate of phishing attacks; which suggests that most people do not pay attention to such details.

Emails from banks and credit card companies often include partial account numbers, but research has shown that people tend to not differentiate between the first and last digits. This is an issue because the first few digits are often the same for all clients of a financial institution.

The Anti-Phishing Working Group, who's one of the largest anti-phishing organizations in the world, produces regular report on trends in phishing attacks.

Google posted a video demonstrating how to identify and protect yourself from Phishing scams.

Technical approaches

A wide range of technical approaches are available to prevent phishing attacks reaching users or to prevent them from successfully capturing sensitive information.

Filtering out phishing mail

Specialized spam filters can reduce the number of phishing emails that reach their addressees' inboxes. These filters use a number of techniques including machine learning and natural language processing approaches to classify phishing emails, and reject email with forged addresses.

Browsers alerting users to fraudulent websites

Another popular approach to fighting phishing is to maintain a list of known phishing sites and to check websites against the list. One such service is the Safe Browsing service. Web browsers such as Google Chrome, Internet Explorer 7, Mozilla Firefox 2.0, Safari 3.2, and Opera all contain this type of anti-phishing measure. Firefox 2 used Google anti-phishing software. Opera 9.1 uses live blacklists from Phishtank, cyscon and GeoTrust, as well as live whitelists from GeoTrust. Some implementations of this approach send the visited URLs to a central service to be checked, which has raised concerns about privacy. According to a report by Mozilla in late 2006, Firefox 2 was found to be more effective than Internet Explorer 7 at detecting fraudulent sites in a study by an independent software testing company.

An approach introduced in mid-2006 involves switching to a special DNS service that filters out known phishing domains: this will work with any browser, and is similar in principle to using a hosts file to block web adverts.

To mitigate the problem of phishing sites impersonating a victim site by embedding its images (such as logos), several site owners have altered the images to send a message to the visitor that a site may be fraudulent. The image may be moved to a new filename and the original permanently replaced, or a server can detect that the image was not requested as part of normal browsing, and instead send a warning image.

Augmenting password logins

The Bank of America website is one of several that asks users to select a personal image (marketed as SiteKey) and displays this user-selected image with any forms that request a password. Users of the bank's online services are instructed to enter a password only when they see the image they selected. However, several studies suggest that few users refrain from entering their passwords when images are absent. In addition, this feature (like other forms of two-factor authentication) is susceptible to other attacks, such as those suffered by Scandinavian bank Nordea in late 2005, and Citibank in 2006.

A similar system, in which an automatically generated "Identity Cue" consisting of a colored word within a colored box is displayed to each website user, is in use at other financial institutions.

Security skins are a related technique that involves overlaying a user-selected image onto the login form as a visual cue that the form is legitimate. Unlike the website-based image schemes, however, the image itself is shared only between the user and the browser, and not between the user and the website. The scheme also relies on a mutual authentication protocol, which makes it less vulnerable to attacks that affect user-only authentication schemes.

Still another technique relies on a dynamic grid of images that is different for each login attempt. The user must identify the pictures that fit their pre-chosen categories (such as dogs, cars and flowers). Only after they have correctly identified the pictures that fit their categories are they allowed to enter their alphanumeric password to complete the login. Unlike the static images used on the Bank of America website, a dynamic image-based authentication method creates a one-time passcode for the login, requires active participation from the user, and is very difficult for a phishing website to correctly replicate because it would need to display a different grid of randomly generated images that includes the user's secret categories.

Monitoring and takedown

Several companies offer banks and other organizations likely to suffer from phishing scams round-the-clock services to monitor, analyze and assist in shutting down phishing websites. Automated detection of phishing content is still below accepted levels for direct action, with content-based analysis reaching between 80% and 90% of success so most of the tools include manual steps to certify the detection and authorize the response. Individuals can contribute by reporting phishing to both volunteer and industry groups, such as cyscon or PhishTank. Phishing web pages and emails can be reported to Google.

Transaction verification and signing

Solutions have also emerged using the mobile phone (smartphone) as a second channel for verification and authorization of banking transactions.

Multi-factor authentication

Organizations can implement two factor or multi-factor authentication (MFA), which requires a user to use at least 2 factors when logging in. (For example, a user must both present a smart card and a password). This mitigates some risk, in the event of a successful phishing attack, the stolen password on its own cannot be reused to further breach the protected system. However, there are several attack methods which can defeat many of the typical systems. MFA schemes such as WebAuthn address this issue by design.

Email content redaction

Organizations that prioritize security over convenience can require users of its computers to use an email client that redacts URLs from email messages, thus making it impossible for the reader of the email to click on a link, or even copy a URL. While this may result in an inconvenience, it does almost eliminate email phishing attacks.

Limitations of technical responses

An article in Forbes in August 2014 argues that the reason phishing problems persist even after a decade of anti-phishing technologies being sold is that phishing is "a technological medium to exploit human weaknesses" and that technology cannot fully compensate for human weaknesses.

Organizational responses

Scholars have found that the investment into both technological and organizational factors can impact protection against phishing. The studies found that organizations can improve their technical education of employees if they include socio-technical factors in their training.

Legal responses

On January 26, 2004, the U.S. Federal Trade Commission filed the first lawsuit against a Californian teenager suspected of phishing by creating a webpage mimicking America Online and stealing credit card information. Other countries have followed this lead by tracing and arresting phishers. A phishing kingpin, Valdir Paulo de Almeida, was arrested in Brazil for leading one of the largest phishing crime rings, which in two years stole between US$18 million and US$37 million. UK authorities jailed two men in June 2005 for their role in a phishing scam, in a case connected to the U.S. Secret Service Operation Firewall, which targeted notorious "carder" websites. In 2006, Japanese police arrested eight people for creating fake Yahoo Japan websites, netting themselves ¥100 million (US$870,000) and the FBI detained a gang of sixteen in the U.S. and Europe in Operation Cardkeeper.

Senator Patrick Leahy introduced the Anti-Phishing Act of 2005 to Congress in the United States on March 1, 2005. This bill aimed to impose fines of up to $250,000 and prison sentences of up to five years on criminals who used fake websites and emails to defraud consumers. In the UK, the Fraud Act 2006 introduced a general offense of fraud punishable by up to ten years in prison and prohibited the development or possession of phishing kits with the intention of committing fraud.

Companies have also joined the effort to crack down on phishing. On March 31, 2005, Microsoft filed 117 federal lawsuits in the U.S. District Court for the Western District of Washington. The lawsuits accuse "John Doe" defendants of obtaining passwords and confidential information. March 2005 also saw a partnership between Microsoft and the Australian government teaching law enforcement officials how to combat various cyber crimes, including phishing. Microsoft announced a planned further 100 lawsuits outside the U.S. in March 2006, followed by the commencement, as of November 2006, of 129 lawsuits mixing criminal and civil actions. AOL reinforced its efforts against phishing in early 2006 with three lawsuits seeking a total of US$18 million under the 2005 amendments to the Virginia Computer Crimes Act, and Earthlink has joined in by helping to identify six men subsequently charged with phishing fraud in Connecticut.

In January 2007, Jeffrey Brett Goodin of California became the first defendant convicted by a jury under the provisions of the CAN-SPAM Act of 2003. He was found guilty of sending thousands of emails to AOL users, while posing as the company's billing department, which prompted customers to submit personal and credit card information. Facing a possible 101 years in prison for the CAN-SPAM violation and ten other counts including wire fraud, the unauthorized use of credit cards, and the misuse of AOL's trademark, he was sentenced to serve 70 months. Goodin had been in custody since failing to appear for an earlier court hearing and began serving his prison term immediately.

Moral responsibility

From Wikipedia, the free encyclopedia

In philosophy, moral responsibility is the status of morally deserving praise, blame, reward, or punishment for an act or omission in accordance with one's moral obligations. Deciding what (if anything) counts as "morally obligatory" is a principal concern of ethics.

Philosophers refer to people who have moral responsibility for an action as "moral agents". Agents have the capability to reflect upon their situation, to form intentions about how they will act, and then to carry out that action. The notion of free will has become an important issue in the debate on whether individuals are ever morally responsible for their actions and, if so, in what sense. Incompatibilists regard determinism as at odds with free will, whereas compatibilists think the two can coexist.

Moral responsibility does not necessarily equate to legal responsibility. A person is legally responsible for an event when a legal system is liable to penalise that person for that event. Although it may often be the case that when a person is morally responsible for an act, they are also legally responsible for it, the two states do not always coincide.

Preferential promoters of the concept of personal responsibility (or some popularization thereof) may include (for example) parents, managers, politicians, technocrats, large-group awareness trainings (LGATs), and religious groups.

Some see individual responsibility as an important component of neoliberalism.

Philosophical stance

Various philosophical positions exist, disagreeing over determinism and free will.

Depending on how a philosopher conceives of free will, they will have different views on moral responsibility.

Metaphysical libertarianism

Metaphysical libertarians think actions are not always causally determined, allowing for the possibility of free will and thus moral responsibility. All libertarians are also incompatibilists; for think that if causal determinism were true of human action, people would not have free will. Accordingly, some libertarians subscribe to the principle of alternate possibilities, which posits that moral responsibility requires that people could have acted differently.

Phenomenological considerations are sometimes invoked by incompatibilists to defend a libertarian position. In daily life, we feel as though choosing otherwise is a viable option. Although this feeling doesn't firmly establish the existence of free will, some incompatibilists claim the phenomenological feeling of alternate possibilities is a prerequisite for free will.

Jean-Paul Sartre suggested that people sometimes avoid incrimination and responsibility by hiding behind determinism: "we are always ready to take refuge in a belief in determinism if this freedom weighs upon us or if we need an excuse".

A similar view is that individual moral culpability lies in individual character. That is, a person with the character of a murderer has no choice other than to murder, but can still be punished because it is right to punish those of bad character. How one's character was determined is irrelevant from this perspective. Robert Cummins, for example, argues that people should not be judged for their individual actions, but rather for how those actions "reflect on their character". If character (however defined) is the dominant causal factor in determining one's choices, and one's choices are morally wrong, then one should be held accountable for those choices, regardless of genes and other such factors.

In law, there is a known exception to the assumption that moral culpability lies in either individual character or freely willed acts. The insanity defense—or its corollary, diminished responsibility (a sort of appeal to the fallacy of the single cause)—can be used to argue that the guilty deed was not the product of a guilty mind. In such cases, the legal systems of most Western societies assume that the person is in some way not at fault, because his actions were a consequence of abnormal brain function (implying brain function is a deterministic causal agent of mind and motive).

The argument from luck

The argument from luck is a criticism against the libertarian conception of moral responsibility. It suggests that any given action, and even a person's character, is the result of various forces outside a person's control. It may not be appropriate, then, to hold that person solely morally responsible. Thomas Nagel suggests that four different types of luck (including genetic influences and other external factors) end up influencing the way that a person's actions are evaluated morally. For instance, a person driving drunk may make it home without incident, and yet this action of drunk driving might seem more morally objectionable if someone happens to jaywalk along his path (getting hit by the car).

This argument can be traced back to David Hume. If physical indeterminism is true, then those events that are not determined are scientifically described as probabilistic or random. It is therefore argued that it is doubtful that one can praise or blame someone for performing an action generated randomly by his nervous system (without there being any non-physical agency responsible for the observed probabilistic outcome).

Hard determinism

Hard determinists (not to be confused with fatalists) often use liberty in practical moral considerations, rather than a notion of a free will. Indeed, faced with the possibility that determinism requires a completely different moral system, some proponents say "So much the worse for free will!". Clarence Darrow, the famous defense attorney, pleaded the innocence of his clients, Leopold and Loeb, by invoking such a notion of hard determinism. During his summation, he declared:

What has this boy to do with it? He was not his own father; he was not his own mother; he was not his own grandparents. All of this was handed to him. He did not surround himself with governesses and wealth. He did not make himself. And yet he is to be compelled to pay.

Paul the Apostle, in his Epistle to the Romans addresses the question of moral responsibility as follows: "Hath not the potter power over the clay, of the same lump to make one vessel unto honour, and another unto dishonour?" In this view, individuals can still be dishonoured for their acts even though those acts were ultimately completely determined by God.

Joshua Greene and Jonathan Cohen, researchers in the emerging field of neuroethics, argue, on the basis of such cases, that our current notion of moral responsibility is founded on libertarian (and dualist) intuitions. They argue that cognitive neuroscience research (e.g. neuroscience of free will) is undermining these intuitions by showing that the brain is responsible for our actions, not only in cases of florid psychosis, but also in less obvious situations. For example, damage to the frontal lobe reduces the ability to weigh uncertain risks and make prudent decisions, and therefore leads to an increased likelihood that someone will commit a violent crime. This is true not only of patients with damage to the frontal lobe due to accident or stroke, but also of adolescents, who show reduced frontal lobe activity compared to adults, and even of children who are chronically neglected or mistreated. In each case, the guilty party can, they argue, be said to have less responsibility for his actions. Greene and Cohen predict that, as such examples become more common and well known, jurors’ interpretations of free will and moral responsibility will move away from the intuitive libertarian notion that currently underpins them. They also argue that the legal system does not require this libertarian interpretation. Rather, they suggest that only retributive notions of justice, in which the goal of the legal system is to punish people for misdeeds, require the libertarian intuition. Many forms of ethically realistic and consequentialist approaches to justice, which are aimed at promoting future welfare rather than retribution, can survive even a hard determinist interpretation of free will. Accordingly, the legal system and notions of justice can thus be maintained even in the face of emerging neuroscientific evidence undermining libertarian intuitions of free will.

Neuroscientist David Eagleman maintains similar ideas. Eagleman says that the legal justice system ought to become more forward looking. He says it is wrong to ask questions of narrow culpability, rather than focusing on what is important: what needs to change in a criminal's behavior and brain. Eagleman is not saying that no one is responsible for their crimes, but rather that the "sentencing phase" should correspond with modern neuroscientific evidence. To Eagleman, it is damaging to entertain the illusion that a person can make a single decision that is somehow, suddenly, independent of their physiology and history. He describes what scientists have learned from brain damaged patients, and offers the case of a school teacher who exhibited escalating pedophilic tendencies on two occasions—each time as results of growing tumors. Eagleman also warns that less attractive people and minorities tend to get longer sentencing—all of which he sees as symptoms that more science is needed in the legal system.

Hard incompatibilism

Derk Pereboom defends a skeptical position about free will he calls hard incompatibilism. In his view, we cannot have free will if our actions are causally determined by factors beyond our control, or if our actions are indeterministic events—if they happen by chance. Pereboom conceives of free will as the control in action required for moral responsibility in the sense involving deserved blame and praise, punishment and reward. While he acknowledges that libertarian agent causation, the capacity of agents as substances to cause actions without being causally determined by factors beyond their control, is still a possibility, he regards it as unlikely against the backdrop of the most defensible physical theories. Without libertarian agent causation, Pereboom thinks the free will required for moral responsibility in the desert-involving sense is not in the offing. However, he also contends that by contrast with the backward-looking, desert-involving sense of moral responsibility, forward-looking senses are compatible with causal determination. For instance, causally determined agents who act badly might justifiably be blamed with the aim of forming faulty character, reconciling impaired relationships, and protecting others from harm they are apt to cause.

Pereboom proposes that a viable criminal jurisprudence is compatible with the denial of deserved blame and punishment. His view rules out retributivist justifications for punishment, but it allows for incapacitation of dangerous criminals on the analogy with quarantine of carriers of dangerous diseases. Isolation of carriers of the Ebola virus can be justified on the ground of the right to defend against threat, a justification that does not reference desert. Pereboom contends that the analogy holds for incapacitation of dangerous criminals. He also argues that the less serious the threat, the more moderate the justifiable method of incapacitation; for certain crimes only monitoring may be needed. In addition, just as we should do what we can, within reasonable bounds, to cure the carriers of the Ebola virus we quarantine, so we should aim to rehabilitate and reintegrate the criminals we incapacitate. Pereboom also proposes that given hard incompatibilism, punishment justified as general deterrence may be legitimate when the penalties don't involve undermining an agent's capacity to live a meaningful, flourishing life, since justifying such moderate penalties need not invoke desert.

Compatibilism

Compatibilists contend that even if determinism were true, it would still be possible for us to have free will. The Hindu text The Bhagavad Gita offers one very early compatibilist account. Facing the prospect of going to battle against kinsmen to whom he has bonds, Arjuna despairs. Krishna attempts to assuage Arjuna's anxieties. He argues that forces of nature come together to produce actions, and it is only vanity that causes us to regard ourselves as the agent in charge of these actions. However, Krishna adds this caveat: "... [But] the Man who knows the relation between the forces of Nature and actions, witnesses how some forces of Nature work upon other forces of Nature, and becomes [not] their slave..." When we are ignorant of the relationship between forces of Nature, we become passive victims of nomological facts. Krishna's admonition is intended to get Arjuna to perform his duty (i.e., fight in the battle), but he is also claiming that being a successful moral agent requires being mindful of the wider circumstances in which one finds oneself. Paramahansa Yogananda also said, "Freedom means the power to act by soul guidance, not by the compulsions of desires and habits. Obeying the ego leads to bondage; obeying the soul brings liberation."

In the Western tradition, Baruch Spinoza echoes the Bhagavad Gita's point about agents and natural forces, writing "men think themselves free because they are conscious of their volitions and their appetite, and do not think, even in their dreams, of the causes by which they are disposed to wanting and willing, because they are ignorant [of those causes]." Krishna is hostile to the influence of passions on our rational faculties, speaking up instead for the value of heeding the dictates of one's own nature: "Even a wise man acts under the impulse of his nature. Of what use is restraint?" Spinoza similarly identifies the taming of one's passions as a way to extricate oneself from merely being passive in the face of external forces and a way toward following our own natures.

Jesus asserted that "There is a path that SEEMS right to a man which leads to Destruction". The contrapositive (equivalent) is the origin of this position of Spinoza. "If a man is Not on the road to destruction, then he has not taken the path that ONLY SEEMS right to him."

P.F. Strawson is a major example of a contemporary compatibilist. His paper "Freedom and Resentment," which adduces reactive attitudes, has been widely cited as an important response to incompatibilist accounts of free will. Other compatibilists, who have been inspired by Strawson's paper, are as follows: Gary Watson, Susan Wolf, R. Jay Wallace, Paul Russell, and David Shoemaker.

Other views

Daniel Dennett asks why anyone would care about whether someone had the property of responsibility and speculates that the idea of moral responsibility may be "a purely metaphysical hankering". In this view, the denial of moral responsibility is the moral hankering to be able to assert that one has some fictitious right such as asserting PARENTAL rights instead of parent responsibility.

Bruce Waller has argued, in Against Moral Responsibility (MIT Press), that moral responsibility "belongs with the ghosts and gods and that it cannot survive in a naturalistic environment devoid of miracles". We cannot punish another for wrong acts committed, contends Waller, because the causal forces which precede and have brought about the acts may ultimately be reduced to luck, namely, factors over which the individual has no control. One may not be blamed even for one’s character traits, he maintains, since they too are heavily influenced by evolutionary, environmental, and genetic factors (inter alia). Although his view would fall in the same category as the views of philosophers like Dennett who argue against moral responsibility, Waller's view differs in an important manner: He tries to, as he puts it, "rescue" free will from moral responsibility (See Chapter 3). This move goes against the commonly held assumption that how one feels about free will is ipso facto a claim about moral responsibility.

Epistemic condition for moral responsibility

In philosophical discussions of moral responsibility, two necessary conditions are usually cited: the control (or freedom) condition (which answers the question 'did the individual doing the action in question have free will?') and the epistemic condition, the former of which is explored in the above discussion. The epistemic condition, in contrast to the control condition, focuses on the question 'was the individual aware of, for instance, the moral implications of what she did?' Not all philosophers think this condition to be a distinct condition, separate from the control condition: For instance, Alfred Mele thinks that the epistemic condition is a component of the control condition. Nonetheless, there seems to be philosophical consensus of sorts that it is both distinct and explanatorily relevant. One major concept associated with the condition is "awareness". According to those philosophers who affirm this condition, one needs to be "aware" of four things to be morally responsible: the action (which one is doing), its moral significance, consequences, and alternatives.

Experimental research

Mauro suggests that a sense of personal responsibility does not operate or evolve universally among humankind. He argues that it was absent in the successful civilization of the Iroquois.

In recent years, research in experimental philosophy has explored whether people's untutored intuitions about determinism and moral responsibility are compatibilist or incompatibilist. Some experimental work has included cross-cultural studies. However, the debate about whether people naturally have compatibilist or incompatibilist intuitions has not come out overwhelmingly in favor of one view or the other, finding evidence for both views. For instance, when people are presented with abstract cases that ask if a person could be morally responsible for an immoral act when they could not have done otherwise, people tend to say no, or give incompatibilist answers. When presented with a specific immoral act that a specific person committed, people tend to say that that person is morally responsible for their actions, even if they were determined (that is, people also give compatibilist answers).

The neuroscience of free will investigates various experiments that might shed light on free will.

Collective

When people attribute moral responsibility, they usually attribute it to individual moral agents. However, Joel Feinberg, among others, has argued that corporations and other groups of people can have what is called ‘collective moral responsibility’ for a state of affairs. For example, when South Africa had an apartheid regime, the country's government might have been said to have had collective moral responsibility for the violation of the rights of non-European South Africans.

Psychopathy's lack of sense of responsibility

One of the attributes defined for psychopathy is "failure to accept responsibility for own actions".

Artificial systems

The emergence of automation, robotics and related technologies prompted the question, 'Can an artificial system be morally responsible?' The question has a closely related variant, 'When (if ever) does moral responsibility transfer from its human creator(s) to the system?'.

The questions arguably adjoin with but are distinct from machine ethics, which is concerned with the moral behavior of artificial systems. Whether an artificial system's behavior qualifies it to be morally responsible has been a key focus of debate.

Arguments that artificial systems cannot be morally responsible

Batya Friedman and Peter Kahn Jr. posited that intentionality is a necessary condition for moral responsibility, and that computer systems as conceivable in 1992 in material and structure could not have intentionality.

Arthur Kuflik asserted that humans must bear the ultimate moral responsibility for a computer's decisions, as it is humans who design the computers and write their programs. He further proposed that humans can never relinquish oversight of computers.

Frances Grodzinsky et al. considered artificial systems that could be modelled as finite state machines. They posited that if the machine had a fixed state transition table, then it could not be morally responsible. If the machine could modify its table, then the machine's designer still retained some moral responsibility.

Patrick Hew argued that for an artificial system to be morally responsible, its rules for behaviour and the mechanisms for supplying those rules must not be supplied entirely by external humans. He further argued that such systems are a substantial departure from technologies and theory as extant in 2014. An artificial system based on those technologies will carry zero responsibility for its behaviour. Moral responsibility is apportioned to the humans that created and programmed the system.

(A more extensive review of arguments may be found in.)

Arguments that artificial systems can be morally responsible

Colin Allen et al. proposed that an artificial system may be morally responsible if its behaviours are functionally indistinguishable from a moral person, coining the idea of a 'Moral Turing Test'. They subsequently disavowed the Moral Turing Test in recognition of controversies surrounding the Turing Test.

Andreas Matthias described a 'responsibility gap' where to hold humans responsible for a machine would be an injustice, but to hold the machine responsible would challenge 'traditional' ways of ascription. He proposed three cases where the machine's behaviour ought to be attributed to the machine and not its designers or operators. First, he argued that modern machines are inherently unpredictable (to some degree), but perform tasks that need to be performed yet cannot be handled by simpler means. Second, that there are increasing 'layers of obscurity' between manufacturers and system, as hand coded programs are replaced with more sophisticated means. Third, in systems that have rules of operation that can be changed during the operation of the machine.

Trade-off

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Trade-off

A trade-off (or tradeoff) is a situational decision that involves diminishing or losing on quality, quantity, or property of a set or design in return for gains in other aspects. In simple terms, a tradeoff is where one thing increases, and another must decrease. Tradeoffs stem from limitations of many origins, including simple physics – for instance, only a certain volume of objects can fit into a given space, so a full container must remove some items in order to accept any more, and vessels can carry a few large items or multiple small items. Tradeoffs also commonly refer to different configurations of a single item, such as the tuning of strings on a guitar to enable different notes to be played, as well as an allocation of time and attention towards different tasks.

The concept of a tradeoff suggests a tactical or strategic choice made with full comprehension of the advantages and disadvantages of each setup. An economic example is the decision to invest in stocks, which are risky but carry great potential return, versus bonds, which are generally safer but with lower potential returns.

Theoretical description

The theoretical description of trade-offs involves the pareto front.

Examples

The concept of a trade-off is often used to describe situations in everyday life.

Economics

In economics a trade-off is expressed in terms of the opportunity cost of a particular choice, which is the loss of the most preferred alternative given up. A tradeoff, then, involves a sacrifice that must be made to obtain a certain product, service, or experience, rather than others that could be made or obtained using the same required resources. For example, for a person going to a basketball game, their opportunity cost is the loss of the alternative of watching a particular television program at home. If the basketball game occurs during her or his working hours, then the opportunity cost would be several hours of lost work, as they would need to take time off work.

Many factors affect the tradeoff environment within a particular country, including the availability of raw materials, a skilled labor force, machinery for producing a product, technology and capital, market rate to produce that product on a reasonable time scale, and so forth.

A trade-off in economics is often illustrated graphically by a Pareto frontier (named after the economist Vilfredo Pareto), which shows the greatest (or least) amount of one thing that can be attained for each of various given amounts of the other. As an example, in production theory, the trade-off between the output of one good and the output of another is illustrated graphically by the production possibilities frontier. The Pareto frontier is also used in multi-objective optimization. In finance, the capital asset pricing model includes an efficient frontier that shows the highest level of expected return that any portfolio could have given any particular level of risk, as measured by the variance of portfolio return.

Opportunity cost

An opportunity cost example of trade-offs for an individual would be the decision by a full-time worker to take time off work with a salary of $50,000 to attend medical school with an annual tuition of $30,000 and earning $150,000 as a doctor after 7 years of study. If we assume for the sake of simplicity that the medical school only allows full-time study, then the individual considering stopping work would face a trade-off between not going to medical school and earning $50,000 at work, or going to medical school and losing $50,000 in salary and having to pay $30,000 in tuition but earning $150,000 or more per year after 7 years of study.

Trash cans

Trash cans that are used inside and then taken out to the street and emptied into a dumpster can be small or large. A large trash can does not need to be taken out to the dumpster so often, but it may become very heavy and difficult to move when full. The choice of big versus small trash can is a trade-off between the frequency of needing to take out the trash and ease of use.

In the case of food waste, a second trade-off presents itself. Large trash cans are more likely to sit for a long time in the kitchen, leading to the food decomposing and a nasty odor. A small trash can will likely need to be taken out to the dumpster more often, thus greatly reducing or eliminating the odor. Of course, a user of a large trash can could simply carry the can outside frequently, but the larger can would be more cumbersome to take out often, and the user would have to think more about when to take the can out.

Mittens

In cold climates, mittens in which all the fingers are in the same compartment work well to keep the hands warm, but this arrangement also confines finger movement and prevents the full range of hand function. Gloves, with their separate fingers, do not have this drawback, but they do not keep the fingers as warm as mittens do. As such, with mittens and gloves, the trade-off is warmth versus dexterity. Similarly, warm coats are often bulky and impede the wearer's freedom of movement. Thin coats give the wearer more freedom of movement, but they are not as warm as a thicker coat would be.

Music

When copying music from compact discs to a computer, lossy compression formats, such as MP3, are used routinely to save hard disk space, but some information is lost resulting in lower sound quality. Lossless compression schemes, such as FLAC or ALAC take much more disk space, but do not affect sound quality.

Cars

Large cars can carry many people, and since they have larger crumple zones, they may be safer in an accident. However, they also tend to be heavy (and often not very aerodynamic) and thus usually have relatively poor fuel economy. Small cars like the Smart Car can only carry two people, and being lightweight, they are more fuel-efficient. At the same time, the smaller size and weight of small cars mean that they have smaller crumple zones, which means occupants are less protected in case of an accident. In addition, if a small car has an accident with a larger, heavier car, the occupants of the smaller car will fare more poorly. Thus car size (large versus small) involves multiple tradeoffs regarding passenger capacity, accident safety, and fuel economy.

Athletics

In athletics, sprint running demands different physical attributes from running a marathon. Accordingly, the two contests have distinct events in such competitions as the Olympics, and each pursuit features distinct teams of athletes. Whether a professional runner is better suited to marathon running versus sprinting is a trade-off based on the runner's morphology and physiology (e.g., variation in muscle fiber type), as well as the runner's individual interest, preference, and other motivational factors. Sports recruiters are mindful of these tradeoffs as they decide what role a prospective athlete would best suit on a team.

Biology

In biology, several types of tradeoffs have been recognized. Most simply, a tradeoff occurs when a beneficial change in one trait is linked to a detrimental change in another trait. In environmental resource management, trade-offs occur among different targets. For example, these occur among biodiversity conservation, carbon sequestration and distributive equity in the distribution of funds of the program for Reducing Emissions from Deforestation and forest Degradation (REDD+), as maximizing one of these targets implies reducing the outcomes in the other two targets.

The term is also used widely in an evolutionary context, in which case the processes of natural selection and sexual selection are in reference as the ultimate decisive factors. In biology, the concepts of tradeoffs and constraints are often closely related.

Demography

In demography, tradeoff examples may include maturity, fecundity, parental care, parity, senescence, and mate choice. For example, the higher the fecundity (number of offspring), the lower the parental care that each offspring will receive. Parental care as a function of fecundity would show a negative sloped linear graph. A related phenomenon, known as demographic compensation, arises when the different components of species life cycles (survival, growth, fecundity, etc.) show negative correlations across the distribution ranges. For example, survival may be higher towards the northern edge of the distribution, while fecundity or growth increases towards the south, leading to a compensation that allows the species to persist along an environmental gradient. Contrasting trends in life cycle components may arise through tradeoffs in resource allocation, but also through independent but opposite responses to environmental conditions.

Engineering

Tradeoffs are important in engineering. For example, in electrical engineering, negative feedback is used in amplifiers to trade gain for other desirable properties, such as improved bandwidth, stability of the gain and/or bias point, noise immunity, and reduction of nonlinear distortion. Similarly, tradeoffs are used to maximize power efficiency in medical devices whilst guaranteeing the required measurement quality.

Computer science

In computer science, tradeoffs are viewed as a tool of the trade. A program can often run faster if it uses more memory (a space–time tradeoff). Consider the following examples:

By compressing an image, you can reduce transmission time/costs at the expense of CPU time to perform the compression and decompression. Depending on the compression method, this may also involve the tradeoff of a loss in image quality.
By using a lookup table, you may be able to reduce CPU time at the expense of space to hold the table, e.g. to determine the parity of a byte you can either look at each bit individually (using shifts and masks), or use a 256-entry table giving the parity for each possible bit-pattern, or combine the upper and lower nibbles and use a 16-entry table.
For some situations (e.g. string manipulation), a compiler may be able to use inline code for greater speed, or call run-time routines for reduced memory; the user of the compiler should be able to indicate whether speed or space is more important.

The Software Engineering Institute has a specific method for analyzing tradeoffs, called the Architecture Tradeoff Analysis Method (ATAM).

Board games

Strategy board games often involve tradeoffs: for example, in chess you might trade a pawn for an improved position. In a worst-case scenario, a chess player might even tradeoff the loss of a valuable piece (even the Queen) to protect the King. In Go, you might trade thickness for influence.

Ethics

Ethics often involves competing for interests that must be traded off against each other, such as the interests of different people, or different principles (e.g. is it ethical to use information resulting from inhumane or illegal experiments to treat disease today?)

Medicine

In medicine, patients and physicians are often faced with difficult decisions involving tradeoffs. One example is localized prostate cancer where patients need to weigh the possibility of a prolonged life expectancy against possible stressful or unpleasant treatment side-effects (patient trade-off).

Government

Governmental tradeoffs are among the most controversial political and social difficulties of any time. All of politics can be viewed as a series of tradeoffs based upon which core values are most core to most people or politicians. Political campaigns also involve tradeoffs, as when attack ads may energize the political base but alienate undecided voters.

Work schedules

With work schedules, employees will often use a tradeoff of "9/80" where an 80-hour work period is compressed into a narrow group of 9 nearly-9 hour working days over the traditional 10 8-hour working days, allowing the employee to take every second Friday off.

Search This Blog

Friday, December 8, 2023

Web server

History

Initial WWW project (1989–1991)

Fast and wild development (1991–1995)

Explosive growth and competition (1996–2014)

New challenges (2015 and later years)

Technical overview

Common features

Common tasks

Read request message

URL normalization

URL mapping

URL path translation to file system

Manage request message

Serve static content

Directory index files

Regular files

Serve dynamic content

Directory listings

Program or module processing

Send response message

Error message

URL authorization

URL redirection

Successful message

Content cache

File cache

Dynamic cache

Kernel-mode and user-mode web servers

Performances

Performance metrics

Software efficiency

Operating conditions

Benchmarking

Load limits

Causes of overload

Symptoms of overload

Anti-overload techniques

Phishing

Types

Email phishing

Spear phishing

Whaling and CEO fraud

Clone phishing

Voice phishing

SMS phishing

Page hijacking

Calendar phishing

Quishing

Techniques

Link manipulation

Filter evasion

Social engineering

History

Early history

2000s

2010s

2020s

Anti-phishing

User training

Technical approaches

Filtering out phishing mail

Browsers alerting users to fraudulent websites

Augmenting password logins

Monitoring and takedown

Transaction verification and signing

Multi-factor authentication

Email content redaction

Limitations of technical responses

Legal responses

Moral responsibility

Philosophical stance

Metaphysical libertarianism

The argument from luck

Hard determinism

Hard incompatibilism

Compatibilism

Other views