Software development is the process of conceiving, specifying, designing, programming, documenting, testing, and bug fixing involved in creating and maintaining applications, frameworks, or other software components. Software development is a process of writing and maintaining the source code,
but in a broader sense, it includes all that is involved between the
conception of the desired software through to the final manifestation of
the software, sometimes in a planned and structured process.
Therefore, software development may include research, new development,
prototyping, modification, reuse, re-engineering, maintenance, or any
other activities that result in software products.
The software can be developed for a variety of purposes, the
three most common being to meet specific needs of a specific
client/business (the case with custom software), to meet a perceived need of some set of potential users (the case with commercial and open source software), or for personal use (e.g. a scientist may write software to automate a mundane task). Embedded software development, that is, the development of embedded software, such as used for controlling consumer products, requires the development process to be integrated with the development of the controlled physical product. System software underlies applications and the programming process itself, and is often developed separately.
The need for better quality control of the software development process has given rise to the discipline of software engineering, which aims to apply the systematic approach exemplified in the engineeringparadigm to the process of software development.
A software development process (also known as a software development methodology, model, or life cycle) is a framework that is used to structure, plan, and control the process of developing information systems.
A wide variety of such frameworks has evolved over the years, each with
its own recognized strengths and weaknesses. There are several
different approaches to software development: some take a more
structured, engineering-based approach to developing software, whereas
others may take a more incremental approach, where software evolves as
it is developed piece-by-piece. One system development methodology is
not necessarily suitable for use by all projects. Each of the available
methodologies is best suited to specific kinds of projects, based on
various technical, organizational, project, and team considerations.
Most methodologies share some combination of the following stages of software development:
These stages are often referred to collectively as the software
development life-cycle, or SDLC. Different approaches to software
development may carry out these stages in different orders, or devote
more or less time to different stages. The level of detail of the
documentation produced at each stage of software development may also
vary. These stages may also be carried out in turn (a “waterfall” based
approach), or they may be repeated over various cycles or iterations (a
more "extreme" approach). The more extreme approach usually involves
less time spent on planning and documentation, and more time spent on
coding and development of automated tests.
More “extreme” approaches also promote continuous testing throughout
the development life-cycle, as well as having a working (or bug-free)
product at all times. More structured or “waterfall” based approaches attempt to assess the majority of risks and develop a detailed plan for the software before implementation
(coding) begins, and avoid significant design changes and re-coding in
later stages of the software development life-cycle planning.
There are significant advantages and disadvantages to the various
methodologies, and the best approach to solving a problem using
software will often depend on the type of problem. If the problem is
well understood and work can be effectively planned out ahead of time,
the more "waterfall" based approach may work the best. If, on the other
hand, the problem is unique (at least to the development team) and the
structure of the software cannot be easily envisioned, then a more
"extreme" incremental approach may work best.
Software development activities
Identification of need
The sources of ideas for software products are plentiful. These ideas can come from market research including the demographics
of potential new customers, existing customers, sales prospects who
rejected the product, other internal software development staff, or a
creative third party. Ideas for software products are usually first
evaluated by marketing
personnel for economic feasibility, for fit with existing channels
distribution, for possible effects on existing product lines, required features,
and for fit with the company's marketing objectives. In a marketing
evaluation phase, the cost and time assumptions become evaluated. A
decision is reached early in the first phase as to whether, based on the
more detailed information generated by the marketing and development
staff, the project should be pursued further.
In the book "Great Software Debates", Alan M. Davis states in the chapter "Requirements", sub-chapter "The Missing Piece of Software Development"
Students of engineering learn engineering and are rarely
exposed to finance or marketing. Students of marketing learn marketing
and are rarely exposed to finance or engineering. Most of us become
specialists in just one area. To complicate matters, few of us meet
interdisciplinary people in the workforce, so there are few roles to
mimic. Yet, software product planning is critical to the development
success and absolutely requires knowledge of multiple disciplines.
Planning is an objective of each and every activity, where we want to discover things that belong to the project.
An important task in creating a software program is extracting the requirements or requirements analysis. Customers typically have an abstract idea of what they want as an end result but do not know what software
should do. Skilled and experienced software engineers recognize
incomplete, ambiguous, or even contradictory requirements at this point.
Frequently demonstrating live code may help reduce the risk that the
requirements are incorrect.
"Although much effort is put in the requirements phase to ensure
that requirements are complete and consistent, rarely that is the case;
leaving the software design phase as the most influential one when it
comes to minimizing the effects of new or changing requirements.
Requirements volatility is challenging because they impact future or
already going development efforts."
Once the general requirements are gathered from the client, an
analysis of the scope of the development should be determined and
clearly stated. This is often called a scope document.
Designing
Once the requirements are established, the design of the software can be established in a software design document. This involves a preliminary or high-level design of the main modules with an overall picture (such as a block diagram)
of how the parts fit together. The language, operating system, and
hardware components should all be known at this time. Then a detailed
or low-level design is created, perhaps with prototyping as proof-of-concept or to firm up requirements.
Software testing is an integral and important phase of the software development process. This part of the process ensures that defects are recognized as soon as possible. In some processes, generally known as test-driven development, tests may be developed just before implementation and serve as a guide for the implementation's correctness.
Documenting
the internal design of software for the purpose of future maintenance
and enhancement is done throughout development. This may also include
the writing of an API,
be it external or internal. The software engineering process chosen by
the developing team will determine how much internal documentation (if
any) is necessary. Plan-driven models (e.g., Waterfall) generally produce more documentation than Agile models.
Deployment and maintenance
Deployment starts directly after the code is appropriately tested, approved for release,
and sold or otherwise distributed into a production environment. This
may involve installation, customization (such as by setting parameters
to the customer's values), testing, and possibly an extended period of
evaluation.
Software training and support is important, as software is only effective if it is used correctly.
Maintaining and enhancing software to cope with newly discovered faults or requirements can take substantial time and effort, as missed requirements may force redesign of the software. In most cases maintenance is required on regular basis to fix reported issues and keep the software running.
The purpose of viewpoints and views is to enable human engineers to comprehend very complex systems and to organize the elements of the problem around domains of expertise. In the engineering
of physically intensive systems, viewpoints often correspond to
capabilities and responsibilities within the engineering organization.
Most complex system specifications are so extensive that no one
individual can fully comprehend all aspects of the specifications.
Furthermore, we all have different interests in a given system and
different reasons for examining the system's specifications. A business
executive will ask different questions of a system make-up than would a
system implementer. The concept of viewpoints framework, therefore, is
to provide separate viewpoints into the specification of a given complex
system. These viewpoints each satisfy an audience with interest in some
set of aspects of the system. Associated with each viewpoint is a
viewpoint language
that optimizes the vocabulary and presentation for the audience of that
viewpoint.
Business process and data modelling
Graphical representation of the current state of information provides a very effective means for presenting information to both users and system developers.
example of the interaction between business process and data models.
A business model
illustrates the functions associated with the business process being
modeled and the organizations that perform these functions. By depicting
activities and information flows, a foundation is created to visualize,
define, understand, and validate the nature of a process.
A data model provides the details of information to be stored and is of primary use when the final product is the generation of computer software code for an application or the preparation of a functional specification to aid a computer software make-or-buy decision. See the figure on the right for an example of the interaction between business process and data models.
Usually, a model is created after conducting an interview, referred to as business analysis.
The interview consists of a facilitator asking a series of questions
designed to extract required information that describes a process. The
interviewer is called a facilitator to emphasize that it is the
participants who provide the information. The facilitator should have
some knowledge of the process of interest, but this is not as important
as having a structured methodology by which the questions are asked of
the process expert. The methodology is
important because usually a team of facilitators is collecting
information across the facility and the results of the information from
all the interviewers must fit together once completed.
The models are developed as defining either the current state of
the process, in which case the final product is called the "as-is"
snapshot model, or a collection of ideas of what the process should
contain, resulting in a "what-can-be" model. Generation of process and
data models can be used to determine if the existing processes and
information systems are sound and only need minor modifications or
enhancements, or if re-engineering is required as a corrective action.
The creation of business models is more than a way to view or automate
your information process. Analysis can be used to fundamentally reshape
the way your business or organization conducts its operations.
Computer-aided software engineering
Computer-aided software engineering (CASE), in the field software engineering, is the scientific application of a set of software tools and methods to the development of software which results in high-quality, defect-free, and maintainable software products. It also refers to methods for the development of information systems together with automated tools that can be used in the software development process. The term "computer-aided software engineering" (CASE) can refer to the software used for the automated development of systems software,
i.e., computer code. The CASE functions include analysis, design, and
programming. CASE tools automate methods for designing, documenting, and
producing structured computer code in the desired programming language.
Two key ideas of Computer-aided Software System Engineering (CASE) are:
Foster computer assistance in software development and software maintenance processes, and
An engineering approach to software development and maintenance.
IDEs are designed to maximize programmer productivity by providing tight-knit components with similar user interfaces. Typically an IDE is dedicated to a specific programming language, so as to provide a feature set which most closely matches the programming paradigms of the language.
Modeling language
A modeling language is any artificial language that can be used to express information or knowledge or systems in a structure
that is defined by a consistent set of rules. The rules are used for
interpretation of the meaning of components in the structure. A modeling
language can be graphical or textual. Graphical modeling languages use a diagram techniques
with named symbols that represent concepts and lines that connect the
symbols and that represent relationships and various other graphical
annotation to represent constraints. Textual modeling languages
typically use standardised keywords accompanied by parameters to make
computer-interpretable expressions.
Examples of graphical modelling languages in the field of software engineering are:
IDEF is a family of modeling languages, the most notable of which include IDEF0 for functional modeling, IDEF1X for information modeling, and IDEF5 for modeling ontologies.
Specification and Description Language
(SDL) is a specification language targeted at the unambiguous
specification and description of the behavior of reactive and
distributed systems.
Unified Modeling Language (UML) is a general-purpose modeling
language that is an industry standard for specifying software-intensive
systems. UML 2.0, the current version, supports thirteen different
diagram techniques and has widespread tool support.
Not all modeling languages are executable, and for those that are,
using them doesn't necessarily mean that programmers are no longer
needed. On the contrary, executable modeling languages are intended to
amplify the productivity of skilled programmers, so that they can
address more difficult problems, such as parallel computing and distributed systems.
Programming paradigm
A programming paradigm is a fundamental style of computer programming,
which is not generally dictated by the project management methodology
(such as waterfall or agile). Paradigms differ in the concepts and
abstractions used to represent the elements of a program (such as
objects, functions, variables, constraints) and the steps that comprise a
computation (such as assignations, evaluation, continuations, data
flows). Sometimes the concepts asserted by the paradigm are utilized
cooperatively in high-level system architecture design; in other cases,
the programming paradigm's scope is limited to the internal structure of
a particular program or module.
A programming language can support multiple paradigms. For example, programs written in C++ or Object Pascal can be purely procedural, or purely object-oriented, or contain elements of both paradigms. Software designers and programmers decide how to use those paradigm elements. In object-oriented programming, programmers can think of a program as a collection of interacting objects, while in functional programming
a program can be thought of as a sequence of stateless function
evaluations. When programming computers or systems with many processors,
process-oriented programming allows programmers to think about applications as sets of concurrent processes acting upon logically shared data structures.
Many programming paradigms are as well known for what methods they forbid as for what they enable. For instance, pure functional programming forbids using side-effects; structured programming forbids using goto
statements. Partly for this reason, new paradigms are often regarded as
doctrinaire or overly rigid by those accustomed to earlier styles.
Avoiding certain methods can make it easier to prove theorems about a
program's correctness, or simply to understand its behavior.
Grady Booch's object-oriented design
(OOD), also known as object-oriented analysis and design (OOAD). The
Booch model includes six diagrams: class, object, state transition,
interaction, module, and process.
A definition of software reuse
is the process of creating software from predefined software
components. A software reuse approach seeks to increase or maximise the
use of existing software artefacts in the software development lifecycle.
The following are some common software reuse methods:
A software framework is a reusable design or implementation for a software system or subsystem.
Software product lines
seek to develop software based upon a common set of 'core' assets and
process, in order to produce a range of products (or 'applications') for
a particular market.
Open Source documentations, via libraries such as GitHub, provide free code for software developers to re-use and implement into new applications or designs.
This graphic symbolizes the use of ideas from a wide range of individuals, as used in crowdsourcing
Crowdsourcing is a sourcing model in which individuals or organizations obtain goods or services,
including ideas, voting, micro-tasks and finances, from a large,
relatively open and often rapidly evolving group of participants. As of
2021, crowdsourcing typically involves using the internet to attract and divide work between participants to achieve a cumulative result. The word "crowdsourcing" itself - a portmanteau of "crowd" and "outsourcing" - was allegedly coined in 2005.
Crowdsourcing is not necessarily an "online" activity; it existed before Internet access became a household commodity.
Major differences distinguish crowdsourcing from outsourcing.
Crowdsourcing comes from a less-specific, more public group, whereas
outsourcing is commissioned from a specific, named group, and includes a
mix of bottom-up and top-down processes. Advantages of using crowdsourcing may include improved costs, speed, quality, flexibility, scalability, or diversity.
Some forms of crowdsourcing, such as in "idea competitions" or
"innovation contests" provide ways for organizations to learn beyond the
"base of minds" provided by their employees (e.g. LEGO Ideas). Tedious "microtasks" performed in parallel by large, paid crowds (e.g. Amazon Mechanical Turk) are another form of crowdsourcing. Not-for-profit organizations have used crowdsourcing to develop common goods (e.g. Wikipedia). The effect of user communication and the platform presentation should be taken into account when evaluating the performance of ideas in crowdsourcing contexts.
The term "crowdsourcing" was coined in 2005 by Jeff Howe and Mark Robinson, editors at Wired, to describe how businesses were using the Internet to "outsource work to the crowd", which quickly led to the portmanteau "crowdsourcing". Howe first published a definition for the term crowdsourcing in a companion blog post to his June 2006 Wired article, "The Rise of Crowdsourcing", which came out in print just days later:
Simply defined, crowdsourcing represents the act of a
company or institution taking a function once performed by employees and
outsourcing it to an undefined (and generally large) network of people
in the form of an open call. This can take the form of peer-production
(when the job is performed collaboratively), but is also often
undertaken by sole individuals. The crucial prerequisite is the use of
the open call format and the large network of potential laborers.
In a 1 February 2008, article, Daren C. Brabham, "the first [person]
to publish scholarly research using the word crowdsourcing" and writer
of the 2013 book Crowdsourcing, defined it as an "online, distributed problem-solving and production model".
Kristen L. Guth and Brabham found that the performance of ideas offered
in crowdsourcing platforms are affected not only by their quality, but
also by the communication among users about the ideas, and presentation
in the platform itself.
After studying more than 40 definitions of crowdsourcing in the
scientific and popular literature, Enrique Estellés-Arolas and Fernando
González Ladrón-de-Guevara, researchers at the Technical University of
Valencia, developed a new integrating definition:
Crowdsourcing is a type of participative online activity
in which an individual, an institution, a nonprofit organization, or
company proposes to a group of individuals of varying knowledge,
heterogeneity, and number, via a flexible open call, the voluntary
undertaking of a task. The undertaking of the task; of variable
complexity and modularity, and; in which the crowd should participate,
bringing their work, money, knowledge **[and/or]** experience, always
entails mutual benefit. The user will receive the satisfaction of a
given type of need, be it economic, social recognition, self-esteem, or
the development of individual skills, while the crowdsourcer will obtain
and use to their advantage that which the user has brought to the
venture, whose form will depend on the type of activity undertaken.
As mentioned by the definitions of Brabham and Estellés-Arolas and
Ladrón-de-Guevara above, crowdsourcing in the modern conception is an
IT-mediated phenomenon, meaning that a form of IT is always used to
create and access crowds of people.
In this respect, crowdsourcing has been considered to encompass three
separate, but stable techniques; competition crowdsourcing, virtual
labor market crowdsourcing, and open collaboration crowdsourcing.
Henk van Ess, a college lecturer in online communications,
emphasizes the need to "give back" the crowdsourced results to the
public on ethical grounds. His nonscientific, noncommercial definition
is widely cited in the popular press:
Crowdsourcing is channeling the experts' desire to solve a problem and then freely sharing the answer with everyone.
Despite the multiplicity of definitions for crowdsourcing, one
constant has been the broadcasting of problems to the public, and an
open call for contributions to help solve the problem. Members of the
public submit solutions that are then owned by the entity, which
originally broadcast the problem. In some cases, the contributor of the
solution is compensated monetarily with prizes or with recognition. In
other cases, the only rewards may be kudos or intellectual satisfaction.
Crowdsourcing may produce solutions from amateurs or volunteers working in their spare time or from experts or small businesses, which were previously unknown to the initiating organization.
Another consequence of the multiple definitions is the controversy surrounding what kinds of activities that may be considered crowdsourcing.
Historical examples
While the term "crowdsourcing" was popularized online to describe Internet-based activities, some examples of projects, in retrospect, can be described as crowdsourcing.
Timeline of major events
594 BCE – Solon
requires that all citizens swear to uphold his laws, which among other
things, strengthens citizen inclusion and involvement in the governance
of Ancient Athens, the earliest example of democratic government for which reliable documentation exists
1714 – The longitude rewards:
When the British government was trying to find a way to measure a
ship's longitudinal position, they offered the public a monetary prize
to whomever came up with the best solution.
1783 – King Louis XVI offered an award to the person who could "make the alkali" by decomposing sea salt by the "simplest and most economic method".
1848 – Matthew Fontaine Maury distributed 5000 copies of his Wind and Current Charts
free of charge on the condition that sailors returned a standardized
log of their voyage to the U.S. Naval Observatory. By 1861, he had
distributed 200,000 copies free of charge, on the same conditions.
1849 – A network of some 150 volunteer weather observers all over the USA was set up as a part of the Smithsonian Institution's Meteorological Project started by the Smithsonian's first Secretary, Joseph Henry, who used the telegraph
to gather volunteers' data and create a large weather map, making new
information available to the public daily. For instance, volunteers
tracked a tornado passing through Wisconsin and sent the findings via
telegraph to the Smithsonian. Henry's project is considered the origin
of what later became the National Weather Service.
Within a decade, the project had more than 600 volunteer observers and
had spread to Canada, Mexico, Latin America, and the Caribbean.
1884 – Publication of the Oxford English Dictionary: 800 volunteers catalogued words to create the first fascicle of the OED
1916 – Planters Peanuts contest: The Mr. Peanut logo was designed by a 14-year-old boy who won the Planter Peanuts logo contest.
1970 – French amateur photo contest C'était Paris en 1970 ("This Was Paris in 1970") sponsored by the city of Paris, France-Inter radio, and the Fnac:
14,000 photographers produced 70,000 black-and-white prints and 30,000
color slides of the French capital to document the architectural changes
of Paris. Photographs were donated to the Bibliothèque historique de la ville de Paris.
1991 – Linus Torvalds begins work on the Linux operating system, inviting programmers around the world to contribute code
1997 – British rock band Marillion raised $60,000 from their fans to help finance their U.S. tour.
1999 – SETI@home was launched by the University of California, Berkeley.
Volunteers can contribute to searching for signals that might come from
extraterrestrial intelligence by installing a program that uses idle
computer time for analyzing chunks of data recorded by radio telescopes involved in the SERENDIP program.
2000 – JustGiving established: This online platform allows the public to help raise money for charities.
2000 – UNV Online Volunteering service launched: Connecting people
who commit their time and skills over the Internet to help organizations
address development challenges
2000 – iStockPhoto was founded: The free stock imagery website allows the public to contribute to and receive commission for their contributions.
2001 – Launch of Wikipedia: "Free-access, free content Internet encyclopedia"
2001 – Foundation of Topcoder – crowdsourcing software development company.
2004 – Toyota's first "Dream car art" contest: Children were asked globally to draw their "dream car of the future".
2005 – Kodak's "Go for the Gold" contest: Kodak asked anyone to submit a picture of a personal victory.
2006 – Jeff Howe coined the term crowdsourcing in Wired.
2009 – Waze, a community-oriented GPS app, allows for users to submit road info and route data based on location, such as reports of car accidents or traffic, and integrates that data into its routing algorithms for all users of the app
2011 – Casting of Flavours (Do us a flavor in the USA)
– a campaign launched by PepsiCo's Lay's in Spain. The campaign was
about a contest that was held for initiating a flavor for the snack.
Early competitions
Crowdsourcing
has often been used in the past as a competition to discover a
solution. The French government proposed several of these competitions,
often rewarded with Montyon Prizes, created for poor Frenchmen who had done virtuous acts. These included the Leblanc process, or the Alkali prize, where a reward was provided for separating the salt from the alkali, and the Fourneyron's turbine, when the first hydraulic commercial turbine was developed.
In response to a challenge from the French government, Nicolas Appert won a prize for inventing a new way of food preservation that involved sealing food in air-tight jars. The British government provided a similar reward to find an easy way to determine a ship's longitude in the Longitude Prize. During the Great Depression, out-of-work clerks tabulated higher mathematical functions in the Mathematical Tables Project as an outreach project.
One of the biggest crowdsourcing campaigns was a public design contest
in 2010, hosted by the Indian government's finance ministry to create a
symbol for the Indian rupee. Thousands of people sent in entries before the government zeroed in on the final symbol based on the Devanagari script using the letter Ra.
In astronomy
Crowdsourcing in astronomy was used in the early 19th century by astronomer Denison Olmsted. After being awakened in a late November night due to a meteor shower
taking place, Olmsted noticed a pattern in the shooting stars. Olmsted
wrote a brief report of this meteor shower in the local newspaper. "As
the cause of 'Falling Stars' is not understood by meteorologists, it is
desirable to collect all the facts attending this phenomenon, stated
with as much precision as possible," Olmsted wrote to readers, in a
report subsequently picked up and pooled to newspapers nationwide.
Responses came pouring in from many states, along with scientists'
observations sent to the American Journal of Science and Arts.
These responses helped him make a series of scientific breakthroughs,
the major discovery being that meteor showers are seen nationwide, and
fall from space under the influence of gravity. Also, they demonstrated
that the showers appeared in yearly cycles, a fact that often eluded
scientists. The responses allowed him to suggest a velocity for the
meteors, although his estimate turned out to be too conservative. If he
had just taken the responses as presented, his conjecture on the
meteors' velocity would have been closer to their actual speed.
A more recent version of crowdsourcing in astronomy is NASA's photo organizing project, which asks internet users to browse photos taken from space and try to identify the location the picture is documenting.
In energy system research
Energy system models require large and diverse datasets, increasingly so given the trend towards greater temporal and spatial resolution. In response, there have been several initiatives to crowdsource this data. Launched in December 2009, OpenEI is a collaborativewebsite, run by the US government, providing open energy data. While much of its information is from US government sources, the platform also seeks crowdsourced input from around the world. The semanticwiki and database Enipedia also publishes energy systems data using the concept of crowdsourced open information. Enipedia went live in March 2011.
In genealogy research
Genealogical research was using crowdsourcing techniques long before personal computers were common. Beginning in 1942, members of The Church of Jesus Christ of Latter-day Saints
encouraged members to submit information about their ancestors. The
submitted information was gathered together into a single collection. In
1969, to encourage more people to participate in gathering genealogical
information about their ancestors, the church started the
three-generation program. In this program, church members were asked to
prepare documented family group record forms for the first three
generations. The program was later expanded to encourage members to
research at least four generations and became known as the
four-generation program.
Institutes that have records of interest to genealogical research
have used crowds of volunteers to create catalogs and indices to
records.
In geography
Volunteered
Geographic Information (VGI) is geographic information generated
through Crowdsourcing, as opposed to traditional methods of Professional
Geographic Information (PGI). In describing the built environment, VGI has many advantages over PGI, primarily perceived currency, accuracy and authority. OpenStreetMap is an example of crowdsourced mapping project.
In engineering
Many
companies are introducing crowdsourcing to grow their engineering
capabilities and find solutions to unsolved technical challenges and the
need to adopt newest technologies such as 3D printing, IOT, etc.
In genetic genealogy research
Genetic genealogy is a combination of traditional genealogy with genetics. The rise of personal DNA testing, after the turn of the century, by companies such as Gene by Gene, FTDNA, GeneTree, 23andMe, and Ancestry.com, has led to public and semipublic databases of DNA testing which uses crowdsourcing techniques. In recent years, citizen science projects have become increasingly focused providing benefits to scientific research. This includes support, organization, and dissemination of personal DNA (genetic) testing. Similar to amateur astronomy, citizen scientists encouraged by volunteer organizations like the International Society of Genetic Genealogy, have provided valuable information and research to the professional scientific community.
Since 2005, the Genographic Project, has used the latest
genetic technology to expand our knowledge of the human story, and its
pioneering use of DNA
testing to engage and involve the public in the research effort has
helped to create a new breed of "citizen scientist." Geno 2.0 expands
the scope for citizen science, harnessing the power of the crowd to
discover new details of human population history.
In journalism
Crowdsourcing is increasingly used in professional journalism.
Journalists are able to crowdsource information from the crowd typically
by fact checking the information, and then using the information
they've gathered in their articles as they see fit. The leading daily
newspaper in Sweden has successfully used crowdsourcing in investigating
the home loan interest rates in the country in 2013–2014, resulting to
over 50,000 submissions.
The leading daily newspaper in Finland crowdsourced investigation in
stock short selling in 2011–2012, and the crowdsourced information lead
to a revelation of a sketchy tax evasion system in a Finnish bank. The
bank executive was fired and policy changes followed. TalkingPointsMemo in the United States asked its readers to examine
3000 emails concerning the firing of federal prosecutors in 2008. The
British newspaper the Guardian crowdsourced the examination of hundreds of thousands of documents in 2009.
In linguistics
Crowdsourcing strategies have been applied to estimate word knowledge, vocabulary size, and word origin.
Implicit crowdsourcing on social media has also helped efficiently
approximate sociolinguistic data. Reddit conversations in various
location-based subreddits were analyzed for the presence of grammatical
forms unique to a regional dialect. These were then used to map the
extent of the speaker population. The results could roughly approximate
large-scale surveys on the subject without engaging in field interviews.
In ornithology
Another early example of crowdsourcing occurred in the field of ornithology. On 25 December 1900, Frank Chapman, an early officer of the National Audubon Society, initiated a tradition, dubbed the "Christmas Day Bird Census".
The project called birders from across North America to count and
record the number of birds in each species they witnessed on Christmas
Day. The project was successful, and the records from 27 different
contributors were compiled into one bird census, which tallied around 90
species of birds.
This large-scale collection of data constituted an early form of
citizen science, the premise upon which crowdsourcing is based. In the
2012 census, more than 70,000 individuals participated across 2,369 bird
count circles. Christmas 2014 marked the National Audubon Society's 115th annual Christmas Bird Count.
In public policy
Crowdsourcing public policy and the production of public services is also referred to as citizen sourcing. While some scholars argue crowdsourcing is a policy tool or a definite means of co-production
others question that and argue that crowdsourcing should be considered
just as a technological enabler that simply can increase speed and ease
of participation.
The first conference focusing on Crowdsourcing for Politics and
Policy took place at Oxford University, under the auspices of the Oxford
Internet Institute in 2014. Research has emerged since 2012 that focuses on the use of crowdsourcing for policy purposes. These include the experimental investigation of the use of Virtual Labor Markets for policy assessment, and an assessment of the potential for citizen involvement in process innovation for public administration.
Governments across the world are increasingly using crowdsourcing
for knowledge discovery and civic engagement. Iceland crowdsourced
their constitution reform process in 2011, and Finland has crowdsourced
several law reform processes to address their off-road traffic laws. The
Finnish government allowed citizens to go on an online forum to discuss
problems and possible resolutions regarding some off-road traffic laws.
The crowdsourced information and resolutions would then be passed on to
legislators for them to refer to when making a decision, letting
citizens more directly contribute to public policy.
The City of Palo Alto is crowdsourcing people's feedback for its
Comprehensive City Plan update in a process, which started in 2015.
The House of Representatives in Brazil has used crowdsourcing in
policy-reforms, and federal agencies in the United States have used
crowdsourcing for several years.
Crowdsourcing is used in libraries for OCR corrections on digitized texts, for tagging and for funding, especially in the absence of financial and human means.
Volunteers can contribute explicitly with conscious effort or
implicitly without being known by turnning the text on the raw newspaper
image into human corrected digital form.
Modern methods
Currently,
crowdsourcing has transferred mainly to the Internet, which provides a
particularly beneficial venue for crowdsourcing since individuals tend
to be more open in web-based projects where they are not being
physically judged or scrutinized, and thus can feel more comfortable
sharing. This approach ultimately allows for well-designed artistic
projects because individuals are less conscious, or maybe even less
aware, of scrutiny towards their work. In an online atmosphere, more
attention can be given to the specific needs of a project, rather than
spending as much time in communication with other individuals.
According to a definition by Henk van Ess:
"The crowdsourced problem can be huge (epic tasks like
finding alien life or mapping earthquake zones) or very small ('where
can I skate safely?'). Some examples of successful crowdsourcing themes
are problems that bug people, things that make people feel good about
themselves, projects that tap into niche knowledge of proud experts,
subjects that people find sympathetic or any form of injustice."
Crowdsourcing can either take an explicit or an implicit route.
Explicit crowdsourcing lets users work together to evaluate, share, and
build different specific tasks, while implicit crowdsourcing means that
users solve a problem as a side effect of something else they are doing.
With explicit crowdsourcing, users can evaluate particular items
like books or webpages, or share by posting products or items. Users can
also build artifacts by providing information and editing other
people's work.
Implicit crowdsourcing can take two forms: standalone and
piggyback. Standalone allows people to solve problems as a side effect
of the task they are actually doing, whereas piggyback takes users'
information from a third-party website to gather information.
In his 2013 book, Crowdsourcing, Daren C. Brabham puts forth a problem-based typology of crowdsourcing approaches:
Knowledge discovery and management is used for information
management problems where an organization mobilizes a crowd to find and
assemble information. It is ideal for creating collective resources.
Distributed human intelligence tasking is used for information
management problems where an organization has a set of information in
hand and mobilizes a crowd to process or analyze the information. It is
ideal for processing large data sets that computers cannot easily do.
Broadcast search is used for ideation problems where an organization
mobilizes a crowd to come up with a solution to a problem that has an
objective, provable right answer. It is ideal for scientific problem
solving.
Peer-vetted creative production is used for ideation problems, where
an organization mobilizes a crowd to come up with a solution to a
problem which has an answer that is subjective or dependent on public
support. It is ideal for design, aesthetic, or policy problems.
Crowdsourcing often allows participants to rank each other's
contributions, e.g. in answer to the question "What is one thing we can
do to make Acme a great company?" One common method for ranking is
"like" counting, where the contribution with the most likes ranks first.
This method is simple and easy to understand, but it privileges early
contributions, which have more time to accumulate likes. In recent years
several crowdsourcing companies have begun to use pairwise comparisons,
backed by ranking algorithms. Ranking algorithms do not penalize late
contributions. They also produce results faster. Ranking algorithms
have proven to be at least 10 times faster than manual stack ranking. One drawback, however, is that ranking algorithms are more difficult to understand than like counting.
In "How to Manage Crowdsourcing Platforms Effectively," Ivo Blohm
states that there are four types of Crowdsourcing Platforms:
Microtasking, Information Pooling, Broadcast Search, and Open
Collaboration. They differ in the diversity and aggregation of
contributions that are created. The diversity of information collected
can either be homogenous or heterogenous. The aggregation of information
can either be selective or integrative.
Crowdvoting occurs when a website gathers a large group's opinions and judgments on a certain topic. The Iowa Electronic Market
is a prediction market that gathers crowds' views on politics and tries
to ensure accuracy by having participants pay money to buy and sell
contracts based on political outcomes.
Some of the most famous examples have made use of social media
channels: Domino's Pizza, Coca-Cola, Heineken, and Sam Adams have thus
crowdsourced a new pizza, bottle design, beer, and song, respectively. Threadless.com
selects the T-shirts it sells by having users provide designs and vote
on the ones they like, which are then printed and available for
purchase.
The California Report Card (CRC), a program jointly launched in January 2014 by the Center for Information Technology Research in the Interest of Society and Lt. Governor Gavin Newsom, is an example of modern-day crowd voting. Participants access the CRC online and vote on six timely issues. Through principal component analysis,
the users are then placed into an online "café" in which they can
present their own political opinions and grade the suggestions of other
participants. This system aims to effectively involve the greater public
in relevant political discussions and highlight the specific topics
with which Californians are most concerned.
Crowdvoting's value in the movie industry was shown when in 2009 a
crowd accurately predicting the success or failure of a movie based on
its trailer, a feat that was replicated in 2013 by Google.
On reddit, users collectively rate web content, discussions and comments as well as questions posed to persons of interest in "AMA" and AskScience online interviews.
In 2017, Project Fanchise purchased a team in the Indoor Football League and created the Salt Lake Screaming Eagles
a fan run team. Using a mobile app the fans voted on the day-to-day
operations of the team, the mascot name, signing of players and even the
offensive play calling during games.
Crowdsourcing creative work
Creative crowdsourcing spans sourcing creative projects such as graphic design, crowdsourcing architecture, product design, apparel design, movies, writing, company naming, illustration, etc. While crowdsourcing competitions have been used for decades in some
creative fields (such as architecture), creative crowdsourcing has
proliferated with the recent development of web-based platforms where
clients can solicit a wide variety of creative work at lower cost than
by traditional means.
Crowdsourcing
has also been used for gathering language-related data. For dictionary
work, as was mentioned above, it was applied over a hundred years ago by
the Oxford English Dictionary editors, using paper and postage.
Much later, a call for collecting examples of proverbs on a specific
topic (religious pluralism) was printed in a journal.
Today, as "crowdsourcing" has the inherent connotation of being
web-based, such language-related data gathering is being conducted on
the web by crowdsourcing in accelerating ways. Currently, a number of
dictionary compilation projects are being conducted on the web,
particularly for languages that are not highly academically documented,
such as for the Oromo language. Software programs have been developed for crowdsourced dictionaries, such as WeSay.
A slightly different form of crowdsourcing for language data has been
the online creation of scientific and mathematical terminology for American Sign Language.
Mining publicly available social media conversations can be used as a
form of implicit crowdsourcing to approximate the geographic extent of
speaker dialects. Proverb collection is also being done via crowdsourcing on the Web, most innovatively for the Pashto language of Afghanistan and Pakistan.
Crowdsourcing has been extensively used to collect high-quality gold
standard for creating automatic systems in natural language processing
(e.g. named entity recognition, entity linking).
Crowdsolving
Crowdsolving
is a collaborative, yet holistic, way of solving a problem using many
people, communities, groups, or resources. It is a type of crowdsourcing
with focus on complex and intellectually demanding problems requiring
considerable effort, and quality/ uniqueness of contribution.
Crowdfunding
Crowdfunding is the process of funding projects by a multitude of
people contributing a small amount to attain a certain monetary goal,
typically via the Internet. Crowdfunding has been used for both commercial and charitable purposes.
The crowdfuding model that has been around the longest is rewards-based
crowdfunding. This model is where people can prepurchase products, buy
experiences, or simply donate. While this funding may in some cases go
towards helping a business, funders are not allowed to invest and become
shareholders via rewards-based crowdfunding.
Individuals, businesses, and entrepreneurs can showcase their
businesses and projects to the entire world by creating a profile, which
typically includes a short video introducing their project, a list of
rewards per donation, and illustrations through images. The goal is to
create a compelling message towards which readers will be drawn. Funders
make monetary contribution for numerous reasons:
They connect to the greater purpose of the campaign, such as
being a part of an entrepreneurial community and supporting an
innovative idea or product.
They connect to a physical aspect of the campaign like rewards and gains from investment.
They connect to the creative display of the campaign's presentation.
They want to see new products before the public.
The dilemma for equity crowdfunding in the US as of 2012 was how the Securities and Exchange Commission
(SEC) is going to regulate the entire process. At the time, rules and
regulations were being refined by the SEC, which had until 1 January
2013, to tweak the fundraising methods. The regulators were overwhelmed
trying to regulate Dodd – Frank and all the other rules and regulations
involving public companies and the way they trade. Advocates of
regulation claimed that crowdfunding would open up the flood gates for
fraud, called it the "wild west" of fundraising, and compared it to the
1980s days of penny stock "cold-call cowboys". The process allows for up
to $1 million to be raised without some of the regulations being
involved. Companies under the then-current proposal would have
exemptions available and be able to raise capital from a larger pool of
persons, which can include lower thresholds for investor criteria,
whereas the old rules required that the person be an "accredited"
investor. These people are often recruited from social networks, where
the funds can be acquired from an equity purchase, loan, donation, or
ordering. The amounts collected have become quite high, with requests
that are over a million dollars for software such as Trampoline Systems,
which used it to finance the commercialization of their new software.
Mobile crowdsourcing
Mobile
crowdsourcing, involves activities that take place on smartphones or
mobile platforms that are frequently characterized by GPS technology.
This allows for real-time data gathering and gives projects greater
reach and accessibility. However, mobile crowdsourcing can lead to an
urban bias, as well as safety and privacy concerns.
Macrowork
Macrowork
tasks typically have these characteristics: they can be done
independently, they take a fixed amount of time, and they require
special skills. Macrotasks could be part of specialized projects or
could be part of a large, visible project where workers pitch in
wherever they have the required skills. The key distinguishing factors
are that macrowork requires specialized skills and typically takes
longer, while microwork requires no specialized skills.
Microwork
Microwork
is a crowdsourcing platform that allows users to do small tasks for
which computers lack aptitude for low amounts of money. Amazon's
popular Mechanical Turk
has created many different projects for users to participate in, where
each task requires very little time and offers a very small amount in
payment. The Chinese versions of this, commonly called Witkey,
are similar and include such sites as Taskcn.com and k68.cn. When
choosing tasks, since only certain users "win", users learn to submit
later and pick less popular tasks to increase the likelihood of getting
their work chosen. An example of a Mechanical Turk project is when users searched satellite images for a boat to find lost researcher Jim Gray. Based on an elaborate survey of participants in a microtask crowdsourcing platform, Gadiraju et al. have proposed a taxonomy of different types of microtasks that are crowdsourced. Two important questions in microtask crowdsourcing are dynamic task allocation and answer aggregation.
Simple projects
Simple
projects are those that require a large amount of time and skills
compared to micro and macrowork. While an example of macrowork would be
writing survey feedback, simple projects rather include activities like
writing a basic line of code or programming a database, which both
require a larger time commitment and skill level. These projects are
usually not found on sites like Amazon Mechanical Turk, and are rather posted on platforms like Upwork that call for a specific expertise.
Complex projects
Complex
projects generally take the most time, have higher stakes, and call for
people with very specific skills. These are generally "one-off"
projects that are difficult to accomplish and can include projects like
designing a new product that a company hopes to patent. Tasks like that
would be "complex" because design is a meticulous process that requires a
large amount of time to perfect, and also people doing these projects
must have specialized training in design to effectively complete the
project. These projects usually pay the highest, yet are rarely offered.
Inducement prize contests
Web-based
idea competitions or inducement prize contests often consist of generic
ideas, cash prizes, and an Internet-based platform to facilitate easy
idea generation and discussion. An example of these competitions
includes an event like IBM's 2006 "Innovation Jam", attended by over
140,000 international participants and yielding around 46,000 ideas. Another example is the Netflix Prize
in 2009. The idea was to ask the crowd to come up with a recommendation
algorithm more accurate than Netflix's own algorithm. It had a grand
prize of US$1,000,000, and it was given to the BellKor's Pragmatic Chaos
team which bested Netflix's own algorithm for predicting ratings, by
10.06%.
Another example of competition-based crowdsourcing is the 2009 DARPA balloon
experiment, where DARPA placed 10 balloon markers across the United
States and challenged teams to compete to be the first to report the
location of all the balloons. A collaboration of efforts was required
to complete the challenge quickly and in addition to the competitive
motivation of the contest as a whole, the winning team (MIT, in less
than nine hours) established its own "collaborapetitive" environment to
generate participation in their team. A similar challenge was the Tag Challenge,
funded by the US State Department, which required locating and
photographing individuals in five cities in the US and Europe within 12
hours based only on a single photograph. The winning team managed to
locate three suspects by mobilizing volunteers worldwide using a similar
incentive scheme to the one used in the balloon challenge.
Open innovation platforms are a very effective way of crowdsourcing people's thoughts and ideas to do research and development. The company InnoCentive
is a crowdsourcing platform for corporate research and development
where difficult scientific problems are posted for crowds of solvers to
discover the answer and win a cash prize, which can range from $10,000
to $100,000 per challenge. InnoCentive, of Waltham, Massachusetts and London, England
provides access to millions of scientific and technical experts from
around the world. The company claims a success rate of 50% in providing
successful solutions to previously unsolved scientific and technical
problems. IdeaConnection.com challenges people to come up with new
inventions and innovations and Ninesigma.com connects clients with
experts in various fields. The X Prize Foundation creates and runs incentive competitions offering between $1 million and $30 million for solving challenges. Local Motors
is another example of crowdsourcing. A community of 20,000 automotive
engineers, designers, and enthusiasts competes to build off-road rally
trucks.
Implicit crowdsourcing
Implicit
crowdsourcing is less obvious because users do not necessarily know
they are contributing, yet can still be very effective in completing
certain tasks. Rather than users actively participating in solving a
problem or providing information, implicit crowdsourcing involves users
doing another task entirely where a third party gains information for
another topic based on the user's actions.
A good example of implicit crowdsourcing is the ESP game,
where users guess what images are and then these labels are used to tag
Google images. Another popular use of implicit crowdsourcing is
through reCAPTCHA, which asks people to solve CAPTCHAs
to prove they are human, and then provides CAPTCHAs from old books that
cannot be deciphered by computers, to digitize them for the web. Like
many tasks solved using the Mechanical Turk, CAPTCHAs are simple for
humans, but often very difficult for computers.
Piggyback crowdsourcing can be seen most frequently by websites
such as Google that data-mine a user's search history and websites to
discover keywords for ads, spelling corrections, and finding synonyms.
In this way, users are unintentionally helping to modify existing
systems, such as Google's AdWords.
Health-care crowdsourcing
Research has emerged that outlines the use of crowdsourcing techniques in the public health domain.
The collective intelligence outcomes from crowdsourcing are being
generated in three broad categories of public health care; health
promotion, health research, and health maintenance. Crowdsourcing also enables researchers to move from small homogeneous groups of participants to large heterogenous groups, beyond convenience samples such as students or higher educated people. The SESH group focuses on using crowdsourcing to improve health.
Crowdsourcing in agriculture
Crowdsource
research also reaches to the field of agriculture. This is mainly to
give the farmers and experts a kind of help in identification of
different types of weeds from the fields and also to give them the best way to remove the weeds from fields.
Crowdsourcing in cheating in bridge
Boye Brogeland initiated a crowdsourcing investigation of cheating by top-level bridge players that showed several players were guilty, which led to their suspension.
Crowdshipping
Crowdshipping (crowd-shipping) is a peer-to-peer shipping service, usually conducted via an online platform or marketplace. There are several methods that have been categorized as crowd-shipping:
Travelers heading in the direction of the buyer, and are willing to bring the package as part of their luggage for a reward.
Truck drivers whose route lies along the buyer's location and who are willing to take extra items in their truck.
Community-based platforms that connect international buyers and
local forwarders, by allowing buyers to use forwarder's address as
purchase destination, after which forwarders ship items further to the
buyer.
Crowdsourcers
A
number of motivations exist for businesses to use crowdsourcing to
accomplish their tasks, find solutions for problems, or to gather
information. These include the ability to offload peak demand, access
cheap labor and information, generate better results, access a wider
array of talent than might be present in one organization, and undertake
problems that would have been too difficult to solve internally.
Crowdsourcing allows businesses to submit problems on which
contributors can work, on topics such as science, manufacturing,
biotech, and medicine, with monetary rewards for successful solutions.
Although crowdsourcing complicated tasks can be difficult, simple work
tasks can be crowdsourced cheaply and effectively.
Crowdsourcing also has the potential to be a problem-solving mechanism for government and nonprofit use.
Urban and transit planning are prime areas for crowdsourcing. One
project to test crowdsourcing's public participation process for transit
planning in Salt Lake City was carried out from 2008 to 2009, funded by
a U.S. Federal Transit Administration grant. Another notable application of crowdsourcing to government problem solving is the Peer to Patent Community Patent Review project for the U.S. Patent and Trademark Office.
Researchers have used crowdsourcing systems like the Mechanical
Turk to aid their research projects by crowdsourcing some aspects of the
research process, such as data collection, parsing, and evaluation.
Notable examples include using the crowd to create speech and language
databases, and using the crowd to conduct user studies.
Crowdsourcing systems provide these researchers with the ability to
gather large amounts of data. Additionally, using crowdsourcing,
researchers can collect data from populations and demographics they may
not have had access to locally, but that improve the validity and value
of their work.
Artists have also used crowdsourcing systems. In his project called the Sheep Market, Aaron Koblin used Mechanical Turk to collect 10,000 drawings of sheep from contributors around the world. Sam Brown (artist) leverages the crowd by asking visitors of his website explodingdog to send him sentences that he uses as inspirations for paintings.
Art curator Andrea Grover argues that individuals tend to be more open
in crowdsourced projects because they are not being physically judged or
scrutinized.
As with other crowdsourcers, artists use crowdsourcing systems to
generate and collect data. The crowd also can be used to provide
inspiration and to collect financial support for an artist's work.
Additionally, crowdsourcing from 100 million drivers is being used by INRIX to collect users' driving times to provide better GPS routing and real-time traffic updates.
Demographics
The
crowd is an umbrella term for the people who contribute to
crowdsourcing efforts. Though it is sometimes difficult to gather data
about the demographics of the crowd, a study by Ross et al.
surveyed the demographics of a sample of the more than 400,000
registered crowdworkers using Amazon Mechanical Turk to complete tasks
for pay. A previous study in 2008 by Ipeirotis
found that users at that time were primarily American, young, female,
and well-educated, with 40% earning more than $40,000 per year. In
November 2009, Ross found a very different Mechanical Turk population,
36% of which was Indian. Two-thirds of Indian workers were male, and 66%
had at least a bachelor's degree. Two-thirds had annual incomes less
than $10,000, with 27% sometimes or always depending on income from
Mechanical Turk to make ends meet.
The average US user of Mechanical Turk earned $2.30 per hour for tasks in 2009, versus $1.58 for the average Indian worker. While the majority of users worked less than five hours per week, 18% worked 15 hours per week or more. This is less than minimum wage in the United States (but not in India), which Ross suggests raises ethical questions for researchers who use crowdsourcing.
The demographics of Microworkers.com differ from Mechanical Turk
in that the US and India together account for only 25% of workers; 197
countries are represented among users, with Indonesia (18%) and
Bangladesh (17%) contributing the largest share. However, 28% of
employers are from the US.
Another study of the demographics of the crowd at iStockphoto
found a crowd that was largely white, middle- to upper-class, higher
educated, worked in a so-called "white-collar job" and had a high-speed
Internet connection at home. In a crowd-sourcing diary study of 30 days in Europe the participants were predominantly higher educated women.
Studies have also found that crowds are not simply collections of
amateurs or hobbyists. Rather, crowds are often professionally trained
in a discipline relevant to a given crowdsourcing task and sometimes
hold advanced degrees and many years of experience in the profession.
Claiming that crowds are amateurs, rather than professionals, is both
factually untrue and may lead to marginalization of crowd labor rights.
G. D. Saxton et al. (2013) studied the role of community
users, among other elements, during his content analysis of 103
crowdsourcing organizations. Saxton et al. developed a taxonomy
of nine crowdsourcing models (intermediary model, citizen media
production, collaborative software development, digital goods sales,
product design, peer-to-peer social financing, consumer report model,
knowledge base building model, and collaborative science project model)
in which to categorize the roles of community users, such as researcher,
engineer, programmer, journalist, graphic designer, etc., and the
products and services developed.
Motivations
Contributors
Many scholars of crowdsourcing suggest that both intrinsic and extrinsic motivations cause people to contribute to crowdsourced tasks and these factors influence different types of contributors.
For example, students and people employed full-time rate human capital
advancement as less important than part-time workers do, while women
rate social contact as more important than men do.
Intrinsic motivations are broken down into two categories:
enjoyment-based and community-based motivations. Enjoyment-based
motivations refer to motivations related to the fun and enjoyment that
contributors experience through their participation. These motivations
include: skill variety, task identity, task autonomy, direct feedback
from the job, and pastime. Community-based motivations refer to
motivations related to community participation, and include community
identification and social contact. In crowdsourced journalism, the
motivation factors are intrinsic: the crowd is driven by a possibility
to make social impact, contribute to social change and help their peers.
Extrinsic motivations are broken down into three categories:
immediate payoffs, delayed payoffs, and social motivations. Immediate
payoffs, through monetary payment, are the immediately received
compensations given to those who complete tasks. Delayed payoffs are
benefits that can be used to generate future advantages, such as
training skills and being noticed by potential employers. Social
motivations are the rewards of behaving pro-socially, such as the altruistic motivations of online volunteers.
Chandler and Kapelner found that US users of the Amazon Mechanical Turk
were more likely to complete a task when told they were going to "help
researchers identify tumor cells," than when they were not told the
purpose of their task. However, of those who completed the task, quality
of output did not depend on the framing of the task.
Motivation factors in crowdsourcing are often a mix of intrinsic and extrinsic factors.
In a crowdsourced law-making project, the crowd was motivated by a mix
of intrinsic and extrinsic factors. Intrinsic motivations included
fulfilling civic duty, affecting the law for sociotropic reasons, to
deliberate with and learn from peers. Extrinsic motivations included
changing the law for financial gain or other benefits. Participation in
crowdsourced policy-making was an act of grassroots advocacy, whether to
pursue one's own interest or more altruistic goals, such as protecting
nature.
Another form of social motivation is prestige or status. The International Children's Digital Library
recruits volunteers to translate and review books. Because all
translators receive public acknowledgment for their contributions,
Kaufman and Schulz cite this as a reputation-based strategy to motivate
individuals who want to be associated with institutions that have
prestige. The Mechanical Turk uses reputation as a motivator in a
different sense, as a form of quality control. Crowdworkers who
frequently complete tasks in ways judged to be inadequate can be denied
access to future tasks, providing motivation to produce high-quality
work.
Requesters
Using
crowdsourcing through means such as Amazon Mechanical Turk can help
provide researchers and requesters with an already established
infrastructure for their projects, allowing them to easily use a crowd
and access participants from a diverse culture background. Using
crowdsourcing can also help complete the work for projects that would
normally have geographical and population size limitations.
Participation in crowdsourcing
Despite the potential global reach of IT applications online, recent research illustrates that differences in location affect participation outcomes in IT-mediated crowds.
Limitations and controversies
At least six major topics cover the limitations and controversies about crowdsourcing:
Impact of crowdsourcing on product quality
Entrepreneurs contribute less capital themselves
Increased number of funded ideas
The value and impact of the work received from the crowd
The ethical implications of low wages paid to crowdworkers
Trustworthiness and informed decision making
Impact of crowdsourcing on product quality
Crowdsourcing
allows anyone to participate, allowing for many unqualified
participants and resulting in large quantities of unusable
contributions. Companies, or additional crowdworkers, then have to sort
through all of these low-quality contributions. The task of sorting
through crowdworkers' contributions, along with the necessary job of
managing the crowd, requires companies to hire actual employees, thereby
increasing management overhead.
For example, susceptibility to faulty results is caused by targeted,
malicious work efforts. Since crowdworkers completing microtasks are
paid per task, often a financial incentive causes workers to complete
tasks quickly rather than well. Verifying responses is time-consuming,
so requesters often depend on having multiple workers complete the same
task to correct errors. However, having each task completed multiple
times increases time and monetary costs.
Crowdsourcing quality is also impacted by task design. Lukyanenko et al.
argue that, the prevailing practice of modeling crowdsourcing data
collection tasks in terms of fixed classes (options), unnecessarily
restricts quality. Results demonstrate that information accuracy depends
on the classes used to model domains, with participants providing more
accurate information when classifying phenomena at a more general level
(which is typically less useful to sponsor organizations, hence less
common). Further, greater overall accuracy is expected when participants
could provide free-form data compared to tasks in which they select
from constrained choices.
Just as limiting, oftentimes the scenario is that just not enough
skills or expertise exist in the crowd to successfully accomplish the
desired task. While this scenario does not affect "simple" tasks such
as image labeling, it is particularly problematic for more complex
tasks, such as engineering design or product validation. A comparison
between the evaluation of business models from experts and an anonymous
online crowd showed that an anonymous online crowd cannot evaluate
business models to the same level as experts.
In these cases, it may be difficult or even impossible to find the
qualified people in the crowd, as their voices may be drowned out by
consistent, but incorrect crowd members.
However, if the difficulty of the task is even "intermediate" in its
difficulty, estimating crowdworkers' skills and intentions and
leveraging them for inferring true responses works well, albeit with an additional computation cost.
Crowdworkers are a nonrandom sample of the population. Many
researchers use crowdsourcing to quickly and cheaply conduct studies
with larger sample sizes than would be otherwise achievable. However,
due to limited access to the Internet, participation in low developed
countries is relatively low. Participation in highly developed countries
is similarly low, largely because the low amount of pay is not a strong
motivation for most users in these countries. These factors lead to a
bias in the population pool towards users in medium developed countries,
as deemed by the human development index.
The likelihood that a crowdsourced project will fail due to lack
of monetary motivation or too few participants increases over the course
of the project. Crowdsourcing markets are not a first-in, first-out
queue. Tasks that are not completed quickly may be forgotten, buried by
filters and search procedures so that workers do not see them. This
results in a long-tail power law distribution of completion times.
Additionally, low-paying research studies online have higher rates of
attrition, with participants not completing the study once started. Even when tasks are completed, crowdsourcing does not always produce quality results. When Facebook began its localization program in 2008, it encountered some criticism for the low quality of its crowdsourced translations.
One of the problems of crowdsourcing products is the lack of
interaction between the crowd and the client. Usually little information
is known about the final desired product, and often very limited
interaction with the final client occurs. This can decrease the quality
of product because client interaction is a vital part of the design
process.
An additional cause of the decrease in product quality that can
result from crowdsourcing is the lack of collaboration tools. In a
typical workplace, coworkers are organized in such a way that they can
work together and build upon each other's knowledge and ideas.
Furthermore, the company often provides employees with the necessary
information, procedures, and tools to fulfill their responsibilities.
However, in crowdsourcing, crowdworkers are left to depend on their own
knowledge and means to complete tasks.
A crowdsourced project is usually expected to be unbiased by
incorporating a large population of participants with a diverse
background. However, most of the crowdsourcing works are done by people
who are paid or directly benefit from the outcome (e.g. most of open source projects working on Linux).
In many other cases, the end product is the outcome of a single
person's endeavour, who creates the majority of the product, while the
crowd only participates in minor details.
Entrepreneurs contribute less capital themselves
To
make an idea turn into a reality, the first component needed is
capital. Depending on the scope and complexity of the crowdsourced
project, the amount of necessary capital can range from a few thousand
dollars to hundreds of thousands, if not more. The capital-raising
process can take from days to months depending on different variables,
including the entrepreneur's network and the amount of initial
self-generated capital.
The crowdsourcing process allows entrepreneurs to access to a
wide range of investors who can take different stakes in the project.
In effect, crowdsourcing simplifies the capital-raising process and
allows entrepreneurs to spend more time on the project itself and
reaching milestones rather than dedicating time to get it started.
Overall, the simplified access to capital can save time to start
projects and potentially increase efficiency of projects.
Opponents of this issue argue easier access to capital through a
large number of smaller investors can hurt the project and its creators.
With a simplified capital-raising process involving more investors with
smaller stakes, investors are more risk-seeking because they can take
on an investment size with which they are comfortable.
This leads to entrepreneurs losing possible experience convincing
investors who are wary of potential risks in investing because they do
not depend on one single investor for the survival of their project.
Instead of being forced to assess risks and convince large institutional
investors why their project can be successful, wary investors can be
replaced by others who are willing to take on the risk.
There are translation companies and several users of translations
who pretend to use crowdsourcing as a means for drastically cutting
costs, instead of hiring professional translators. This situation has been systematically denounced by IAPTI and other translator organizations.
Increased number of funded ideas
The raw number of ideas that get funded and the quality of the ideas is a large controversy over the issue of crowdsourcing.
Proponents argue that crowdsourcing is beneficial because it
allows niche ideas that would not survive venture capitalist or angel
funding, many times the primary investors in startups, to be started.
Many ideas are killed in their infancy due to insufficient support and
lack of capital, but crowdsourcing allows these ideas to be started if
an entrepreneur can find a community to take interest in the project.
Crowdsourcing allows those who would benefit from the project to
fund and become a part of it, which is one way for small niche ideas get
started.
However, when the raw number of projects grows, the number of possible
failures can also increase. Crowdsourcing assists niche and high-risk
projects to start because of a perceived need from a select few who seek
the product. With high risk and small target markets, the pool of
crowdsourced projects faces a greater possible loss of capital, lower
return, and lower levels of success.
Concerns
Because crowdworkers are considered independent contractors rather than employees, they are not guaranteed minimum wage.
In practice, workers using the Amazon Mechanical Turk generally earn
less than the minimum wage. In 2009, it was reported that United States
Turk users earned an average of $2.30 per hour for tasks, while users in
India earned an average of $1.58 per hour, which is below minimum wage
in the United States (but not in India).
Some researchers who have considered using Mechanical Turk to get
participants for research studies, have argued that the wage conditions
might be unethical.
However, according to other research, workers on Amazon Mechanical Turk
do not feel they are exploited, and are ready to participate in
crowdsourcing activities in the future.
When Facebook began its localization program in 2008, it received
criticism for using free labor in crowdsourcing the translation of site
guidelines.
Typically, no written contracts, nondisclosure agreements, or
employee agreements are made with crowdworkers. For users of the Amazon
Mechanical Turk, this means that requestors decide whether users' work
is acceptable, and reserve the right to withhold pay if it does not meet
their standards.
Critics say that crowdsourcing arrangements exploit individuals in the
crowd, and a call has been made for crowds to organize for their labor
rights.
Collaboration between crowd members can also be difficult or even
discouraged, especially in the context of competitive crowd sourcing.
Crowdsourcing site InnoCentive allows organizations to solicit solutions
to scientific and technological problems; only 10.6% of respondents
report working in a team on their submission.
Amazon Mechanical Turk workers collaborated with academics to create a
platform, WeAreDynamo.org, that allows them to organize and create
campaigns to better their work situation.
Irresponsible crowdsourcing
The popular forum website reddit came under the spotlight during the first few days after the events of the Boston Marathon bombing
as it showed how powerful social media and crowdsourcing could be.
Reddit was able to help many victims of the bombing as they sent relief
and some even opened up their homes, all being communicated very
efficiently on their site.
However, Reddit soon came under fire after they started to crowdsource
information on the possible perpetrators of the bombing. While the FBI
received thousands of photos from average citizens, the website also
started to focus on crowdsourcing their own investigation, with the
information that they were crowdsourcing. Eventually, Reddit members
claimed to have found 4 bombers but all were innocent, including a
college student who had committed suicide a few days before the bombing.
The problem was exacerbated when the media also started to rely on
Reddit as their source for information,
allowing the misinformation to spread almost nationwide. The FBI has
since warned the media to be more careful of where they are getting
their information but Reddit's investigation and its false accusations
opened up questions about what should be crowdsourced and the unintended
consequences of irresponsible crowdsourcing.