A Medley of Potpourri

Wednesday, April 24, 2019

Crowdsourcing

From Wikipedia, the free encyclopedia

Crowdsourcing is a sourcing model in which individuals or organizations obtain goods and services, including ideas and finances, from a large, relatively open and often rapidly-evolving group of internet users; it divides work between participants to achieve a cumulative result. The word crowdsourcing itself is a portmanteau of crowd and outsourcing, and was coined in 2005. As a mode of sourcing, crowdsourcing existed prior to the digital age (i.e. "offline").

There are major differences between crowdsourcing and outsourcing. Crowdsourcing comes from a less-specific, more public group, whereas outsourcing is commissioned from a specific, named group, and includes a mix of bottom-up and top-down processes. Advantages of using crowdsourcing may include improved costs, speed, quality, flexibility, scalability, or diversity.

Some forms of crowdsourcing, such as in "idea competitions" or "innovation contests" provide ways for organizations to learn beyond the "base of minds" provided by their employees (e.g. LEGO Ideas). Tedious "microtasks" performed in parallel by large, paid crowds (e.g. Amazon Mechanical Turk) are another form of crowdsourcing. It has also been used by not-for-profit organizations and to create common goods (e.g. Wikipedia). The effect of user communication and the platform presentation should be taken into account when evaluating the performance of ideas in crowdsourcing contexts.

Definitions

The term "crowdsourcing" was coined in 2005 by Jeff Howe and Mark Robinson, editors at Wired, to describe how businesses were using the Internet to "outsource work to the crowd", which quickly led to the portmanteau "crowdsourcing." Howe, first published a definition for the term crowdsourcing in a companion blog post to his June 2006 Wired article, "The Rise of Crowdsourcing", which came out in print just days later:

"Simply defined, crowdsourcing represents the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call. This can take the form of peer-production (when the job is performed collaboratively), but is also often undertaken by sole individuals. The crucial prerequisite is the use of the open call format and the large network of potential laborers."

In a February 1, 2008, article, Daren C. Brabham, "the first [person] to publish scholarly research using the word crowdsourcing" and writer of the 2013 book, Crowdsourcing, defined it as an "online, distributed problem-solving and production model." Kristen L. Guth and Brabham, found that the performance of ideas offered in crowdsourcing platforms are affected not only by their quality, but also by the communication among users about the ideas, and presentation in the platform itself.

After studying more than 40 definitions of crowdsourcing in the scientific and popular literature, Enrique Estellés-Arolas and Fernando González Ladrón-de-Guevara, researchers at the Technical University of Valencia, developed a new integrating definition:

"Crowdsourcing is a type of participative online activity in which an individual, an institution, a nonprofit organization, or company proposes to a group of individuals of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task. The undertaking of the task; of variable complexity and modularity, and; in which the crowd should participate, bringing their work, money, knowledge **[and/or]** experience, always entails mutual benefit. The user will receive the satisfaction of a given type of need, be it economic, social recognition, self-esteem, or the development of individual skills, while the crowdsourcer will obtain and use to their advantage that which the user has brought to the venture, whose form will depend on the type of activity undertaken".

As mentioned by the definitions of Brabham and Estellés-Arolas and Ladrón-de-Guevara above, crowdsourcing in the modern conception is an IT-mediated phenomenon, meaning that a form of IT is always used to create and access crowds of people. In this respect, crowdsourcing has been considered to encompass three separate, but stable techniques; competition crowdsourcing, virtual labor market crowdsourcing, and open collaboration crowdsourcing.

Henk van Ess, a college lecturer in online communications, emphasizes the need to "give back" the crowdsourced results to the public on ethical grounds. His nonscientific, noncommercial definition is widely cited in the popular press:

"Crowdsourcing is channeling the experts’ desire to solve a problem and then freely sharing the answer with everyone."

Despite the multiplicity of definitions for crowdsourcing, one constant has been the broadcasting of problems to the public, and an open call for contributions to help solve the problem. Members of the public submit solutions that are then owned by the entity, which originally broadcast the problem. In some cases, the contributor of the solution is compensated monetarily with prizes or with recognition. In other cases, the only rewards may be kudos or intellectual satisfaction. Crowdsourcing may produce solutions from amateurs or volunteers working in their spare time or from experts or small businesses, which were previously unknown to the initiating organization.

Another consequence of the multiple definitions is the controversy surrounding what kinds of activities that may be considered crowdsourcing.

Historical examples

While the term "crowdsourcing" was popularized online to describe Internet-based activities, some examples of projects, in retrospect, can be described as crowdsourcing.

Timeline of major events

1714 – The longitude rewards: When the British government was trying to find a way to measure a ship's longitudinal position, they offered the public a monetary prize to whomever came up with the best solution.
1783 – King Louis XVI offered an award to the person who could ‘make the alkali’ by decomposing sea salt by the ‘simplest and most economic method.’
1848 – Matthew Fontaine Maury distributed 5000 copies of his Wind and Current Charts free of charge on the condition that sailors returned a standardized log of their voyage to the U.S. Naval Observatory. By 1861, he had distributed 200,000 copies free of charge, on the same conditions.
1849 – A network of some 150 volunteer weather observers all over the USA was set up as a part of the Smithsonian Institution's Meteorological Project started by the Smithsonian's first Secretary, Joseph Henry, who used the telegraph to gather volunteers’ data and create a large weather map, making new information available to the public daily. For instance, volunteers tracked a tornado passing through Wisconsin and sent the findings via telegraph to the Smithsonian. Henry's project is considered the origin of what later became the National Weather Service. Within a decade, the project had more than 600 volunteer observers and had spread to Canada, Mexico, Latin America, and the Caribbean.
1884 – Publication of the Oxford English Dictionary: 800 volunteers catalogued words to create the first fascicle of the OED
1916 – Planters Peanuts contest: The Mr. Peanut logo was designed by a 14-year-old boy who won the Planter Peanuts logo contest.
1957 – Jørn Utzon, winner of the design competition for the Sydney Opera House
1970 – French amateur photo contest ‘C’était Paris en 1970’ (‘This Was Paris in 1970’) sponsored by the city of Paris, France-Inter radio, and the Fnac: 14,000 photographers produced 70,000 black-and-white prints and 30,000 color slides of the French capital to document the architectural changes of Paris. Photographs were donated to the Bibliothèque historique de la ville de Paris.
1996 – The Hollywood Stock Exchange was founded: Allowed for the buying and selling of shares
1997 – British rock band Marillion raised $60,000 from their fans to help finance their U.S. tour.
1999 - SETI@home was launched by the University of California, Berkeley. Volunteers can contribute to searching for signals that might come from extraterrestrial intelligence by installing a program that uses idle computer time for analyzing chunks of data recorded by radio telescopes involved in the SERENDIP program.
2000 – JustGiving established: This online platform allows the public to help raise money for charities.
2000 – UNV Online Volunteering service launched: Connecting people who commit their time and skills over the Internet to help organizations address development challenges
2000 – iStockPhoto was founded: The free stock imagery website allows the public to contribute to and receive commission for their contributions.
2001 – Launch of Wikipedia: “Free-access, free content Internet encyclopedia”
2001 – Foundation of Topcoder – crowdsourcing software development company.
2004 – Toyota’s first "Dream car art" contest: Children were asked globally to draw their ‘dream car of the future.’
2005 – Kodak’s "Go for the Gold" contest: Kodak asked anyone to submit a picture of a personal victory.
2006 – Jeff Howe coined the term crowdsourcing in Wired.
2009 – Waze, a community-oriented GPS app, allows for users to submit road info and route data based on location, such as reports of car accidents or traffic, and integrates that data into its routing algorithms for all users of the app

Early competitions

Crowdsourcing has often been used in the past as a competition to discover a solution. The French government proposed several of these competitions, often rewarded with Montyon Prizes, created for poor Frenchmen who had done virtuous acts. These included the Leblanc process, or the Alkali prize, where a reward was provided for separating the salt from the alkali, and the Fourneyron's turbine, when the first hydraulic commercial turbine was developed.

In response to a challenge from the French government, Nicolas Appert won a prize for inventing a new way of food preservation that involved sealing food in air-tight jars. The British government provided a similar reward to find an easy way to determine a ship's longitude in the Longitude Prize. During the Great Depression, out-of-work clerks tabulated higher mathematical functions in the Mathematical Tables Project as an outreach project. One of the biggest crowdsourcing campaigns was a public design contest in 2010, hosted by the Indian government's finance ministry to create a symbol for the Indian rupee. Thousands of people sent in entries before the government zeroed in on the final symbol based on the Devanagari script using the letter Ra.

In astronomy

Crowdsourcing in astronomy was used in the early 19th century by astronomer Denison Olmsted. After being awakened in a late November night due to a meteor shower taking place, Olmsted noticed a pattern in the shooting stars. Olmsted wrote a brief report of this meteor shower in the local newspaper. “As the cause of ‘Falling Stars’ is not understood by meteorologists, it is desirable to collect all the facts attending this phenomenon, stated with as much precision as possible,” Olmsted wrote to readers, in a report subsequently picked up and pooled to newspapers nationwide. Responses came pouring in from many states, along with scientists’ observations sent to the American Journal of Science and Arts. These responses helped him make a series of scientific breakthroughs, the major discovery being that meteor showers are seen nationwide, and fall from space under the influence of gravity. Also, they demonstrated that the showers appeared in yearly cycles, a fact that often eluded scientists. The responses allowed him to suggest a velocity for the meteors, although his estimate turned out to be too conservative. If he had just taken the responses as presented, his conjecture on the meteors' velocity would have been closer to their actual speed.

A more recent version of crowdsourcing in astronomy is NASA's photo organizing project, which asks internet users to browse photos taken from space and try to identify the location the picture is documenting.

In energy system research

Energy system models require large and diverse datasets, increasingly so given the trend towards greater temporal and spatial resolution. In response, there have been several initiatives to crowdsource this data. Launched in December 2009, OpenEI is a collaborative website, run by the US government, providing open energy data. While much of its information is from US government sources, the platform also seeks crowdsourced input from around the world. The semantic wiki and database Enipedia also publishes energy systems data using the concept of crowdsourced open information. Enipedia went live in March 2011.

In genealogy research

Genealogical research was using crowdsourcing techniques long before personal computers were common. Beginning in 1942, members of The Church of Jesus Christ of Latter-day Saints encouraged members to submit information about their ancestors. The submitted information was gathered together into a single collection. In 1969, to encourage more people to participate in gathering genealogical information about their ancestors, the church started the three-generation program. In this program, church members were asked to prepare documented family group record forms for the first three generations. The program was later expanded to encourage members to research at least four generations and became known as the four-generation program.

Institutes that have records of interest to genealogical research have used crowds of volunteers to create catalogs and indices to records.

In geography

Volunteered Geographic Information (VGI) is geographic information generated through Crowdsourcing, as opposed to traditional methods of Professional Geographic Information (PGI). In describing the built environment, VGI has many advantages over PGI, primarily perceived currency, accuracy, and authority.

In engineering

Many companies are introducing crowdsourcing to grow their engineering capabilities and find solutions to unsolved technical challenges and the need to adopt newest technologies such as 3D printing, IOT, etc.

In genetic genealogy research

Genetic genealogy is a combination of traditional genealogy with genetics. The rise of personal DNA testing, after the turn of the century, by companies such as Gene by Gene, FTDNA, GeneTree, 23andMe, and Ancestry.com, has led to public and semipublic databases of DNA testing which uses crowdsourcing techniques. In recent years, citizen science projects have become increasingly focused providing benefits to scientific research. This includes support, organization, and dissemination of personal DNA (genetic) testing. Similar to amateur astronomy, citizen scientists encouraged by volunteer organizations like the International Society of Genetic Genealogy, have provided valuable information and research to the professional scientific community.

Spencer Wells, director of the Genographic Project blurb:

Since 2005, the Genographic Project, has used the latest genetic technology to expand our knowledge of the human story, and its pioneering use of DNA testing to engage and involve the public in the research effort has helped to create a new breed of "citizen scientist." Geno 2.0 expands the scope for citizen science, harnessing the power of the crowd to discover new details of human population history.

In journalism

Crowdsourcing is increasingly used in professional journalism. Journalists are able to crowdsource information from the crowd typically by fact checking the information, and then using the information they've gathered in their articles as they see fit. The leading daily newspaper in Sweden has successfully used crowdsourcing in investigating the home loan interest rates in the country in 2013-2014, resulting to over 50,000 submissions. The leading daily newspaper in Finland crowdsourced investigation in stock short selling in 2011-2012, and the crowdsourced information lead to a revelation of a sketchy tax evasion system in a Finnish bank. The bank executive was fired and policy changes followed. TalkingPointsMemo in the United States asked its readers to examine 3000 emails concerning the firing of federal prosecutors in 2008. The British newspaper the Guardian crowdsourced the examination of hundreds of thousands of documents in 2009.

In linguistics

Crowdsourcing strategies have been applied to estimate word knowledge and vocabulary size.

In ornithology

Another early example of crowdsourcing occurred in the field of ornithology. On December 25, 1900, Frank Chapman, an early officer of the National Audubon Society, initiated a tradition, dubbed the "Christmas Day Bird Census". The project called birders from across North America to count and record the number of birds in each species they witnessed on Christmas Day. The project was successful, and the records from 27 different contributors were compiled into one bird census, which tallied around 90 species of birds. This large-scale collection of data constituted an early form of citizen science, the premise upon which crowdsourcing is based. In the 2012 census, more than 70,000 individuals participated across 2,369 bird count circles. Christmas 2014 marked the National Audubon Society's 115th annual Christmas Bird Count.

In public policy

Crowdsourcing public policy and the production of public services is also referred to as citizen sourcing. While some scholars argue crowdsourcing is a policy tool or a definite means of co-production others question that and argue that crowdsourcing should be considered just as a technological enabler that simply can increase speed and ease of participation.

The first conference focusing on Crowdsourcing for Politics and Policy took place at Oxford University, under the auspices of the Oxford Internet Institute in 2014. Research has emerged since 2012 that focuses on the use of crowdsourcing for policy purposes. These include the experimental investigation of the use of Virtual Labor Markets for policy assessment, and an assessment of the potential for citizen involvement in process innovation for public administration.

Governments across the world are increasingly using crowdsourcing for knowledge discovery and civic engagement. Iceland crowdsourced their constitution reform process in 2011, and Finland has crowdsourced several law reform processes to address their off-road traffic laws. The Finnish government allowed citizens to go on an online forum to discuss problems and possible resolutions regarding some off-road traffic laws. The crowdsourced information and resolutions would then be passed on to legislators for them to refer to when making a decision, letting citizens more directly contribute to public policy. The City of Palo Alto is crowdsourcing people's feedback for its Comprehensive City Plan update in a process, which started in 2015. The House of Representatives in Brazil has used crowdsourcing in policy-reforms, and federal agencies in the United States have used crowdsourcing for several years.

In seismology

The European-Mediterranean Seismological Centre (EMSC) has developed a seismic detection system by monitoring the traffic peaks on its website and by the analysis of keywords used on Twitter.

In libraries

Crowdsourcing is used in libraries for OCR corrections on digitized texts, for tagging and for funding, especailly in the absence of financial and human means. Volunteer can contribute explicitly with conscious effort or implicitly without being known by turnning the text on the raw newspaper image into human corrected digital form.

Modern methods

Currently, crowdsourcing has transferred mainly to the Internet, which provides a particularly beneficial venue for crowdsourcing since individuals tend to be more open in web-based projects where they are not being physically judged or scrutinized, and thus can feel more comfortable sharing. This approach ultimately allows for well-designed artistic projects because individuals are less conscious, or maybe even less aware, of scrutiny towards their work. In an online atmosphere, more attention can be given to the specific needs of a project, rather than spending as much time in communication with other individuals.

According to a definition by Henk van Ess:

"The crowdsourced problem can be huge (epic tasks like finding alien life or mapping earthquake zones) or very small ('where can I skate safely?'). Some examples of successful crowdsourcing themes are problems that bug people, things that make people feel good about themselves, projects that tap into niche knowledge of proud experts, subjects that people find sympathetic or any form of injustice."

Crowdsourcing can either take an explicit or an implicit route. Explicit crowdsourcing lets users work together to evaluate, share, and build different specific tasks, while implicit crowdsourcing means that users solve a problem as a side effect of something else they are doing.

With explicit crowdsourcing, users can evaluate particular items like books or webpages, or share by posting products or items. Users can also build artifacts by providing information and editing other people's work.

Implicit crowdsourcing can take two forms: standalone and piggyback. Standalone allows people to solve problems as a side effect of the task they are actually doing, whereas piggyback takes users' information from a third-party website to gather information.

In his 2013 book, Crowdsourcing, Daren C. Brabham puts forth a problem-based typology of crowdsourcing approaches:

Knowledge discovery and management is used for information management problems where an organization mobilizes a crowd to find and assemble information. It is ideal for creating collective resources.
Distributed human intelligence tasking is used for information management problems where an organization has a set of information in hand and mobilizes a crowd to process or analyze the information. It is ideal for processing large data sets that computers cannot easily do.
Broadcast search is used for ideation problems where an organization mobilizes a crowd to come up with a solution to a problem that has an objective, provable right answer. It is ideal for scientific problem solving.
Peer-vetted creative production is used for ideation problems, where an organization mobilizes a crowd to come up with a solution to a problem which has an answer that is subjective or dependent on public support. It is ideal for design, aesthetic, or policy problems.

Crowdsourcing often allows participants to rank each other's contributions, e.g. in answer to the question "What is one thing we can do to make Acme a great company?" One common method for ranking is "like" counting, where the contribution with the most likes ranks first. This method is simple and easy to understand, but it privileges early contributions, which have more time to accumulate likes. In recent years several crowdsourcing companies have begun to use pairwise comparisons, backed by ranking algorithms. Ranking algorithms do not penalize late contributions. They also produce results faster. Ranking algorithms have proven to be at least 10 times faster than manual stack ranking. One drawback, however, is that ranking algorithms are more difficult to understand than like counting.

In "How to Manage Crowdsourcing Platforms Effectively," Ivo Blohm states that there are four types of Crowdsourcing Platforms: Microtasking, Information Pooling, Broadcast Search, and Open Collaboration. They differ in the diversity and aggregation of contributions that are created. The diversity of information collected can either be homogenous or heterogenous. The aggregation of information can either be selective or integrative.

Examples

Some common categories of crowdsourcing can be used effectively in the commercial world, including crowdvoting, crowdsolving, crowdfunding, microwork, creative crowdsourcing, crowdsource workforce management, and inducement prize contests. Although this may not be an exhaustive list, the items cover the current major ways in which people use crowds to perform tasks.

Crowdvoting

Crowdvoting occurs when a website gathers a large group's opinions and judgments on a certain topic. The Iowa Electronic Market is a prediction market that gathers crowds' views on politics and tries to ensure accuracy by having participants pay money to buy and sell contracts based on political outcomes.

Some of the most famous examples have made use of social media channels: Domino's Pizza, Coca-Cola, Heineken, and Sam Adams have thus crowdsourced a new pizza, bottle design, beer, and song, respectively. Threadless.com selects the T-shirts it sells by having users provide designs and vote on the ones they like, which are then printed and available for purchase.

The California Report Card (CRC), a program jointly launched in January 2014 by the Center for Information Technology Research in the Interest of Society and Lt. Governor Gavin Newsom, is an example of modern-day crowd voting. Participants access the CRC online and vote on six timely issues. Through principal component analysis, the users are then placed into an online "café" in which they can present their own political opinions and grade the suggestions of other participants. This system aims to effectively involve the greater public in relevant political discussions and highlight the specific topics with which Californians are most concerned.

Crowdvoting's value in the movie industry was shown when in 2009 a crowd accurately predicting the success or failure of a movie based on its trailer, a feat that was replicated in 2013 by Google.

On reddit, users collectively rate web content, discussions and comments as well as questions posed to persons of interest in "AMA" and AskScience online interviews.

In 2017, Project Fanchise purchased a team in the Indoor Football League and created the Salt Lake Screaming Eagles a fan run team. Using a mobile app the fans voted on the day-to-day operations of the team, the mascot name, signing of players and even the offensive playcalling during games.

Crowdsourcing creative work

Creative crowdsourcing spans sourcing creative projects such as graphic design, crowdsourcing architecture, apparel design, movies, writing, company naming, illustration, etc. While crowdsourcing competitions have been used for decades in some creative fields (such as architecture), creative crowdsourcing has proliferated with the recent development of web-based platforms where clients can solicit a wide variety of creative work at lower cost than by traditional means.

Crowdsourcing in software development

Crowdsourcing approach to software development is used by a number of companies. Notable examples are Topcoder and its parent company Wipro.

Crowdsourcing language-related data collection

Crowdsourcing has also been used for gathering language-related data. For dictionary work, as was mentioned above, it was applied over a hundred years ago by the Oxford English Dictionary editors, using paper and postage. Much later, a call for collecting examples of proverbs on a specific topic (religious pluralism) was printed in a journal. Today, as "crowdsourcing" has the inherent connotation of being web-based, such language-related data gathering is being conducted on the web by crowdsourcing in accelerating ways. Currently, a number of dictionary compilation projects are being conducted on the web, particularly for languages that are not highly academically documented, such as for the Oromo language. Software programs have been developed for crowdsourced dictionaries, such as WeSay. A slightly different form of crowdsourcing for language data has been the online creation of scientific and mathematical terminology for American Sign Language. Proverb collection is also being done via crowdsourcing on the Web, most innovatively for the Pashto language of Afghanistan and Pakistan. Crowdsourcing has been extensively used to collect high-quality gold standard for creating automatic systems in natural language processing (e.g. named entity recognition, entity linking).

Crowdsolving

Crowdsolving is a collaborative, yet holistic, way of solving a problem using many people, communities, groups, or resources. It is a type of crowdsourcing with focus on complex and intellectively demanding problems requiring considerable effort, and quality/ uniqueness of contribution.

Crowdsearching

Chicago-based startup Crowdfind, formerly "crowdfynd", uses a version of crowdsourcing best termed as crowdsearching, which differs from microwork in that no payment for taking part in the search is made. Their platform, through geographic location anchoring, builds a virtual search party of smartphone and Internet users to find lost items, pets, or persons, as well as returning them.

TrackR uses a system they call "crowd GPS" to load Bluetooth identities to a central server to track lost or stolen items.

Crowdfunding

Crowdfunding is the process of funding projects by a multitude of people contributing a small amount to attain a certain monetary goal, typically via the Internet. Crowdfunding has been used for both commercial and charitable purposes. The crowdfuding model that has been around the longest is rewards-based crowdfunding. This model is where people can prepurchase products, buy experiences, or simply donate. While this funding may in some cases go towards helping a business, funders are not allowed to invest and become shareholders via rewards-based crowdfunding.

Individuals, businesses, and entrepreneurs can showcase their businesses and projects to the entire world by creating a profile, which typically includes a short video introducing their project, a list of rewards per donation, and illustrations through images. The goal is to create a compelling message towards which readers will be drawn. Funders make monetary contribution for numerous reasons:

They connect to the greater purpose of the campaign, such as being a part of an entrepreneurial community and supporting an innovative idea or product.
They connect to a physical aspect of the campaign like rewards and gains from investment.
They connect to the creative display of the campaign's presentation.
They want to see new products before the public.

The dilemma for equity crowdfunding in the US as of 2012 was how the Securities and Exchange Commission (SEC) is going to regulate the entire process. At the time, rules and regulations were being refined by the SEC, which had until January 1, 2013, to tweak the fundraising methods. The regulators were overwhelmed trying to regulate Dodd – Frank and all the other rules and regulations involving public companies and the way they trade. Advocates of regulation claimed that crowdfunding would open up the flood gates for fraud, called it the "wild west" of fundraising, and compared it to the 1980s days of penny stock "cold-call cowboys". The process allows for up to $1 million to be raised without some of the regulations being involved. Companies under the then-current proposal would have exemptions available and be able to raise capital from a larger pool of persons, which can include lower thresholds for investor criteria, whereas the old rules required that the person be an "accredited" investor. These people are often recruited from social networks, where the funds can be acquired from an equity purchase, loan, donation, or ordering. The amounts collected have become quite high, with requests that are over a million dollars for software such as Trampoline Systems, which used it to finance the commercialization of their new software.

Mobile crowdsourcing

Mobile crowdsourcing, involves activities that take place on smartphones or mobile platforms that are frequently characterized by GPS technology. This allows for real-time data gathering and gives projects greater reach and accessibility. However, mobile crowdsourcing can lead to an urban bias, as well as safety and privacy concerns.

Macrowork

Macrowork tasks typically have these characteristics: they can be done independently, they take a fixed amount of time, and they require special skills. Macrotasks could be part of specialized projects or could be part of a large, visible project where workers pitch in wherever they have the required skills. The key distinguishing factors are that macrowork requires specialized skills and typically takes longer, while microwork requires no specialized skills.

Microwork

Microwork is a crowdsourcing platform that allows users to do small tasks for which computers lack aptitude for low amounts of money. Amazon's popular Mechanical Turk has created many different projects for users to participate in, where each task requires very little time and offers a very small amount in payment. The Chinese versions of this, commonly called Witkey, are similar and include such sites as Taskcn.com and k68.cn. When choosing tasks, since only certain users “win”, users learn to submit later and pick less popular tasks to increase the likelihood of getting their work chosen. An example of a Mechanical Turk project is when users searched satellite images for a boat to find lost researcher Jim Gray. Based on an elaborate survey of participants in a microtask crowdsourcing platform, Gadiraju et al. have proposed a taxonomy of different types of microtasks that are crowdsourced. Two important questions in microtask crowdsourcing are dynamic task allocation and answer aggregation.

Simple projects

Simple projects are those that require a large amount of time and skills compared to micro and macrowork. While an example of macrowork would be writing survey feedback, simple projects rather include activities like writing a basic line of code or programming a database, which both require a larger time commitment and skill level. These projects are usually not found on sites like Amazon Mechanical Turk, and are rather posted on platforms like Upwork that call for a specific expertise.

Complex projects

Complex projects generally take the most time, have higher stakes, and call for people with very specific skills. These are generally “one-off” projects that are difficult to accomplish and can include projects like designing a new product that a company hopes to patent. Tasks like that would be “complex” because design is a meticulous process that requires a large amount of time to perfect, and also people doing these projects must have specialized training in design to effectively complete the project. These projects usually pay the highest, yet are rarely offered.

Inducement prize contests

Web-based idea competitions or inducement prize contests often consist of generic ideas, cash prizes, and an Internet-based platform to facilitate easy idea generation and discussion. An example of these competitions includes an event like IBM's 2006 "Innovation Jam", attended by over 140,000 international participants and yielding around 46,000 ideas. Another example is the Netflix Prize in 2009. The idea was to ask the crowd to come up with a recommendation algorithm more accurate than Netflix's own algorithm. It had a grand prize of US$1,000,000, and it was given to the BellKor's Pragmatic Chaos team which bested Netflix's own algorithm for predicting ratings, by 10.06%.

Another example of competition-based crowdsourcing is the 2009 DARPA balloon experiment, where DARPA placed 10 balloon markers across the United States and challenged teams to compete to be the first to report the location of all the balloons. A collaboration of efforts was required to complete the challenge quickly and in addition to the competitive motivation of the contest as a whole, the winning team (MIT, in less than nine hours) established its own "collaborapetitive" environment to generate participation in their team. A similar challenge was the Tag Challenge, funded by the US State Department, which required locating and photographing individuals in five cities in the US and Europe within 12 hours based only on a single photograph. The winning team managed to locate three suspects by mobilizing volunteers worldwide using a similar incentive scheme to the one used in the balloon challenge.

Open innovation platforms are a very effective way of crowdsourcing people's thoughts and ideas to do research and development. The company InnoCentive is a crowdsourcing platform for corporate research and development where difficult scientific problems are posted for crowds of solvers to discover the answer and win a cash prize, which can range from $10,000 to $100,000 per challenge. InnoCentive, of Waltham, MA and London, England provides access to millions of scientific and technical experts from around the world. The company claims a success rate of 50% in providing successful solutions to previously unsolved scientific and technical problems. IdeaConnection.com challenges people to come up with new inventions and innovations and Ninesigma.com connects clients with experts in various fields. The X Prize Foundation creates and runs incentive competitions offering between $1 million and $30 million for solving challenges. Local Motors is another example of crowdsourcing. A community of 20,000 automotive engineers, designers, and enthusiasts competes to build off-road rally trucks.

Implicit crowdsourcing

Implicit crowdsourcing is less obvious because users do not necessarily know they are contributing, yet can still be very effective in completing certain tasks. Rather than users actively participating in solving a problem or providing information, implicit crowdsourcing involves users doing another task entirely where a third party gains information for another topic based on the user's actions.

A good example of implicit crowdsourcing is the ESP game, where users guess what images are and then these labels are used to tag Google images. Another popular use of implicit crowdsourcing is through reCAPTCHA, which asks people to solve CAPTCHAs to prove they are human, and then provides CAPTCHAs from old books that cannot be deciphered by computers, to digitize them for the web. Like many tasks solved using the Mechanical Turk, CAPTCHAs are simple for humans, but often very difficult for computers.

Piggyback crowdsourcing can be seen most frequently by websites such as Google that data-mine a user's search history and websites to discover keywords for ads, spelling corrections, and finding synonyms. In this way, users are unintentionally helping to modify existing systems, such as Google's AdWords.

Health-care crowdsourcing

Research has emerged that outlines the use of crowdsourcing techniques in the public health domain. The collective intelligence outcomes from crowdsourcing are being generated in three broad categories of public health care; health promotion, health research, and health maintenance. Crowdsourcing also enables researchers to move from small homogeneous groups of participants to large heterogenous groups, beyond convenience samples such as students or higher educated people. The SESH group focuses on using crowdsourcing to improve health.

Crowdsourcing in agriculture

Crowdsource research also reaches to the field of agriculture. This is mainly to give the farmers and experts a kind of help in identification of different types of weeds from the fields and also to give them the best way to remove the weeds from fields.

Crowdsourcing in cheating in bridge

Boye Brogeland initiated a crowdsourcing investigation of cheating by top-level bridge players that showed several players were guilty, which led to their suspension.

Crowdsifting

Crowdsifting (crowd-sifting) is a form of crowdsourcing by which self-selected participants with specialized disciplinary knowledge, skills and interests examine a specific topic at large scale.

Crowdshipping

Crowdshipping (crowd-shipping) is a peer-to-peer shipping service, usually conducted via an online platform or marketplace. There are several methods that have been categorized as crowd-shipping: a) Travelers heading in the direction of the buyer, and are willing to bring the package as part of their luggage for a reward. b) Truck drivers whose route lies along the buyer's location and who are willing to take extra items in their truck. c)Community-based platforms that connect international buyers and local forwarders, by allowing buyers to use forwarder's address as purchase destination, after which forwarders ship items further to the buyer.

Crowdsourcers

A number of motivations exist for businesses to use crowdsourcing to accomplish their tasks, find solutions for problems, or to gather information. These include the ability to offload peak demand, access cheap labor and information, generate better results, access a wider array of talent than might be present in one organization, and undertake problems that would have been too difficult to solve internally. Crowdsourcing allows businesses to submit problems on which contributors can work, on topics such as science, manufacturing, biotech, and medicine, with monetary rewards for successful solutions. Although crowdsourcing complicated tasks can be difficult, simple work tasks can be crowdsourced cheaply and effectively.

Crowdsourcing also has the potential to be a problem-solving mechanism for government and nonprofit use. Urban and transit planning are prime areas for crowdsourcing. One project to test crowdsourcing's public participation process for transit planning in Salt Lake City was carried out from 2008 to 2009, funded by a U.S. Federal Transit Administration grant. Another notable application of crowdsourcing to government problem solving is the Peer to Patent Community Patent Review project for the U.S. Patent and Trademark Office.

Researchers have used crowdsourcing systems like the Mechanical Turk to aid their research projects by crowdsourcing some aspects of the research process, such as data collection, parsing, and evaluation. Notable examples include using the crowd to create speech and language databases, and using the crowd to conduct user studies. Crowdsourcing systems provide these researchers with the ability to gather large amounts of data. Additionally, using crowdsourcing, researchers can collect data from populations and demographics they may not have had access to locally, but that improve the validity and value of their work.

Artists have also used crowdsourcing systems. In his project called the Sheep Market, Aaron Koblin used Mechanical Turk to collect 10,000 drawings of sheep from contributors around the world. Sam Brown (artist) leverages the crowd by asking visitors of his website explodingdog to send him sentences that he uses as inspirations for paintings. Art curator Andrea Grover argues that individuals tend to be more open in crowdsourced projects because they are not being physically judged or scrutinized. As with other crowdsourcers, artists use crowdsourcing systems to generate and collect data. The crowd also can be used to provide inspiration and to collect financial support for an artist's work.

Additionally, crowdsourcing from 100 million drivers is being used by INRIX to collect users' driving times to provide better GPS routing and real-time traffic updates.

Demographics

The crowd is an umbrella term for the people who contribute to crowdsourcing efforts. Though it is sometimes difficult to gather data about the demographics of the crowd, a study by Ross et al. surveyed the demographics of a sample of the more than 400,000 registered crowdworkers using Amazon Mechanical Turk to complete tasks for pay. A previous study in 2008 by Ipeirotis found that users at that time were primarily American, young, female, and well-educated, with 40% earning more than $40,000 per year. In November 2009, Ross found a very different Mechanical Turk population, 36% of which was Indian. Two-thirds of Indian workers were male, and 66% had at least a bachelor's degree. Two-thirds had annual incomes less than $10,000, with 27% sometimes or always depending on income from Mechanical Turk to make ends meet.

The average US user of Mechanical Turk earned $2.30 per hour for tasks in 2009, versus $1.58 for the average Indian worker. While the majority of users worked less than five hours per week, 18% worked 15 hours per week or more. This is less than minimum wage in the United States (but not in India), which Ross suggests raises ethical questions for researchers who use crowdsourcing.

The demographics of Microworkers.com differ from Mechanical Turk in that the US and India together account for only 25% of workers; 197 countries are represented among users, with Indonesia (18%) and Bangladesh (17%) contributing the largest share. However, 28% of employers are from the US.

Another study of the demographics of the crowd at iStockphoto found a crowd that was largely white, middle- to upper-class, higher educated, worked in a so-called "white-collar job" and had a high-speed Internet connection at home. In a crowd-sourcing diary study of 30 days in Europe the participants were predominantly higher educated women.

Studies have also found that crowds are not simply collections of amateurs or hobbyists. Rather, crowds are often professionally trained in a discipline relevant to a given crowdsourcing task and sometimes hold advanced degrees and many years of experience in the profession. Claiming that crowds are amateurs, rather than professionals, is both factually untrue and may lead to marginalization of crowd labor rights.

G. D. Saxton et al. (2013) studied the role of community users, among other elements, during his content analysis of 103 crowdsourcing organizations. Saxton et al. developed a taxonomy of nine crowdsourcing models (intermediary model, citizen media production, collaborative software development, digital goods sales, product design, peer-to-peer social financing, consumer report model, knowledge base building model, and collaborative science project model) in which to categorize the roles of community users, such as researcher, engineer, programmer, journalist, graphic designer, etc., and the products and services developed.

Motivations

Contributors

Many scholars of crowdsourcing suggest that both intrinsic and extrinsic motivations cause people to contribute to crowdsourced tasks and these factors influence different types of contributors. For example, students and people employed full-time rate human capital advancement as less important than part-time workers do, while women rate social contact as more important than men do.

Intrinsic motivations are broken down into two categories: enjoyment-based and community-based motivations. Enjoyment-based motivations refer to motivations related to the fun and enjoyment that contributors experience through their participation. These motivations include: skill variety, task identity, task autonomy, direct feedback from the job, and pastime. Community-based motivations refer to motivations related to community participation, and include community identification and social contact. In crowdsourced journalism, the motivation factors are intrinsic: the crowd is driven by a possibility to make social impact, contribute to social change and help their peers.

Extrinsic motivations are broken down into three categories: immediate payoffs, delayed payoffs, and social motivations. Immediate payoffs, through monetary payment, are the immediately received compensations given to those who complete tasks. Delayed payoffs are benefits that can be used to generate future advantages, such as training skills and being noticed by potential employers. Social motivations are the rewards of behaving pro-socially, such as the altruistic motivations of online volunteers. Chandler and Kapelner found that US users of the Amazon Mechanical Turk were more likely to complete a task when told they were going to “help researchers identify tumor cells,” than when they were not told the purpose of their task. However, of those who completed the task, quality of output did not depend on the framing of the task.

Motivation factors in crowdsourcing are often a mix of intrinsic and extrinsic factors. In a crowdsourced law-making project, the crowd was motivated by a mix of intrinsic and extrinsic factors. Intrinsic motivations included fulfilling civic duty, affecting the law for sociotropic reasons, to deliberate with and learn from peers. Extrinsic motivations included changing the law for financial gain or other benefits. Participation in crowdsourced policy-making was an act of grassroots advocacy, whether to pursue one's own interest or more altruistic goals, such as protecting nature.

Another form of social motivation is prestige or status. The International Children's Digital Library recruits volunteers to translate and review books. Because all translators receive public acknowledgment for their contributions, Kaufman and Schulz cite this as a reputation-based strategy to motivate individuals who want to be associated with institutions that have prestige. The Mechanical Turk uses reputation as a motivator in a different sense, as a form of quality control. Crowdworkers who frequently complete tasks in ways judged to be inadequate can be denied access to future tasks, providing motivation to produce high-quality work.

Requesters

Using crowdsourcing through means such as Amazon Mechanical Turk can help provide researchers and requesters with an already established infrastructure for their projects, allowing them to easily use a crowd and access participants from a diverse culture background. Using crowdsourcing can also help complete the work for projects that would normally have geographical and population size limitations.

Participation in crowdsourcing

Despite the potential global reach of IT applications online, recent research illustrates that differences in location affect participation outcomes in IT-mediated crowds.

Limitations and controversies

At least six major topics cover the limitations and controversies about crowdsourcing:

Impact of crowdsourcing on product quality
Entrepreneurs contribute less capital themselves
Increased number of funded ideas
The value and impact of the work received from the crowd
The ethical implications of low wages paid to crowdworkers
Trustworthiness and informed decision making

Impact of crowdsourcing on product quality

Crowdsourcing allows anyone to participate, allowing for many unqualified participants and resulting in large quantities of unusable contributions. Companies, or additional crowdworkers, then have to sort through all of these low-quality contributions. The task of sorting through crowdworkers’ contributions, along with the necessary job of managing the crowd, requires companies to hire actual employees, thereby increasing management overhead. For example, susceptibility to faulty results is caused by targeted, malicious work efforts. Since crowdworkers completing microtasks are paid per task, often a financial incentive causes workers to complete tasks quickly rather than well. Verifying responses is time-consuming, so requesters often depend on having multiple workers complete the same task to correct errors. However, having each task completed multiple times increases time and monetary costs.

Crowdsourcing quality is also impacted by task design. Lukyanenko et al. argue that, the prevailing practice of modeling crowdsourcing data collection tasks in terms of fixed classes (options), unnecessarily restricts quality. Results demonstrate that information accuracy depends on the classes used to model domains, with participants providing more accurate information when classifying phenomena at a more general level (which is typically less useful to sponsor organizations, hence less common). Further, greater overall accuracy is expected when participants could provide free-form data compared to tasks in which they select from constrained choices.

Just as limiting, oftentimes the scenario is that just not enough skills or expertise exist in the crowd to successfully accomplish the desired task. While this scenario does not affect "simple" tasks such as image labeling, it is particularly problematic for more complex tasks, such as engineering design or product validation. In these cases, it may be difficult or even impossible to find the qualified people in the crowd, as their voices may be drowned out by consistent, but incorrect crowd members. However, if the difficulty of the task is even "intermediate" in its difficultly, estimating crowdworkers' skills and intentions and leveraging them for inferring true responses works well, albeit with an additional computation cost.

Crowdworkers are a nonrandom sample of the population. Many researchers use crowdsourcing to quickly and cheaply conduct studies with larger sample sizes than would be otherwise achievable. However, due to limited access to the Internet, participation in low developed countries is relatively low. Participation in highly developed countries is similarly low, largely because the low amount of pay is not a strong motivation for most users in these countries. These factors lead to a bias in the population pool towards users in medium developed countries, as deemed by the human development index.

The likelihood that a crowdsourced project will fail due to lack of monetary motivation or too few participants increases over the course of the project. Crowdsourcing markets are not a first-in, first-out queue. Tasks that are not completed quickly may be forgotten, buried by filters and search procedures so that workers do not see them. This results in a long-tail power law distribution of completion times. Additionally, low-paying research studies online have higher rates of attrition, with participants not completing the study once started. Even when tasks are completed, crowdsourcing does not always produce quality results. When Facebook began its localization program in 2008, it encountered some criticism for the low quality of its crowdsourced translations.

One of the problems of crowdsourcing products is the lack of interaction between the crowd and the client. Usually little information is known about the final desired product, and often very limited interaction with the final client occurs. This can decrease the quality of product because client interaction is a vital part of the design process.

An additional cause of the decrease in product quality that can result from crowdsourcing is the lack of collaboration tools. In a typical workplace, coworkers are organized in such a way that they can work together and build upon each other's knowledge and ideas. Furthermore, the company often provides employees with the necessary information, procedures, and tools to fulfill their responsibilities. However, in crowdsourcing, crowdworkers are left to depend on their own knowledge and means to complete tasks.

A crowdsourced project is usually expected to be unbiased by incorporating a large population of participants with a diverse background. However, most of the crowdsourcing works are done by people who are paid or directly benefit from the outcome (e.g. most of open source projects working on Linux). In many other cases, the end product is the outcome of a single person's endeavour, who creates the majority of the product, while the crowd only participates in minor details.

Entrepreneurs contribute less capital themselves

To make an idea turn into a reality, the first component needed is capital. Depending on the scope and complexity of the crowdsourced project, the amount of necessary capital can range from a few thousand dollars to hundreds of thousands, if not more. The capital-raising process can take from days to months depending on different variables, including the entrepreneur's network and the amount of initial self-generated capital.

The crowdsourcing process allows entrepreneurs to access to a wide range of investors who can take different stakes in the project. In effect, crowdsourcing simplifies the capital-raising process and allows entrepreneurs to spend more time on the project itself and reaching milestones rather than dedicating time to get it started. Overall, the simplified access to capital can save time to start projects and potentially increase efficiency of projects.

Opponents of this issue argue easier access to capital through a large number of smaller investors can hurt the project and its creators. With a simplified capital-raising process involving more investors with smaller stakes, investors are more risk-seeking because they can take on an investment size with which they are comfortable. This leads to entrepreneurs losing possible experience convincing investors who are wary of potential risks in investing because they do not depend on one single investor for the survival of their project. Instead of being forced to assess risks and convince large institutional investors why their project can be successful, wary investors can be replaced by others who are willing to take on the risk.

There are translation companies and several users of translations who pretend to use crowdsourcing as a means for drastically cutting costs, instead of hiring professional translators. This situation has been systematically denounced by IAPTI and other translator organizations.

Increased number of funded ideas

The raw number of ideas that get funded and the quality of the ideas is a large controversy over the issue of crowdsourcing.

Proponents argue that crowdsourcing is beneficial because it allows niche ideas that would not survive venture capitalist or angel funding, many times the primary investors in startups, to be started. Many ideas are killed in their infancy due to insufficient support and lack of capital, but crowdsourcing allows these ideas to be started if an entrepreneur can find a community to take interest in the project.

Crowdsourcing allows those who would benefit from the project to fund and become a part of it, which is one way for small niche ideas get started. However, when the raw number of projects grows, the number of possible failures can also increase. Crowdsourcing assists niche and high-risk projects to start because of a perceived need from a select few who seek the product. With high risk and small target markets, the pool of crowdsourced projects faces a greater possible loss of capital, lower return, and lower levels of success.

Concerns

Because crowdworkers are considered independent contractors rather than employees, they are not guaranteed minimum wage. In practice, workers using the Amazon Mechanical Turk generally earn less than the minimum wage. In 2009, it was reported that United States Turk users earned an average of $2.30 per hour for tasks, while users in India earned an average of $1.58 per hour, which is below minimum wage in the United States (but not in India). Some researchers who have considered using Mechanical Turk to get participants for research studies, have argued that the wage conditions might be unethical. However, according to other research, workers on Amazon Mechanical Turk do not feel they are exploited, and are ready to participate in crowdsourcing activities in the future. When Facebook began its localization program in 2008, it received criticism for using free labor in crowdsourcing the translation of site guidelines.

Typically, no written contracts, nondisclosure agreements, or employee agreements are made with crowdworkers. For users of the Amazon Mechanical Turk, this means that requestors decide whether users' work is acceptable, and reserve the right to withhold pay if it does not meet their standards. Critics say that crowdsourcing arrangements exploit individuals in the crowd, and a call has been made for crowds to organize for their labor rights.

Collaboration between crowd members can also be difficult or even discouraged, especially in the context of competitive crowd sourcing. Crowdsourcing site InnoCentive allows organizations to solicit solutions to scientific and technological problems; only 10.6% of respondents report working in a team on their submission. Amazon Mechanical Turk workers collaborated with academics to create a platform, WeAreDynamo.org, that allows them to organize and create campaigns to better their work situation.

Irresponsible crowdsourcing

The popular forum website reddit came under the spotlight during the first few days after the events of the Boston Marathon bombing as it showed how powerful social media and crowdsourcing could be. Reddit was able to help many victims of the bombing as they sent relief and some even opened up their homes, all being communicated very efficiently on their site. However, Reddit soon came under fire after they started to crowdsource information on the possible perpetrators of the bombing. While the FBI received thousands of photos from average citizens, the website also started to focus on crowdsourcing their own investigation, with the information that they were crowdsourcing. Eventually, Reddit members claimed to have found 4 bombers but all were innocent, including a college student who had committed suicide a few days before the bombing. The problem was exacerbated when the media also started to rely on Reddit as their source for information, allowing the misinformation to spread almost nationwide. The FBI has since warned the media to be more careful of where they are getting their information but Reddit's investigation and its false accusations opened up questions about what should be crowdsourced and the unintended consequences of irresponsible crowdsourcing.

Wiki

From Wikipedia, the free encyclopedia

Ward Cunningham, inventor of the wiki

A wiki is a website on which users collaboratively modify content and structure directly from the web browser. In a typical wiki, text is written using a simplified markup language and often edited with the help of a rich-text editor.

A wiki is run using wiki software, otherwise known as a wiki engine. A wiki engine is a type of content management system, but it differs from most other such systems, including blog software, in that the content is created without any defined owner or leader, and wikis have little inherent structure, allowing structure to emerge according to the needs of the users. There are dozens of different wiki engines in use, both standalone and part of other software, such as bug tracking systems. Some wiki engines are open source, whereas others are proprietary. Some permit control over different functions (levels of access); for example, editing rights may permit changing, adding, or removing material. Others may permit access without enforcing access control. Other rules may be imposed to organize content.

The online encyclopedia project Wikipedia is the most popular wiki-based website, and is one of the most widely viewed sites in the world, having been ranked in the top ten since 2007. Wikipedia is not a single wiki but rather a collection of hundreds of wikis, with each one pertaining to a specific language. In addition to Wikipedia, there are tens of thousands of other wikis in use, both public and private, including wikis functioning as knowledge management resources, notetaking tools, community websites, and intranets. The English-language Wikipedia has the largest collection of articles; as of September 2016, it had over five million articles. Ward Cunningham, the developer of the first wiki software, WikiWikiWeb, originally described wiki as "the simplest online database that could possibly work". "Wiki" is a Hawaiian word meaning "quick".

Characteristics

Ward Cunningham and co-author Bo Leuf, in their book The Wiki Way: Quick Collaboration on the Web, described the essence of the Wiki concept as follows:

A wiki invites all users—not just experts—to edit any page or to create new pages within the wiki Web site, using only a standard "plain-vanilla" Web browser without any extra add-ons.
Wiki promotes meaningful topic associations between different pages by making page link creation intuitively easy and showing whether an intended target page exists or not.
A wiki is not a carefully crafted site created by experts and professional writers, and designed for casual visitors. Instead, it seeks to involve the typical visitor/user in an ongoing process of creation and collaboration that constantly changes the website landscape.

A wiki enables communities of editors and contributors to write documents collaboratively. All that people require to contribute is a computer, Internet access, a web browser, and a basic understanding of a simple markup language (e.g., HTML). A single page in a wiki website is referred to as a "wiki page", while the entire collection of pages, which are usually well-interconnected by hyperlinks, is "the wiki". A wiki is essentially a database for creating, browsing, and searching through information. A wiki allows non-linear, evolving, complex, and networked text, while also allowing for editor argument, debate, and interaction regarding the content and formatting. A defining characteristic of wiki technology is the ease with which pages can be created and updated. Generally, there is no review by a moderator or gatekeeper before modifications are accepted and thus lead to changes on the website. Many wikis are open to alteration by the general public without requiring registration of user accounts. Many edits can be made in real-time and appear almost instantly online, but this feature facilitates abuse of the system. Private wiki servers require user authentication to edit pages, and sometimes even to read them. Maged N. Kamel Boulos, Cito Maramba, and Steve Wheeler write that the open wikis produce a process of Social Darwinism. "'Unfit' sentences and sections are ruthlessly culled, edited, and replaced if they are not considered 'fit', which hopefully results in the evolution of a higher quality and more relevant page. While such openness may invite 'vandalism' and the posting of untrue information, this same openness also makes it possible to rapidly correct or restore a 'quality' wiki page."

Editing

Some wikis have an Edit button or link directly on the page being viewed, if the user has permission to edit the page. This can lead to a text-based editing page where participants can structure and format wiki pages with a simplified markup language, sometimes known as Wikitext, Wiki markup or Wikicode (it can also lead to a WYSIWYG editing page; see the paragraph after the table below). For example, starting lines of text with asterisks could create a bulleted list. The style and syntax of wikitexts can vary greatly among wiki implementations, some of which also allow HTML tags.

Wikis have favoured plain-text editing, with fewer and simpler conventions than HTML, for indicating style and structure. Although limiting access to HTML and Cascading Style Sheets (CSS) of wikis limits user ability to alter the structure and formatting of wiki content, there are some benefits. Limited access to CSS promotes consistency in the look and feel, and having JavaScript disabled prevents a user from implementing code that may limit other users' access.

Wikis can also make WYSIWYG editing available to users, usually by means of JavaScript control that translates graphically entered formatting instructions into the corresponding HTML tags or wikitext. In those implementations, the markup of a newly edited, marked-up version of the page is generated and submitted to the server transparently, shielding the user from this technical detail. An example of this is the VisualEditor on Wikipedia. WYSIWYG controls do not, however, always provide all of the features available in wikitext, and some users prefer not to use a WYSIWYG editor. Hence, many of these sites offer some means to edit the wikitext directly.

Some wikis keep a record of changes made to wiki pages; often, every version of the page is stored. This means that authors can revert to an older version of the page should it be necessary because a mistake has been made, such as the content accidentally being deleted or the page has been vandalized to include offensive or malicious text or other inappropriate content.

Many wiki implementations, such as MediaWiki, allow users to supply an edit summary when they edit a page. This is a short piece of text summarizing the changes they have made (e.g., "Corrected grammar," or "Fixed formatting in table."). It is not inserted into the article's main text, but is stored along with that revision of the page, allowing users to explain what has been done and why, similar to a log message when making changes in a revision-control system. This enables other users to see which changes have been made by whom and why, often in a list of summaries, dates and other short, relevant content, a list which is called a "log" or "history."

Navigation

Within the text of most pages, there are usually a large number of hypertext links to other pages within the wiki. This form of non-linear navigation is more "native" to a wiki than structured/formalized navigation schemes. Users can also create any number of index or table-of-contents pages, with hierarchical categorization or whatever form of organization they like. These may be challenging to maintain "by hand", as multiple authors and users may create and delete pages in an ad hoc, unorganized manner. Wikis can provide one or more ways to categorize or tag pages to support the maintenance of such index pages. Some wikis, including the original, have a backlink feature, which displays all pages that link to a given page. It is also typically possible in a wiki to create links to pages that do not yet exist, as a way to invite others to share what they know about a subject new to the wiki. Wiki users can typically "tag" pages with categories or keywords, to make it easier for other users to find the article. For example, a user creating a new article on cold weather cycling might "tag" this page under the categories of commuting, winter sports and bicycling. This would make it easier for other users to find the article.

Linking and creating pages

Links are created using a specific syntax, the so-called "link pattern". Originally, most wikis used CamelCase to name pages and create links. These are produced by capitalizing words in a phrase and removing the spaces between them (the word "CamelCase" is itself an example). While CamelCase makes linking easy, it also leads to links in a form that deviates from the standard spelling. To link to a page with a single-word title, one must abnormally capitalize one of the letters in the word (e.g. "WiKi" instead of "Wiki"). CamelCase-based wikis are instantly recognizable because they have many links with names such as "TableOfContents" and "BeginnerQuestions." It is possible for a wiki to render the visible anchor of such links "pretty" by reinserting spaces, and possibly also reverting to lower case. This reprocessing of the link to improve the readability of the anchor is, however, limited by the loss of capitalization information caused by CamelCase reversal. For example, "RichardWagner" should be rendered as "Richard Wagner", whereas "PopularMusic" should be rendered as "popular music". There is no easy way to determine which capital letters should remain capitalized. As a result, many wikis now have "free linking" using brackets, and some disable CamelCase by default.

Searching

Most wikis offer at least a title search, and sometimes a full-text search. The scalability of the search depends on whether the wiki engine uses a database. Some wikis, such as PmWiki, use flat files. MediaWiki's first versions used flat files, but it was rewritten by Lee Daniel Crocker in the early 2000s (decade) to be a database application. Indexed database access is necessary for high speed searches on large wikis. Alternatively, external search engines such as Google Search can sometimes be used on wikis with limited searching functions in order to obtain more precise results.

History

Wiki Wiki Shuttle at Honolulu International Airport

WikiWikiWeb was the first wiki. Ward Cunningham started developing WikiWikiWeb in Portland, Oregon, in 1994, and installed it on the Internet domain c2.com on March 25, 1995. It was named by Cunningham, who remembered a Honolulu International Airport counter employee telling him to take the "Wiki Wiki Shuttle" bus that runs between the airport's terminals. According to Cunningham, "I chose wiki-wiki as an alliterative substitute for 'quick' and thereby avoided naming this stuff quick-web."

Cunningham was, in part, inspired by Apple Inc.'s HyperCard, which he had used. HyperCard, however, was single-user. Apple had designed a system allowing users to create virtual "card stacks" supporting links among the various cards. Cunningham developed Vannevar Bush's ideas by allowing users to "comment on and change one another's text." Cunningham says his goals were to link together people's experiences to create a new literature to document programming patterns, and to harness people's natural desire to talk and tell stories with a technology that would feel comfortable to those not used to "authoring".

Wikipedia became the most famous wiki site, entering the top ten most popular websites in 2007. In the early 2000s (decade), wikis were increasingly adopted in enterprise as collaborative software. Common uses included project communication, intranets, and documentation, initially for technical users. Some companies use wikis as their only collaborative software and as a replacement for static intranets, and some schools and universities use wikis to enhance group learning. There may be greater use of wikis behind firewalls than on the public Internet. On March 15, 2007, the word wiki was listed in the online Oxford English Dictionary.

Alternative definitions

In the late 1990s and early 2000s, the word "wiki" was used to refer to both user-editable websites and the software that powers them; the latter definition is still occasionally in use. Wiki inventor Ward Cunningham wrote in 2014 that the word "wiki" should not be used to refer to a single website, but rather to a mass of user-editable pages and or sites, so that a single website is not "a wiki" but "an instance of wiki". He wrote that the concept of wiki federation, in which the same content can be hosted and edited in more than one location in a manner similar to distributed version control, meant that the concept of a single discrete "wiki" no longer made sense.

Implementations

Wiki software is a type of collaborative software that runs a wiki system, allowing web pages to be created and edited using a common web browser. It may be implemented as a series of scripts behind an existing web server, or as a standalone application server that runs on one or more web servers. The content is stored in a file system, and changes to the content are stored in a relational database management system. A commonly implemented software package is MediaWiki, which runs Wikipedia. Alternatively, personal wikis run as a standalone application on a single computer. WikidPad is an example. One application, TiddlyWiki, simply makes use of an even single local HTML file with JavaScript inside.

Wikis can also be created on a "wiki farm", where the server-side software is implemented by the wiki farm owner. PBwiki, Socialtext, and Wikia are popular examples of such services. Some wiki farms can also make private, password-protected wikis. Note that free wiki farms generally contain advertising on every page.

Trust and security

Controlling changes

History comparison reports highlight the changes between two revisions of a page.

Wikis are generally designed with the philosophy of making it easy to correct mistakes, rather than making it difficult to make them. Thus, while wikis are very open, they provide a means to verify the validity of recent additions to the body of pages. The most prominent, on almost every wiki, is the "Recent Changes" page—a specific list numbering recent edits, or a list of edits made within a given time frame. Some wikis can filter the list to remove minor edits and edits made by automatic importing scripts ("bots"). From the change log, other functions are accessible in most wikis: the revision history shows previous page versions and the diff feature highlights the changes between two revisions. Using the revision history, an editor can view and restore a previous version of the article. This gives great power to the author to eliminate edits. The diff feature can be used to decide whether or not this is necessary. A regular wiki user can view the diff of an edit listed on the "Recent Changes" page and, if it is an unacceptable edit, consult the history, restoring a previous revision; this process is more or less streamlined, depending on the wiki software used.

In case unacceptable edits are missed on the "recent changes" page, some wiki engines provide additional content control. It can be monitored to ensure that a page, or a set of pages, keeps its quality. A person willing to maintain pages will be warned of modifications to the pages, allowing him or her to verify the validity of new editions quickly. This can be seen as a very pro-author and anti-editor feature. A watchlist is a common implementation of this. Some wikis also implement "patrolled revisions", in which editors with the requisite credentials can mark some edits as not vandalism. A "flagged revisions" system can prevent edits from going live until they have been reviewed.

Trustworthiness and reliability of content

Quality dimensions of the wiki and other sources: Wikipedia use case

Critics of publicly editable wiki systems argue that these systems could be easily tampered with by malicious individuals ("vandals") or even by well-meaning but unskilled users who introduce errors into the content. While proponents argue that the community of users can catch malicious content and correct it. Lars Aronsson, a data systems specialist, summarizes the controversy as follows: "Most people, when they first learn about the wiki concept, assume that a Web site that can be edited by anybody would soon be rendered useless by destructive input. It sounds like offering free spray cans next to a grey concrete wall. The only likely outcome would be ugly graffiti and simple tagging, and many artistic efforts would not be long lived. Still, it seems to work very well." High editorial standards in medicine and health sciences articles, in which users typically use peer-reviewed journals or university textbooks as sources, have led to the idea of expert-moderated wikis. Some wikis allow one to link to specific versions of articles, which has been useful to the scientific community, in that expert peer reviewers could analyse articles, improve them and provide links to the trusted version of that article. Noveck points out that "participants are accredited by members of the wiki community, who have a vested interest in preserving the quality of the work product, on the basis of their ongoing participation." On controversial topics that have been subject to disruptive editing, a wiki author may restrict editing to registered users.

Security

The open philosophy of wiki – allowing anyone to edit content – does not ensure that every editor's intentions are well-mannered. For example, vandalism (changing wiki content to something offensive, adding nonsense, or deliberately adding incorrect information, such as hoax information) can be a major problem. On larger wiki sites, such as those run by the Wikimedia Foundation, vandalism can go unnoticed for some period of time. Wikis, because of their open nature, are susceptible to intentional disruption, known as "trolling". Wikis tend to take a soft-security approach to the problem of vandalism, making damage easy to undo rather than attempting to prevent damage. Larger wikis often employ sophisticated methods, such as bots that automatically identify and revert vandalism and JavaScript enhancements that show characters that have been added in each edit. In this way, vandalism can be limited to just "minor vandalism" or "sneaky vandalism", where the characters added/eliminated are so few that bots do not identify them and users do not pay much attention to them. An example of a bot that reverts vandalism on Wikipedia is ClueBot NG. ClueBot NG can revert edits, often within minutes, if not seconds. The bot uses machine learning in lieu of heuristics.

The amount of vandalism a wiki receives depends on how open the wiki is. For instance, some wikis allow unregistered users, identified by their IP addresses, to edit content, while others limit this function to just registered users. Most wikis allow anonymous editing without an account, but give registered users additional editing functions; on most wikis, becoming a registered user is a short and simple process. Some wikis require an additional waiting period before gaining access to certain tools. For example, on the English Wikipedia, registered users can rename pages only if their account is at least four days old and has made at least ten edits. Other wikis such as the Portuguese Wikipedia use an editing requirement instead of a time requirement, granting extra tools after the user has made a certain number of edits to prove their trustworthiness and usefulness as an editor. Vandalism of Wikipedia is common (though policed and usually reverted) because it is extremely open, allowing anyone with a computer and Internet access to edit it, although this makes it grow rapidly. In contrast, Citizendium requires an editor's real name and short autobiography, affecting the growth of the wiki but sometimes helping stop vandalism.

Edit wars can also occur as users repetitively revert a page to the version they favor. In some cases, editors with opposing views of which content should appear or what formatting style should be used will change and re-change each other's edits. This results in the page being "unstable" from a general users' perspective, because each time a general user comes to the page, it may look different. Some wiki software allows an administrator to stop such edit wars by locking a page from further editing until a decision has been made on what version of the page would be most appropriate. Some wikis are in a better position than others to control behavior due to governance structures existing outside the wiki. For instance, a college teacher can create incentives for students to behave themselves on a class wiki they administer by limiting editing to logged-in users and pointing out that all contributions can be traced back to the contributors. Bad behavior can then be dealt with in accordance with university policies. The issue of wiki vandalism is debated. In some cases, when an editor deletes an entire article and replaces it with nonsense content, it may be a "test edit", made by the user as she or he is experimenting with the wiki system. Some editors may not realize that they have damaged the page, or if they do realize it, they may not know how to undo the mistake or restore the content.

Potential malware vector

Malware can also be a problem for wikis, as users can add links to sites hosting malicious code. For example, a German Wikipedia article about the Blaster Worm was edited to include a hyperlink to a malicious website. Users of vulnerable Microsoft Windows systems who followed the link would be infected. A countermeasure is the use of software that prevents users from saving an edit that contains a link to a site listed on a blacklist of malware sites.

Communities

Applications

The home page of the English Wikipedia

The English Wikipedia has the largest user base among wikis on the World Wide Web and ranks in the top 10 among all Web sites in terms of traffic. Other large wikis include the WikiWikiWeb, Memory Alpha, Wikivoyage, and Susning.nu, a Swedish-language knowledge base. Medical and health-related wiki examples include Ganfyd, an online collaborative medical reference that is edited by medical professionals and invited non-medical experts. Many wiki communities are private, particularly within enterprises. They are often used as internal documentation for in-house systems and applications. Some companies use wikis to allow customers to help produce software documentation. A study of corporate wiki users found that they could be divided into "synthesizers" and "adders" of content. Synthesizers' frequency of contribution was affected more by their impact on other wiki users, while adders' contribution frequency was affected more by being able to accomplish their immediate work. from a study of 1000s of wiki deployments, Jonathan Grudin concluded careful stakeholder analysis and education are crucial to successful wiki deployment.

In 2005, the Gartner Group, noting the increasing popularity of wikis, estimated that they would become mainstream collaboration tools in at least 50% of companies by 2009. Wikis can be used for project management. Wikis have also been used in the academic community for sharing and dissemination of information across institutional and international boundaries. In those settings, they have been found useful for collaboration on grant writing, strategic planning, departmental documentation, and committee work. In the mid-2000s (decade), the increasing trend among industries toward collaboration was placing a heavier impetus upon educators to make students proficient in collaborative work, inspiring even greater interest in wikis being used in the classroom.

Wikis have found some use within the legal profession, and within government. Examples include the Central Intelligence Agency's Intellipedia, designed to share and collect intelligence, dKospedia, which was used by the American Civil Liberties Union to assist with review of documents pertaining to internment of detainees in Guantánamo Bay; and the wiki of the United States Court of Appeals for the Seventh Circuit, used to post court rules and allow practitioners to comment and ask questions. The United States Patent and Trademark Office operates Peer-to-Patent, a wiki to allow the public to collaborate on finding prior art relevant to examination of pending patent applications. Queens, New York has used a wiki to allow citizens to collaborate on the design and planning of a local park. Cornell Law School founded a wiki-based legal dictionary called Wex, whose growth has been hampered by restrictions on who can edit.

In academic context, wiki has also been used as project collaboration and research support system.

City wikis

A city wiki (or local wiki) is a wiki used as a knowledge base and social network for a specific geographical locale. The term 'city wiki' or its foreign language equivalent (e.g. German 'Stadtwiki') is sometimes also used for wikis that cover not just a city, but a small town or an entire region. A city wiki contains information about specific instances of things, ideas, people and places. Much of this information might not be appropriate for encyclopedias such as Wikipedia (e.g., articles on every retail outlet in a town), but might be appropriate for a wiki with more localized content and viewers. A city wiki could also contain information about the following subjects, that may or may not be appropriate for a general knowledge wiki, such as:

Details of public establishments such as public houses, bars, accommodation or social centers
Owner name, opening hours and statistics for a specific shop
Statistical information about a specific road in a city
Flavors of ice cream served at a local ice cream parlor
A biography of a local mayor and other persons

WikiNodes

Visualization of the collaborative work in the German wiki project Mathe für Nicht-Freaks

WikiNodes are pages on wikis that describe related wikis. They are usually organized as neighbors and delegates. A neighbor wiki is simply a wiki that may discuss similar content or may otherwise be of interest. A delegate wiki is a wiki that agrees to have certain content delegated to that wiki. One way of finding a wiki on a specific subject is to follow the wiki-node network from wiki to wiki; another is to take a Wiki "bus tour", for example: Wikipedia's Tour Bus Stop.

Participants

The four basic types of users who participate in wikis are reader, author, wiki administrator and system administrator. The system administrator is responsible for installation and maintenance of the wiki engine and the container web server. The wiki administrator maintains wiki content and is provided additional functions pertaining to pages (e.g. page protection and deletion), and can adjust users' access rights by, for instance, blocking them from editing.

Growth factors

A study of several hundred wikis showed that a relatively high number of administrators for a given content size is likely to reduce growth; that access controls restricting editing to registered users tends to reduce growth; that a lack of such access controls tends to fuel new user registration; and that higher administration ratios (i.e. admins/user) have no significant effect on content or population growth.

Conferences

Active conferences and meetings about wiki-related topics include:

Atlassian Summit, an annual conference for users of Atlassian software, including Confluence.
OpenSym (called WikiSym until 2014), an academic conference dedicated to research about wikis and open collaboration.
SMWCon, a bi-annual conference for users and developers of Semantic MediaWiki.
TikiFest, a frequently held meeting for users and developers of Tiki Wiki CMS Groupware.
Wikimania, an annual conference dedicated to the research and practice of Wikimedia Foundation projects like Wikipedia.

Former wiki-related events include:

RecentChangesCamp (2006–2012), an unconference on wiki-related topics.
RegioWikiCamp (2009–2013), a semi-annual unconference on "regiowikis", or wikis on cities and other geographic areas.

Rules

Wikis typically have a set of rules governing user behavior. Wikipedia, for instance, has a labyrinthine set of policies and guidelines summed up in its five pillars: Wikipedia is an encyclopedia; Wikipedia has a neutral point of view; Wikipedia is free content; Wikipedians should interact in a respectful and civil manner; and Wikipedia does not have firm rules. Many wikis have adopted a set of commandments. For instance, Conservapedia commands, among other things, that its editors use "B.C." rather than "B.C.E." when referring to years prior to C.E. 1 and refrain from "unproductive activity." One teacher instituted a commandment for a class wiki, "Wiki unto others as you would have them wiki unto you."

Legal environment

Joint authorship of articles, in which different users participate in correcting, editing, and compiling the finished product, can also cause editors to become tenants in common of the copyright, making it impossible to republish without permission of all co-owners, some of whose identities may be unknown due to pseudonymous or anonymous editing. Where persons contribute to a collective work such as an encyclopedia, there is, however, no joint ownership if the contributions are separate and distinguishable. Despite most wikis' tracking of individual contributions, the action of contributing to a wiki page is still arguably one of jointly correcting, editing, or compiling, which would give rise to joint ownership. Some copyright issues can be alleviated through the use of an open content license. Version 2 of the GNU Free Documentation License includes a specific provision for wiki relicensing; Creative Commons licenses are also popular. When no license is specified, an implied license to read and add content to a wiki may be deemed to exist on the grounds of business necessity and the inherent nature of a wiki, although the legal basis for such an implied license may not exist in all circumstances.

Wikis and their users can be held liable for certain activities that occur on the wiki. If a wiki owner displays indifference and forgoes controls (such as banning copyright infringers) that he could have exercised to stop copyright infringement, he may be deemed to have authorized infringement, especially if the wiki is primarily used to infringe copyrights or obtains direct financial benefit, such as advertising revenue, from infringing activities. In the United States, wikis may benefit from Section 230 of the Communications Decency Act, which protects sites that engage in "Good Samaritan" policing of harmful material, with no requirement on the quality or quantity of such self-policing. It has also been argued, however, that a wiki's enforcement of certain rules, such as anti-bias, verifiability, reliable sourcing, and no-original-research policies, could pose legal risks. When defamation occurs on a wiki, theoretically all users of the wiki can be held liable, because any of them had the ability to remove or amend the defamatory material from the "publication." It remains to be seen whether wikis will be regarded as more akin to an internet service provider, which is generally not held liable due to its lack of control over publications' contents, than a publisher. It has been recommended that trademark owners monitor what information is presented about their trademarks on wikis, since courts may use such content as evidence pertaining to public perceptions. Joshua Jarvis notes, "Once misinformation is identified, the trade mark owner can simply edit the entry."

Search This Blog