Search This Blog

Monday, December 16, 2024

Deepfake

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Deepfake

Deepfakes
(a portmanteau of 'deep learning' and 'fake') are images, videos, or audio which are edited or generated using artificial intelligence tools, and which may depict real or non-existent people. They are a type of synthetic media and modern form of a Media prank.

While the act of creating fake content is not new, deepfakes uniquely leverage the technological tools and techniques of machine learning and artificial intelligence, including facial recognition algorithms and artificial neural networks such as variational autoencoders (VAEs) and generative adversarial networks (GANs).

In turn the field of image forensics develops techniques to detect manipulated images. Deepfakes have garnered widespread attention for their potential use in creating child sexual abuse material, celebrity pornographic videos, revenge porn, fake news, hoaxes, bullying, and financial fraud.

Academics have raised concerns about the potential for deep fakes to be used to promote disinformation and hate speech, and interfere with elections. The information technology industry and governments have responded with recommendations to detect and limit their use.

From traditional entertainment to gaming, deepfake technology has evolved to be increasingly convincing and available to the public, allowing for the disruption of the entertainment and media industries.

History

Portrait of actress Sydney Sweeney generated by Stable Diffusion

Photo manipulation was developed in the 19th century and soon applied to motion pictures. Technology steadily improved during the 20th century, and more quickly with the advent of digital video.

Deepfake technology has been developed by researchers at academic institutions beginning in the 1990s, and later by amateurs in online communities. More recently the methods have been adopted by industry.

Academic research

Academic research related to deepfakes is split between the field of computer vision, a sub-field of computer science, which develops techniques for creating and identifying deepfakes, and humanities and social science approaches that study the social, ethical and aesthetic implications of deepfakes.

Social science and humanities approaches to deepfakes

In cinema studies, deepfakes demonstrate how "the human face is emerging as a central object of ambivalence in the digital age". Video artists have used deepfakes to "playfully rewrite film history by retrofitting canonical cinema with new star performers". Film scholar Christopher Holliday analyses how switching out the gender and race of performers in familiar movie scenes destabilizes gender classifications and categories. The idea of "queering" deepfakes is also discussed in Oliver M. Gingrich's discussion of media artworks that use deepfakes to reframe gender, including British artist Jake Elwes' Zizi: Queering the Dataset, an artwork that uses deepfakes of drag queens to intentionally play with gender. The aesthetic potentials of deepfakes are also beginning to be explored. Theatre historian John Fletcher notes that early demonstrations of deepfakes are presented as performances, and situates these in the context of theater, discussing "some of the more troubling paradigm shifts" that deepfakes represent as a performance genre.

Philosophers and media scholars have discussed the ethics of deepfakes especially in relation to pornography. Media scholar Emily van der Nagel draws upon research in photography studies on manipulated images to discuss verification systems, that allow women to consent to uses of their images.

Beyond pornography, deepfakes have been framed by philosophers as an "epistemic threat" to knowledge and thus to society. There are several other suggestions for how to deal with the risks deepfakes give rise beyond pornography, but also to corporations, politicians and others, of "exploitation, intimidation, and personal sabotage", and there are several scholarly discussions of potential legal and regulatory responses both in legal studies and media studies. In psychology and media studies, scholars discuss the effects of disinformation that uses deepfakes, and the social impact of deepfakes.

While most English-language academic studies of deepfakes focus on the Western anxieties about disinformation and pornography, digital anthropologist Gabriele de Seta has analyzed the Chinese reception of deepfakes, which are known as huanlian, which translates to "changing faces". The Chinese term does not contain the "fake" of the English deepfake, and de Seta argues that this cultural context may explain why the Chinese response has been more about practical regulatory responses to "fraud risks, image rights, economic profit, and ethical imbalances".

Computer science research on deepfakes

An early landmark project was the Video Rewrite program, published in 1997. The program modified existing video footage of a person speaking to depict that person mouthing the words contained in a different audio track. It was the first system to fully automate this kind of facial reanimation, and it did so using machine learning techniques to make connections between the sounds produced by a video's subject and the shape of the subject's face.

Contemporary academic projects have focused on creating more realistic videos and on improving techniques. The "Synthesizing Obama" program, published in 2017, modifies video footage of former president Barack Obama to depict him mouthing the words contained in a separate audio track. The project lists as a main research contribution its photorealistic technique for synthesizing mouth shapes from audio. The Face2Face program, published in 2016, modifies video footage of a person's face to depict them mimicking the facial expressions of another person in real time. The project lists as a main research contribution the first method for re-enacting facial expressions in real time using a camera that does not capture depth, making it possible for the technique to be performed using common consumer cameras.

In August 2018, researchers at the University of California, Berkeley published a paper introducing a fake dancing app that can create the impression of masterful dancing ability using AI. This project expands the application of deepfakes to the entire body; previous works focused on the head or parts of the face.

Researchers have also shown that deepfakes are expanding into other domains such as tampering with medical imagery. In this work, it was shown how an attacker can automatically inject or remove lung cancer in a patient's 3D CT scan. The result was so convincing that it fooled three radiologists and a state-of-the-art lung cancer detection AI. To demonstrate the threat, the authors successfully performed the attack on a hospital in a White hat penetration test.

A survey of deepfakes, published in May 2020, provides a timeline of how the creation and detection deepfakes have advanced over the last few years. The survey identifies that researchers have been focusing on resolving the following challenges of deepfake creation:

  • Generalization. High-quality deepfakes are often achieved by training on hours of footage of the target. This challenge is to minimize the amount of training data and the time to train the model required to produce quality images and to enable the execution of trained models on new identities (unseen during training).
  • Paired Training. Training a supervised model can produce high-quality results, but requires data pairing. This is the process of finding examples of inputs and their desired outputs for the model to learn from. Data pairing is laborious and impractical when training on multiple identities and facial behaviors. Some solutions include self-supervised training (using frames from the same video), the use of unpaired networks such as Cycle-GAN, or the manipulation of network embeddings.
  • Identity leakage. This is where the identity of the driver (i.e., the actor controlling the face in a reenactment) is partially transferred to the generated face. Some solutions proposed include attention mechanisms, few-shot learning, disentanglement, boundary conversions, and skip connections.
  • Occlusions. When part of the face is obstructed with a hand, hair, glasses, or any other item then artifacts can occur. A common occlusion is a closed mouth which hides the inside of the mouth and the teeth. Some solutions include image segmentation during training and in-painting.
  • Temporal coherence. In videos containing deepfakes, artifacts such as flickering and jitter can occur because the network has no context of the preceding frames. Some researchers provide this context or use novel temporal coherence losses to help improve realism. As the technology improves, the interference is diminishing.

Overall, deepfakes are expected to have several implications in media and society, media production, media representations, media audiences, gender, law, and regulation, and politics.

Amateur development

The term deepfakes originated around the end of 2017 from a Reddit user named "deepfakes". He, as well as others in the Reddit community r/deepfakes, shared deepfakes they created; many videos involved celebrities' faces swapped onto the bodies of actors in pornographic videos, while non-pornographic content included many videos with actor Nicolas Cage's face swapped into various movies.

Other online communities remain, including Reddit communities that do not share pornography, such as r/SFWdeepfakes (short for "safe for work deepfakes"), in which community members share deepfakes depicting celebrities, politicians, and others in non-pornographic scenarios. Other online communities continue to share pornography on platforms that have not banned deepfake pornography.

Commercial development

In January 2018, a proprietary desktop application called FakeApp was launched. This app allows users to easily create and share videos with their faces swapped with each other. As of 2019, FakeApp has been superseded by open-source alternatives such as Faceswap, command line-based DeepFaceLab, and web-based apps such as DeepfakesWeb.com.

Larger companies started to use deepfakes. Corporate training videos can be created using deepfaked avatars and their voices, for example Synthesia, which uses deepfake technology with avatars to create personalized videos. The mobile app Momo created the application Zao which allows users to superimpose their face on television and movie clips with a single picture. As of 2019 the Japanese AI company DataGrid made a full body deepfake that could create a person from scratch.

As of 2020 audio deepfakes, and AI software capable of detecting deepfakes and cloning human voices after 5 seconds of listening time also exist. A mobile deepfake app, Impressions, was launched in March 2020. It was the first app for the creation of celebrity deepfake videos from mobile phones.

Resurrection

Deepfake technology's ability to fabricate messages and actions of others can include deceased individuals. On 29 October 2020, Kim Kardashian posted a video featuring a hologram of her late father Robert Kardashian created by the company Kaleida, which used a combination of performance, motion tracking, SFX, VFX and DeepFake technologies to create the illusion.

In 2020, a deepfake video of Joaquin Oliver, a victim of the Parkland shooting was created as part of a gun safety campaign. Oliver's parents partnered with nonprofit Change the Ref and McCann Health to produce a video in which Oliver to encourage people to support gun safety legislation and politicians who back do so as well.

In 2022, a deepfake video of Elvis Presley was used on the program America's Got Talent 17.

A TV commercial used a deepfake video of Beatles member John Lennon, who was murdered in 1980.

Techniques

Deepfakes rely on a type of neural network called an auto encoder. These consist of an encoder, which reduces an image to a lower dimensional latent space, and a decoder, which reconstructs the image from the latent representation. Deepfakes utilize this architecture by having a universal encoder which encodes a person in to the latent space. The latent representation contains key features about their facial features and body posture. This can then be decoded with a model trained specifically for the target. This means the target's detailed information will be superimposed on the underlying facial and body features of the original video, represented in the latent space.

A popular upgrade to this architecture attaches a generative adversarial network to the decoder. A GAN trains a generator, in this case the decoder, and a discriminator in an adversarial relationship. The generator creates new images from the latent representation of the source material, while the discriminator attempts to determine whether or not the image is generated. This causes the generator to create images that mimic reality extremely well as any defects would be caught by the discriminator. Both algorithms improve constantly in a zero sum game. This makes deepfakes difficult to combat as they are constantly evolving; any time a defect is determined, it can be corrected.

Applications

Acting

Digital clones of professional actors have appeared in films before, and progress in deepfake technology is expected to further the accessibility and effectiveness of such clones. The use of AI technology was a major issue in the 2023 SAG-AFTRA strike, as new techniques enabled the capability of generating and storing a digital likeness to use in place of actors.

Disney has improved their visual effects using high-resolution deepfake face swapping technology. Disney improved their technology through progressive training programmed to identify facial expressions, implementing a face-swapping feature, and iterating in order to stabilize and refine the output. This high-resolution deepfake technology saves significant operational and production costs. Disney's deepfake generation model can produce AI-generated media at a 1024 x 1024 resolution, as opposed to common models that produce media at a 256 x 256 resolution. The technology allows Disney to de-age characters or revive deceased actors. Similar technology was initially used by fans to unofficially insert faces into existing media, such as overlaying Harrison Ford's young face onto Han Solo's face in Solo: A Star Wars Story. Disney used deepfakes for the characters of Princess Leia and Grand Moff Tarkin in Rogue One.

The 2020 documentary Welcome to Chechnya used deepfake technology to obscure the identity of the people interviewed, so as to protect them from retaliation.

Creative Artists Agency has developed a facility to capture the likeness of an actor "in a single day", to develop a digital clone of the actor, which would be controlled by the actor or their estate alongside other personality rights.

Companies which have used digital clones of professional actors in advertisements include Puma, Nike and Procter & Gamble.

Deep fake allowed portray David Beckham to able to publish in a campaign in nearly nine languages to raise awareness the fight against Malaria.

In the 2024 Indian Tamil science fiction action thriller The Greatest of All Time, the teenage version of Vijay's character Jeevan is portrayed by Ayaz Khan. Vijay's teenage face was then attained by AI deepfake.

Art

In March 2018 the multidisciplinary artist Joseph Ayerle published the video artwork Un'emozione per sempre 2.0 (English title: The Italian Game). The artist worked with Deepfake technology to create an AI actor, a synthetic version of 80s movie star Ornella Muti, Deepfakes are also being used in education and media to create realistic videos and interactive content, which offer new ways to engage audiences. However, they also bring risks, especially for spreading false information, which has led to calls for responsible use and clear rules. traveling in time from 1978 to 2018. The Massachusetts Institute of Technology referred this artwork in the study "Collective Wisdom". The artist used Ornella Muti's time travel to explore generational reflections, while also investigating questions about the role of provocation in the world of art. For the technical realization Ayerle used scenes of photo model Kendall Jenner. The program replaced Jenner's face by an AI calculated face of Ornella Muti. As a result, the AI actor has the face of the Italian actor Ornella Muti and the body of Kendall Jenner.

Deepfakes have been widely used in satire or to parody celebrities and politicians. The 2020 webseries Sassy Justice, created by Trey Parker and Matt Stone, heavily features the use of deepfaked public figures to satirize current events and raise awareness of deepfake technology.

Blackmail

Deepfakes can be used to generate blackmail materials that falsely incriminate a victim. A report by the American Congressional Research Service warned that deepfakes could be used to blackmail elected officials or those with access to classified information for espionage or influence purposes.

Alternatively, since the fakes cannot reliably be distinguished from genuine materials, victims of actual blackmail can now claim that the true artifacts are fakes, granting them plausible deniability. The effect is to void credibility of existing blackmail materials, which erases loyalty to blackmailers and destroys the blackmailer's control. This phenomenon can be termed "blackmail inflation", since it "devalues" real blackmail, rendering it worthless. It is possible to utilize commodity GPU hardware with a small software program to generate this blackmail content for any number of subjects in huge quantities, driving up the supply of fake blackmail content limitlessly and in highly scalable fashion.

Entertainment

On June 8, 2022, Daniel Emmet, a former AGT contestant, teamed up with the AI startup Metaphysic AI, to create a hyperrealistic deepfake to make it appear as Simon Cowell. Cowell, notoriously known for severely critiquing contestants, was on stage interpreting "You're The Inspiration" by Chicago. Emmet sang on stage as an image of Simon Cowell emerged on the screen behind him in flawless synchronicity.

On August 30, 2022, Metaphysic AI had 'deep-fake' Simon Cowell, Howie Mandel and Terry Crews singing opera on stage.

On September 13, 2022, Metaphysic AI performed with a synthetic version of Elvis Presley for the finals of America's Got Talent.

The MIT artificial intelligence project 15.ai has been used for content creation for multiple Internet fandoms, particularly on social media.

In 2023 the bands ABBA and KISS partnered with Industrial Light & Magic and Pophouse Entertainment to develop deepfake avatars capable of performing virtual concerts.

Fraud and scams

Fraudsters and scammers make use of deepfakes to trick people into fake investment schemes, financial fraud, cryptocurrencies, sending money, and following endorsements. The likenesses of celebrities and politicians have been used for large-scale scams, as well as those of private individuals, which are used in spearphishing attacks. According to the Better Business Bureau, deepfake scams are becoming more prevalent. These scams are responsible for an estimated $12 billion in fraud losses globally. According to a recent report these numbers are expected to reach $40 Billion over the next three years.

Fake endorsements have misused the identities of celebrities like Taylor Swift, Tom Hanks, Oprah Winfrey, and Elon Musk; news anchors like Gayle King and Sally Bundock; and politicians like Lee Hsien Loong and Jim Chalmers. Videos of them have appeared in online advertisements on YouTube, Facebook, and TikTok, who have policies against synthetic and manipulated media. Ads running these videos are seen by millions of people. A single Medicare fraud campaign had been viewed more than 195 million times across thousands of videos. Deepfakes have been used for: a fake giveaway of Le Creuset cookware for a "shipping fee" without receiving the products, except for hidden monthly charges; weight-loss gummies that charge significantly more than what was said; a fake iPhone giveaway; and fraudulent get-rich-quick, investment, and cryptocurrency schemes.

Many ads pair AI voice cloning with "decontextualized video of the celebrity" to mimic authenticity. Others use a whole clip from a celebrity before moving to a different actor or voice. Some scams may involve real-time deepfakes.

Celebrities have been warning people of these fake endorsements, and to be more vigilant against them. Celebrities are unlikely to file lawsuits against every person operating deepfake scams, as "finding and suing anonymous social media users is resource intensive," though cease and desist letters to social media companies work in getting videos and ads taken down.

Audio deepfakes have been used as part of social engineering scams, fooling people into thinking they are receiving instructions from a trusted individual. In 2019, a U.K.-based energy firm's CEO was scammed over the phone when he was ordered to transfer €220,000 into a Hungarian bank account by an individual who reportedly used audio deepfake technology to impersonate the voice of the firm's parent company's chief executive.

As of 2023, the combination advances in deepfake technology, which could clone an individual's voice from a recording of a few seconds to a minute, and new text generation tools, enabled automated impersonation scams, targeting victims using a convincing digital clone of a friend or relative.

Identity masking

Audio deepfakes can be used to mask a user's real identity. In online gaming, for example, a player may want to choose a voice that sounds like their in-game character when speaking to other players. Those who are subject to harassment, such as women, children, and transgender people, can use these "voice skins" to hide their gender or age.

Memes

In 2020, an internet meme emerged utilizing deepfakes to generate videos of people singing the chorus of "Baka Mitai" (ばかみたい), a song from the game Yakuza 0 in the video game series Like a Dragon. In the series, the melancholic song is sung by the player in a karaoke minigame. Most iterations of this meme use a 2017 video uploaded by user Dobbsyrules, who lip syncs the song, as a template.

Politics

Deepfakes have been used to misrepresent well-known politicians in videos.

  • In February 2018, in separate videos, the face of the Argentine President Mauricio Macri had been replaced by the face of Adolf Hitler, and Angela Merkel's face has been replaced with Donald Trump's.
  • In April 2018, Jordan Peele collaborated with Buzzfeed to create a deepfake of Barack Obama with Peele's voice; it served as a public service announcement to increase awareness of deepfakes.
  • In January 2019, Fox affiliate KCPQ aired a deepfake of Trump during his Oval Office address, mocking his appearance and skin colour. The employee found responsible for the video was subsequently fired.
  • In June 2019, the United States House Intelligence Committee held hearings on the potential malicious use of deepfakes to sway elections.
  • In April 2020, the Belgian branch of Extinction Rebellion published a deepfake video of Belgian Prime Minister Sophie Wilmès on Facebook. The video promoted a possible link between deforestation and COVID-19. It had more than 100,000 views within 24 hours and received many comments. On the Facebook page where the video appeared, many users interpreted the deepfake video as genuine.
  • During the 2020 US presidential campaign, many deep fakes surfaced purporting Joe Biden in cognitive decline—falling asleep during an interview, getting lost, and misspeaking—all bolstering rumors of his decline.
  • During the 2020 Delhi Legislative Assembly election campaign, the Delhi Bharatiya Janata Party used similar technology to distribute a version of an English-language campaign advertisement by its leader, Manoj Tiwari, translated into Haryanvi to target Haryana voters. A voiceover was provided by an actor, and AI trained using video of Tiwari speeches was used to lip-sync the video to the new voiceover. A party staff member described it as a "positive" use of deepfake technology, which allowed them to "convincingly approach the target audience even if the candidate didn't speak the language of the voter."
  • In 2020, Bruno Sartori produced deepfakes parodying politicians like Jair Bolsonaro and Donald Trump.
  • In April 2021, politicians in a number of European countries were approached by pranksters Vovan and Lexus, who are accused by critics of working for the Russian state. They impersonated Leonid Volkov, a Russian opposition politician and chief of staff of the Russian opposition leader Alexei Navalny's campaign, allegedly through deepfake technology. However, the pair told The Verge that they did not use deepfakes, and just used a look-alike.
  • In May 2023, a deepfake video of Vice President Kamala Harris supposedly slurring her words and speaking nonsensically about today, tomorrow and yesterday went viral on social media.
  • In June 2023, in the United States, Ron DeSantis's presidential campaign used a deepfake to misrepresent Donald Trump.
  • In March 2024, during India's state assembly elections, deepfake technology was widely employed by political candidates to reach out to voters. Many politicians used AI-generated deepfakes created by an Indian startup The Indian Deepfaker, founder by Divyendra Singh Jadoun  to translate their speeches into multiple regional languages, allowing them to engage with diverse linguistic communities across the country. This surge in the use of deepfakes for political campaigns marked a significant shift in electioneering tactics in India.

Pornography

In 2017, Deepfake pornography prominently surfaced on the Internet, particularly on Reddit. As of 2019, many deepfakes on the internet feature pornography of female celebrities whose likeness is typically used without their consent. A report published in October 2019 by Dutch cybersecurity startup Deeptrace estimated that 96% of all deepfakes online were pornographic. As of 2018, a Daisy Ridley deepfake first captured attention, among others. As of October 2019, most of the deepfake subjects on the internet were British and American actors. However, around a quarter of the subjects are South Korean, the majority of which are K-pop stars.

In June 2019, a downloadable Windows and Linux application called DeepNude was released that used neural networks, specifically generative adversarial networks, to remove clothing from images of women. The app had both a paid and unpaid version, the paid version costing $50. On 27 June the creators removed the application and refunded consumers.

Female celebrities are often a main target when it comes to deepfake pornography. In 2023, deepfake porn videos appeared online of Emma Watson and Scarlett Johansson in a face swapping app. In 2024, deepfake porn images circulated online of Taylor Swift.

Academic studies have reported that women, LGBT people and people of colour (particularly activists, politicians and those questioning power) are at higher risk of being targets of promulgation of deepfake pornography.

Social media

Deepfakes have begun to see use in popular social media platforms, notably through Zao, a Chinese deepfake app that allows users to substitute their own faces onto those of characters in scenes from films and television shows such as Romeo + Juliet and Game of Thrones. The app originally faced scrutiny over its invasive user data and privacy policy, after which the company put out a statement claiming it would revise the policy. In January 2020 Facebook announced that it was introducing new measures to counter this on its platforms.

The Congressional Research Service cited unspecified evidence as showing that foreign intelligence operatives used deepfakes to create social media accounts with the purposes of recruiting individuals with access to classified information.

In 2021, realistic deepfake videos of actor Tom Cruise were released on TikTok, which went viral and garnered more than tens of millions of views. The deepfake videos featured an "artificial intelligence-generated doppelganger" of Cruise doing various activities such as teeing off at the golf course, showing off a coin trick, and biting into a lollipop. The creator of the clips, Belgian VFX Artist Chris Umé, said he first got interested in deepfakes in 2018 and saw the "creative potential" of them.

Sockpuppets

Deepfake photographs can be used to create sockpuppets, non-existent people, who are active both online and in traditional media. A deepfake photograph appears to have been generated together with a legend for an apparently non-existent person named Oliver Taylor, whose identity was described as a university student in the United Kingdom. The Oliver Taylor persona submitted opinion pieces in several newspapers and was active in online media attacking a British legal academic and his wife, as "terrorist sympathizers." The academic had drawn international attention in 2018 when he commenced a lawsuit in Israel against NSO, a surveillance company, on behalf of people in Mexico who alleged they were victims of NSO's phone hacking technology. Reuters could find only scant records for Oliver Taylor and "his" university had no records for him. Many experts agreed that the profile photo is a deepfake. Several newspapers have not retracted articles attributed to him or removed them from their websites. It is feared that such techniques are a new battleground in disinformation.

Collections of deepfake photographs of non-existent people on social networks have also been deployed as part of Israeli partisan propaganda. The Facebook page "Zionist Spring" featured photos of non-existent persons along with their "testimonies" purporting to explain why they have abandoned their left-leaning politics to embrace right-wing politics, and the page also contained large numbers of posts from Prime Minister of Israel Benjamin Netanyahu and his son and from other Israeli right wing sources. The photographs appear to have been generated by "human image synthesis" technology, computer software that takes data from photos of real people to produce a realistic composite image of a non-existent person. In much of the "testimonies," the reason given for embracing the political right was the shock of learning of alleged incitement to violence against the prime minister. Right wing Israeli television broadcasters then broadcast the "testimonies" of these non-existent people based on the fact that they were being "shared" online. The broadcasters aired these "testimonies" despite being unable to find such people, explaining "Why does the origin matter?" Other Facebook fake profiles—profiles of fictitious individuals—contained material that allegedly contained such incitement against the right wing prime minister, in response to which the prime minister complained that there was a plot to murder him.

Concerns and countermeasures

Though fake photos have long been plentiful, faking motion pictures has been more difficult, and the presence of deepfakes increases the difficulty of classifying videos as genuine or not. AI researcher Alex Champandard has said people should know how fast things can be corrupted with deepfake technology, and that the problem is not a technical one, but rather one to be solved by trust in information and journalism. Computer science associate professor Hao Li of the University of Southern California states that deepfakes created for malicious use, such as fake news, will be even more harmful if nothing is done to spread awareness of deepfake technology. Li predicted that genuine videos and deepfakes would become indistinguishable in as soon as half a year, as of October 2019, due to rapid advancement in artificial intelligence and computer graphics. Former Google fraud czar Shuman Ghosemajumder has called deepfakes an area of "societal concern" and said that they will inevitably evolve to a point at which they can be generated automatically, and an individual could use that technology to produce millions of deepfake videos.

Credibility of information

A primary pitfall is that humanity could fall into an age in which it can no longer be determined whether a medium's content corresponds to the truth. Deepfakes are one of a number of tools for disinformation attack, creating doubt, and undermining trust. They have a potential to interfere with democratic functions in societies, such as identifying collective agendas, debating issues, informing decisions, and solving problems though the exercise of political will. People may also start to dismiss real events as fake.

Defamation

Deepfakes possess the ability to damage individual entities tremendously. This is because deepfakes are often targeted at one individual, and/or their relations to others in hopes to create a narrative powerful enough to influence public opinion or beliefs. This can be done through deepfake voice phishing, which manipulates audio to create fake phone calls or conversations. Another method of deepfake use is fabricated private remarks, which manipulate media to convey individuals voicing damaging comments. The quality of a negative video or audio does not need to be that high. As long as someone's likeness and actions are recognizable, a deepfake can hurt their reputation.

In September 2020 Microsoft made public that they are developing a Deepfake detection software tool.

Detection

Audio

Detecting fake audio is a highly complex task that requires careful attention to the audio signal in order to achieve good performance. Using deep learning, preprocessing of feature design and masking augmentation have been proven effective in improving performance.

Video

Most of the academic research surrounding deepfakes focuses on the detection of deepfake videos. One approach to deepfake detection is to use algorithms to recognize patterns and pick up subtle inconsistencies that arise in deepfake videos. For example, researchers have developed automatic systems that examine videos for errors such as irregular blinking patterns of lighting. This approach has been criticized because deepfake detection is characterized by a "moving goal post" where the production of deepfakes continues to change and improve as algorithms to detect deepfakes improve. In order to assess the most effective algorithms for detecting deepfakes, a coalition of leading technology companies hosted the Deepfake Detection Challenge to accelerate the technology for identifying manipulated content. The winning model of the Deepfake Detection Challenge was 65% accurate on the holdout set of 4,000 videos. A team at Massachusetts Institute of Technology published a paper in December 2021 demonstrating that ordinary humans are 69–72% accurate at identifying a random sample of 50 of these videos.

A team at the University of Buffalo published a paper in October 2020 outlining their technique of using reflections of light in the eyes of those depicted to spot deepfakes with a high rate of success, even without the use of an AI detection tool, at least for the time being.

In the case of well-documented individuals such as political leaders, algorithms have been developed to distinguish identity-based features such as patterns of facial, gestural, and vocal mannerisms and detect deep-fake impersonators.

Another team led by Wael AbdAlmageed with Visual Intelligence and Multimedia Analytics Laboratory (VIMAL) of the Information Sciences Institute at the University Of Southern California developed two generations of deepfake detectors based on convolutional neural networks. The first generation used recurrent neural networks to spot spatio-temporal inconsistencies to identify visual artifacts left by the deepfake generation process. The algorithm achieved 96% accuracy on FaceForensics++, the only large-scale deepfake benchmark available at that time. The second generation used end-to-end deep networks to differentiate between artifacts and high-level semantic facial information using two-branch networks. The first branch propagates colour information while the other branch suppresses facial content and amplifies low-level frequencies using Laplacian of Gaussian (LoG). Further, they included a new loss function that learns a compact representation of bona fide faces, while dispersing the representations (i.e. features) of deepfakes. VIMAL's approach showed state-of-the-art performance on FaceForensics++ and Celeb-DF benchmarks, and on March 16, 2022 (the same day of the release), was used to identify the deepfake of Volodymyr Zelensky out-of-the-box without any retraining or knowledge of the algorithm with which the deepfake was created. 

Other techniques suggest that blockchain could be used to verify the source of the media. For instance, a video might have to be verified through the ledger before it is shown on social media platforms. With this technology, only videos from trusted sources would be approved, decreasing the spread of possibly harmful deepfake media.

Digitally signing of all video and imagery by cameras and video cameras, including smartphone cameras, was suggested to fight deepfakes. That allows tracing every photograph or video back to its original owner that can be used to pursue dissidents.

One easy way to uncover deepfake video calls consists in asking the caller to turn sideways.

Prevention

Henry Ajder who works for Deeptrace, a company that detects deepfakes, says there are several ways to protect against deepfakes in the workplace. Semantic passwords or secret questions can be used when holding important conversations. Voice authentication and other biometric security features should be up to date. Educate employees about deepfakes.

Controversies

In March 2024, a video clip was shown from the Buckingham Palace, where Kate Middleton had cancer and she was undergoing chemotherapy. However, the clip fuelled rumours that the woman in that clip was an AI deepfake. UCLA's race director Johnathan Perkins doubted she had cancer, and further speculated that she could be in critical condition or dead. 

Example events

A fake Midjourney-created image of Donald Trump being arrested
The fake Midjourney-created image of Pope Francis wearing a puffer jacket
Barack Obama
On April 17, 2018, American actor Jordan Peele, BuzzFeed, and Monkeypaw Productions posted a deepfake of Barack Obama to YouTube, which depicted Barack Obama cursing and calling Donald Trump names. In this deepfake, Peele's voice and face were transformed and manipulated into those of Obama. The intent of this video was to portray the dangerous consequences and power of deepfakes, and how deepfakes can make anyone say anything.
Donald Trump
On May 5, 2019, Derpfakes posted a deepfake of Donald Trump to YouTube, based on a skit Jimmy Fallon performed on The Tonight Show. In the original skit (aired May 4, 2016), Jimmy Fallon dressed as Donald Trump and pretended to participate in a phone call with Barack Obama, conversing in a manner that presented him to be bragging about his primary win in Indiana. In the deepfake, Jimmy Fallon's face was transformed into Donald Trump's face, with the audio remaining the same. This deepfake video was produced by Derpfakes with a comedic intent. In March 2023, a series of images appeared to show New York Police Department officers restraining Trump. The images, created using Midjourney, were initially posted on Twitter by Eliot Higgins but were later re-shared without context, leading some viewers to believe they were real photographs.
Nancy Pelosi
In 2019, a clip from Nancy Pelosi's speech at the Center for American Progress (given on May 22, 2019) in which the video was slowed down, in addition to the pitch of the audio being altered, to make it seem as if she were drunk, was widely distributed on social media. Critics argue that this was not a deepfake, but a shallowfakea less sophisticated form of video manipulation.[190]
Mark Zuckerberg
In May 2019, two artists collaborating with the company CannyAI created a deepfake video of Facebook founder Mark Zuckerberg talking about harvesting and controlling data from billions of people. The video was part of an exhibit to educate the public about the dangers of artificial intelligence.
Kim Jong-un and Vladimir Putin
On September 29, 2020, deepfakes of North Korean leader Kim Jong-un and Russian President Vladimir Putin were uploaded to YouTube, created by a nonpartisan advocacy group RepresentUs. The deepfakes of Kim and Putin were meant to air publicly as commercials to relay the notion that interference by these leaders in US elections would be detrimental to the United States' democracy. The commercials also aimed to shock Americans to realize how fragile democracy is, and how media and news can significantly influence the country's path regardless of credibility. However, while the commercials included an ending comment detailing that the footage was not real, they ultimately did not air due to fears and sensitivity regarding how Americans may react. On June 5, 2023, an unknown source broadcast a reported deepfake of Vladimir Putin on multiple radio and television networks. In the clip, Putin appears to deliver a speech announcing the invasion of Russia and calling for a general mobilization of the army.
Volodymyr Zelenskyy
On March 16, 2022, a one-minute long deepfake video depicting Ukraine's president Volodymyr Zelenskyy seemingly telling his soldiers to lay down their arms and surrender during the 2022 Russian invasion of Ukraine was circulated on social media. Russian social media boosted it, but after it was debunked, Facebook and YouTube removed it. Twitter allowed the video in tweets where it was exposed as a fake, but said it would be taken down if posted to deceive people. Hackers inserted the disinformation into a live scrolling-text news crawl on TV station Ukraine 24, and the video appeared briefly on the station's website in addition to false claims that Zelenskyy had fled his country's capital, Kyiv. It was not immediately clear who created the deepfake, to which Zelenskyy responded with his own video, saying, "We don't plan to lay down any arms. Until our victory."
Wolf News
In late 2022, pro-China propagandists started spreading deepfake videos purporting to be from "Wolf News" that used synthetic actors. The technology was developed by a London company called Synthesia, which markets it as a cheap alternative to live actors for training and HR videos.
Pope Francis
In March 2023, an anonymous construction worker from Chicago used Midjourney to create a fake image of Pope Francis in a white Balenciaga puffer jacket. The image went viral, receiving over twenty million views. Writer Ryan Broderick dubbed it "the first real mass-level AI misinformation case". Experts consulted by Slate characterized the image as unsophisticated: "you could have made it on Photoshop five years ago".
Keir Starmer
In October 2023, a deepfake audio clip of the UK Labour Party leader Keir Starmer abusing staffers was released on the first day of a Labour Party conference. The clip purported to be an audio tape of Starmer abusing his staffers.
Rashmika Mandanna
In early November 2023, a famous South Indian actor, Rashmika Mandanna fell prey to DeepFake when a morphed video of a famous British-Indian influencer, Zara Patel, with Rashmika's face started to float on social media. Zara Patel claims to not be involved in its creation.
Bongbong Marcos
In April 2024, a deepfake video misrepresenting Philippine President Bongbong Marcos was released. It is a slideshow accompanied by a deepfake audio of Marcos purportedly ordering the Armed Forces of the Philippines and special task force to act "however appropriate" should China attack the Philippines. The video was released amidst tensions related to the South China Sea dispute. The Presidential Communications Office has said that there is no such directive from the president and said a foreign actor might be behind the fabricated media. Criminal charges have been filed by the Kapisanan ng mga Brodkaster ng Pilipinas in relation to the deepfake media. On July 22, 2024, a video of Marcos purportedly snorting illegal drugs was released by Claire Contreras, a former supporter of Marcos. Dubbed as the polvoron video, the media noted its consistency with the insinuation of Marcos' predecessor—Rodrigo Duterte—that Marcos is a drug addict; the video was also shown at a Hakbang ng Maisug rally organized by people aligned with Duterte. Two days later, the Philippine National Police and the National Bureau of Investigation, based on their own findings, concluded that the video was created using AI; they further pointed out inconsistencies with the person on the video with Marcos, such as details on the two people's ears.
Joe Biden
Prior to the 2024 United States presidential election, phone calls imitating the voice of the incumbent Joe Biden were made to dissuade people from voting for him. The person responsible for the calls was charged with voter suppression and impersonating a candidate. The FCC proposed to fine him US$6 million and Lingo Telecom, the company that allegedly relayed the calls, $2 million.

Responses

Social media platforms

Twitter (later X) is taking active measures to handle synthetic and manipulated media on their platform. In order to prevent disinformation from spreading, Twitter is placing a notice on tweets that contain manipulated media and/or deepfakes that signal to viewers that the media is manipulated. There will also be a warning that appears to users who plan on retweeting, liking, or engaging with the tweet. Twitter will also work to provide users a link next to the tweet containing manipulated or synthetic media that links to a Twitter Moment or credible news article on the related topic—as a debunking action. Twitter also has the ability to remove any tweets containing deepfakes or manipulated media that may pose a harm to users' safety. In order to better improve Twitter's detection of deepfakes and manipulated media, Twitter asked users who are interested in partnering with them to work on deepfake detection solutions to fill out a form.

"In August 2024, the secretaries of state of Minnesota, Pennsylvania, Washington, Michigan and New Mexico penned an open letter to X owner Elon Musk urging modifications to its AI chatbot Grok's new text-to-video generator, added in August 2024, stating that it had disseminated election misinformation.

Facebook has taken efforts towards encouraging the creation of deepfakes in order to develop state of the art deepfake detection software. Facebook was the prominent partner in hosting the Deepfake Detection Challenge (DFDC), held December 2019, to 2114 participants who generated more than 35,000 models. The top performing models with the highest detection accuracy were analyzed for similarities and differences; these findings are areas of interest in further research to improve and refine deepfake detection models. Facebook has also detailed that the platform will be taking down media generated with artificial intelligence used to alter an individual's speech. However, media that has been edited to alter the order or context of words in one's message would remain on the site but be labeled as false, since it was not generated by artificial intelligence.

On 31 January 2018, Gfycat began removing all deepfakes from its site. On Reddit, the r/deepfakes subreddit was banned on 7 February 2018, due to the policy violation of "involuntary pornography". In the same month, representatives from Twitter stated that they would suspend accounts suspected of posting non-consensual deepfake content. Chat site Discord has taken action against deepfakes in the past, and has taken a general stance against deepfakes. In September 2018, Google added "involuntary synthetic pornographic imagery" to its ban list, allowing anyone to request the block of results showing their fake nudes.

In February 2018, Pornhub said that it would ban deepfake videos on its website because it is considered "non consensual content" which violates their terms of service. They also stated previously to Mashable that they will take down content flagged as deepfakes. Writers from Motherboard reported that searching "deepfakes" on Pornhub still returned multiple recent deepfake videos.

Facebook has previously stated that they would not remove deepfakes from their platforms. The videos will instead be flagged as fake by third-parties and then have a lessened priority in user's feeds. This response was prompted in June 2019 after a deepfake featuring a 2016 video of Mark Zuckerberg circulated on Facebook and Instagram.

In May 2022, Google officially changed the terms of service for their Jupyter Notebook colabs, banning the use of their colab service for the purpose of creating deepfakes. This came a few days after a VICE article had been published, claiming that "most deepfakes are non-consensual porn" and that the main use of popular deepfake software DeepFaceLab (DFL), "the most important technology powering the vast majority of this generation of deepfakes" which often was used in combination with Google colabs, would be to create non-consensual pornography, by pointing to the fact that among many other well-known examples of third-party DFL implementations such as deepfakes commissioned by The Walt Disney Company, official music videos, and web series Sassy Justice by the creators of South Park, DFL's GitHub page also links to deepfake porn website Mr.Deepfakes and participants of the DFL Discord server also participate on Mr.Deepfakes.

Legislation

In the United States, there have been some responses to the problems posed by deepfakes. In 2018, the Malicious Deep Fake Prohibition Act was introduced to the US Senate; in 2019, the Deepfakes Accountability Act was introduced in the 116th United States Congress by U.S. representative for New York's 9th congressional district Yvette Clarke. Several states have also introduced legislation regarding deepfakes, including Virginia, Texas, California, and New York; charges as varied as identity theft, cyberstalking, and revenge porn have been pursued, while more comprehensive statutes are urged.

Among U.S. legislative efforts, on 3 October 2019, California governor Gavin Newsom signed into law Assembly Bills No. 602 and No. 730. Assembly Bill No. 602 provides individuals targeted by sexually explicit deepfake content made without their consent with a cause of action against the content's creator. Assembly Bill No. 730 prohibits the distribution of malicious deepfake audio or visual media targeting a candidate running for public office within 60 days of their election. U.S. representative Yvette Clarke introduced H.R. 5586: Deepfakes Accountability Act into the 118th United States Congress on September 20, 2023 in an effort to protect national security from threats posed by deepfake technology. U.S. representative María Salazar introduced H.R. 6943: No AI Fraud Act into the 118th United States Congress on January 10, 2024, to establish specific property rights of individual physicality, including voice.

In November 2019, China announced that deepfakes and other synthetically faked footage should bear a clear notice about their fakeness starting in 2020. Failure to comply could be considered a crime the Cyberspace Administration of China stated on its website. The Chinese government seems to be reserving the right to prosecute both users and online video platforms failing to abide by the rules. The Cyberspace Administration of China, the Ministry of Industry and Information Technology, and the Ministry of Public Security jointly issued the Provision on the Administration of Deep Synthesis Internet Information Service in November 2022. China's updated Deep Synthesis Provisions (Administrative Provisions on Deep Synthesis in Internet-Based Information Services) went into effect in January 2023.

In the United Kingdom, producers of deepfake material could be prosecuted for harassment, but deepfake production was not a specific crime until 2023, when the Online Safety Act was passed, which made deepfakes illegal; the UK plans to expand the Act's scope to criminalize deepfakes created with "intention to cause distress" in 2024.

In Canada, in 2019, the Communications Security Establishment released a report which said that deepfakes could be used to interfere in Canadian politics, particularly to discredit politicians and influence voters. As a result, there are multiple ways for citizens in Canada to deal with deepfakes if they are targeted by them. In February 2024, bill C-63 was tabled in the 44th Canadian Parliament in order to enact the Online Harms Act, which would amend Criminal Code, and other Acts. An earlier version of the Bill, C-36, was ended by the dissolution of the 43rd Canadian Parliament in September 2021.

In India, there are no direct laws or regulation on AI or deepfakes, but there are provisions under the Indian Penal Code and Information Technology Act 2000/2008, which can be looked at for legal remedies, and the new proposed Digital India Act will have a chapter on AI and deepfakes in particular, as per the MoS Rajeev Chandrasekhar.

In Europe, the European Union's 2024 Artificial Intelligence Act (AI Act) takes a risk-based approach to regulating AI systems, including deepfakes. It establishes categories of "unacceptable risk," "high risk," "specific/limited or transparency risk", and "minimal risk" to determine the level of regulatory obligations for AI providers and users. However, the lack of clear definitions for these risk categories in the context of deepfakes creates potential challenges for effective implementation. Legal scholars have raised concerns about the classification of deepfakes intended for political misinformation or the creation of non-consensual intimate imagery. Debate exists over whether such uses should always be considered "high-risk" AI systems, which would lead to stricter regulatory requirements.

In August 2024, the Irish Data Protection Commission (DPC) launched court proceedings against X for its unlawful use of the personal data of over 60 million EU/EEA users, in order to train its AI technologies, such as its chatbot Grok.

Response from DARPA

In 2016, the Defense Advanced Research Projects Agency (DARPA) launched the Media Forensics (MediFor) program which was funded through 2020. MediFor aimed at automatically spotting digital manipulation in images and videos, including Deepfakes. In the summer of 2018, MediFor held an event where individuals competed to create AI-generated videos, audio, and images as well as automated tools to detect these deepfakes. According to the MediFor program, it established a framework of three tiers of information - digital integrity, physical integrity and semantic integrity - to generate one integrity score in an effort to enable accurate detection of manipulated media.

In 2019, DARPA hosted a "proposers day" for the Semantic Forensics (SemaFor) program where researchers were driven to prevent viral spread of AI-manipulated media. DARPA and the Semantic Forensics Program were also working together to detect AI-manipulated media through efforts in training computers to utilize common sense, logical reasoning. Built on the MediFor's technologies, SemaFor's attribution algorithms infer if digital media originates from a particular organization or individual, while characterization algorithms determine whether media was generated or manipulated for malicious purposes. In March 2024, SemaFor published an analytic catalog that offers the public access to open-source resources developed under SemaFor.

International Panel on the Information Environment

The International Panel on the Information Environment was launched in 2023 as a consortium of over 250 scientists working to develop effective countermeasures to deepfakes and other problems created by perverse incentives in organizations disseminating information via the Internet.

  • The 1986 mid-December issue of Analog magazine published the novelette "Picaper" by Jack Wodhams. Its plot revolves around digitally enhanced or digitally generated videos produced by skilled hackers serving unscrupulous lawyers and political figures.
  • The 1987 film The Running Man starring Arnold Schwarzenegger depicts an autocratic government using computers to digitally replace the faces of actors with those of wanted fugitives to make it appear the fugitives had been neutralized.
  • In the 1992 techno-thriller A Philosophical Investigation by Philip Kerr, "Wittgenstein", the main character and a serial killer, makes use of both a software similar to deepfake and a virtual reality suit for having sex with an avatar of Isadora "Jake" Jakowicz, the female police lieutenant assigned to catch him.
  • The 1993 film Rising Sun starring Sean Connery and Wesley Snipes depicts another character, Jingo Asakuma, who reveals that a computer disc has digitally altered personal identities to implicate a competitor.
  • Deepfake technology is part of the plot of the 2019 BBC One TV series The Capture. The first series follows former British Army sergeant Shaun Emery, who is accused of assaulting and abducting his barrister. Expertly doctored CCTV footage is revealed to have framed him and mislead the police investigating the case. The second series follows politician Isaac Turner who discovers that another deepfake is tarnishing his reputation until the "correction" is eventually exposed to the public.
  • In June 2020, YouTube deepfake artist Shamook created a deepfake of the 1994 film Forrest Gump by replacing the face of beloved actor Tom Hanks with John Travolta's. He created this piece using 6,000 high-quality still images of John Travolta's face from several of his films released around the same time as Forrest Gump. Shamook, then, created a 180 degree facial profile that he fed into a machine learning piece of software (DeepFaceLabs), along with Tom Hanks' face from Forrest Gump. The humor and irony of this deepfake traces back to 2007 when John Travolta revealed he turned down the chance to play the lead role in Forrest Gump because he had said yes to Pulp Fiction instead.
  • Al Davis vs. the NFL: The narrative structure of this 2021 documentary, part of ESPN's 30 for 30 documentary series, uses deepfake versions of the film's two central characters, both deceased—Al Davis, who owned the Las Vegas Raiders during the team's tenure in Oakland and Los Angeles, and Pete Rozelle, the NFL commissioner who frequently clashed with Davis.
  • Deepfake technology is featured in "Impawster Syndrome", the 57th episode of the Canadian police series Hudson & Rex, first broadcast on 6 January 2022, in which a member of the St. John's police team is investigated on suspicion of robbery and assault due to doctored CCTV footage using his likeness.
  • Using deepfake technology in his music video for his 2022 single, "The Heart Part 5", musician Kendrick Lamar transformed into figures resembling Nipsey Hussle, O.J. Simpson, and Kanye West, among others. The deepfake technology in the video was created by Deep Voodoo, a studio led by Trey Parker and Matt Stone, who created South Park.
  • Aloe Blacc honored his long-time collaborator Avicii four years after his death by performing their song "Wake Me Up" in English, Spanish, and Mandarin, using deepfake technologies.
  • In January 2023, ITVX released the series Deep Fake Neighbour Wars, in which various celebrities were played by actors experiencing inane conflicts, the celebrity's face deepfaked onto them.
  • In October 2023, Tom Hanks shared a photo of an apparent deepfake likeness depicting him promoting "some dental plan" to his Instagram page. Hanks warned his fans, "BEWARE . . . I have nothing to do with it."

Small-world experiment

From Wikipedia, the free encyclopedia
Milgram concluded from his small-world experiments that any two random people in the United States would be linked by a chain of (on average) six steps.

The small-world experiment comprised several experiments conducted by Stanley Milgram and other researchers examining the average path length for social networks of people in the United States. The research was groundbreaking in that it suggested that human society is a small-world-type network characterized by short path-lengths. The experiments are often associated with the phrase "six degrees of separation", although Milgram did not use this term himself.

Historical context of the small-world problem

Guglielmo Marconi's conjectures based on his radio work in the early 20th century, which were articulated in his 1909 Nobel Prize address, may have inspired Hungarian author Frigyes Karinthy to write a challenge to find another person to whom he could not be connected through at most five people. This is perhaps the earliest reference to the concept of six degrees of separation, and the search for an answer to the small world problem.

Mathematician Manfred Kochen and political scientist Ithiel de Sola Pool wrote a mathematical manuscript, "Contacts and Influences", while working at the University of Paris in the early 1950s, during a time when Milgram visited and collaborated in their research. Their unpublished manuscript circulated among academics for over 20 years before publication in 1978. It formally articulated the mechanics of social networks, and explored the mathematical consequences of these (including the degree of connectedness). The manuscript left many significant questions about networks unresolved, and one of these was the number of degrees of separation in actual social networks.

Milgram took up the challenge on his return from Paris, leading to the experiments reported in "The Small World Problem" in the May 1967 (charter) issue of the popular magazine Psychology Today, with a more rigorous version of the paper appearing in Sociometry two years later. The Psychology Today article generated enormous publicity for the experiments, which are well known today, long after much of the formative work has been forgotten.

Milgram's experiment was conceived in an era when a number of independent threads were converging on the idea that the world is becoming increasingly interconnected. Michael Gurevich had conducted seminal work in his empirical study of the structure of social networks in his MIT doctoral dissertation under Pool. Mathematician Manfred Kochen, an Austrian who had been involved in statist urban design, extrapolated these empirical results in a mathematical manuscript, Contacts and Influences, concluding that, in an American-sized population without social structure, "it is practically certain that any two individuals can contact one another by means of at least two intermediaries. In a [socially] structured population it is less likely but still seems probable. And perhaps for the whole world's population, probably only one more bridging individual should be needed." They subsequently constructed Monte Carlo simulations based on Gurevich's data, which recognized that both weak and strong acquaintance links are needed to model social structure. The simulations, running on the slower computers of 1973, were limited, but still were able to predict that a more realistic three degrees of separation existed across the U.S. population, a value that foreshadowed the findings of Milgram.

Milgram revisited Gurevich's experiments in acquaintanceship networks when he conducted a highly publicized set of experiments beginning in 1967 at Harvard University. One of Milgram's most famous works is a study of obedience and authority, which is widely known as the Milgram Experiment. Milgram's earlier association with Pool and Kochen was the likely source of his interest in the increasing interconnectedness among human beings. Gurevich's interviews served as a basis for his small world experiments.

Milgram sought to develop an experiment that could answer the small world problem. This was the same phenomenon articulated by the writer Frigyes Karinthy in the 1920s while documenting a widely circulated belief in Budapest that individuals were separated by six degrees of social contact. This observation, in turn, was loosely based on the seminal demographic work of the Statists who were so influential in the design of Eastern European cities during that period. Mathematician Benoit Mandelbrot, born in Poland and having traveled extensively in Eastern Europe, was aware of the Statist rules of thumb, and was also a colleague of Pool, Kochen and Milgram at the University of Paris during the early 1950s (Kochen brought Mandelbrot to work at the Institute for Advanced Study and later IBM in the U.S.). This circle of researchers was fascinated by the interconnectedness and "social capital" of social networks.

Milgram's study results showed that people in the United States seemed to be connected by approximately three friendship links, on average, without speculating on global linkages; he never actually used the phrase "six degrees of separation". Since the Psychology Today article gave the experiments wide publicity, Milgram, Kochen, and Karinthy all had been incorrectly attributed as the origin of the notion of "six degrees"; the most likely popularizer of the phrase "six degrees of separation" is John Guare, who attributed the value "six" to Marconi.

The experiment

Milgram's experiment developed out of a desire to learn more about the probability that two randomly selected people would know each other. This is one way of looking at the small world problem. An alternative view of the problem is to imagine the population as a social network and attempt to find the average path length between any two nodes. Milgram's experiment was designed to measure these path lengths by developing a procedure to count the number of ties between any two people.

Basic procedure

One possible path of a message in the "Small World" experiment by Stanley Milgram
  1. Though the experiment went through several variations, Milgram typically chose individuals in the U.S. cities of Omaha, Nebraska, and Wichita, Kansas, to be the starting points and Boston, Massachusetts, to be the end point of a chain of correspondence. These cities were selected because they were thought to represent a great distance in the United States, both socially and geographically.
  2. Information packets were initially sent to "randomly" selected individuals in Omaha or Wichita. They included letters, which detailed the study's purpose, and basic information about a target contact person in Boston. It additionally contained a roster on which they could write their own name, as well as business reply cards that were pre-addressed to Harvard.
  3. Upon receiving the invitation to participate, the recipient was asked whether he or she personally knew the contact person described in the letter. If so, the person was to forward the letter directly to that person. For the purposes of this study, knowing someone "personally" was defined as knowing them on a first-name basis.
  4. In the more likely case that the person did not personally know the target, then the person was to think of a friend or relative who was more likely to know the target. They were then directed to sign their name on the roster and forward the packet to that person. A postcard was also mailed to the researchers at Harvard so that they could track the chain's progression toward the target.
  5. When and if the package eventually reached the contact person in Boston, the researchers could examine the roster to count the number of times it had been forwarded from person to person. Additionally, for packages that never reached the destination, the incoming postcards helped identify the break point in the chain.

Results

Shortly after the experiments began, letters would begin arriving to the targets and the researchers would receive postcards from the respondents. Sometimes the packet would arrive to the target in as few as one or two hops, while some chains were composed of as many as nine or ten links. However, a significant problem was that often people refused to pass the letter forward, and thus the chain never reached its destination. In one case, 232 of the 296 letters never reached the destination.

However, 64 of the letters eventually did reach the target contact. Among these chains, the average path length fell around five and a half or six. Hence, the researchers concluded that people in the United States are separated by about six people on average. Although Milgram himself never used the phrase "six degrees of separation", these findings are likely to have contributed to its widespread acceptance.

In an experiment in which 160 letters were mailed out, 24 reached the target in his home in Sharon, Massachusetts. Of those 24 letters, 16 were given to the target by the same person, a clothing merchant Milgram called "Mr. Jacobs". Of those that reached the target at his office, more than half came from two other men.

The researchers used the postcards to qualitatively examine the types of chains that are created. Generally, the package quickly reached a close geographic proximity, but would circle the target almost randomly until it found the target's inner circle of friends. This suggests that participants strongly favored geographic characteristics when choosing an appropriate next person in the chain.

Criticisms

There are a number of methodological criticisms of the small-world experiment, which suggest that the average path length might actually be smaller or larger than Milgram expected. Four such criticisms are summarized here:

  1. Judith Kleinfeld argues that Milgram's study suffers from selection and non-response bias due to the way participants were recruited and high non-completion rates. First, the "starters" were not chosen at random, as they were recruited through an advertisement that specifically sought people who considered themselves well-connected. Another problem has to do with the attrition rate. If one assumes a constant portion of non-response for each person in the chain, longer chains will be under-represented because it is more likely that they will encounter an unwilling participant. Hence, Milgram's experiment should underestimate the true average path length. Several methods have been suggested to correct these estimates; one uses a variant of survival analysis in order to account for the length information of interrupted chains, and thus reduce the bias in the estimation of average degrees of separation.
  2. One of the key features of Milgram's methodology is that participants are asked to choose the person they know who is most likely to know the target individual. But in many cases, the participant may be unsure which of their friends is the most likely to know the target. Thus, since the participants of the Milgram experiment do not have a topological map of the social network, they might actually be sending the package further away from the target rather than sending it along the shortest path. This is very likely to increase route length, overestimating the average number of ties needed to connect two random people. An omniscient path-planner, having access to the complete social graph of the country, would be able to choose a shortest path that is, in general, shorter than the path produced by a greedy algorithm that makes local decisions only.
  3. A description of heterogeneous social networks still remains an open question. Though much research was not done for a number of years, in 1998 Duncan Watts and Steven Strogatz published a breakthrough paper in the journal Nature. Mark Buchanan said, "Their paper touched off a storm of further work across many fields of science" (Nexus, p60, 2002). See Watts' book on the topic: Six Degrees: The Science of a Connected Age.
  4. Some communities, such as the Sentinelese, are completely isolated, disrupting the otherwise global chains. Once these people are discovered, they remain more "distant" from the vast majority of the world, as they have few economic, familial, or social contacts with the world at large; before they are discovered, they are not within any degree of separation from the rest of the population. However, these populations are invariably tiny, rendering them of low statistical relevance.

In addition to these methodological criticisms, conceptual issues are debated. One regards the social relevance of indirect contact chains of different degrees of separation. Much formal and empirical work focuses on diffusion processes, but the literature on the small-world problem also often illustrates the relevance of the research using an example (similar to Milgram's experiment) of a targeted search in which a starting person tries to obtain some kind of resource (e.g., information) from a target person, using a number of intermediaries to reach that target person. However, there is little empirical research showing that indirect channels with a length of about six degrees of separation are actually used for such directed search, or that such search processes are more efficient compared to other means (e.g., finding information in a directory).

Influence

The social sciences

The Tipping Point by Malcolm Gladwell, based on articles originally published in The New Yorker, elaborates on the "funneling" concept. Gladwell condenses sociological research, which argues that the six-degrees phenomenon is dependent on a few extraordinary people ("connectors") with large networks of contacts and friends: these hubs then mediate the connections between the vast majority of otherwise weakly connected individuals.

Recent work in the effects of the small world phenomenon on disease transmission, however, have indicated that due to the strongly connected nature of social networks as a whole, removing these hubs from a population usually has little effect on the average path length through the graph (Barrett et al., 2005).

Mathematicians and actors

Smaller communities, such as mathematicians and actors, have been found to be densely connected by chains of personal or professional associations. Mathematicians have created the Erdős number to describe their distance from Paul Erdős based on shared publications. A similar exercise has been carried out for the actor Kevin Bacon and other actors who appeared in movies together with him — the latter effort informing the game "Six Degrees of Kevin Bacon". There is also the combined Erdős-Bacon number, for actor-mathematicians and mathematician-actors. Players of the popular Asian game Go describe their distance from the great player Honinbo Shusaku by counting their Shusaku number, which counts degrees of separation through the games the players have had.

Current research on the small-world problem

The small-world question is still a popular research topic today, with many experiments still being conducted. For instance, Peter Dodds, Roby Muhamad, and Duncan Watts conducted the first large-scale replication of Milgram's experiment, involving 24,163 e-mail chains and 18 targets around the world.

Dodds et al. also found that the mean chain length was roughly six, even after accounting for attrition. A similar experiment using popular social networking sites as a medium was carried out at Carnegie Mellon University. Results showed that very few messages actually reached their destination. However, the critiques that apply to Milgram's experiment largely apply also to this current research.

Network models

There are three graphs side by side. The titles on top from left to right are: "Regular Ring Graph (p = 0)", "Small-World Graph (p = 0.2), and "Random Graph (p = 1)".
Comparison of Watts-Strogatz graphs with different randomization probability. A regular ring graph (left), a small-world graph with some edges randomly rewired (center), and a random graph with all edges randomly rewired (right).

In 1998, Duncan J. Watts and Steven Strogatz from Cornell University published the first network model on the small-world phenomenon. They showed that networks from both the natural and man-made world, such as power grids and the neural network of C. elegans, exhibit the small-world phenomenon. Watts and Strogatz showed that, beginning with a regular lattice, the addition of a small number of random links reduces the diameter—the longest direct path between any two vertices in the network—from being very long to being very short. The research was originally inspired by Watts' efforts to understand the synchronization of cricket chirps, which show a high degree of coordination over long ranges as though the insects are being guided by an invisible conductor. The mathematical model which Watts and Strogatz developed to explain this phenomenon has since been applied in a wide range of different areas. In Watts' words:

I think I've been contacted by someone from just about every field outside of English literature. I've had letters from mathematicians, physicists, biochemists, neurophysiologists, epidemiologists, economists, sociologists; from people in marketing, information systems, civil engineering, and from a business enterprise that uses the concept of the small world for networking purposes on the Internet.

Generally, their model demonstrated the truth in Mark Granovetter's observation that it is "the strength of weak ties" that holds together a social network. Although the specific model has since been generalized by Jon Kleinberg, it remains a canonical case study in the field of complex networks. In network theory, the idea presented in the small-world network model has been explored quite extensively. Indeed, several classic results in random graph theory show that even networks with no real topological structure exhibit the small-world phenomenon, which mathematically is expressed as the diameter of the network growing with the logarithm of the number of nodes (rather than proportional to the number of nodes, as in the case for a lattice). This result similarly maps onto networks with a power-law degree distribution, such as scale-free networks.

In computer science, the small-world phenomenon (although it is not typically called that) is used in the development of secure peer-to-peer protocols, novel routing algorithms for the Internet and ad hoc wireless networks, and search algorithms for communication networks of all kinds.

Social networks pervade popular culture in the United States and elsewhere. In particular, the notion of six degrees has become part of the collective consciousness. Social networking services such as Facebook, Linkedin, and Instagram have greatly increased the connectivity of the online space through the application of social networking concepts.

Social network analysis

From Wikipedia, the free encyclopedia
A social network diagram displaying friendship ties among a set of Facebook users.

Social network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory. It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties, edges, or links (relationships or interactions) that connect them. Examples of social structures commonly visualized through social network analysis include social media networks, meme proliferation, information circulation, friendship and acquaintance networks, business networks, knowledge networks, difficult working relationships, collaboration graphs, kinship, disease transmission, and sexual relationships. These networks are often visualized through sociograms in which nodes are represented as points and ties are represented as lines. These visualizations provide a means of qualitatively assessing networks by varying the visual representation of their nodes and edges to reflect attributes of interest.

Social network analysis has emerged as a key technique in modern sociology. It has also gained significant popularity in the following: anthropology, biology, demography, communication studies, economics, geography, history, information science, organizational studies, physics, political science, public health, social psychology, development studies, sociolinguistics, and computer science, education and distance education research, and is now commonly available as a consumer tool (see the list of SNA software).

History

Social network analysis has its theoretical roots in the work of early sociologists such as Georg Simmel and Émile Durkheim, who wrote about the importance of studying patterns of relationships that connect social actors. Social scientists have used the concept of "social networks" since early in the 20th century to connote complex sets of relationships between members of social systems at all scales, from interpersonal to international.

In the 1930s Jacob Moreno and Helen Jennings introduced basic analytical methods. In 1954, John Arundel Barnes started using the term systematically to denote patterns of ties, encompassing concepts traditionally used by the public and those used by social scientists: bounded groups (e.g., tribes, families) and social categories (e.g., gender, ethnicity).

Starting in the 1970s, scholars such as Ronald Burt, Kathleen Carley, Mark Granovetter, David Krackhardt, Edward Laumann, Anatol Rapoport, Barry Wellman, Douglas R. White, and Harrison White expanded the use of systematic social network analysis.

Beginning in the late 1990s, social network analysis experienced a further resurgence with work by sociologists, political scientists, economists, computer scientists, and physicists such as Duncan J. Watts, Albert-László Barabási, Peter Bearman, Nicholas A. Christakis, James H. Fowler, Mark Newman, Matthew Jackson, Jon Kleinberg, and others, developing and applying new models and methods, prompted in part by the emergence of new data available about online social networks as well as "digital traces" regarding face-to-face networks.

Computational SNA has been extensively used in research on study-abroad second language acquisition. Even in the study of literature, network analysis has been applied by Anheier, Gerhards and Romo, Wouter De Nooy, and Burgert Senekal. Indeed, social network analysis has found applications in various academic disciplines as well as practical contexts such as countering money laundering and terrorism.

Metrics

Hue (from red=min to blue=max) indicates each node's betweenness centrality.

Size: The number of network members in a given network.

Connections

Homophily: The extent to which actors form ties with similar versus dissimilar others. Similarity can be defined by gender, race, age, occupation, educational achievement, status, values or any other salient characteristic. Homophily is also referred to as assortativity.

Multiplexity: The number of content-forms contained in a tie. For example, two people who are friends and also work together would have a multiplexity of 2. Multiplexity has been associated with relationship strength and can also comprise overlap of positive and negative network ties.

Mutuality/Reciprocity: The extent to which two actors reciprocate each other's friendship or other interaction.

Network Closure: A measure of the completeness of relational triads. An individual's assumption of network closure (i.e. that their friends are also friends) is called transitivity. Transitivity is an outcome of the individual or situational trait of Need for Cognitive Closure.

Propinquity: The tendency for actors to have more ties with geographically close others.

Distributions

Bridge: An individual whose weak ties fill a structural hole, providing the only link between two individuals or clusters. It also includes the shortest route when a longer one is unfeasible due to a high risk of message distortion or delivery failure.

Centrality: Centrality refers to a group of metrics that aim to quantify the "importance" or "influence" (in a variety of senses) of a particular node (or group) within a network. Examples of common methods of measuring "centrality" include betweenness centrality, closeness centrality, eigenvector centrality, alpha centrality, and degree centrality.

Density: The proportion of direct ties in a network relative to the total number possible.

Distance: The minimum number of ties required to connect two particular actors, as popularized by Stanley Milgram's small world experiment and the idea of 'six degrees of separation'.

Structural holes: The absence of ties between two parts of a network. Finding and exploiting a structural hole can give an entrepreneur a competitive advantage. This concept was developed by sociologist Ronald Burt, and is sometimes referred to as an alternate conception of social capital.

Tie Strength: Defined by the linear combination of time, emotional intensity, intimacy and reciprocity (i.e. mutuality). Strong ties are associated with homophily, propinquity and transitivity, while weak ties are associated with bridges.

Segmentation

Groups are identified as 'cliques' if every individual is directly tied to every other individual, 'social circles' if there is less stringency of direct contact, which is imprecise, or as structurally cohesive blocks if precision is wanted.

Clustering coefficient: A measure of the likelihood that two associates of a node are associates. A higher clustering coefficient indicates a greater 'cliquishness'.

Cohesion: The degree to which actors are connected directly to each other by cohesive bonds. Structural cohesion refers to the minimum number of members who, if removed from a group, would disconnect the group.

Modelling and visualization of networks

Different characteristics of social networks. A, B, and C show varying centrality and density of networks; panel D shows network closure, i.e., when two actors, tied to a common third actor, tend to also form a direct tie between them. Panel E represents two actors with different attributes (e.g., organizational affiliation, beliefs, gender, education) who tend to form ties. Panel F consists of two types of ties: friendship (solid line) and dislike (dashed line). In this case, two actors being friends both dislike a common third (or, similarly, two actors that dislike a common third tend to be friends).

Visual representation of social networks is important to understand the network data and convey the result of the analysis. Numerous methods of visualization for data produced by social network analysis have been presented. Many of the analytic software have modules for network visualization. The data is explored by displaying nodes and ties in various layouts and attributing colors, size, and other advanced properties to nodes. Visual representations of networks may be a powerful method for conveying complex information. Still, care should be taken in interpreting node and graph properties from visual displays alone, as they may misrepresent structural properties better captured through quantitative analyses.

Signed graphs can be used to illustrate good and bad relationships between humans. A positive edge between two nodes denotes a positive relationship (friendship, alliance, dating), and a negative edge denotes a negative relationship (hatred, anger). Signed social network graphs can be used to predict the future evolution of the graph. In signed social networks, there is the concept of "balanced" and "unbalanced" cycles. A balanced cycle is defined as a cycle where the product of all the signs are positive. According to balance theory, balanced graphs represent a group of people who are unlikely to change their opinions of the other people in the group. Unbalanced graphs represent a group of people who are very likely to change their opinions of the people in their group. For example, a group of 3 people (A, B, and C) where A and B have a positive relationship, B and C have a positive relationship. Still, C and A have a negative relationship, an unbalanced cycle. This group is very likely to morph into a balanced cycle, such as one where B only has a good relationship with A, and both A and B have a negative relationship with C. By using the concept of balanced and unbalanced cycles, the evolution of signed social network graphs can be predicted.

Different approaches to participatory network mapping have proven useful, especially when using social network analysis as a tool for facilitating change. Here, participants/interviewers provide network data by mapping the network (with pen and paper or digitally) during the data collection session. An example of a pen-and-paper network mapping approach, which also includes the collection of some actor attributes (perceived influence and goals of actors) is the * Net-map toolbox. One benefit of this approach is that it allows researchers to collect qualitative data and ask clarifying questions while the network data is collected.

Social networking potential

Social Networking Potential (SNP) is a numeric coefficient, derived through algorithms to represent both the size of an individual's social network and their ability to influence that network. SNP coefficients were first defined and used by Bob Gerstley in 2002. A closely related term is Alpha User, defined as a person with a high SNP.

SNP coefficients have two primary functions:

  1. The classification of individuals based on their social networking potential, and
  2. The weighting of respondents in quantitative marketing research studies.

By calculating the SNP of respondents and by targeting High SNP respondents, the strength and relevance of quantitative marketing research used to drive viral marketing strategies is enhanced.

Variables used to calculate an individual's SNP include but are not limited to: participation in Social Networking activities, group memberships, leadership roles, recognition, publication/editing/contributing to non-electronic media, publication/editing/contributing to electronic media (websites, blogs), and frequency of past distribution of information within their network. The acronym "SNP" and some of the first algorithms developed to quantify an individual's social networking potential were described in the white paper "Advertising Research is Changing" (Gerstley, 2003) See Viral Marketing.

The first book to discuss the commercial use of Alpha Users among mobile telecoms audiences was 3G Marketing by Ahonen, Kasper and Melkko in 2004. The first book to discuss Alpha Users more generally in the context of social marketing intelligence was Communities Dominate Brands by Ahonen & Moore in 2005. In 2012, Nicola Greco (UCL) presents at TEDx the Social Networking Potential as a parallelism to the potential energy that users generate and companies should use, stating that "SNP is the new asset that every company should aim to have".

Practical applications

Social network analysis is used extensively in a wide range of applications and disciplines. Some common network analysis applications include data aggregation and mining, network propagation modeling, network modeling and sampling, user attribute and behavior analysis, community-maintained resource support, location-based interaction analysis, social sharing and filtering, recommender systems development, and link prediction and entity resolution. In the private sector, businesses use social network analysis to support activities such as customer interaction and analysis, information system development analysis, marketing, and business intelligence needs (see social media analytics). Some public sector uses include development of leader engagement strategies, analysis of individual and group engagement and media use, and community-based problem solving.

Longitudinal SNA in schools

Large numbers of researchers worldwide examine the social networks of children and adolescents. In questionnaires, they list all classmates, students in the same grade, or schoolmates, asking: "Who are your best friends?". Students may sometimes nominate as many peers as they wish; other times, the number of nominations is limited. Social network researchers have investigated similarities in friendship networks. The similarity between friends was established as far back as classical antiquity. Resemblance is an important basis for the survival of friendships. Similarity in characteristics, attitudes, or behaviors means that friends understand each other more quickly, have common interests to talk about, know better where they stand with each other, and have more trust in each other. As a result, such relationships are more stable and valuable. Moreover, looking more alike makes young people more confident and strengthens them in developing their identity. Similarity in behavior can result from two processes: selection and influence. These two processes can be distinguished using longitudinal social network analysis in the R package SIENA (Simulation Investigation for Empirical Network Analyses), developed by Tom Snijders and colleagues. Longitudinal social network analysis became mainstream after the publication of a special issue of the Journal of Research on Adolescence in 2013, edited by René Veenstra and containing 15 empirical papers.

Security applications

Social network analysis is also used in intelligence, counter-intelligence and law enforcement activities. This technique allows the analysts to map covert organizations such as an espionage ring, an organized crime family or a street gang. The National Security Agency (NSA) uses its electronic surveillance programs to generate the data needed to perform this type of analysis on terrorist cells and other networks deemed relevant to national security. The NSA looks up to three nodes deep during this network analysis. After the initial mapping of the social network is complete, analysis is performed to determine the structure of the network and determine, for example, the leaders within the network. This allows military or law enforcement assets to launch capture-or-kill decapitation attacks on the high-value targets in leadership positions to disrupt the functioning of the network. The NSA has been performing social network analysis on call detail records (CDRs), also known as metadata, since shortly after the September 11 attacks.

Textual analysis applications

Large textual corpora can be turned into networks and then analyzed using social network analysis. In these networks, the nodes are Social Actors, and the links are Actions. The extraction of these networks can be automated by using parsers. The resulting networks, which can contain thousands of nodes, are then analyzed using tools from network theory to identify the key actors, the key communities or parties, and general properties such as the robustness or structural stability of the overall network or the centrality of certain nodes. This automates the approach introduced by Quantitative Narrative Analysis, whereby subject-verb-object triplets are identified with pairs of actors linked by an action, or pairs formed by actor-object.

Narrative network of US Elections 2012

In other approaches, textual analysis is carried out considering the network of words co-occurring in a text. In these networks, nodes are words and links among them are weighted based on their frequency of co-occurrence (within a specific maximum range).

Internet applications

Social network analysis has also been applied to understanding online behavior by individuals, organizations, and between websites. Hyperlink analysis can be used to analyze the connections between websites or webpages to examine how information flows as individuals navigate the web. The connections between organizations has been analyzed via hyperlink analysis to examine which organizations within an issue community.

Netocracy

Another concept that has emerged from this connection between social network theory and the Internet is the concept of netocracy, where several authors have emerged studying the correlation between the extended use of online social networks, and changes in social power dynamics.

Social media internet applications

Social network analysis has been applied to social media as a tool to understand behavior between individuals or organizations through their linkages on social media websites such as Twitter and Facebook.

In computer-supported collaborative learning

One of the most current methods of the application of SNA is to the study of computer-supported collaborative learning (CSCL). When applied to CSCL, SNA is used to help understand how learners collaborate in terms of amount, frequency, and length, as well as the quality, topic, and strategies of communication. Additionally, SNA can focus on specific aspects of the network connection, or the entire network as a whole. It uses graphical representations, written representations, and data representations to help examine the connections within a CSCL network. When applying SNA to a CSCL environment the interactions of the participants are treated as a social network. The focus of the analysis is on the "connections" made among the participants – how they interact and communicate – as opposed to how each participant behaved on his or her own.

Key terms

There are several key terms associated with social network analysis research in computer-supported collaborative learning such as: density, centrality, indegree, outdegree, and sociogram.

  • Density refers to the "connections" between participants. Density is defined as the number of connections a participant has, divided by the total possible connections a participant could have. For example, if there are 20 people participating, each person could potentially connect to 19 other people. A density of 100% (19/19) is the greatest density in the system. A density of 5% indicates there is only 1 of 19 possible connections.
  • Centrality focuses on the behavior of individual participants within a network. It measures the extent to which an individual interacts with other individuals in the network. The more an individual connects to others in a network, the greater their centrality in the network.

In-degree and out-degree variables are related to centrality.

  • In-degree centrality concentrates on a specific individual as the point of focus; centrality of all other individuals is based on their relation to the focal point of the "in-degree" individual.
  • Out-degree is a measure of centrality that still focuses on a single individual, but the analytic is concerned with the out-going interactions of the individual; the measure of out-degree centrality is how many times the focus point individual interacts with others.
  • A sociogram is a visualization with defined boundaries of connections in the network. For example, a sociogram which shows out-degree centrality points for Participant A would illustrate all outgoing connections Participant A made in the studied network.

Unique capabilities

Researchers employ social network analysis in the study of computer-supported collaborative learning in part due to the unique capabilities it offers. This particular method allows the study of interaction patterns within a networked learning community and can help illustrate the extent of the participants' interactions with the other members of the group. The graphics created using SNA tools provide visualizations of the connections among participants and the strategies used to communicate within the group. Some authors also suggest that SNA provides a method of easily analyzing changes in participatory patterns of members over time.

A number of research studies have applied SNA to CSCL across a variety of contexts. The findings include the correlation between a network's density and the teacher's presence, a greater regard for the recommendations of "central" participants, infrequency of cross-gender interaction in a network, and the relatively small role played by an instructor in an asynchronous learning network.

Other methods used alongside SNA

Although many studies have demonstrated the value of social network analysis within the computer-supported collaborative learning field, researchers have suggested that SNA by itself is not enough for achieving a full understanding of CSCL. The complexity of the interaction processes and the myriad sources of data make it difficult for SNA to provide an in-depth analysis of CSCL. Researchers indicate that SNA needs to be complemented with other methods of analysis to form a more accurate picture of collaborative learning experiences.

A number of research studies have combined other types of analysis with SNA in the study of CSCL. This can be referred to as a multi-method approach or data triangulation, which will lead to an increase of evaluation reliability in CSCL studies.

  • Qualitative method – The principles of qualitative case study research constitute a solid framework for the integration of SNA methods in the study of CSCL experiences.
    • Ethnographic data such as student questionnaires and interviews and classroom non-participant observations
    • Case studies: comprehensively study particular CSCL situations and relate findings to general schemes
    • Content analysis: offers information about the content of the communication among members
  • Quantitative method – This includes simple descriptive statistical analyses on occurrences to identify particular attitudes of group members who have not been able to be tracked via SNA in order to detect general tendencies.
    • Computer log files: provide automatic data on how collaborative tools are used by learners
    • Multidimensional scaling (MDS): charts similarities among actors, so that more similar input data is closer together
    • Software tools: QUEST, SAMSA (System for Adjacency Matrix and Sociogram-based Analysis), and Nud*IST

Artificial intelligence for video surveillance

Face detection in a photograph

Artificial intelligence for video surveillance utilizes computer software programs that analyze the audio and images from video surveillance cameras in order to recognize humans, vehicles, objects, attributes, and events. Security contractors program the software to define restricted areas within the camera's view (such as a fenced off area, a parking lot but not the sidewalk or public street outside the lot) and program for times of day (such as after the close of business) for the property being protected by the camera surveillance. The artificial intelligence ("A.I.") sends an alert if it detects a trespasser breaking the "rule" set that no person is allowed in that area during that time of day.

The A.I. program functions by using machine vision. Machine vision is a series of algorithms, or mathematical procedures, which work like a flow-chart or series of questions to compare the object seen with hundreds of thousands of stored reference images of humans in different postures, angles, positions and movements. The A.I. asks itself if the observed object moves like the reference images, whether it is approximately the same size height relative to width, if it has the characteristic two arms and two legs, if it moves with similar speed, and if it is vertical instead of horizontal. Many other questions are possible, such as the degree to which the object is reflective, the degree to which it is steady or vibrating, and the smoothness with which it moves. Combining all of the values from the various questions, an overall ranking is derived which gives the A.I. the probability that the object is or is not a human. If the value exceeds a limit that is set, then the alert is sent. It is characteristic of such programs that they are self-learning to a degree, learning, for example that humans or vehicles appear bigger in certain portions of the monitored image – those areas near the camera – than in other portions, those being the areas farthest from the camera.

In addition to the simple rule restricting humans or vehicles from certain areas at certain times of day, more complex rules can be set. The user of the system may wish to know if vehicles drive in one direction but not the other. Users may wish to know that there are more than a certain preset number of people within a particular area. The A.I. is capable of maintaining surveillance of hundreds of cameras simultaneously. Its ability to spot a trespasser in the distance or in rain or glare is superior to humans' ability to do so.

This type of A.I. for security is known as "rule-based" because a human programmer must set rules for all of the things for which the user wishes to be alerted. This is the most prevalent form of A.I. for security. Many video surveillance camera systems today include this type of A.I. capability. The hard-drive that houses the program can either be located in the cameras themselves or can be in a separate device that receives the input from the cameras.

A newer, non-rule based form of A.I. for security called "behavioral analytics" has been developed. This software is fully self-learning with no initial programming input by the user or security contractor. In this type of analytics, the A.I. learns what is normal behaviour for people, vehicles, machines, and the environment based on its own observation of patterns of various characteristics such as size, speed, reflectivity, color, grouping, vertical or horizontal orientation and so forth. The A.I. normalises the visual data, meaning that it classifies and tags the objects and patterns it observes, building up continuously refined definitions of what is normal or average behaviour for the various observed objects. After several weeks of learning in this fashion it can recognise when things break the pattern. When it observes such anomalies it sends an alert. For example, it is normal for cars to drive in the street. A car seen driving up onto a sidewalk would be an anomaly. If a fenced yard is normally empty at night, then a person entering that area would be an anomaly.

History

Statement of the problem

Limitations in the ability of humans to vigilantly monitor video surveillance live footage led to the demand for artificial intelligence that could better serve the task. Humans watching a single video monitor for more than twenty minutes lose 95% of their ability to maintain attention sufficient to discern significant events. With two monitors this is cut in half again. Given that many facilities have dozens or even hundreds of cameras, the task is clearly beyond human ability. In general, the camera views of empty hallways, storage facilities, parking lots or structures are exceedingly boring and thus attention quickly diminishes. When multiple cameras are monitored, typically employing a wall monitor or bank of monitors with split screen views and rotating every several seconds between one set of cameras and the next, the visual tedium is quickly overwhelming. While video surveillance cameras proliferated with great adoption by users ranging from car dealerships and shopping plazas to schools and businesses to highly secured facilities such as nuclear plants, it was recognized in hindsight that video surveillance by human officers (also called "operators") was impractical and ineffective. Extensive video surveillance systems were relegated to merely recording for possible forensic use to identify someone, after the fact of a theft, arson, attack or incident. Where wide angle camera views were employed, particularly for large outdoor areas, severe limitations were discovered even for this purpose due to insufficient resolution. In these cases it is impossible to identify the trespasser or perpetrator because their image is too tiny on the monitor.

Earlier attempts at solution

Motion detection cameras

In response to the shortcomings of human guards to watch surveillance monitors long-term, the first solution was to add motion detectors to cameras. It was reasoned that an intruder's or perpetrator's motion would send an alert to the remote monitoring officer obviating the need for constant human vigilance. The problem was that in an outdoor environment there is constant motion or changes of pixels that comprise the total viewed image on screen. The motion of leaves on trees blowing in the wind, litter along the ground, insects, birds, dogs, shadows, headlights, sunbeams and so forth all comprise motion. This caused hundreds or even thousands of false alerts per day, rendering this solution inoperable except in indoor environments during times of non-operating hours.

Advanced video motion detection

The next evolution reduced false alerts to a degree but at the cost of complicated and time-consuming manual calibration. Here, changes of a target such as a person or vehicle relative to a fixed background are detected. Where the background changes seasonally or due to other changes, the reliability deteriorates over time. The economics of responding to too many false alerts again proved to be an obstacle and this solution was not sufficient.

Advent of true video analytics

Machine learning of visual recognition relates to patterns and their classification. True video analytics can distinguish the human form, vehicles and boats or selected objects from the general movement of all other objects and visual static or changes in pixels on the monitor. It does this by recognizing patterns. When the object of interest, for example a human, violates a preset rule, for example that the number of people shall not exceed zero in a pre-defined area during a defined time interval, then an alert is sent. A red rectangle or so-called "bounding box" will typically automatically follow the detected intruder, and a short video clip of this is sent as the alert.

Practical application

Pedestrian detection

Real-time preventative action

The detection of intruders using video surveillance has limitations based on economics and the nature of video cameras. Typically, cameras outdoors are set to a wide angle view and yet look out over a long distance. Frame rate per second and dynamic range to handle brightly lit areas and dimly lit ones further challenge the camera to actually be adequate to see a moving human intruder. At night, even in illuminated outdoor areas, a moving subject does not gather enough light per frame per second and so, unless quite close to the camera, will appear as a thin wisp or barely discernible ghost or completely invisible. Conditions of glare, partial obscuration, rain, snow, fog, and darkness all compound the problem. Even when a human is directed to look at the actual location on a monitor of a subject in these conditions, the subject will usually not be detected. The A.I. is able to impartially look at the entire image and all cameras' images simultaneously. Using statistical models of degrees of deviation from its learned pattern of what constitutes the human form it will detect an intruder with high reliability and a low false alert rate even in adverse conditions. Its learning is based on approximately a quarter million images of humans in various positions, angles, postures, and so forth.

A one megapixel camera with the onboard video analytics was able to detect a human at a distance of about 350' and an angle of view of about 30 degrees in non-ideal conditions. Rules could be set for a "virtual fence" or intrusion into a pre-defined area. Rules could be set for directional travel, object left behind, crowd formation and some other conditions. Artificial intelligence for video surveillance is widely used in China. See Mass surveillance in China.

Talk-down

One of the most powerful features of the system is that a human officer or operator, receiving an alert from the A.I., could immediately talk down over outdoor public address loudspeakers to the intruder. This had high deterrence value as most crimes are opportunistic and the risk of capture to the intruder becomes so pronounced when a live person is talking to them that they are very likely to desist from intrusion and to retreat. The security officer would describe the actions of the intruder so that the intruder had no doubt that a real person was watching them. The officer would announce that the intruder was breaking the law and that law enforcement was being contacted and that they were being video-recorded.

Verified breach report

The police receive a tremendous number of false alarms from burglar alarms. In fact the security industry reports that over 98% of such alarms are false ones. Accordingly, the police give very low priority response to burglar alarms and can take from twenty minutes to two hours to respond to the site. By contrast, the video analytic-detected crime is reported to the central monitoring officer, who verifies with his or her own eyes that it is a real crime in progress. He or she then dispatches to the police who give such calls their highest priority.

Behavioural analytics

Active environments

While rule-based video analytics worked economically and reliably for many security applications there are many situations in which it cannot work. For an indoor or outdoor area where no one belongs during certain times of day, for example overnight, or for areas where no one belongs at any time such as a cell tower, traditional rule-based analytics are perfectly appropriate. In the example of a cell tower the rare time that a service technician may need to access the area would simply require calling in with a pass-code to put the monitoring response "on test" or inactivated for the brief time the authorized person was there.

But there are many security needs in active environments in which hundreds or thousands of people belong all over the place all the time. For example, a college campus, an active factory, a hospital or any active operating facility. It is not possible to set rules that would discriminate between legitimate people and criminals or wrong-doers.

Overcoming the problem of active environments

Using behavioral analytics, a self-learning, non-rule-based A.I. takes the data from video cameras and continuously classifies objects and events that it sees. For example, a person crossing a street is one classification. A group of people is another classification. A vehicle is one classification, but with continued learning a public bus would be discriminated from a small truck and that from a motorcycle. With increasing sophistication, the system recognizes patterns in human behavior. For example, it might observe that individuals pass through a controlled access door one at a time. The door opens, the person presents their proximity card or tag, the person passes through and the door closes. This pattern of activity, observed repeatedly, forms a basis for what is normal in the view of the camera observing that scene. Now if an authorized person opens the door but a second "tail-gating" unauthorized person grabs the door before it closes and passes through, that is the sort of anomaly that would create an alert. This type of analysis is much more complex than the rule-based analytics. While the rule-based analytics work mainly to detect intruders into areas where no one is normally present at defined times of day, the behavioral analytics works where people are active to detect things that are out of the ordinary.

A fire breaking out outdoors would be an unusual event and would cause an alert, as would a rising cloud of smoke. Vehicles driving the wrong way into a one-way driveway would also typify the type of event that has a strong visual signature and would deviate from the repeatedly observed pattern of vehicles driving the correct one-way in the lane. Someone thrown to the ground by an attacker would be an unusual event that would likely cause an alert. This is situation-specific. So if the camera viewed a gymnasium where wrestling was practiced the A.I. would learn it is usual for one human to throw another to the ground, in which case it would not alert on this observation.

What the artificial intelligence 'understands'

The A.I. does not know or understand what a human is, or a fire, or a vehicle. It is simply finding characteristics of these things based on their size, shape, color, reflectivity, angle, orientation, motion, and so on. It then finds that the objects it has classified have typical patterns of behavior. For example, humans walk on sidewalks and sometimes on streets but they don't climb up the sides of buildings very often. Vehicles drive on streets but don't drive on sidewalks. Thus the anomalous behavior of someone scaling a building or a vehicle veering onto a sidewalk would trigger an alert.

Varies from traditional mindset of security systems

Typical alarm systems are designed to not miss true positives (real crime events) and to have as low of a false alarm rate as possible. In that regard, burglar alarms miss very few true positives but have a very high false alarm rate even in the controlled indoor environment. Motion detecting cameras miss some true positives but are plagued with overwhelming false alarms in an outdoor environment. Rule-based analytics reliably detect most true positives and have a low rate of false positives but cannot perform in active environments, only in empty ones. Also they are limited to the simple discrimination of whether an intruder is present or not.

Something as complex or subtle as a fight breaking out or an employee breaking a safety procedure is not possible for a rule based analytics to detect or discriminate. With behavioral analytics, it is. Places where people are moving and working do not present a problem. However, the A.I. may spot many things that appear anomalous but are innocent in nature. For example, if students at a campus walk on a plaza, that will be learned as normal. If a couple of students decided to carry a large sheet outdoors flapping in the wind, that might indeed trigger an alert. The monitoring officer would be alerted to look at his or her monitor and would see that the event is not a threat and would then ignore it. The degree of deviation from norm that triggers an alert can be set so that only the most abnormal things are reported. However, this still constitutes a new way of human and A.I. interaction not typified by the traditional alarm industry mindset. This is because there will be many false alarms that may nevertheless be valuable to send to a human officer who can quickly look and determine if the scene requires a response. In this sense, it is a "tap on the shoulder" from the A.I. to have the human look at something.

Limitations of behavioral analytics

Because so many complex things are being processed continuously, the software samples down to the very low resolution of only 1 CIF to conserve computational demand. The 1 CIF resolution means that an object the size of a human will not be detected if the camera utilized is wide angle and the human is more than sixty to eighty feet distant depending on conditions. Larger objects like vehicles or smoke would be detectable at greater distances.

Quantification of situational awareness

The utility of artificial intelligence for security does not exist in a vacuum, and its development was not driven by purely academic or scientific study. Rather, it is addressed to real-world needs, and hence, economic forces. Its use for non-security applications such as operational efficiency, shopper heat-mapping of display areas (meaning how many people are in a certain area in retail space), and attendance at classes are developing uses. Humans are not as well qualified as A.I. to compile and recognize patterns consisting of very large data sets requiring simultaneous calculations in multiple remote viewed locations. There is nothing natively human about such awareness. Such multitasking has been shown to defocus human attention and performance. A.I.s have the ability to handle such data. For the purposes of security interacting with video cameras they functionally have better visual acuity than humans or the machine approximation to it. For judging subtleties of behaviors or intentions of subjects or degrees of threat, humans remain far superior at the present state of the technology. So the A.I. in security functions to broadly scan beyond human capability and to vet the data to a first level of sorting of relevance and to alert the human officer who then takes over the function of assessment and response.

Security in the practical world is economically determined so that the expenditure of preventative security will never typically exceed the perceived cost of the risk to be avoided. Studies have shown that companies typically only spend about one twenty-fifth the amount on security that their actual losses cost them. What by pure economic theory should be an equivalence or homeostasis, thus falls vastly short of it. One theory that explains this is cognitive dissonance, or the ease with which unpleasant things like risk can be shunted from the conscious mind. Nevertheless, security is a major expenditure, and comparison of the costs of different means of security is always foremost amongst security professionals.

Another reason that future security threats or losses are under-assessed is that often only the direct cost of a potential loss is considered instead of the spectrum of consequential losses that are concomitantly experienced. For example, the vandalism-destruction of a custom production machine in a factory or of a refrigerated tractor-trailer would result in a long replacement time during which customers could not be served, resulting in loss of their business. A violent crime will have extensive public relations damage for an employer, beyond the direct liability for failing to protect the employee.

Behavioral analytics uniquely functions beyond simple security and, due to its ability to observe breaches in standard patterns of protocols, it can effectively find unsafe acts of employees that may result in workers comp or public liability incidents. Here too, the assessment of future incidents' costs falls short of the reality. A study by Liberty Mutual Insurance Company showed that the cost to employers is about six times the direct insured cost, since uninsured costs of consequential damages include temporary replacement workers, hiring costs for replacements, training costs, managers' time in reports or court, adverse morale on other workers, and effect on customer and public relations. The potential of A.I. in the form of behavioral analytics to proactively intercept and prevent such incidents is significant.

Remote control animal

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Remote_control_animal   ...