Search This Blog

Friday, January 31, 2014

Who's the most significant historical figure?

From Leonardo da Vinci to Einstein, and Shakespeare to Stephen King, two data analysts have ranked the most significant people in history – do the results seem right?
Portrait of man in black with shoulder-length, wavy brown hair, a large sharp nose, and a distracted gaze 
 
Steven Skiena and Charles B Ward
The Guardian,                 
 
Shakespeare, Austen, Homer, King, Dickinson and Shelley
The Literary Top 50 … top row from left: Shakespeare, Austen and Homer;
bottom row: King, Dickinson and Shelley
 
People love lists, and are perhaps even more fascinated by rankings – lists organised according to some measure of value or merit. Who were the most important women in history? The best writers or most influential artists? Our least illustrious political leaders? Who's bigger: Hitler or Napoleon? Picasso or Michelangelo? Charles Dickens or Jane Austen? John, Paul, George or Ringo?

We work in the fields of data and computer science and do not answer these questions as historians might, through a principled assessment of a person's achievements. Instead, we aggregate millions of opinions. We rank historical figures just as Google ranks web pages, by integrating a diverse set of measurements of reputation into a single consensus value.

Significance is related to fame but measures something different. According to our system, forgotten US President Chester A Arthur (who we rank at 499) is more historically significant than pop star Justin Bieber (ranked 8,633), even though Arthur may have a less devoted following and certainly has lower contemporary name recognition. We believe our computational, data-centric analysis provides new ways to understand and interpret the past.

Historically significant figures leave statistical evidence of their presence behind, if one knows where to look for it. We use several data sources to fuel our ranking algorithms. Most important is Wikipedia, the web-based, collaborative, multi-lingual encyclopedia. Wikipedia is enormous, featuring well over 3m articles in its English edition alone. But we use it in a manner quite different from the typical reader, by analysing the Wiki pages of more than 800,000 people to measure quantities that should correspond to historical significance. We would expect that more significant people should have longer Wikipedia pages than those less notable because they have greater accomplishments to report. The Wiki pages of people of higher significance should attract greater readership than those of lower significance. The elite should have pages linked to by other highly significant figures, meaning they should have a high PageRank, the measure of importance used by Google to identify important web pages. We combine these other variables into a single number using a statistical method called factor analysis. But we need one final correction: to fairly compare contemporary figures such as Britney Spears against, say, Aristotle, we must adjust for the fact that today's stars will fade from living memory over the next several generations. By analysing traces left in millions of scanned books, we hope to measure just how fast this decay occurs, and correct for it.

We have naturally received strong reactions from readers of our book Who's Bigger? complaining about our computational methodology. Certain historians have complained that Wiki cannot be trusted as a source for anything. This is pretty silly. People find Wikipedia articles to be generally accurate and informative, or else they wouldn't read them. Where do you head to read up on a new topic you are interested in? We think it is clear that anyone (or anything, like our algorithms) that has read all of Wikipedia would be in an excellent position to discourse about the most important people in recorded history.

More cogent is the complaint that our results are culturally biased because we analyse only the English edition of Wikipedia. How can we fairly assess the significance of Chinese poets against US presidents? We agree that any ranking of historical significance is indeed culturally dependent and so, yes, our rankings have an Anglocentric bias. But the depth of Wikipedia is so great that there are hundreds of articles about Chinese poets in the English edition.

Others highlight a few contemporary figures that they deem us to have overrated, such as Britney Spears (689) or Barack Obama (111), and use this anecdotal evidence to sneer. But we also conduct validation procedures, and compare our rankings to public opinion polls, Hall of Fame voting records, sports statistics, and even the prices of paintings and autographs.

Anecdotal evidence is not as compelling as it might seem. British readers have complained that our algorithms don't rank British figures high enough just as strongly as Spanish readers think we are unfair to their compatriots. But our book is designed in part to generate debate.

Our overall top 30

Portrait of Elizabeth I of England At No 13 … Elizabeth I. Photograph: Getty Images
1 Jesus
2 Napoleon
3 Muhammad
4 William Shakespeare
5 Abraham Lincoln
6 George Washington
7 Adolf Hitler
8 Aristotle
9 Alexander the Great
10 Thomas Jefferson
11 Henry VIII
12 Charles Darwin
13 Elizabeth I
14 Karl Marx
15 Julius Caesar
16 Queen Victoria
17 Martin Luther
18 Joseph Stalin
19 Albert Einstein
20 Christopher Columbus
21 Isaac Newton
22 Charlemagne
23 Theodore Roosevelt
24 Wolfgang Amadeus Mozart
25 Plato
26 Louis XIV
27 Ludwig van Beethoven
28 Ulysses S Grant
29 Leonardo da Vinci
30 Augustus

Top pre-20th-century artists

Self-Portrait by Leonardo da Vinci                       
At No 1 … Leonardo da Vinci Photograph: Bettmann/CORBIS

Art has been a uniquely human activity for more than 40,000 years. But the names of artists went unrecorded for most of this period. The identities of several prominent Greek artists, most notably Phidias, survive through contemporary written accounts and Roman copies of their work. But the notion of artists with distinct identities then faded, not to be revived until the late middle ages. The great painters of the Renaissance dominate our rankings of the most significant pre-20th century artists.

1 Leonardo da Vinci (overall ranking 29)
2 Michelangelo (86)
3 Raphael (140)
4 Rembrandt (189)
5 Titian (319)
6 Francisco Goya (366)
7 El Greco (465)
8 Albrecht Dürer (503)
9 Hans Holbein the Younger (555)
10 Johannes Vermeer (567)
11 Jacques-Louis David (607)
12 Giotto (610)
13 Diego Velázquez (693)
14 Gustave Courbet (965)
15 Hieronymus Bosch (983)

Top modern-era artists

VARIOUS                    
Top of the list … Vincent Van Gogh. Photograph: Rex Features

The Impressionist painters and their successors are at the top of our table of the most significant modern artists. Later movements such as surrealism (Salvador Dalí, 1,021) and abstract expressionism (Jackson Pollock, 1,013) are represented, but by relatively few artists.

1 Vincent van Gogh (73)
2 Pablo Picasso (171)
3 Claude Monet (178)
4 Henri Matisse (376)
5 Paul Cézanne (389)
6 Edgar Degas (422)
7 Andy Warhol (485)
8 Paul Gauguin (540)
9 Pierre-Auguste Renoir (549)
10 Auguste Rodin (574)
11 Wassily Kandinsky (618)
12 Edouard Manet (640)
13 Camille Pissarro (815)
14 Diego Rivera (915)
15 Edvard Munch (944)
16 James McNeill Whistler (1,002)
17 Jackson Pollock (1,013)
18 Salvador Dalí (1,021)
19 Piet Mondrian (1,051)
20 Georgia O'Keeffe (1,178)

Top 50 literary figures

Charles Dickens
At No 2 … Charles Dickens. Photograph: Getty Images

Ranking the world's greatest literary figures is a parlour game – just like the ranking of presidents or prime ministers. It exposes the biases inherent in everyone's world-view. But our ranking, it turns out, agrees with others: our top 50 contains 39 members of Daniel Burt's The Literary 100, including his 11 highest-ranked figures. With our Anglocentric source bias, we feature a larger number of British and US writers (but Jane Austen and Emily Dickinson are the only women to make it into the top 50).

1 William Shakespeare (4)
2 Charles Dickens (33)
3 Mark Twain (53)
4 Edgar Allan Poe (54)
5 Voltaire (64)
6 Oscar Wilde (77)
7 Johann Wolfgang von Goethe (88)
8 Dante Alighieri (96)
9 Lewis Carroll (118)
10 Henry David Thoreau (131)
11 Jane Austen (139)
12 Samuel Johnson (141)
13 Homer (152)
14 Lord Byron (158)
15 Walt Whitman (160)
16 John Milton (165)
17 Geoffrey Chaucer (173)
18 Virgil (177)
19 William Wordsworth (182)
20 Stephen King (191)
21 Emily Dickinson (194)
22 Leo Tolstoy (196)
23 Victor Hugo (208)
24 George Bernard Shaw (213)
25 Nathaniel Hawthorne (227)
26 Fyodor Dostoyevsky (244)
27 Miguel de Cervantes (246)
28 Ernest Hemingway (248)
29 HG Wells (249)
30 Herman Melville (251)
31 Rudyard Kipling (259)
32 Sophocles (274)
33 Samuel Taylor Coleridge (280)
34 John Keats (305)
35 Robert Burns (317)
36 Petrarch (326)
37 Percy Bysshe Shelley (329)
38 George Orwell (342)
39 Christopher Marlowe (374)
40 Thomas Hardy (378)
41 Aeschylus (386)
42 Jonathan Swift (391)
43 Rabindranath Tagore (397)
44 Henrik Ibsen (403)
45 James Joyce (406)
46 Henry James (408)
47 Aristophanes (418)
48 Alexander Pushkin (420)
49 Ben Jonson (421)
50 TS Eliot (436)

We generally score popular writers such as Oscar Wilde, Lewis Carroll and Mark Twain higher than we think the literary establishment would. We expect them to be surprised by our rank for horror novelist Stephen King. No other contemporary writer came close to a spot in our Literary 50. But we consider King to be the Dickens [33] of our time, characterised by immense popularity, mind-boggling productivity, and even the serial novel genre.

Who's Bigger: Where Historical Figures Really Rank by Steven Skiena and Charles B Ward is published by Cambridge.

Bayesian inference

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Bayesian_inference Bayesian inference ( / ...