Big Little Problems

This week, I had the bad luck to come across the following story about a book ranking the ‘importance’ of historical figures using computational methods.

As a cultural historian, their approach strikes me as fundamentally wrong-headed, from the level of its most basic assumptions, up to the guiding idea itself.

Let me tell you why.

I’ve said it before about other topics, and I’ll say it again now: I am an enthusiast of applying digital methodologies to cultural history… but only when it’s done well.

Clickbait about computers solving the past (thanks guys!) has this depressing tendency to resurface periodically, and this story is no exception. It turns out to be from 2014… about a book published in 2013.

But because the article pissed me off enough, I thought it was worth rebutting here. Yeah, yeah, yeah, I haven’t read the book. But given how badly they explain it in the article, and given how much rope they provide for their own execution in this short piece, I can hardly resist the temptation to get stuck in.

The authors do briefly address some ‘criticisms’ . They know that their approach is ‘biased’ because it’s only based on English-language Wikipedia. But the only other problem they address (at least in the puff piece article) is close to a straw-man: the good old ‘Wikipedia is nonsense’ attitude. ‘No,’ they cry, ‘Wikipedia is surprisingly reliable! Hooray, plus one to Gryffindor etc. etc.

I can spot some other fairly glaring bullshit that goes unexamined.

For a start: the whole concept itself stinks. I don’t say that simply as someone who has little interest in researching who The Most Important Man was: I say it because the idea of the importance of individuals is a very culturally-specific one.

Should they, for instance, have included figures that some audiences today might consider fictional, or mythical? Jesus is on the list, but did the Devil really not make the top 100? In my current research looking at magical practices in the nineteenth century, many of the ‘authors’ who were most influential are made up, or at least misattributed. That’s the nature of grimoire publishing. But it isn’t something that a question like this about greatness can address.

Of course many cultures have celebrated individual achievement. But the ways they do (and did) so have been wildly variable. Even if the authors had bothered to use Wikipediae (sic?)  in a variety of different languages, they wouldn’t be able to address the problem that the heroic model of history they are writing themselves into is very much a legacy of Romanticism.

And this cultural specificity of ‘greatness’ really undermines their methodology.

They write:

We would expect that more significant people should have longer Wikipedia pages than those less notable because they have greater accomplishments to report.


Um. What?

Quite beyond the fact that Wikipedia article length is largely determined by fandom (one keen amateur could easily massively overinflate the importance of any given figure), the idea that a short Wikipedia article would correspond to a historical figure who is less important simply baffles me. That pattern might hold in the aggregate, but when what you are doing is trying to work out which specific individuals are the most influential… well that’s a huge flaw.

The Wiki pages of people of higher significance should attract greater readership than those of lower significance


Or maybe the pages about the most controversial people attract higher readership. (A similar point could be made about article length).

Controversy is not importance.

There isn’t a way to measure implicit importance, the ways that given figures may have been hovering on the lips and in the minds of writers in the past, and authors of Wikipedia articles.

Any historian reading Alexis de Tocqueville, for instance, knows that Napoleon III is at the forefront of his mind in L’Ancien régime et la révolution, but unless I’m mistaken, Tocqueville never once uses his name.


If I had more energy, and felt less poorly from the flu I’ve picked up this week, I could perhaps go on.

But I think perhaps the important thing is to keep making this basic point: historians, like the folklorists I have sometimes spent time defending, do a specific kind of research with its own methods and findings.

Just because you have a big computer and a good idea does not mean you can chuck that work out of the window.

But if you want to come and play nicely, you’ve got my number.

One thought on “Big Little Problems

  1. I’d be interested to know the precise methodology applied to this cull of Wikipedia. E.g. have the authors noted who wrote the original article, who has since edited, and at what time? How has this affected their profound statement that Big = Important? How has this affected their overall analysis of the data?

    Length of article is also affected by every random piece of claptrap that happens to be in the news that day; Donald Trump’s and Anders Breivik’s entries get longer by the day, mostly through the accretion of pointless ephemera. Crowd-sourcing edits rather has this effect, I suspect, as everyone wants to have their say. Whether either figure is historically ‘important’ (let’s hope not) remains to be seen. I think our students would tell us that, at best, this analysis *might* tell us something about the many editors of Wikipedia and *perhaps* also the people who read it, but nothing whatsoever about the past.

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s