Fairytale Genetics

So by now you’ve seen the story doing the rounds online about how fairytales are much older than researchers thought, as proved by ‘phylogenetic analyses’.

Us humanities scholars should be darn grateful we have those scientists to save us from our misconceptions!

Or not.


This isn’t the first time that folklorists have found their subject matter making a big splash in the international news.

Let’s be clear about this point: folklorists are THE experts on oral narratives, such as fairy tales. They have developed tools and methods for studying this material, but the clickbait stories about ‘myth’ and ‘fairy tales’ often ignore this expertise, preferring dramatic accounts of undiscovered materials.

Um, no, 500 ‘new’ fairytales were not ‘discovered‘ in Germany in 2012. For a start, the materials in question had actually been published before. And this is without taking into account the point that these tales are not ‘new’. They are versions of tales collected in many other places.

Hey, if you want to see some ‘undiscovered’ tales, head down to the archives in any country of the world and go through the papers of some local folklorists. I guarantee you will find unpublished tales. (I did, during my PhD. Where’s my goddamn Guardian splash?)

You could even do something a bit more original, and look for previously undocumented versions of folk tales in early newspapers, for instance, which loved to republish fairy tales.


Let’s get back to the recent paper.

First the good news. These researchers have read the folklorists. They are up on what folklorists have to say about this stuff, and they make use of the most up to date versions of the key reference text: the Aarne-Thompson-Uther catalogue of tale types. (And we should expect nothing less: the authors are, after all, in anthropology departments, where the meeting of science and humanities research should be old hat.)

But I’m a little bit annoyed to notice that the whole point of the article is disproving ‘researchers who think fairytales were invented in the sixteenth century’.

Note the plural.

But for this point, the article only refers to the work of the literary scholar Ruth Bottigheimer, whose theories about the early-modern origins of fairy tales have been – to put it mildly – badly received by other scholars who have devoted their entire lives to the study of traditional narrative.

Perhaps Jack Zipes’ reaction is extreme, but his dissatisfaction with Bottigheimer’s arguments is typical. You only need look at the special issue of the Journal of American Folklore devoted to the Bottigheimer debate in 2010.

This is pretty important. Most folklorists believe that the origins of fairy tales most likely lie deep in human history, far beyond the periods we have evidence for.


So thanks for nothing?

Well perhaps not. Surely this is good evidence to back up a widely held theory? Maybe even something akin to a missing link?


But there are much broader problems with any analysis of this type. I will be the first to admit I don’t understand phylogenetic research and Bayesian probability. But do the researchers who wrote this article understand historical and literary research? They are laudably gracious about the value of humanistic research into fairy tales, but that isn’t the same thing as realising that the very ‘data’ they are using is inescapably shaped by factors that researchers in the humanities – and in this case, especially folklore – have spent hundreds of years thinking about.

What do I mean?

I mean that the Aarne-Thomson-Uther catalogue is an attempt to impose order on a set of materials that are filtered through historical texts. Ireland and Finland swarmed with folklorists in the nineteenth and twentieth centuries, yet much of France and England remained unexplored. The records of which tales existed where and when are nowhere near complete enough to guarantee that the results of phylogenetic analysis are any good.

Let me offer a really important example, which is key to the article.

The authors say that they found that many tales are surprisingly limited to vertical rather than horizontal transmission, by which they mean that – contrary to what we might expect – tales tend to travel down through the generations within cultural or linguistic groups, rather than between groups who live next to one another.

This is incredible. But is it right?

I know from my own PhD research that one of the things that horrified Félix Arnaudin, arguably the most important folklorist of nineteenth-century Gascon culture, was material that was contaminated by French. Arnaudin ruthlessly excluded tales and legends that he considered ‘French’ in search of a pure Gascon heritage.

The data that this recent article is based on is full of these kinds of source problems.

And if you really want to go down this line, you should check out the work of Dr. Jeana Jorgensen whose Ph.D. thesis sadly isn’t published, but who has written extensively on the fallacies underpinning some computational analyses of folklore data.

I do hate to be a party pooper.

In fact, I applaud the meeting of scientific and humanistic research this kind of paper represents.

But I am very worried by two things here: the ignorance, or belittling of the very serious issues raised by specialists, and the way these stories play out as media events.

Don’t stop doing the science, but perhaps we could have a little less grandstanding in the dissemination.



So this post has been a little more popular than my usual run-of-the-mill posts on history, folklore, and writing. It seems to have touched a chord with what a few other people thought about this story.

I await the inevitable backlash.


28 thoughts on “Fairytale Genetics

  1. Thanks for this. I know little or nothing about this topic, but I was puzzled to see the idea of ancient origins for folklore presented as new and surprising. The ‘folksong’ orthodoxy of the late 19th & early 20th centuries was precisely that contemporary folksongs were survivals from immemorial ‘pagan’ times; more recently Bert Lloyd argued quite persuasively that The Outlandish Knight had elements going back at least 2000 years. There’s been a lot of pushback against this line of thinking in its broader and sloppier forms – “Poor Old Horse” as an invocation of Odin’s horse Sleipnir etc – but I didn’t think anyone was seriously denying that *some* of these stories & tropes were very old indeed.

    Whether a folktale about a blacksmith goes back to the Bronze Age, as the Guardian reported, is another matter!

  2. None of this is terribly surprising.

    As someone who studies the history of music, specifically rhythm, it is incredibly frustrating how little evidence there is in the historical record. Obviously rhythm has to exist for music to have existed, so we can infer that it is as old as what we call humanity, but we have no records of it until surprisingly recently.

    Those darned prehistoric tribes and people, with their struggling to survive and thrive, but never leaving good documentation!

    I think the Guardian article is fairly benign, and while a bit on the “OH SHOCK, OH AMAZEMENT” side, will ultimately serve to snare those who have a slightly more than casual interest, even as those who have dedicated serious study to the subject roll their eyes.

    Again, exactly like everything else in the arts.

      1. WGP: Please explain your logic here. What does the benignity (or otherwise) of the article have to do with the number of folklore departments in England?

        My layperson’s take on the whole issue: You raise excellent points about what’s missing or errant in the popularized analysis, but in general the effect of articles like the Guardian’s is to draw more people’s attention to folklore, period. Fortunately, they will find that folklorists like yourself waiting for them to improve their understanding. That’s a good thing, in my view.


      2. Hi Tim: I suppose the point I am making is that these researchers claim authority on a topic that they don’t fully understand. This research is presented to the public as cutting edge. Research that makes headlines is research that funding bodies want to fund.

        Meanwhile, the people who really do understand this important and fascinating topic have no place in English higher education (and a relatively small place in Scottish, Irish, and Welsh higher education). They struggle for funding, and struggle to find steady employment.

        So far for from thinking the effects of this research are benign, I think they are harmful.

        Which is not at all to say that these researchers have acted in bad faith. I’m sure they did their best to read around their topic and do the best research they could. But their misleading conclusions are actively harmful to the other people who research folk tales.

        As for drawing people’s attention to folklore, I think one of the interesting things is that folklore is already v popular with the general public, not simply at the level of ye olde folke tale, but also the modern meme, Internet legend, urban love lock tradition etc.

        But the newspapers don’t like using the word folklore (and they’re not the only ones) so in the Uk we rarely see the experts commenting on these topics. The field is thrown open to pseudo-science (Dawkin’s idea of the meme being a classic example).


      3. I await their call 😛

        More seriously: I suppose I could send a readers letter or ask if they would publish a critique. I suspect the latter is unlikely, but it couldn’t hurt…


  3. There’s a superorganic assumption underlying this work on the phylogenetics of folktales–that folktales behave like living organisms, and that one can discover a stable model of mutation (change over time). In this particular article, the ATU type is seen as the organism, but they give us no information on what the underlying model for change is–in phylogenetics, this model is usually based on a model of DNA mutation.

    Accordingly, one must ask, “Is there a DNA for the tale? And does it change over time in a fashion that can be modeled using methods developed for a different problem?” Because these phylogenetic models appear to work for language, or more accurately, words (for which rules of change over time are well established, albeit debated, in philology), can one infer as Tehrani and da Silva do here, that the same applies to an artifact, here a tale, generated by language? These assumptions are fairly massive, and are not addressed even a little bit by the group.

    Just so everyone is on the same page, there is a great overview of some of the fundamental assumptions of phylogenetic analysis from the Delwiche Lab at UMD. These are:
    (1) homology (existence of a shared ancestry) –this is an a priori assumption; no homology leads to v. difficult results to interpret. Tehrani and da Silva propose the Indo-European hypothesis for assuming a shared ancestry [A more interesting research question is to test the IE hypothesis for tales by using existing data]
    (2) single common ancestor [the methods assume there will be something or some things at the top of the tree–you are almost guaranteed of getting a result with something at the top since that’s an assumption of the method]
    (3) mutation of the genetic sequences
    (4) relation across a dichotomously branching tree (equal division of a terminal bud)

    A priori assumptions are, in addition to homology:
    (1) sequence is correct (here the ATU or motif sequencing)
    (2) determined from the correct organism (ie the collection was not biased in any way)
    (3) the homology is free from paralogy (duplication events that do not necessarily rely on common ancestry)
    (4) sufficient similarity remains among the sequences so that there is still phylogenetic information present.
    Finally, there’s the actual applicability and parameterization of the computational method, here an established Monte Carlo Markov Chain (a Bayesian analysis approach) that is beyond my machine learning abilities to explain….

    There’s a lot more to say about what they don’t do, but this critique is intended to focus on what they do do, and the assumptions that underlie their approach and are quite dubious in my estimation

  4. I will say that at least one other scholar has recently argued that we shouldn’t assume the Grimms’ tales (and by extension others) have ancient roots: http://www.amazon.com/Tales-Magic-Print-Genealogy-Brothers/dp/0719083796 The article didn’t cite this book, however.

    Please update if you find any explanations of the article’s methodology. From what I can tell, this resembles the process of tracing language families back to one ancient language from which all others originated. Okay, but how do you do that with stories?

  5. Hi and thanks for your critique. Although I’m perhaps not quite as much of an outsider as you might think (having studied and published on folklore for a number of years now), I have much to learn and welcome these largely constructive criticisms. Of course, I’m sad to hear Will thinks the research “harmful” – but would point out that the condition of folklore studies in the UK obviously long predates my interest in the area. And on the positive side, the project was funded by the Portuguese FCT research council to support Sara’s postdoc at a *folklore* department in Lisbon (she is a literary scholar by the way). So, I can assure you that no British folklorists were harmed in the making of this paper ;).
    With regard to the more substantive criticisms:
    1) We do not assume that tales are only vertically transmitted or deny the importance of contamination, which we make plain in the Introduction. The whole point was to see whether, given the massive influence of contamination, it was even possible to excavate buried signatures of common ancestry. For more than two thirds of the tales it wasn’t – although we found detectable traces in a significant minority of cases (76 out of 275).
    2) Will points out the uneven coverage of the international folklore record (or at least the ATU Index). That’s fair enough – but have you seen the patchiness of the natural historical and fossil records?! We have to work with the data we have, and actually, as an anthropologist, I am enormously impressed by the richness of the data that folklorists have accumulated. I wish we had similar records for other cultural domains! A final and important point is that “missing data” usually makes it harder for the analyses to infer the presence of traits in ancestral populations. So our estimates may, if anything, be too conservative.
    3) Tim Tangherlini in the comments section says we make a load of assumptions which we don’t. We don’t use a DNA substitution model for tale evolution. We do not assume single common ancestors, but use 1,000 different trees to address phylogenetic uncertainty. We do not assume “something at the top of the tree” that will “guarantee a result” (as a quick glance at the Results section shows!). We DO assume that shared tales among closely related populations are likely to be homologous but explicitly draw attention to this in our Discussion and suggest ways of testing it in the future (so I was surprised he thought we didn’t discuss them “even a little bit”). As I said in the beginning, I welcome constructive criticisms, but I do think it’s important that they’re based on reading the paper itself, and not the media reports.
    Thanks again!


    1. Hi Jamie,

      Thanks for the thoughtful reply, and don’t worry, I don’t blame you personally for the state of folklore in the UK 😉 I do really appreciate you taking the time to respond to the criticisms: it probably isn’t great fun to find people snarking about your research on their blogs (!).

      I just find it problematic that one form of expertise, presented in a model drawn from the sciences, gets out in front of the general public, while the people who know a lot about this, but don’t publish and publicise like scientists, are left out of the conversation.

      This is a problem that folklorists particularly suffer from: why don’t public health bodies contact them more often for their expertise on disease and rumour? (They do sometimes: see Diane Goldstein)

      Why don’t psychologists and psychoanalysts ask folklorists about myths and fairy tales before they launch into their speculative interpretations based on limited examples (Bettelheim, Jung, Campbell)?

      I’m glad to hear you have been publishing on folklore for a number of years.

      With all due respect though, have you published in any folklore journals?

      If not, perhaps it would be worth doing. I appreciate it may not represent a very useful publication in terms of research output or grant funding for the future in your discipline, and I also think there would be a risk that you wouldn’t get a fair hearing.

      But on the other hand, I would have thought you could only benefit from getting some feedback from people (and Tim Tangherlini would be at the top of my list actually!) who know a great deal about folklore from a humanist (and indeed computational) background. If this method really works, then folklorists should be the first to find this exciting. And perhaps they can suggest ways you might deal with some of the data and method problems.

      They might tell you, however, that this research finding is not very important. They might (like me) be a little bit dismayed that this is really front page news.

      Because where are the folklorists that think fairy tales aren’t very old?

      Even Bottigheimer admits that her research is really only about a subset of tales (and this is one of the issues that has got her into hot water: she makes a claim too wide for her evidence). Apart from her, who else is there? I haven’t read the Blécourt book that Lefttheweb kindly linked to above, which is a shame, as Blécourt has widely published in folklore journals.

      The only other people I can think of are literary scholars, such as John Ellis, whose book on the Grimms is a bit of a hatchet job. (And like Bottigheimer, it’s important to point out that Ellis goes after a sub-set of fairy tales, and then tends to rather extrapolate).

      And I think it is important to note that these scholars all aim at a non-existent target: the folklorists who supposedly deny the possibility of literary transmission. I have never met such a folklorist.

      So just to be clear: my main complaint about this research is not even about the methods and data (although I think it would actually be really great to get in a room with some people and talk about this) but about the finding: fairy tales may date back to the bronze age?

      Well, yes.

      Did we need phylogenetic research to prove that?


    2. Dear Jaimie,
      Thanks for your thoughtful reply. The assumptions I mention in my brief post are for phylogenetic analysis in general; I was under the impression that homology, the starting point for phylogenetic research implies a common ancestor or at least a set of common ancestors (a point of clarification: I didn’t say “guarantee a result”, I said “almost guarantee a result”). I tried to fit those assumptions to your paper with comments in brackets. I’m glad to hear that you are doing something else. It would be interesting to get a clarification how those standard phylogenetic assumptions are not included in your analysis. The D analysis is clearly important, but it too seems to be at least initially based on binary traits in biological organisms (and hence by analogy to folklore, the superorganic and survivals theories).

      I didn’t say you used DNA substitution for this model, but rather wondered what the analogous component would be in a fairy tale, since a great deal of phylogenetic work does use DNA. After having read your article several times, I’m still confused as to what constitutes a tale in your model–is it simply the ATU number, is it a motif-chain, is it some sort of aggregate distillation of stories over a local corpus or set of local corpora? This is important since the representation of a tale and the implicit evolutionary model (inherited through the phylo.d package) are quite important. Ultimately, I’m trying to understand the underlying tale data on which this phylogenetic tree stands. The SI only includes your ATU x Place matrix.

      Finally, how do you decide which feature of a tale is the most important? You mention that “The basic plot of this tale… is stable throughout the Indo-European speaking world, from India to Scandinavia” and then settle on the smith as apparently the most representative feature of the tale. How did you decide that that feature (the cunning smith) is the most stable/important one?

      Liked by 1 person

  6. Merci, François. En effet, j’ai l’idée de peut-être faire un petit résumé des critiques, et j’ai déjà noté votre blog excellent! Il paraît que plusieurs chercheurs (dont Tim Tangherlini) sont d’accord en ce qui concerne la circularité (?) du logique en evidence…


  7. Thank you, all, for a very enlightening thread. I am a folklorist with 30+ years under my belt, but I’m not a narrative scholar. I was waiting for someone who is to comment on this research, because my folkoristic instincts led me to the same conclusions as Will has laid out. I would love to see a session at the American Folklore Society meetings next October built around the discussion that is going on here. Jaime, I hope you will consider applying to present your results there–the proposal deadline is March 31. The meeting is going to be a joint one with the International Society for Folk Narrative Research, so this would be an ideal setting for this discussion. Go to afsnet.org for information on the meeting. Thank you, all, for a stimulating discussion!

