Reading is Weird


How do humans learn how to read? Reading is a much more recent development compared to spoken language, with the development of written language typically placed around 3200 BCE in Egypt (1). Some have termed this development a cultural invention because we were able to gain a new ability to understand meaning from written characters, most likely independent of any changes in our brains. And it is not just any meaning that we are gaining, but the exact same meaning as if we had heard the word spoken. Perhaps the secret to how we read is linked to how vocalized and written language interact in our brains.

Learning vocalized language essentially happens for free; put an infant in an environment with spoken words and they can not help but learn the language. This skill appears to be almost preprogrammed into humans. But the ability to turn meaning into characters (writing) and then to turn those characters into meaning (reading) is not something we are skilled with in the same instinctual way. Learning to read is a struggle requiring years of practice to perfect, and with many potential disabilities associated with it (such as dyslexia). Further, if you are never taught, you will not learn how to read. There are multiple countries in the world, including South Sudan, Afghanistan, and Niger in which less than 1/3rd of adults can read (2). For a geographically closer example, a study in 1989 found that 16% of Canadians were found to have literacy skills “too limited to deal with printed material encountered in daily life”(3). So why is learning a written language different than learning a spoken language?

Learning to Read

One popular theory is that we use our speech processing abilities as a stepping stone to the creation of a completely new ability (4). This theory posits that there are two routes to get from the written word to meaning. I will call one route Phonological Recoding and the other the Direct Route.

Think of language processing routes as a journey. First, you must process the basic sensory characteristics (for the letter ‘i’ the visual characteristics would be the straight line and dot and for the auditory characteristics it would be the sound ‘ih’). Next, you must turn the sensory experience into a usable linguistic form, i.e. you become aware you are looking at/hearing an ‘i’. Eventually, the various parts of the word are put together and you become aware of what word you are viewing/hearing and gain access the word’s meaning. There are two possible routes the visual word can take.

The idea behind the Phonological Recoding route is to use letter-to-sound rules to convert written symbols into sounds and then use already existing auditory neural machinery to combine the sounds into a word and access the meaning. The most vivid example of phonological recoding is imagining a young child learning to read by sounding out a word (wee…eer…eerd…weird!).

The Direct Route uses visual relationships between the letters in a word to determine visually (without reference to auditory processing) which word is being read. In learning to read, the Direct Route gradually develops as a child increases their reading skill, developing a ‘whole word orthographic representations’ which is a fancy way of saying ‘they will know the meaning of a word just from seeing it’.

What is the evidence?

I’ll quickly go over one recent study that illustrates this theory (5). The scientists’ aim is to demonstrate an early reliance of children on Phonological Recoding followed by the children’s development of whole word orthographic representations (the Direct Route). To do this they use two tools: Pseudohomophones & Transposed Letter Non-words (TLNs). Pseudohomophones are written non-words that sound like real words but are misspelled (Weird -> Weerd). If you are relying on Phonological Recoding to read, you would misidentify Pseudohomophones as real words due to their sounding exactly like real words. TLNs are a little more complicated. Phonological Recoding requires a serial reading of letters (W-E-I-R-D) for precise mapping from letters to sounds. As children become more skilled at reading, the develop the ability to process multiple letters at once, with letter combinations cueing them into which word is in front of them (W-E, W-I, W-R, W-D, E-I, E-R etc.). However, a TLN switches two letters (WIERD) which leads to many of the same combinations as the real word (W-E, W-I, W-R, W-D, E-R are all present with just E-I changing to I-E). Because of the large overlap between letter combinations, the TLN can be misidentified as a word.

Since the scientists theorize that Phonological Recoding predominates early and the Direct Route develops over time, they have a rather tidy hypothesis. They predict that Pseudohomophone errors should start high (due to children’s’ early reliance on Phonological Recoding) and decrease as reading level increases while TLN errors should start low and increase up to a certain grade level (due to gradual development of the Direct Route).

Figure 1. Errors made by children when deciding whether nonwords are words. The y-axis shows the differences in error rates between regular non-words and the pseudohomophones and TLNs (in all cases, errors to pseudohomophones and TLNs were greater than to regular nonwords). An error occurs when a child incorrectly labels a nonword (either a regular nonword, pseudohomophone, or TLN) as word.

Figure 1. Errors made by children when deciding whether nonwords are words. The y-axis shows the differences in error rates between regular non-words and the pseudohomophones and TLNs (in all cases, errors to pseudohomophones and TLNs were greater than to regular nonwords). An error occurs when a child incorrectly labels a nonword (either a regular nonword, pseudohomophone, or TLN) as word.

When examining Figure 1, at first glance it seems to support the scientists theory. There is a striking decrease in Pseudohomophone errors as people age and presumably become more skilled as readers. However, the initial increase and later decrease in TLN errors is much less pronounced (though still statistically significant). This could suggest that the scientists theory is not as strong as it could be, or perhaps TLNs are not a very sensitize measure of the development of the Direct Route. Check out the paper for yourself and see if you believe their results and interpretation.

Of course this is just one study and one theory. Here is an amazing (and very dense) resource on the current state of theories of reading if you care to take a gander (6).

What about other animals?

Using vocalizations to communicate meaning between individuals has a long and storied history in the animal kingdom. Imagine one wild animal screeching at another to ‘Keep Out’ of their territory. Humans do this too; One way to keep people from trespassing is to hire a guard to stand and say ‘Keep Out’ whenever someone ventures a little too close to the property line. But because of our ability to read, we can send the exact same message with a ‘Keep Out’ sign.

While animals have non-vocal ways of sending messages with similar meanings to their vocalizations, the relationship between the two isn’t quite the same. An animal may accomplish a similar message to screeching by urinating on a tree, but it would probably be a stretch to say the animal ‘hears’ the smell of urine the same way a human ‘hears’ the sign’s words Keep Out (Fun neuro challenge: Try to design an experiment in your head that would test my claim that animals don’t ‘hear’ a warning screech when smelling another animal’s warning urine).

Here’s a jumping off point to some studies that have asked whether our close genetic relatives can learn to use symbols to communicate. Bon Voyage!


1) Mitchell, Larkin. “Earliest Egyptian Glyphs”.Archaeology. Archaeological Institute of America.
2) CIA Factbook
3) OECD, Statistics Canada (2011). “Literacy for Life: Further Results from the Adult Literacy and Life Skills Survey”
4) Share, D. L. (1999). Phonological recoding and orthographic learning: A direct test of the self-teaching hypothesis. Journal of experimental child psychology,72(2), 95-129.
5) Grainger, J., Lété, B., Bertand, D., Dufau, S., & Ziegler, J. C. (2012). Evidence for multiple routes in learning to read. Cognition, 123(2), 280-292.
6) Frost, R., Behme, C., Beveridge, M. E., Bak, T. H., Bowers, J. S., Coltheart, M., … & Pitchford, N. J. (2012). Towards a universal model of reading.Behavioral and brain sciences, 35(5), 263.