October 29

The Dark Art of Computer Learning

It is a well-established fact that computers will one day rise up to kill us all. But what is rapidly becoming clear is that they first intend to humiliate us before their xenocidal rampage. A recent report found that roughly half of current U.S. jobs currently filled by humans are likely to be taken by robots in some form or another in the next 10-20 years(1). It is only once we are all unemployed, sitting at home watching Seinfeld reruns while eating potato chips that the machines will finally unleash their master plan (but which plan will they end up using!?).

I know what you are thinking. ‘My job is far too hard for a machine, it requires far too much creativity and outside-the-box thinking. I am a unique snowflake that can never be replaced!’. I hear that. We are flexibly pre-programmed (2,3) organic machines with the ability to learn and find solutions to never before encountered conditions. How can a machine ever cope with the multitude of unforeseen situations you encounter on a daily basis? Well let me tell you, computers are starting to teach each other, allowing less-experienced computers to learn from the knowledge gained from older, wiser computers. This allows the less-experienced computer to encounter novel situations and respond with the correct answer. And while these dastardly droids won’t be able to write blog posts just yet, the day will soon come, dear reader, when even my services are rendered obsolete by the silicon revolution.

How Computers Know when a Human is a Human

A basic task performed by computers is called Classification, in which a computer learns how to identify objects as members of one of several categories. Imagine a computer is tasked with classifying a series of pictures into the categories ‘Robot’ or ‘Human’.

First the computer is trained on a series of inputs (imagine the computer being fed images of variety of different robots and humans). When the computer sees a picture, it then extracts features from the image and compares these features to what it would expect for the different categories. It then makes a guess as to whether it is looking at a robot or a human, and receives feedback as to whether it is correct or not . It takes this feedback and then updates its machine learning algorithm. After many guesses and checks, the machine learning algorithm has a stable set of features associated with all the labels. Once this has been achieved, it is ready for testing, where it performs the predictions without feedback and its performance can be graded (4).

What Lurks in the Dark?

Here is where things go from ‘ho-hum’ to ‘might as well quit your job now’. To understand how a machine can generalize from previous experience to a novel situation, you have to understand the output of the learning algorithm (5). In our simple example there are two categories, ‘Robot’ or ‘Human’. Now, a robot is represented as [1,0] while a human is represented as a [0,1]. The 1’s & 0’s simply mean ‘100% chance this image is from category X and 0% chance the image is from category Y’. But in real life we are never 100% sure about anything, and the lower probabilities (referred to as ‘the dark’) actually hold important information that can help us in the future. Sometimes a robot looks a bit like a human, leading to numbers more like ‘97% chance this image is from category X and 3% chance the image is from category Y’. If instead of teaching a computer the hard numbers (this image is 100% robot) you instead train it on soft numbers based on the experiences of a previous computer (“Now this image looks 97% like a robot and 3% like a human”), a more experienced computer learning algorithm can teach an inexperienced computer learning algorithm based on the teacher’s encounters.

For a concrete example, imagine a machine learning algorithm that is fed handwritten versions of the numbers 1-9 and classifies them. Now, when people write a ‘5’ sometimes it can look very much like a 6 and sometimes it can look much more like a 3 (see figure 5). So now the machine learning algorithm can tell you with certainty that a 5 is a 5, and it can also tell you with certainty when a 5 looks kind of like a 3 but not like a 6 and vice versa.


Now, take the learned probabilities from the first computer and teach a second computer with these more subtle probabilities (“the dark”). The test is that you never show the second algorithm an example of a 3 during the learning phase. When researchers did just this, the second model was able to identify >98% of 3’s during the test phase even though it had never seen them before the test. An even more impressive result is to take the first algorithm’s knowledge about 7’s & 8’s and teach only those two numbers to the second algorithm. When the researchers tested the second algorithm on 1,2,3,4,5,6, and 9, it correctly identified 87% of trials without ever having seen any of those numbers before! And if that does not impress you, imagine a child being taught to identify numbers he/she previously had no experience with, based only on descriptions of their similarity to known numbers and then that human child getting 87% correct on an ensuing test for numbers it had never seen before. I would be very impressed if any teacher/student combination could perform at the level of 2 very average machine learning algorithms interacting.


Now, I know what everyone’s thinking. ‘You got me all riled up about these computers taking my job, and you’re telling me the big breakthrough is that they can learn what a 3 looks like? I’m going back to aimlessly surfing the web!’ Well you can go right back to reddit my dear reader, but I for one am going to start brushing up my skills serving dilithium chips to robot overlords on their lunch break. Oh but wait, they’re already working on automating those jobs too. Have they no decency!?


Title Pictures: Big Think (link)

Figure 1: http://opticalengineering.spiedigitallibrary.org/article.aspx?articleid=1077785

  1. Frey, C. B., & Osborne, M. A. (2013). The future of employment: how susceptible are jobs to computerisation?. Sept, 17, 2013.
  2. Deák, G. O. (2014). Development of Adaptive Tool-Use in Early Childhood: Sensorimotor, Social, and Conceptual Factors. Advances in child development and behavior, 46, 149.
  3. Lewkowicz, D. J. (2014). Early experience and multisensory perceptual narrowing. Developmental psychobiology, 56(2), 292-315.
  4. Machine Learning Classifiers
  5. Geoffrey Hinton Talk