human learning

These are rough prose notes for a talk I’ve prepared, also titled “Human Learning”

This is a fun talk.

I’ve started writing the content for this talk at a coffee shop – the Purple Llama – while making silly faces at a six month old baby. His mother was taking a break from parenting, drinking coffee and browsing Instagram or some other social media. The kid just watched me as I finished writing some other piece and started to turn my attention towards this topic.

I smiled, he smiled. I stuck out my tongue, he pursed his lips. Close; he couldn’t quite get the muscle control to stick out his tongue. I went over to buy a tea, he started making little squeaks and waving around his light green, BPA free plastic teething ring, as if to say, “where are you going?”

The purpose of this talk is to consider for a few moments the often mysterious activity of human learning. It’s a joke and a serious exposition all at once. The talk’s especially funny if you browse about online for blogs, articles, software libraries, tools, platforms, frameworks, startups, demonstrations, research projects, and general hype about machine learning. What gives? And if you look carefully you’ll notice that many of the smartest commentators on machine learning are equally fascinated by – if not more so – human learning. Why?

The’re onto something. All this is for people and made by people. So we need to understand what the hell we’re working with, why, and how it’s all possible.

Instinct

In the beginning, there’s instinct. Shit happens, literally. The first thing a newborn child does, is take huge lungfuls of air, expel them, and expel a few viscous ounces of shit. It’s the child’s first bowel movement, consisting of the slew of discarded nutrients and byproducts that would normally be filtered out of its body through the umbilical cord. These initial behaviors are very important, so much so the child born at a 21st century birthing facility faces her or his first standardized test – the Apgar score. The next instinctual behavior is feeding, followed closely by sleeping.

Then comes the fifth instinct by my count: the palmar grasp reflex. This one’s fascinating, and a fun, safe experiment you can try on a minutes old human. Place your finger in the palm of the newborn – she won’t let go. In fact, her grip will be stronger than you thought possible for a moments old child. This is an example of a vestigial behavior, of which “goosebumps” is another example that humans experience more often, and throughout their lives. Vestigial behaviors are both automatic and learned, albeit in an evolutionary sense. The difference from feeding, sleeping, and pooping is that the palmar grasp and goosebumps don’t have a clear necessity in present day human life.

The early primates likely have an arboreal origin akin to sloths, lemurs, monkeys, or koalas, who all give live birth in trees and whose young cling to the mother fur for the first few years of their life. As an evolutionary behavior – somethings that’s passed between generations simply because offspring that do it live and offspring that don’t die – it’s an excellent example of how learning can be innate.

Imitation

The next stage of human learning resembles my opening story. The baby in the cafe imitated me in smiling and attempting to stick out his tongue. The behavior can be regarded as instinct “plus”. Humans generally smile when smiled at. Not always, but the correlation has vastly more statistical significance than, say, liking a social media message.

It’s instinctive to smile back; we also see a very rapid formation of behavioral intent. An adult may be more likely to smile at other people after being smiled at. An infant who’s smiled at may be more likely to seek further attention by smiling. These are core behaviors in humans, slightly more complex than instinctual response. If we look further, we can see more complex demonstrations in the way people speak, dress, groom, gesture, and otherwise express themselves.

Imitation, in other words, is the next stage in human learning because it requires external behaviors (not just stimuli) and the behaviors get incorporated in one’s own. There’s an active and controversial body of research debating whether imitation is a distinctively human behavior or whether some animals – such as the chimpanzees – also imitate the way humans do. Tomasello, for instance, would say “no” but the comparative anthropology community appears split from my point of view, so I’ll leave it as an open question.

Socialization

Here’s a tricky transition in classifying behaviors. The idea behind socialized learning is that there’s some behaviors between imitation and more formal instruction that don’t cleanly fit into either category.

As imitative behaviors become more complex, there’s less identifiable moment of one individual directly observing another individual and doing what they saw being done.The line becomes burry, the description of the learned transfer less plausible from a scientific standpoint. Perhaps a behavior emerges at once from a group; perhaps this emergent behavior spreads with imitation as its driving mechanism, but perhaps we cannot be said to have a direct example of monkey see, monkey do.

A trend like avocado toast, Air Jordans, an interest in machine learning, or financial investments in tulips or crypto currencies probably fit this stage. The key point is that an emergent, socialized behavior has an effect on individual behaviors, preferences, and values. Sometimes this disappears rapidly – like flannel shirts – and sometimes we find ourselves enjoying tulips or cherry blossoms in April but cannot say for sure why.

These forms of learning are studied in cultural anthropology and economics.

Instruction

Education researchers have many ways to describe and evaluate different instructional styles, but the only one that’s achieved scientific validity so far are those considered direct instruction.

What is direct instruction? Let’s make an example. The stages of human learning so far are:

1. Instinct
2. Imitation
3. Socialization
4. Instruction

What are the four stages of human learning I’m teaching you? Good. You’ve just demonstrated all four stages, partially through direct instruction itself!

Personally, I have a background in philosophy and mathematics, so there’s a special place in my heart for self-referential proofs.

We (as educational researchers) know direct instruction works because it shows results, and these results can be replicated in many settings. Direct instruction is easily standardized, repeated, and distributed widely. It works great in general, but the ability to package a lesson gets significantly more difficult as the contents get more complex.

For example, a battlefield medic may take just a few hours to train, or minutes in an emergency – stop the bleeding and get them to a real doctor. The actual life saving and the further restoration of quality living require dozens of experts each with years of training and on-the-job experience. Thankfully, 95% of human experience this century is much more peaceful; unfortunately, there’s less political willpower in deploying direct instructional techniques outside of the medical and emergency professions. (I think part of this has to do with the martial an unfriendly learning environment of medical school or military training, but those socialized learning environments don’t devalue the instructional techniques writ large.)

Further Digressions

I’ve spoken of “science” several times: evolutionary biology, comparative anthropology, cultural anthropology, economics, educational research. What do we make of the scientific method and it’s dumb cousin, trial and error, when it comes to human learning. My opinion on this might be controversial yet I have strong reason to believe that both trial and error and the scientific method are not proper learning methodologies.

The first reason is that I think there’s a common misconception about trial and error that comes from Enlightenment era armchair anthropology. (Get your smoking jacket, we’re going to criticize cultures we don’t know that much about directly!) The lazy way to think about the origins of learning and discovery is to just imagine that people, especially primitive people long before our smarty pants era, just made shit up. That’s the intentional fallacy – I’m basically making shit up right this moment, assuming that’s what people used to do! We become more lazy by saying – putting extra words into the mouths of Locke and Rousseau – early humans must have learned what to eat by trial an error because we can’t imagine them doing anything else – they must have known so little!

I don’t believe for a second that very many human communities actually did randomized trials on what berries to eat on the savanna. Can we seriously imagine early humans having five babies prior to moving 500 miles north from the Rift Valley with the intention of letting them randomly select grasses to eat, and then seeing which ones to survive to figure out which are safe and which are not? No way!

I think it more plausible to say that human communities learn what not to do by mistake. Watch any child and parent in the park and you’ll see the parent simply stop the child from sampling anything for consumption. That’s dumb to just try things randomly – don’t do it. I suspect humans have known this for a long time, and we see it socialized in things like dietary restrictions (eg. shellfish in the desert seems rife for disease, so let’s make sure Deuteronomy prohibits it for all the tribes of Israel).

The second reason I think there’s a misconception about randomization, this time more closely related to the scientific method, is because there’s a tendency to use the more formal methodologies of discovery to confirm or deny assumptions. That’s distinct from learning something. If I can confirm that a particular plant derived compound has sedative effects on individuals suffering from sleep disorders, I’ve discovered something that I can then instruct physicians on using in particular cases. Otherwise, if it gives me pleasant yet unexpected hallucinations when accidentally ingested through skin contact during extraction – well, I’ve learned to use gloves in the lab and not take LSD without proper supervision and dosing!

In other words, there’s a clear utility in science, but it’s not the same thing as learning and not very amenable to the way humans pass along information. It’s more for discovery, finding new information, which I consider a different behavior altogether. Like I said – controversial.

Behold: Machines that Learn!

We’re now at the margins of this discussion. I’ve described four stages of human learning – instinct, imitation, socialization, and instruction – that compromise the vast majority of how people acquire and pass along information for use. I’ve also explained, poorly I’m sure, why science, although it serves to prove or disprove instances in each of the four stages I’ve reviewed, is not one such stage. Now I’ll move onto some closing remarks.

What’s up with the machines? Is there anything special about them, or different? I think not, at least philosophically speaking. Machine learning is an entire discipline built out of imitation. The smart people who talk about machine learning (not I) will talk about the ways in which their inventions will mimic human behaviors. Weird! Moreso, the really really smart ML researchers seem aghast at how desperately primitive Machine Learning is compared to Human Learning.

By way of ranting: some companies spend millions of dollars and several years building specialized hardware, software libraries, cleaning and agglomerating data sets, testing, revising, and powering with gigawatts of electricity little programs they can teach to play a video game. Or perhaps two video games with the same model. You’re really lucky if this same ML model can write crappy poetry. This all sounds to me like it’s much easier and cheaper to raise a child on Nintendo and Dr Seuss and release him on the world at age 15. Anyone want to take bets on whether AlphaZero gets kicked out of its parent’s basement in 3 years because it needs to get on with life and earn a living?

Invariably, the smart people who talk about machine learning (not I) say we can do better. In any case, the statistical methods at play in machine learning achievements like AlphaZero – which actually is a pretty impressive technological feat after all, just the same as it’s biologically amazing that a 15 year old is able to survive on HotPockets and Mountain Dew – these methods basically approximate what we think are human stimuli-response behaviors. Humans, even 15 year olds, do much better than I’ve polemicized and so can machines.

Neuroscience comes at the conundrum from the bottom up to try and describe human learning. At the cellular level, neuroscientists describe models and metaphors for human learning and behavior, and these descriptions formed the basis for early ML research. John von Neumann, for example, once famously quipped, “if you can describe it, I can make a computer do it.” He certainly could have. Problem: we (all humans, von Neumann included) have no idea how to formally describe the vast majority of human behavior! Neurons are a bit easier, despite their size, delicacy, and the inability to work with them in their natural environment (the living brain).

With neurons, we know how to look at their structure – long axions and webby dendrites – and measure the chemicals they emit and absorb. These chemicals, complicated salts basically, build up and get slowly absorbed between neurons; at times, the chemical-electoral imbalance they build up causes a reaction that initiates a reabsorbtion of some chemicals and a release of others. These sudden releases/absorption in response to chemical signals are called action potentials – and they serve as a metaphor for the machine learning object called a perceptron.

In certain kinds of machine learning, the perception receives inputs and – according to some mathematical function – emits outputs. It’s all cleanly expressed by a sum or weights equation like $\sigma_{i} {a}_{i}{c}_{i}$. A function may pass linearly over the sum or weights, there may be scaling by a constant, some additional phases or parameters or hyper parameters – but the core neural network concepts extremely simple. And so is the biological neuron. And we don’t have a holistic understanding of how either work in the face of actual complex behaviors and environments. Oh well.

Content by © Jared Davis 2019-2020