Jay McClelland: Neural Networks and the Emergence of Cognition #222

Transcript

00:00:00 The following is a conversation with Jay McClelland,

00:00:03 a cognitive scientist at Stanford

00:00:05 and one of the seminal figures

00:00:06 in the history of artificial intelligence

00:00:09 and specifically neural networks.

00:00:12 Having written the parallel distributed processing book

00:00:15 with David Rommelhart,

00:00:17 who coauthored the backpropagation paper with Jeff Hinton.

00:00:21 In their collaborations, they’ve paved the way

00:00:24 for many of the ideas

00:00:25 at the center of the neural network based

00:00:27 machine learning revolution of the past 15 years.

00:00:32 To support this podcast,

00:00:33 please check out our sponsors in the description.

00:00:36 This is the Lex Friedman podcast

00:00:38 and here is my conversation with Jay McClelland.

00:00:43 You are one of the seminal figures

00:00:45 in the history of neural networks.

00:00:47 At the intersection of cognitive psychology

00:00:49 and computer science,

00:00:51 what to you has over the decades emerged

00:00:54 as the most beautiful aspect about neural networks?

00:00:57 Both artificial and biological.

00:01:00 The fundamental thing I think about with neural networks

00:01:03 is how they allow us to link

00:01:08 biology with the mysteries of thought.

00:01:17 When I was first entering the field myself

00:01:19 in the late 60s, early 70s,

00:01:23 cognitive psychology had just become a field.

00:01:29 There was a book published in 67 called Cognitive Psychology.

00:01:36 And the author said that the study of the nervous system

00:01:42 was only of peripheral interest.

00:01:44 It wasn’t going to tell us anything about the mind.

00:01:48 And I didn’t agree with that.

00:01:51 I always felt, oh, look, I’m a physical being.

00:01:58 From dust to dust, you know,

00:02:01 ashes to ashes, and somehow I emerged from that.

00:02:06 So that’s really interesting.

00:02:08 So there was a sense with cognitive psychology

00:02:11 that in understanding the neuronal structure of things,

00:02:17 you’re not going to be able to understand the mind.

00:02:20 And then your sense is if we study these neural networks,

00:02:23 we might be able to get at least very close

00:02:25 to understanding the fundamentals of the human mind.

00:02:28 Yeah.

00:02:29 I used to think, or I used to talk about the idea

00:02:32 of awakening from the Cartesian dream.

00:02:36 So Descartes, you know, thought about these things, right?

00:02:41 He was walking in the gardens of Versailles one day,

00:02:46 and he stepped on a stone.

00:02:48 And a statue moved.

00:02:52 And he walked a little further,

00:02:53 he stepped on another stone, and another statue moved.

00:02:55 And he, like, why did the statue move

00:02:59 when I stepped on the stone?

00:03:00 And he went and talked to the gardeners,

00:03:02 and he found out that they had a hydraulic system

00:03:06 that allowed the physical contact with the stone

00:03:10 to cause water to flow in various directions,

00:03:12 which caused water to flow into the statue

00:03:14 and move the statue.

00:03:15 And he used this as the beginnings of a theory

00:03:22 about how animals act.

00:03:28 And he had this notion that these little fibers

00:03:33 that people had identified that weren’t carrying the blood,

00:03:37 you know, were these little hydraulic tubes

00:03:39 that if you touch something, there would be pressure,

00:03:42 and it would send a signal of pressure

00:03:43 to the other parts of the system,

00:03:46 and that would cause action.

00:03:49 So he had a mechanistic theory of animal behavior.

00:03:54 And he thought that the human had this animal body,

00:04:00 but that some divine something else

00:04:03 had to have come down and been placed in him

00:04:06 to give him the ability to think, right?

00:04:10 So the physical world includes the body in action,

00:04:15 but it doesn’t include thought according to Descartes, right?

00:04:19 And so the study of physiology at that time

00:04:22 was the study of sensory systems and motor systems

00:04:26 and things that you could directly measure

00:04:30 when you stimulated neurons and stuff like that.

00:04:33 And the study of cognition was something that, you know,

00:04:38 was tied in with abstract computer algorithms

00:04:41 and things like that.

00:04:43 But when I was an undergraduate,

00:04:45 I learned about the physiological mechanisms.

00:04:48 And so when I’m studying cognitive psychology

00:04:51 as a first year PhD student, I’m saying,

00:04:53 wait a minute, the whole thing is biological, right?

00:04:56 You know?

00:04:57 You had that intuition right away.

00:04:59 That always seemed obvious to you.

00:05:00 Yeah, yeah.

00:05:03 Isn’t that magical, though,

00:05:04 that from just a little bit of biology can emerge

00:05:08 the full beauty of the human experience?

00:05:10 Why is that so obvious to you?

00:05:13 Well, obvious and not obvious at the same time.

00:05:18 And I think about Darwin in this context, too,

00:05:20 because Darwin knew very early on

00:05:25 that none of the ideas that anybody had ever offered

00:05:29 gave him a sense of understanding

00:05:31 how evolution could have worked.

00:05:36 But he wanted to figure out how it could have worked.

00:05:40 That was his goal.

00:05:42 And he spent a lot of time working on this idea

00:05:48 and reading about things that gave him hints

00:05:52 and thinking they were interesting but not knowing why

00:05:54 and drawing more and more pictures of different birds

00:05:57 that differ slightly from each other and so on, you know.

00:06:00 And then he figured it out.

00:06:03 But after he figured it out, he had nightmares about it.

00:06:06 He would dream about the complexity of the eye

00:06:10 and the arguments that people had given

00:06:12 about how ridiculous it was to imagine

00:06:16 that that could have ever emerged

00:06:19 from some sort of, you know, unguided process, right?

00:06:24 That it hadn’t been the product of design.

00:06:28 And so he didn’t publish for a long time,

00:06:32 in part because he was scared of his own ideas.

00:06:35 He didn’t think they could possibly be true.

00:06:40 But then, you know, by the time

00:06:44 the 20th century rolls around, we all,

00:06:49 you know, we understand that,

00:06:52 many people understand or believe

00:06:55 that evolution produced, you know, the entire

00:06:59 range of animals that there are.

00:07:03 And, you know, Descartes’s idea starts to seem

00:07:06 a little wonky after a while, right?

00:07:08 Like, well, wait a minute.

00:07:11 There’s the apes and the chimpanzees and the bonobos

00:07:15 and, you know, like, they’re pretty smart in some ways.

00:07:18 You know, so what?

00:07:20 Oh, you know, somebody comes up,

00:07:22 oh, there’s a certain part of the brain

00:07:23 that’s still different.

00:07:24 They don’t, you know, there’s no hippocampus

00:07:26 in the monkey brain.

00:07:28 It’s only in the human brain.

00:07:31 Huxley had to do a surgery in front of many, many people

00:07:34 in the late 19th century to show to them

00:07:36 there’s actually a hippocampus in the chimpanzee’s brain.

00:07:40 You know, so the continuity of the species

00:07:45 is another element that, you know,

00:07:49 contributes to this sort of, you know, idea

00:07:56 that we are ourselves a total product of nature.

00:08:01 And that, to me, is the magic and the mystery,

00:08:06 how nature could actually, you know,

00:08:11 give rise to organisms that have the capabilities

00:08:16 that we have.

00:08:20 So it’s interesting because even the idea of evolution

00:08:23 is hard for me to keep all together in my mind.

00:08:27 So because we think of a human time scale,

00:08:30 it’s hard to imagine, like, the development

00:08:33 of the human eye would give me nightmares too.

00:08:36 Because you have to think across many, many, many

00:08:38 generations, and it’s very tempting to think about

00:08:41 kind of a growth of a complicated object

00:08:44 and it’s like, how is it possible for such a thing

00:08:49 to be built?

00:08:50 Because also, me, from a robotics engineering perspective,

00:08:53 it’s very hard to build these systems.

00:08:55 How can, through an undirected process,

00:08:58 can a complex thing be designed?

00:09:00 It seems not, it seems wrong.

00:09:03 Yeah, so that’s absolutely right.

00:09:05 And I, you know, a slightly different career path

00:09:08 that would have been equally interesting to me

00:09:10 would have been to actually study the process

00:09:15 of embryological development flowing on

00:09:21 into brain development and the exquisite sort of laying

00:09:29 down of pathways and so on that occurs in the brain.

00:09:32 And I know the slightest bit about that is not my field,

00:09:35 but there are, you know, fascinating aspects

00:09:43 to this process that eventually result in the, you know,

00:09:49 the complexity of various brains.

00:09:54 At least, you know, one thing we’re,

00:09:59 in the field, I think people have felt for a long time,

00:10:02 in the study of vision, the continuity between humans

00:10:07 and nonhuman animals has been second nature

00:10:11 for a lot longer.

00:10:12 I was having, I had this conversation with somebody

00:10:16 who is a vision scientist and he was saying,

00:10:17 oh, we don’t have any problem with this.

00:10:19 You know, the monkey’s visual system

00:10:21 and the human visual system, extremely similar

00:10:26 up to certain levels, of course, they diverge after a while.

00:10:29 But the first, the visual pathway from the eye

00:10:34 to the brain and the first few layers of cortex

00:10:41 or cortical areas, I guess one would say,

00:10:45 are extremely similar.

00:10:49 Yeah, so on the cognition side is where the leap

00:10:52 seems to happen with humans,

00:10:54 that it does seem we’re kind of special.

00:10:56 And that’s a really interesting question

00:10:58 when thinking about alien life

00:11:00 or if there’s other intelligent alien civilizations

00:11:03 out there, is how special is this leap?

00:11:06 So one special thing seems to be the origin of life itself.

00:11:09 However you define that, there’s a gray area.

00:11:11 And the other leap, this is very biased perspective

00:11:14 of a human, is the origin of intelligence.

00:11:19 And again, from an engineer perspective,

00:11:22 it’s a difficult question to ask.

00:11:24 An important one is how difficult is that leap?

00:11:27 How special were humans?

00:11:30 Did a monolith come down?

00:11:32 Did aliens bring down a monolith

00:11:33 and some apes had to touch a monolith to get it?

00:11:38 That’s a lot like Descartes idea, right?

00:11:41 Exactly, but it just seems one heck of a leap

00:11:46 to get to this level of intelligence.

00:11:48 Yeah, and so Chomsky argued that some genetic fluke occurred

00:12:00 100,000 years ago and just happened

00:12:04 that some human, some hominin predecessor of current humans

00:12:13 had this one genetic tweak that resulted in language.

00:12:20 And language then provided this special thing that separates us

00:12:29 from all other animals.

00:12:36 I think there’s a lot of truth to the value and importance

00:12:39 of language, but I think it comes along

00:12:43 with the evolution of a lot of other related things related

00:12:48 to sociality and mutual engagement with others

00:12:53 and establishment of, I don’t know,

00:13:01 rich mechanisms for organizing and understanding

00:13:07 of the world, which language then plugs into.

00:13:12 Right, so language is a tool that

00:13:16 allows you to do this kind of collective intelligence.

00:13:18 And whatever is at the core of the thing that

00:13:21 allows for this collective intelligence is the main thing.

00:13:25 And it’s interesting to think about that one fluke, one

00:13:29 mutation could lead to the first crack opening of the door

00:13:36 to human intelligence.

00:13:38 All it takes is one.

00:13:39 Evolution just kind of opens the door a little bit,

00:13:41 and then time and selection takes care of the rest.

00:13:45 You know, there’s so many fascinating aspects

00:13:48 to these kinds of things.

00:13:49 So we think of evolution as continuous, right?

00:13:54 We think, oh, yes, OK, over 500 million years,

00:13:58 there could have been this relatively continuous changes.

00:14:04 And but that’s not what anthropologists,

00:14:12 evolutionary biologists found from the fossil record.

00:14:15 They found hundreds of millions of years of stasis.

00:14:24 And then suddenly a change occurs.

00:14:27 Well, suddenly on that scale is a million years or something,

00:14:32 or even 10 million years.

00:14:33 But the concept of punctuated equilibrium

00:14:38 was a very important concept in evolutionary biology.

00:14:44 And that also feels somehow right about the stages

00:14:53 of our mental abilities.

00:14:55 We seem to have a certain kind of mindset at a certain age.

00:14:59 And then at another age, we look at that four year old

00:15:04 and say, oh, my god, how could they have thought that way?

00:15:07 So Piaget was known for this kind of stage theory

00:15:10 of child development, right?

00:15:11 And you look at it closely, and suddenly those stages

00:15:14 are so discreet and it transitions.

00:15:17 But the difference between the four year old and the seven

00:15:19 year old is profound.

00:15:20 And that’s another thing that’s always interested me

00:15:24 is how something happens over the course of several years

00:15:29 of experience where at some point

00:15:31 we reach the point where something

00:15:33 like an insight or a transition or a new stage of development

00:15:37 occurs.

00:15:38 And these kinds of things can be understood

00:15:45 in complex systems research.

00:15:47 And so evolutionary biology, developmental biology,

00:15:55 cognitive development are all things

00:15:57 that have been approached in this kind of way.

00:15:59 Yeah.

00:16:01 Just like you said, I find both fascinating

00:16:03 those early years of human life, but also

00:16:07 the early minutes, days from the embryonic development

00:16:13 to how from embryos you get the brain.

00:16:17 That development, again, from an engineer perspective,

00:16:20 is fascinating.

00:16:22 So it’s not.

00:16:22 So the early, when you deploy the brain to the human world

00:16:27 and it gets to explore that world and learn,

00:16:29 that’s fascinating.

00:16:30 But just like the assembly of the mechanism

00:16:33 that is capable of learning, that’s amazing.

00:16:36 The stuff they’re doing with brain organoids

00:16:39 where you can build many brains and study

00:16:42 that self assembly of a mechanism from the DNA material,

00:16:48 that’s like, what the heck?

00:16:51 You have literally biological programs

00:16:55 that just generate a system, this mushy thing that’s

00:17:00 able to be robust and learn in a very unpredictable world

00:17:05 and learn seemingly arbitrary things,

00:17:08 or a very large number of things that enable survival.

00:17:14 Yeah.

00:17:15 Ultimately, that is a very important part

00:17:19 of the whole process of understanding

00:17:22 this emergence of mind from brain kind of thing.

00:17:27 And the whole thing seems to be pretty continuous.

00:17:29 So let me step back to neural networks

00:17:32 for another brief minute.

00:17:35 You wrote parallel distributed processing books

00:17:37 that explored ideas of neural networks in the 1980s

00:17:42 together with a few folks.

00:17:43 But the books you wrote with David Romelhart,

00:17:47 who is the first author on the back propagation

00:17:50 paper with Jeff Hinton.

00:17:52 So these are just some figures at the time

00:17:54 that we’re thinking about these big ideas.

00:17:57 What are some memorable moments of discovery

00:18:00 and beautiful ideas from those early days?

00:18:04 I’m going to start sort of with my own process in the mid 70s

00:18:13 and then into the late 70s when I met Jeff Hinton

00:18:18 and he came to San Diego and we were all together.

00:18:25 In my time in graduate schools, I’ve already described to you,

00:18:30 I had this sort of feeling of, OK, I’m

00:18:33 really interested in human cognition,

00:18:35 but this disembodied sort of way of thinking about it

00:18:40 that I’m getting from the current mode of thought about it

00:18:44 isn’t working fully for me.

00:18:47 And when I got my assistant professorship,

00:18:52 I went to UCSD and that was in 1974.

00:18:58 Something amazing had just happened.

00:19:00 Dave Romelhart had written a book together

00:19:03 with another man named Don Norman

00:19:06 and the book was called Explorations in Cognition.

00:19:09 And it was a series of chapters exploring

00:19:14 interesting questions about cognition,

00:19:17 but in a completely sort of abstract, nonbiological kind

00:19:22 of way.

00:19:23 And I’m saying, gee, this is amazing.

00:19:25 I’m coming to this community where people can get together

00:19:28 and feel like they’ve collectively exploring ideas.

00:19:35 And it was a book that had a lot of, I don’t know,

00:19:39 lightness to it.

00:19:40 And Don Norman, who was the more senior figure

00:19:47 to Romelhart at that time who led that project,

00:19:51 always created this spirit of playful exploration of ideas.

00:19:55 And so I’m like, wow, this is great.

00:19:58 But I was also still trying to get from the neurons

00:20:07 to the cognition.

00:20:10 And I realized at one point, I got this opportunity

00:20:15 to go to a conference where I heard a talk by a man named

00:20:18 James Anderson, who was an engineer,

00:20:22 but by then a professor in a psychology department, who

00:20:26 had used linear algebra to create neural network

00:20:32 models of perception and categorization and memory.

00:20:37 And it just blew me out of the water

00:20:41 that one could create a model that was simulating neurons,

00:20:47 not just engaged in a stepwise algorithmic process that

00:20:56 was construed abstractly.

00:20:58 But it was simulating remembering and recalling

00:21:03 and recognizing the prior occurrence of a stimulus

00:21:07 or something like that.

00:21:08 So for me, this was a bridge between the mind and the brain.

00:21:14 And I remember I was walking across campus one day in 1977,

00:21:20 and I almost felt like St. Paul on the road to Damascus.

00:21:25 I said to myself, if I think about the mind in terms

00:21:30 of a neural network, it will help

00:21:32 me answer the questions about the mind

00:21:33 that I’m trying to answer.

00:21:36 And that really excited me.

00:21:38 So I think that a lot of people were

00:21:43 becoming excited about that.

00:21:45 And one of those people was Jim Anderson, who I had mentioned.

00:21:49 Another one was Steve Grossberg, who

00:21:52 had been writing about neural networks since the 60s.

00:21:58 And Jeff Hinton was yet another.

00:22:00 And his PhD dissertation showed up in an applicant pool

00:22:08 to a postdoctoral training program

00:22:11 that Dave and Don, the two men I mentioned before,

00:22:16 Rumelhart and Norman, were administering.

00:22:19 And Rumelhart got really excited about Hinton’s PhD dissertation.

00:22:26 And so Hinton was one of the first people

00:22:30 who came and joined this group of postdoctoral scholars

00:22:34 that was funded by this wonderful grant that they got.

00:22:39 Another one who is also well known

00:22:41 in neural network circles is Paul Smolenski.

00:22:45 He was another one of that group.

00:22:47 Anyway, Jeff and Jim Anderson organized a conference

00:22:55 at UCSD where we were.

00:22:59 And it was called Parallel Models of Associative Memory.

00:23:04 And it brought all the people together

00:23:06 who had been thinking about these kinds of ideas

00:23:08 in 1979 or 1980.

00:23:11 And this began to kind of really resonate

00:23:18 with some of Rumelhart’s own thinking,

00:23:23 some of his reasons for wanting something

00:23:26 other than the kinds of computation

00:23:28 he’d been doing so far.

00:23:29 So let me talk about Rumelhart now for a minute,

00:23:32 OK, with that context.

00:23:33 Well, let me also just pause because he

00:23:34 said so many interesting things before we go to Rumelhart.

00:23:37 So first of all, for people who are not familiar,

00:23:40 neural networks are at the core of the machine learning,

00:23:43 deep learning revolution of today.

00:23:45 Geoffrey Hinton that we mentioned

00:23:46 is one of the figures that were important in the history

00:23:50 like yourself in the development of these neural networks,

00:23:53 artificial neural networks that are then

00:23:54 used for the machine learning application.

00:23:56 Like I mentioned, the backpropagation paper

00:23:59 is one of the optimization mechanisms

00:24:02 by which these networks can learn.

00:24:05 And the word parallel is really interesting.

00:24:09 So it’s almost like synonymous from a computational

00:24:12 perspective how you thought at the time about neural networks

00:24:17 as parallel computation.

00:24:20 Would that be fair to say?

00:24:21 Well, yeah, the parallel, the word parallel in this

00:24:25 comes from the idea that each neuron is

00:24:30 an independent computational unit, right?

00:24:33 It gathers data from other neurons,

00:24:36 it integrates it in a certain way,

00:24:39 and then it produces a result. And it’s

00:24:41 a very simple little computational unit.

00:24:44 But it’s autonomous in the sense that it does its thing, right?

00:24:51 It’s in a biological medium where

00:24:53 it’s getting nutrients and various chemicals

00:24:57 from that medium.

00:25:00 But you can think of it as almost like a little computer

00:25:05 in and of itself.

00:25:08 So the idea is that each our brains have, oh, look,

00:25:13 100 or hundreds, almost a billion

00:25:17 of these little neurons, right?

00:25:21 And they’re all capable of doing their work at the same time.

00:25:25 So it’s like instead of just a single central processor that’s

00:25:30 engaged in chug one step after another,

00:25:36 we have a billion of these little computational units

00:25:41 working at the same time.

00:25:42 So at the time that’s, I don’t know, maybe you can comment,

00:25:45 it seems to me, even still to me,

00:25:49 quite a revolutionary way to think about computation

00:25:52 relative to the development of theoretical computer science

00:25:56 alongside of that where it’s very much like sequential computer.

00:26:00 You’re analyzing algorithms that are running on a single computer.

00:26:04 You’re saying, wait a minute, why don’t we

00:26:08 take a really dumb, very simple computer

00:26:11 and just have a lot of them interconnected together?

00:26:14 And they’re all operating in their own little world

00:26:16 and they’re communicating with each other

00:26:18 and thinking of computation that way.

00:26:21 And from that kind of computation,

00:26:24 trying to understand how things like certain characteristics

00:26:28 of the human mind can emerge.

00:26:31 That’s quite a revolutionary way of thinking, I would say.

00:26:35 Well, yes, I agree with you.

00:26:37 And there’s still this sort of sense

00:26:44 of not sort of knowing how we kind of get all the way there,

00:26:53 I think.

00:26:54 And this very much remains at the core of the questions

00:26:58 that everybody’s asking about the capabilities

00:27:01 of deep learning and all these kinds of things.

00:27:02 But if I could just play this out a little bit,

00:27:07 a convolutional neural network or a CNN,

00:27:11 which many people may have heard of, is a set of,

00:27:19 you could think of it biologically as a set of

00:27:24 collections of neurons.

00:27:27 Each collection has maybe 10,000 neurons in it.

00:27:33 But there’s many layers.

00:27:35 Some of these things are hundreds or even

00:27:38 1,000 layers deep.

00:27:39 But others are closer to the biological brain

00:27:43 and maybe they’re like 20 layers deep or something like that.

00:27:47 So within each layer, we have thousands of neurons

00:27:52 or tens of thousands maybe.

00:27:54 Well, in the brain, we probably have millions in each layer.

00:27:59 But we’re getting sort of similar in a certain way.

00:28:05 And then we think, OK, at the bottom level,

00:28:09 there’s an array of things that are like the photoreceptors.

00:28:12 In the eye, they respond to the amount

00:28:14 of light of a certain wavelength at a certain location

00:28:17 on the pixel array.

00:28:21 So that’s like the biological eye.

00:28:24 And then there’s several further stages going up,

00:28:27 layers of these neuron like units.

00:28:30 And you go from that raw input array of pixels

00:28:36 to the classification, you’ve actually

00:28:40 built a system that could do the same kind of thing

00:28:44 that you and I do when we open our eyes and we look around

00:28:46 and we see there’s a cup, there’s a cell phone,

00:28:49 there’s a water bottle.

00:28:52 And these systems are doing that now, right?

00:28:54 So they are, in terms of the parallel idea

00:29:00 that we were talking about before,

00:29:02 they are doing this massively parallel computation

00:29:05 in the sense that each of the neurons in each

00:29:08 of those layers is thought of as computing

00:29:12 its little bit of something about the input

00:29:17 simultaneously with all the other ones in the same layer.

00:29:21 We get to the point of abstracting that away

00:29:24 and thinking, oh, it’s just one whole vector that’s

00:29:27 being computed, one activation pattern that’s

00:29:30 computed in a single step.

00:29:32 And that abstraction is useful, but it’s still that parallel.

00:29:39 And distributed processing, right?

00:29:41 Each one of these guys is just contributing

00:29:43 a tiny bit to that whole thing.

00:29:45 And that’s the excitement that you felt,

00:29:46 that from these simple things, you can emerge.

00:29:50 When you add these level of abstractions on it,

00:29:53 you can start getting all the beautiful things

00:29:56 that we think about as cognition.

00:29:58 And so, OK, so you have this conference.

00:30:01 I forgot the name already, but it’s

00:30:02 Parallel and Something Associative Memory and so on.

00:30:05 Very exciting, technical and exciting title.

00:30:08 And you started talking about Dave Romerhart.

00:30:11 So who is this person that was so,

00:30:15 you’ve spoken very highly of him.

00:30:17 Can you tell me about him, his ideas, his mind, who he was

00:30:22 as a human being, as a scientist?

00:30:24 So Dave came from a little tiny town in Western South Dakota.

00:30:31 And his mother was the librarian,

00:30:35 and his father was the editor of the newspaper.

00:30:41 And I know one of his brothers pretty well.

00:30:46 They grew up, there were four brothers,

00:30:49 and they grew up together.

00:30:53 And their father encouraged them to compete with each other

00:30:56 a lot.

00:30:58 They competed in sports, and they competed in mind games.

00:31:04 I don’t know, things like Sudoku and chess and various things

00:31:07 like that.

00:31:08 And Dave was a standout undergraduate.

00:31:16 He went at a younger age than most people

00:31:20 do to college at the University of South Dakota

00:31:23 and majored in mathematics.

00:31:24 And I don’t know how he got interested in psychology,

00:31:30 but he applied to the mathematical psychology

00:31:33 program at Stanford and was accepted as a PhD student

00:31:37 to study mathematical psychology at Stanford.

00:31:40 So mathematical psychology is the use of mathematics

00:31:46 to model mental processes.

00:31:50 So something that I think these days

00:31:52 might be called cognitive modeling, that whole space.

00:31:55 Yeah, it’s mathematical in the sense

00:31:57 that you say, if this is true and that is true,

00:32:05 then I can derive that this should follow.

00:32:08 And so you say, these are my stipulations

00:32:10 about the fundamental principles,

00:32:12 and this is my prediction about behavior.

00:32:15 And it’s all done with equations.

00:32:16 It’s not done with a computer simulation.

00:32:19 So you solve the equation, and that tells you

00:32:23 what the probability that the subject

00:32:26 will be correct on the seventh trial or the experiment is

00:32:29 or something like that.

00:32:30 So it’s a use of mathematics to descriptively characterize

00:32:37 aspects of behavior.

00:32:39 And Stanford at that time was the place

00:32:43 where there were several really, really strong

00:32:48 mathematical thinkers who were also connected with three

00:32:51 or four others around the country who brought

00:32:55 a lot of really exciting ideas onto the table.

00:32:59 And it was a very, very prestigious part

00:33:02 of the field of psychology at that time.

00:33:05 So Rummelhart comes into this.

00:33:08 He was a very strong student within that program.

00:33:13 And he got this job at this brand new university

00:33:19 in San Diego in 1967, where he’s one of the first assistant

00:33:24 professors in the Department of Psychology at UCSD.

00:33:30 So I got there in 74, seven years later,

00:33:37 and Rummelhart at that time was still

00:33:43 doing mathematical modeling.

00:33:48 But he had gotten interested in cognition.

00:33:53 He’d gotten interested in understanding.

00:33:58 And understanding, I think, remains,

00:34:04 what does it mean to understand anyway?

00:34:08 It’s an interesting sort of curious,

00:34:11 how would we know if we really understood something?

00:34:14 But he was interested in building machines

00:34:18 that would hear a couple of sentences

00:34:21 and have an insight about what was going on.

00:34:23 So for example, one of his favorite things at that time

00:34:26 was, Margie was sitting on the front step

00:34:32 when she heard the familiar jingle of the good humor man.

00:34:38 She remembered her birthday money and ran into the house.

00:34:42 What is Margie doing?

00:34:44 Why?

00:34:47 Well, there’s a couple of ideas you could have,

00:34:50 but the most natural one is that the good humor

00:34:53 man brings ice cream.

00:34:55 She likes ice cream.

00:34:57 She knows she needs money to buy ice cream,

00:34:59 so she’s going to run into the house and get her money

00:35:02 so she can buy herself an ice cream.

00:35:03 It’s a huge amount of inference that

00:35:05 has to happen to get those things to link up

00:35:07 with each other.

00:35:09 And he was interested in how the hell that could happen.

00:35:13 And he was trying to build good old fashioned AI style

00:35:20 models of representation of language and content of things

00:35:30 like has money.

00:35:32 So like formal logic and knowledge bases,

00:35:35 like that kind of stuff.

00:35:36 So he was integrating that with his thinking about cognition.

00:35:40 The mechanisms of cognition, how can they mechanistically

00:35:45 be applied to build these knowledge,

00:35:46 like to actually build something that

00:35:49 looks like a web of knowledge and thereby from there emerges

00:35:54 something like understanding, whatever the heck that is.

00:35:57 Yeah, he was grappling.

00:35:59 This was something that they grappled

00:36:01 with at the end of that book that I was describing,

00:36:04 Explorations in Cognition.

00:36:06 But he was realizing that the paradigm of good old fashioned

00:36:11 AI wasn’t giving him the answers to these questions.

00:36:16 By the way, that’s called good old fashioned AI now.

00:36:18 It wasn’t called that at the time.

00:36:20 Well, it was.

00:36:21 It was beginning to be called that.

00:36:23 Oh, because it was from the 60s.

00:36:24 Yeah, yeah.

00:36:26 By the late 70s, it was kind of old fashioned,

00:36:28 and it hadn’t really panned out.

00:36:30 And people were beginning to recognize that.

00:36:34 And Rommelhardt was like, yeah, he’s part of the recognition

00:36:37 that this wasn’t all working.

00:36:39 Anyway, so he started thinking in terms of the idea

00:36:48 that we needed systems that allowed us to integrate

00:36:52 multiple simultaneous constraints in a way that would

00:36:56 be mutually influencing each other.

00:37:00 So he wrote a paper that just really, first time I read it,

00:37:07 I said, oh, well, yeah, but is this important?

00:37:11 But after a while, it just got under my skin.

00:37:15 And it was called An Interactive Model of Reading.

00:37:18 And in this paper, he laid out the idea

00:37:21 that every aspect of our interpretation of what’s

00:37:34 coming off the page when we read at every level of analysis

00:37:40 you can think of actually depends

00:37:42 on all the other levels of analysis.

00:37:45 So what are the actual pixels making up each letter?

00:37:53 And what do those pixels signify about which letters they are?

00:38:00 And what do those letters tell us about what words are there?

00:38:05 And what do those words tell us about what ideas

00:38:09 the author is trying to convey?

00:38:12 And so he had this model where we

00:38:18 have these little tiny elements that represent

00:38:25 each of the pixels of each of the letters,

00:38:29 and then other ones that represent the line segments

00:38:31 in them, and other ones that represent the letters,

00:38:33 and other ones that represent the words.

00:38:36 And at that time, his idea was there’s this set of experts.

00:38:43 There’s an expert about how to construct a line out of pixels,

00:38:48 and another expert about which sets of lines

00:38:51 go together to make which letters,

00:38:53 and another one about which letters go together

00:38:55 to make which words, and another one about what

00:38:58 the meanings of the words are, and another one about how

00:39:01 the meanings fit together, and things like that.

00:39:04 And all these experts are looking at this data,

00:39:06 and they’re updating hypotheses at other levels.

00:39:12 So the word expert can tell the letter expert,

00:39:15 oh, I think there should be a T there,

00:39:17 because I think there should be a word the here.

00:39:20 And the bottom up sort of feature to letter expert

00:39:23 could say, I think there should be a T there, too.

00:39:25 And if they agree, then you see a T, right?

00:39:28 And so there’s a top down, bottom up interactive process,

00:39:32 but it’s going on at all layers simultaneously.

00:39:34 So everything can filter all the way down from the top,

00:39:37 as well as all the way up from the bottom.

00:39:39 And it’s a completely interactive, bidirectional,

00:39:42 parallel distributed process.

00:39:45 That is somehow, because of the abstractions, it’s hierarchical.

00:39:48 So there’s different layers of responsibilities,

00:39:52 different levels of responsibilities.

00:39:54 First of all, it’s fascinating to think about it

00:39:56 in this kind of mechanistic way.

00:39:58 So not thinking purely from the structure

00:40:02 of a neural network or something like a neural network,

00:40:04 but thinking about these little guys

00:40:06 that work on letters, and then the letters come words

00:40:09 and words become sentences.

00:40:11 And that’s a very interesting hypothesis

00:40:14 that from that kind of hierarchical structure

00:40:18 can emerge understanding.

00:40:21 Yeah, so, but the thing is, though,

00:40:23 I wanna just sort of relate this

00:40:25 to the earlier part of the conversation.

00:40:28 When Romelhart was first thinking about it,

00:40:31 there were these experts on the side,

00:40:34 one for the features and one for the letters

00:40:36 and one for how the letters make the words and so on.

00:40:39 And they would each be working,

00:40:43 sort of evaluating various propositions about,

00:40:46 you know, is this combination of features here

00:40:48 going to be one that looks like the letter T and so on.

00:40:52 And what he realized,

00:40:56 kind of after reading Hinton’s dissertation

00:40:59 and hearing about Jim Anderson’s

00:41:03 linear algebra based neural network models

00:41:06 that I was telling you about before

00:41:07 was that he could replace those experts

00:41:10 with neuron like processing units,

00:41:12 which just would have their connection weights

00:41:14 that would do this job.

00:41:16 So what ended up happening was

00:41:20 that Romelhart and I got together

00:41:22 and we created a model

00:41:24 called the interactive activation model of letter perception,

00:41:29 which takes these little pixel level inputs,

00:41:35 constructs line segment features, letters and words.

00:41:41 But now we built it out of a set of neuron

00:41:44 like processing units that are just connected

00:41:47 to each other with connection weights.

00:41:49 So the unit for the word time has a connection

00:41:53 to the unit for the letter T in the first position

00:41:56 and the letter I in the second position, so on.

00:41:59 And because these connections are bi directional,

00:42:05 if you have prior knowledge that it might be the word time

00:42:08 that starts to prime the letters and the features.

00:42:12 And if you don’t, then it has to start bottom up.

00:42:14 But the directionality just depends

00:42:17 on where the information comes in first.

00:42:19 And if you have context together

00:42:22 with features at the same time,

00:42:24 they can convergently result in an emergent perception.

00:42:27 And that was the piece of work that we did together

00:42:35 that sort of got us both completely convinced

00:42:41 that this neural network way of thinking

00:42:44 was going to be able to actually address the questions

00:42:48 that we were interested in as cognitive psychologists.

00:42:50 So the algorithmic side, the optimization side,

00:42:53 those are all details like when you first start the idea

00:42:56 that you can get far with this kind of way of thinking,

00:42:59 that in itself is a profound idea.

00:43:01 So do you like the term connectionism

00:43:05 to describe this kind of set of ideas?

00:43:07 I think it’s useful.

00:43:10 It highlights the notion that the knowledge

00:43:15 that the system exploits is in the connections

00:43:19 between the units, right?

00:43:21 There isn’t a separate dictionary.

00:43:24 There’s just the connections between the units.

00:43:27 So I already sort of laid that on the table

00:43:31 with the connections from the letter units

00:43:34 to the unit for the word time, right?

00:43:36 The unit for the word time isn’t a unit for the word time

00:43:40 for any other reason than it’s got the connections

00:43:43 to the letters that make up the word time.

00:43:46 Those are the units on the input that excited

00:43:48 when it’s excited that it in a sense represents

00:43:52 in the system that there’s support for the hypothesis

00:43:57 that the word time is present in the input.

00:44:01 But it’s not, the word time isn’t written anywhere

00:44:07 inside the bottle, it’s only written there

00:44:09 in the picture we drew of the model

00:44:11 to say that’s the unit for the word time, right?

00:44:14 And if somebody wants to tell me,

00:44:18 well, how do you spell that word?

00:44:21 You have to use the connections from that out

00:44:24 to then get those letters, for example.

00:44:27 That’s such a, that’s a counterintuitive idea

00:44:31 where humans want to think in this logic way.

00:44:36 This idea of connectionism, it doesn’t, it’s weird.

00:44:41 It’s weird that this is how it all works.

00:44:43 Yeah, but let’s go back to that CNN, right?

00:44:46 That CNN with all those layers of neuron

00:44:48 like processing units that we were talking about before,

00:44:51 it’s gonna come out and say, this is a cat, that’s a dog,

00:44:55 but it has no idea why it said that.

00:44:57 It’s just got all these connections

00:44:59 between all these layers of neurons,

00:45:02 like from the very first layer to the,

00:45:04 you know, like whatever these layers are,

00:45:07 they just get numbered after a while

00:45:09 because they, you know, they somehow further in you go,

00:45:13 the more abstract the features are,

00:45:17 but it’s a graded and continuous sort of process

00:45:20 of abstraction anyway.

00:45:21 And, you know, it goes from very local,

00:45:24 very specific to much more sort of global,

00:45:28 but it’s still, you know, another sort of pattern

00:45:32 of activation over an array of units.

00:45:33 And then at the output side, it says it’s a cat

00:45:36 or it’s a dog.

00:45:37 And when I open my eyes and say, oh, that’s Lex,

00:45:42 or, oh, you know, there’s my own dog

00:45:47 and I recognize my dog,

00:45:50 which is a member of the same species as many other dogs,

00:45:53 but I know this one

00:45:54 because of some slightly unique characteristics.

00:45:57 I don’t know how to describe what it is

00:46:00 that makes me know that I’m looking at Lex

00:46:02 or at my particular dog, right?

00:46:04 Or even that I’m looking at a particular brand of car.

00:46:07 Like I can say a few words about it,

00:46:09 but I wrote you a paragraph about the car,

00:46:12 you would have trouble figuring out

00:46:14 which car is he talking about, right?

00:46:16 So the idea that we have propositional knowledge

00:46:19 of what it is that allows us to recognize

00:46:23 that this is an actual instance

00:46:25 of this particular natural kind

00:46:27 has always been something that it never worked, right?

00:46:36 You couldn’t ever write down a set of propositions

00:46:38 for visual recognition.

00:46:41 And so in that space, it sort of always seemed very natural

00:46:46 that something more implicit,

00:46:51 you don’t have access to what the details

00:46:54 of the computation were in between,

00:46:56 you just get the result.

00:46:58 So that’s the other part of connectionism,

00:47:00 you cannot, you don’t read the contents of the connections,

00:47:04 the connections only cause outputs to occur

00:47:08 based on inputs.

00:47:09 Yeah, and for us that like final layer

00:47:13 or some particular layer is very important,

00:47:16 the one that tells us that it’s our dog

00:47:19 or like it’s a cat or a dog,

00:47:22 but each layer is probably equally as important

00:47:25 in the grand scheme of things.

00:47:27 Like there’s no reason why the cat versus dog

00:47:30 is more important than the lower level activations,

00:47:33 it doesn’t really matter.

00:47:34 I mean, all of it is just this beautiful stacking

00:47:36 on top of each other.

00:47:37 And we humans live in this particular layers,

00:47:40 for us it’s useful to survive,

00:47:43 to use those cat versus dog, predator versus prey,

00:47:47 all those kinds of things.

00:47:49 It’s fascinating that it’s all continuous,

00:47:51 but then you then ask,

00:47:53 the history of artificial intelligence, you ask,

00:47:55 are we able to introspect and convert the very things

00:47:59 that allow us to tell the difference between cat and dog

00:48:02 into a logic, into formal logic?

00:48:05 That’s been the dream.

00:48:06 I would say that’s still part of the dream of symbolic AI.

00:48:10 And I’ve recently talked to Doug Lenat who created Psych

00:48:19 and that’s a project that lasted for many decades

00:48:23 and still carries a sort of dream in it, right?

00:48:28 But we still don’t know the answer, right?

00:48:30 It seems like connectionism is really powerful,

00:48:34 but it also seems like there’s this building of knowledge.

00:48:38 And so how do we, how do you square those two?

00:48:41 Like, do you think the connections can contain

00:48:44 the depth of human knowledge and the depth

00:48:46 of what Dave Romahart was thinking about of understanding?

00:48:51 Well, that remains the $64 question.

00:48:55 And I…

00:48:58 With inflation, that number is higher.

00:48:59 Okay, $64,000.

00:49:01 Maybe it’s the $64 billion question now.

00:49:08 You know, I think that from the emergentist side,

00:49:13 which, you know, I placed myself on.

00:49:23 So I used to sometimes tell people

00:49:26 I was a radical, eliminative connectionist

00:49:29 because I didn’t want them to think

00:49:34 that I wanted to build like anything into the machine.

00:49:38 But I don’t like the word eliminative anymore

00:49:45 because it makes it seem like it’s wrong to think

00:49:51 that there is this emergent level of understanding.

00:49:55 And I disagree with that.

00:50:00 So I think, you know, I would call myself

00:50:02 an a radical emergentist connectionist

00:50:06 rather than eliminative connectionist, right?

00:50:09 Because I want to acknowledge

00:50:12 that these higher level kinds of aspects

00:50:17 of our cognition are real, but they’re not,

00:50:26 they don’t exist as such.

00:50:29 And there was an example that Doug Hofstadter used to use

00:50:33 that I thought was helpful in this respect.

00:50:36 Just the idea that we can think about sand dunes

00:50:42 as entities and talk about like how many there are even.

00:50:51 But we also know that a sand dune is a very fluid thing.

00:50:56 It’s a pile of sand that is capable

00:51:00 of moving around under the wind and reforming itself

00:51:08 in somewhat different ways.

00:51:10 And if we think about our thoughts as like sand dunes,

00:51:13 as being things that emerge from just the way

00:51:19 all the lower level elements sort of work together

00:51:22 and are constrained by external forces,

00:51:26 then we can say, yes, they exist as such,

00:51:29 but they also, we shouldn’t treat them

00:51:34 as completely monolithic entities that we can understand

00:51:40 without understanding sort of all of the stuff

00:51:43 that allows them to change in the ways that they do.

00:51:47 And that’s where I think the connectionist

00:51:49 feeds into the cognitive.

00:51:52 It’s like, okay, so if the substrate

00:51:55 is parallel distributed connectionist, then it doesn’t mean

00:52:01 that the contents of thought isn’t like abstract

00:52:05 and symbolic, but it’s more fluid maybe

00:52:10 than it’s easier to capture

00:52:13 with a set of logical expressions.

00:52:15 Yeah, that’s a heck of a sort of thing

00:52:17 to put at the top of a resume,

00:52:20 radical, emergentist, connectionist.

00:52:23 So there is, just like you said, a beautiful dance

00:52:26 between that, between the machinery of intelligence,

00:52:30 like the neural network side of it,

00:52:32 and the stuff that emerges.

00:52:34 I mean, the stuff that emerges seems to be,

00:52:40 I don’t know, I don’t know what that is,

00:52:44 that it seems like maybe all of reality is emergent.

00:52:48 What I think about, this is made most distinctly rich to me

00:52:57 when I look at cellular automata, look at game of life,

00:53:01 that from very, very simple things,

00:53:03 very rich, complex things emerge

00:53:06 that start looking very quickly like organisms

00:53:10 that you forget how the actual thing operates.

00:53:13 They start looking like they’re moving around,

00:53:15 they’re eating each other,

00:53:16 some of them are generating offspring.

00:53:20 You forget very quickly.

00:53:21 And it seems like maybe it’s something

00:53:23 about the human mind that wants to operate

00:53:26 in some layer of the emergent,

00:53:28 and forget about the mechanism

00:53:30 of how that emergence happens.

00:53:32 So it, just like you are in your radicalness,

00:53:35 I’m also, it seems like unfair

00:53:39 to eliminate the magic of that emergent,

00:53:43 like eliminate the fact that that emergent is real.

00:53:48 Yeah, no, I agree.

00:53:49 I’m not, that’s why I got rid of eliminative, right?

00:53:53 Eliminative, yeah.

00:53:54 Yeah, because it seemed like that was trying to say

00:53:56 that it’s all completely like.

00:54:01 An illusion of some kind, it’s not.

00:54:03 Well, who knows whether there isn’t,

00:54:06 there aren’t some illusory characteristics there.

00:54:08 And I think that philosophically many people

00:54:15 have confronted that possibility over time,

00:54:17 but it’s still important to accept it as magic, right?

00:54:26 So, I think of Fellini in this context,

00:54:30 I think of others who have appreciated the role of magic,

00:54:35 the role of magic, of actual trickery

00:54:39 in creating illusions that move us.

00:54:45 And Plato was on to this too.

00:54:47 It’s like somehow or other these shadows

00:54:52 give rise to something much deeper than that.

00:54:55 And that’s, so we won’t try to figure out what it is.

00:55:01 We’ll just accept it as given that that occurs.

00:55:04 And, you know, but he was still onto the magic of it.

00:55:08 Yeah, yeah, we won’t try to really, really,

00:55:11 really deeply understand how it works.

00:55:14 We’ll just enjoy the fact that it’s kind of fun.

00:55:16 Okay, but you worked closely with Dave Romo Hart.

00:55:21 He passed away as a human being.

00:55:24 What do you remember about him?

00:55:27 Do you miss the guy?

00:55:28 Absolutely, you know, he passed away 15ish years ago now.

00:55:38 And his demise was actually one of the most poignant

00:55:43 and, you know, like relevant tragedies, relevant to our conversation.

00:55:52 He started to undergo a progressive neurological condition

00:56:03 that isn’t far from what we’re used to.

00:56:08 A neurological condition that isn’t fully understood.

00:56:15 That is to say his particular course isn’t fully understood

00:56:23 because, you know, brain scans weren’t done at certain stages

00:56:28 and no autopsy was done or anything like that.

00:56:32 The wishes of the family.

00:56:34 We don’t know as much about the underlying pathology as we might,

00:56:38 but I had begun to get interested in this neurological condition

00:56:48 that might have been the very one that he was succumbing to

00:56:52 as my own efforts to understand another aspect of this mystery

00:56:57 that we’ve been discussing while he was beginning

00:57:01 to get progressively more and more affected.

00:57:04 So I’m going to talk about the disorder

00:57:06 and not about Rumelhart for a second, okay?

00:57:09 The disorder is something my colleagues and collaborators

00:57:12 have chosen to call semantic dementia.

00:57:17 So it’s a specific form of loss of mind

00:57:23 related to meaning, semantic dementia.

00:57:27 And it’s progressive in the sense that the patient loses the ability

00:57:37 to appreciate the meaning of the experiences that they have,

00:57:44 either from touch, from sight, from sound, from language.

00:57:50 They, I hear sounds, but I don’t know what they mean kind of thing.

00:57:56 So as this illness progresses, it starts with the patient

00:58:04 being unable to differentiate like similar breeds of dog

00:58:12 or remember the lower frequency unfamiliar categories

00:58:18 that they used to be able to remember.

00:58:21 But as it progresses, it becomes more and more striking

00:58:27 and the patient loses the ability to recognize things like

00:58:36 pigs and goats and sheep and calls all middle sized animals dogs

00:58:42 and can’t recognize rabbits and rodents anymore.

00:58:46 They call all the little ones cats

00:58:49 and they can’t recognize hippopotamuses and cows anymore.

00:58:53 They call them all horses.

00:58:55 So there was this one patient who went through this progression

00:59:00 where at a certain point, any four legged animal,

00:59:03 he would call it either a horse or a dog or a cat.

00:59:07 And if it was big, he would tend to call it a horse.

00:59:10 If it was small, he’d tend to call it a cat.

00:59:12 Middle sized ones, he called dogs.

00:59:16 This is just a part of the syndrome though.

00:59:19 The patient loses the ability to relate concepts to each other.

00:59:25 So my collaborator in this work, Carolyn Patterson,

00:59:28 developed a test called the pyramids and palm trees test.

00:59:34 So you give the patient a picture of pyramids

00:59:39 and they have a choice which goes with the pyramids,

00:59:42 palm trees or pine trees.

00:59:46 And she showed that this wasn’t just a matter of language

00:59:50 because the patient’s loss of this ability shows up

00:59:55 whether you present the material with words or with pictures.

00:59:59 The pictures, they can’t put the pictures together

01:00:03 with each other properly anymore.

01:00:05 They can’t relate the pictures to the words either.

01:00:07 They can’t do word picture matching.

01:00:09 But they’ve lost the conceptual grounding

01:00:12 from either modality of input.

01:00:15 And so that’s why it’s called semantic dementia.

01:00:19 The very semantics is disintegrating.

01:00:22 And we understand this in terms of our idea

01:00:27 that distributed representation, a pattern of activation,

01:00:31 represents the concepts, really similar ones.

01:00:33 As you degrade them, they start being,

01:00:36 you lose the differences.

01:00:40 So the difference between the dog and the goat

01:00:42 is no longer part of the pattern anymore.

01:00:44 And since dog is really familiar,

01:00:47 that’s the thing that remains.

01:00:49 And we understand that in the way the models work and learn.

01:00:52 But Rumelhart underwent this condition.

01:00:57 So on the one hand, it’s a fascinating aspect

01:01:00 of parallel distributed processing to be.

01:01:03 It reveals this sort of texture of distributed representation

01:01:08 in a very nice way, I’ve always felt.

01:01:11 But at the same time, it was extremely poignant

01:01:13 because this is exactly the condition

01:01:16 that Rumelhart was undergoing.

01:01:18 And there was a period of time when he was this man

01:01:22 who had been the most focused, goal directed, competitive,

01:01:35 thoughtful person who was willing to work for years

01:01:41 to solve a hard problem, he starts to disappear.

01:01:48 And there was a period of time when it was hard for any of us

01:01:57 to really appreciate that he was sort of, in some sense,

01:02:00 not fully there anymore.

01:02:04 Do you know if he was able to introspect

01:02:07 the solution of the understanding mind?

01:02:14 I mean, this is one of the big scientists that thinks about this.

01:02:19 Was he able to look at himself and understand the fading mind?

01:02:24 You know, we can contrast Hawking and Rumelhart in this way.

01:02:31 And I like to do that to honor Rumelhart

01:02:33 because I think Rumelhart is sort of like the Hawking

01:02:36 of cognitive science to me in some ways.

01:02:40 Both of them suffered from a degenerative condition.

01:02:45 In Hawking’s case, it affected the motor system.

01:02:49 In Rumelhart’s case, it’s affecting the semantics.

01:02:54 And not just the pure object semantics,

01:03:01 but maybe the self semantics as well.

01:03:04 And we don’t understand that.

01:03:06 Concepts broadly.

01:03:08 So I would say he didn’t.

01:03:13 And this was part of what, from the outside,

01:03:16 was a profound tragedy.

01:03:18 But on the other hand, at some level, he sort of did

01:03:22 because there was a period of time when it finally was realized

01:03:28 that he had really become profoundly impaired.

01:03:32 This was clearly a biological condition.

01:03:35 It wasn’t just like he was distracted that day or something like that.

01:03:39 So he retired from his professorship at Stanford

01:03:44 and he became, he lived with his brother for a couple years

01:03:51 and then he moved into a facility for people with cognitive impairments.

01:04:00 One that many elderly people end up in when they have cognitive impairments.

01:04:06 And I would spend time with him during that period.

01:04:12 This was like in the late 90s, around 2000 even.

01:04:16 And we would go bowling and he could still bowl.

01:04:25 And after bowling, I took him to lunch and I said,

01:04:32 where would you like to go?

01:04:34 You want to go to Wendy’s?

01:04:35 And he said, nah.

01:04:37 And I said, okay, well, where do you want to go?

01:04:38 And he just pointed.

01:04:40 He said, turn here.

01:04:41 So he still had a certain amount of spatial cognition

01:04:44 and he could get me to the restaurant.

01:04:47 And then when we got to the restaurant, I said,

01:04:51 what do you want to order?

01:04:53 And he couldn’t come up with any of the words,

01:04:56 but he knew where on the menu the thing was that he wanted.

01:04:59 So it’s, you know, and he couldn’t say what it was,

01:05:04 but he knew that that’s what he wanted to eat.

01:05:07 And so it’s like it isn’t monolithic at all.

01:05:14 Our cognition is, you know, first of all, graded in certain kinds of ways,

01:05:21 but also multipartite and there’s many elements to it and things,

01:05:27 certain sort of partial competencies still exist

01:05:31 in the absence of other aspects of these competencies.

01:05:36 So this is what always fascinated me about what used to be called

01:05:43 cognitive neuropsychology, you know,

01:05:46 the effects of brain damage on cognition.

01:05:49 But in particular, this gradual disintegration part.

01:05:53 You know, I’m a big believer that the loss of a human being that you value

01:05:59 is as powerful as, you know, first falling in love with that human being.

01:06:03 I think it’s all a celebration of the human being.

01:06:06 So the disintegration itself too is a celebration in a way.

01:06:10 Yeah, yeah.

01:06:12 But just to say something more about the scientist

01:06:17 and the backpropagation idea that you mentioned.

01:06:22 So in 1982, Hinton had been there as a postdoc and organized that conference.

01:06:34 He’d actually gone away and gotten an assistant professorship

01:06:37 and then there was this opportunity to bring him back.

01:06:41 So Jeff Hinton was back on a sabbatical.

01:06:45 San Diego.

01:06:46 And Rommelhard and I had decided we wanted to do this, you know,

01:06:52 we thought it was really exciting and the papers on the interactive activation model

01:06:58 that I was telling you about had just been published

01:07:00 and we both sort of saw a huge potential for this work and Jeff was there.

01:07:06 And so the three of us started a research group,

01:07:11 which we called the PDP Research Group.

01:07:13 And several other people came.

01:07:17 Francis Crick, who was at the Salk Institute, heard about it from Jeff

01:07:22 because Jeff was known among Brits to be brilliant

01:07:27 and Francis was well connected with his British friends.

01:07:30 So Francis Crick came.

01:07:32 That’s a heck of a group of people, wow.

01:07:34 And Paul Spolensky was one of the other postdocs.

01:07:40 He was still there as a postdoc.

01:07:41 And a few other people.

01:07:45 But anyway, Jeff talked to us about learning

01:07:56 and how we should think about how, you know, learning occurs in a neural network.

01:08:06 And he said, the problem with the way you guys have been approaching this

01:08:12 is that you’ve been looking for inspiration from biology

01:08:17 to tell you what the rules should be for how the synapses should change

01:08:22 the strengths of their connections, how the connections should form.

01:08:27 He said, that’s the wrong way to go about it.

01:08:30 What you should do is you should think in terms of

01:08:36 how you can adjust connection weights to solve a problem.

01:08:44 So you define your problem and then you figure out

01:08:49 how the adjustment of the connection weights will solve the problem.

01:08:54 And Rumelhart heard that and said to himself, okay,

01:09:01 so I’m going to start thinking about it that way.

01:09:04 I’m going to essentially imagine that I have some objective function,

01:09:11 some goal of the computation.

01:09:14 I want my machine to correctly classify all of these images.

01:09:19 And I can score that.

01:09:21 I can measure how well they’re doing on each image.

01:09:24 And I get some measure of error or loss, it’s typically called in deep learning.

01:09:30 And I’m going to figure out how to adjust the connection weights

01:09:35 so as to minimize my loss or reduce the error.

01:09:41 And that’s called, you know, gradient descent.

01:09:47 And engineers were already familiar with the concept of gradient descent.

01:09:53 And in fact, there was an algorithm called the delta rule

01:09:58 that had been invented by a professor in the electrical engineering department

01:10:07 at Stanford, Bernie Widrow and a collaborator named Hoff.

01:10:11 I never met him.

01:10:13 So gradient descent in continuous neural networks

01:10:19 with multiple neuron like processing units was already understood

01:10:26 for a single layer of connection weights.

01:10:29 We have some inputs over a set of neurons.

01:10:32 We want the output to produce a certain pattern.

01:10:35 We can define the difference between our target

01:10:38 and what the neural network is producing.

01:10:41 And we can figure out how to change the connection weights to reduce that error.

01:10:44 So what Romilhar did was to generalize that

01:10:49 so as to be able to change the connections from earlier layers of units

01:10:53 to the ones at a hidden layer between the input and the output.

01:10:58 And so he first called the algorithm the generalized delta rule

01:11:03 because it’s just an extension of the gradient descent idea.

01:11:08 And interestingly enough, Hinton was thinking that this wasn’t going to work very well.

01:11:15 So Hinton had his own alternative algorithm at the time

01:11:20 based on the concept of the Boltzmann machine that he was pursuing.

01:11:24 So the paper on the Boltzmann machine came out in,

01:11:27 learning in Boltzmann machines came out in 1985.

01:11:31 But it turned out that back prop worked better than the Boltzmann machine learning algorithm.

01:11:37 So this generalized delta algorithm ended up being called back propagation, as you say, back prop.

01:11:44 Yeah. And probably that name is opaque to me.

01:11:50 What does that mean?

01:11:53 What it meant was that in order to figure out what the changes you needed to make

01:11:59 to the connections from the input to the hidden layer,

01:12:03 you had to back propagate the error signals from the output layer

01:12:10 through the connections from the hidden layer to the output

01:12:15 to get the signals that would be the error signals for the hidden layer.

01:12:20 And that’s how Rumelhart formulated it.

01:12:22 It was like, well, we know what the error signals are at the output layer.

01:12:25 Let’s see if we can get a signal at the hidden layer

01:12:28 that tells each hidden unit what its error signal is essentially.

01:12:32 So it’s back propagating through the connections

01:12:37 from the hidden to the output to get the signals to tell the hidden units

01:12:41 how to change their weights from the input.

01:12:43 And that’s why it’s called back prop.

01:12:47 Yeah. But so it came from Hinton having introduced the concept of, you know,

01:12:54 define your objective function, figure out how to take the derivative

01:12:59 so that you can adjust the connections so that they make progress towards your goal.

01:13:04 So stop thinking about biology for a second

01:13:06 and let’s start to think about optimization and computation a little bit more.

01:13:12 So what about Jeff Hinton?

01:13:15 You’ve gotten a chance to work with him in that little thing.

01:13:20 The set of people involved there is quite incredible.

01:13:24 The small set of people under the PDP flag,

01:13:28 it’s just given the amount of impact those ideas have had over the years,

01:13:32 it’s kind of incredible to think about.

01:13:34 But, you know, just like you said, like yourself,

01:13:38 Jeffrey Hinton is seen as one of the, not just like a seminal figure in AI,

01:13:43 but just a brilliant person,

01:13:45 just like the horsepower of the mind is pretty high up there for him

01:13:49 because he’s just a great thinker.

01:13:52 So what kind of ideas have you learned from him?

01:13:57 Have you influenced each other on?

01:13:59 Have you debated over what stands out to you in the full space of ideas here

01:14:05 at the intersection of computation and cognition?

01:14:09 Well, so Jeff has said many things to me that had a profound impact on my thinking.

01:14:18 And he’s written several articles which were way ahead of their time.

01:14:26 He had two papers in 1981, just to give one example,

01:14:37 one of which was essentially the idea of transformers

01:14:42 and another of which was an early paper on semantic cognition

01:14:49 which inspired him and Rumelhart and me throughout the 80s

01:15:01 and, you know, still I think sort of grounds my own thinking

01:15:11 about the semantic aspects of cognition.

01:15:16 He also, in a small paper that was never published that he wrote in 1977,

01:15:25 you know, before he actually arrived at UCSD or maybe a couple years even before that,

01:15:29 I don’t know, when he was a PhD student,

01:15:32 he described how a neural network could do recursive computation.

01:15:40 And it was a very clever idea that he’s continued to explore over time,

01:15:48 which was sort of the idea that when you call a subroutine,

01:15:56 you need to save the state that you had when you called it

01:16:01 so you can get back to where you were when you’re finished with the subroutine.

01:16:04 And the idea was that you would save the state of the calling routine

01:16:10 by making fast changes to connection weights.

01:16:13 And then when you finished with the subroutine call,

01:16:19 those fast changes in the connection weights would allow you to go back

01:16:23 to where you had been before and reinstate the previous context

01:16:27 so that you could continue on with the top level of the computation.

01:16:32 Anyway, that was part of the idea.

01:16:35 And I always thought, okay, that’s really, you know,

01:16:38 he had extremely creative ideas that were quite a lot ahead of his time

01:16:44 and many of them in the 1970s and early 1980s.

01:16:49 So another thing about Geoff Hinton’s way of thinking,

01:16:57 which has profoundly influenced my effort to understand

01:17:05 human mathematical cognition, is that he doesn’t write too many equations.

01:17:13 And people tell stories like, oh, in the Hinton Lab meetings,

01:17:17 you don’t get up at the board and write equations

01:17:19 like you do in everybody else’s machine learning lab.

01:17:22 What you do is you draw a picture.

01:17:26 And, you know, he explains aspects of the way deep learning works

01:17:33 by putting his hands together and showing you the shape of a ravine

01:17:38 and using that as a geometrical metaphor for what’s happening

01:17:45 as this gradient descent process.

01:17:47 You’re coming down the wall of a ravine.

01:17:49 If you take too big a jump, you’re going to jump to the other side.

01:17:53 And so that’s why we have to turn down the learning rate, for example.

01:17:59 And it speaks to me of the fundamentally intuitive character of deep insight

01:18:12 together with a commitment to really understanding

01:18:21 in a way that’s absolutely ultimately explicit and clear, but also intuitive.

01:18:31 Yeah, there’s certain people like that.

01:18:33 Here’s an example, some kind of weird mix of visual and intuitive

01:18:38 and all those kinds of things.

01:18:40 Feynman is another example, different style of thinking, but very unique.

01:18:44 And when you’re around those people, for me in the engineering realm,

01:18:48 there’s a guy named Jim Keller who’s a chip designer, engineer.

01:18:52 Every time I talk to him, it doesn’t matter what we’re talking about.

01:18:57 Just having experienced that unique way of thinking transforms you

01:19:02 and makes your work much better.

01:19:04 And that’s the magic.

01:19:06 You look at Daniel Kahneman, you look at the great collaborations

01:19:10 throughout the history of science.

01:19:12 That’s the magic of that.

01:19:13 It’s not always the exact ideas that you talk about,

01:19:16 but it’s the process of generating those ideas.

01:19:19 Being around that, spending time with that human being,

01:19:22 you can come up with some brilliant work,

01:19:24 especially when it’s cross disciplinary as it was a little bit in your case with Jeff.

01:19:29 Yeah.

01:19:31 Jeff is a descendant of the logician Boole.

01:19:38 He comes from a long line of English academics.

01:19:43 And together with the deeply intuitive thinking ability that he has,

01:19:51 he also has, it’s been clear, he’s described this to me,

01:19:59 and I think he’s mentioned it from time to time in other interviews

01:20:04 that he’s had with people.

01:20:06 He’s wanted to be able to sort of think of himself as contributing

01:20:12 to the understanding of reasoning itself, not just human reasoning.

01:20:22 Like Boole is about logic, right?

01:20:25 It’s about what can we conclude from what else and how do we formalize that.

01:20:31 And as a computer scientist, logician, philosopher,

01:20:40 the goal is to understand how we derive truths from other,

01:20:46 from givens and things like this.

01:20:48 And the work that Jeff was doing in the early to mid 80s

01:20:57 on something called the Bolton machine was his way of connecting

01:21:02 with that Boolean tradition and bringing it into the more continuous,

01:21:07 probabilistic graded constraint satisfaction realm.

01:21:11 And it was a beautiful set of ideas linked with theoretical physics

01:21:20 as well as with logic.

01:21:26 And it’s always been, I mean, I’ve always been inspired

01:21:31 by the Bolton machine too.

01:21:33 It’s like, well, if the neurons are probabilistic rather than deterministic

01:21:38 in their computations, then maybe this somehow is part of the serendipity

01:21:48 or adventitiousness of the moment of insight, right?

01:21:53 It might not have occurred at that particular instant.

01:21:56 It might be sort of partially the result of a stochastic process.

01:22:00 And that too is part of the magic of the emergence of some of these things.

01:22:07 Well, you’re right with the Boolean lineage and the dream of computer science

01:22:11 is somehow, I mean, I certainly think of humans this way,

01:22:16 that humans are one particular manifestation of intelligence,

01:22:20 that there’s something bigger going on and you’re hoping to figure that out.

01:22:25 The mechanisms of intelligence, the mechanisms of cognition

01:22:28 are much bigger than just humans.

01:22:30 Yeah. So I think of, I started using the phrase computational intelligence

01:22:37 at some point as to characterize the field that I thought, you know,

01:22:42 people like Geoff Hinton and many of the people I know at DeepMind

01:22:51 are working in and where I feel like I’m, you know,

01:23:00 I’m a kind of a human oriented computational intelligence researcher

01:23:06 in that I’m actually kind of interested in the human solution.

01:23:10 But at the same time, I feel like that’s where a huge amount

01:23:18 of the excitement of deep learning actually lies is in the idea that,

01:23:26 you know, we may be able to even go beyond what we can achieve

01:23:32 with our own nervous systems when we build computational intelligences

01:23:38 that are, you know, not limited in the ways that we are by our own biology.

01:23:46 Perhaps allowing us to scale the very mechanisms of human intelligence

01:23:51 just increases power through scale.

01:23:55 Yes. And I think that that, you know, obviously that’s the,

01:24:03 that’s being played out massively at Google Brain, at OpenAI

01:24:08 and to some extent at DeepMind as well.

01:24:11 I guess I shouldn’t say to some extent.

01:24:14 Just the massive scale of the computations that are used to succeed

01:24:22 at games like Go or to solve the protein folding problems

01:24:25 that they’ve been solving and so on.

01:24:27 Still not as many synapses and neurons as the human brain.

01:24:31 So we still got, we’re still beating them on that.

01:24:35 We humans are beating the AIs, but they’re catching up pretty quickly.

01:24:41 You write about modeling of mathematical cognition.

01:24:45 So let me first ask about mathematics in general.

01:24:49 There’s a paper titled Parallel Distributed Processing

01:24:53 Approach to Mathematical Cognition where in the introduction

01:24:56 there’s some beautiful discussion of mathematics.

01:25:00 And you referenced there Tristan Needham who criticizes a narrow

01:25:05 form of view of mathematics by liking the studying of mathematics

01:25:10 as symbol manipulation to studying music without ever hearing a note.

01:25:16 So from that perspective, what do you think is mathematics?

01:25:20 What is this world of mathematics like?

01:25:23 Well, I think of mathematics as a set of tools for exploring

01:25:32 idealized worlds that often turn out to be extremely relevant

01:25:42 to the real world but need not.

01:25:47 But they’re worlds in which objects exist with idealized properties

01:26:01 and in which the relationships among them can be characterized

01:26:07 with precision so as to allow the implications of certain facts

01:26:17 to then allow you to derive other facts with certainty.

01:26:22 So if you have two triangles and you know that there is an angle

01:26:37 in the first one that has the same measure as an angle in the second one

01:26:42 and you know that the lengths of the sides adjacent to that angle

01:26:47 in each of the two triangles, the corresponding sides adjacent

01:26:53 to that angle also have the same measure, then you can then conclude

01:26:58 that the triangles are congruent.

01:27:02 That is to say they have all of their properties in common.

01:27:06 And that is something about triangles.

01:27:11 It’s not a matter of formulas.

01:27:15 These are idealized objects.

01:27:18 In fact, we built bridges out of triangles and we understand

01:27:26 how to measure the height of something we can’t climb by extending

01:27:32 these ideas about triangles a little further.

01:27:36 And all of the ability to get a tiny speck of matter launched

01:27:49 from the planet Earth to intersect with some tiny, tiny little body

01:27:56 way out in way beyond Pluto somewhere at exactly a predicted time

01:28:02 and date is something that depends on these ideas.

01:28:08 And it’s actually happening in the real physical world that these ideas

01:28:18 make contact with it in those kinds of instances.

01:28:27 But there are these idealized objects, these triangles or these distances

01:28:32 or these points, whatever they are, that allow for this set of tools

01:28:40 to be created that then gives human beings this incredible leverage

01:28:47 that they didn’t have without these concepts.

01:28:51 And I think this is actually already true when we think about just,

01:29:01 you know, the natural numbers.

01:29:06 I always like to include zero, so I’m going to say the nonnegative integers,

01:29:11 but that’s a place where some people prefer not to include zero.

01:29:17 We like zero here, natural numbers, zero, one, two, three, four, five,

01:29:21 six, seven, and so on.

01:29:23 Yeah. And because they give you the ability to be exact about

01:29:36 how many sheep you have.

01:29:38 I sent you out this morning, there were 23 sheep.

01:29:41 You came back with only 22. What happened?

01:29:44 The fundamental problem of physics, how many sheep you have.

01:29:48 It’s a fundamental problem of human society that you damn well better

01:29:53 bring back the same number of sheep as you started with.

01:29:57 And it allows commerce, it allows contracts, it allows the establishment

01:30:03 of records and so on to have systems that allow these things to be notated.

01:30:10 But they have an inherent aboutness to them that’s one in the same time sort of

01:30:20 abstract and idealized and generalizable, while on the other hand,

01:30:26 potentially very, very grounded and concrete.

01:30:30 And one of the things that makes for the incredible achievements of the human mind

01:30:41 is the fact that humans invented these idealized systems that leverage

01:30:49 the power of human thought in such a way as to allow all this kind of thing to happen.

01:30:57 And so that’s what mathematics to me is the development of systems for thinking about

01:31:06 the properties and relations among sets of idealized objects and

01:31:18 the mathematical notation system that we unfortunately focus way too much on

01:31:26 is just our way of expressing propositions about these properties.

01:31:36 It’s just like we’re talking with Chomsky in language.

01:31:39 It’s the thing we’ve invented for the communication of those ideas.

01:31:43 They’re not necessarily the deep representation of those ideas.

01:31:48 So what’s a good way to model such powerful mathematical reasoning, would you say?

01:31:57 What are some ideas you have for capturing this in a model?

01:32:01 The insights that human mathematicians have had is a combination of the kind of the

01:32:10 intuitive kind of connectionist like knowledge that makes it so that something is just like

01:32:24 obviously true so that you don’t have to think about why it’s true.

01:32:31 That then makes it possible to then take the next step and ponder and reason and

01:32:40 figure out something that you previously didn’t have that intuition about.

01:32:45 It then ultimately becomes a part of the intuition that the next generation of

01:32:54 mathematical thinkers have to ground their own thinking on so that they can extend the ideas even further.

01:33:02 I came across this quotation from Henri Poincare while I was walking in the woods with my wife

01:33:15 in a state park in Northern California late last summer.

01:33:20 And what it said on the bench was it is by logic that we prove but by intuition that we discover.

01:33:32 And so what for me the essence of the project is to understand how to bring the intuitive

01:33:41 connectionist resources to bear on letting the intuitive discovery arise from engagement in

01:33:56 thinking with this formal system.

01:33:59 So I think of the ability of somebody like Hinton or Newton or Einstein or Rumelhart or

01:34:14 Poincare to Archimedes is another example.

01:34:21 So suddenly a flash of insight occurs. It’s like the constellation of all of these

01:34:31 simultaneous constraints that somehow or other causes the mind to settle into a novel state that

01:34:38 it never did before and give rise to a new idea that then you can say, okay, well, now how can I

01:34:51 prove this? How do I write down the steps of that theorem that allow me to make it rigorous and certain?

01:35:01 And so I feel like the kinds of things that we’re beginning to see deep learning systems do of

01:35:14 their own accord kind of gives me this feeling of hope or encouragement that ultimately it’ll all happen.

01:35:34 So in particular as many people now have become really interested in thinking about, you know,

01:35:46 neural networks that have been trained with massive amounts of text can be given a prompt and they

01:35:55 can then sort of generate some really interesting, fanciful, creative story from that prompt.

01:36:05 And there’s kind of like a sense that they’ve somehow synthesized something like novel out of

01:36:15 the, you know, all of the particulars of all of the billions and billions of experiences that went

01:36:22 into the training data that gives rise to something like this sort of intuitive sense of what would

01:36:29 be a fun and interesting little story to tell or something like that. It just sort of wells up out

01:36:36 of the letting the thing play out its own imagining of what somebody might say given this prompt as

01:36:47 an input to get it to start to generate its own thoughts. And to me that sort of represents the

01:36:56 potential of capturing the intuitive side of this.

01:37:01 And there’s other examples, I don’t know if you find them as captivating is, you know, on the

01:37:06 DeepMind side with AlphaZero, if you study chess, the kind of solutions that has come up in terms

01:37:12 of chess, it is, there’s novel ideas there. It feels very like there’s brilliant moments of insight.

01:37:20 And the mechanism they use, if you think of search as maybe more towards good old fashioned AI and

01:37:31 then there’s the connection is the neural network that has the intuition of looking at a board,

01:37:37 looking at a set of patterns and saying, how good is this set of positions? And the next few

01:37:42 positions, how good are those? And that’s it. That’s just an intuition. Grandmasters have this

01:37:49 and understanding positionally, tactically, how good the situation is, how can it be improved

01:37:55 without doing this full, like deep search. And then maybe doing a little bit of what human chess

01:38:03 players call calculation, which is the search, taking a particular set of steps down the line to

01:38:08 see how they unroll. But there is moments of genius in those systems too. So that’s another hopeful

01:38:16 illustration that from neural networks can emerge this novel creation of an idea.

01:38:25 Yes. And I think that, you know, I think Demis Hassabis is, you know, he’s spoken about those

01:38:34 things. I heard him describe a move that was made in one of the go matches against Lisa Dahl in a

01:38:44 very similar way. And it caused me to become really excited to kind of collaborate with some of those

01:38:52 people and analyze it at DeepMind. So I think though that what I like to really emphasize here

01:39:05 is one part of what I like to emphasize about mathematical cognition at least is that philosophers

01:39:15 and logicians going back three or even a little more than 3000 years ago began to develop these

01:39:28 formal systems and gradually the whole idea about thinking formally got constructed. And, you know,

01:39:45 it’s preceded Euclid, certainly present in the work of Thales and others. And I’m not the world’s

01:39:55 leading expert in all the details of that history, but Euclid’s elements were the kind of the touch

01:40:03 point of a coherent document that sort of laid out this idea of an actual formal system within which

01:40:15 these objects were characterized and the system of inference that allowed new truths to be derived

01:40:31 from others was sort of like established as a paradigm. And what I find interesting is the

01:40:43 idea that the ability to become a person who is capable of thinking in this abstract formal way

01:40:55 is a result of the same kind of immersion in experience thinking in that way that we now

01:41:10 begin to think of our understanding of language as being, right? So, we immerse ourselves in a

01:41:16 particular language, in a particular world of objects and their relationships and we learn

01:41:22 to talk about that and we develop intuitive understanding of the real world. In a similar

01:41:30 way, we can think that what academia has created for us, what those early philosophers and their

01:41:39 academies in Athens and Alexandria and other places allowed was the development of these

01:41:49 schools of thought, modes of thought that then become deeply ingrained and it becomes what it

01:42:00 is that makes it so that somebody like Jerry Fodor would think that systematic thought is

01:42:11 the essential characteristic of the human mind as opposed to a derived and an acquired characteristic

01:42:20 that results from acculturation in a certain mode that’s been invented by humans.

01:42:28 Would you say it’s more fundamental than like language? If we start dancing, if we bring

01:42:34 Chomsky back into the conversation, first of all, is it unfair to draw a line between mathematical

01:42:43 cognition and language, linguistic cognition?

01:42:48 I think that’s a very interesting question and I think it’s one of the ones that I’m actually very

01:42:54 interested in right now, but I think the answer is in important ways, it is important to draw that

01:43:06 line, but then to come back and look at it again and see some of the subtleties and interesting

01:43:12 aspects of the difference. So if we think about Chomsky himself, he was born into an academic

01:43:34 family. His father was a professor of rabbinical studies at a small rabbinical college in

01:43:40 Philadelphia. He was deeply enculturated in a culture of thought and reason and brought to the

01:43:59 effort to understand natural language, this profound engagement with these formal systems. I

01:44:13 think that there was tremendous power in that and that Chomsky had some amazing insights into the

01:44:23 structure of natural language, but that, I’m going to use the word but there, the actual intuitive

01:44:34 knowledge of these things only goes so far and does not go as far as it does in people like

01:44:41 Chomsky himself. And this was something that was discovered in the PhD dissertation of Lyla

01:44:48 Gleitman, who was actually trained in the same linguistics department with Chomsky. So what Lyla

01:44:55 discovered was that the intuitions that linguists had about even the meaning of a phrase, not just

01:45:09 about its grammar, but about what they thought a phrase must mean were very different from the

01:45:17 intuitions of an ordinary person who wasn’t a formally trained thinker. And well, it recently

01:45:27 has become much more salient. I happened to have learned about this when I myself was a PhD student

01:45:32 at the University of Pennsylvania, but I never knew how to put it together with all of my other

01:45:38 thinking about these things. So I actually currently have the hypothesis that formally

01:45:45 trained linguists and other formally trained academics, whether it be linguistics, philosophy,

01:45:58 cognitive science, computer science, machine learning, mathematics,

01:46:02 have a mode of engagement with experience that is intuitively deeply structured to be more

01:46:17 organized around the systematicity and ability to be conformant with the principles of a system

01:46:35 than is actually true of the natural human mind without that immersion.

01:46:42 That’s fascinating. So the different fields and approaches with which you start to study the mind

01:46:48 actually take you away from the natural operation of the mind. So it makes it very difficult for you

01:46:56 to be somebody who introspects.

01:46:59 Yes. And this is where things about human belief and so called knowledge that we consider

01:47:16 private, not our business to manipulate in others. We are not entitled to tell somebody else what to

01:47:29 believe about certain kinds of things. What are those beliefs? Well, they are the product of this

01:47:42 sort of immersion and enculturation. That is what I believe.

01:47:51 And that’s limiting.

01:47:55 It’s something to be aware of.

01:47:58 Does that limit you from having a good model of cognition?

01:48:04 It can.

01:48:04 So when you look at mathematical or linguistics, I mean, what is that line then? So is Chomsky

01:48:13 unable to sneak up to the full picture of cognition? Are you, when you’re focusing on

01:48:17 mathematical thinking, are you also unable to do so?

01:48:22 I think you’re right. I think that’s a great way of characterizing it. And

01:48:27 I also think that it’s related to the concept of beginner’s mind and another concept called the

01:48:43 expert blind spot. So the expert blind spot is much more prosaic seeming than this point that

01:48:53 you were just making. But it’s something that plagues experts when they try to communicate

01:49:01 their understanding to non experts. And that is that things are self evident to them that

01:49:12 they can’t begin to even think about how they could explain it to somebody else.

01:49:23 Because it’s just like so patently obvious that it must be true. And

01:49:31 when Kronacker said, God made the natural numbers, all else is the work of man,

01:49:47 he was expressing that intuition that somehow or other, the basic fundamentals of discrete

01:49:57 quantities being countable and innumerable and indefinite in number was not something that

01:50:10 had to be discovered. But he was wrong. It turns out that many cognitive scientists

01:50:21 agreed with him for a time. There was a long period of time where the natural

01:50:27 numbers were considered to be a part of the innate endowment of core knowledge or to use

01:50:35 the kind of phrases that Spelke and Kerry used to talk about what they believe are

01:50:41 the innate primitives of the human mind. And they no longer believe that. It’s actually

01:50:50 been more or less accepted by almost everyone that the natural numbers are actually a cultural

01:50:56 construction. And it’s so interesting to go back and study those few people who still exist who

01:51:04 don’t have those systems. So this is just an example to me where a certain mode of thinking

01:51:13 about language itself or a certain mode of thinking about geometry and those kinds of

01:51:20 relations. So it becomes so second nature that you don’t know what it is that you need to teach. And

01:51:30 in fact, we don’t really teach it all that explicitly anyway. You take a math class,

01:51:41 the professor sort of teaches it to you the way they understand it. Some of the students in the

01:51:47 class sort of like they get it. They start to get the way of thinking and they can actually do the

01:51:52 problems that get put on the homework that the professor thinks are interesting and challenging

01:51:57 ones. But most of the students who don’t kind of engage as deeply don’t ever get. And we think,

01:52:08 oh, that man must be brilliant. He must have this special insight. But he must have some

01:52:14 some biological sort of bit that’s different, that makes him so that he or she could have

01:52:20 that insight. Although I don’t want to dismiss biological individual differences completely,

01:52:31 I find it much more interesting to think about the possibility that it was that difference in the

01:52:39 dinner table conversation at the Chomsky house when he was growing up that made it so that he

01:52:45 had that cast of mind. Yeah. And there’s a few topics we talked about that kind of interconnect

01:52:53 because I wonder the better I get at certain things, we humans, the deeper we understand

01:52:59 something, what are you starting to then miss about the rest of the world? We talked about David

01:53:11 and his degenerative mind. And, you know, when you look in the mirror and wonder how different

01:53:19 am I am I cognitively from the man I was a month ago, from the man I was a year ago, like what,

01:53:26 you know, if I can, having thought about language of Chomsky for 10, 20 years, what am I no longer

01:53:35 able to see? What is in my blind spot? And how big is that? And then to somehow be able to leap back

01:53:43 out of your deep, like structure that you form for yourself about thinking about the world,

01:53:48 leap back and look at the big picture again, or jump out of the your current way of thinking.

01:53:54 And to be able to introspect, like what are the limitations of your mind? How is your mind less

01:54:00 powerful than it used to be or more powerful or different, powerful in different ways? So that

01:54:06 seems to be a difficult thing to do because we’re living, we’re looking at the world through the

01:54:11 lens of our mind, right? To step outside and introspect is difficult, but it seems necessary

01:54:17 if you want to make progress. You know, one of the threads of psychological research that’s always

01:54:25 been very, I don’t know, important to me to be aware of is the idea that our explanations of our

01:54:38 own behavior aren’t necessarily actually part of the causal process that caused that behavior to

01:54:53 occur, or even valid observations of the set of constraints that led to the outcome, but they are

01:55:03 post hoc rationalizations that we can give based on information at our disposal about what might

01:55:11 have contributed to the result that we came to when asked. And so this is an idea that was

01:55:21 introduced in a very important paper by Nisbet and Wilson about, you know, the limits on our ability

01:55:29 to be aware of the factors that cause us to make the choices that we make. And, you know, I think

01:55:42 it’s something that we really ought to be much more cognizant of, in general, as human beings,

01:55:54 is that our own insight into exactly why we hold the beliefs that we do and we hold the attitudes

01:56:01 and make the choices and feel the feelings that we do is not something that we totally control

01:56:12 or totally observe. And it’s subject to, you know, our culturally transmitted understanding of what

01:56:25 it is that is the mode that we give to explain these things when asked to do so as much as it is

01:56:34 about anything else. And so even our ability to introspect and think we have access to our own

01:56:42 thoughts is a product of culture and belief, you know, practice.

01:56:47 So let me ask you the big question of advice. So you’ve lived an incredible life in terms of the

01:56:57 ideas you’ve put out into the world, in terms of the trajectory you’ve taken through your career,

01:57:02 through your life. What advice would you give to young people today, in high school, in college,

01:57:09 about how to have a career or how to have a life they can be proud of?

01:57:16 Finding the thing that you are intrinsically motivated to engage with and then celebrating

01:57:27 that discovery is what it’s all about. When I was in college, I struggled with that. I had thought

01:57:43 I wanted to be a psychiatrist because I think I was interested in human psychology in high school.

01:57:50 And at that time, the only sort of information I had that had anything to do with the psyche was,

01:57:58 you know, Freud and Erich Fromm and sort of popular psychiatry kinds of things.

01:58:03 And so, well, they were psychiatrists, right? So I had to be a psychiatrist.

01:58:08 And that meant I had to go to medical school. And I got to college and I find myself taking,

01:58:14 you know, the first semester of a three quarter physics class and it was mechanics. And this was

01:58:21 so far from what it was I was interested in, but it was also too early in the morning in the winter

01:58:26 court semester. So I never made it to the physics class. But I wondered about the rest of my

01:58:34 freshman year and most of my sophomore year until I found myself in the midst of this situation where

01:58:45 around me there was this big revolution happening. I was at Columbia University in 1968 and

01:58:54 the Vietnam War is going on. Columbia is building a gym in Morningside Heights, which is part of

01:58:59 Harlem. And people are thinking, oh, the big bad rich guys are stealing the parkland that

01:59:06 belongs to the people of Harlem. And, you know, they’re part of the military industrial complex,

01:59:13 which is enslaving us and sending us all off to war in Vietnam. And so there was a big revolution

01:59:20 that involved a confluence of black activism and, you know, SDS and social justice and the whole

01:59:27 university blew up and got shut down. And I got a chance to sort of think about

01:59:34 why people were behaving the way they were in this context. And I, you know, I happened to have

01:59:42 taken mathematical statistics. I happened to have been taking psychology that quarter at just cycle

01:59:48 one. And somehow things in that space all ran together in my mind and got me really excited

01:59:54 about asking questions about why people, what made certain people go into the buildings and not

02:00:01 others and things like that. And so suddenly I had a path forward and I had just been wandering

02:00:07 around aimlessly. And at the different points in my career, you know, and I think, okay,

02:00:12 well, should I take this class or should I just read that book about some idea that I want to

02:00:26 understand better, you know, or should I pursue the thing that excites me and interests me or

02:00:33 should I, you know, meet some requirement? You know, that’s, I always did the latter.

02:00:39 So I ended up, my professors in psychology were, thought I was great. They wanted me to go to

02:00:46 graduate school. They nominated me for Phi Beta Kappa. And I went to the Phi Beta Kappa ceremony

02:00:55 and this guy came up and he said, oh, are you Magna Arsuma? And I wasn’t even getting honors

02:01:00 based on my grades. They just happened to have thought I was interested enough in ideas to

02:01:07 belong to Phi Beta Kappa. So. I mean, would it be fair to say you kind of stumbled around a little

02:01:12 bit through accidents of too early morning of classes in physics and so on until you discovered

02:01:20 intrinsic motivation, as you mentioned, and then that’s it. It hooked you. And then you celebrate

02:01:26 the fact that this happens to human beings. Yeah. And what is it that made what I did intrinsically

02:01:34 motivating to me? Well, that’s interesting and I don’t know all the answers to it. And I don’t

02:01:41 think I want anybody to think that you should be sort of in any way, I don’t know, sanctimonious or

02:01:52 anything about it. You know, it’s like, I really enjoyed doing statistical analysis of data. I

02:02:01 really enjoyed running my own experiment, which was what I got a chance to do in the psychology

02:02:09 department that chemistry and physics had never, I never imagined that mere mortals would ever do

02:02:14 an experiment in those sciences, except one that was in the textbook that you were told to do in

02:02:20 lab class. But in psychology, we were already like, even when I was taking psych one, it turned out

02:02:26 we had our own rat and we got to, after two set experiments, we got to, okay, do something you

02:02:32 think of with your rat. So it’s the opportunity to do it myself and to bring together a certain

02:02:42 set of things that engaged me intrinsically. And I think it has something to do with why

02:02:49 certain people turn out to be profoundly amazing musical geniuses, right? They get immersed in it

02:02:59 at an early enough point and it just sort of gets into the fabric. So my little brother had intrinsic

02:03:07 motivation for music as we witnessed when he discovered how to put records on the phonograph

02:03:15 when he was like 13 months old and recognize which one he wanted to play, not because he could read

02:03:21 the labels, because he could sort of see which ones had which scratches, which were the different,

02:03:26 you know, oh, that’s rapidi espanol. And that’s, you know, and, and, and,

02:03:31 And he enjoyed that, that connected with him somehow.

02:03:33 Yeah. And, and there was something that it fed into and it, you’re extremely lucky if you have

02:03:40 that and if you can nurture it and can let it grow and let it be, be an important part of your life.

02:03:47 Yeah. Those are, those are the two things is like, be attentive enough to,

02:03:52 to feel it when it comes, like this is something special. I mean, I don’t know. For example,

02:03:59 I really like tabular data, like Excel sheets. Like it brings me a deep joy. I don’t know how

02:04:08 useful that is for anything. That’s part of what I’m talking about.

02:04:12 Exactly. So there’s like a million, not a million, but there’s a lot of things

02:04:17 like that. For me, you have to hear that for yourself, like be, like realize this is really

02:04:23 joyful. But then the other part that you’re mentioning, which is the nurture is take time

02:04:27 and stay with it, stay with it a while and see where that takes you in life.

02:04:33 Yeah. And I think, I think the, the, the motivational engagement results in the

02:04:40 immersion that then creates the opportunity to obtain the expertise. So, you know, we could call

02:04:47 it the Mozart effect, right? I mean, when I think about Mozart, I think about, you know,

02:04:53 the person who was born as the fourth member of the family string quartet, right? And, and they

02:05:01 handed him the violin when he was six weeks old. All right, start playing, you know, it’s like,

02:05:08 and so the, the level of immersion there was, was amazingly profound, but hopefully he also had,

02:05:20 you know, some, something, maybe this is where the more sort of the genetic part comes in.

02:05:28 Sometimes I think, you know, something in him resonated to the music so that that,

02:05:34 the synergy of the combination of that was so powerful. So, so that’s what I really considered

02:05:40 to be the Mozart effect. It’s sort of the, the synergy of something with, with experience that,

02:05:47 that then results in the unique flowering of a particular, you know, mind.

02:05:51 And so I know my siblings and I are all very different from each other. We’ve all gone in

02:06:01 our own different directions. And, you know, I mentioned my younger brother who was very musical.

02:06:07 I had my other younger brother was like this amazing, like intuitive engineer.

02:06:11 And my sister, one of my sisters was passionate about, in, you know, water conservation well

02:06:23 before it was, you know, such a hugely important issue that it is today. So we all sort of somehow

02:06:31 these find a different thing. And I don’t, I don’t mean to say it isn’t tied in with something about,

02:06:41 about us biologically, but, but it’s also when that happens, where you can find that, then,

02:06:47 you know, you can do your thing and you can be excited about it. So people can be excited about

02:06:52 fitting people on bicycles, as well as excited about making neural networks, achieve insights

02:06:56 into human cognition, right? Yeah. Like for me personally, I’ve always been excited about

02:07:03 love and friendship between humans. And just like the actual experience of it,

02:07:10 since I was a child, just observing people around me and also been excited about robots.

02:07:16 And there’s something in me that thinks I really would love to explore how those two things

02:07:21 combine. And it doesn’t make any sense. A lot of it is also timing, just to think of your own career

02:07:26 and your own life. You found yourself in certain pieces, places that happened to involve some of

02:07:33 the greatest thinkers of our time. And so it just worked out that like, you guys developed those

02:07:37 ideas. And there may be a lot of other people similar to you, and they were brilliant, and

02:07:43 they never found that right connection and place to where they, their ideas could flourish. So

02:07:48 it’s timing, it’s place, it’s people. And ultimately the whole ride, you know, it’s undirected.

02:07:56 Can I ask you about something you mentioned in terms of psychiatry when you were younger?

02:08:00 Because I had a similar experience of, you know, reading Freud and Carl Jung and just,

02:08:09 you know, those kind of popular psychiatry ideas. And that was a dream for me early on in high

02:08:15 school too. Like I hoped to understand the human mind by, somehow psychiatry felt like

02:08:24 the right discipline for that. Does that make you sad? That psychiatry is not

02:08:31 the mechanism by which you are able to explore the human mind. So for me, I was a little bit

02:08:37 disillusioned because of how much prescription medication and biochemistry is involved in the

02:08:46 discipline of psychiatry, as opposed to the dream of the Freud like, use the mechanisms of language

02:08:53 to explore the human mind. So that was a little disappointing. And that’s why I kind of went to

02:09:00 computer science and thinking like, maybe you can explore the human mind by trying to build the

02:09:04 thing. Yes. I wasn’t exposed to the sort of the biomedical slash pharmacological aspects of

02:09:14 psychiatry at that point because I dropped out of that whole idea of premed that I never even

02:09:22 found out about that until much later. But you’re absolutely right. So I was actually a member of the

02:09:30 National Advisory Mental Health Council. That is to say the board of scientists who advise the

02:09:41 director of the National Institute of Mental Health. And that was around the year 2000. And

02:09:47 in fact, at that time, the man who came in as the new director, I had been on this board for a year

02:09:56 when he came in, said, okay, schizophrenia is a biological illness. It’s a lot like cancer.

02:10:08 We’ve made huge strides in curing cancer. And that’s what we’re going to do with schizophrenia.

02:10:13 We’re going to find the medications that are going to cure this disease. And we’re not going

02:10:18 to listen to anybody’s grandmother anymore. And good old behavioral psychology is not something

02:10:27 we’re going to support any further. And he completely alienated me from the Institute

02:10:40 and from all of its prior policies, which had been much more holistic, I think, really at some level.

02:10:46 And the other people on the board were like psychiatrists, very biological psychiatrists.

02:10:57 It didn’t pan out that nothing has changed in our ability to help people with mental illness.

02:11:07 And so 20 years later, that particular path was a dead end, as far as I can tell.

02:11:14 Well, there’s some aspect to, and sorry to romanticize the whole philosophical conversation

02:11:20 about the human mind. But to me, psychiatrists, for a time, held the flag of we’re the deep thinkers.

02:11:29 In the same way that physicists are the deep thinkers about the nature of reality,

02:11:34 psychiatrists are the deep thinkers about the nature of the human mind. And I think that flag

02:11:38 has been taken from them and carried by people like you. It’s like, it’s more in the cognitive

02:11:44 psychology, especially when you have a foot in the computational view of the world, because you can

02:11:50 both build it, you can like, intuit about the functioning of the mind by building little models

02:11:56 and be able to see mathematical things and then deploying those models, especially in computers,

02:12:00 to say, does this actually work? They do like experiments. And then some combination of

02:12:07 neuroscience, where you’re starting to actually be able to observe, do certain experiments on

02:12:13 human beings and observe how the brain is actually functioning. And there, using intuition, you can

02:12:21 start being the philosopher. Like Richard Feynman is the philosopher, cognitive psychologists can

02:12:26 become the philosopher, and psychiatrists become much more like doctors. They’re like very medical.

02:12:32 They help people with medication, biochemistry, and so on. But they are no longer the book writers

02:12:39 and the philosophers, which of course I admire. I admire the Richard Feynman ability to do

02:12:45 great low level mathematics and physics and the high level philosophy.

02:12:52 Yeah, I think it was Fromm and Jung more than Freud that was sort of initially kind of like

02:13:00 made me feel like, oh, this is really amazing and interesting and I want to explore it further.

02:13:06 I actually, when I got to college and I lost that thread, I found more of it in sociology

02:13:15 and literature than I did in any place else. So I took quite a lot of both of those

02:13:23 disciplines as an undergraduate. And I was actually deeply ambivalent about

02:13:32 the psychology because I was doing experiments after the initial flurry of interest in

02:13:40 why people would occupy buildings during an insurrection and consider

02:13:44 being so overcommitted to their beliefs. But I ended up in the psychology laboratory running

02:13:55 experiments on pigeons. And so I had these profound dissonance between the kinds of issues

02:14:03 that would be explored when I was thinking about what I read about in modern British literature

02:14:12 versus what I could study with my pigeons in the laboratory. That got resolved when I went

02:14:18 to graduate school and I discovered cognitive psychology. And so for me, that was the path

02:14:25 out of this sort of like extremely sort of ambivalent divergence between the interest

02:14:31 in the human condition and the desire to do actual mechanistically oriented thinking about it. And I

02:14:42 think we’ve come a long way in that regard and that you’re absolutely right that nowadays this

02:14:50 is something that’s accessible to people through the pathway in through computer science or the

02:14:57 pathway in through neuroscience. You can get derailed in neuroscience down to the bottom of

02:15:08 the system where you might find the cures of various conditions, but you don’t get a chance

02:15:16 to think about the higher level stuff. So it’s in the systems and cognitive neuroscience and

02:15:21 computational intelligence, miasma up there at the top that I think these opportunities are most

02:15:28 are richest right now. And so yes, I am indeed blessed by having had the opportunity to fall

02:15:36 into that space. So you mentioned the human condition, speaking which you happen to be a

02:15:44 human being who’s unfortunately not immortal. That seems to be a fundamental part of the human

02:15:52 condition that this ride ends. Do you think about the fact that you’re going to die one day? Are you

02:16:00 afraid of death? I would say that I am not as much afraid of death as I am of degeneration. And

02:16:15 I say that in part for reasons of having, you know, seen some tragic degenerative situations

02:16:24 unfold. It’s exciting when you can continue to participate and feel like you’re near the place

02:16:42 where the wave is breaking on the shore, if you like. And I think about my own future potential.

02:16:58 If I were to begin to suffer from Alzheimer’s disease or semantic dementia or some other

02:17:07 condition, you know, I would sort of gradually lose the thread of that ability. And so one can

02:17:17 live on for a decade after, you know, sort of having to retire because one no longer has

02:17:28 these kinds of abilities to engage. And I think that’s the thing that I fear the most.

02:17:34 SL. The losing of that, like the breaking of the wave, the flourishing of the mind,

02:17:42 where you have these ideas and they’re swimming around and you’re able to play with them.

02:17:46 RL. Yeah. And collaborate with other people who, you know, are themselves

02:17:54 really helping to push these ideas forward. So, yeah.

02:17:58 SL. What about the edge of the cliff? The end? I mean, the mystery of it. I mean…

02:18:05 RL. The migrated sort of conception of mind and, you know, sort of continuous sort of way of

02:18:12 thinking about most things makes it so that, to me, the discreteness of that transition is less

02:18:25 apparent than it seems to be to most people.

02:18:27 SL. I see. I see. Yeah. Yeah. I wonder, so I don’t know if you know the work of Ernest Becker

02:18:35 and so on. I wonder what role mortality and our ability to be cognizant of it

02:18:42 and anticipate it and perhaps be afraid of it, what role that plays in our reasoning of the world.

02:18:49 RL. I think that it can be motivating to people to think they have a limited period left.

02:18:55 SL. I think in my own case, you know, it’s like seven or eight years ago now that I was

02:19:03 sitting around doing experiments on decision making that were

02:19:11 satisfying in a certain way because I could really get closure on whether the model fit the data

02:19:19 perfectly or not. And I could see how one could test, you know, the predictions in monkeys as well

02:19:26 as humans and really see what the neurons were doing. But I just realized, hey, wait a minute,

02:19:33 you know, I may only have about 10 or 15 years left here. And I don’t feel like I’m getting

02:19:40 towards the answers to the really interesting questions while I’m doing this particular level

02:19:46 of work. And that’s when I said to myself, okay, let’s pick something that’s hard. So that’s when

02:19:56 I started working on mathematical cognition. And I think it was more in terms of, well,

02:20:03 I got 15 more years possibly of useful life left. Let’s imagine that it’s only 10.

02:20:09 I’m actually getting close to the end of that now, maybe three or four more years.

02:20:13 But I’m beginning to feel like, well, I probably have another five after that. So, okay, I’ll give

02:20:17 myself another six or eight. But a deadline is looming and therefore. It’s not going to go on

02:20:23 forever. And so, yeah, I got to keep thinking about the questions that I think are the interesting and

02:20:31 important ones for sure. What do you hope your legacy is? You’ve done some incredible work in

02:20:37 your life as a man, as a scientist, when the aliens and the human civilization is long gone

02:20:46 and the aliens are reading the encyclopedia about the human species. What do you hope is the

02:20:51 paragraph written about you? I would want it to sort of highlight

02:20:56 a couple things that I was able to see one path that was more exciting to me than the one that

02:21:20 seemed already to be there for a cognitive psychologist, but not for any super special

02:21:28 reason other than that I’d had the right context prior to that, but that I had gone ahead and

02:21:34 followed that lead. And then I forget the exact wording, but I said in this preface that

02:21:44 the joy of science is the moment in which a partially formed thought in the mind of one person

02:22:01 gets crystallized a little better in the discourse and becomes the foundation

02:22:08 of some exciting concrete piece of actual scientific progress. And I feel like that

02:22:16 moment happened when Rumelhart and I were doing the interactive activation model and when

02:22:21 Rumelhart heard Hinton talk about gradient descent and having the objective function to guide the

02:22:29 learning process. And it happened a lot in that period and I sort of seek that kind of

02:22:37 thing in my collaborations with my students. So the idea that this is a person who contributed

02:22:49 to science by finding exciting collaborative opportunities to engage with other people

02:22:55 through is something that I certainly hope is part of the paragraph.

02:22:59 And like you said, taking a step maybe in directions that are non obvious. So it’s the

02:23:08 old Robert Frost road less taken. So maybe because you said like this incomplete initial idea,

02:23:16 that step you take is a little bit off the beaten path.

02:23:22 If I could just say one more thing here. This was something that really contributed

02:23:28 to energizing me in a way that I feel it would be useful to share. My PhD dissertation project

02:23:40 was completely empirical experimental project. And I wrote a paper based on the two main

02:23:48 experiments that were the core of my dissertation and I submitted it to a journal. And at the end

02:23:55 of the paper, I had a little section where I laid out the beginnings of my theory about what I

02:24:05 thought was going on that would explain the data that I had collected. And I had submitted the

02:24:13 paper to the Journal of Experimental Psychology. So I got back a letter from the editor saying,

02:24:20 thank you very much. These are great experiments and we’d love to publish them in the journal.

02:24:23 But what we’d like you to do is to leave the theorizing to the theorists and take that part

02:24:30 out of the paper. And so I did, I took that part out of the paper. But I almost found myself labeled

02:24:42 as a non theorist by this. And I could have succumbed to that and said, okay, well, I guess

02:24:50 my job is to just go on and do experiments, right? But that’s not what I wanted to do. And so when I

02:25:01 got to my assistant professorship, although I continued to do experiments because I knew I had

02:25:07 to get some papers out, I also at the end of my first year submitted my first article to

02:25:13 Psychological Review, which was the theoretical journal where I took that section and elaborated

02:25:18 it and wrote it up and submitted it to them. And they didn’t accept that either, but they said,

02:25:24 oh, this is interesting. You should keep thinking about it this time. And then that was what got me

02:25:29 going to think, okay, you know, so it’s not a superhuman thing to contribute to the development

02:25:37 of theory. You know, you don’t have to be, you can do it as a mere mortal.

02:25:43 LB And the broader, I think, lesson is don’t succumb to the labels of a particular reviewer.

02:25:50 RL Yeah, that’s for sure. Or anybody labeling you, right?

02:25:55 LB Yeah, exactly. I mean that, yeah, exactly. And especially as you become successful,

02:26:01 your labels get assigned to you for that you’re successful for that thing.

02:26:05 RL Connectionist or cognitive scientist and not a neuroscientist.

02:26:09 LB And then you can, you can completely, that’s just, that’s the stories of the past. You’re

02:26:15 today a new person that can completely revolutionize in totally new areas. So don’t

02:26:20 let those labels hold you back. Well, let me ask the big question. When you look at into the,

02:26:29 you said it started with Columbia trying to observe these humans and they’re doing

02:26:34 weird stuff and you want to know why are they doing this stuff. So Zuma even bigger.

02:26:38 LB At the hundred plus billion people who’ve ever lived on earth. Why do you think we’re all

02:26:47 doing what we’re doing? What do you think is the meaning of it all? The big why question.

02:26:51 We seem to be very busy doing a bunch of stuff and we seem to be kind of directed towards somewhere.

02:26:59 But why?

02:27:00 RL Well, I myself think that we make meaning for ourselves and that we find inspiration

02:27:13 in the meaning that other people have made in the past. You know, and the great religious thinkers

02:27:21 of the first millennium BC and, you know, few that came in the early part of the second millennium,

02:27:36 you know, laid down some important foundations for us.

02:27:40 But I do believe that, you know, we are an emergent result of a process that happened

02:27:54 naturally without guidance and that meaning is what we make of it and that the creation of

02:28:05 efforts to reify meaning in like religious traditions and so on is just a part of the

02:28:15 expression of that goal that we have to, you know, not find out what the meaning is, but to

02:28:26 make it ourselves. And so, to me, it’s something that’s very personal. It’s very individual. It’s

02:28:40 like meaning will come for you through the particular combination of synergistic elements

02:28:50 that are your fabric and your experience and your context and, you know, you should…

02:29:04 It’s all made in a certain kind of a local context though, right? Here I am at UCSD with this brilliant

02:29:12 man, Rommelhart, who’s having, you know, these doubts about symbolic artificial intelligence

02:29:24 that resonate with my desire to see it grounded in the biology and let’s make the most of that,

02:29:35 you know? Yeah. And so, from that like little pocket, there’s some kind of peculiar little

02:29:41 emergent process that then, which is basically each one of us, each one of us humans is a kind of,

02:29:49 you know, you think cells and they come together and it’s an emergent process that then tells fancy

02:29:56 stories about itself and then gets, just like you said, just enjoys the beauty of the stories

02:30:03 we tell about ourselves. It’s an emergent process that lives for a time, is defined by its local

02:30:10 pocket and context in time and space and then tells pretty stories and we write those stories

02:30:16 down and then we celebrate how nice the stories are and then it continues because we build stories

02:30:21 on top of each other and eventually we’ll colonize hopefully other planets, other solar systems,

02:30:30 other galaxies and we’ll tell even better stories. But it all starts here on Earth. Jay, you’re

02:30:37 speaking of peculiar emergent processes that lived one heck of a story. You’re one of the

02:30:47 the great scientists of cognitive science, of psychology, of computation. It’s a huge honor

02:30:58 you would talk to me today that you spend your very valuable time. I really enjoyed talking with

02:31:03 you and thank you for all the work you’ve done. I can’t wait to see what you do next.

02:31:06 JL Well, thank you so much and this has been an amazing opportunity for me to let ideas that I’ve

02:31:13 never fully expressed before come out because you asked such a wide range of the deeper questions

02:31:20 that we’ve all been thinking about for so long. So thank you very much for that.

02:31:24 RL Thank you. Thanks for listening to this conversation with Jay McClelland.

02:31:29 To support this podcast, please check out our sponsors in the description.

02:31:32 And now, let me leave you with some words from Jeffrey Hinton. In the long run,

02:31:37 curiosity driven research works best. Real breakthroughs come from people focusing

02:31:43 on what they’re excited about. Thanks for listening and hope to see you next time.