Melanie Mitchell: Concepts, Analogies, Common Sense & Future of AI #61

Transcript

00:00:00 The following is a conversation with Melanie Mitchell.

00:00:03 She’s a professor of computer science

00:00:04 at Portland State University

00:00:06 and an external professor at Santa Fe Institute.

00:00:10 She has worked on and written about artificial intelligence

00:00:12 from fascinating perspectives,

00:00:14 including adaptive complex systems, genetic algorithms,

00:00:18 and the copycat cognitive architecture,

00:00:20 which places the process of analogy making

00:00:23 at the core of human cognition.

00:00:26 From her doctoral work with her advisors,

00:00:28 Douglas Hofstadter and John Holland, to today,

00:00:32 she has contributed a lot of important ideas

00:00:34 to the field of AI, including her recent book,

00:00:37 simply called Artificial Intelligence,

00:00:39 A Guide for Thinking Humans.

00:00:42 This is the Artificial Intelligence Podcast.

00:00:45 If you enjoy it, subscribe on YouTube,

00:00:48 give it five stars on Apple Podcast,

00:00:50 support it on Patreon,

00:00:51 or simply connect with me on Twitter

00:00:53 at Lex Friedman, spelled F R I D M A N.

00:00:58 I recently started doing ads

00:00:59 at the end of the introduction.

00:01:01 I’ll do one or two minutes after introducing the episode

00:01:04 and never any ads in the middle

00:01:05 that can break the flow of the conversation.

00:01:08 I hope that works for you

00:01:09 and doesn’t hurt the listening experience.

00:01:12 I provide timestamps for the start of the conversation,

00:01:14 but it helps if you listen to the ad

00:01:17 and support this podcast by trying out the product

00:01:19 or service being advertised.

00:01:22 This show is presented by Cash App,

00:01:24 the number one finance app in the App Store.

00:01:27 I personally use Cash App to send money to friends,

00:01:30 but you can also use it to buy, sell,

00:01:32 and deposit Bitcoin in just seconds.

00:01:35 Cash App also has a new investing feature.

00:01:38 You can buy fractions of a stock, say $1 worth,

00:01:41 no matter what the stock price is.

00:01:43 Broker services are provided by Cash App Investing,

00:01:46 a subsidiary of Square and member SIPC.

00:01:49 I’m excited to be working with Cash App

00:01:51 to support one of my favorite organizations called First.

00:01:55 Best known for their first robotics and Lego competitions.

00:01:58 They educate and inspire hundreds of thousands of students

00:02:02 in over 110 countries

00:02:04 and have a perfect rating on Charity Navigator,

00:02:06 which means that donated money is used

00:02:09 to maximum effectiveness.

00:02:11 When you get Cash App from the App Store or Google Play

00:02:14 and use code LexPodcast,

00:02:17 you’ll get $10 and Cash App will also donate $10 to First,

00:02:21 which again is an organization

00:02:23 that I’ve personally seen inspire girls and boys

00:02:26 to dream of engineering a better world.

00:02:28 And now here’s my conversation with Melanie Mitchell.

00:02:33 The name of your new book is Artificial Intelligence,

00:02:36 subtitle, A Guide for Thinking Humans.

00:02:39 The name of this podcast is Artificial Intelligence.

00:02:42 So let me take a step back

00:02:44 and ask the old Shakespeare question about roses.

00:02:46 And what do you think of the term artificial intelligence

00:02:51 for our big and complicated and interesting field?

00:02:55 I’m not crazy about the term.

00:02:57 I think it has a few problems

00:03:01 because it means so many different things

00:03:04 to different people.

00:03:05 And intelligence is one of those words

00:03:07 that isn’t very clearly defined either.

00:03:10 There’s so many different kinds of intelligence,

00:03:14 degrees of intelligence, approaches to intelligence.

00:03:18 John McCarthy was the one who came up with the term

00:03:21 artificial intelligence.

00:03:23 And from what I read,

00:03:24 he called it that to differentiate it from cybernetics,

00:03:28 which was another related movement at the time.

00:03:33 And he later regretted calling it artificial intelligence.

00:03:39 Herbert Simon was pushing for calling it

00:03:41 complex information processing,

00:03:45 which got nixed,

00:03:47 but probably is equally vague, I guess.

00:03:52 Is it the intelligence or the artificial

00:03:55 in terms of words that is most problematic, would you say?

00:03:58 Yeah, I think it’s a little of both.

00:04:01 But it has some good sides

00:04:02 because I personally was attracted to the field

00:04:07 because I was interested in phenomenon of intelligence.

00:04:11 And if it was called complex information processing,

00:04:13 maybe I’d be doing something wholly different now.

00:04:16 What do you think of, I’ve heard the term used,

00:04:18 cognitive systems, for example, so using cognitive.

00:04:22 Yeah, I mean, cognitive has certain associations with it.

00:04:27 And people like to separate things like cognition

00:04:31 and perception, which I don’t actually think are separate.

00:04:33 But often people talk about cognition as being different

00:04:37 from sort of other aspects of intelligence.

00:04:41 It’s sort of higher level.

00:04:42 So to you, cognition is this broad,

00:04:44 beautiful mess of things that encompasses the whole thing.

00:04:47 Memory, perception.

00:04:48 Yeah, I think it’s hard to draw lines like that.

00:04:53 When I was coming out of grad school in 1990,

00:04:56 which is when I graduated,

00:04:58 that was during one of the AI winters.

00:05:01 And I was advised to not put AI,

00:05:05 artificial intelligence on my CV,

00:05:06 but instead call it intelligence systems.

00:05:09 So that was kind of a euphemism, I guess.

00:05:14 What about to stick briefly on terms and words,

00:05:20 the idea of artificial general intelligence,

00:05:24 or like Yann LeCun prefers human level intelligence,

00:05:29 sort of starting to talk about ideas

00:05:32 that achieve higher and higher levels of intelligence

00:05:37 and somehow artificial intelligence seems to be a term

00:05:41 used more for the narrow, very specific applications of AI

00:05:45 and sort of what set of terms appeal to you

00:05:51 to describe the thing that perhaps we strive to create?

00:05:56 People have been struggling with this

00:05:57 for the whole history of the field

00:06:00 and defining exactly what it is that we’re talking about.

00:06:03 You know, John Searle had this distinction

00:06:05 between strong AI and weak AI.

00:06:08 And weak AI could be general AI,

00:06:10 but his idea was strong AI was the view

00:06:14 that a machine is actually thinking,

00:06:18 that as opposed to simulating thinking

00:06:22 or carrying out processes that we would call intelligent.

00:06:30 At a high level, if you look at the founding

00:06:34 of the field of McCarthy and Searle and so on,

00:06:38 are we closer to having a better sense of that line

00:06:44 between narrow, weak AI and strong AI?

00:06:50 Yes, I think we’re closer to having a better idea

00:06:55 of what that line is.

00:06:58 Early on, for example, a lot of people thought

00:07:01 that playing chess would be, you couldn’t play chess

00:07:06 if you didn’t have sort of general human level intelligence.

00:07:11 And of course, once computers were able to play chess

00:07:13 better than humans, that revised that view.

00:07:18 And people said, okay, well, maybe now we have to revise

00:07:22 what we think of intelligence as.

00:07:25 And so that’s kind of been a theme

00:07:28 throughout the history of the field

00:07:29 is that once a machine can do some task,

00:07:34 we then have to look back and say, oh, well,

00:07:37 that changes my understanding of what intelligence is

00:07:39 because I don’t think that machine is intelligent,

00:07:43 at least that’s not what I wanna call intelligence.

00:07:45 So do you think that line moves forever

00:07:47 or will we eventually really feel as a civilization

00:07:51 like we’ve crossed the line if it’s possible?

00:07:54 It’s hard to predict, but I don’t see any reason

00:07:56 why we couldn’t in principle create something

00:08:00 that we would consider intelligent.

00:08:03 I don’t know how we will know for sure.

00:08:07 Maybe our own view of what intelligence is

00:08:10 will be refined more and more

00:08:12 until we finally figure out what we mean

00:08:14 when we talk about it.

00:08:17 But I think eventually we will create machines

00:08:22 in a sense that have intelligence.

00:08:24 They may not be the kinds of machines we have now.

00:08:28 And one of the things that that’s going to produce

00:08:32 is making us sort of understand

00:08:34 our own machine like qualities

00:08:38 that we in a sense are mechanical

00:08:43 in the sense that like cells,

00:08:45 cells are kind of mechanical.

00:08:47 They have algorithms, they process information by

00:08:52 and somehow out of this mass of cells,

00:08:57 we get this emergent property that we call intelligence.

00:09:01 But underlying it is really just cellular processing

00:09:07 and lots and lots and lots of it.

00:09:10 Do you think we’ll be able to,

00:09:12 do you think it’s possible to create intelligence

00:09:14 without understanding our own mind?

00:09:16 You said sort of in that process

00:09:18 we’ll understand more and more,

00:09:19 but do you think it’s possible to sort of create

00:09:23 without really fully understanding

00:09:26 from a mechanistic perspective,

00:09:27 sort of from a functional perspective

00:09:29 how our mysterious mind works?

00:09:32 If I had to bet on it, I would say,

00:09:36 no, we do have to understand our own minds

00:09:39 at least to some significant extent.

00:09:42 But I think that’s a really big open question.

00:09:47 I’ve been very surprised at how far kind of

00:09:49 brute force approaches based on say big data

00:09:53 and huge networks can take us.

00:09:57 I wouldn’t have expected that.

00:09:59 And they have nothing to do with the way our minds work.

00:10:03 So that’s been surprising to me, so it could be wrong.

00:10:06 To explore the psychological and the philosophical,

00:10:09 do you think we’re okay as a species

00:10:11 with something that’s more intelligent than us?

00:10:16 Do you think perhaps the reason

00:10:18 we’re pushing that line further and further

00:10:20 is we’re afraid of acknowledging

00:10:23 that there’s something stronger, better,

00:10:25 smarter than us humans?

00:10:29 Well, I’m not sure we can define intelligence that way

00:10:31 because smarter than is with respect to what,

00:10:40 computers are already smarter than us in some areas.

00:10:42 They can multiply much better than we can.

00:10:45 They can figure out driving routes to take

00:10:50 much faster and better than we can.

00:10:51 They have a lot more information to draw on.

00:10:54 They know about traffic conditions and all that stuff.

00:10:57 So for any given particular task,

00:11:02 sometimes computers are much better than we are

00:11:04 and we’re totally happy with that, right?

00:11:07 I’m totally happy with that.

00:11:08 It doesn’t bother me at all.

00:11:10 I guess the question is which things about our intelligence

00:11:15 would we feel very sad or upset

00:11:20 that machines had been able to recreate?

00:11:24 So in the book, I talk about my former PhD advisor,

00:11:27 Douglas Hofstadter,

00:11:29 who encountered a music generation program.

00:11:32 And that was really the line for him,

00:11:36 that if a machine could create beautiful music,

00:11:40 that would be terrifying for him

00:11:44 because that is something he feels

00:11:46 is really at the core of what it is to be human,

00:11:50 creating beautiful music, art, literature.

00:11:56 He doesn’t like the fact that machines

00:11:59 can recognize spoken language really well.

00:12:05 He personally doesn’t like using speech recognition,

00:12:09 but I don’t think it bothers him to his core

00:12:11 because it’s like, okay, that’s not at the core of humanity.

00:12:15 But it may be different for every person

00:12:17 what really they feel would usurp their humanity.

00:12:25 And I think maybe it’s a generational thing also.

00:12:27 Maybe our children or our children’s children

00:12:30 will be adapted, they’ll adapt to these new devices

00:12:35 that can do all these tasks and say,

00:12:38 yes, this thing is smarter than me in all these areas,

00:12:41 but that’s great because it helps me.

00:12:46 Looking at the broad history of our species,

00:12:50 why do you think so many humans have dreamed

00:12:52 of creating artificial life and artificial intelligence

00:12:55 throughout the history of our civilization?

00:12:57 So not just this century or the 20th century,

00:13:00 but really throughout many centuries that preceded it?

00:13:06 That’s a really good question,

00:13:07 and I have wondered about that.

00:13:09 Because I myself was driven by curiosity

00:13:16 about my own thought processes

00:13:18 and thought it would be fantastic

00:13:20 to be able to get a computer

00:13:22 to mimic some of my thought processes.

00:13:26 I’m not sure why we’re so driven.

00:13:28 I think we want to understand ourselves better

00:13:33 and we also want machines to do things for us.

00:13:40 But I don’t know, there’s something more to it

00:13:42 because it’s so deep in the kind of mythology

00:13:45 or the ethos of our species.

00:13:49 And I don’t think other species have this drive.

00:13:52 So I don’t know.

00:13:53 If you were to sort of psychoanalyze yourself

00:13:55 in your own interest in AI, are you,

00:13:59 what excites you about creating intelligence?

00:14:07 You said understanding our own selves?

00:14:09 Yeah, I think that’s what drives me particularly.

00:14:13 I’m really interested in human intelligence,

00:14:22 but I’m also interested in the sort of the phenomenon

00:14:25 of intelligence more generally.

00:14:28 And I don’t think humans are the only thing

00:14:29 with intelligence, or even animals.

00:14:34 But I think intelligence is a concept

00:14:39 that encompasses a lot of complex systems.

00:14:43 And if you think of things like insect colonies

00:14:47 or cellular processes or the immune system

00:14:52 or all kinds of different biological

00:14:54 or even societal processes have as an emergent property

00:14:59 some aspects of what we would call intelligence.

00:15:02 They have memory, they process information,

00:15:05 they have goals, they accomplish their goals, et cetera.

00:15:08 And to me, the question of what is this thing

00:15:12 we’re talking about here was really fascinating to me.

00:15:17 And exploring it using computers seem to be a good way

00:15:22 to approach the question.

00:15:23 So do you think kind of of intelligence,

00:15:26 do you think of our universe as a kind of hierarchy

00:15:28 of complex systems?

00:15:30 And then intelligence is just the property of any,

00:15:33 you can look at any level and every level

00:15:36 has some aspect of intelligence.

00:15:39 So we’re just like one little speck

00:15:40 in that giant hierarchy of complex systems.

00:15:44 I don’t know if I would say any system

00:15:47 like that has intelligence, but I guess what I wanna,

00:15:52 I don’t have a good enough definition of intelligence

00:15:55 to say that.

00:15:56 So let me do sort of a multiple choice, I guess.

00:15:59 So you said ant colonies.

00:16:02 So are ant colonies intelligent?

00:16:04 Are the bacteria in our body intelligent?

00:16:09 And then going to the physics world molecules

00:16:13 and the behavior at the quantum level of electrons

00:16:18 and so on, are those kinds of systems,

00:16:21 do they possess intelligence?

00:16:22 Like where’s the line that feels compelling to you?

00:16:27 I don’t know.

00:16:28 I mean, I think intelligence is a continuum.

00:16:30 And I think that the ability to, in some sense,

00:16:35 have intention, have a goal,

00:16:37 have some kind of self awareness is part of it.

00:16:45 So I’m not sure if, you know,

00:16:47 it’s hard to know where to draw that line.

00:16:50 I think that’s kind of a mystery.

00:16:52 But I wouldn’t say that the planets orbiting the sun

00:16:59 is an intelligent system.

00:17:01 I mean, I would find that maybe not the right term

00:17:05 to describe that.

00:17:06 And there’s all this debate in the field

00:17:09 of like what’s the right way to define intelligence?

00:17:12 What’s the right way to model intelligence?

00:17:15 Should we think about computation?

00:17:16 Should we think about dynamics?

00:17:18 And should we think about free energy

00:17:21 and all of that stuff?

00:17:23 And I think that it’s a fantastic time to be in the field

00:17:28 because there’s so many questions

00:17:30 and so much we don’t understand.

00:17:32 There’s so much work to do.

00:17:33 So are we the most special kind of intelligence

00:17:38 in this kind of, you said there’s a bunch

00:17:41 of different elements and characteristics

00:17:43 of intelligence systems and colonies.

00:17:47 Is human intelligence the thing in our brain?

00:17:53 Is that the most interesting kind of intelligence

00:17:55 in this continuum?

00:17:57 Well, it’s interesting to us because it is us.

00:18:01 I mean, interesting to me, yes.

00:18:03 And because I’m part of, you know, human.

00:18:06 But to understanding the fundamentals of intelligence,

00:18:08 what I’m getting at, is studying the human,

00:18:11 is sort of, if everything we’ve talked about,

00:18:13 what you talk about in your book,

00:18:14 what just the AI field, this notion,

00:18:18 yes, it’s hard to define,

00:18:19 but it’s usually talking about something

00:18:22 that’s very akin to human intelligence.

00:18:24 Yeah, to me it is the most interesting

00:18:26 because it’s the most complex, I think.

00:18:29 It’s the most self aware.

00:18:32 It’s the only system, at least that I know of,

00:18:34 that reflects on its own intelligence.

00:18:38 And you talk about the history of AI

00:18:41 and us, in terms of creating artificial intelligence,

00:18:45 being terrible at predicting the future

00:18:48 with AI, with tech in general.

00:18:50 So why do you think we’re so bad at predicting the future?

00:18:56 Are we hopelessly bad?

00:18:59 So no matter what, whether it’s this decade

00:19:01 or the next few decades, every time we make a prediction,

00:19:04 there’s just no way of doing it well,

00:19:06 or as the field matures, we’ll be better and better at it.

00:19:10 I believe as the field matures, we will be better.

00:19:13 And I think the reason that we’ve had so much trouble

00:19:16 is that we have so little understanding

00:19:18 of our own intelligence.

00:19:20 So there’s the famous story about Marvin Minsky

00:19:29 assigning computer vision as a summer project

00:19:32 to his undergrad students.

00:19:34 And I believe that’s actually a true story.

00:19:36 Yeah, no, there’s a write up on it.

00:19:39 Everyone should read.

00:19:40 It’s like a, I think it’s like a proposal

00:19:43 that describes everything that should be done

00:19:46 in that project.

00:19:46 It’s hilarious because it, I mean, you could explain it,

00:19:49 but from my recollection, it describes basically

00:19:52 all the fundamental problems of computer vision,

00:19:55 many of which still haven’t been solved.

00:19:57 Yeah, and I don’t know how far

00:19:59 they really expect it to get.

00:20:01 But I think that, and they’re really,

00:20:04 Marvin Minsky is a super smart guy

00:20:06 and very sophisticated thinker.

00:20:08 But I think that no one really understands

00:20:12 or understood, still doesn’t understand

00:20:16 how complicated, how complex the things that we do are

00:20:22 because they’re so invisible to us.

00:20:24 To us, vision, being able to look out at the world

00:20:27 and describe what we see, that’s just immediate.

00:20:31 It feels like it’s no work at all.

00:20:33 So it didn’t seem like it would be that hard,

00:20:35 but there’s so much going on unconsciously,

00:20:39 sort of invisible to us that I think we overestimate

00:20:44 how easy it will be to get computers to do it.

00:20:50 And sort of for me to ask an unfair question,

00:20:53 you’ve done research, you’ve thought about

00:20:56 many different branches of AI through this book,

00:20:59 widespread looking at where AI has been, where it is today.

00:21:06 If you were to make a prediction,

00:21:08 how many years from now would we as a society

00:21:12 create something that you would say

00:21:15 achieved human level intelligence

00:21:19 or superhuman level intelligence?

00:21:23 That is an unfair question.

00:21:25 A prediction that will most likely be wrong.

00:21:28 But it’s just your notion because.

00:21:30 Okay, I’ll say more than 100 years.

00:21:34 More than 100 years.

00:21:35 And I quoted somebody in my book who said that

00:21:38 human level intelligence is 100 Nobel Prizes away,

00:21:44 which I like because it’s a nice way to sort of,

00:21:48 it’s a nice unit for prediction.

00:21:51 And it’s like that many fantastic discoveries

00:21:55 have to be made.

00:21:56 And of course there’s no Nobel Prize in AI, not yet at least.

00:22:03 If we look at that 100 years,

00:22:05 your sense is really the journey to intelligence

00:22:10 has to go through something more complicated

00:22:15 that’s akin to our own cognitive systems,

00:22:19 understanding them, being able to create them

00:22:21 in the artificial systems,

00:22:24 as opposed to sort of taking the machine learning

00:22:26 approaches of today and really scaling them

00:22:30 and scaling them and scaling them exponentially

00:22:33 with both compute and hardware and data.

00:22:37 That would be my guess.

00:22:42 I think that in the sort of going along in the narrow AI

00:22:47 that the current approaches will get better.

00:22:54 I think there’s some fundamental limits

00:22:56 to how far they’re gonna get.

00:22:59 I might be wrong, but that’s what I think.

00:23:01 And there’s some fundamental weaknesses that they have

00:23:06 that I talk about in the book that just comes

00:23:10 from this approach of supervised learning requiring

00:23:20 sort of feed forward networks and so on.

00:23:27 It’s just, I don’t think it’s a sustainable approach

00:23:31 to understanding the world.

00:23:34 Yeah, I’m personally torn on it.

00:23:36 Sort of everything you read about in the book

00:23:39 and sort of what we’re talking about now,

00:23:41 I agree with you, but I’m more and more,

00:23:45 depending on the day, first of all,

00:23:48 I’m deeply surprised by the success

00:23:50 of machine learning and deep learning in general.

00:23:52 From the very beginning, when I was,

00:23:54 it’s really been my main focus of work.

00:23:57 I’m just surprised how far it gets.

00:23:59 And I’m also think we’re really early on

00:24:03 in these efforts of these narrow AI.

00:24:07 So I think there’ll be a lot of surprise

00:24:09 of how far it gets.

00:24:11 I think we’ll be extremely impressed.

00:24:14 Like my sense is everything I’ve seen so far,

00:24:17 and we’ll talk about autonomous driving and so on,

00:24:19 I think we can get really far.

00:24:21 But I also have a sense that we will discover,

00:24:24 just like you said, is that even though we’ll get

00:24:27 really far in order to create something

00:24:30 like our own intelligence, it’s actually much farther

00:24:32 than we realize.

00:24:34 I think these methods are a lot more powerful

00:24:37 than people give them credit for actually.

00:24:39 So that of course there’s the media hype,

00:24:41 but I think there’s a lot of researchers in the community,

00:24:43 especially like not undergrads, right?

00:24:46 But like people who’ve been in AI,

00:24:48 they’re skeptical about how far deep learning can get.

00:24:50 And I’m more and more thinking that it can actually

00:24:54 get farther than they’ll realize.

00:24:56 It’s certainly possible.

00:24:58 One thing that surprised me when I was writing the book

00:25:00 is how far apart different people in the field are

00:25:03 on their opinion of how far the field has come

00:25:08 and what is accomplished and what’s gonna happen next.

00:25:11 What’s your sense of the different,

00:25:13 who are the different people, groups, mindsets,

00:25:17 thoughts in the community about where AI is today?

00:25:22 Yeah, they’re all over the place.

00:25:24 So there’s kind of the singularity transhumanism group.

00:25:30 I don’t know exactly how to characterize that approach,

00:25:33 which is sort of the sort of exponential,

00:25:36 exponential progress where we’re on the sort of

00:25:41 almost at the hugely accelerating part of the exponential.

00:25:45 And in the next 30 years,

00:25:49 we’re going to see super intelligent AI and all that,

00:25:54 and we’ll be able to upload our brains and that.

00:25:57 So there’s that kind of extreme view that most,

00:26:00 I think most people who work in AI don’t have.

00:26:04 They disagree with that.

00:26:06 But there are people who are,

00:26:09 maybe aren’t singularity people,

00:26:12 but they do think that the current approach

00:26:16 of deep learning is going to scale

00:26:20 and is going to kind of go all the way basically

00:26:23 and take us to true AI or human level AI

00:26:26 or whatever you wanna call it.

00:26:29 And there’s quite a few of them.

00:26:30 And a lot of them, like a lot of the people I’ve met

00:26:34 who work at big tech companies in AI groups

00:26:40 kind of have this view that we’re really not that far.

00:26:46 Just to linger on that point,

00:26:47 sort of if I can take as an example, like Yann LeCun,

00:26:50 I don’t know if you know about his work

00:26:52 and so his viewpoints on this.

00:26:54 I do.

00:26:55 He believes that there’s a bunch of breakthroughs,

00:26:57 like fundamental, like Nobel prizes that are needed still.

00:27:01 But I think he thinks those breakthroughs

00:27:03 will be built on top of deep learning.

00:27:06 And then there’s some people who think

00:27:08 we need to kind of put deep learning

00:27:11 to the side a little bit as just one module

00:27:14 that’s helpful in the bigger cognitive framework.

00:27:17 Right, so I think somewhat I understand Yann LeCun

00:27:22 is rightly saying supervised learning is not sustainable.

00:27:27 We have to figure out how to do unsupervised learning,

00:27:31 that that’s gonna be the key.

00:27:34 And I think that’s probably true.

00:27:39 I think unsupervised learning

00:27:40 is gonna be harder than people think.

00:27:43 I mean, the way that we humans do it.

00:27:47 Then there’s the opposing view,

00:27:50 there’s the Gary Marcus kind of hybrid view

00:27:55 where deep learning is one part,

00:27:58 but we need to bring back kind of these symbolic approaches

00:28:02 and combine them.

00:28:03 Of course, no one knows how to do that very well.

00:28:06 Which is the more important part to emphasize

00:28:10 and how do they fit together?

00:28:12 What’s the foundation?

00:28:13 What’s the thing that’s on top?

00:28:15 What’s the cake?

00:28:16 What’s the icing?

00:28:17 Right.

00:28:18 Then there’s people pushing different things.

00:28:22 There’s the people, the causality people who say,

00:28:26 deep learning as it’s formulated today

00:28:28 completely lacks any notion of causality.

00:28:32 And that’s, dooms it.

00:28:35 And therefore we have to somehow give it

00:28:37 some kind of notion of causality.

00:28:41 There’s a lot of push

00:28:45 from the more cognitive science crowd saying,

00:28:51 we have to look at developmental learning.

00:28:54 We have to look at how babies learn.

00:28:56 We have to look at intuitive physics,

00:29:00 all these things we know about physics.

00:29:03 And as somebody kind of quipped,

00:29:05 we also have to teach machines intuitive metaphysics,

00:29:08 which means like objects exist.

00:29:14 Causality exists.

00:29:17 These things that maybe we’re born with.

00:29:19 I don’t know that they don’t have the,

00:29:21 machines don’t have any of that.

00:29:23 They look at a group of pixels

00:29:26 and maybe they get 10 million examples,

00:29:31 but they can’t necessarily learn

00:29:34 that there are objects in the world.

00:29:38 So there’s just a lot of pieces of the puzzle

00:29:41 that people are promoting

00:29:44 and with different opinions of like how important they are

00:29:47 and how close we are to being able to put them all together

00:29:52 to create general intelligence.

00:29:54 Looking at this broad field,

00:29:56 what do you take away from it?

00:29:57 Who is the most impressive?

00:29:59 Is it the cognitive folks,

00:30:01 the Gary Marcus camp, the on camp,

00:30:05 unsupervised and their self supervised.

00:30:07 There’s the supervisors and then there’s the engineers

00:30:09 who are actually building systems.

00:30:11 You have sort of the Andrej Karpathy at Tesla

00:30:14 building actual, it’s not philosophy,

00:30:17 it’s real like systems that operate in the real world.

00:30:21 What do you take away from all this beautiful variety?

00:30:23 I don’t know if,

00:30:25 these different views are not necessarily

00:30:27 mutually exclusive.

00:30:29 And I think people like Yann LeCun

00:30:34 agrees with the developmental psychology of causality,

00:30:39 intuitive physics, et cetera.

00:30:43 But he still thinks that it’s learning,

00:30:45 like end to end learning is the way to go.

00:30:48 Will take us perhaps all the way.

00:30:50 Yeah, and that we don’t need,

00:30:51 there’s no sort of innate stuff that has to get built in.

00:30:56 This is, it’s because it’s a hard problem.

00:31:02 I personally, I’m very sympathetic

00:31:05 to the cognitive science side,

00:31:07 cause that’s kind of where I came in to the field.

00:31:10 I’ve become more and more sort of an embodiment adherent

00:31:15 saying that without having a body,

00:31:18 it’s gonna be very hard to learn

00:31:20 what we need to learn about the world.

00:31:24 That’s definitely something I’d love to talk about

00:31:26 in a little bit.

00:31:28 To step into the cognitive world,

00:31:31 then if you don’t mind,

00:31:32 cause you’ve done so many interesting things.

00:31:34 If you look to copycat,

00:31:36 taking a couple of decades step back,

00:31:40 you, Douglas Hofstadter and others

00:31:43 have created and developed copycat

00:31:45 more than 30 years ago.

00:31:48 That’s painful to hear.

00:31:50 So what is it?

00:31:51 What is copycat?

00:31:54 It’s a program that makes analogies

00:31:57 in an idealized domain,

00:32:00 idealized world of letter strings.

00:32:03 So as you say, 30 years ago, wow.

00:32:06 So I started working on it

00:32:07 when I started grad school in 1984.

00:32:12 Wow, dates me.

00:32:17 And it’s based on Doug Hofstadter’s ideas

00:32:21 about that analogy is really a core aspect of thinking.

00:32:30 I remember he has a really nice quote

00:32:32 in the book by himself and Emmanuel Sandor

00:32:36 called Surfaces and Essences.

00:32:38 I don’t know if you’ve seen that book,

00:32:39 but it’s about analogy and he says,

00:32:43 without concepts, there can be no thought

00:32:46 and without analogies, there can be no concepts.

00:32:51 So the view is that analogy

00:32:52 is not just this kind of reasoning technique

00:32:55 where we go, shoe is to foot as glove is to what,

00:33:01 these kinds of things that we have on IQ tests or whatever,

00:33:05 but that it’s much deeper,

00:33:06 it’s much more pervasive in every thing we do,

00:33:10 in our language, our thinking, our perception.

00:33:16 So he had a view that was a very active perception idea.

00:33:20 So the idea was that instead of having kind of

00:33:26 a passive network in which you have input

00:33:31 that’s being processed through these feed forward layers

00:33:35 and then there’s an output at the end,

00:33:37 that perception is really a dynamic process

00:33:41 where like our eyes are moving around

00:33:43 and they’re getting information

00:33:44 and that information is feeding back

00:33:47 to what we look at next, influences,

00:33:50 what we look at next and how we look at it.

00:33:53 And so copycat was trying to do that,

00:33:56 kind of simulate that kind of idea

00:33:57 where you have these agents,

00:34:02 it’s kind of an agent based system

00:34:04 and you have these agents that are picking things

00:34:07 to look at and deciding whether they were interesting

00:34:10 or not and whether they should be looked at more

00:34:13 and that would influence other agents.

00:34:15 Now, how do they interact?

00:34:17 So they interacted through this global kind of

00:34:20 what we call the workspace.

00:34:22 So it’s actually inspired by the old blackboard systems

00:34:25 where you would have agents that post information

00:34:28 on a blackboard, a common blackboard.

00:34:30 This is like very old fashioned AI.

00:34:33 Is that, are we talking about like in physical space?

00:34:36 Is this a computer program?

00:34:37 It’s a computer program.

00:34:38 So agents posting concepts on a blackboard kind of thing?

00:34:41 Yeah, we called it a workspace.

00:34:43 And the workspace is a data structure.

00:34:48 The agents are little pieces of code

00:34:50 that you could think of them as little detectors

00:34:54 or little filters that say,

00:34:55 I’m gonna pick this place to look

00:34:57 and I’m gonna look for a certain thing

00:34:59 and is this the thing I think is important, is it there?

00:35:03 So it’s almost like, you know, a convolution in a way,

00:35:06 except a little bit more general and saying,

00:35:10 and then highlighting it in the workspace.

00:35:14 Once it’s in the workspace,

00:35:16 how do the things that are highlighted

00:35:18 relate to each other?

00:35:18 Like what’s, is this?

00:35:19 So there’s different kinds of agents

00:35:21 that can build connections between different things.

00:35:23 So just to give you a concrete example,

00:35:25 what CopyCat did was it made analogies

00:35:28 between strings of letters.

00:35:30 So here’s an example.

00:35:31 ABC changes to ABD.

00:35:35 What does IJK change to?

00:35:39 And the program had some prior knowledge

00:35:41 about the alphabet, knew the sequence of the alphabet.

00:35:45 It had a concept of letter, successor of letter.

00:35:49 It had concepts of sameness.

00:35:50 So it has some innate things programmed in.

00:35:55 But then it could do things like say,

00:35:58 discover that ABC is a group of letters in succession.

00:36:06 And then an agent can mark that.

00:36:11 So the idea that there could be a sequence of letters,

00:36:16 is that a new concept that’s formed

00:36:18 or that’s a concept that’s innate?

00:36:19 That’s a concept that’s innate.

00:36:21 Sort of, can you form new concepts

00:36:23 or are all concepts innate? No.

00:36:25 So in this program, all the concepts

00:36:28 of the program were innate.

00:36:30 So, cause we weren’t, I mean,

00:36:32 obviously that limits it quite a bit.

00:36:35 But what we were trying to do is say,

00:36:37 suppose you have some innate concepts,

00:36:40 how do you flexibly apply them to new situations?

00:36:45 And how do you make analogies?

00:36:47 Let’s step back for a second.

00:36:49 So I really liked that quote that you say,

00:36:51 without concepts, there could be no thought

00:36:53 and without analogies, there can be no concepts.

00:36:56 In a Santa Fe presentation,

00:36:58 you said that it should be one of the mantras of AI.

00:37:00 Yes.

00:37:01 And that you also yourself said,

00:37:04 how to form and fluidly use concept

00:37:06 is the most important open problem in AI.

00:37:09 Yes.

00:37:11 How to form and fluidly use concepts

00:37:14 is the most important open problem in AI.

00:37:16 So let’s, what is a concept and what is an analogy?

00:37:21 A concept is in some sense a fundamental unit of thought.

00:37:28 So say we have a concept of a dog, okay?

00:37:38 And a concept is embedded in a whole space of concepts

00:37:45 so that there’s certain concepts that are closer to it

00:37:48 or farther away from it.

00:37:50 Are these concepts, are they really like fundamental,

00:37:53 like we mentioned innate, almost like axiomatic,

00:37:55 like very basic and then there’s other stuff

00:37:57 built on top of it?

00:37:58 Or does this include everything?

00:38:01 Are they complicated?

00:38:04 You can certainly form new concepts.

00:38:06 Right, I guess that’s the question I’m asking.

00:38:08 Can you form new concepts

00:38:10 that are complex combinations of other concepts?

00:38:14 Yes, absolutely.

00:38:15 And that’s kind of what we do in learning.

00:38:20 And then what’s the role of analogies in that?

00:38:22 So analogy is when you recognize

00:38:27 that one situation is essentially the same

00:38:33 as another situation.

00:38:35 And essentially is kind of the key word there

00:38:38 because it’s not the same.

00:38:39 So if I say, last week I did a podcast interview

00:38:44 actually like three days ago in Washington, DC.

00:38:52 And that situation was very similar to this situation,

00:38:56 although it wasn’t exactly the same.

00:38:58 It was a different person sitting across from me.

00:39:00 We had different kinds of microphones.

00:39:03 The questions were different.

00:39:04 The building was different.

00:39:06 There’s all kinds of different things,

00:39:07 but really it was analogous.

00:39:10 Or I can say, so doing a podcast interview,

00:39:14 that’s kind of a concept, it’s a new concept.

00:39:17 I never had that concept before this year essentially.

00:39:23 I mean, and I can make an analogy with it

00:39:27 like being interviewed for a news article in a newspaper.

00:39:31 And I can say, well, you kind of play the same role

00:39:35 that the newspaper reporter played.

00:39:40 It’s not exactly the same

00:39:42 because maybe they actually emailed me some written questions

00:39:45 rather than talking and the writing,

00:39:48 the written questions are analogous

00:39:52 to your spoken questions.

00:39:53 And there’s just all kinds of similarities.

00:39:55 And this somehow probably connects to conversations

00:39:57 you have over Thanksgiving dinner,

00:39:58 just general conversations.

00:40:01 There’s like a thread you can probably take

00:40:03 that just stretches out in all aspects of life

00:40:06 that connect to this podcast.

00:40:08 I mean, conversations between humans.

00:40:11 Sure, and if I go and tell a friend of mine

00:40:16 about this podcast interview, my friend might say,

00:40:20 oh, the same thing happened to me.

00:40:22 Let’s say, you ask me some really hard question

00:40:27 and I have trouble answering it.

00:40:29 My friend could say, the same thing happened to me,

00:40:31 but it was like, it wasn’t a podcast interview.

00:40:34 It wasn’t, it was a completely different situation.

00:40:39 And yet my friend is seeing essentially the same thing.

00:40:43 We say that very fluidly, the same thing happened to me.

00:40:46 Essentially the same thing.

00:40:48 But we don’t even say that, right?

00:40:50 We just say the same thing.

00:40:51 You imply it, yes.

00:40:51 Yeah, and the view that kind of went into say copycat,

00:40:56 that whole thing is that that act of saying

00:41:00 the same thing happened to me is making an analogy.

00:41:04 And in some sense, that’s what’s underlies

00:41:07 all of our concepts.

00:41:10 Why do you think analogy making that you’re describing

00:41:14 is so fundamental to cognition?

00:41:17 Like it seems like it’s the main element action

00:41:20 of what we think of as cognition.

00:41:23 Yeah, so it can be argued that all of this

00:41:28 generalization we do of concepts

00:41:31 and recognizing concepts in different situations

00:41:39 is done by analogy.

00:41:42 That that’s, every time I’m recognizing

00:41:48 that say you’re a person, that’s by analogy

00:41:53 because I have this concept of what person is

00:41:55 and I’m applying it to you.

00:41:57 And every time I recognize a new situation,

00:42:02 like one of the things I talked about in the book

00:42:06 was the concept of walking a dog,

00:42:09 that that’s actually making an analogy

00:42:11 because all of the details are very different.

00:42:15 So reasoning could be reduced down

00:42:19 to essentially analogy making.

00:42:21 So all the things we think of as like,

00:42:25 yeah, like you said, perception.

00:42:26 So what’s perception is taking raw sensory input

00:42:29 and it’s somehow integrating into our understanding

00:42:33 of the world, updating the understanding.

00:42:34 And all of that has just this giant mess of analogies

00:42:39 that are being made.

00:42:40 I think so, yeah.

00:42:42 If you just linger on it a little bit,

00:42:44 like what do you think it takes to engineer

00:42:47 a process like that for us in our artificial systems?

00:42:52 We need to understand better, I think,

00:42:56 how we do it, how humans do it.

00:43:02 And it comes down to internal models, I think.

00:43:07 People talk a lot about mental models,

00:43:11 that concepts are mental models,

00:43:13 that I can, in my head, I can do a simulation

00:43:18 of a situation like walking a dog.

00:43:22 And there’s some work in psychology

00:43:25 that promotes this idea that all of concepts

00:43:29 are really mental simulations,

00:43:31 that whenever you encounter a concept

00:43:35 or situation in the world or you read about it or whatever,

00:43:38 you do some kind of mental simulation

00:43:40 that allows you to predict what’s gonna happen,

00:43:44 to develop expectations of what’s gonna happen.

00:43:47 So that’s the kind of structure I think we need,

00:43:51 is that kind of mental model that,

00:43:55 and in our brains, somehow these mental models

00:43:58 are very much interconnected.

00:44:01 Again, so a lot of stuff we’re talking about

00:44:03 are essentially open problems, right?

00:44:05 So if I ask a question, I don’t mean

00:44:08 that you would know the answer, only just hypothesizing.

00:44:11 But how big do you think is the network graph,

00:44:19 data structure of concepts that’s in our head?

00:44:23 Like if we’re trying to build that ourselves,

00:44:26 like it’s, we take it,

00:44:28 that’s one of the things we take for granted.

00:44:29 We think, I mean, that’s why we take common sense

00:44:32 for granted, we think common sense is trivial.

00:44:34 But how big of a thing of concepts

00:44:38 is that underlies what we think of as common sense,

00:44:42 for example?

00:44:44 Yeah, I don’t know.

00:44:45 And I’m not, I don’t even know what units to measure it in.

00:44:48 Can you say how big is it?

00:44:50 That’s beautifully put, right?

00:44:51 But, you know, we have, you know, it’s really hard to know.

00:44:55 We have, what, a hundred billion neurons or something.

00:45:00 I don’t know.

00:45:02 And they’re connected via trillions of synapses.

00:45:07 And there’s all this chemical processing going on.

00:45:10 There’s just a lot of capacity for stuff.

00:45:13 And their information’s encoded

00:45:15 in different ways in the brain.

00:45:17 It’s encoded in chemical interactions.

00:45:19 It’s encoded in electric, like firing and firing rates.

00:45:24 And nobody really knows how it’s encoded,

00:45:25 but it just seems like there’s a huge amount of capacity.

00:45:29 So I think it’s huge.

00:45:30 It’s just enormous.

00:45:32 And it’s amazing how much stuff we know.

00:45:36 Yeah.

00:45:38 And for, but we know, and not just know like facts,

00:45:42 but it’s all integrated into this thing

00:45:44 that we can make analogies with.

00:45:46 Yes.

00:45:47 There’s a dream of Semantic Web,

00:45:49 and there’s a lot of dreams from expert systems

00:45:53 of building giant knowledge bases.

00:45:56 Do you see a hope for these kinds of approaches

00:45:58 of building, of converting Wikipedia

00:46:01 into something that could be used in analogy making?

00:46:05 Sure.

00:46:07 And I think people have made some progress

00:46:09 along those lines.

00:46:10 I mean, people have been working on this for a long time.

00:46:13 But the problem is,

00:46:14 and this I think is the problem of common sense.

00:46:17 Like people have been trying to get

00:46:19 these common sense networks.

00:46:21 Here at MIT, there’s this concept net project, right?

00:46:25 But the problem is that, as I said,

00:46:27 most of the knowledge that we have is invisible to us.

00:46:31 It’s not in Wikipedia.

00:46:33 It’s very basic things about intuitive physics,

00:46:42 intuitive psychology, intuitive metaphysics,

00:46:46 all that stuff.

00:46:47 If you were to create a website

00:46:49 that described intuitive physics, intuitive psychology,

00:46:53 would it be bigger or smaller than Wikipedia?

00:46:56 What do you think?

00:46:58 I guess described to whom?

00:47:00 I’m sorry, but.

00:47:03 No, that’s really good.

00:47:05 That’s exactly right, yeah.

00:47:07 That’s a hard question,

00:47:07 because how do you represent that knowledge

00:47:10 is the question, right?

00:47:12 I can certainly write down F equals MA

00:47:15 and Newton’s laws and a lot of physics

00:47:19 can be deduced from that.

00:47:23 But that’s probably not the best representation

00:47:27 of that knowledge for doing the kinds of reasoning

00:47:32 we want a machine to do.

00:47:35 So, I don’t know, it’s impossible to say now.

00:47:40 And people, you know, the projects,

00:47:43 like there’s the famous psych project, right,

00:47:46 that Douglas Linnaught did that was trying.

00:47:50 That thing’s still going?

00:47:51 I think it’s still going.

00:47:52 And the idea was to try and encode

00:47:54 all of common sense knowledge,

00:47:56 including all this invisible knowledge

00:47:58 in some kind of logical representation.

00:48:03 And it just never, I think, could do any of the things

00:48:09 that he was hoping it could do,

00:48:11 because that’s just the wrong approach.

00:48:13 Of course, that’s what they always say, you know.

00:48:16 And then the history books will say,

00:48:18 well, the psych project finally found a breakthrough

00:48:21 in 2058 or something.

00:48:24 So much progress has been made in just a few decades

00:48:28 that who knows what the next breakthroughs will be.

00:48:31 It could be.

00:48:32 It’s certainly a compelling notion

00:48:34 what the psych project stands for.

00:48:37 I think Linnaught was one of the earliest people

00:48:39 to say common sense is what we need.

00:48:43 That’s what we need.

00:48:44 All this like expert system stuff,

00:48:46 that is not gonna get you to AI.

00:48:49 You need common sense.

00:48:50 And he basically gave up his whole academic career

00:48:56 to go pursue that.

00:48:57 And I totally admire that,

00:48:59 but I think that the approach itself will not,

00:49:06 in 2040 or wherever, be successful.

00:49:09 What do you think is wrong with the approach?

00:49:10 What kind of approach might be successful?

00:49:14 Well, if I knew that.

00:49:15 Again, nobody knows the answer, right?

00:49:16 If I knew that, you know, one of my talks,

00:49:19 one of the people in the audience,

00:49:21 this is a public lecture,

00:49:22 one of the people in the audience said,

00:49:24 what AI companies are you investing in?

00:49:27 I’m like, well, I’m a college professor for one thing,

00:49:31 so I don’t have a lot of extra funds to invest,

00:49:34 but also like no one knows what’s gonna work in AI, right?

00:49:39 That’s the problem.

00:49:41 Let me ask another impossible question

00:49:43 in case you have a sense.

00:49:44 In terms of data structures

00:49:46 that will store this kind of information,

00:49:49 do you think they’ve been invented yet,

00:49:51 both in hardware and software?

00:49:54 Or is it something else needs to be, are we totally, you know?

00:49:58 I think something else has to be invented.

00:50:01 That’s my guess.

00:50:03 Is the breakthroughs that’s most promising,

00:50:06 would that be in hardware or in software?

00:50:09 Do you think we can get far with the current computers?

00:50:12 Or do we need to do something that you see?

00:50:14 I see what you’re saying.

00:50:16 I don’t know if Turing computation

00:50:18 is gonna be sufficient.

00:50:19 Probably, I would guess it will.

00:50:22 I don’t see any reason why we need anything else.

00:50:26 So in that sense, we have invented the hardware we need,

00:50:29 but we just need to make it faster and bigger,

00:50:31 and we need to figure out the right algorithms

00:50:34 and the right sort of architecture.

00:50:39 Turing, that’s a very mathematical notion.

00:50:43 When we try to have to build intelligence,

00:50:44 it’s now an engineering notion

00:50:46 where you throw all that stuff.

00:50:48 Well, I guess it is a question.

00:50:53 People have brought up this question,

00:50:56 and when you asked about, like, is our current hardware,

00:51:00 will our current hardware work?

00:51:02 Well, Turing computation says that our current hardware

00:51:08 is, in principle, a Turing machine, right?

00:51:13 So all we have to do is make it faster and bigger.

00:51:16 But there have been people like Roger Penrose,

00:51:20 if you might remember, that he said,

00:51:22 Turing machines cannot produce intelligence

00:51:26 because intelligence requires continuous valued numbers.

00:51:30 I mean, that was sort of my reading of his argument.

00:51:34 And quantum mechanics and what else, whatever.

00:51:38 But I don’t see any evidence for that,

00:51:41 that we need new computation paradigms.

00:51:48 But I don’t know if we’re, you know,

00:51:50 I don’t think we’re gonna be able to scale up

00:51:53 our current approaches to programming these computers.

00:51:58 What is your hope for approaches like CopyCat

00:52:00 or other cognitive architectures?

00:52:02 I’ve talked to the creator of SOAR, for example.

00:52:04 I’ve used ActR myself.

00:52:06 I don’t know if you’re familiar with it.

00:52:07 Yeah, I am.

00:52:07 What do you think is,

00:52:10 what’s your hope of approaches like that

00:52:12 in helping develop systems of greater

00:52:15 and greater intelligence in the coming decades?

00:52:19 Well, that’s what I’m working on now,

00:52:22 is trying to take some of those ideas and extending it.

00:52:26 So I think there are some really promising approaches

00:52:30 that are going on now that have to do with

00:52:34 more active generative models.

00:52:39 So this is the idea of this simulation in your head,

00:52:42 the concept, when you, if you wanna,

00:52:46 when you’re perceiving a new situation,

00:52:49 you have some simulations in your head.

00:52:51 Those are generative models.

00:52:52 They’re generating your expectations.

00:52:54 They’re generating predictions.

00:52:55 So that’s part of a perception.

00:52:57 You have a metamodel that generates a prediction

00:53:00 then you compare it with, and then the difference.

00:53:03 And you also, that generative model is telling you

00:53:07 where to look and what to look at

00:53:09 and what to pay attention to.

00:53:11 And it, I think it affects your perception.

00:53:14 It’s not that just you compare it with your perception.

00:53:16 It becomes your perception in a way.

00:53:21 It’s kind of a mixture of the bottom up information

00:53:28 coming from the world and your top down model

00:53:31 being imposed on the world is what becomes your perception.

00:53:36 So your hope is something like that

00:53:37 can improve perception systems

00:53:39 and that they can understand things better.

00:53:41 Yes. To understand things.

00:53:42 Yes.

00:53:44 What’s the, what’s the step,

00:53:47 what’s the analogy making step there?

00:53:49 Well, there, the idea is that you have this

00:53:54 pretty complicated conceptual space.

00:53:57 You can talk about a semantic network or something like that

00:54:00 with these different kinds of concept models

00:54:04 in your brain that are connected.

00:54:07 So, so let’s, let’s take the example of walking a dog.

00:54:10 So we were talking about that.

00:54:12 Okay.

00:54:13 Let’s say I see someone out in the street walking a cat.

00:54:16 Some people walk their cats, I guess.

00:54:18 Seems like a bad idea, but.

00:54:19 Yeah.

00:54:21 So my model, my, you know,

00:54:23 there’s connections between my model of a dog

00:54:27 and model of a cat.

00:54:28 And I can immediately see the analogy

00:54:33 of that those are analogous situations,

00:54:38 but I can also see the differences

00:54:40 and that tells me what to expect.

00:54:43 So also, you know, I have a new situation.

00:54:48 So another example with the walking the dog thing

00:54:51 is sometimes people,

00:54:52 I see people riding their bikes with a leash,

00:54:55 holding a leash and the dogs running alongside.

00:54:57 Okay, so I know that the,

00:55:00 I recognize that as kind of a dog walking situation,

00:55:03 even though the person’s not walking, right?

00:55:06 And the dog’s not walking.

00:55:08 Because I have these models that say, okay,

00:55:14 riding a bike is sort of similar to walking

00:55:16 or it’s connected, it’s a means of transportation,

00:55:20 but I, because they have their dog there,

00:55:22 I assume they’re not going to work,

00:55:24 but they’re going out for exercise.

00:55:26 You know, these analogies help me to figure out

00:55:30 kind of what’s going on, what’s likely.

00:55:33 But sort of these analogies are very human interpretable.

00:55:37 So that’s that kind of space.

00:55:38 And then you look at something

00:55:40 like the current deep learning approaches,

00:55:43 they kind of help you to take raw sensory information

00:55:46 and to sort of automatically build up hierarchies

00:55:49 of what you can even call them concepts.

00:55:52 They’re just not human interpretable concepts.

00:55:55 What’s your, what’s the link here?

00:55:58 Do you hope, sort of the hybrid system question,

00:56:05 how do you think the two can start to meet each other?

00:56:08 What’s the value of learning in this systems of forming,

00:56:14 of analogy making?

00:56:16 The goal of, you know, the original goal of deep learning

00:56:20 in at least visual perception was that

00:56:24 you would get the system to learn to extract features

00:56:27 that at these different levels of complexity.

00:56:30 So maybe edge detection and that would lead into learning,

00:56:34 you know, simple combinations of edges

00:56:36 and then more complex shapes

00:56:38 and then whole objects or faces.

00:56:42 And this was based on the ideas

00:56:47 of the neuroscientists, Hubel and Wiesel,

00:56:51 who had seen, laid out this kind of structure in brain.

00:56:58 And I think that’s right to some extent.

00:57:02 Of course, people have found that the whole story

00:57:05 is a little more complex than that.

00:57:07 And the brain of course always is

00:57:09 and there’s a lot of feedback.

00:57:10 So I see that as absolutely a good brain inspired approach

00:57:22 to some aspects of perception.

00:57:25 But one thing that it’s lacking, for example,

00:57:29 is all of that feedback, which is extremely important.

00:57:33 The interactive element that you mentioned.

00:57:36 The expectation, right, the conceptual level.

00:57:39 Going back and forth with the expectation,

00:57:42 the perception and just going back and forth.

00:57:44 So, right, so that is extremely important.

00:57:47 And, you know, one thing about deep neural networks

00:57:52 is that in a given situation,

00:57:54 like, you know, they’re trained, right?

00:57:56 They get these weights and everything,

00:57:58 but then now I give them a new image, let’s say.

00:58:02 They treat every part of the image in the same way.

00:58:09 You know, they apply the same filters at each layer

00:58:13 to all parts of the image.

00:58:15 There’s no feedback to say like,

00:58:17 oh, this part of the image is irrelevant.

00:58:20 I shouldn’t care about this part of the image.

00:58:23 Or this part of the image is the most important part.

00:58:27 And that’s kind of what we humans are able to do

00:58:30 because we have these conceptual expectations.

00:58:33 So there’s a, by the way, a little bit of work in that.

00:58:35 There’s certainly a lot more in what’s under the,

00:58:38 called attention in natural language processing knowledge.

00:58:42 It’s a, and that’s exceptionally powerful.

00:58:46 And it’s a very, just as you say,

00:58:49 it’s a really powerful idea.

00:58:50 But again, in sort of machine learning,

00:58:53 it all kind of operates in an automated way.

00:58:55 That’s not human interpret.

00:58:56 It’s not also, okay, so that, right.

00:58:59 It’s not dynamic.

00:59:00 I mean, in the sense that as a perception

00:59:03 of a new example is being processed,

00:59:08 those attention’s weights don’t change.

00:59:12 Right, so I mean, there’s a kind of notion

00:59:17 that there’s not a memory.

00:59:20 So you’re not aggregating the idea of like,

00:59:23 this mental model.

00:59:25 Yes.

00:59:26 I mean, that seems to be a fundamental idea.

00:59:28 There’s not a really powerful,

00:59:30 I mean, there’s some stuff with memory,

00:59:32 but there’s not a powerful way to represent the world

00:59:37 in some sort of way that’s deeper than,

00:59:42 I mean, it’s so difficult because, you know,

00:59:45 neural networks do represent the world.

00:59:47 They do have a mental model, right?

00:59:50 But it just seems to be shallow.

00:59:53 It’s hard to criticize them at the fundamental level,

01:00:00 to me at least.

01:00:01 It’s easy to criticize them.

01:00:05 Well, look, like exactly what you’re saying,

01:00:07 mental models sort of almost put a psychology hat on,

01:00:11 say, look, these networks are clearly not able

01:00:15 to achieve what we humans do with forming mental models,

01:00:18 analogy making and so on.

01:00:20 But that doesn’t mean that they fundamentally cannot do that.

01:00:23 Like it’s very difficult to say that.

01:00:25 I mean, at least to me,

01:00:26 do you have a notion that the learning approach is really,

01:00:29 I mean, they’re going to not only are they limited today,

01:00:34 but they will forever be limited

01:00:37 in being able to construct such mental models.

01:00:42 I think the idea of the dynamic perception is key here.

01:00:47 The idea that moving your eyes around and getting feedback.

01:00:54 And that’s something that, you know,

01:00:56 there’s been some models like that.

01:00:58 There’s certainly recurrent neural networks

01:01:00 that operate over several time steps.

01:01:03 But the problem is that the actual, the recurrence is,

01:01:07 you know, basically the feedback is at the next time step

01:01:13 is the entire hidden state of the network,

01:01:18 which is, it turns out that that doesn’t work very well.

01:01:25 But see, the thing I’m saying is mathematically speaking,

01:01:29 it has the information in that recurrence

01:01:33 to capture everything, it just doesn’t seem to work.

01:01:38 So like, you know, it’s like,

01:01:40 it’s the same Turing machine question, right?

01:01:44 Yeah, maybe theoretically, computers,

01:01:49 anything that’s Turing, a universal Turing machine

01:01:53 can be intelligent, but practically,

01:01:56 the architecture might be very specific.

01:01:59 Kind of architecture to be able to create it.

01:02:04 So just, I guess it sort of asks almost the same question

01:02:09 again is how big of a role do you think deep learning needs,

01:02:14 will play or needs to play in this, in perception?

01:02:20 I think that deep learning as it’s currently,

01:02:24 as it currently exists, you know, will place,

01:02:27 that kind of thing will play some role.

01:02:31 But I think that there’s a lot more going on in perception.

01:02:36 But who knows, you know, the definition of deep learning,

01:02:39 I mean, it’s pretty broad.

01:02:41 It’s kind of an umbrella for a lot of different things.

01:02:43 So what I mean is purely sort of neural networks.

01:02:45 Yeah, and a feed forward neural networks.

01:02:48 Essentially, or there could be recurrence,

01:02:50 but sometimes it feels like,

01:02:53 for instance, I talked to Gary Marcus,

01:02:55 it feels like the criticism of deep learning

01:02:58 is kind of like us birds criticizing airplanes

01:03:02 for not flying well, or that they’re not really flying.

01:03:07 Do you think deep learning,

01:03:10 do you think it could go all the way?

01:03:12 Like Yann LeCun thinks.

01:03:14 Do you think that, yeah,

01:03:17 the brute force learning approach can go all the way?

01:03:21 I don’t think so, no.

01:03:23 I mean, I think it’s an open question,

01:03:25 but I tend to be on the innateness side

01:03:29 that there’s some things that we’ve been evolved

01:03:35 to be able to learn,

01:03:39 and that learning just can’t happen without them.

01:03:44 So one example, here’s an example I had in the book

01:03:47 that I think is useful to me, at least, in thinking about this.

01:03:51 So this has to do with

01:03:54 the Deep Minds Atari game playing program, okay?

01:03:59 And it learned to play these Atari video games

01:04:02 just by getting input from the pixels of the screen,

01:04:08 and it learned to play the game Breakout

01:04:15 1,000% better than humans, okay?

01:04:18 That was one of their results, and it was great.

01:04:20 And it learned this thing where it tunneled through the side

01:04:23 of the bricks in the breakout game,

01:04:26 and the ball could bounce off the ceiling

01:04:28 and then just wipe out bricks.

01:04:30 Okay, so there was a group who did an experiment

01:04:36 where they took the paddle that you move with the joystick

01:04:41 and moved it up two pixels or something like that.

01:04:45 And then they looked at a deep Q learning system

01:04:49 that had been trained on Breakout and said,

01:04:51 could it now transfer its learning

01:04:53 to this new version of the game?

01:04:55 Of course, a human could, and it couldn’t.

01:04:58 Maybe that’s not surprising, but I guess the point is

01:05:00 it hadn’t learned the concept of a paddle.

01:05:04 It hadn’t learned the concept of a ball

01:05:07 or the concept of tunneling.

01:05:09 It was learning something, you know, we looking at it

01:05:12 kind of anthropomorphized it and said,

01:05:16 oh, here’s what it’s doing in the way we describe it.

01:05:18 But it actually didn’t learn those concepts.

01:05:21 And so because it didn’t learn those concepts,

01:05:23 it couldn’t make this transfer.

01:05:26 Yes, so that’s a beautiful statement,

01:05:28 but at the same time, by moving the paddle,

01:05:31 we also anthropomorphize flaws to inject into the system

01:05:36 that will then flip how impressed we are by it.

01:05:39 What I mean by that is, to me, the Atari games were,

01:05:43 to me, deeply impressive that that was possible at all.

01:05:48 So like I have to first pause on that,

01:05:50 and people should look at that, just like the game of Go,

01:05:53 which is fundamentally different to me

01:05:55 than what Deep Blue did.

01:05:59 Even though there’s still a tree search,

01:06:03 it’s just everything DeepMind has done in terms of learning,

01:06:08 however limited it is, is still deeply surprising to me.

01:06:11 Yeah, I’m not trying to say that what they did wasn’t impressive.

01:06:15 I think it was incredibly impressive.

01:06:17 To me, it’s interesting.

01:06:19 Is moving the board just another thing that needs to be learned?

01:06:24 So like we’ve been able to, maybe, maybe,

01:06:27 been able to, through the current neural networks,

01:06:29 learn very basic concepts

01:06:31 that are not enough to do this general reasoning,

01:06:34 and maybe with more data.

01:06:37 I mean, the interesting thing about the examples

01:06:41 that you talk about beautifully

01:06:44 is it’s often flaws of the data.

01:06:48 Well, that’s the question.

01:06:49 I mean, I think that is the key question,

01:06:51 whether it’s a flaw of the data or not.

01:06:53 Because the reason I brought up this example

01:06:56 was because you were asking,

01:06:57 do I think that learning from data could go all the way?

01:07:01 And this was why I brought up the example,

01:07:04 because I think, and this is not at all to take away

01:07:09 from the impressive work that they did,

01:07:11 but it’s to say that when we look at what these systems learn,

01:07:18 do they learn the things

01:07:21 that we humans consider to be the relevant concepts?

01:07:25 And in that example, it didn’t.

01:07:29 Sure, if you train it on moving, you know, the paddle being

01:07:34 in different places, maybe it could deal with,

01:07:38 maybe it would learn that concept.

01:07:40 I’m not totally sure.

01:07:42 But the question is, you know, scaling that up

01:07:44 to more complicated worlds,

01:07:48 to what extent could a machine

01:07:51 that only gets this very raw data

01:07:54 learn to divide up the world into relevant concepts?

01:07:58 And I don’t know the answer,

01:08:01 but I would bet that without some innate notion

01:08:08 that it can’t do it.

01:08:10 Yeah, 10 years ago, I 100% agree with you

01:08:12 as the most experts in AI system,

01:08:15 but now I have a glimmer of hope.

01:08:19 Okay, I mean, that’s fair enough.

01:08:21 And I think that’s what deep learning did in the community is,

01:08:24 no, no, if I had to bet all my money,

01:08:26 it’s 100% deep learning will not take us all the way.

01:08:29 But there’s still other, it’s still,

01:08:31 I was so personally sort of surprised by the Atari games,

01:08:36 by Go, by the power of self play of just game playing

01:08:40 against each other that I was like many other times

01:08:44 just humbled of how little I know about what’s possible

01:08:48 in this approach.

01:08:49 Yeah, I think fair enough.

01:08:51 Self play is amazingly powerful.

01:08:53 And that goes way back to Arthur Samuel, right,

01:08:58 with his checker plane program,

01:09:01 which was brilliant and surprising that it did so well.

01:09:06 So just for fun, let me ask you on the topic of autonomous vehicles.

01:09:10 It’s the area that I work at least these days most closely on,

01:09:15 and it’s also area that I think is a good example that you use

01:09:20 as sort of an example of things we as humans

01:09:25 don’t always realize how hard it is to do.

01:09:28 It’s like the constant trend in AI,

01:09:30 but the different problems that we think are easy

01:09:32 when we first try them and then realize how hard it is.

01:09:36 Okay, so you’ve talked about autonomous driving

01:09:41 being a difficult problem, more difficult than we realize.

01:09:44 Humans give it credit for it.

01:09:46 Why is it so difficult?

01:09:47 What are the most difficult parts in your view?

01:09:51 I think it’s difficult because of the world is so open ended

01:09:56 as to what kinds of things can happen.

01:09:59 So you have sort of what normally happens,

01:10:05 which is just you drive along and nothing surprising happens,

01:10:09 and autonomous vehicles can do,

01:10:12 the ones we have now evidently can do really well

01:10:17 on most normal situations as long as the weather

01:10:21 is reasonably good and everything.

01:10:24 But if some, we have this notion of edge cases

01:10:28 or things in the tail of the distribution,

01:10:32 we call it the long tail problem,

01:10:34 which says that there’s so many possible things

01:10:37 that can happen that was not in the training data

01:10:41 of the machine that it won’t be able to handle it

01:10:47 because it doesn’t have common sense.

01:10:50 Right, it’s the old, the paddle moved problem.

01:10:54 Yeah, it’s the paddle moved problem, right.

01:10:57 And so my understanding, and you probably are more

01:10:59 of an expert than I am on this,

01:11:01 is that current self driving car vision systems

01:11:07 have problems with obstacles, meaning that they don’t know

01:11:12 which obstacles, which quote unquote obstacles

01:11:15 they should stop for and which ones they shouldn’t stop for.

01:11:18 And so a lot of times I read that they tend to slam

01:11:21 on the brakes quite a bit.

01:11:23 And the most common accidents with self driving cars

01:11:27 are people rear ending them because they were surprised.

01:11:31 They weren’t expecting the machine, the car to stop.

01:11:35 Yeah, so there’s a lot of interesting questions there.

01:11:38 Whether, because you mentioned kind of two things.

01:11:42 So one is the problem of perception, of understanding,

01:11:46 of interpreting the objects that are detected correctly.

01:11:51 And the other one is more like the policy,

01:11:54 the action that you take, how you respond to it.

01:11:57 So a lot of the car’s braking is a kind of notion of,

01:12:02 to clarify, there’s a lot of different kind of things

01:12:05 that are people calling autonomous vehicles.

01:12:07 But the L4 vehicles with a safety driver are the ones

01:12:12 like Waymo and Cruise and those companies,

01:12:15 they tend to be very conservative and cautious.

01:12:18 So they tend to be very, very afraid of hurting anything

01:12:21 or anyone and getting in any kind of accidents.

01:12:24 So their policy is very kind of, that results

01:12:28 in being exceptionally responsive to anything

01:12:31 that could possibly be an obstacle, right?

01:12:33 Right, which the human drivers around it,

01:12:38 it behaves unpredictably.

01:12:41 Yeah, that’s not a very human thing to do, caution.

01:12:43 That’s not the thing we’re good at, especially in driving.

01:12:46 We’re in a hurry, often angry and et cetera,

01:12:49 especially in Boston.

01:12:51 And then there’s sort of another, and a lot of times,

01:12:55 machine learning is not a huge part of that.

01:12:57 It’s becoming more and more unclear to me

01:13:00 how much sort of speaking to public information

01:13:05 because a lot of companies say they’re doing deep learning

01:13:08 and machine learning just to attract good candidates.

01:13:12 The reality is in many cases,

01:13:14 it’s still not a huge part of the perception.

01:13:18 There’s LiDAR and there’s other sensors

01:13:20 that are much more reliable for obstacle detection.

01:13:23 And then there’s Tesla approach, which is vision only.

01:13:27 And there’s, I think a few companies doing that,

01:13:30 but Tesla most sort of famously pushing that forward.

01:13:32 And that’s because the LiDAR is too expensive, right?

01:13:35 Well, I mean, yes, but I would say

01:13:40 if you were to for free give to every Tesla vehicle,

01:13:44 I mean, Elon Musk fundamentally believes

01:13:47 that LiDAR is a crutch, right, famously said that.

01:13:50 That if you want to solve the problem of machine learning,

01:13:55 LiDAR should not be the primary sensor is the belief.

01:14:00 The camera contains a lot more information.

01:14:04 So if you want to learn, you want that information.

01:14:08 But if you want to not to hit obstacles, you want LiDAR, right?

01:14:13 Sort of it’s this weird trade off because yeah,

01:14:18 sort of what Tesla vehicles have a lot of,

01:14:21 which is really the thing, the fallback,

01:14:26 the primary fallback sensor is radar,

01:14:29 which is a very crude version of LiDAR.

01:14:32 It’s a good detector of obstacles

01:14:34 except when those things are standing, right?

01:14:37 The stopped vehicle.

01:14:39 Right, that’s why it had problems

01:14:41 with crashing into stop fire trucks.

01:14:43 Stop fire trucks, right.

01:14:44 So the hope there is that the vision sensor

01:14:47 would somehow catch that.

01:14:49 And for, there’s a lot of problems with perception.

01:14:54 They are doing actually some incredible stuff in the,

01:15:00 almost like an active learning space

01:15:02 where it’s constantly taking edge cases and pulling back in.

01:15:06 There’s this data pipeline.

01:15:08 Another aspect that is really important

01:15:12 that people are studying now is called multitask learning,

01:15:15 which is sort of breaking apart this problem,

01:15:18 whatever the problem is, in this case driving,

01:15:20 into dozens or hundreds of little problems

01:15:24 that you can turn into learning problems.

01:15:26 So this giant pipeline, it’s kind of interesting.

01:15:30 I’ve been skeptical from the very beginning,

01:15:33 but become less and less skeptical over time

01:15:35 how much of driving can be learned.

01:15:37 I still think it’s much farther

01:15:39 than the CEO of that particular company thinks it will be,

01:15:44 but it’s constantly surprising that

01:15:48 through good engineering and data collection

01:15:51 and active selection of data,

01:15:53 how you can attack that long tail.

01:15:56 And it’s an interesting open question

01:15:58 that you’re absolutely right.

01:15:59 There’s a much longer tail

01:16:01 and all these edge cases that we don’t think about,

01:16:04 but it’s a fascinating question

01:16:06 that applies to natural language and all spaces.

01:16:09 How big is that long tail?

01:16:12 And I mean, not to linger on the point,

01:16:16 but what’s your sense in driving

01:16:19 in these practical problems of the human experience?

01:16:24 Can it be learned?

01:16:26 So the current, what are your thoughts of sort of

01:16:28 Elon Musk thought, let’s forget the thing that he says

01:16:31 it’d be solved in a year,

01:16:33 but can it be solved in a reasonable timeline

01:16:38 or do fundamentally other methods need to be invented?

01:16:41 So I don’t, I think that ultimately driving,

01:16:47 so it’s a trade off in a way,

01:16:50 being able to drive and deal with any situation that comes up

01:16:56 does require kind of full human intelligence.

01:16:59 And even in humans aren’t intelligent enough to do it

01:17:02 because humans, I mean, most human accidents

01:17:06 are because the human wasn’t paying attention

01:17:09 or the humans drunk or whatever.

01:17:11 And not because they weren’t intelligent enough.

01:17:13 And not because they weren’t intelligent enough, right.

01:17:17 Whereas the accidents with autonomous vehicles

01:17:23 is because they weren’t intelligent enough.

01:17:25 They’re always paying attention.

01:17:26 Yeah, they’re always paying attention.

01:17:27 So it’s a trade off, you know,

01:17:29 and I think that it’s a very fair thing to say

01:17:32 that autonomous vehicles will be ultimately safer than humans

01:17:37 because humans are very unsafe.

01:17:39 It’s kind of a low bar.

01:17:42 But just like you said, I think humans got a better rap, right.

01:17:48 Because we’re really good at the common sense thing.

01:17:50 Yeah, we’re great at the common sense thing.

01:17:52 We’re bad at the paying attention thing.

01:17:53 Paying attention thing, right.

01:17:54 Especially when we’re, you know, driving is kind of boring

01:17:56 and we have these phones to play with and everything.

01:17:59 But I think what’s going to happen is that

01:18:06 for many reasons, not just AI reasons,

01:18:09 but also like legal and other reasons,

01:18:12 that the definition of self driving is going to change

01:18:17 or autonomous is going to change.

01:18:19 It’s not going to be just,

01:18:23 I’m going to go to sleep in the back

01:18:24 and you just drive me anywhere.

01:18:27 It’s going to be more certain areas are going to be instrumented

01:18:34 to have the sensors and the mapping

01:18:37 and all of the stuff you need for,

01:18:39 that the autonomous cars won’t have to have full common sense

01:18:43 and they’ll do just fine in those areas

01:18:46 as long as pedestrians don’t mess with them too much.

01:18:49 That’s another question.

01:18:51 That’s right.

01:18:52 But I don’t think we will have fully autonomous self driving

01:18:59 in the way that like most,

01:19:01 the average person thinks of it for a very long time.

01:19:04 And just to reiterate, this is the interesting open question

01:19:09 that I think I agree with you on,

01:19:11 is to solve fully autonomous driving,

01:19:14 you have to be able to engineer in common sense.

01:19:17 Yes.

01:19:19 I think it’s an important thing to hear and think about.

01:19:23 I hope that’s wrong, but I currently agree with you

01:19:27 that unfortunately you do have to have, to be more specific,

01:19:32 sort of these deep understandings of physics

01:19:35 and of the way this world works and also the human dynamics.

01:19:39 Like you mentioned, pedestrians and cyclists,

01:19:41 actually that’s whatever that nonverbal communication

01:19:45 as some people call it,

01:19:46 there’s that dynamic that is also part of this common sense.

01:19:50 Right.

01:19:51 And we humans are pretty good at predicting

01:19:55 what other humans are going to do.

01:19:57 And how our actions impact the behaviors

01:20:00 of this weird game theoretic dance that we’re good at somehow.

01:20:05 And the funny thing is,

01:20:07 because I’ve watched countless hours of pedestrian video

01:20:11 and talked to people,

01:20:12 we humans are also really bad at articulating

01:20:15 the knowledge we have.

01:20:16 Right.

01:20:17 Which has been a huge challenge.

01:20:19 Yes.

01:20:20 So you’ve mentioned embodied intelligence.

01:20:23 What do you think it takes to build a system

01:20:25 of human level intelligence?

01:20:27 Does it need to have a body?

01:20:29 I’m not sure, but I’m coming around to that more and more.

01:20:34 And what does it mean to be,

01:20:36 I don’t mean to keep bringing up Yann LeCun.

01:20:40 He looms very large.

01:20:42 Well, he certainly has a large personality.

01:20:45 Yes.

01:20:46 He thinks that the system needs to be grounded,

01:20:49 meaning he needs to sort of be able to interact with reality,

01:20:53 but doesn’t think it necessarily needs to have a body.

01:20:56 So when you think of…

01:20:57 So what’s the difference?

01:20:58 I guess I want to ask,

01:21:00 when you mean body,

01:21:01 do you mean you have to be able to play with the world?

01:21:04 Or do you also mean like there’s a body

01:21:06 that you have to preserve?

01:21:10 Oh, that’s a good question.

01:21:11 I haven’t really thought about that,

01:21:13 but I think both, I would guess.

01:21:15 Because I think intelligence,

01:21:23 it’s so hard to separate it from our desire

01:21:29 for self preservation,

01:21:31 our emotions,

01:21:34 all that non rational stuff

01:21:37 that kind of gets in the way of logical thinking.

01:21:43 Because the way,

01:21:46 if we’re talking about human intelligence

01:21:48 or human level intelligence, whatever that means,

01:21:51 a huge part of it is social.

01:21:55 We were evolved to be social

01:21:58 and to deal with other people.

01:22:01 And that’s just so ingrained in us

01:22:05 that it’s hard to separate intelligence from that.

01:22:09 I think AI for the last 70 years

01:22:14 or however long it’s been around,

01:22:16 it has largely been separated.

01:22:18 There’s this idea that there’s like,

01:22:20 it’s kind of very Cartesian.

01:22:23 There’s this thinking thing that we’re trying to create,

01:22:27 but we don’t care about all this other stuff.

01:22:30 And I think the other stuff is very fundamental.

01:22:34 So there’s idea that things like emotion

01:22:37 can get in the way of intelligence.

01:22:40 As opposed to being an integral part of it.

01:22:42 Integral part of it.

01:22:43 So, I mean, I’m Russian,

01:22:45 so romanticize the notions of emotion and suffering

01:22:48 and all that kind of fear of mortality,

01:22:51 those kinds of things.

01:22:52 So in AI, especially.

01:22:56 By the way, did you see that?

01:22:57 There was this recent thing going around the internet.

01:23:00 Some, I think he’s a Russian or some Slavic

01:23:03 had written this thing,

01:23:05 anti the idea of super intelligence.

01:23:08 I forgot, maybe he’s Polish.

01:23:10 Anyway, so it all these arguments

01:23:12 and one was the argument from Slavic pessimism.

01:23:15 My favorite.

01:23:19 Do you remember what the argument is?

01:23:21 It’s like nothing ever works.

01:23:23 Everything sucks.

01:23:27 So what do you think is the role?

01:23:29 Like that’s such a fascinating idea

01:23:31 that what we perceive as sort of the limits of the human mind,

01:23:38 which is emotion and fear and all those kinds of things

01:23:42 are integral to intelligence.

01:23:45 Could you elaborate on that?

01:23:47 Like why is that important, do you think?

01:23:54 For human level intelligence.

01:23:58 At least for the way the humans work,

01:24:00 it’s a big part of how it affects how we perceive the world.

01:24:04 It affects how we make decisions about the world.

01:24:07 It affects how we interact with other people.

01:24:10 It affects our understanding of other people.

01:24:14 For me to understand what you’re likely to do,

01:24:21 I need to have kind of a theory of mind

01:24:22 and that’s very much a theory of emotion

01:24:27 and motivations and goals.

01:24:32 And to understand that,

01:24:35 we have this whole system of mirror neurons.

01:24:42 I sort of understand your motivations

01:24:45 through sort of simulating it myself.

01:24:49 So it’s not something that I can prove that’s necessary,

01:24:55 but it seems very likely.

01:24:58 So, okay.

01:25:01 You’ve written the op ed in the New York Times titled

01:25:04 We Shouldn’t Be Scared by Superintelligent AI

01:25:07 and it criticized a little bit Stuart Russell and Nick Bostrom.

01:25:13 Can you try to summarize that article’s key ideas?

01:25:18 So it was spurred by an earlier New York Times op ed

01:25:22 by Stuart Russell, which was summarizing his book

01:25:26 called Human Compatible.

01:25:28 And the article was saying if we have superintelligent AI,

01:25:36 we need to have its values aligned with our values

01:25:40 and it has to learn about what we really want.

01:25:43 And he gave this example.

01:25:45 What if we have a superintelligent AI

01:25:48 and we give it the problem of solving climate change

01:25:52 and it decides that the best way to lower the carbon

01:25:56 in the atmosphere is to kill all the humans?

01:25:59 Okay.

01:26:00 So to me, that just made no sense at all

01:26:02 because a superintelligent AI,

01:26:08 first of all, trying to figure out what a superintelligence means

01:26:13 and it seems that something that’s superintelligent

01:26:21 can’t just be intelligent along this one dimension of,

01:26:24 okay, I’m going to figure out all the steps,

01:26:26 the best optimal path to solving climate change

01:26:30 and not be intelligent enough to figure out

01:26:32 that humans don’t want to be killed,

01:26:36 that you could get to one without having the other.

01:26:39 And, you know, Bostrom, in his book,

01:26:43 talks about the orthogonality hypothesis

01:26:46 where he says he thinks that a system’s,

01:26:51 I can’t remember exactly what it is,

01:26:52 but like a system’s goals and its values

01:26:56 don’t have to be aligned.

01:26:58 There’s some orthogonality there,

01:27:00 which didn’t make any sense to me.

01:27:02 So you’re saying in any system that’s sufficiently

01:27:06 not even superintelligent,

01:27:07 but as opposed to greater and greater intelligence,

01:27:09 there’s a holistic nature that will sort of,

01:27:11 a tension that will naturally emerge

01:27:14 that prevents it from sort of any one dimension running away.

01:27:17 Yeah, yeah, exactly.

01:27:19 So, you know, Bostrom had this example

01:27:23 of the superintelligent AI that makes,

01:27:28 that turns the world into paper clips

01:27:30 because its job is to make paper clips or something.

01:27:33 And that just, as a thought experiment,

01:27:35 didn’t make any sense to me.

01:27:37 Well, as a thought experiment

01:27:39 or as a thing that could possibly be realized?

01:27:42 Either.

01:27:43 So I think that, you know,

01:27:45 what my op ed was trying to do was say

01:27:47 that intelligence is more complex

01:27:50 than these people are presenting it.

01:27:53 That it’s not like, it’s not so separable.

01:27:58 The rationality, the values, the emotions,

01:28:03 the, all of that, that it’s,

01:28:06 the view that you could separate all these dimensions

01:28:09 and build a machine that has one of these dimensions

01:28:12 and it’s superintelligent in one dimension,

01:28:14 but it doesn’t have any of the other dimensions.

01:28:17 That’s what I was trying to criticize

01:28:22 that I don’t believe that.

01:28:24 So can I read a few sentences

01:28:28 from Yoshua Bengio who is always super eloquent?

01:28:35 So he writes,

01:28:38 I have the same impression as Melanie

01:28:40 that our cognitive biases are linked

01:28:42 with our ability to learn to solve many problems.

01:28:45 They may also be a limiting factor for AI.

01:28:49 However, this is a may in quotes.

01:28:53 Things may also turn out differently

01:28:55 and there’s a lot of uncertainty

01:28:56 about the capabilities of future machines.

01:28:59 But more importantly for me,

01:29:02 the value alignment problem is a problem

01:29:04 well before we reach some hypothetical superintelligence.

01:29:08 It is already posing a problem

01:29:10 in the form of super powerful companies

01:29:13 whose objective function may not be sufficiently aligned

01:29:17 with humanity’s general wellbeing,

01:29:19 creating all kinds of harmful side effects.

01:29:21 So he goes on to argue that the orthogonality

01:29:28 and those kinds of things,

01:29:29 the concerns of just aligning values

01:29:32 with the capabilities of the system

01:29:34 is something that might come long

01:29:37 before we reach anything like superintelligence.

01:29:40 So your criticism is kind of really nice to saying

01:29:44 this idea of superintelligent systems

01:29:46 seem to be dismissing fundamental parts

01:29:48 of what intelligence would take.

01:29:50 And then Yoshua kind of says, yes,

01:29:53 but if we look at systems that are much less intelligent,

01:29:57 there might be these same kinds of problems that emerge.

01:30:02 Sure, but I guess the example that he gives there

01:30:06 of these corporations, that’s people, right?

01:30:09 Those are people’s values.

01:30:11 I mean, we’re talking about people,

01:30:13 the corporations are,

01:30:16 their values are the values of the people

01:30:20 who run those corporations.

01:30:21 But the idea is the algorithm, that’s right.

01:30:24 So the fundamental person,

01:30:26 the fundamental element of what does the bad thing

01:30:30 is a human being.

01:30:31 Yeah.

01:30:32 But the algorithm kind of controls the behavior

01:30:36 of this mass of human beings.

01:30:38 Which algorithm?

01:30:40 For a company that’s the,

01:30:42 so for example, if it’s an advertisement driven company

01:30:44 that recommends certain things

01:30:47 and encourages engagement,

01:30:50 so it gets money by encouraging engagement

01:30:53 and therefore the company more and more,

01:30:57 it’s like the cycle that builds an algorithm

01:31:00 that enforces more engagement

01:31:03 and may perhaps more division in the culture

01:31:05 and so on, so on.

01:31:07 I guess the question here is sort of who has the agency?

01:31:12 So you might say, for instance,

01:31:14 we don’t want our algorithms to be racist.

01:31:17 Right.

01:31:18 And facial recognition,

01:31:21 some people have criticized some facial recognition systems

01:31:23 as being racist because they’re not as good

01:31:26 on darker skin than lighter skin.

01:31:29 That’s right.

01:31:30 Okay.

01:31:31 But the agency there,

01:31:33 the actual facial recognition algorithm

01:31:36 isn’t what has the agency.

01:31:38 It’s not the racist thing, right?

01:31:41 It’s the, I don’t know,

01:31:44 the combination of the training data,

01:31:48 the cameras being used, whatever.

01:31:51 But my understanding of,

01:31:53 and I agree with Bengio there that he,

01:31:56 I think there are these value issues

01:31:59 with our use of algorithms.

01:32:02 But my understanding of what Russell’s argument was

01:32:09 is more that the machine itself has the agency now.

01:32:14 It’s the thing that’s making the decisions

01:32:17 and it’s the thing that has what we would call values.

01:32:21 Yes.

01:32:22 So whether that’s just a matter of degree,

01:32:25 it’s hard to say, right?

01:32:27 But I would say that’s sort of qualitatively different

01:32:30 than a face recognition neural network.

01:32:34 And to broadly linger on that point,

01:32:38 if you look at Elon Musk or Stuart Russell or Bostrom,

01:32:42 people who are worried about existential risks of AI,

01:32:45 however far into the future,

01:32:47 the argument goes is it eventually happens.

01:32:50 We don’t know how far, but it eventually happens.

01:32:53 Do you share any of those concerns

01:32:56 and what kind of concerns in general do you have about AI

01:32:59 that approach anything like existential threat to humanity?

01:33:06 So I would say, yes, it’s possible,

01:33:10 but I think there’s a lot more closer in existential threats to humanity.

01:33:15 As you said, like a hundred years for your time.

01:33:18 It’s more than a hundred years.

01:33:20 More than a hundred years.

01:33:21 Maybe even more than 500 years.

01:33:23 I don’t know.

01:33:24 So the existential threats are so far out that the future is,

01:33:29 I mean, there’ll be a million different technologies

01:33:32 that we can’t even predict now

01:33:34 that will fundamentally change the nature of our behavior,

01:33:37 reality, society, and so on before then.

01:33:39 Yeah, I think so.

01:33:40 I think so.

01:33:41 And we have so many other pressing existential threats going on right now.

01:33:46 Nuclear weapons even.

01:33:47 Nuclear weapons, climate problems, poverty, possible pandemics.

01:33:57 You can go on and on.

01:33:59 And I think worrying about existential threat from AI

01:34:05 is not the best priority for what we should be worrying about.

01:34:13 That’s kind of my view, because we’re so far away.

01:34:15 But I’m not necessarily criticizing Russell or Bostrom or whoever

01:34:24 for worrying about that.

01:34:26 And I think some people should be worried about it.

01:34:29 It’s certainly fine.

01:34:30 But I was more getting at their view of what intelligence is.

01:34:38 So I was more focusing on their view of superintelligence

01:34:42 than just the fact of them worrying.

01:34:49 And the title of the article was written by the New York Times editors.

01:34:54 I wouldn’t have called it that.

01:34:55 We shouldn’t be scared by superintelligence.

01:34:58 No.

01:34:59 If you wrote it, it’d be like we should redefine what you mean by superintelligence.

01:35:02 I actually said something like superintelligence is not a sort of coherent idea.

01:35:13 But that’s not something the New York Times would put in.

01:35:18 And the follow up argument that Yoshua makes also,

01:35:22 not argument, but a statement, and I’ve heard him say it before.

01:35:25 And I think I agree.

01:35:27 He kind of has a very friendly way of phrasing it.

01:35:30 It’s good for a lot of people to believe different things.

01:35:34 He’s such a nice guy.

01:35:36 Yeah.

01:35:37 But it’s also practically speaking like we shouldn’t be like,

01:35:42 while your article stands, like Stuart Russell does amazing work.

01:35:46 Bostrom does amazing work.

01:35:48 You do amazing work.

01:35:49 And even when you disagree about the definition of superintelligence

01:35:53 or the usefulness of even the term,

01:35:56 it’s still useful to have people that like use that term, right?

01:36:01 And then argue.

01:36:02 Sure.

01:36:03 I absolutely agree with Benjo there.

01:36:05 And I think it’s great that, you know,

01:36:08 and it’s great that New York Times will publish all this stuff.

01:36:10 That’s right.

01:36:11 It’s an exciting time to be here.

01:36:13 What do you think is a good test of intelligence?

01:36:16 Is natural language ultimately a test that you find the most compelling,

01:36:21 like the original or the higher levels of the Turing test kind of?

01:36:28 Yeah, I still think the original idea of the Turing test

01:36:33 is a good test for intelligence.

01:36:36 I mean, I can’t think of anything better.

01:36:38 You know, the Turing test, the way that it’s been carried out so far

01:36:42 has been very impoverished, if you will.

01:36:47 But I think a real Turing test that really goes into depth,

01:36:52 like the one that I mentioned, I talk about in the book,

01:36:54 I talk about Ray Kurzweil and Mitchell Kapoor have this bet, right?

01:36:59 That in 2029, I think is the date there,

01:37:04 a machine will pass the Turing test and they have a very specific,

01:37:09 like how many hours, expert judges and all of that.

01:37:14 And, you know, Kurzweil says yes, Kapoor says no.

01:37:17 We only have like nine more years to go to see.

01:37:21 But I, you know, if something, a machine could pass that,

01:37:27 I would be willing to call it intelligent.

01:37:30 Of course, nobody will.

01:37:33 They will say that’s just a language model, if it does.

01:37:37 So you would be comfortable, so language, a long conversation that,

01:37:43 well, yeah, you’re, I mean, you’re right,

01:37:45 because I think probably to carry out that long conversation,

01:37:48 you would literally need to have deep common sense understanding of the world.

01:37:52 I think so.

01:37:54 And the conversation is enough to reveal that.

01:37:57 I think so.

01:37:59 So another super fun topic of complexity that you have worked on, written about.

01:38:09 Let me ask the basic question.

01:38:10 What is complexity?

01:38:12 So complexity is another one of those terms like intelligence.

01:38:17 It’s perhaps overused.

01:38:18 But my book about complexity was about this wide area of complex systems,

01:38:29 studying different systems in nature, in technology,

01:38:35 in society in which you have emergence, kind of like I was talking about with intelligence.

01:38:41 You know, we have the brain, which has billions of neurons.

01:38:45 And each neuron individually could be said to be not very complex compared to the system as a whole.

01:38:53 But the system, the interactions of those neurons and the dynamics,

01:38:58 creates these phenomena that we call intelligence or consciousness,

01:39:04 you know, that we consider to be very complex.

01:39:08 So the field of complexity is trying to find general principles that underlie all these systems

01:39:16 that have these kinds of emergent properties.

01:39:19 And the emergence occurs from like underlying the complex system is usually simple, fundamental interactions.

01:39:27 Yes.

01:39:28 And the emergence happens when there’s just a lot of these things interacting.

01:39:34 Yes.

01:39:35 Sort of what, and then most of science to date, can you talk about what is reductionism?

01:39:45 Well, reductionism is when you try and take a system and divide it up into its elements,

01:39:54 whether those be cells or atoms or subatomic particles, whatever your field is,

01:40:02 and then try and understand those elements.

01:40:06 And then try and build up an understanding of the whole system by looking at sort of the sum of all the elements.

01:40:13 So what’s your sense?

01:40:15 Whether we’re talking about intelligence or these kinds of interesting complex systems,

01:40:20 is it possible to understand them in a reductionist way,

01:40:24 which is probably the approach of most of science today, right?

01:40:29 I don’t think it’s always possible to understand the things we want to understand the most.

01:40:35 So I don’t think it’s possible to look at single neurons and understand what we call intelligence,

01:40:45 to look at sort of summing up, and sort of the summing up is the issue here.

01:40:54 One example is that the human genome, right, so there was a lot of work on excitement about sequencing the human genome

01:41:03 because the idea would be that we’d be able to find genes that underlies diseases.

01:41:10 But it turns out that, and it was a very reductionist idea, you know, we figure out what all the parts are,

01:41:18 and then we would be able to figure out which parts cause which things.

01:41:22 But it turns out that the parts don’t cause the things that we’re interested in.

01:41:25 It’s like the interactions, it’s the networks of these parts.

01:41:30 And so that kind of reductionist approach didn’t yield the explanation that we wanted.

01:41:37 What do you, what do you use the most beautiful complex system that you’ve encountered?

01:41:43 The most beautiful.

01:41:45 That you’ve been captivated by.

01:41:47 Is it sort of, I mean, for me, is the simplest to be cellular automata.

01:41:54 Oh, yeah. So I was very captivated by cellular automata and worked on cellular automata for several years.

01:42:01 Do you find it amazing or is it surprising that such simple systems, such simple rules in cellular automata can create sort of seemingly unlimited complexity?

01:42:14 Yeah, that was very surprising to me.

01:42:16 How do you make sense of it? How does that make you feel?

01:42:18 Is it just ultimately humbling or is there a hope to somehow leverage this into a deeper understanding and even able to engineer things like intelligence?

01:42:29 It’s definitely humbling.

01:42:31 How humbling in that also kind of awe inspiring that it’s that awe inspiring like part of mathematics that these credibly simple rules can produce this very beautiful, complex, hard to understand behavior.

01:42:50 And that’s, it’s mysterious, you know, and surprising still.

01:42:58 But exciting because it does give you kind of the hope that you might be able to engineer complexity just from simple rules.

01:43:09 Can you briefly say what is the Santa Fe Institute, its history, its culture, its ideas, its future?

01:43:14 So I’ve never, as I mentioned to you, I’ve never been, but it’s always been this, in my mind, this mystical place where brilliant people study the edge of chaos.

01:43:24 Yeah, exactly.

01:43:26 So the Santa Fe Institute was started in 1984 and it was created by a group of scientists, a lot of them from Los Alamos National Lab, which is about a 40 minute drive from Santa Fe Institute.

01:43:45 They were mostly physicists and chemists, but they were frustrated in their field because they felt so that their field wasn’t approaching kind of big interdisciplinary questions like the kinds we’ve been talking about.

01:44:03 And they wanted to have a place where people from different disciplines could work on these big questions without sort of being siloed into physics, chemistry, biology, whatever.

01:44:17 So they started this institute and this was people like George Cowen, who was a chemist in the Manhattan Project, and Nicholas Metropolis, a mathematician, physicist, Marie Gail Mann, physicist.

01:44:37 So some really big names here.

01:44:39 Ken Arrow, Nobel Prize winning economist, and they started having these workshops.

01:44:47 And this whole enterprise kind of grew into this research institute that itself has been kind of on the edge of chaos its whole life because it doesn’t have a significant endowment.

01:45:03 And it’s just been kind of living on whatever funding it can raise through donations and grants and however it can, you know, business associates and so on.

01:45:21 But it’s a great place. It’s a really fun place to go think about ideas that you wouldn’t normally encounter.

01:45:28 I saw Sean Carroll, a physicist. Yeah, he’s on the external faculty.

01:45:34 And you mentioned that there’s, so there’s some external faculty and there’s people that are…

01:45:37 A very small group of resident faculty, maybe about 10 who are there for five year terms that can sometimes get renewed.

01:45:48 And then they have some postdocs and then they have this much larger on the order of 100 external faculty or people like me who come and visit for various periods of time.

01:45:59 So what do you think is the future of the Santa Fe Institute?

01:46:02 And if people are interested, like what’s there in terms of the public interaction or students or so on that could be a possible interaction with the Santa Fe Institute or its ideas?

01:46:15 Yeah, so there’s a few different things they do.

01:46:18 They have a complex system summer school for graduate students and postdocs and sometimes faculty attend too.

01:46:25 And that’s a four week, very intensive residential program where you go and you listen to lectures and you do projects and people really like that.

01:46:35 I mean, it’s a lot of fun.

01:46:37 They also have some specialty summer schools.

01:46:41 There’s one on computational social science.

01:46:45 There’s one on climate and sustainability, I think it’s called.

01:46:52 There’s a few and then they have short courses where just a few days on different topics.

01:46:59 They also have an online education platform that offers a lot of different courses and tutorials from SFI faculty.

01:47:09 Including an introduction to complexity course that I taught.

01:47:13 Awesome. And there’s a bunch of talks too online from the guest speakers and so on.

01:47:19 They host a lot of…

01:47:20 Yeah, they have sort of technical seminars and colloquia and they have a community lecture series like public lectures and they put everything on their YouTube channel so you can see it all.

01:47:33 Watch it.

01:47:34 Douglas Hofstadter, author of Ghetto Escherbach, was your PhD advisor.

01:47:40 He mentioned a couple of times in collaborator.

01:47:43 Do you have any favorite lessons or memories from your time working with him that continues to this day?

01:47:50 Just even looking back throughout your time working with him.

01:47:55 One of the things he taught me was that when you’re looking at a complex problem, to idealize it as much as possible to try and figure out what is the essence of this problem.

01:48:11 And this is how the copycat program came into being was by taking analogy making and saying, how can we make this as idealized as possible but still retain really the important things we want to study?

01:48:25 And that’s really been a core theme of my research, I think.

01:48:33 And I continue to try and do that.

01:48:36 And it’s really very much kind of physics inspired. Hofstadter was a PhD in physics.

01:48:42 That was his background.

01:48:44 It’s like first principles kind of thing.

01:48:46 You’re reduced to the most fundamental aspect of the problem so that you can focus on solving that fundamental aspect.

01:48:52 Yeah.

01:48:53 And in AI, people used to work in these micro worlds, right?

01:48:57 Like the blocks world was very early important area in AI.

01:49:02 And then that got criticized because they said, oh, you can’t scale that to the real world.

01:49:09 And so people started working on much more real world like problems.

01:49:14 But now there’s been kind of a return even to the blocks world itself.

01:49:19 We’ve seen a lot of people who are trying to work on more of these very idealized problems for things like natural language and common sense.

01:49:28 So that’s an interesting evolution of those ideas.

01:49:31 So perhaps the blocks world represents the fundamental challenges of the problem of intelligence more than people realize.

01:49:38 It might. Yeah.

01:49:41 When you look back at your body of work and your life, you’ve worked in so many different fields.

01:49:46 Is there something that you’re just really proud of in terms of ideas that you’ve gotten a chance to explore, create yourself?

01:49:54 So I am really proud of my work on the copycat project.

01:49:59 I think it’s really different from what almost everyone has done in AI.

01:50:04 I think there’s a lot of ideas there to be explored.

01:50:08 And I guess one of the happiest days of my life.

01:50:14 You know, aside from like the births of my children was the birth of copycat when it actually started to be able to make really interesting analogies.

01:50:24 And I remember that very clearly.

01:50:27 It was a very exciting time.

01:50:30 Well, you kind of gave life to an artificial system.

01:50:34 That’s right.

01:50:35 In terms of what people can interact, I saw there’s like a, I think it’s called MetaCat.

01:50:40 MetaCat.

01:50:41 MetaCat.

01:50:42 And there’s a Python 3 implementation.

01:50:45 If people actually wanted to play around with it and actually get into it and study it and maybe integrate into whether it’s with deep learning or any other kind of work they’re doing.

01:50:54 What would you suggest they do to learn more about it and to take it forward in different kinds of directions?

01:51:00 Yeah, so that there’s Douglas Hofstadter’s book called Fluid Concepts and Creative Analogies talks in great detail about copycat.

01:51:09 I have a book called Analogy Making as Perception, which is a version of my PhD thesis on it.

01:51:16 There’s also code that’s available that you can get it to run.

01:51:20 I have some links on my webpage to where people can get the code for it.

01:51:25 And I think that that would really be the best way to get into it.

01:51:28 Just dive in and play with it.

01:51:30 Well, Melanie, it was an honor talking to you.

01:51:33 I really enjoyed it.

01:51:34 Thank you so much for your time today.

01:51:35 Thanks.

01:51:36 It’s been really great.

01:51:38 Thanks for listening to this conversation with Melanie Mitchell.

01:51:41 And thank you to our presenting sponsor, Cash App.

01:51:44 Download it.

01:51:45 Use code LexPodcast.

01:51:47 You will get $10 and $10 will go to FIRST, a STEM education nonprofit that inspires hundreds of thousands of young minds to learn and to dream of engineering our future.

01:51:58 If you enjoy this podcast, subscribe on YouTube, give it five stars on Apple Podcast, support it on Patreon or connect with me on Twitter.

01:52:06 And now let me leave you with some words of wisdom from Douglas Hofstadter and Melanie Mitchell.

01:52:12 Without concepts, there can be no thought.

01:52:15 Without analogies, there can be no concepts.

01:52:18 And Melanie adds, how to form and fluidly use concepts is the most important open problem in AI.

01:52:27 Thank you for listening and hope to see you next time.