François Chollet: Measures of Intelligence #120

Transcript

00:00:00 The following is a conversation with Francois Chollet,

00:00:03 his second time on the podcast.

00:00:05 He’s both a world class engineer and a philosopher

00:00:09 in the realm of deep learning and artificial intelligence.

00:00:13 This time, we talk a lot about his paper titled

00:00:16 on the measure of intelligence that discusses

00:00:19 how we might define and measure general intelligence

00:00:22 in our computing machinery.

00:00:24 Quick summary of the sponsors,

00:00:26 Babbel, Masterclass, and Cash App.

00:00:29 Click the sponsor links in the description

00:00:31 to get a discount and to support this podcast.

00:00:34 As a side note, let me say that the serious,

00:00:36 rigorous scientific study

00:00:38 of artificial general intelligence is a rare thing.

00:00:42 The mainstream machine learning community works

00:00:44 on very narrow AI with very narrow benchmarks.

00:00:47 This is very good for incremental

00:00:49 and sometimes big incremental progress.

00:00:53 On the other hand, the outside the mainstream,

00:00:56 renegade, you could say, AGI community works

00:01:00 on approaches that verge on the philosophical

00:01:03 and even the literary without big public benchmarks.

00:01:07 Walking the line between the two worlds is a rare breed,

00:01:10 but it doesn’t have to be.

00:01:12 I ran the AGI series at MIT as an attempt

00:01:15 to inspire more people to walk this line.

00:01:17 Deep mind and open AI for a time

00:01:20 and still on occasion walk this line.

00:01:23 Francois Chollet does as well.

00:01:25 I hope to also.

00:01:27 It’s a beautiful dream to work towards

00:01:29 and to make real one day.

00:01:32 If you enjoy this thing, subscribe on YouTube,

00:01:34 review it with five stars on Apple Podcast,

00:01:36 follow on Spotify, support on Patreon,

00:01:39 or connect with me on Twitter at Lex Friedman.

00:01:42 As usual, I’ll do a few minutes of ads now

00:01:44 and no ads in the middle.

00:01:45 I try to make these interesting,

00:01:47 but I give you timestamps so you can skip.

00:01:50 But still, please do check out the sponsors

00:01:52 by clicking the links in the description.

00:01:54 It’s the best way to support this podcast.

00:01:57 This show is sponsored by Babbel,

00:02:00 an app and website that gets you speaking

00:02:02 in a new language within weeks.

00:02:04 Go to babbel.com and use code Lex to get three months free.

00:02:08 They offer 14 languages, including Spanish, French,

00:02:11 Italian, German, and yes, Russian.

00:02:15 Daily lessons are 10 to 15 minutes,

00:02:17 super easy, effective,

00:02:19 designed by over 100 language experts.

00:02:22 Let me read a few lines from the Russian poem

00:02:24 Noch, ulitsa, fanar, apteka, by Alexander Bloch,

00:02:29 that you’ll start to understand if you sign up to Babbel.

00:02:32 Noch, ulitsa, fanar, apteka,

00:02:35 Bessmysliny, ituskly, svet,

00:02:38 Zhevi esho, khod chetvert veka,

00:02:41 Vse budet tak, ishoda, net.

00:02:44 Now, I say that you’ll start to understand this poem

00:02:48 because Russian starts with a language

00:02:51 and ends with vodka.

00:02:54 Now, the latter part is definitely not endorsed

00:02:56 or provided by Babbel.

00:02:58 It will probably lose me this sponsorship,

00:03:00 although it hasn’t yet.

00:03:02 But once you graduate with Babbel,

00:03:04 you can enroll in my advanced course

00:03:06 of late night Russian conversation over vodka.

00:03:09 No app for that yet.

00:03:11 So get started by visiting babbel.com

00:03:13 and use code Lex to get three months free.

00:03:18 This show is also sponsored by Masterclass.

00:03:20 Sign up at masterclass.com slash Lex

00:03:23 to get a discount and to support this podcast.

00:03:26 When I first heard about Masterclass,

00:03:28 I thought it was too good to be true.

00:03:29 I still think it’s too good to be true.

00:03:32 For $180 a year, you get an all access pass

00:03:35 to watch courses from, to list some of my favorites.

00:03:38 Chris Hatfield on space exploration,

00:03:41 hope to have him in this podcast one day.

00:03:43 Neil Dugras Tyson on scientific thinking and communication,

00:03:46 Neil too.

00:03:47 Will Wright, creator of SimCity and Sims

00:03:50 on game design, Carlos Santana on guitar,

00:03:52 Kary Kasparov on chess, Daniel Nagrano on poker,

00:03:55 and many more.

00:03:57 Chris Hatfield explaining how rockets work

00:03:59 and the experience of being watched at the space

00:04:01 alone is worth the money.

00:04:03 By the way, you can watch it on basically any device.

00:04:06 Once again, sign up at masterclass.com slash Lex

00:04:09 to get a discount and to support this podcast.

00:04:13 This show finally is presented by Cash App,

00:04:16 the number one finance app in the App Store.

00:04:18 When you get it, use code LexPodcast.

00:04:21 Cash App lets you send money to friends,

00:04:23 buy Bitcoin, and invest in the stock market

00:04:25 with as little as $1.

00:04:27 Since Cash App allows you to send

00:04:28 and receive money digitally,

00:04:30 let me mention a surprising fact related to physical money.

00:04:33 Of all the currency in the world,

00:04:35 roughly 8% of it is actually physical money.

00:04:39 The other 92% of the money only exists digitally,

00:04:42 and that’s only going to increase.

00:04:45 So again, if you get Cash App from the App Store

00:04:47 through Google Play and use code LexPodcast,

00:04:50 you get 10 bucks,

00:04:51 and Cash App will also donate $10 to FIRST,

00:04:54 an organization that is helping to advance robotics

00:04:57 and STEM education for young people around the world.

00:05:00 And now here’s my conversation with Francois Chalet.

00:05:05 What philosophers, thinkers, or ideas

00:05:07 had a big impact on you growing up and today?

00:05:10 So one author that had a big impact on me

00:05:14 when I read his books as a teenager was Jean Piaget,

00:05:18 who is a Swiss psychologist,

00:05:21 is considered to be the father of developmental psychology.

00:05:25 And he has a large body of work about

00:05:28 basically how intelligence develops in children.

00:05:33 And so it’s very old work,

00:05:35 like most of it is from the 1930s, 1940s.

00:05:39 So it’s not quite up to date.

00:05:40 It’s actually superseded by many newer developments

00:05:43 in developmental psychology.

00:05:45 But to me, it was very interesting, very striking,

00:05:49 and actually shaped the early ways

00:05:51 in which I started thinking about the mind

00:05:53 and the development of intelligence as a teenager.

00:05:56 His actual ideas or the way he thought about it

00:05:58 or just the fact that you could think

00:05:59 about the developing mind at all?

00:06:01 I guess both.

00:06:02 Jean Piaget is the author that really introduced me

00:06:04 to the notion that intelligence and the mind

00:06:07 is something that you construct throughout your life

00:06:11 and that children construct it in stages.

00:06:15 And I thought that was a very interesting idea,

00:06:17 which is, of course, very relevant to AI,

00:06:20 to building artificial minds.

00:06:23 Another book that I read around the same time

00:06:25 that had a big impact on me,

00:06:28 and there was actually a little bit of overlap

00:06:32 with Jean Piaget as well,

00:06:32 and I read it around the same time,

00:06:35 is Geoff Hawking’s On Intelligence, which is a classic.

00:06:39 And he has this vision of the mind

00:06:42 as a multi scale hierarchy of temporal prediction modules.

00:06:47 And these ideas really resonated with me,

00:06:50 like the notion of a modular hierarchy

00:06:55 of potentially compression functions

00:07:00 or prediction functions.

00:07:01 I thought it was really, really interesting,

00:07:03 and it shaped the way I started thinking

00:07:07 about how to build minds.

00:07:09 The hierarchical nature, which aspect?

00:07:13 Also, he’s a neuroscientist, so he was thinking actual,

00:07:17 he was basically talking about how our mind works.

00:07:20 Yeah, the notion that cognition is prediction

00:07:23 was an idea that was kind of new to me at the time

00:07:25 and that I really loved at the time.

00:07:27 And yeah, and the notion that there are multiple scales

00:07:31 of processing in the brain.

00:07:35 The hierarchy.

00:07:36 Yes.

00:07:37 This was before deep learning.

00:07:38 These ideas of hierarchies in AI

00:07:41 have been around for a long time,

00:07:43 even before on intelligence.

00:07:45 They’ve been around since the 1980s.

00:07:48 And yeah, that was before deep learning.

00:07:50 But of course, I think these ideas really found

00:07:53 their practical implementation in deep learning.

00:07:58 What about the memory side of things?

00:07:59 I think he was talking about knowledge representation.

00:08:02 Do you think about memory a lot?

00:08:04 One way you can think of neural networks

00:08:06 as a kind of memory, you’re memorizing things,

00:08:10 but it doesn’t seem to be the kind of memory

00:08:14 that’s in our brains,

00:08:16 or it doesn’t have the same rich complexity,

00:08:18 long term nature that’s in our brains.

00:08:20 Yes, the brain is more of a sparse access memory

00:08:23 so that you can actually retrieve very precisely

00:08:27 like bits of your experience.

00:08:30 The retrieval aspect, you can like introspect,

00:08:33 you can ask yourself questions.

00:08:35 I guess you can program your own memory

00:08:38 and language is actually the tool you use to do that.

00:08:41 I think language is a kind of operating system for the mind

00:08:46 and use language.

00:08:47 Well, one of the uses of language is as a query

00:08:51 that you run over your own memory,

00:08:53 use words as keys to retrieve specific experiences

00:08:57 or specific concepts, specific thoughts.

00:09:00 Like language is a way you store thoughts,

00:09:02 not just in writing, in the physical world,

00:09:04 but also in your own mind.

00:09:06 And it’s also how you retrieve them.

00:09:07 Like, imagine if you didn’t have language,

00:09:10 then you would have to,

00:09:11 you would not really have a self,

00:09:14 internally triggered way of retrieving past thoughts.

00:09:18 You would have to rely on external experiences.

00:09:21 For instance, you see a specific site,

00:09:24 you smell a specific smell and that brings up memories,

00:09:26 but you would not really have a way

00:09:28 to deliberately access these memories without language.

00:09:32 Well, the interesting thing you mentioned

00:09:33 is you can also program the memory.

00:09:37 You can change it probably with language.

00:09:39 Yeah, using language, yes.

00:09:41 Well, let me ask you a Chomsky question,

00:09:44 which is like, first of all,

00:09:45 do you think language is like fundamental,

00:09:49 like there’s turtles, what’s at the bottom of the turtles?

00:09:54 They don’t go, it can’t be turtles all the way down.

00:09:57 Is language at the bottom of cognition of everything?

00:10:00 Is like language, the fundamental aspect

00:10:05 of like what it means to be a thinking thing?

00:10:10 No, I don’t think so.

00:10:12 I think language is.

00:10:12 You disagree with Norm Chomsky?

00:10:14 Yes, I think language is a layer on top of cognition.

00:10:17 So it is fundamental to cognition in the sense that

00:10:21 to use a computing metaphor,

00:10:23 I see language as the operating system of the brain,

00:10:28 of the human mind.

00:10:29 And the operating system is a layer on top of the computer.

00:10:33 The computer exists before the operating system,

00:10:36 but the operating system is how you make it truly useful.

00:10:39 And the operating system is most likely Windows, not Linux,

00:10:43 because language is messy.

00:10:45 Yeah, it’s messy and it’s pretty difficult

00:10:49 to inspect it, introspect it.

00:10:53 How do you think about language?

00:10:55 Like we use actually sort of human interpretable language,

00:11:00 but is there something like a deeper,

00:11:03 that’s closer to like logical type of statements?

00:11:08 Like, yeah, what is the nature of language, do you think?

00:11:16 Like is there something deeper than like the syntactic rules

00:11:18 we construct?

00:11:19 Is there something that doesn’t require utterances

00:11:22 or writing or so on?

00:11:25 Are you asking about the possibility

00:11:27 that there could exist languages for thinking

00:11:30 that are not made of words?

00:11:32 Yeah.

00:11:33 Yeah, I think so.

00:11:34 I think, so the mind is layers, right?

00:11:38 And language is almost like the outermost,

00:11:41 the uppermost layer.

00:11:44 But before we think in words,

00:11:46 I think we think in terms of emotion in space

00:11:51 and we think in terms of physical actions.

00:11:54 And I think babies in particular,

00:11:56 probably expresses thoughts in terms of the actions

00:12:01 that they’ve seen or that they can perform

00:12:03 and in terms of motions of objects in their environment

00:12:08 before they start thinking in terms of words.

00:12:10 It’s amazing to think about that

00:12:13 as the building blocks of language.

00:12:16 So like the kind of actions and ways the babies see the world

00:12:21 as like more fundamental

00:12:23 than the beautiful Shakespearean language

00:12:26 you construct on top of it.

00:12:28 And we probably don’t have any idea

00:12:30 what that looks like, right?

00:12:31 Like what, because it’s important

00:12:34 for them trying to engineer it into AI systems.

00:12:38 I think visual analogies and motion

00:12:42 is a fundamental building block of the mind.

00:12:45 And you actually see it reflected in language.

00:12:48 Like language is full of special metaphors.

00:12:51 And when you think about things,

00:12:53 I consider myself very much as a visual thinker.

00:12:57 You often express these thoughts

00:13:01 by using things like visualizing concepts

00:13:06 in 2D space or like you solve problems

00:13:09 by imagining yourself navigating a concept space.

00:13:14 So I don’t know if you have this sort of experience.

00:13:17 You said visualizing concept space.

00:13:19 So like, so I certainly think about,

00:13:24 I certainly visualize mathematical concepts,

00:13:27 but you mean like in concept space,

00:13:32 visually you’re embedding ideas

00:13:34 into a three dimensional space

00:13:36 you can explore with your mind essentially?

00:13:38 You should be more like 2D, but yeah.

00:13:40 2D?

00:13:41 Yeah.

00:13:42 You’re a flatlander.

00:13:43 You’re, okay.

00:13:45 No, I do not.

00:13:49 I always have to, before I jump from concept to concept,

00:13:52 I have to put it back down on paper.

00:13:57 It has to be on paper.

00:13:58 I can only travel on 2D paper, not inside my mind.

00:14:03 You’re able to move inside your mind.

00:14:05 But even if you’re writing like a paper, for instance,

00:14:07 don’t you have like a spatial representation of your paper?

00:14:11 Like you visualize where ideas lie topologically

00:14:16 in relationship to other ideas,

00:14:18 kind of like a subway map of the ideas in your paper.

00:14:22 Yeah, that’s true.

00:14:23 I mean, there is, in papers, I don’t know about you,

00:14:27 but it feels like there’s a destination.

00:14:32 There’s a key idea that you want to arrive at.

00:14:36 And a lot of it is in the fog

00:14:39 and you’re trying to kind of,

00:14:40 it’s almost like, what’s that called

00:14:46 when you do a path planning search from both directions,

00:14:49 from the start and from the end.

00:14:52 And then you find, you do like shortest path,

00:14:54 but like, you know, in game playing,

00:14:57 you do this with like A star from both sides.

00:15:01 And you see where we’re on the join.

00:15:03 Yeah, so you kind of do, at least for me,

00:15:05 I think like, first of all,

00:15:07 just exploring from the start from like first principles,

00:15:10 what do I know, what can I start proving from that, right?

00:15:15 And then from the destination,

00:15:18 if you start backtracking,

00:15:20 like if I want to show some kind of sets of ideas,

00:15:25 what would it take to show them and you kind of backtrack,

00:15:28 but like, yeah,

00:15:29 I don’t think I’m doing all that in my mind though.

00:15:31 Like I’m putting it down on paper.

00:15:33 Do you use mind maps to organize your ideas?

00:15:35 Yeah, I like mind maps.

00:15:37 Let’s get into this,

00:15:38 because I’ve been so jealous of people.

00:15:41 I haven’t really tried it.

00:15:42 I’ve been jealous of people that seem to like,

00:15:45 they get like this fire of passion in their eyes

00:15:48 because everything starts making sense.

00:15:50 It’s like Tom Cruise in the movie

00:15:51 was like moving stuff around.

00:15:53 Some of the most brilliant people I know use mind maps.

00:15:55 I haven’t tried really.

00:15:57 Can you explain what the hell a mind map is?

00:16:01 I guess mind map is a way to make

00:16:03 kind of like the mess inside your mind

00:16:05 to just put it on paper so that you gain more control over it.

00:16:10 It’s a way to organize things on paper

00:16:13 and as kind of like a consequence

00:16:16 of organizing things on paper,

00:16:17 they start being more organized inside your own mind.

00:16:20 So what does that look like?

00:16:21 You put, like, do you have an example?

00:16:23 Like what’s the first thing you write on paper?

00:16:27 What’s the second thing you write?

00:16:28 I mean, typically you draw a mind map

00:16:31 to organize the way you think about a topic.

00:16:34 So you would start by writing down

00:16:37 like the key concept about that topic.

00:16:39 Like you would write intelligence or something,

00:16:42 and then you would start adding associative connections.

00:16:45 Like what do you think about

00:16:46 when you think about intelligence?

00:16:48 What do you think are the key elements of intelligence?

00:16:50 So maybe you would have language, for instance,

00:16:52 and you’d have motion.

00:16:53 And so you would start drawing notes with these things.

00:16:55 And then you would see what do you think about

00:16:57 when you think about motion and so on.

00:16:59 And you would go like that, like a tree.

00:17:00 Is it a tree mostly or is it a graph too, like a tree?

00:17:05 Oh, it’s more of a graph than a tree.

00:17:07 And it’s not limited to just writing down words.

00:17:13 You can also draw things.

00:17:15 And it’s not supposed to be purely hierarchical, right?

00:17:21 The point is that once you start writing it down,

00:17:24 you can start reorganizing it so that it makes more sense,

00:17:27 so that it’s connected in a more effective way.

00:17:29 See, but I’m so OCD that you just mentioned

00:17:34 intelligence and language and motion.

00:17:37 I would start becoming paranoid

00:17:39 that the categorization isn’t perfect.

00:17:41 Like that I would become paralyzed with the mind map

00:17:47 that like this may not be.

00:17:49 So like the, even though you’re just doing

00:17:52 associative kind of connections,

00:17:55 there’s an implied hierarchy that’s emerging.

00:17:58 And I would start becoming paranoid

00:17:59 that it’s not the proper hierarchy.

00:18:02 So you’re not just, one way to see mind maps

00:18:04 is you’re putting thoughts on paper.

00:18:07 It’s like a stream of consciousness,

00:18:10 but then you can also start getting paranoid.

00:18:12 Well, is this the right hierarchy?

00:18:15 Sure, which it’s mind maps, your mind map.

00:18:17 You’re free to draw anything you want.

00:18:19 You’re free to draw any connection you want.

00:18:20 And you can just make a different mind map

00:18:23 if you think the central node is not the right node.

00:18:26 Yeah, I suppose there’s a fear of being wrong.

00:18:29 If you want to organize your ideas

00:18:32 by writing down what you think,

00:18:35 which I think is very effective.

00:18:37 Like how do you know what you think about something

00:18:40 if you don’t write it down, right?

00:18:42 If you do that, the thing is that it imposes

00:18:46 much more syntactic structure over your ideas,

00:18:49 which is not required with mind maps.

00:18:51 So mind map is kind of like a lower level,

00:18:54 more freehand way of organizing your thoughts.

00:18:57 And once you’ve drawn it,

00:18:59 then you can start actually voicing your thoughts

00:19:03 in terms of, you know, paragraphs.

00:19:05 It’s a two dimensional aspect of layout too, right?

00:19:08 Yeah.

00:19:09 It’s a kind of flower, I guess, you start.

00:19:12 There’s usually, you want to start with a central concept?

00:19:15 Yes.

00:19:16 Then you move out.

00:19:17 Typically it ends up more like a subway map.

00:19:19 So it ends up more like a graph,

00:19:20 a topological graph without a root node.

00:19:23 Yeah, so like in a subway map,

00:19:25 there are some nodes that are more connected than others.

00:19:27 And there are some nodes that are more important than others.

00:19:30 So there are destinations,

00:19:32 but it’s not going to be purely like a tree, for instance.

00:19:36 Yeah, it’s fascinating to think that

00:19:38 if there’s something to that about the way our mind thinks.

00:19:42 By the way, I just kind of remembered obvious thing

00:19:45 that I have probably thousands of documents

00:19:49 in Google Doc at this point, that are bullet point lists,

00:19:53 which is, you can probably map a mind map

00:19:57 to a bullet point list.

00:20:01 It’s the same, it’s a, no, it’s not, it’s a tree.

00:20:05 It’s a tree, yeah.

00:20:06 So I create trees,

00:20:07 but also they don’t have the visual element.

00:20:10 Like, I guess I’m comfortable with the structure.

00:20:13 It feels like the narrowness,

00:20:15 the constraints feel more comforting.

00:20:18 If you have thousands of documents

00:20:20 with your own thoughts in Google Docs,

00:20:23 why don’t you write some kind of search engine,

00:20:26 like maybe a mind map, a piece of software,

00:20:30 mind mapping software, where you write down a concept

00:20:33 and then it gives you sentences or paragraphs

00:20:37 from your thousand Google Docs document

00:20:39 that match this concept.

00:20:41 The problem is it’s so deeply, unlike mind maps,

00:20:45 it’s so deeply rooted in natural language.

00:20:48 So it’s not, it’s not semantically searchable,

00:20:54 I would say, because the categories are very,

00:20:57 you kind of mentioned intelligence, language, and motion.

00:21:00 They’re very strong, semantic.

00:21:02 Like, it feels like the mind map forces you

00:21:05 to be semantically clear and specific.

00:21:09 The bullet points list I have are sparse,

00:21:13 desperate thoughts that poetically represent

00:21:20 a category like motion, as opposed to saying motion.

00:21:25 So unfortunately, that’s the same problem with the internet.

00:21:28 That’s why the idea of semantic web is difficult to get.

00:21:32 It’s, most language on the internet is a giant mess

00:21:37 of natural language that’s hard to interpret, which,

00:21:42 so do you think there’s something to mind maps as,

00:21:46 you actually originally brought it up

00:21:48 as we were talking about kind of cognition and language.

00:21:53 Do you think there’s something to mind maps

00:21:55 about how our brain actually deals,

00:21:58 like think reasons about things?

00:22:01 It’s possible.

00:22:02 I think it’s reasonable to assume that there is

00:22:07 some level of topological processing in the brain,

00:22:10 that the brain is very associative in nature.

00:22:15 And I also believe that a topological space

00:22:20 is a better medium to encode thoughts

00:22:25 than a geometric space.

00:22:27 So I think…

00:22:28 What’s the difference in a topological

00:22:29 and a geometric space?

00:22:31 Well, if you’re talking about topologies,

00:22:34 then points are either connected or not.

00:22:36 So a topology is more like a subway map.

00:22:38 And geometry is when you’re interested

00:22:41 in the distance between things.

00:22:43 And in a subway map,

00:22:44 you don’t really have the concept of distance.

00:22:46 You only have the concept of whether there is a train

00:22:48 going from station A to station B.

00:22:52 And what we do in deep learning is that we’re actually

00:22:55 dealing with geometric spaces.

00:22:57 We are dealing with concept vectors, word vectors,

00:23:01 that have a distance between them

00:23:03 to express in terms of that product.

00:23:05 So we are not really building topological models usually.

00:23:10 I think you’re absolutely right.

00:23:11 Like distance is a fundamental importance in deep learning.

00:23:16 I mean, it’s the continuous aspect of it.

00:23:19 Yes, because everything is a vector

00:23:21 and everything has to be a vector

00:23:22 because everything has to be differentiable.

00:23:24 If your space is discrete, it’s no longer differentiable.

00:23:26 You cannot do deep learning in it anymore.

00:23:29 Well, you could, but you can only do it by embedding it

00:23:32 in a bigger continuous space.

00:23:35 So if you do topology in the context of deep learning,

00:23:39 you have to do it by embedding your topology

00:23:41 in the geometry.

00:23:42 Well, let me zoom out for a second.

00:23:46 Let’s get into your paper on the measure of intelligence

00:23:50 that you put out in 2019.

00:23:52 Yes.

00:23:53 Okay.

00:23:54 November.

00:23:55 November.

00:23:57 Yeah, remember 2019?

00:23:59 That was a different time.

00:24:01 Yeah, I remember.

00:24:02 I still remember.

00:24:06 It feels like a different world.

00:24:09 You could travel, you could actually go outside

00:24:12 and see friends.

00:24:15 Yeah.

00:24:16 Let me ask the most absurd question.

00:24:18 I think there’s some nonzero probability

00:24:21 there’ll be a textbook one day, like 200 years from now

00:24:25 on artificial intelligence,

00:24:27 or it’ll be called like just intelligence

00:24:30 cause humans will already be gone.

00:24:32 It’ll be your picture with a quote.

00:24:35 This is, you know, one of the early biological systems

00:24:39 would consider the nature of intelligence

00:24:41 and there’ll be like a definition

00:24:43 of how they thought about intelligence.

00:24:45 Which is one of the things you do in your paper

00:24:46 on measure intelligence is to ask like,

00:24:51 well, what is intelligence

00:24:52 and how to test for intelligence and so on.

00:24:55 So is there a spiffy quote about what is intelligence?

00:25:01 What is the definition of intelligence

00:25:03 according to Francois Chollet?

00:25:06 Yeah, so do you think the super intelligent AIs

00:25:10 of the future will want to remember us

00:25:13 the way we remember humans from the past?

00:25:16 And do you think they will be, you know,

00:25:18 they won’t be ashamed of having a biological origin?

00:25:22 No, I think it would be a niche topic.

00:25:24 It won’t be that interesting,

00:25:25 but it’ll be like the people that study

00:25:29 in certain contexts like historical civilization

00:25:33 that no longer exists, the Aztecs and so on.

00:25:36 That’s how it’ll be seen.

00:25:38 And it’ll be study in also the context on social media.

00:25:42 There’ll be hashtags about the atrocity

00:25:46 committed to human beings

00:25:49 when the robots finally got rid of them.

00:25:52 Like it was a mistake.

00:25:55 You’ll be seen as a giant mistake,

00:25:57 but ultimately in the name of progress

00:26:00 and it created a better world

00:26:01 because humans were over consuming the resources

00:26:05 and they were not very rational

00:26:07 and were destructive in the end in terms of productivity

00:26:11 and putting more love in the world.

00:26:13 And so within that context,

00:26:15 there’ll be a chapter about these biological systems.

00:26:17 It seems to have a very detailed vision of that hit here.

00:26:20 You should write a sci fi novel about it.

00:26:22 I’m working on a sci fi novel currently, yes.

00:26:28 Self published, yeah.

00:26:29 The definition of intelligence.

00:26:30 So intelligence is the efficiency

00:26:34 with which you acquire new skills at tasks

00:26:39 that you did not previously know about,

00:26:41 that you did not prepare for, right?

00:26:44 So intelligence is not skill itself.

00:26:47 It’s not what you know, it’s not what you can do.

00:26:50 It’s how well and how efficiently

00:26:52 you can learn new things.

00:26:54 New things.

00:26:55 Yes.

00:26:56 The idea of newness there

00:26:58 seems to be fundamentally important.

00:27:01 Yes.

00:27:02 So you would see intelligence on display, for instance.

00:27:05 Whenever you see a human being or an AI creature

00:27:09 adapt to a new environment that it does not see before,

00:27:13 that its creators did not anticipate.

00:27:16 When you see adaptation, when you see improvisation,

00:27:19 when you see generalization, that’s intelligence.

00:27:22 In reverse, if you have a system

00:27:24 that when you put it in a slightly new environment,

00:27:27 it cannot adapt, it cannot improvise,

00:27:30 it cannot deviate from what it’s hard coded to do

00:27:33 or what it has been trained to do,

00:27:38 that is a system that is not intelligent.

00:27:41 There’s actually a quote from Einstein

00:27:43 that captures this idea, which is,

00:27:46 the measure of intelligence is the ability to change.

00:27:50 I like that quote.

00:27:51 I think it captures at least part of this idea.

00:27:54 You know, there might be something interesting

00:27:56 about the difference between your definition and Einstein’s.

00:27:59 I mean, he’s just being Einstein and clever,

00:28:04 but acquisition of new ability to deal with new things

00:28:09 versus ability to just change.

00:28:14 What’s the difference between those two things?

00:28:16 So just change in itself.

00:28:19 Do you think there’s something to that?

00:28:21 Just being able to change.

00:28:23 Yes, being able to adapt.

00:28:25 So not change, but certainly change its direction.

00:28:30 Being able to adapt yourself to your environment.

00:28:34 Whatever the environment is.

00:28:35 That’s a big part of intelligence.

00:28:37 And intelligence is more precisely, you know,

00:28:40 how efficiently you’re able to adapt,

00:28:42 how efficiently you’re able to basically master your environment,

00:28:45 how efficiently you can acquire new skills.

00:28:49 And I think there’s a big distinction to be drawn

00:28:52 between intelligence, which is a process,

00:28:56 and the output of that process, which is skill.

00:29:01 So for instance, if you have a very smart human brain,

00:29:04 so for instance, if you have a very smart human programmer

00:29:08 that considers the game of chess,

00:29:10 and that writes down a static program that can play chess,

00:29:16 then the intelligence is the process

00:29:19 of developing that program.

00:29:20 But the program itself is just encoding

00:29:25 the output artifact of that process.

00:29:28 The program itself is not intelligent.

00:29:30 And the way you tell it’s not intelligent

00:29:31 is that if you put it in a different context,

00:29:34 you ask it to play Go or something,

00:29:36 it’s not going to be able to perform well

00:29:37 without human involvement,

00:29:38 because the source of intelligence,

00:29:41 the entity that is capable of that process

00:29:43 is the human programmer.

00:29:44 So we should be able to tell the difference

00:29:47 between the process and its output.

00:29:50 We should not confuse the output and the process.

00:29:53 It’s the same as, you know,

00:29:54 do not confuse a road building company

00:29:58 and one specific road,

00:30:00 because one specific road takes you from point A to point B,

00:30:03 but a road building company can take you from,

00:30:06 can make a path from anywhere to anywhere else.

00:30:08 Yeah, that’s beautifully put,

00:30:10 but it’s also to play devil’s advocate a little bit.

00:30:15 You know, it’s possible that there’s something

00:30:18 more fundamental than us humans.

00:30:21 So you kind of said the programmer creates

00:30:25 the difference between the choir,

00:30:28 the skill and the skill itself.

00:30:31 There could be something like,

00:30:32 you could argue the universe is more intelligent.

00:30:36 Like the base intelligence that we should be trying

00:30:43 to measure is something that created humans.

00:30:46 We should be measuring God or the source of the universe

00:30:51 as opposed to, like there could be a deeper intelligence.

00:30:55 Sure.

00:30:55 There’s always deeper intelligence, I guess.

00:30:57 You can argue that,

00:30:58 but that does not take anything away

00:31:00 from the fact that humans are intelligent.

00:31:01 And you can tell that

00:31:03 because they are capable of adaptation and generality.

00:31:07 Got it.

00:31:07 And you see that in particular in the fact

00:31:09 that humans are capable of handling situations and tasks

00:31:16 that are quite different from anything

00:31:19 that any of our evolutionary ancestors

00:31:22 has ever encountered.

00:31:24 So we are capable of generalizing very much

00:31:27 out of distribution,

00:31:28 if you consider our evolutionary history

00:31:30 as being in a way our training data.

00:31:33 Of course, evolutionary biologists would argue

00:31:35 that we’re not going too far out of the distribution.

00:31:37 We’re like mapping the skills we’ve learned previously,

00:31:41 desperately trying to like jam them

00:31:43 into like these new situations.

00:31:47 I mean, there’s definitely a little bit of that,

00:31:49 but it’s pretty clear to me that we’re able to,

00:31:53 most of the things we do any given day

00:31:56 in our modern civilization

00:31:58 are things that are very, very different

00:32:00 from what our ancestors a million years ago

00:32:03 would have been doing in a given day.

00:32:05 And your environment is very different.

00:32:07 So I agree that everything we do,

00:32:12 we do it with cognitive building blocks

00:32:14 that we acquired over the course of evolution, right?

00:32:17 And that anchors our cognition to a certain context,

00:32:22 which is the human condition very much.

00:32:25 But still our mind is capable of a pretty remarkable degree

00:32:29 of generality far beyond anything we can create

00:32:32 in artificial systems today.

00:32:34 Like the degree in which the mind can generalize

00:32:37 from its evolutionary history,

00:32:41 can generalize away from its evolutionary history

00:32:43 is much greater than the degree

00:32:46 to which a deep learning system today

00:32:48 can generalize away from its training data.

00:32:51 And like the key point you’re making,

00:32:52 which I think is quite beautiful is like,

00:32:54 we shouldn’t measure, if we’re talking about measurement,

00:32:58 we shouldn’t measure the skill.

00:33:01 We should measure like the creation of the new skill,

00:33:04 the ability to create that new skill.

00:33:06 But it’s tempting, like it’s weird

00:33:10 because the skill is a little bit of a small window

00:33:13 into the system.

00:33:16 So whenever you have a lot of skills,

00:33:19 it’s tempting to measure the skills.

00:33:21 I mean, the skill is the only thing you can objectively

00:33:25 measure, but yeah.

00:33:27 So the thing to keep in mind is that

00:33:30 when you see skill in the human,

00:33:35 it gives you a strong signal that that human is intelligent

00:33:39 because you know they weren’t born with that skill typically.

00:33:42 Like you see a very strong chess player,

00:33:45 maybe you’re a very strong chess player yourself.

00:33:47 I think you’re saying that because I’m Russian

00:33:51 and now you’re prejudiced, you assume.

00:33:53 All Russians are good at chess.

00:33:54 I’m biased, exactly.

00:33:55 I’m biased, yeah.

00:33:56 Well, you’re definitely biased.

00:34:00 So if you see a very strong chess player,

00:34:01 you know they weren’t born knowing how to play chess.

00:34:05 So they had to acquire that skill

00:34:07 with their limited resources, with their limited lifetime.

00:34:10 And they did that because they are generally intelligent.

00:34:15 And so they may as well have acquired any other skill.

00:34:18 You know they have this potential.

00:34:21 And on the other hand, if you see a computer playing chess,

00:34:25 you cannot make the same assumptions

00:34:27 because you cannot just assume

00:34:29 the computer is generally intelligent.

00:34:30 The computer may be born knowing how to play chess

00:34:35 in the sense that it may have been programmed by a human

00:34:38 that has understood chess for the computer

00:34:40 and that has just encoded the output

00:34:44 of that understanding in a static program.

00:34:46 And that program is not intelligent.

00:34:49 So let’s zoom out just for a second and say like,

00:34:52 what is the goal on the measure of intelligence paper?

00:34:57 Like what do you hope to achieve with it?

00:34:59 So the goal of the paper is to clear up

00:35:01 some longstanding misunderstandings

00:35:04 about the way we’ve been conceptualizing intelligence

00:35:08 in the AI community and in the way we’ve been

00:35:12 evaluating progress in AI.

00:35:16 There’s been a lot of progress recently in machine learning

00:35:19 and people are extrapolating from that progress

00:35:22 that we are about to solve general intelligence.

00:35:26 And if you want to be able to evaluate these statements,

00:35:30 you need to precisely define what you’re talking about

00:35:33 when you’re talking about general intelligence.

00:35:35 And you need a formal way, a reliable way to measure

00:35:40 how much intelligence,

00:35:42 how much general intelligence a system processes.

00:35:45 And ideally this measure of intelligence

00:35:48 should be actionable.

00:35:50 So it should not just describe what intelligence is.

00:35:54 It should not just be a binary indicator

00:35:56 that tells you the system is intelligent or it isn’t.

00:36:01 It should be actionable.

00:36:03 It should have explanatory power, right?

00:36:05 So you could use it as a feedback signal.

00:36:08 It would show you the way

00:36:10 towards building more intelligent systems.

00:36:13 So at the first level, you draw a distinction

00:36:16 between two divergent views of intelligence.

00:36:21 As we just talked about,

00:36:22 intelligence is a collection of task specific skills

00:36:26 and a general learning ability.

00:36:29 So what’s the difference between

00:36:32 kind of this memorization of skills

00:36:35 and a general learning ability?

00:36:37 We’ve talked about it a little bit,

00:36:39 but can you try to linger on this topic for a bit?

00:36:43 Yeah, so the first part of the paper

00:36:45 is an assessment of the different ways

00:36:49 we’ve been thinking about intelligence

00:36:50 and the different ways we’ve been evaluating progress in AI.

00:36:54 And this tree of cognitive sciences

00:36:57 has been shaped by two views of the human mind.

00:37:01 And one view is the evolutionary psychology view

00:37:04 in which the mind is a collection of fairly static

00:37:10 special purpose ad hoc mechanisms

00:37:14 that have been hard coded by evolution

00:37:17 over our history as a species for a very long time.

00:37:22 And early AI researchers,

00:37:27 people like Marvin Minsky, for instance,

00:37:30 they clearly subscribed to this view.

00:37:33 And they saw the mind as a kind of

00:37:36 collection of static programs

00:37:39 similar to the programs they would run

00:37:42 on like mainframe computers.

00:37:43 And in fact, I think they very much understood the mind

00:37:48 through the metaphor of the mainframe computer

00:37:50 because that was the tool they were working with, right?

00:37:53 And so you had these static programs,

00:37:55 this collection of very different static programs

00:37:57 operating over a database like memory.

00:38:00 And in this picture, learning was not very important.

00:38:03 Learning was considered to be just memorization.

00:38:05 And in fact, learning is basically not featured

00:38:10 in AI textbooks until the 1980s

00:38:14 with the rise of machine learning.

00:38:16 It’s kind of fun to think about

00:38:18 that learning was the outcast.

00:38:21 Like the weird people working on learning,

00:38:24 like the mainstream AI world was,

00:38:28 I mean, I don’t know what the best term is,

00:38:31 but it’s non learning.

00:38:33 It was seen as like reasoning would not be learning based.

00:38:37 Yes, it was considered that the mind

00:38:40 was a collection of programs

00:38:43 that were primarily logical in nature.

00:38:46 And that’s all you needed to do to create a mind

00:38:49 was to write down these programs

00:38:50 and they would operate over knowledge,

00:38:52 which would be stored in some kind of database.

00:38:55 And as long as your database would encompass,

00:38:57 you know, everything about the world

00:38:59 and your logical rules were comprehensive,

00:39:03 then you would have a mind.

00:39:04 So the other view of the mind

00:39:06 is the brain as a sort of blank slate, right?

00:39:11 This is a very old idea.

00:39:13 You find it in John Locke’s writings.

00:39:16 This is the tabula rasa.

00:39:19 And this is this idea that the mind

00:39:21 is some kind of like information sponge

00:39:23 that starts empty, that starts blank.

00:39:27 And that absorbs knowledge and skills from experience, right?

00:39:34 So it’s a sponge that reflects the complexity of the world,

00:39:38 the complexity of your life experience, essentially.

00:39:41 That everything you know and everything you can do

00:39:44 is a reflection of something you found

00:39:47 in the outside world, essentially.

00:39:49 So this is an idea that’s very old.

00:39:51 That was not very popular, for instance, in the 1970s.

00:39:56 But that gained a lot of vitality recently

00:39:58 with the rise of connectionism, in particular deep learning.

00:40:02 And so today, deep learning

00:40:03 is the dominant paradigm in AI.

00:40:06 And I feel like lots of AI researchers

00:40:10 are conceptualizing the mind via a deep learning metaphor.

00:40:14 Like they see the mind as a kind of

00:40:17 randomly initialized neural network that starts blank

00:40:21 when you’re born.

00:40:22 And then that gets trained via exposure to trained data

00:40:26 that acquires knowledge and skills

00:40:27 via exposure to trained data.

00:40:29 By the way, it’s a small tangent.

00:40:32 I feel like people who are thinking about intelligence

00:40:36 are not conceptualizing it that way.

00:40:39 I actually haven’t met too many people

00:40:41 who believe that a neural network

00:40:44 will be able to reason, who seriously think that rigorously.

00:40:51 Because I think it’s actually an interesting worldview.

00:40:54 And we’ll talk about it more,

00:40:56 but it’s been impressive what neural networks

00:41:00 have been able to accomplish.

00:41:02 And to me, I don’t know, you might disagree,

00:41:04 but it’s an open question whether like scaling size

00:41:09 eventually might lead to incredible results

00:41:13 to us mere humans will appear as if it’s general.

00:41:17 I mean, if you ask people who are seriously thinking

00:41:19 about intelligence, they will definitely not say

00:41:22 that all you need to do is,

00:41:24 like the mind is just a neural network.

00:41:27 However, it’s actually a view that’s very popular,

00:41:30 I think, in the deep learning community

00:41:31 that many people are kind of conceptually

00:41:35 intellectually lazy about it.

00:41:37 Right, it’s a, but I guess what I’m saying exactly right,

00:41:40 it’s, I mean, I haven’t met many people

00:41:44 and I think it would be interesting to meet a person

00:41:47 who is not intellectually lazy about this particular topic

00:41:50 and still believes that neural networks will go all the way.

00:41:54 I think Yama is probably closest to that

00:41:56 with self supervised.

00:41:57 There are definitely people who argue

00:41:59 that current deep learning techniques

00:42:03 are already the way to general artificial intelligence.

00:42:06 And that all you need to do is to scale it up

00:42:09 to all the available trained data.

00:42:12 And that’s, if you look at the waves

00:42:16 that OpenAI’s GPT3 model has made,

00:42:19 you see echoes of this idea.

00:42:22 So on that topic, GPT3, similar to GPT2 actually,

00:42:28 have captivated some part of the imagination of the public.

00:42:33 There’s just a bunch of hype of different kind.

00:42:35 That’s, I would say it’s emergent.

00:42:37 It’s not artificially manufactured.

00:42:39 It’s just like people just get excited

00:42:42 for some strange reason.

00:42:43 And in the case of GPT3, which is funny,

00:42:46 that there’s, I believe, a couple months delay

00:42:49 from release to hype.

00:42:51 Maybe I’m not historically correct on that,

00:42:56 but it feels like there was a little bit of a lack of hype

00:43:01 and then there’s a phase shift into hype.

00:43:04 But nevertheless, there’s a bunch of cool applications

00:43:07 that seem to captivate the imagination of the public

00:43:10 about what this language model

00:43:12 that’s trained in unsupervised way

00:43:15 without any fine tuning is able to achieve.

00:43:19 So what do you make of that?

00:43:20 What are your thoughts about GPT3?

00:43:22 Yeah, so I think what’s interesting about GPT3

00:43:25 is the idea that it may be able to learn new tasks

00:43:31 after just being shown a few examples.

00:43:33 So I think if it’s actually capable of doing that,

00:43:35 that’s novel and that’s very interesting

00:43:37 and that’s something we should investigate.

00:43:39 That said, I must say, I’m not entirely convinced

00:43:43 that we have shown it’s capable of doing that.

00:43:47 It’s very likely, given the amount of data

00:43:50 that the model is trained on,

00:43:52 that what it’s actually doing is pattern matching

00:43:55 a new task you give it with a task

00:43:58 that it’s been exposed to in its trained data.

00:44:00 It’s just recognizing the task

00:44:01 instead of just developing a model of the task, right?

00:44:05 But there’s, sorry to interrupt,

00:44:07 there’s a parallel as to what you said before,

00:44:10 which is it’s possible to see GPT3 as like the prompts

00:44:14 it’s given as a kind of SQL query

00:44:17 into this thing that it’s learned,

00:44:19 similar to what you said before,

00:44:20 which is language is used to query the memory.

00:44:23 Yes.

00:44:24 So is it possible that neural network

00:44:26 is a giant memorization thing,

00:44:29 but then if it gets sufficiently giant,

00:44:32 it’ll memorize sufficiently large amounts

00:44:35 of things in the world or it becomes,

00:44:37 or intelligence becomes a querying machine?

00:44:40 I think it’s possible that a significant chunk

00:44:44 of intelligence is this giant associative memory.

00:44:48 I definitely don’t believe that intelligence

00:44:51 is just a giant associative memory,

00:44:53 but it may well be a big component.

00:44:57 So do you think GPT3, 4, 5,

00:45:02 GPT10 will eventually, like, what do you think,

00:45:07 where’s the ceiling?

00:45:08 Do you think you’ll be able to reason?

00:45:11 No, that’s a bad question.

00:45:14 Like, what is the ceiling is the better question.

00:45:17 How well is it gonna scale?

00:45:18 How good is GPTN going to be?

00:45:21 Yeah.

00:45:22 So I believe GPTN is gonna.

00:45:25 GPTN.

00:45:26 Is gonna improve on the strength of GPT2 and 3,

00:45:30 which is it will be able to generate, you know,

00:45:33 ever more plausible text in context.

00:45:37 Just monotonically increasing performance.

00:45:41 Yes, if you train a bigger model on more data,

00:45:44 then your text will be increasingly more context aware

00:45:49 and increasingly more plausible

00:45:51 in the same way that GPT3 is much better

00:45:54 at generating plausible text compared to GPT2.

00:45:57 But that said, I don’t think just scaling up the model

00:46:01 to more transformer layers and more trained data

00:46:04 is gonna address the flaws of GPT3,

00:46:07 which is that it can generate plausible text,

00:46:09 but that text is not constrained by anything else

00:46:13 other than plausibility.

00:46:15 So in particular, it’s not constrained by factualness

00:46:19 or even consistency, which is why it’s very easy

00:46:21 to get GPT3 to generate statements

00:46:23 that are factually untrue.

00:46:26 Or to generate statements that are even self contradictory.

00:46:29 Right?

00:46:30 Because it’s only goal is plausibility,

00:46:35 and it has no other constraints.

00:46:37 It’s not constrained to be self consistent, for instance.

00:46:40 Right?

00:46:41 And so for this reason, one thing that I thought

00:46:43 was very interesting with GPT3 is that you can

00:46:46 predetermine the answer it will give you

00:46:49 by asking the question in a specific way,

00:46:52 because it’s very responsive to the way you ask the question.

00:46:55 Since it has no understanding of the content of the question.

00:47:00 Right.

00:47:01 And if you have the same question in two different ways

00:47:05 that are basically adversarially engineered

00:47:09 to produce certain answer,

00:47:10 you will get two different answers,

00:47:12 two contradictory answers.

00:47:14 It’s very susceptible to adversarial attacks, essentially.

00:47:16 Potentially, yes.

00:47:17 So in general, the problem with these models,

00:47:20 these generative models, is that they are very good

00:47:24 at generating plausible text,

00:47:27 but that’s just not enough.

00:47:29 Right?

00:47:33 I think one avenue that would be very interesting

00:47:36 to make progress is to make it possible

00:47:40 to write programs over the latent space

00:47:43 that these models operate on.

00:47:45 That you would rely on these self supervised models

00:47:49 to generate a sort of like pool of knowledge and concepts

00:47:54 and common sense.

00:47:55 And then you would be able to write

00:47:57 explicit reasoning programs over it.

00:48:01 Because the current problem with GPT3 is that

00:48:03 it can be quite difficult to get it to do what you want to do.

00:48:09 If you want to turn GPT3 into products,

00:48:12 you need to put constraints on it.

00:48:14 You need to force it to obey certain rules.

00:48:19 So you need a way to program it explicitly.

00:48:22 Yeah, so if you look at its ability

00:48:24 to do program synthesis,

00:48:26 it generates, like you said, something that’s plausible.

00:48:29 Yeah, so if you try to make it generate programs,

00:48:32 it will perform well for any program

00:48:35 that it has seen in its training data.

00:48:38 But because program space is not interpretive, right?

00:48:42 It’s not going to be able to generalize to problems

00:48:46 it hasn’t seen before.

00:48:48 Now that’s currently, do you think sort of an absurd,

00:48:54 but I think useful, I guess, intuition builder is,

00:49:00 you know, the GPT3 has 175 billion parameters.

00:49:07 Human brain has 100, has about a thousand times that

00:49:11 or more in terms of number of synapses.

00:49:16 Do you think, obviously, very different kinds of things,

00:49:21 but there is some degree of similarity.

00:49:26 Do you think, what do you think GPT will look like

00:49:30 when it has 100 trillion parameters?

00:49:34 You think our conversation might be in nature different?

00:49:39 Like, because you’ve criticized GPT3 very effectively now.

00:49:42 Do you think?

00:49:45 No, I don’t think so.

00:49:46 So to begin with, the bottleneck with scaling up GPT3,

00:49:51 GPT models, generative pre trained transformer models,

00:49:54 is not going to be the size of the model

00:49:57 or how long it takes to train it.

00:49:59 The bottleneck is going to be the trained data

00:50:01 because OpenAI is already training GPT3

00:50:05 on a core of basically the entire web, right?

00:50:08 And that’s a lot of data.

00:50:09 So you could imagine training on more data than that,

00:50:12 like Google could train on more data than that,

00:50:14 but it would still be only incrementally more data.

00:50:17 And I don’t recall exactly how much more data GPT3

00:50:21 was trained on compared to GPT2,

00:50:22 but it’s probably at least like a hundred,

00:50:25 maybe even a thousand X.

00:50:26 I don’t have the exact number.

00:50:28 You’re not going to be able to train a model

00:50:30 on a hundred more data than what you’re already doing.

00:50:34 So that’s brilliant.

00:50:35 So it’s easier to think of compute as a bottleneck

00:50:38 and then arguing that we can remove that bottleneck.

00:50:41 But we can remove the compute bottleneck.

00:50:43 I don’t think it’s a big problem.

00:50:44 If you look at the pace at which we’ve improved

00:50:48 the efficiency of deep learning models

00:50:51 in the past few years,

00:50:54 I’m not worried about train time bottlenecks

00:50:57 or model size bottlenecks.

00:50:59 The bottleneck in the case

00:51:01 of these generative transformer models

00:51:03 is absolutely the trained data.

00:51:05 What about the quality of the data?

00:51:07 So, yeah.

00:51:08 So the quality of the data is an interesting point.

00:51:10 The thing is,

00:51:11 if you’re going to want to use these models

00:51:14 in real products,

00:51:16 then you want to feed them data

00:51:20 that’s as high quality, as factual,

00:51:23 I would say as unbiased as possible,

00:51:25 that there’s not really such a thing

00:51:27 as unbiased data in the first place.

00:51:30 But you probably don’t want to train it on Reddit,

00:51:34 for instance.

00:51:34 It sounds like a bad plan.

00:51:37 So from my personal experience,

00:51:38 working with large scale deep learning models.

00:51:42 So at some point I was working on a model at Google

00:51:46 that’s trained on 350 million labeled images.

00:51:52 It’s an image classification model.

00:51:53 That’s a lot of images.

00:51:54 That’s like probably most publicly available images

00:51:58 on the web at the time.

00:52:00 And it was a very noisy data set

00:52:03 because the labels were not originally annotated by hand,

00:52:07 by humans.

00:52:08 They were automatically derived from like tags

00:52:12 on social media,

00:52:14 or just keywords in the same page

00:52:16 as the image was found and so on.

00:52:18 So it was very noisy.

00:52:19 And it turned out that you could easily get a better model,

00:52:25 not just by training,

00:52:26 like if you train on more of the noisy data,

00:52:29 you get an incrementally better model,

00:52:31 but you very quickly hit diminishing returns.

00:52:35 On the other hand,

00:52:36 if you train on smaller data set

00:52:38 with higher quality annotations,

00:52:40 quality annotations that are actually made by humans,

00:52:45 you get a better model.

00:52:47 And it also takes less time to train it.

00:52:49 Yeah, that’s fascinating.

00:52:51 It’s the self supervised learning.

00:52:53 There’s a way to get better doing the automated labeling.

00:52:58 Yeah, so you can enrich or refine your labels

00:53:04 in an automated way.

00:53:05 That’s correct.

00:53:07 Do you have a hope for,

00:53:08 I don’t know if you’re familiar

00:53:09 with the idea of a semantic web.

00:53:11 Is a semantic web just for people who are not familiar

00:53:15 and is the idea of being able to convert the internet

00:53:20 or be able to attach like semantic meaning

00:53:25 to the words on the internet,

00:53:27 the sentences, the paragraphs,

00:53:29 to be able to convert information on the internet

00:53:33 or some fraction of the internet

00:53:35 into something that’s interpretable by machines.

00:53:39 That was kind of a dream for,

00:53:44 I think the semantic web papers in the nineties,

00:53:47 it’s kind of the dream that, you know,

00:53:49 the internet is full of rich, exciting information.

00:53:52 Even just looking at Wikipedia,

00:53:54 we should be able to use that as data for machines.

00:53:57 And so far it’s not,

00:53:58 it’s not really in a format that’s available to machines.

00:54:01 So no, I don’t think the semantic web will ever work

00:54:04 simply because it would be a lot of work, right?

00:54:08 To make, to provide that information in structured form.

00:54:12 And there is not really any incentive

00:54:13 for anyone to provide that work.

00:54:16 So I think the way forward to make the knowledge

00:54:21 on the web available to machines

00:54:22 is actually something closer to unsupervised deep learning.

00:54:29 So GPT3 is actually a bigger step in the direction

00:54:32 of making the knowledge of the web available to machines

00:54:34 than the semantic web was.

00:54:36 Yeah, perhaps in a human centric sense,

00:54:40 it feels like GPT3 hasn’t learned anything

00:54:47 that could be used to reason.

00:54:50 But that might be just the early days.

00:54:52 Yeah, I think that’s correct.

00:54:54 I think the forms of reasoning that you see it perform

00:54:57 are basically just reproducing patterns

00:55:00 that it has seen in string data.

00:55:02 So of course, if you’re trained on the entire web,

00:55:06 then you can produce an illusion of reasoning

00:55:09 in many different situations.

00:55:10 But it will break down if it’s presented

00:55:13 with a novel situation.

00:55:15 That’s the open question between the illusion of reasoning

00:55:17 and actual reasoning, yeah.

00:55:18 Yes.

00:55:19 The power to adapt to something that is genuinely new.

00:55:22 Because the thing is, even imagine you had,

00:55:28 you could train on every bit of data

00:55:31 ever generated in the history of humanity.

00:55:35 It remains, that model would be capable

00:55:38 of anticipating many different possible situations.

00:55:43 But it remains that the future is

00:55:45 going to be something different.

00:55:48 For instance, if you train a GPT3 model on data

00:55:52 from the year 2002, for instance,

00:55:55 and then use it today, it’s going to be missing many things.

00:55:58 It’s going to be missing many common sense

00:56:00 facts about the world.

00:56:02 It’s even going to be missing vocabulary and so on.

00:56:05 Yeah, it’s interesting that GPT3 even doesn’t have,

00:56:09 I think, any information about the coronavirus.

00:56:13 Yes.

00:56:14 Which is why a system that’s, you

00:56:19 tell that the system is intelligent

00:56:21 when it’s capable to adapt.

00:56:22 So intelligence is going to require

00:56:25 some amount of continuous learning.

00:56:28 It’s also going to require some amount of improvisation.

00:56:31 It’s not enough to assume that what you’re

00:56:33 going to be asked to do is something

00:56:36 that you’ve seen before, or something

00:56:39 that is a simple interpolation of things you’ve seen before.

00:56:42 Yeah.

00:56:43 In fact, that model breaks down for even very

00:56:49 tasks that look relatively simple from a distance,

00:56:52 like L5 self driving, for instance.

00:56:55 Google had a paper a couple of years

00:56:58 back showing that something like 30 million different road

00:57:04 situations were actually completely insufficient

00:57:07 to train a driving model.

00:57:09 It wasn’t even L2, right?

00:57:11 And that’s a lot of data.

00:57:12 That’s a lot more data than the 20 or 30 hours of driving

00:57:16 that a human needs to learn to drive,

00:57:19 given the knowledge they’ve already accumulated.

00:57:21 Well, let me ask you on that topic.

00:57:25 Elon Musk, Tesla Autopilot, one of the only companies,

00:57:31 I believe, is really pushing for a learning based approach.

00:57:34 Are you skeptical that that kind of network

00:57:37 can achieve level 4?

00:57:39 L4 is probably achievable.

00:57:42 L5 probably not.

00:57:44 What’s the distinction there?

00:57:45 Is L5 is completely you can just fall asleep?

00:57:49 Yeah, L5 is basically human level.

00:57:51 Well, with driving, we have to be careful saying human level,

00:57:53 because that’s the most of the drivers.

00:57:57 Yeah, that’s the clearest example of cars

00:58:00 will most likely be much safer than humans in many situations

00:58:05 where humans fail.

00:58:06 It’s the vice versa question.

00:58:09 I’ll tell you, the thing is the amount of trained data

00:58:13 you would need to anticipate for pretty much every possible

00:58:17 situation you learn content in the real world

00:58:20 is such that it’s not entirely unrealistic

00:58:23 to think that at some point in the future,

00:58:25 we’ll develop a system that’s trained on enough data,

00:58:27 especially provided that we can simulate a lot of that data.

00:58:32 We don’t necessarily need actual cars

00:58:34 on the road for everything.

00:58:37 But it’s a massive effort.

00:58:39 And it turns out you can create a system that’s

00:58:42 much more adaptive, that can generalize much better

00:58:45 if you just add explicit models of the surroundings

00:58:52 of the car.

00:58:53 And if you use deep learning for what

00:58:55 it’s good at, which is to provide

00:58:57 perceptive information.

00:58:59 So in general, deep learning is a way

00:59:02 to encode perception and a way to encode intuition.

00:59:05 But it is not a good medium for any sort of explicit reasoning.

00:59:11 And in AI systems today, strong generalization

00:59:15 tends to come from explicit models,

00:59:21 tend to come from abstractions in the human mind that

00:59:24 are encoded in program form by a human engineer.

00:59:29 These are the abstractions you can actually generalize, not

00:59:31 the sort of weak abstraction that

00:59:33 is learned by a neural network.

00:59:34 Yeah, and the question is how much reasoning,

00:59:38 how much strong abstractions are required

00:59:41 to solve particular tasks like driving.

00:59:44 That’s the question.

00:59:46 Or human life existence.

00:59:48 How much strong abstractions does existence require?

00:59:53 But more specifically on driving,

00:59:58 that seems to be a coupled question about intelligence.

01:00:02 How much intelligence, how do you

01:00:05 build an intelligent system?

01:00:07 And the coupled problem, how hard is this problem?

01:00:11 How much intelligence does this problem actually require?

01:00:14 So we get to cheat because we get

01:00:18 to look at the problem.

01:00:20 It’s not like you get to close our eyes

01:00:22 and completely new to driving.

01:00:24 We get to do what we do as human beings, which

01:00:27 is for the majority of our life before we ever

01:00:31 learn, quote unquote, to drive.

01:00:32 We get to watch other cars and other people drive.

01:00:35 We get to be in cars.

01:00:36 We get to watch.

01:00:37 We get to see movies about cars.

01:00:39 We get to observe all this stuff.

01:00:42 And that’s similar to what neural networks are doing.

01:00:45 It’s getting a lot of data, and the question

01:00:50 is, yeah, how many leaps of reasoning genius

01:00:55 is required to be able to actually effectively drive?

01:00:59 I think it’s a good example of driving.

01:01:01 I mean, sure, you’ve seen a lot of cars in your life

01:01:06 before you learned to drive.

01:01:07 But let’s say you’ve learned to drive in Silicon Valley,

01:01:10 and now you rent a car in Tokyo.

01:01:14 Well, now everyone is driving on the other side of the road,

01:01:16 and the signs are different, and the roads

01:01:19 are more narrow and so on.

01:01:20 So it’s a very, very different environment.

01:01:22 And a smart human, even an average human,

01:01:26 should be able to just zero shot it,

01:01:29 to just be operational in this very different environment

01:01:34 right away, despite having had no contact with the novel

01:01:40 complexity that is contained in this environment.

01:01:44 And that novel complexity is not just an interpolation

01:01:49 over the situations that you’ve encountered previously,

01:01:52 like learning to drive in the US.

01:01:55 I would say the reason I ask is one

01:01:57 of the most interesting tests of intelligence

01:01:59 we have today actively, which is driving,

01:02:04 in terms of having an impact on the world.

01:02:06 When do you think we’ll pass that test of intelligence?

01:02:09 So I don’t think driving is that much of a test of intelligence,

01:02:13 because again, there is no task for which skill at that task

01:02:18 demonstrates intelligence, unless it’s

01:02:21 a kind of meta task that involves acquiring new skills.

01:02:26 So I don’t think, I think you can actually

01:02:28 solve driving without having any real amount of intelligence.

01:02:35 For instance, if you did have infinite trained data,

01:02:39 you could just literally train an end to end deep learning

01:02:42 model that does driving, provided infinite trained data.

01:02:45 The only problem with the whole idea

01:02:48 is collecting a data set that’s sufficiently comprehensive,

01:02:53 that covers the very long tail of possible situations

01:02:56 you might encounter.

01:02:57 And it’s really just a scale problem.

01:02:59 So I think there’s nothing fundamentally wrong

01:03:04 with this plan, with this idea.

01:03:06 It’s just that it strikes me as a fairly inefficient thing

01:03:11 to do, because you run into this scaling issue with diminishing

01:03:17 returns.

01:03:17 Whereas if instead you took a more manual engineering

01:03:21 approach, where you use deep learning modules in combination

01:03:29 with engineering an explicit model of the surrounding

01:03:33 of the cars, and you bridge the two in a clever way,

01:03:36 your model will actually start generalizing

01:03:38 much earlier and more effectively

01:03:40 than the end to end deep learning model.

01:03:42 So why would you not go with the more manual engineering

01:03:46 oriented approach?

01:03:47 Even if you created that system, either the end

01:03:50 to end deep learning model system that’s

01:03:52 running infinite data, or the slightly more human system,

01:03:58 I don’t think achieving L5 would demonstrate

01:04:02 general intelligence or intelligence

01:04:04 of any generality at all.

01:04:05 Again, the only possible test of generality in AI

01:04:10 would be a test that looks at skill acquisition

01:04:12 over unknown tasks.

01:04:14 For instance, you could take your L5 driver

01:04:17 and ask it to learn to pilot a commercial airplane,

01:04:21 for instance.

01:04:22 And then you would look at how much human involvement is

01:04:25 required and how much strength data

01:04:26 is required for the system to learn to pilot an airplane.

01:04:29 And that gives you a measure of how intelligent

01:04:35 that system really is.

01:04:35 Yeah, well, I mean, that’s a big leap.

01:04:37 I get you.

01:04:38 But I’m more interested, as a problem, I would see,

01:04:42 to me, driving is a black box that

01:04:47 can generate novel situations at some rate,

01:04:51 what people call edge cases.

01:04:53 So it does have newness that keeps being like,

01:04:56 we’re confronted, let’s say, once a month.

01:04:59 It is a very long tail, yes.

01:05:00 It’s a long tail.

01:05:01 That doesn’t mean you cannot solve it just

01:05:05 by training a statistical model and a lot of data.

01:05:08 Huge amount of data.

01:05:09 It’s really a matter of scale.

01:05:11 But I guess what I’m saying is if you have a vehicle that

01:05:16 achieves level 5, it is going to be able to deal

01:05:21 with new situations.

01:05:23 Or, I mean, the data is so large that the rate of new situations

01:05:30 is very low.

01:05:32 Yes.

01:05:33 That’s not intelligent.

01:05:34 So if we go back to your kind of definition of intelligence,

01:05:37 it’s the efficiency.

01:05:39 With which you can adapt to new situations,

01:05:42 to truly new situations, not situations you’ve seen before.

01:05:45 Not situations that could be anticipated by your creators,

01:05:48 by the creators of the system, but truly new situations.

01:05:51 The efficiency with which you acquire new skills.

01:05:54 If you require, if in order to pick up a new skill,

01:05:58 you require a very extensive training

01:06:03 data set of most possible situations

01:06:05 that can occur in the practice of that skill,

01:06:08 then the system is not intelligent.

01:06:10 It is mostly just a lookup table.

01:06:15 Yeah.

01:06:16 Well, likewise, if in order to acquire a skill,

01:06:20 you need a human engineer to write down

01:06:23 a bunch of rules that cover most or every possible situation.

01:06:26 Likewise, the system is not intelligent.

01:06:29 The system is merely the output artifact

01:06:33 of a process that happens in the minds of the engineers that

01:06:39 are creating it.

01:06:40 It is encoding an abstraction that’s

01:06:44 produced by the human mind.

01:06:46 And intelligence would actually be

01:06:51 the process of autonomously producing this abstraction.

01:06:56 Yeah.

01:06:57 Not like if you take an abstraction

01:06:59 and you encode it on a piece of paper or in a computer program,

01:07:02 the abstraction itself is not intelligent.

01:07:05 What’s intelligent is the agent that’s

01:07:09 capable of producing these abstractions.

01:07:11 Yeah, it feels like there’s a little bit of a gray area.

01:07:16 Because you’re basically saying that deep learning forms

01:07:18 abstractions, too.

01:07:21 But those abstractions do not seem

01:07:24 to be effective for generalizing far outside of the things

01:07:29 that it’s already seen.

01:07:30 But generalize a little bit.

01:07:31 Yeah, absolutely.

01:07:32 No, deep learning does generalize a little bit.

01:07:34 Generalization is not binary.

01:07:36 It’s more like a spectrum.

01:07:38 Yeah.

01:07:38 And there’s a certain point, it’s a gray area,

01:07:40 but there’s a certain point where

01:07:42 there’s an impressive degree of generalization that happens.

01:07:47 No, I guess exactly what you were saying

01:07:50 is intelligence is how efficiently you’re

01:07:56 able to generalize far outside of the distribution of things

01:08:02 you’ve seen already.

01:08:03 Yes.

01:08:03 So it’s both the distance of how far you can,

01:08:07 how new, how radically new something is,

01:08:10 and how efficiently you’re able to deal with that.

01:08:12 So you can think of intelligence as a measure of an information

01:08:17 conversion ratio.

01:08:19 Imagine a space of possible situations.

01:08:23 And you’ve covered some of them.

01:08:27 So you have some amount of information

01:08:30 about your space of possible situations

01:08:32 that’s provided by the situations you already know.

01:08:34 And that’s, on the other hand, also provided

01:08:36 by the prior knowledge that the system brings

01:08:40 to the table, the prior knowledge embedded

01:08:42 in the system.

01:08:43 So the system starts with some information

01:08:46 about the problem, about the task.

01:08:48 And it’s about going from that information

01:08:52 to a program, what we would call a skill program,

01:08:55 a behavioral program, that can cover a large area

01:08:58 of possible situation space.

01:09:01 And essentially, the ratio between that area

01:09:04 and the amount of information you start with is intelligence.

01:09:09 So a very smart agent can make efficient use

01:09:14 of very little information about a new problem

01:09:17 and very little prior knowledge as well

01:09:19 to cover a very large area of potential situations

01:09:23 in that problem without knowing what these future new situations

01:09:28 are going to be.

01:09:31 So one of the other big things you talk about in the paper,

01:09:34 we’ve talked about a little bit already,

01:09:36 but let’s talk about it some more,

01:09:37 is the actual tests of intelligence.

01:09:41 So if we look at human and machine intelligence,

01:09:45 do you think tests of intelligence

01:09:48 should be different for humans and machines,

01:09:50 or how we think about testing of intelligence?

01:09:54 Are these fundamentally the same kind of intelligences

01:09:59 that we’re after, and therefore, the tests should be similar?

01:10:03 So if your goal is to create AIs that are more humanlike,

01:10:10 then it would be super valuable, obviously,

01:10:12 to have a test that’s universal, that applies to both AIs

01:10:18 and humans, so that you could establish

01:10:20 a comparison between the two, that you

01:10:23 could tell exactly how intelligent,

01:10:27 in terms of human intelligence, a given system is.

01:10:30 So that said, the constraints that

01:10:34 apply to artificial intelligence and to human intelligence

01:10:37 are very different.

01:10:39 And your test should account for this difference.

01:10:44 Because if you look at artificial systems,

01:10:47 it’s always possible for an experimenter

01:10:50 to buy arbitrary levels of skill at arbitrary tasks,

01:10:55 either by injecting hardcoded prior knowledge

01:11:01 into the system via rules and so on that

01:11:05 come from the human mind, from the minds of the programmers,

01:11:08 and also buying higher levels of skill

01:11:12 just by training on more data.

01:11:15 For instance, you could generate an infinity

01:11:17 of different Go games, and you could train a Go playing

01:11:21 system that way, but you could not directly compare it

01:11:26 to human Go playing skills.

01:11:28 Because a human that plays Go had

01:11:31 to develop that skill in a very constrained environment.

01:11:34 They had a limited amount of time.

01:11:36 They had a limited amount of energy.

01:11:38 And of course, this started from a different set of priors.

01:11:42 This started from innate human priors.

01:11:48 So I think if you want to compare

01:11:49 the intelligence of two systems, like the intelligence of an AI

01:11:53 and the intelligence of a human, you have to control for priors.

01:11:59 You have to start from the same set of knowledge priors

01:12:04 about the task, and you have to control

01:12:06 for experience, that is to say, for training data.

01:12:11 So what’s priors?

01:12:14 So prior is whatever information you

01:12:18 have about a given task before you

01:12:21 start learning about this task.

01:12:23 And how’s that different from experience?

01:12:25 Well, experience is acquired.

01:12:28 So for instance, if you’re trying to play Go,

01:12:31 your experience with Go is all the Go games

01:12:33 you’ve played, or you’ve seen, or you’ve simulated

01:12:37 in your mind, let’s say.

01:12:38 And your priors are things like, well,

01:12:42 Go is a game on the 2D grid.

01:12:45 And we have lots of hardcoded priors

01:12:48 about the organization of 2D space.

01:12:53 And the rules of how the dynamics of the physics

01:12:58 of this game in this 2D space?

01:12:59 Yes.

01:13:00 And the idea that you have what winning is.

01:13:04 Yes, exactly.

01:13:05 And other board games can also share some similarities with Go.

01:13:09 And if you’ve played these board games, then,

01:13:12 with respect to the game of Go, that

01:13:13 would be part of your priors about the game.

01:13:16 Well, it’s interesting to think about the game of Go

01:13:18 is how many priors are actually brought to the table.

01:13:22 When you look at self play, reinforcement learning based

01:13:27 mechanisms that do learning, it seems

01:13:29 like the number of priors is pretty low.

01:13:31 Yes.

01:13:31 But you’re saying you should be expec…

01:13:32 There is a 2D special priors in the carbonate.

01:13:35 Right.

01:13:36 But you should be clear at making

01:13:39 those priors explicit.

01:13:40 Yes.

01:13:41 So in particular, I think if your goal

01:13:44 is to measure a humanlike form of intelligence,

01:13:47 then you should clearly establish

01:13:49 that you want the AI you’re testing

01:13:52 to start from the same set of priors that humans start with.

01:13:57 Right.

01:13:58 So I mean, to me personally, but I think to a lot of people,

01:14:02 the human side of things is very interesting.

01:14:05 So testing intelligence for humans.

01:14:08 What do you think is a good test of human intelligence?

01:14:14 Well, that’s the question that psychometrics is interested in.

01:14:19 There’s an entire subfield of psychology

01:14:22 that deals with this question.

01:14:23 So what’s psychometrics?

01:14:25 The psychometrics is the subfield of psychology

01:14:27 that tries to measure, quantify aspects of the human mind.

01:14:33 So in particular, our cognitive abilities, intelligence,

01:14:36 and personality traits as well.

01:14:39 So what are, it might be a weird question,

01:14:43 but what are the first principles of psychometrics

01:14:49 this operates on?

01:14:52 What are the priors it brings to the table?

01:14:55 So it’s a field with a fairly long history.

01:15:01 So psychology sometimes gets a bad reputation

01:15:05 for not having very reproducible results.

01:15:09 And psychometrics has actually some fairly solidly

01:15:12 reproducible results.

01:15:14 So the ideal goals of the field is a test

01:15:17 should be reliable, which is a notion tied to reproducibility.

01:15:23 It should be valid, meaning that it should actually

01:15:26 measure what you say it measures.

01:15:30 So for instance, if you’re saying

01:15:32 that you’re measuring intelligence,

01:15:34 then your test results should be correlated

01:15:36 with things that you expect to be correlated

01:15:39 with intelligence like success in school

01:15:41 or success in the workplace and so on.

01:15:43 Should be standardized, meaning that you

01:15:46 can administer your tests to many different people

01:15:48 in some conditions.

01:15:50 And it should be free from bias.

01:15:52 Meaning that, for instance, if your test involves

01:15:57 the English language, then you have

01:15:59 to be aware that this creates a bias against people

01:16:02 who have English as their second language

01:16:04 or people who can’t speak English at all.

01:16:07 So of course, these principles for creating

01:16:10 psychometric tests are very much an ideal.

01:16:13 I don’t think every psychometric test is really either

01:16:17 reliable, valid, or free from bias.

01:16:22 But at least the field is aware of these weaknesses

01:16:25 and is trying to address them.

01:16:27 So it’s kind of interesting.

01:16:30 Ultimately, you’re only able to measure,

01:16:31 like you said previously, the skill.

01:16:34 But you’re trying to do a bunch of measures

01:16:36 of different skills that correlate,

01:16:38 as you mentioned, strongly with some general concept

01:16:41 of cognitive ability.

01:16:43 Yes, yes.

01:16:44 So what’s the G factor?

01:16:46 So right, there are many different kinds

01:16:48 of tests of intelligence.

01:16:50 And each of them is interesting in different aspects

01:16:55 of intelligence.

01:16:56 Some of them will deal with language.

01:16:57 Some of them will deal with spatial vision,

01:17:00 maybe mental rotations, numbers, and so on.

01:17:04 When you run these very different tests at scale,

01:17:08 what you start seeing is that there

01:17:10 are clusters of correlations among test results.

01:17:14 So for instance, if you look at homework at school,

01:17:19 you will see that people who do well at math

01:17:21 are also likely statistically to do well in physics.

01:17:25 And what’s more, people who do well at math and physics

01:17:30 are also statistically likely to do well

01:17:32 in things that sound completely unrelated,

01:17:35 like writing an English essay, for instance.

01:17:38 And so when you see clusters of correlations

01:17:42 in statistical terms, you would explain them

01:17:46 with the latent variable.

01:17:47 And the latent variable that would, for instance, explain

01:17:51 the relationship between being good at math

01:17:53 and being good at physics would be cognitive ability.

01:17:57 And the G factor is the latent variable

01:18:00 that explains the fact that every test of intelligence

01:18:05 that you can come up with results on this test

01:18:09 end up being correlated.

01:18:10 So there is some single unique variable

01:18:16 that explains these correlations.

01:18:17 That’s the G factor.

01:18:18 So it’s a statistical construct.

01:18:20 It’s not really something you can directly measure,

01:18:23 for instance, in a person.

01:18:25 But it’s there.

01:18:26 But it’s there.

01:18:27 It’s there.

01:18:27 It’s there at scale.

01:18:28 And that’s also one thing I want to mention about psychometrics.

01:18:33 Like when you talk about measuring intelligence

01:18:36 in humans, for instance, some people

01:18:38 get a little bit worried.

01:18:40 They will say, that sounds dangerous.

01:18:41 Maybe that sounds potentially discriminatory, and so on.

01:18:44 And they’re not wrong.

01:18:46 And the thing is, personally, I’m

01:18:48 not interested in psychometrics as a way

01:18:51 to characterize one individual person.

01:18:54 Like if I get your psychometric personality

01:18:59 assessments or your IQ, I don’t think that actually

01:19:01 tells me much about you as a person.

01:19:05 I think psychometrics is most useful as a statistical tool.

01:19:10 So it’s most useful at scale.

01:19:12 It’s most useful when you start getting test results

01:19:15 for a large number of people.

01:19:17 And you start cross correlating these test results.

01:19:20 Because that gives you information

01:19:23 about the structure of the human mind,

01:19:26 in particular about the structure

01:19:28 of human cognitive abilities.

01:19:29 So at scale, psychometrics paints a certain picture

01:19:34 of the human mind.

01:19:35 And that’s interesting.

01:19:37 And that’s what’s relevant to AI, the structure

01:19:39 of human cognitive abilities.

01:19:41 Yeah, it gives you an insight into it.

01:19:42 I mean, to me, I remember when I learned about G factor,

01:19:45 it seemed like it would be impossible for it

01:19:52 to be real, even as a statistical variable.

01:19:55 Like it felt kind of like astrology.

01:19:59 Like it’s like wishful thinking among psychologists.

01:20:01 But the more I learned, I realized that there’s some.

01:20:05 I mean, I’m not sure what to make about human beings,

01:20:07 the fact that the G factor is a thing.

01:20:10 There’s a commonality across all of human species,

01:20:13 that there does seem to be a strong correlation

01:20:15 between cognitive abilities.

01:20:17 That’s kind of fascinating, actually.

01:20:19 So human cognitive abilities have a structure.

01:20:22 Like the most mainstream theory of the structure

01:20:25 of cognitive abilities is called CHC theory.

01:20:28 It’s Cattell, Horn, Carroll.

01:20:30 It’s named after the three psychologists who

01:20:33 contributed key pieces of it.

01:20:35 And it describes cognitive abilities

01:20:38 as a hierarchy with three levels.

01:20:41 And at the top, you have the G factor.

01:20:43 Then you have broad cognitive abilities,

01:20:46 for instance fluid intelligence, that

01:20:49 encompass a broad set of possible kinds of tasks

01:20:54 that are all related.

01:20:57 And then you have narrow cognitive abilities

01:20:59 at the last level, which is closer to task specific skill.

01:21:04 And there are actually different theories of the structure

01:21:09 of cognitive abilities that just emerge

01:21:10 from different statistical analysis of IQ test results.

01:21:14 But they all describe a hierarchy with a kind of G

01:21:18 factor at the top.

01:21:21 And you’re right that the G factor,

01:21:23 it’s not quite real in the sense that it’s not something

01:21:27 you can observe and measure, like your height,

01:21:29 for instance.

01:21:30 But it’s real in the sense that you

01:21:32 see it in a statistical analysis of the data.

01:21:37 One thing I want to mention is that the fact

01:21:39 that there is a G factor does not really

01:21:41 mean that human intelligence is general in a strong sense.

01:21:45 It does not mean human intelligence

01:21:47 can be applied to any problem at all,

01:21:50 and that someone who has a high IQ

01:21:52 is going to be able to solve any problem at all.

01:21:54 That’s not quite what it means.

01:21:55 I think one popular analogy to understand it

01:22:00 is the sports analogy.

01:22:03 If you consider the concept of physical fitness,

01:22:06 it’s a concept that’s very similar to intelligence

01:22:09 because it’s a useful concept.

01:22:11 It’s something you can intuitively understand.

01:22:14 Some people are fit, maybe like you.

01:22:17 Some people are not as fit, maybe like me.

01:22:20 But none of us can fly.

01:22:22 Absolutely.

01:22:23 It’s constrained to a specific set of skills.

01:22:25 Even if you’re very fit, that doesn’t

01:22:27 mean you can do anything at all in any environment.

01:22:31 You obviously cannot fly.

01:22:32 You cannot survive at the bottom of the ocean and so on.

01:22:36 And if you were a scientist and you

01:22:38 wanted to precisely define and measure physical fitness

01:22:42 in humans, then you would come up with a battery of tests.

01:22:47 You would have running 100 meter, playing soccer,

01:22:51 playing table tennis, swimming, and so on.

01:22:54 And if you ran these tests over many different people,

01:22:58 you would start seeing correlations in test results.

01:23:01 For instance, people who are good at soccer

01:23:03 are also good at sprinting.

01:23:05 And you would explain these correlations

01:23:08 with physical abilities that are strictly

01:23:11 analogous to cognitive abilities.

01:23:14 And then you would start also observing correlations

01:23:17 between biological characteristics,

01:23:21 like maybe lung volume is correlated with being

01:23:24 a fast runner, for instance, in the same way

01:23:27 that there are neurophysical correlates of cognitive

01:23:32 abilities.

01:23:33 And at the top of the hierarchy of physical abilities

01:23:38 that you would be able to observe,

01:23:39 you would have a G factor, a physical G factor, which

01:23:43 would map to physical fitness.

01:23:45 And as you just said, that doesn’t

01:23:47 mean that people with high physical fitness can’t fly.

01:23:51 It doesn’t mean human morphology and human physiology

01:23:54 is universal.

01:23:55 It’s actually super specialized.

01:23:57 We can only do the things that we were evolved to do.

01:24:04 We are not appropriate to, you could not

01:24:08 exist on Venus or Mars or in the void of space

01:24:11 or the bottom of the ocean.

01:24:12 So that said, one thing that’s really striking and remarkable

01:24:17 is that our morphology generalizes

01:24:23 far beyond the environments that we evolved for.

01:24:27 Like in a way, you could say we evolved to run after prey

01:24:31 in the savanna, right?

01:24:32 That’s very much where our human morphology comes from.

01:24:36 And that said, we can do a lot of things

01:24:40 that are completely unrelated to that.

01:24:42 We can climb mountains.

01:24:44 We can swim across lakes.

01:24:47 We can play table tennis.

01:24:48 I mean, table tennis is very different from what

01:24:51 we were evolved to do, right?

01:24:53 So our morphology, our bodies, our sense and motor

01:24:56 affordances have a degree of generality

01:24:59 that is absolutely remarkable, right?

01:25:02 And I think cognition is very similar to that.

01:25:05 Our cognitive abilities have a degree of generality

01:25:08 that goes far beyond what the mind was initially

01:25:11 supposed to do, which is why we can play music and write

01:25:14 novels and go to Mars and do all kinds of crazy things.

01:25:18 But it’s not universal in the same way

01:25:20 that human morphology and our body

01:25:23 is not appropriate for actually most of the universe by volume.

01:25:27 In the same way, you could say that the human mind is not

01:25:29 really appropriate for most of problem space,

01:25:32 potential problem space by volume.

01:25:35 So we have very strong cognitive biases, actually,

01:25:39 that mean that there are certain types of problems

01:25:42 that we handle very well and certain types of problems

01:25:45 that we are completely in adapted for.

01:25:48 So that’s really how we’d interpret the G factor.

01:25:52 It’s not a sign of strong generality.

01:25:56 It’s really just the broadest cognitive ability.

01:26:01 But our abilities, whether we are

01:26:03 talking about sensory motor abilities or cognitive

01:26:05 abilities, they still remain very specialized

01:26:09 in the human condition, right?

01:26:12 Within the constraints of the human cognition,

01:26:16 they’re general.

01:26:18 Yes, absolutely.

01:26:19 But the constraints, as you’re saying, are very limited.

01:26:22 I think what’s limiting.

01:26:23 So we evolved our cognition and our body

01:26:26 evolved in very specific environments.

01:26:29 Because our environment was so variable, fast changing,

01:26:32 and so unpredictable, part of the constraints

01:26:35 that drove our evolution is generality itself.

01:26:39 So we were, in a way, evolved to be able to improvise

01:26:42 in all kinds of physical or cognitive environments.

01:26:47 And for this reason, it turns out

01:26:49 that the minds and bodies that we ended up with

01:26:55 can be applied to much, much broader scope

01:26:58 than what they were evolved for.

01:27:00 And that’s truly remarkable.

01:27:01 And that’s a degree of generalization

01:27:03 that is far beyond anything you can see in artificial systems

01:27:07 today.

01:27:10 That said, it does not mean that human intelligence

01:27:14 is anywhere universal.

01:27:16 Yeah, it’s not general.

01:27:18 It’s a kind of exciting topic for people,

01:27:21 even outside of artificial intelligence, is IQ tests.

01:27:27 I think it’s Mensa, whatever.

01:27:29 There’s different degrees of difficulty for questions.

01:27:32 We talked about this offline a little bit, too,

01:27:34 about difficult questions.

01:27:37 What makes a question on an IQ test more difficult or less

01:27:42 difficult, do you think?

01:27:43 So the thing to keep in mind is that there’s

01:27:46 no such thing as a question that’s intrinsically difficult.

01:27:51 It has to be difficult to suspect to the things you

01:27:54 already know and the things you can already do, right?

01:27:58 So in terms of an IQ test question,

01:28:02 typically it would be structured, for instance,

01:28:05 as a set of demonstration input and output pairs, right?

01:28:11 And then you would be given a test input, a prompt,

01:28:15 and you would need to recognize or produce

01:28:18 the corresponding output.

01:28:20 And in that narrow context, you could say a difficult question

01:28:26 is a question where the input prompt is

01:28:31 very surprising and unexpected, given the training examples.

01:28:36 Just even the nature of the patterns

01:28:38 that you’re observing in the input prompt.

01:28:40 For instance, let’s say you have a rotation problem.

01:28:43 You must relate the shape by 90 degrees.

01:28:46 If I give you two examples and then I give you one prompt,

01:28:50 which is actually one of the two training examples,

01:28:53 then there is zero generalization difficulty

01:28:55 for the task.

01:28:56 It’s actually a trivial task.

01:28:57 You just recognize that it’s one of the training examples,

01:29:00 and you produce the same answer.

01:29:02 Now, if it’s a more complex shape,

01:29:05 there is a little bit more generalization,

01:29:07 but it remains that you are still

01:29:09 doing the same thing at this time,

01:29:12 as you were being demonstrated at training time.

01:29:15 A difficult task starts to require some amount of test

01:29:20 time adaptation, some amount of improvisation, right?

01:29:25 So consider, I don’t know, you’re

01:29:29 teaching a class on quantum physics or something.

01:29:34 If you wanted to test the understanding that students

01:29:40 have of the material, you would come up

01:29:42 with an exam that’s very different from anything

01:29:47 they’ve seen on the internet when they were cramming.

01:29:51 On the other hand, if you wanted to make it easy,

01:29:54 you would just give them something

01:29:56 that’s very similar to the mock exams

01:30:00 that they’ve taken, something that’s

01:30:03 just a simple interpolation of questions

01:30:05 that they’ve already seen.

01:30:07 And so that would be an easy exam.

01:30:09 It’s very similar to what you’ve been trained on.

01:30:11 And a difficult exam is one that really probes your understanding

01:30:15 because it forces you to improvise.

01:30:18 It forces you to do things that are

01:30:22 different from what you were exposed to before.

01:30:24 So that said, it doesn’t mean that the exam that

01:30:29 requires improvisation is intrinsically hard, right?

01:30:32 Because maybe you’re a quantum physics expert.

01:30:35 So when you take the exam, this is actually

01:30:37 stuff that, despite being new to the students,

01:30:40 it’s not new to you, right?

01:30:42 So it can only be difficult with respect

01:30:46 to what the test taker already knows

01:30:49 and with respect to the information

01:30:51 that the test taker has about the task.

01:30:54 So that’s what I mean by controlling for priors

01:30:57 what the information you bring to the table.

01:30:59 And the experience.

01:31:00 And the experience, which is to train data.

01:31:02 So in the case of the quantum physics exam,

01:31:05 that would be all the course material itself

01:31:09 and all the mock exams that students

01:31:11 might have taken online.

01:31:12 Yeah, it’s interesting because I’ve also sent you an email.

01:31:17 I asked you, I’ve been in just this curious question

01:31:21 of what’s a really hard IQ test question.

01:31:27 And I’ve been talking to also people

01:31:30 who have designed IQ tests.

01:31:32 There’s a few folks on the internet, it’s like a thing.

01:31:34 People are really curious about it.

01:31:36 First of all, most of the IQ tests they designed,

01:31:39 they like religiously protect against the correct answers.

01:31:45 Like you can’t find the correct answers anywhere.

01:31:48 In fact, the question is ruined once you know,

01:31:50 even like the approach you’re supposed to take.

01:31:53 So they’re very…

01:31:54 That said, the approach is implicit in the training examples.

01:31:58 So if you release the training examples, it’s over.

01:32:02 Which is why in Arc, for instance,

01:32:04 there is a test set that is private and no one has seen it.

01:32:09 No, for really tough IQ questions, it’s not obvious.

01:32:13 It’s not because the ambiguity.

01:32:17 Like it’s, I mean, we’ll have to look through them,

01:32:20 but like some number sequences and so on,

01:32:22 it’s not completely clear.

01:32:25 So like you can get a sense, but there’s like some,

01:32:30 you know, when you look at a number sequence, I don’t know,

01:32:36 like your Fibonacci number sequence,

01:32:37 if you look at the first few numbers,

01:32:39 that sequence could be completed in a lot of different ways.

01:32:42 And you know, some are, if you think deeply,

01:32:45 are more correct than others.

01:32:46 Like there’s a kind of intuitive simplicity

01:32:51 and elegance to the correct solution.

01:32:53 Yes.

01:32:53 I am personally not a fan of ambiguity

01:32:56 in test questions actually,

01:32:58 but I think you can have difficulty

01:33:01 without requiring ambiguity simply by making the test

01:33:05 require a lot of extrapolation over the training examples.

01:33:09 But the beautiful question is difficult,

01:33:13 but gives away everything

01:33:14 when you give the training example.

01:33:17 Basically, yes.

01:33:18 Meaning that, so the tests I’m interested in creating

01:33:24 are not necessarily difficult for humans

01:33:27 because human intelligence is the benchmark.

01:33:31 They’re supposed to be difficult for machines

01:33:34 in ways that are easy for humans.

01:33:36 Like I think an ideal test of human and machine intelligence

01:33:40 is a test that is actionable,

01:33:44 that highlights the need for progress,

01:33:48 and that highlights the direction

01:33:50 in which you should be making progress.

01:33:51 I think we’ll talk about the ARC challenge

01:33:54 and the test you’ve constructed

01:33:55 and you have these elegant examples.

01:33:58 I think that highlight,

01:33:59 like this is really easy for us humans,

01:34:01 but it’s really hard for machines.

01:34:04 But on the, you know, the designing an IQ test

01:34:09 for IQs of like higher than 160 and so on,

01:34:13 you have to say, you have to take that

01:34:15 and put it on steroids, right?

01:34:16 You have to think like, what is hard for humans?

01:34:19 And that’s a fascinating exercise in itself, I think.

01:34:25 And it was an interesting question

01:34:27 of what it takes to create a really hard question for humans

01:34:32 because you again have to do the same process

01:34:36 as you mentioned, which is, you know,

01:34:39 something basically where the experience

01:34:45 that you have likely to have encountered

01:34:46 throughout your whole life,

01:34:48 even if you’ve prepared for IQ tests,

01:34:51 which is a big challenge,

01:34:53 that this will still be novel for you.

01:34:55 Yeah, I mean, novelty is a requirement.

01:34:58 You should not be able to practice for the questions

01:35:02 that you’re gonna be tested on.

01:35:03 That’s important because otherwise what you’re doing

01:35:06 is not exhibiting intelligence.

01:35:08 What you’re doing is just retrieving

01:35:10 what you’ve been exposed before.

01:35:12 It’s the same thing as deep learning model.

01:35:14 If you train a deep learning model

01:35:15 on all the possible answers, then it will ace your test

01:35:20 in the same way that, you know,

01:35:24 a stupid student can still ace the test

01:35:28 if they cram for it.

01:35:30 They memorize, you know,

01:35:32 a hundred different possible mock exams.

01:35:34 And then they hope that the actual exam

01:35:37 will be a very simple interpolation of the mock exams.

01:35:41 And that student could just be a deep learning model

01:35:43 at that point.

01:35:44 But you can actually do that

01:35:45 without any understanding of the material.

01:35:48 And in fact, many students pass their exams

01:35:50 in exactly this way.

01:35:51 And if you want to avoid that,

01:35:53 you need an exam that’s unlike anything they’ve seen

01:35:56 that really probes their understanding.

01:36:00 So how do we design an IQ test for machines,

01:36:05 an intelligent test for machines?

01:36:07 All right, so in the paper I outline

01:36:10 a number of requirements that you expect of such a test.

01:36:14 And in particular, we should start by acknowledging

01:36:19 the priors that we expect to be required

01:36:23 in order to perform the test.

01:36:25 So we should be explicit about the priors, right?

01:36:28 And if the goal is to compare machine intelligence

01:36:31 and human intelligence,

01:36:32 then we should assume human cognitive priors, right?

01:36:36 And secondly, we should make sure that we are testing

01:36:42 for skill acquisition ability,

01:36:44 skill acquisition efficiency in particular,

01:36:46 and not for skill itself.

01:36:48 Meaning that every task featured in your test

01:36:51 should be novel and should not be something

01:36:54 that you can anticipate.

01:36:55 So for instance, it should not be possible

01:36:57 to brute force the space of possible questions, right?

01:37:02 To pre generate every possible question and answer.

01:37:06 So it should be tasks that cannot be anticipated,

01:37:10 not just by the system itself,

01:37:12 but by the creators of the system, right?

01:37:15 Yeah, you know what’s fascinating?

01:37:17 I mean, one of my favorite aspects of the paper

01:37:20 and the work you do with the ARC challenge

01:37:22 is the process of making priors explicit.

01:37:28 Just even that act alone is a really powerful one

01:37:33 of like, what are, it’s a really powerful question

01:37:39 asked of us humans.

01:37:40 What are the priors that we bring to the table?

01:37:44 So the next step is like, once you have those priors,

01:37:46 how do you use them to solve a novel task?

01:37:50 But like, just even making the priors explicit

01:37:52 is a really difficult and really powerful step.

01:37:56 And that’s like visually beautiful

01:37:58 and conceptually philosophically beautiful part

01:38:01 of the work you did with, and I guess continue to do

01:38:06 probably with the paper and the ARC challenge.

01:38:08 Can you talk about some of the priors

01:38:10 that we’re talking about here?

01:38:12 Yes, so a researcher has done a lot of work

01:38:15 on what exactly are the knowledge priors

01:38:19 that are innate to humans is Elizabeth Spelke from Harvard.

01:38:26 So she developed the core knowledge theory,

01:38:30 which outlines four different core knowledge systems.

01:38:36 So systems of knowledge that we are basically

01:38:39 either born with or that we are hardwired

01:38:43 to acquire very early on in our development.

01:38:47 And there’s no strong distinction between the two.

01:38:52 Like if you are primed to acquire

01:38:57 a certain type of knowledge in just a few weeks,

01:39:01 you might as well just be born with it.

01:39:03 It’s just part of who you are.

01:39:06 And so there are four different core knowledge systems.

01:39:09 Like the first one is the notion of objectness

01:39:13 and basic physics.

01:39:16 Like you recognize that something that moves

01:39:20 coherently, for instance, is an object.

01:39:23 So we intuitively, naturally, innately divide the world

01:39:28 into objects based on this notion of coherence,

01:39:31 physical coherence.

01:39:32 And in terms of elementary physics,

01:39:34 there’s the fact that objects can bump against each other

01:39:41 and the fact that they can occlude each other.

01:39:44 So these are things that we are essentially born with

01:39:48 or at least that we are going to be acquiring extremely early

01:39:52 because we’re really hardwired to acquire them.

01:39:55 So a bunch of points, pixels that move together

01:39:59 on objects are partly the same object.

01:40:02 Yes.

01:40:07 I don’t smoke weed, but if I did,

01:40:11 that’s something I could sit all night

01:40:13 and just think about, remember what I wrote in your paper,

01:40:15 just objectness, I wasn’t self aware, I guess,

01:40:19 of that particular prior.

01:40:23 That’s such a fascinating prior that like…

01:40:28 That’s the most basic one, but actually…

01:40:30 Objectness, just identity, just objectness.

01:40:34 It’s very basic, I suppose, but it’s so fundamental.

01:40:39 It is fundamental to human cognition.

01:40:41 Yeah.

01:40:42 The second prior that’s also fundamental is agentness,

01:40:46 which is not a real world, a real world, so agentness.

01:40:50 The fact that some of these objects

01:40:53 that you segment your environment into,

01:40:56 some of these objects are agents.

01:40:58 So what’s an agent?

01:41:00 It’s basically, it’s an object that has goals.

01:41:05 That has what?

01:41:06 That has goals, that is capable of pursuing goals.

01:41:09 So for instance, if you see two dots

01:41:12 moving in roughly synchronized fashion,

01:41:16 you will intuitively infer that one of the dots

01:41:19 is pursuing the other.

01:41:21 So that one of the dots is…

01:41:24 And one of the dots is an agent

01:41:27 and its goal is to avoid the other dot.

01:41:29 And one of the dots, the other dot is also an agent

01:41:32 and its goal is to catch the first dot.

01:41:35 Belke has shown that babies as young as three months

01:41:40 identify agentness and goal directedness

01:41:45 in their environment.

01:41:46 Another prior is basic geometry and topology,

01:41:52 like the notion of distance,

01:41:53 the ability to navigate in your environment and so on.

01:41:57 This is something that is fundamentally hardwired

01:42:01 into our brain.

01:42:02 It’s in fact backed by very specific neural mechanisms,

01:42:07 like for instance, grid cells and place cells.

01:42:10 So it’s something that’s literally hard coded

01:42:15 at the neural level in our hippocampus.

01:42:19 And the last prior would be the notion of numbers.

01:42:23 Like numbers are not actually a cultural construct.

01:42:26 We are intuitively, innately able to do some basic counting

01:42:31 and to compare quantities.

01:42:34 So it doesn’t mean we can do arbitrary arithmetic.

01:42:37 Counting, the actual counting.

01:42:39 Counting, like counting one, two, three ish,

01:42:41 then maybe more than three.

01:42:43 You can also compare quantities.

01:42:45 If I give you three dots and five dots,

01:42:48 you can tell the side with five dots has more dots.

01:42:52 So this is actually an innate prior.

01:42:56 So that said, the list may not be exhaustive.

01:43:00 So SpellKey is still, you know,

01:43:02 passing the potential existence of new knowledge systems.

01:43:08 For instance, knowledge systems that we deal

01:43:12 with social relationships.

01:43:15 Yeah, I mean, and there could be…

01:43:17 Which is much less relevant to something like ARC

01:43:22 or IQ test and so on.

01:43:22 Right.

01:43:23 There could be stuff that’s like you said,

01:43:26 rotation, symmetry, is there like…

01:43:29 Symmetry is really interesting.

01:43:31 It’s very likely that there is, speaking about rotation,

01:43:34 that there is in the brain, a hard coded system

01:43:38 that is capable of performing rotations.

01:43:42 One famous experiment that people did in the…

01:43:45 I don’t remember which was exactly,

01:43:48 but in the 70s was that people found that

01:43:53 if you asked people, if you give them two different shapes

01:43:57 and one of the shapes is a rotated version

01:44:01 of the first shape, and you ask them,

01:44:03 is that shape a rotated version of the first shape or not?

01:44:07 What you see is that the time it takes people to answer

01:44:11 is linearly proportional, right, to the angle of rotation.

01:44:16 So it’s almost like you have somewhere in your brain

01:44:19 like a turntable with a fixed speed.

01:44:24 And if you want to know if two objects are a rotated version

01:44:28 of each other, you put the object on the turntable,

01:44:31 you let it move around a little bit,

01:44:34 and then you stop when you have a match.

01:44:37 And that’s really interesting.

01:44:40 So what’s the ARC challenge?

01:44:42 So in the paper, I outline all these principles

01:44:47 that a good test of machine intelligence

01:44:50 and human intelligence should follow.

01:44:51 And the ARC challenge is one attempt

01:44:55 to embody as many of these principles as possible.

01:44:58 So I don’t think it’s anywhere near a perfect attempt, right?

01:45:03 It does not actually follow every principle,

01:45:06 but it is what I was able to do given the constraints.

01:45:10 So the format of ARC is very similar to classic IQ tests,

01:45:15 in particular Raven’s Progressive Metrices.

01:45:18 Raven’s?

01:45:18 Yeah, Raven’s Progressive Metrices.

01:45:20 I mean, if you’ve done IQ tests in the past,

01:45:22 you know what that is, probably.

01:45:24 Or at least you’ve seen it, even if you

01:45:25 don’t know what it’s called.

01:45:26 And so you have a set of tasks, that’s what they’re called.

01:45:32 And for each task, you have training data,

01:45:37 which is a set of input and output pairs.

01:45:40 So an input or output pair is a grid of colors, basically.

01:45:45 The grid, the size of the grid is variables.

01:45:48 The size of the grid is variable.

01:45:51 And you’re given an input, and you must transform it

01:45:56 into the proper output.

01:45:59 And so you’re shown a few demonstrations

01:46:02 of a task in the form of existing input output pairs,

01:46:05 and then you’re given a new input.

01:46:06 And you must provide, you must produce the correct output.

01:46:12 And the assumptions in Arc is that every task should only

01:46:22 require core knowledge priors, should not

01:46:27 require any outside knowledge.

01:46:30 So for instance, no language, no English, nothing like this.

01:46:36 No concepts taken from our human experience,

01:46:41 like trees, dogs, cats, and so on.

01:46:44 So only reasoning tasks that are built on top

01:46:49 of core knowledge priors.

01:46:52 And some of the tasks are actually explicitly

01:46:56 trying to probe specific forms of abstraction.

01:47:02 Part of the reason why I wanted to create Arc

01:47:05 is I’m a big believer in when you’re

01:47:11 faced with a problem as murky as understanding

01:47:18 how to autonomously generate abstraction in a machine,

01:47:22 you have to coevolve the solution and the problem.

01:47:27 And so part of the reason why I designed Arc

01:47:29 was to clarify my ideas about the nature of abstraction.

01:47:34 And some of the tasks are actually

01:47:36 designed to probe bits of that theory.

01:47:39 And there are things that turn out

01:47:42 to be very easy for humans to perform, including young kids,

01:47:46 but turn out to be near impossible for machines.

01:47:50 So what have you learned from the nature of abstraction

01:47:53 from designing that?

01:47:58 Can you clarify what you mean?

01:47:59 One of the things you wanted to try to understand

01:48:02 was this idea of abstraction.

01:48:06 Yes, so clarifying my own ideas about abstraction

01:48:10 by forcing myself to produce tasks that

01:48:13 would require the ability to produce

01:48:17 that form of abstraction in order to solve them.

01:48:19 Got it.

01:48:20 OK, so and by the way, just the people should check out.

01:48:24 I’ll probably overlay if you’re watching the video part.

01:48:26 But the grid input output with the different colors

01:48:32 on the grid, that’s it.

01:48:34 I mean, it’s a very simple world,

01:48:36 but it’s kind of beautiful.

01:48:37 It’s very similar to classic IQ tests.

01:48:39 It’s not very original in that sense.

01:48:41 The main difference with IQ tests

01:48:43 is that we make the priors explicit, which is not

01:48:46 usually the case in IQ tests.

01:48:48 So you make it explicit that everything should only

01:48:50 be built on top of core knowledge priors.

01:48:53 I also think it’s generally more diverse than IQ tests

01:48:58 in general.

01:49:00 And it perhaps requires a bit more manual work

01:49:03 to produce solutions, because you

01:49:05 have to click around on a grid for a while.

01:49:08 Sometimes the grids can be as large as 30 by 30 cells.

01:49:12 So how did you come up, if you can reveal, with the questions?

01:49:18 What’s the process of the questions?

01:49:19 Was it mostly you that came up with the questions?

01:49:23 How difficult is it to come up with a question?

01:49:25 Is this scalable to a much larger number?

01:49:30 If we think, with IQ tests, you might not necessarily

01:49:33 want it to or need it to be scalable.

01:49:36 With machines, it’s possible, you

01:49:39 could argue, that it needs to be scalable.

01:49:41 So there are 1,000 questions, 1,000 tasks,

01:49:46 including the test set, the prior test set.

01:49:49 I think it’s fairly difficult in the sense

01:49:51 that a big requirement is that every task should

01:49:54 be novel and unique and unpredictable.

01:50:00 You don’t want to create your own little world that

01:50:04 is simple enough that it would be possible for a human

01:50:08 to reverse and generate and write down

01:50:12 an algorithm that could generate every possible arc

01:50:15 task and their solution.

01:50:17 So that would completely invalidate the test.

01:50:19 So you’re constantly coming up with new stuff.

01:50:21 Yeah, you need a source of novelty,

01:50:24 of unfakeable novelty.

01:50:27 And one thing I found is that, as a human,

01:50:32 you are not a very good source of unfakeable novelty.

01:50:36 And so you have to base the creation of these tasks

01:50:40 quite a bit.

01:50:41 There are only so many unique tasks

01:50:42 that you can do in a given day.

01:50:45 So that means coming up with truly original new ideas.

01:50:49 Did psychedelics help you at all?

01:50:52 No, I’m just kidding.

01:50:53 But I mean, that’s fascinating to think about.

01:50:55 So you would be walking or something like that.

01:50:58 Are you constantly thinking of something totally new?

01:51:02 Yes.

01:51:06 This is hard.

01:51:06 This is hard.

01:51:07 Yeah, I mean, I’m not saying you’ve done anywhere

01:51:10 near a perfect job at it.

01:51:12 There is some amount of redundancy,

01:51:14 and there are many imperfections in ARC.

01:51:16 So that said, you should consider

01:51:18 ARC as a work in progress.

01:51:19 It is not the definitive state.

01:51:25 The ARC tasks today are not the definitive state of the test.

01:51:29 I want to keep refining it in the future.

01:51:32 I also think it should be possible to open up

01:51:36 the creation of tasks to a broad audience

01:51:38 to do crowdsourcing.

01:51:40 That would involve several levels of filtering,

01:51:43 obviously.

01:51:44 But I think it’s possible to apply crowdsourcing

01:51:46 to develop a much bigger and much more diverse ARC data set.

01:51:51 That would also be free of potentially some

01:51:54 of my own personal biases.

01:51:56 Is there always need to be a part of ARC

01:51:59 that the test is hidden?

01:52:02 Yes, absolutely.

01:52:04 It is imperative that the tests that you’re

01:52:08 using to actually benchmark algorithms

01:52:11 is not accessible to the people developing these algorithms.

01:52:15 Because otherwise, what’s going to happen

01:52:16 is that the human engineers are just

01:52:19 going to solve the tasks themselves

01:52:21 and encode their solution in program form.

01:52:24 But that, again, what you’re seeing here

01:52:27 is the process of intelligence happening

01:52:30 in the mind of the human.

01:52:31 And then you’re just capturing its crystallized output.

01:52:35 But that crystallized output is not the same thing

01:52:38 as the process it generated.

01:52:40 It’s not intelligent in itself.

01:52:41 So what, by the way, the idea of crowdsourcing it

01:52:43 is fascinating.

01:52:45 I think the creation of questions

01:52:49 is really exciting for people.

01:52:51 I think there’s a lot of really brilliant people

01:52:53 out there that love to create these kinds of stuff.

01:52:56 Yeah, one thing that kind of surprised me

01:52:59 that I wasn’t expecting is that lots of people

01:53:01 seem to actually enjoy ARC as a kind of game.

01:53:05 And I was releasing it as a test,

01:53:08 as a benchmark of fluid general intelligence.

01:53:14 And lots of people just, including kids,

01:53:17 just started enjoying it as a game.

01:53:18 So I think that’s encouraging.

01:53:20 Yeah, I’m fascinated by it.

01:53:22 There’s a world of people who create IQ questions.

01:53:25 I think that’s a cool activity for machines and for humans.

01:53:32 And humans are themselves fascinated

01:53:35 by taking the questions, like measuring

01:53:40 their own intelligence.

01:53:42 I mean, that’s just really compelling.

01:53:44 It’s really interesting to me, too.

01:53:47 One of the cool things about ARC, you said,

01:53:48 is kind of inspired by IQ tests or whatever

01:53:51 follows a similar process.

01:53:53 But because of its nature, because of the context

01:53:56 in which it lives, it immediately

01:53:59 forces you to think about the nature of intelligence

01:54:01 as opposed to just the test of your own.

01:54:04 It forces you to really think.

01:54:06 I don’t know if it’s within the question,

01:54:09 inherent in the question, or just the fact

01:54:11 that it lives in the test that’s supposed

01:54:13 to be a test of machine intelligence.

01:54:15 Absolutely.

01:54:15 As you solve ARC tasks as a human,

01:54:20 you will be forced to basically introspect

01:54:24 how you come up with solutions.

01:54:27 And that forces you to reflect on the human problem solving

01:54:32 process.

01:54:33 And the way your own mind generates

01:54:38 abstract representations of the problems it’s exposed to.

01:54:44 I think it’s due to the fact that the set of core knowledge

01:54:48 priors that ARC is built upon is so small.

01:54:52 It’s all a recombination of a very, very small set

01:54:58 of assumptions.

01:55:00 OK, so what’s the future of ARC?

01:55:02 So you held ARC as a challenge, as part

01:55:05 of like a Kaggle competition.

01:55:06 Yes.

01:55:07 Kaggle competition.

01:55:08 And what do you think?

01:55:11 Do you think that’s something that

01:55:13 continues for five years, 10 years,

01:55:16 like just continues growing?

01:55:17 Yes, absolutely.

01:55:18 So ARC itself will keep evolving.

01:55:21 So I’ve talked about crowdsourcing.

01:55:22 I think that’s a good avenue.

01:55:26 Another thing I’m starting is I’ll

01:55:29 be collaborating with folks from the psychology department

01:55:32 at NYU to do human testing on ARC.

01:55:36 And I think there are lots of interesting questions

01:55:38 you can start asking, especially as you start correlating

01:55:43 machine solutions to ARC tasks and the human characteristics

01:55:49 of solutions.

01:55:50 Like for instance, you can try to see

01:55:52 if there’s a relationship between the human perceived

01:55:55 difficulty of a task and the machine perceived.

01:55:59 Yes, and exactly some measure of machine

01:56:01 perceived difficulty.

01:56:02 Yeah, it’s a nice playground in which

01:56:04 to explore this very difference.

01:56:06 It’s the same thing as we talked about the autonomous vehicles.

01:56:09 The things that could be difficult for humans

01:56:10 might be very different than the things that are difficult.

01:56:13 And formalizing or making explicit that difference

01:56:17 in difficulty may teach us something fundamental

01:56:21 about intelligence.

01:56:22 So one thing I think we did well with ARC

01:56:26 is that it’s proving to be a very actionable test in the sense

01:56:33 that machine performance on ARC started at very much zero

01:56:37 initially, while humans found actually the task very easy.

01:56:43 And that alone was like a big red flashing light saying

01:56:48 that something is going on and that we are missing something.

01:56:52 And at the same time, machine performance

01:56:55 did not stay at zero for very long.

01:56:57 Actually, within two weeks of the Kaggle competition,

01:57:00 we started having a nonzero number.

01:57:03 And now the state of the art is around 20%

01:57:06 of the test set solved.

01:57:10 And so ARC is actually a challenge

01:57:12 where our capabilities start at zero, which indicates

01:57:16 the need for progress.

01:57:18 But it’s also not an impossible challenge.

01:57:20 It’s not accessible.

01:57:21 You can start making progress basically right away.

01:57:25 At the same time, we are still very far

01:57:28 from having solved it.

01:57:29 And that’s actually a very positive outcome

01:57:32 of the competition is that the competition has proven

01:57:35 that there was no obvious shortcut to solve these tasks.

01:57:41 Yeah, so the test held up.

01:57:43 Yeah, exactly.

01:57:44 That was the primary reason to use the Kaggle competition

01:57:46 is to check if some clever person was

01:57:51 going to hack the benchmark that did not happen.

01:57:56 People who are solving the task are essentially doing it.

01:58:01 Well, in a way, they’re actually exploring some flaws of ARC

01:58:05 that we will need to address in the future,

01:58:07 especially they’re essentially anticipating

01:58:09 what sort of tasks may be contained in the test set.

01:58:13 Right, which is kind of, yeah, that’s the kind of hacking.

01:58:18 It’s human hacking of the test.

01:58:20 Yes, that said, with the state of the art,

01:58:23 it’s like 20% we’re still very, very far from human level,

01:58:28 which is closer to 100%.

01:58:30 And I do believe that it will take a while

01:58:35 until we reach human parity on ARC.

01:58:40 And that by the time we have human parity,

01:58:43 we will have AI systems that are probably

01:58:47 pretty close to human level in terms of general fluid

01:58:50 intelligence, which is, I mean, they are not

01:58:53 going to be necessarily human like.

01:58:54 They’re not necessarily, you would not necessarily

01:58:58 recognize them as being an AGI.

01:59:01 But they would be capable of a degree of generalization

01:59:06 that matches the generalization performed

01:59:09 by human fluid intelligence.

01:59:11 Sure.

01:59:11 I mean, this is a good point in terms

01:59:13 of general fluid intelligence to mention in your paper.

01:59:17 You describe different kinds of generalizations,

01:59:21 local, broad, extreme.

01:59:23 And there’s a kind of a hierarchy that you form.

01:59:25 So when we say generalizations, what are we talking about?

01:59:31 What kinds are there?

01:59:33 Right, so generalization is a very old idea.

01:59:37 I mean, it’s even older than machine learning.

01:59:39 In the context of machine learning,

01:59:40 you say a system generalizes if it can make sense of an input

01:59:47 it has not yet seen.

01:59:49 And that’s what I would call system centric generalization,

01:59:54 generalization with respect to novelty

02:00:00 for the specific system you’re considering.

02:00:02 So I think a good test of intelligence

02:00:05 should actually deal with developer aware generalization,

02:00:09 which is slightly stronger than system centric generalization.

02:00:13 So developer aware generalization

02:00:16 would be the ability to generalize

02:00:19 to novelty or uncertainty that not only the system itself has

02:00:24 not access to, but the developer of the system

02:00:26 could not have access to either.

02:00:29 That’s a fascinating meta definition.

02:00:32 So the system is basically the edge case thing

02:00:37 we’re talking about with autonomous vehicles.

02:00:39 Neither the developer nor the system

02:00:41 know about the edge cases in my encounter.

02:00:44 So it’s up to the system should be

02:00:47 able to generalize the thing that nobody expected,

02:00:51 neither the designer of the training data,

02:00:54 nor obviously the contents of the training data.

02:00:59 That’s a fascinating definition.

02:01:00 So you can see degrees of generalization as a spectrum.

02:01:04 And the lowest level is what machine learning

02:01:08 is trying to do is the assumption

02:01:10 that any new situation is going to be sampled

02:01:15 from a static distribution of possible situations

02:01:18 and that you already have a representative sample

02:01:21 of the distribution.

02:01:22 That’s your training data.

02:01:23 And so in machine learning, you generalize to a new sample

02:01:26 from a known distribution.

02:01:28 And the ways in which your new sample will be new or different

02:01:34 are ways that are already understood by the developers

02:01:38 of the system.

02:01:39 So you are generalizing to known unknowns

02:01:43 for one specific task.

02:01:45 That’s what you would call robustness.

02:01:47 You are robust to things like noise, small variations,

02:01:50 and so on for one fixed known distribution

02:01:56 that you know through your training data.

02:01:59 And the higher degree would be flexibility

02:02:05 in machine intelligence.

02:02:06 So flexibility would be something

02:02:08 like an L5 cell driving car or maybe a robot that

02:02:12 can pass the coffee cup test, which

02:02:16 is the notion that you’d be given a random kitchen

02:02:21 somewhere in the country.

02:02:22 And you would have to go make a cup of coffee in that kitchen.

02:02:28 So flexibility would be the ability

02:02:30 to deal with unknown unknowns, so things that could not,

02:02:35 dimensions of viability that could not

02:02:37 have been possibly foreseen by the creators of the system

02:02:41 within one specific task.

02:02:42 So generalizing to the long tail of situations in self driving,

02:02:47 for instance, would be flexibility.

02:02:48 So you have robustness, flexibility, and finally,

02:02:51 you would have extreme generalization,

02:02:53 which is basically flexibility, but instead

02:02:57 of just considering one specific domain,

02:03:01 like driving or domestic robotics,

02:03:03 you’re considering an open ended range of possible domains.

02:03:07 So a robot would be capable of extreme generalization

02:03:12 if, let’s say, it’s designed and trained for cooking,

02:03:18 for instance.

02:03:19 And if I buy the robot and if it’s

02:03:24 able to teach itself gardening in a couple of weeks,

02:03:28 it would be capable of extreme generalization, for instance.

02:03:32 So the ultimate goal is extreme generalization.

02:03:34 Yes.

02:03:34 So creating a system that is so general that it could

02:03:40 essentially achieve human skill parity over arbitrary tasks

02:03:46 and arbitrary domains with the same level of improvisation

02:03:50 and adaptation power as humans when

02:03:53 it encounters new situations.

02:03:55 And it would do so over basically the same range

02:03:59 of possible domains and tasks as humans

02:04:02 and using essentially the same amount of training

02:04:05 experience of practice as humans would require.

02:04:07 That would be human level extreme generalization.

02:04:10 So I don’t actually think humans are anywhere

02:04:14 near the optimal intelligence bounds

02:04:19 if there is such a thing.

02:04:21 So I think for humans or in general?

02:04:23 In general.

02:04:25 I think it’s quite likely that there

02:04:26 is a hard limit to how intelligent any system can be.

02:04:33 But at the same time, I don’t think humans are anywhere

02:04:35 near that limit.

02:04:39 Yeah, last time I think we talked,

02:04:40 I think you had this idea that we’re only

02:04:43 as intelligent as the problems we face.

02:04:46 Sort of we are bounded by the problems.

02:04:51 In a way, yes.

02:04:51 We are bounded by our environments,

02:04:55 and we are bounded by the problems we try to solve.

02:04:58 Yeah.

02:04:59 Yeah.

02:04:59 What do you make of Neuralink and outsourcing

02:05:03 some of the brain power, like brain computer interfaces?

02:05:07 Do you think we can expand or augment our intelligence?

02:05:13 I am fairly skeptical of neural interfaces

02:05:18 because they are trying to fix one specific bottleneck

02:05:23 in human machine cognition, which

02:05:26 is the bandwidth bottleneck, input and output

02:05:29 of information in the brain.

02:05:31 And my perception of the problem is that bandwidth is not

02:05:37 at this time a bottleneck at all.

02:05:41 Meaning that we already have sensors

02:05:43 that enable us to take in far more information than what

02:05:48 we can actually process.

02:05:50 Well, to push back on that a little bit,

02:05:53 to sort of play devil’s advocate a little bit,

02:05:55 is if you look at the internet, Wikipedia, let’s say Wikipedia,

02:05:58 I would say that humans, after the advent of Wikipedia,

02:06:03 are much more intelligent.

02:06:05 Yes, I think that’s a good one.

02:06:07 But that’s also not about, that’s about externalizing

02:06:14 our intelligence via information processing systems,

02:06:18 external information processing systems,

02:06:19 which is very different from brain computer interfaces.

02:06:23 Right, but the question is whether if we have direct

02:06:27 access, if our brain has direct access to Wikipedia without

02:06:31 Your brain already has direct access to Wikipedia.

02:06:34 It’s on your phone.

02:06:35 And you have your hands and your eyes and your ears

02:06:39 and so on to access that information.

02:06:42 And the speed at which you can access it

02:06:44 Is bottlenecked by the cognition.

02:06:45 I think it’s already close, fairly close to optimal,

02:06:49 which is why speed reading, for instance, does not work.

02:06:53 The faster you read, the less you understand.

02:06:55 But maybe it’s because it uses the eyes.

02:06:58 So maybe.

02:07:00 So I don’t believe so.

02:07:01 I think the brain is very slow.

02:07:04 It typically operates, you know, the fastest things

02:07:07 that happen in the brain are at the level of 50 milliseconds.

02:07:11 Forming a conscious thought can potentially

02:07:14 take entire seconds, right?

02:07:16 And you can already read pretty fast.

02:07:19 So I think the speed at which you can take information in

02:07:23 and even the speed at which you can output information

02:07:26 can only be very incrementally improved.

02:07:29 Maybe there’s a question.

02:07:31 If you’re a very fast typer, if you’re a very trained typer,

02:07:34 the speed at which you can express your thoughts

02:07:36 is already the speed at which you can form your thoughts.

02:07:40 Right, so that’s kind of an idea that there are

02:07:44 fundamental bottlenecks to the human mind.

02:07:47 But it’s possible that everything we have

02:07:50 in the human mind is just to be able to survive

02:07:53 in the environment.

02:07:54 And there’s a lot more to expand.

02:07:58 Maybe, you know, you said the speed of the thought.

02:08:02 So I think augmenting human intelligence

02:08:06 is a very valid and very powerful avenue, right?

02:08:09 And that’s what computers are about.

02:08:12 In fact, that’s what all of culture and civilization

02:08:15 is about.

02:08:16 Our culture is externalized cognition

02:08:20 and we rely on culture to think constantly.

02:08:23 Yeah, I mean, that’s another, yeah.

02:08:26 Not just computers, not just phones and the internet.

02:08:29 I mean, all of culture, like language, for instance,

02:08:32 is a form of externalized cognition.

02:08:34 Books are obviously externalized cognition.

02:08:37 Yeah, that’s a good point.

02:08:38 And you can scale that externalized cognition

02:08:42 far beyond the capability of the human brain.

02:08:45 And you could see civilization itself

02:08:48 is it has capabilities that are far beyond any individual brain

02:08:54 and will keep scaling it because it’s not

02:08:55 rebound by individual brains.

02:08:59 It’s a different kind of system.

02:09:01 Yeah, and that system includes nonhuman, nonhumans.

02:09:06 First of all, it includes all the other biological systems,

02:09:08 which are probably contributing to the overall intelligence

02:09:11 of the organism.

02:09:12 And then computers are part of it.

02:09:14 Nonhuman systems are probably not contributing much,

02:09:16 but AIs are definitely contributing to that.

02:09:19 Like Google search, for instance, is a big part of it.

02:09:24 Yeah, yeah, a huge part, a part that we can’t probably

02:09:29 introspect.

02:09:31 Like how the world has changed in the past 20 years,

02:09:33 it’s probably very difficult for us

02:09:35 to be able to understand until, of course,

02:09:38 whoever created the simulation we’re in is probably

02:09:41 doing metrics, measuring the progress.

02:09:44 There was probably a big spike in performance.

02:09:48 They’re enjoying this.

02:09:51 So what are your thoughts on the Turing test

02:09:56 and the Lobner Prize, which is one

02:10:00 of the most famous attempts at the test of artificial

02:10:05 intelligence by doing a natural language open dialogue test

02:10:11 that’s judged by humans as far as how well the machine did?

02:10:18 So I’m not a fan of the Turing test.

02:10:21 Itself or any of its variants for two reasons.

02:10:25 So first of all, it’s really coping out

02:10:34 of trying to define and measure intelligence

02:10:37 because it’s entirely outsourcing that

02:10:40 to a panel of human judges.

02:10:43 And these human judges, they may not themselves

02:10:47 have any proper methodology.

02:10:49 They may not themselves have any proper definition

02:10:52 of intelligence.

02:10:53 They may not be reliable.

02:10:54 So the Turing test is already failing

02:10:57 one of the core psychometrics principles, which

02:10:59 is reliability because you have biased human judges.

02:11:04 It’s also violating the standardization requirement

02:11:07 and the freedom from bias requirement.

02:11:10 And so it’s really a cope out because you are outsourcing

02:11:13 everything that matters, which is precisely describing

02:11:17 intelligence and finding a standalone test to measure it.

02:11:22 You’re outsourcing everything to people.

02:11:25 So it’s really a cope out.

02:11:26 And by the way, we should keep in mind

02:11:28 that when Turing proposed the imitation game,

02:11:33 it was not meaning for the imitation game

02:11:36 to be an actual goal for the field of AI

02:11:40 and actual test of intelligence.

02:11:42 It was using the imitation game as a thought experiment

02:11:48 in a philosophical discussion in his 1950 paper.

02:11:53 He was trying to argue that theoretically, it

02:11:58 should be possible for something very much like the human mind,

02:12:04 indistinguishable from the human mind,

02:12:06 to be encoded in a Turing machine.

02:12:08 And at the time, that was a very daring idea.

02:12:14 It was stretching credulity.

02:12:16 But nowadays, I think it’s fairly well accepted

02:12:20 that the mind is an information processing system

02:12:22 and that you could probably encode it into a computer.

02:12:25 So another reason why I’m not a fan of this type of test

02:12:29 is that the incentives that it creates

02:12:34 are incentives that are not conducive to proper scientific

02:12:39 research.

02:12:40 If your goal is to trick, to convince a panel of human

02:12:45 judges that they are talking to a human,

02:12:48 then you have an incentive to rely on tricks

02:12:53 and prestidigitation.

02:12:56 In the same way that, let’s say, you’re doing physics

02:12:59 and you want to solve teleportation.

02:13:01 And what if the test that you set out to pass

02:13:04 is you need to convince a panel of judges

02:13:07 that teleportation took place?

02:13:09 And they’re just sitting there and watching what you’re doing.

02:13:12 And that is something that you can achieve with David

02:13:17 Copperfield could achieve it in his show at Vegas.

02:13:22 And what he’s doing is very elaborate.

02:13:25 But it’s not physics.

02:13:29 It’s not making any progress in our understanding

02:13:31 of the universe.

02:13:32 To push back on that is possible.

02:13:34 That’s the hope with these kinds of subjective evaluations

02:13:39 is that it’s easier to solve it generally

02:13:41 than it is to come up with tricks that convince

02:13:45 a large number of judges.

02:13:46 That’s the hope.

02:13:47 In practice, it turns out that it’s

02:13:49 very easy to deceive people in the same way

02:13:51 that you can do magic in Vegas.

02:13:54 You can actually very easily convince people

02:13:57 that they’re talking to a human when they’re actually

02:13:59 talking to an algorithm.

02:14:00 I just disagree.

02:14:01 I disagree with that.

02:14:02 I think it’s easy.

02:14:03 I would push.

02:14:05 No, it’s not easy.

02:14:07 It’s doable.

02:14:08 It’s very easy because we are biased.

02:14:12 We have theory of mind.

02:14:13 We are constantly projecting emotions, intentions, agentness.

02:14:21 Agentness is one of our core innate priors.

02:14:24 We are projecting these things on everything around us.

02:14:26 Like if you paint a smiley on a rock,

02:14:31 the rock becomes happy in our eyes.

02:14:33 And because we have this extreme bias that

02:14:36 permits everything we see around us,

02:14:39 it’s actually pretty easy to trick people.

02:14:41 I just disagree with that.

02:14:44 I so totally disagree with that.

02:14:45 You brilliantly put as a huge, the anthropomorphization

02:14:50 that we naturally do, the agentness of that word.

02:14:53 Is that a real word?

02:14:53 No, it’s not a real word.

02:14:55 I like it.

02:14:56 But it’s a useful word.

02:14:57 It’s a useful word.

02:14:58 Let’s make it real.

02:14:59 It’s a huge help.

02:15:01 But I still think it’s really difficult to convince.

02:15:04 If you do like the Alexa Prize formulation,

02:15:07 where you talk for an hour, there’s

02:15:10 formulations of the test you can create,

02:15:12 where it’s very difficult.

02:15:13 So I like the Alexa Prize better because it’s more pragmatic.

02:15:18 It’s more practical.

02:15:19 It’s actually incentivizing developers

02:15:22 to create something that’s useful as a human machine

02:15:27 interface.

02:15:29 So that’s slightly better than just the imitation.

02:15:31 So I like it.

02:15:34 Your idea is like a test which hopefully

02:15:36 help us in creating intelligent systems as a result.

02:15:39 Like if you create a system that passes it,

02:15:41 it’ll be useful for creating further intelligent systems.

02:15:44 Yes, at least.

02:15:46 Yeah.

02:15:47 Just to kind of comment, I’m a little bit surprised

02:15:51 how little inspiration people draw from the Turing test

02:15:55 today.

02:15:57 The media and the popular press might write about it

02:15:59 every once in a while.

02:16:00 The philosophers might talk about it.

02:16:03 But most engineers are not really inspired by it.

02:16:07 And I know you don’t like the Turing test,

02:16:11 but we’ll have this argument another time.

02:16:15 There’s something inspiring about it, I think.

02:16:18 As a philosophical device in a physical discussion,

02:16:21 I think there is something very interesting about it.

02:16:23 I don’t think it is in practical terms.

02:16:26 I don’t think it’s conducive to progress.

02:16:29 And one of the reasons why is that I

02:16:32 think being very human like, being

02:16:35 indistinguishable from a human is actually

02:16:37 the very last step in the creation of machine

02:16:40 intelligence.

02:16:41 That the first ARs that will show strong generalization

02:16:46 that will actually implement human like broad cognitive

02:16:52 abilities, they will not actually behave or look

02:16:54 anything like humans.

02:16:58 Human likeness is the very last step in that process.

02:17:01 And so a good test is a test that

02:17:03 points you towards the first step on the ladder,

02:17:07 not towards the top of the ladder.

02:17:08 So to push back on that, I usually

02:17:11 agree with you on most things.

02:17:13 I remember you, I think at some point,

02:17:15 tweeting something about the Turing test

02:17:17 not being being counterproductive

02:17:19 or something like that.

02:17:20 And I think a lot of very smart people agree with that.

02:17:23 I, a computation speaking, not very smart person,

02:17:31 disagree with that.

02:17:32 Because I think there’s some magic

02:17:33 to the interactivity with other humans.

02:17:36 So to play devil’s advocate on your statement,

02:17:39 it’s possible that in order to demonstrate

02:17:42 the generalization abilities of a system,

02:17:45 you have to show your ability, in conversation,

02:17:49 show your ability to adjust, adapt to the conversation

02:17:55 through not just like as a standalone system,

02:17:58 but through the process of like the interaction,

02:18:01 the game theoretic, where you really

02:18:05 are changing the environment by your actions.

02:18:09 So in the ARC challenge, for example,

02:18:11 you’re an observer.

02:18:12 You can’t scare the test into changing.

02:18:17 You can’t talk to the test.

02:18:19 You can’t play with it.

02:18:21 So there’s some aspect of that interactivity

02:18:24 that becomes highly subjective, but it

02:18:26 feels like it could be conducive to generalizability.

02:18:29 I think you make a great point.

02:18:31 The interactivity is a very good setting

02:18:33 to force a system to show adaptation,

02:18:36 to show generalization.

02:18:39 That said, at the same time, it’s

02:18:42 not something very scalable, because you

02:18:44 rely on human judges.

02:18:46 It’s not something reliable, because the human judges may

02:18:48 not, may not.

02:18:49 So you don’t like human judges.

02:18:50 Basically, yes.

02:18:51 And I think so.

02:18:52 I love the idea of interactivity.

02:18:56 I initially wanted an ARC test that

02:18:59 had some amount of interactivity where your score on a task

02:19:02 would not be 1 or 0, if you can solve it or not,

02:19:05 but would be the number of attempts

02:19:11 that you can make before you hit the right solution, which

02:19:14 means that now you can start applying

02:19:16 the scientific method as you solve ARC tasks,

02:19:19 that you can start formulating hypotheses and probing

02:19:23 the system to see whether the observation will

02:19:27 match the hypothesis or not.

02:19:28 It would be amazing if you could also,

02:19:30 even higher level than that, measure the quality of your attempts,

02:19:35 which, of course, is impossible.

02:19:36 But again, that gets subjective.

02:19:38 How good was your thinking?

02:19:41 How efficient was?

02:19:43 So one thing that’s interesting about this notion of scoring you

02:19:48 as how many attempts you need is that you

02:19:50 can start producing tasks that are way more ambiguous, right?

02:19:55 Right.

02:19:56 Because with the different attempts,

02:19:59 you can actually probe that ambiguity, right?

02:20:03 Right.

02:20:04 So that’s, in a sense, which is how good can

02:20:08 you adapt to the uncertainty and reduce the uncertainty?

02:20:15 Yes, it’s half fast.

02:20:18 It’s the efficiency with which you reduce uncertainty

02:20:21 in program space, exactly.

02:20:22 Very difficult to come up with that kind of test, though.

02:20:24 Yeah, so I would love to be able to create something like this.

02:20:28 In practice, it would be very, very difficult, but yes.

02:20:33 I mean, what you’re doing, what you’ve done with the ARC challenge

02:20:36 is brilliant.

02:20:37 I’m also not surprised that it’s not more popular,

02:20:40 but I think it’s picking up.

02:20:42 It does its niche.

02:20:42 It does its niche, yeah.

02:20:44 Yeah.

02:20:44 What are your thoughts about another test?

02:20:47 I talked with Marcus Hutter.

02:20:48 He has the Hutter Prize for compression of human knowledge.

02:20:51 And the idea is really sort of quantify and reduce

02:20:55 the test of intelligence purely to just the ability

02:20:58 to compress.

02:20:59 What’s your thoughts about this intelligence as compression?

02:21:04 I mean, it’s a very fun test because it’s

02:21:07 such a simple idea, like you’re given Wikipedia,

02:21:12 basic English Wikipedia, and you must compress it.

02:21:15 And so it stems from the idea that cognition is compression,

02:21:21 that the brain is basically a compression algorithm.

02:21:24 This is a very old idea.

02:21:25 It’s a very, I think, striking and beautiful idea.

02:21:30 I used to believe it.

02:21:32 I eventually had to realize that it was very much

02:21:36 a flawed idea.

02:21:36 So I no longer believe that cognition is compression.

02:21:41 But I can tell you what’s the difference.

02:21:44 So it’s very easy to believe that cognition and compression

02:21:48 are the same thing.

02:21:51 So Jeff Hawkins, for instance, says

02:21:53 that cognition is prediction.

02:21:54 And of course, prediction is basically the same thing

02:21:57 as compression.

02:21:58 It’s just including the temporal axis.

02:22:03 And it’s very easy to believe this

02:22:05 because compression is something that we

02:22:06 do all the time very naturally.

02:22:09 We are constantly compressing information.

02:22:12 We are constantly trying.

02:22:15 We have this bias towards simplicity.

02:22:17 We are constantly trying to organize things in our mind

02:22:21 and around us to be more regular.

02:22:24 So it’s a beautiful idea.

02:22:26 It’s very easy to believe.

02:22:28 There is a big difference between what

02:22:31 we do with our brains and compression.

02:22:33 So compression is actually kind of a tool

02:22:38 in the human cognitive toolkit that is used in many ways.

02:22:42 But it’s just a tool.

02:22:44 It is a tool for cognition.

02:22:45 It is not cognition itself.

02:22:47 And the big fundamental difference

02:22:50 is that cognition is about being able to operate

02:22:55 in future situations that include fundamental uncertainty

02:23:00 and novelty.

02:23:02 So for instance, consider a child at age 10.

02:23:06 And so they have 10 years of life experience.

02:23:10 They’ve gotten pain, pleasure, rewards, and punishment

02:23:14 in a period of time.

02:23:16 If you were to generate the shortest behavioral program

02:23:21 that would have basically run that child over these 10 years

02:23:26 in an optimal way, the shortest optimal behavioral program

02:23:32 given the experience of that child so far,

02:23:34 well, that program, that compressed program,

02:23:37 this is what you would get if the mind of the child

02:23:39 was a compression algorithm essentially,

02:23:42 would be utterly unable, inappropriate,

02:23:48 to process the next 70 years in the life of that child.

02:23:54 So in the models we build of the world,

02:23:59 we are not trying to make them actually optimally compressed.

02:24:03 We are using compression as a tool

02:24:06 to promote simplicity and efficiency in our models.

02:24:10 But they are not perfectly compressed

02:24:12 because they need to include things

02:24:15 that are seemingly useless today, that have seemingly

02:24:18 been useless so far.

02:24:20 But that may turn out to be useful in the future

02:24:24 because you just don’t know the future.

02:24:25 And that’s the fundamental principle

02:24:28 that cognition, that intelligence arises from

02:24:31 is that you need to be able to run

02:24:33 appropriate behavioral programs except you have absolutely

02:24:36 no idea what sort of context, environment, situation

02:24:40 they are going to be running in.

02:24:42 And you have to deal with that uncertainty,

02:24:45 with that future anomaly.

02:24:46 So an analogy that you can make is with investing,

02:24:52 for instance.

02:24:54 If I look at the past 20 years of stock market data,

02:24:59 and I use a compression algorithm

02:25:01 to figure out the best trading strategy,

02:25:04 it’s going to be you buy Apple stock, then

02:25:06 maybe the past few years you buy Tesla stock or something.

02:25:10 But is that strategy still going to be

02:25:13 true for the next 20 years?

02:25:14 Well, actually, probably not, which

02:25:17 is why if you’re a smart investor,

02:25:21 you’re not just going to be following the strategy that

02:25:26 corresponds to compression of the past.

02:25:28 You’re going to be following, you’re

02:25:31 going to have a balanced portfolio, right?

02:25:34 Because you just don’t know what’s going to happen.

02:25:38 I mean, I guess in that same sense,

02:25:40 the compression is analogous to what

02:25:42 you talked about, which is local or robust generalization

02:25:45 versus extreme generalization.

02:25:47 It’s much closer to that side of being able to generalize

02:25:52 in the local sense.

02:25:53 That’s why as humans, when we are children, in our education,

02:25:59 so a lot of it is driven by play, driven by curiosity.

02:26:04 We are not efficiently compressing things.

02:26:07 We’re actually exploring.

02:26:09 We are retaining all kinds of things

02:26:16 from our environment that seem to be completely useless.

02:26:19 Because they might turn out to be eventually useful, right?

02:26:24 And that’s what cognition is really about.

02:26:26 And what makes it antagonistic to compression

02:26:29 is that it is about hedging for future uncertainty.

02:26:33 And that’s antagonistic to compression.

02:26:35 Yes.

02:26:36 Officially hedging.

02:26:38 Cognition leverages compression as a tool

02:26:41 to promote efficiency and simplicity in our models.

02:26:47 It’s like Einstein said, make it simpler, but not,

02:26:52 however that quote goes, but not too simple.

02:26:54 So compression simplifies things,

02:26:57 but you don’t want to make it too simple.

02:27:00 Yes.

02:27:00 So a good model of the world is going

02:27:03 to include all kinds of things that are completely useless,

02:27:06 actually, just in case.

02:27:08 Because you need diversity in the same way

02:27:10 that in your portfolio.

02:27:11 You need all kinds of stocks that may not

02:27:13 have performed well so far, but you need diversity.

02:27:15 And the reason you need diversity

02:27:16 is because fundamentally you don’t know what you’re doing.

02:27:19 And the same is true of the human mind,

02:27:22 is that it needs to behave appropriately in the future.

02:27:26 And it has no idea what the future is going to be like.

02:27:29 But it’s not going to be like the past.

02:27:31 So compressing the past is not appropriate,

02:27:33 because the past is not, it’s not predictive of the future.

02:27:40 Yeah, history repeats itself, but not perfectly.

02:27:44 I don’t think I asked you last time the most inappropriately

02:27:48 absurd question.

02:27:51 We’ve talked a lot about intelligence,

02:27:54 but the bigger question from intelligence is of meaning.

02:28:00 Intelligence systems are kind of goal oriented.

02:28:02 They’re always optimizing for a goal.

02:28:05 If you look at the Hutter Prize, actually,

02:28:07 I mean, there’s always a clean formulation of a goal.

02:28:10 But the natural question for us humans,

02:28:14 since we don’t know our objective function,

02:28:16 is what is the meaning of it all?

02:28:18 So the absurd question is, what, Francois,

02:28:22 do you think is the meaning of life?

02:28:25 What’s the meaning of life?

02:28:26 Yeah, that’s a big question.

02:28:28 And I think I can give you my answer, at least one

02:28:33 of my answers.

02:28:34 And so one thing that’s very important in understanding who

02:28:42 we are is that everything that makes up ourselves,

02:28:48 that makes up who we are, even your most personal thoughts,

02:28:53 is not actually your own.

02:28:55 Even your most personal thoughts are expressed in words

02:29:00 that you did not invent and are built on concepts and images

02:29:04 that you did not invent.

02:29:06 We are very much cultural beings.

02:29:10 We are made of culture.

02:29:12 What makes us different from animals, for instance?

02:29:16 So everything about ourselves is an echo of the past.

02:29:22 Is an echo of the past, an echo of people who lived before us.

02:29:29 That’s who we are.

02:29:31 And in the same way, if we manage

02:29:35 to contribute something to the collective edifice of culture,

02:29:41 a new idea, maybe a beautiful piece of music,

02:29:44 a work of art, a grand theory, a new world, maybe,

02:29:51 that something is going to become

02:29:54 a part of the minds of future humans, essentially, forever.

02:30:00 So everything we do creates ripples

02:30:03 that propagate into the future.

02:30:06 And in a way, this is our path to immortality,

02:30:11 is that as we contribute things to culture,

02:30:17 culture in turn becomes future humans.

02:30:21 And we keep influencing people thousands of years from now.

02:30:27 So our actions today create ripples.

02:30:30 And these ripples, I think, basically

02:30:35 sum up the meaning of life.

02:30:37 In the same way that we are the sum

02:30:42 of the interactions between many different ripples that

02:30:45 came from our past, we are ourselves

02:30:48 creating ripples that will propagate into the future.

02:30:50 And that’s why we should be, this

02:30:53 seems like perhaps an eighth thing to say,

02:30:56 but we should be kind to others during our time on Earth

02:31:02 because every act of kindness creates ripples.

02:31:05 And in reverse, every act of violence also creates ripples.

02:31:09 And you want to carefully choose which kind of ripples

02:31:13 you want to create, and you want to propagate into the future.

02:31:16 And in your case, first of all, beautifully put,

02:31:19 but in your case, creating ripples

02:31:21 into the future human and future AGI systems.

02:31:27 Yes.

02:31:28 It’s fascinating.

02:31:29 Our successors.

02:31:30 I don’t think there’s a better way to end it,

02:31:34 Francois, as always, for a second time.

02:31:37 And I’m sure many times in the future,

02:31:39 it’s been a huge honor.

02:31:40 You’re one of the most brilliant people

02:31:43 in the machine learning, computer science world.

02:31:47 Again, it’s a huge honor.

02:31:48 Thanks for talking to me.

02:31:49 It’s been a pleasure.

02:31:50 Thanks a lot for having me.

02:31:51 We appreciate it.

02:31:53 Thanks for listening to this conversation with Francois

02:31:56 Chollet, and thank you to our sponsors, Babbel, Masterclass,

02:32:00 and Cash App.

02:32:01 Click the sponsor links in the description

02:32:03 to get a discount and to support this podcast.

02:32:06 If you enjoy this thing, subscribe on YouTube,

02:32:09 review it with five stars on Apple Podcast,

02:32:11 follow on Spotify, support on Patreon,

02:32:14 or connect with me on Twitter at Lex Friedman.

02:32:17 And now let me leave you with some words

02:32:19 from René Descartes in 1668, an excerpt of which Francois

02:32:24 includes and is on the measure of intelligence paper.

02:32:27 If there were machines which bore a resemblance

02:32:30 to our bodies and imitated our actions as closely as possible

02:32:34 for all practical purposes, we should still

02:32:36 have two very certain means of recognizing

02:32:40 that they were not real men.

02:32:42 The first is that they could never use words or put together

02:32:45 signs, as we do in order to declare our thoughts to others.

02:32:49 For we can certainly conceive of a machine so constructed

02:32:53 that it utters words and even utters

02:32:55 words that correspond to bodily actions causing

02:32:57 a change in its organs.

02:32:59 But it is not conceivable that such a machine should produce

02:33:03 different arrangements of words so as

02:33:05 to give an appropriately meaningful answer to whatever

02:33:08 is said in its presence as the dullest of men can do.

02:33:12 Here, Descartes is anticipating the Turing test,

02:33:15 and the argument still continues to this day.

02:33:18 Secondly, he continues, even though some machines might

02:33:22 do some things as well as we do them, or perhaps even better,

02:33:26 they would inevitably fail in others,

02:33:29 which would reveal that they are acting not from understanding

02:33:32 but only from the disposition of their organs.

02:33:36 This is an incredible quote.

02:33:39 Whereas reason is a universal instrument

02:33:43 which can be used in all kinds of situations,

02:33:46 these organs need some particular action.

02:33:49 Hence, it is for all practical purposes

02:33:51 impossible for a machine to have enough different organs

02:33:54 to make it act in all the contingencies of life

02:33:57 and the way in which our reason makes us act.

02:34:01 That’s the debate between mimicry and memorization

02:34:05 versus understanding.

02:34:07 So thank you for listening and hope to see you next time.