Transcript
00:00:00 The following is a conversation with Francois Chollet,
00:00:03 his second time on the podcast.
00:00:05 He’s both a world class engineer and a philosopher
00:00:09 in the realm of deep learning and artificial intelligence.
00:00:13 This time, we talk a lot about his paper titled
00:00:16 on the measure of intelligence that discusses
00:00:19 how we might define and measure general intelligence
00:00:22 in our computing machinery.
00:00:24 Quick summary of the sponsors,
00:00:26 Babbel, Masterclass, and Cash App.
00:00:29 Click the sponsor links in the description
00:00:31 to get a discount and to support this podcast.
00:00:34 As a side note, let me say that the serious,
00:00:36 rigorous scientific study
00:00:38 of artificial general intelligence is a rare thing.
00:00:42 The mainstream machine learning community works
00:00:44 on very narrow AI with very narrow benchmarks.
00:00:47 This is very good for incremental
00:00:49 and sometimes big incremental progress.
00:00:53 On the other hand, the outside the mainstream,
00:00:56 renegade, you could say, AGI community works
00:01:00 on approaches that verge on the philosophical
00:01:03 and even the literary without big public benchmarks.
00:01:07 Walking the line between the two worlds is a rare breed,
00:01:10 but it doesn’t have to be.
00:01:12 I ran the AGI series at MIT as an attempt
00:01:15 to inspire more people to walk this line.
00:01:17 Deep mind and open AI for a time
00:01:20 and still on occasion walk this line.
00:01:23 Francois Chollet does as well.
00:01:25 I hope to also.
00:01:27 It’s a beautiful dream to work towards
00:01:29 and to make real one day.
00:01:32 If you enjoy this thing, subscribe on YouTube,
00:01:34 review it with five stars on Apple Podcast,
00:01:36 follow on Spotify, support on Patreon,
00:01:39 or connect with me on Twitter at Lex Friedman.
00:01:42 As usual, I’ll do a few minutes of ads now
00:01:44 and no ads in the middle.
00:01:45 I try to make these interesting,
00:01:47 but I give you timestamps so you can skip.
00:01:50 But still, please do check out the sponsors
00:01:52 by clicking the links in the description.
00:01:54 It’s the best way to support this podcast.
00:01:57 This show is sponsored by Babbel,
00:02:00 an app and website that gets you speaking
00:02:02 in a new language within weeks.
00:02:04 Go to babbel.com and use code Lex to get three months free.
00:02:08 They offer 14 languages, including Spanish, French,
00:02:11 Italian, German, and yes, Russian.
00:02:15 Daily lessons are 10 to 15 minutes,
00:02:17 super easy, effective,
00:02:19 designed by over 100 language experts.
00:02:22 Let me read a few lines from the Russian poem
00:02:24 Noch, ulitsa, fanar, apteka, by Alexander Bloch,
00:02:29 that you’ll start to understand if you sign up to Babbel.
00:02:32 Noch, ulitsa, fanar, apteka,
00:02:35 Bessmysliny, ituskly, svet,
00:02:38 Zhevi esho, khod chetvert veka,
00:02:41 Vse budet tak, ishoda, net.
00:02:44 Now, I say that you’ll start to understand this poem
00:02:48 because Russian starts with a language
00:02:51 and ends with vodka.
00:02:54 Now, the latter part is definitely not endorsed
00:02:56 or provided by Babbel.
00:02:58 It will probably lose me this sponsorship,
00:03:00 although it hasn’t yet.
00:03:02 But once you graduate with Babbel,
00:03:04 you can enroll in my advanced course
00:03:06 of late night Russian conversation over vodka.
00:03:09 No app for that yet.
00:03:11 So get started by visiting babbel.com
00:03:13 and use code Lex to get three months free.
00:03:18 This show is also sponsored by Masterclass.
00:03:20 Sign up at masterclass.com slash Lex
00:03:23 to get a discount and to support this podcast.
00:03:26 When I first heard about Masterclass,
00:03:28 I thought it was too good to be true.
00:03:29 I still think it’s too good to be true.
00:03:32 For $180 a year, you get an all access pass
00:03:35 to watch courses from, to list some of my favorites.
00:03:38 Chris Hatfield on space exploration,
00:03:41 hope to have him in this podcast one day.
00:03:43 Neil Dugras Tyson on scientific thinking and communication,
00:03:46 Neil too.
00:03:47 Will Wright, creator of SimCity and Sims
00:03:50 on game design, Carlos Santana on guitar,
00:03:52 Kary Kasparov on chess, Daniel Nagrano on poker,
00:03:55 and many more.
00:03:57 Chris Hatfield explaining how rockets work
00:03:59 and the experience of being watched at the space
00:04:01 alone is worth the money.
00:04:03 By the way, you can watch it on basically any device.
00:04:06 Once again, sign up at masterclass.com slash Lex
00:04:09 to get a discount and to support this podcast.
00:04:13 This show finally is presented by Cash App,
00:04:16 the number one finance app in the App Store.
00:04:18 When you get it, use code LexPodcast.
00:04:21 Cash App lets you send money to friends,
00:04:23 buy Bitcoin, and invest in the stock market
00:04:25 with as little as $1.
00:04:27 Since Cash App allows you to send
00:04:28 and receive money digitally,
00:04:30 let me mention a surprising fact related to physical money.
00:04:33 Of all the currency in the world,
00:04:35 roughly 8% of it is actually physical money.
00:04:39 The other 92% of the money only exists digitally,
00:04:42 and that’s only going to increase.
00:04:45 So again, if you get Cash App from the App Store
00:04:47 through Google Play and use code LexPodcast,
00:04:50 you get 10 bucks,
00:04:51 and Cash App will also donate $10 to FIRST,
00:04:54 an organization that is helping to advance robotics
00:04:57 and STEM education for young people around the world.
00:05:00 And now here’s my conversation with Francois Chalet.
00:05:05 What philosophers, thinkers, or ideas
00:05:07 had a big impact on you growing up and today?
00:05:10 So one author that had a big impact on me
00:05:14 when I read his books as a teenager was Jean Piaget,
00:05:18 who is a Swiss psychologist,
00:05:21 is considered to be the father of developmental psychology.
00:05:25 And he has a large body of work about
00:05:28 basically how intelligence develops in children.
00:05:33 And so it’s very old work,
00:05:35 like most of it is from the 1930s, 1940s.
00:05:39 So it’s not quite up to date.
00:05:40 It’s actually superseded by many newer developments
00:05:43 in developmental psychology.
00:05:45 But to me, it was very interesting, very striking,
00:05:49 and actually shaped the early ways
00:05:51 in which I started thinking about the mind
00:05:53 and the development of intelligence as a teenager.
00:05:56 His actual ideas or the way he thought about it
00:05:58 or just the fact that you could think
00:05:59 about the developing mind at all?
00:06:01 I guess both.
00:06:02 Jean Piaget is the author that really introduced me
00:06:04 to the notion that intelligence and the mind
00:06:07 is something that you construct throughout your life
00:06:11 and that children construct it in stages.
00:06:15 And I thought that was a very interesting idea,
00:06:17 which is, of course, very relevant to AI,
00:06:20 to building artificial minds.
00:06:23 Another book that I read around the same time
00:06:25 that had a big impact on me,
00:06:28 and there was actually a little bit of overlap
00:06:32 with Jean Piaget as well,
00:06:32 and I read it around the same time,
00:06:35 is Geoff Hawking’s On Intelligence, which is a classic.
00:06:39 And he has this vision of the mind
00:06:42 as a multi scale hierarchy of temporal prediction modules.
00:06:47 And these ideas really resonated with me,
00:06:50 like the notion of a modular hierarchy
00:06:55 of potentially compression functions
00:07:00 or prediction functions.
00:07:01 I thought it was really, really interesting,
00:07:03 and it shaped the way I started thinking
00:07:07 about how to build minds.
00:07:09 The hierarchical nature, which aspect?
00:07:13 Also, he’s a neuroscientist, so he was thinking actual,
00:07:17 he was basically talking about how our mind works.
00:07:20 Yeah, the notion that cognition is prediction
00:07:23 was an idea that was kind of new to me at the time
00:07:25 and that I really loved at the time.
00:07:27 And yeah, and the notion that there are multiple scales
00:07:31 of processing in the brain.
00:07:35 The hierarchy.
00:07:36 Yes.
00:07:37 This was before deep learning.
00:07:38 These ideas of hierarchies in AI
00:07:41 have been around for a long time,
00:07:43 even before on intelligence.
00:07:45 They’ve been around since the 1980s.
00:07:48 And yeah, that was before deep learning.
00:07:50 But of course, I think these ideas really found
00:07:53 their practical implementation in deep learning.
00:07:58 What about the memory side of things?
00:07:59 I think he was talking about knowledge representation.
00:08:02 Do you think about memory a lot?
00:08:04 One way you can think of neural networks
00:08:06 as a kind of memory, you’re memorizing things,
00:08:10 but it doesn’t seem to be the kind of memory
00:08:14 that’s in our brains,
00:08:16 or it doesn’t have the same rich complexity,
00:08:18 long term nature that’s in our brains.
00:08:20 Yes, the brain is more of a sparse access memory
00:08:23 so that you can actually retrieve very precisely
00:08:27 like bits of your experience.
00:08:30 The retrieval aspect, you can like introspect,
00:08:33 you can ask yourself questions.
00:08:35 I guess you can program your own memory
00:08:38 and language is actually the tool you use to do that.
00:08:41 I think language is a kind of operating system for the mind
00:08:46 and use language.
00:08:47 Well, one of the uses of language is as a query
00:08:51 that you run over your own memory,
00:08:53 use words as keys to retrieve specific experiences
00:08:57 or specific concepts, specific thoughts.
00:09:00 Like language is a way you store thoughts,
00:09:02 not just in writing, in the physical world,
00:09:04 but also in your own mind.
00:09:06 And it’s also how you retrieve them.
00:09:07 Like, imagine if you didn’t have language,
00:09:10 then you would have to,
00:09:11 you would not really have a self,
00:09:14 internally triggered way of retrieving past thoughts.
00:09:18 You would have to rely on external experiences.
00:09:21 For instance, you see a specific site,
00:09:24 you smell a specific smell and that brings up memories,
00:09:26 but you would not really have a way
00:09:28 to deliberately access these memories without language.
00:09:32 Well, the interesting thing you mentioned
00:09:33 is you can also program the memory.
00:09:37 You can change it probably with language.
00:09:39 Yeah, using language, yes.
00:09:41 Well, let me ask you a Chomsky question,
00:09:44 which is like, first of all,
00:09:45 do you think language is like fundamental,
00:09:49 like there’s turtles, what’s at the bottom of the turtles?
00:09:54 They don’t go, it can’t be turtles all the way down.
00:09:57 Is language at the bottom of cognition of everything?
00:10:00 Is like language, the fundamental aspect
00:10:05 of like what it means to be a thinking thing?
00:10:10 No, I don’t think so.
00:10:12 I think language is.
00:10:12 You disagree with Norm Chomsky?
00:10:14 Yes, I think language is a layer on top of cognition.
00:10:17 So it is fundamental to cognition in the sense that
00:10:21 to use a computing metaphor,
00:10:23 I see language as the operating system of the brain,
00:10:28 of the human mind.
00:10:29 And the operating system is a layer on top of the computer.
00:10:33 The computer exists before the operating system,
00:10:36 but the operating system is how you make it truly useful.
00:10:39 And the operating system is most likely Windows, not Linux,
00:10:43 because language is messy.
00:10:45 Yeah, it’s messy and it’s pretty difficult
00:10:49 to inspect it, introspect it.
00:10:53 How do you think about language?
00:10:55 Like we use actually sort of human interpretable language,
00:11:00 but is there something like a deeper,
00:11:03 that’s closer to like logical type of statements?
00:11:08 Like, yeah, what is the nature of language, do you think?
00:11:16 Like is there something deeper than like the syntactic rules
00:11:18 we construct?
00:11:19 Is there something that doesn’t require utterances
00:11:22 or writing or so on?
00:11:25 Are you asking about the possibility
00:11:27 that there could exist languages for thinking
00:11:30 that are not made of words?
00:11:32 Yeah.
00:11:33 Yeah, I think so.
00:11:34 I think, so the mind is layers, right?
00:11:38 And language is almost like the outermost,
00:11:41 the uppermost layer.
00:11:44 But before we think in words,
00:11:46 I think we think in terms of emotion in space
00:11:51 and we think in terms of physical actions.
00:11:54 And I think babies in particular,
00:11:56 probably expresses thoughts in terms of the actions
00:12:01 that they’ve seen or that they can perform
00:12:03 and in terms of motions of objects in their environment
00:12:08 before they start thinking in terms of words.
00:12:10 It’s amazing to think about that
00:12:13 as the building blocks of language.
00:12:16 So like the kind of actions and ways the babies see the world
00:12:21 as like more fundamental
00:12:23 than the beautiful Shakespearean language
00:12:26 you construct on top of it.
00:12:28 And we probably don’t have any idea
00:12:30 what that looks like, right?
00:12:31 Like what, because it’s important
00:12:34 for them trying to engineer it into AI systems.
00:12:38 I think visual analogies and motion
00:12:42 is a fundamental building block of the mind.
00:12:45 And you actually see it reflected in language.
00:12:48 Like language is full of special metaphors.
00:12:51 And when you think about things,
00:12:53 I consider myself very much as a visual thinker.
00:12:57 You often express these thoughts
00:13:01 by using things like visualizing concepts
00:13:06 in 2D space or like you solve problems
00:13:09 by imagining yourself navigating a concept space.
00:13:14 So I don’t know if you have this sort of experience.
00:13:17 You said visualizing concept space.
00:13:19 So like, so I certainly think about,
00:13:24 I certainly visualize mathematical concepts,
00:13:27 but you mean like in concept space,
00:13:32 visually you’re embedding ideas
00:13:34 into a three dimensional space
00:13:36 you can explore with your mind essentially?
00:13:38 You should be more like 2D, but yeah.
00:13:40 2D?
00:13:41 Yeah.
00:13:42 You’re a flatlander.
00:13:43 You’re, okay.
00:13:45 No, I do not.
00:13:49 I always have to, before I jump from concept to concept,
00:13:52 I have to put it back down on paper.
00:13:57 It has to be on paper.
00:13:58 I can only travel on 2D paper, not inside my mind.
00:14:03 You’re able to move inside your mind.
00:14:05 But even if you’re writing like a paper, for instance,
00:14:07 don’t you have like a spatial representation of your paper?
00:14:11 Like you visualize where ideas lie topologically
00:14:16 in relationship to other ideas,
00:14:18 kind of like a subway map of the ideas in your paper.
00:14:22 Yeah, that’s true.
00:14:23 I mean, there is, in papers, I don’t know about you,
00:14:27 but it feels like there’s a destination.
00:14:32 There’s a key idea that you want to arrive at.
00:14:36 And a lot of it is in the fog
00:14:39 and you’re trying to kind of,
00:14:40 it’s almost like, what’s that called
00:14:46 when you do a path planning search from both directions,
00:14:49 from the start and from the end.
00:14:52 And then you find, you do like shortest path,
00:14:54 but like, you know, in game playing,
00:14:57 you do this with like A star from both sides.
00:15:01 And you see where we’re on the join.
00:15:03 Yeah, so you kind of do, at least for me,
00:15:05 I think like, first of all,
00:15:07 just exploring from the start from like first principles,
00:15:10 what do I know, what can I start proving from that, right?
00:15:15 And then from the destination,
00:15:18 if you start backtracking,
00:15:20 like if I want to show some kind of sets of ideas,
00:15:25 what would it take to show them and you kind of backtrack,
00:15:28 but like, yeah,
00:15:29 I don’t think I’m doing all that in my mind though.
00:15:31 Like I’m putting it down on paper.
00:15:33 Do you use mind maps to organize your ideas?
00:15:35 Yeah, I like mind maps.
00:15:37 Let’s get into this,
00:15:38 because I’ve been so jealous of people.
00:15:41 I haven’t really tried it.
00:15:42 I’ve been jealous of people that seem to like,
00:15:45 they get like this fire of passion in their eyes
00:15:48 because everything starts making sense.
00:15:50 It’s like Tom Cruise in the movie
00:15:51 was like moving stuff around.
00:15:53 Some of the most brilliant people I know use mind maps.
00:15:55 I haven’t tried really.
00:15:57 Can you explain what the hell a mind map is?
00:16:01 I guess mind map is a way to make
00:16:03 kind of like the mess inside your mind
00:16:05 to just put it on paper so that you gain more control over it.
00:16:10 It’s a way to organize things on paper
00:16:13 and as kind of like a consequence
00:16:16 of organizing things on paper,
00:16:17 they start being more organized inside your own mind.
00:16:20 So what does that look like?
00:16:21 You put, like, do you have an example?
00:16:23 Like what’s the first thing you write on paper?
00:16:27 What’s the second thing you write?
00:16:28 I mean, typically you draw a mind map
00:16:31 to organize the way you think about a topic.
00:16:34 So you would start by writing down
00:16:37 like the key concept about that topic.
00:16:39 Like you would write intelligence or something,
00:16:42 and then you would start adding associative connections.
00:16:45 Like what do you think about
00:16:46 when you think about intelligence?
00:16:48 What do you think are the key elements of intelligence?
00:16:50 So maybe you would have language, for instance,
00:16:52 and you’d have motion.
00:16:53 And so you would start drawing notes with these things.
00:16:55 And then you would see what do you think about
00:16:57 when you think about motion and so on.
00:16:59 And you would go like that, like a tree.
00:17:00 Is it a tree mostly or is it a graph too, like a tree?
00:17:05 Oh, it’s more of a graph than a tree.
00:17:07 And it’s not limited to just writing down words.
00:17:13 You can also draw things.
00:17:15 And it’s not supposed to be purely hierarchical, right?
00:17:21 The point is that once you start writing it down,
00:17:24 you can start reorganizing it so that it makes more sense,
00:17:27 so that it’s connected in a more effective way.
00:17:29 See, but I’m so OCD that you just mentioned
00:17:34 intelligence and language and motion.
00:17:37 I would start becoming paranoid
00:17:39 that the categorization isn’t perfect.
00:17:41 Like that I would become paralyzed with the mind map
00:17:47 that like this may not be.
00:17:49 So like the, even though you’re just doing
00:17:52 associative kind of connections,
00:17:55 there’s an implied hierarchy that’s emerging.
00:17:58 And I would start becoming paranoid
00:17:59 that it’s not the proper hierarchy.
00:18:02 So you’re not just, one way to see mind maps
00:18:04 is you’re putting thoughts on paper.
00:18:07 It’s like a stream of consciousness,
00:18:10 but then you can also start getting paranoid.
00:18:12 Well, is this the right hierarchy?
00:18:15 Sure, which it’s mind maps, your mind map.
00:18:17 You’re free to draw anything you want.
00:18:19 You’re free to draw any connection you want.
00:18:20 And you can just make a different mind map
00:18:23 if you think the central node is not the right node.
00:18:26 Yeah, I suppose there’s a fear of being wrong.
00:18:29 If you want to organize your ideas
00:18:32 by writing down what you think,
00:18:35 which I think is very effective.
00:18:37 Like how do you know what you think about something
00:18:40 if you don’t write it down, right?
00:18:42 If you do that, the thing is that it imposes
00:18:46 much more syntactic structure over your ideas,
00:18:49 which is not required with mind maps.
00:18:51 So mind map is kind of like a lower level,
00:18:54 more freehand way of organizing your thoughts.
00:18:57 And once you’ve drawn it,
00:18:59 then you can start actually voicing your thoughts
00:19:03 in terms of, you know, paragraphs.
00:19:05 It’s a two dimensional aspect of layout too, right?
00:19:08 Yeah.
00:19:09 It’s a kind of flower, I guess, you start.
00:19:12 There’s usually, you want to start with a central concept?
00:19:15 Yes.
00:19:16 Then you move out.
00:19:17 Typically it ends up more like a subway map.
00:19:19 So it ends up more like a graph,
00:19:20 a topological graph without a root node.
00:19:23 Yeah, so like in a subway map,
00:19:25 there are some nodes that are more connected than others.
00:19:27 And there are some nodes that are more important than others.
00:19:30 So there are destinations,
00:19:32 but it’s not going to be purely like a tree, for instance.
00:19:36 Yeah, it’s fascinating to think that
00:19:38 if there’s something to that about the way our mind thinks.
00:19:42 By the way, I just kind of remembered obvious thing
00:19:45 that I have probably thousands of documents
00:19:49 in Google Doc at this point, that are bullet point lists,
00:19:53 which is, you can probably map a mind map
00:19:57 to a bullet point list.
00:20:01 It’s the same, it’s a, no, it’s not, it’s a tree.
00:20:05 It’s a tree, yeah.
00:20:06 So I create trees,
00:20:07 but also they don’t have the visual element.
00:20:10 Like, I guess I’m comfortable with the structure.
00:20:13 It feels like the narrowness,
00:20:15 the constraints feel more comforting.
00:20:18 If you have thousands of documents
00:20:20 with your own thoughts in Google Docs,
00:20:23 why don’t you write some kind of search engine,
00:20:26 like maybe a mind map, a piece of software,
00:20:30 mind mapping software, where you write down a concept
00:20:33 and then it gives you sentences or paragraphs
00:20:37 from your thousand Google Docs document
00:20:39 that match this concept.
00:20:41 The problem is it’s so deeply, unlike mind maps,
00:20:45 it’s so deeply rooted in natural language.
00:20:48 So it’s not, it’s not semantically searchable,
00:20:54 I would say, because the categories are very,
00:20:57 you kind of mentioned intelligence, language, and motion.
00:21:00 They’re very strong, semantic.
00:21:02 Like, it feels like the mind map forces you
00:21:05 to be semantically clear and specific.
00:21:09 The bullet points list I have are sparse,
00:21:13 desperate thoughts that poetically represent
00:21:20 a category like motion, as opposed to saying motion.
00:21:25 So unfortunately, that’s the same problem with the internet.
00:21:28 That’s why the idea of semantic web is difficult to get.
00:21:32 It’s, most language on the internet is a giant mess
00:21:37 of natural language that’s hard to interpret, which,
00:21:42 so do you think there’s something to mind maps as,
00:21:46 you actually originally brought it up
00:21:48 as we were talking about kind of cognition and language.
00:21:53 Do you think there’s something to mind maps
00:21:55 about how our brain actually deals,
00:21:58 like think reasons about things?
00:22:01 It’s possible.
00:22:02 I think it’s reasonable to assume that there is
00:22:07 some level of topological processing in the brain,
00:22:10 that the brain is very associative in nature.
00:22:15 And I also believe that a topological space
00:22:20 is a better medium to encode thoughts
00:22:25 than a geometric space.
00:22:27 So I think…
00:22:28 What’s the difference in a topological
00:22:29 and a geometric space?
00:22:31 Well, if you’re talking about topologies,
00:22:34 then points are either connected or not.
00:22:36 So a topology is more like a subway map.
00:22:38 And geometry is when you’re interested
00:22:41 in the distance between things.
00:22:43 And in a subway map,
00:22:44 you don’t really have the concept of distance.
00:22:46 You only have the concept of whether there is a train
00:22:48 going from station A to station B.
00:22:52 And what we do in deep learning is that we’re actually
00:22:55 dealing with geometric spaces.
00:22:57 We are dealing with concept vectors, word vectors,
00:23:01 that have a distance between them
00:23:03 to express in terms of that product.
00:23:05 So we are not really building topological models usually.
00:23:10 I think you’re absolutely right.
00:23:11 Like distance is a fundamental importance in deep learning.
00:23:16 I mean, it’s the continuous aspect of it.
00:23:19 Yes, because everything is a vector
00:23:21 and everything has to be a vector
00:23:22 because everything has to be differentiable.
00:23:24 If your space is discrete, it’s no longer differentiable.
00:23:26 You cannot do deep learning in it anymore.
00:23:29 Well, you could, but you can only do it by embedding it
00:23:32 in a bigger continuous space.
00:23:35 So if you do topology in the context of deep learning,
00:23:39 you have to do it by embedding your topology
00:23:41 in the geometry.
00:23:42 Well, let me zoom out for a second.
00:23:46 Let’s get into your paper on the measure of intelligence
00:23:50 that you put out in 2019.
00:23:52 Yes.
00:23:53 Okay.
00:23:54 November.
00:23:55 November.
00:23:57 Yeah, remember 2019?
00:23:59 That was a different time.
00:24:01 Yeah, I remember.
00:24:02 I still remember.
00:24:06 It feels like a different world.
00:24:09 You could travel, you could actually go outside
00:24:12 and see friends.
00:24:15 Yeah.
00:24:16 Let me ask the most absurd question.
00:24:18 I think there’s some nonzero probability
00:24:21 there’ll be a textbook one day, like 200 years from now
00:24:25 on artificial intelligence,
00:24:27 or it’ll be called like just intelligence
00:24:30 cause humans will already be gone.
00:24:32 It’ll be your picture with a quote.
00:24:35 This is, you know, one of the early biological systems
00:24:39 would consider the nature of intelligence
00:24:41 and there’ll be like a definition
00:24:43 of how they thought about intelligence.
00:24:45 Which is one of the things you do in your paper
00:24:46 on measure intelligence is to ask like,
00:24:51 well, what is intelligence
00:24:52 and how to test for intelligence and so on.
00:24:55 So is there a spiffy quote about what is intelligence?
00:25:01 What is the definition of intelligence
00:25:03 according to Francois Chollet?
00:25:06 Yeah, so do you think the super intelligent AIs
00:25:10 of the future will want to remember us
00:25:13 the way we remember humans from the past?
00:25:16 And do you think they will be, you know,
00:25:18 they won’t be ashamed of having a biological origin?
00:25:22 No, I think it would be a niche topic.
00:25:24 It won’t be that interesting,
00:25:25 but it’ll be like the people that study
00:25:29 in certain contexts like historical civilization
00:25:33 that no longer exists, the Aztecs and so on.
00:25:36 That’s how it’ll be seen.
00:25:38 And it’ll be study in also the context on social media.
00:25:42 There’ll be hashtags about the atrocity
00:25:46 committed to human beings
00:25:49 when the robots finally got rid of them.
00:25:52 Like it was a mistake.
00:25:55 You’ll be seen as a giant mistake,
00:25:57 but ultimately in the name of progress
00:26:00 and it created a better world
00:26:01 because humans were over consuming the resources
00:26:05 and they were not very rational
00:26:07 and were destructive in the end in terms of productivity
00:26:11 and putting more love in the world.
00:26:13 And so within that context,
00:26:15 there’ll be a chapter about these biological systems.
00:26:17 It seems to have a very detailed vision of that hit here.
00:26:20 You should write a sci fi novel about it.
00:26:22 I’m working on a sci fi novel currently, yes.
00:26:28 Self published, yeah.
00:26:29 The definition of intelligence.
00:26:30 So intelligence is the efficiency
00:26:34 with which you acquire new skills at tasks
00:26:39 that you did not previously know about,
00:26:41 that you did not prepare for, right?
00:26:44 So intelligence is not skill itself.
00:26:47 It’s not what you know, it’s not what you can do.
00:26:50 It’s how well and how efficiently
00:26:52 you can learn new things.
00:26:54 New things.
00:26:55 Yes.
00:26:56 The idea of newness there
00:26:58 seems to be fundamentally important.
00:27:01 Yes.
00:27:02 So you would see intelligence on display, for instance.
00:27:05 Whenever you see a human being or an AI creature
00:27:09 adapt to a new environment that it does not see before,
00:27:13 that its creators did not anticipate.
00:27:16 When you see adaptation, when you see improvisation,
00:27:19 when you see generalization, that’s intelligence.
00:27:22 In reverse, if you have a system
00:27:24 that when you put it in a slightly new environment,
00:27:27 it cannot adapt, it cannot improvise,
00:27:30 it cannot deviate from what it’s hard coded to do
00:27:33 or what it has been trained to do,
00:27:38 that is a system that is not intelligent.
00:27:41 There’s actually a quote from Einstein
00:27:43 that captures this idea, which is,
00:27:46 the measure of intelligence is the ability to change.
00:27:50 I like that quote.
00:27:51 I think it captures at least part of this idea.
00:27:54 You know, there might be something interesting
00:27:56 about the difference between your definition and Einstein’s.
00:27:59 I mean, he’s just being Einstein and clever,
00:28:04 but acquisition of new ability to deal with new things
00:28:09 versus ability to just change.
00:28:14 What’s the difference between those two things?
00:28:16 So just change in itself.
00:28:19 Do you think there’s something to that?
00:28:21 Just being able to change.
00:28:23 Yes, being able to adapt.
00:28:25 So not change, but certainly change its direction.
00:28:30 Being able to adapt yourself to your environment.
00:28:34 Whatever the environment is.
00:28:35 That’s a big part of intelligence.
00:28:37 And intelligence is more precisely, you know,
00:28:40 how efficiently you’re able to adapt,
00:28:42 how efficiently you’re able to basically master your environment,
00:28:45 how efficiently you can acquire new skills.
00:28:49 And I think there’s a big distinction to be drawn
00:28:52 between intelligence, which is a process,
00:28:56 and the output of that process, which is skill.
00:29:01 So for instance, if you have a very smart human brain,
00:29:04 so for instance, if you have a very smart human programmer
00:29:08 that considers the game of chess,
00:29:10 and that writes down a static program that can play chess,
00:29:16 then the intelligence is the process
00:29:19 of developing that program.
00:29:20 But the program itself is just encoding
00:29:25 the output artifact of that process.
00:29:28 The program itself is not intelligent.
00:29:30 And the way you tell it’s not intelligent
00:29:31 is that if you put it in a different context,
00:29:34 you ask it to play Go or something,
00:29:36 it’s not going to be able to perform well
00:29:37 without human involvement,
00:29:38 because the source of intelligence,
00:29:41 the entity that is capable of that process
00:29:43 is the human programmer.
00:29:44 So we should be able to tell the difference
00:29:47 between the process and its output.
00:29:50 We should not confuse the output and the process.
00:29:53 It’s the same as, you know,
00:29:54 do not confuse a road building company
00:29:58 and one specific road,
00:30:00 because one specific road takes you from point A to point B,
00:30:03 but a road building company can take you from,
00:30:06 can make a path from anywhere to anywhere else.
00:30:08 Yeah, that’s beautifully put,
00:30:10 but it’s also to play devil’s advocate a little bit.
00:30:15 You know, it’s possible that there’s something
00:30:18 more fundamental than us humans.
00:30:21 So you kind of said the programmer creates
00:30:25 the difference between the choir,
00:30:28 the skill and the skill itself.
00:30:31 There could be something like,
00:30:32 you could argue the universe is more intelligent.
00:30:36 Like the base intelligence that we should be trying
00:30:43 to measure is something that created humans.
00:30:46 We should be measuring God or the source of the universe
00:30:51 as opposed to, like there could be a deeper intelligence.
00:30:55 Sure.
00:30:55 There’s always deeper intelligence, I guess.
00:30:57 You can argue that,
00:30:58 but that does not take anything away
00:31:00 from the fact that humans are intelligent.
00:31:01 And you can tell that
00:31:03 because they are capable of adaptation and generality.
00:31:07 Got it.
00:31:07 And you see that in particular in the fact
00:31:09 that humans are capable of handling situations and tasks
00:31:16 that are quite different from anything
00:31:19 that any of our evolutionary ancestors
00:31:22 has ever encountered.
00:31:24 So we are capable of generalizing very much
00:31:27 out of distribution,
00:31:28 if you consider our evolutionary history
00:31:30 as being in a way our training data.
00:31:33 Of course, evolutionary biologists would argue
00:31:35 that we’re not going too far out of the distribution.
00:31:37 We’re like mapping the skills we’ve learned previously,
00:31:41 desperately trying to like jam them
00:31:43 into like these new situations.
00:31:47 I mean, there’s definitely a little bit of that,
00:31:49 but it’s pretty clear to me that we’re able to,
00:31:53 most of the things we do any given day
00:31:56 in our modern civilization
00:31:58 are things that are very, very different
00:32:00 from what our ancestors a million years ago
00:32:03 would have been doing in a given day.
00:32:05 And your environment is very different.
00:32:07 So I agree that everything we do,
00:32:12 we do it with cognitive building blocks
00:32:14 that we acquired over the course of evolution, right?
00:32:17 And that anchors our cognition to a certain context,
00:32:22 which is the human condition very much.
00:32:25 But still our mind is capable of a pretty remarkable degree
00:32:29 of generality far beyond anything we can create
00:32:32 in artificial systems today.
00:32:34 Like the degree in which the mind can generalize
00:32:37 from its evolutionary history,
00:32:41 can generalize away from its evolutionary history
00:32:43 is much greater than the degree
00:32:46 to which a deep learning system today
00:32:48 can generalize away from its training data.
00:32:51 And like the key point you’re making,
00:32:52 which I think is quite beautiful is like,
00:32:54 we shouldn’t measure, if we’re talking about measurement,
00:32:58 we shouldn’t measure the skill.
00:33:01 We should measure like the creation of the new skill,
00:33:04 the ability to create that new skill.
00:33:06 But it’s tempting, like it’s weird
00:33:10 because the skill is a little bit of a small window
00:33:13 into the system.
00:33:16 So whenever you have a lot of skills,
00:33:19 it’s tempting to measure the skills.
00:33:21 I mean, the skill is the only thing you can objectively
00:33:25 measure, but yeah.
00:33:27 So the thing to keep in mind is that
00:33:30 when you see skill in the human,
00:33:35 it gives you a strong signal that that human is intelligent
00:33:39 because you know they weren’t born with that skill typically.
00:33:42 Like you see a very strong chess player,
00:33:45 maybe you’re a very strong chess player yourself.
00:33:47 I think you’re saying that because I’m Russian
00:33:51 and now you’re prejudiced, you assume.
00:33:53 All Russians are good at chess.
00:33:54 I’m biased, exactly.
00:33:55 I’m biased, yeah.
00:33:56 Well, you’re definitely biased.
00:34:00 So if you see a very strong chess player,
00:34:01 you know they weren’t born knowing how to play chess.
00:34:05 So they had to acquire that skill
00:34:07 with their limited resources, with their limited lifetime.
00:34:10 And they did that because they are generally intelligent.
00:34:15 And so they may as well have acquired any other skill.
00:34:18 You know they have this potential.
00:34:21 And on the other hand, if you see a computer playing chess,
00:34:25 you cannot make the same assumptions
00:34:27 because you cannot just assume
00:34:29 the computer is generally intelligent.
00:34:30 The computer may be born knowing how to play chess
00:34:35 in the sense that it may have been programmed by a human
00:34:38 that has understood chess for the computer
00:34:40 and that has just encoded the output
00:34:44 of that understanding in a static program.
00:34:46 And that program is not intelligent.
00:34:49 So let’s zoom out just for a second and say like,
00:34:52 what is the goal on the measure of intelligence paper?
00:34:57 Like what do you hope to achieve with it?
00:34:59 So the goal of the paper is to clear up
00:35:01 some longstanding misunderstandings
00:35:04 about the way we’ve been conceptualizing intelligence
00:35:08 in the AI community and in the way we’ve been
00:35:12 evaluating progress in AI.
00:35:16 There’s been a lot of progress recently in machine learning
00:35:19 and people are extrapolating from that progress
00:35:22 that we are about to solve general intelligence.
00:35:26 And if you want to be able to evaluate these statements,
00:35:30 you need to precisely define what you’re talking about
00:35:33 when you’re talking about general intelligence.
00:35:35 And you need a formal way, a reliable way to measure
00:35:40 how much intelligence,
00:35:42 how much general intelligence a system processes.
00:35:45 And ideally this measure of intelligence
00:35:48 should be actionable.
00:35:50 So it should not just describe what intelligence is.
00:35:54 It should not just be a binary indicator
00:35:56 that tells you the system is intelligent or it isn’t.
00:36:01 It should be actionable.
00:36:03 It should have explanatory power, right?
00:36:05 So you could use it as a feedback signal.
00:36:08 It would show you the way
00:36:10 towards building more intelligent systems.
00:36:13 So at the first level, you draw a distinction
00:36:16 between two divergent views of intelligence.
00:36:21 As we just talked about,
00:36:22 intelligence is a collection of task specific skills
00:36:26 and a general learning ability.
00:36:29 So what’s the difference between
00:36:32 kind of this memorization of skills
00:36:35 and a general learning ability?
00:36:37 We’ve talked about it a little bit,
00:36:39 but can you try to linger on this topic for a bit?
00:36:43 Yeah, so the first part of the paper
00:36:45 is an assessment of the different ways
00:36:49 we’ve been thinking about intelligence
00:36:50 and the different ways we’ve been evaluating progress in AI.
00:36:54 And this tree of cognitive sciences
00:36:57 has been shaped by two views of the human mind.
00:37:01 And one view is the evolutionary psychology view
00:37:04 in which the mind is a collection of fairly static
00:37:10 special purpose ad hoc mechanisms
00:37:14 that have been hard coded by evolution
00:37:17 over our history as a species for a very long time.
00:37:22 And early AI researchers,
00:37:27 people like Marvin Minsky, for instance,
00:37:30 they clearly subscribed to this view.
00:37:33 And they saw the mind as a kind of
00:37:36 collection of static programs
00:37:39 similar to the programs they would run
00:37:42 on like mainframe computers.
00:37:43 And in fact, I think they very much understood the mind
00:37:48 through the metaphor of the mainframe computer
00:37:50 because that was the tool they were working with, right?
00:37:53 And so you had these static programs,
00:37:55 this collection of very different static programs
00:37:57 operating over a database like memory.
00:38:00 And in this picture, learning was not very important.
00:38:03 Learning was considered to be just memorization.
00:38:05 And in fact, learning is basically not featured
00:38:10 in AI textbooks until the 1980s
00:38:14 with the rise of machine learning.
00:38:16 It’s kind of fun to think about
00:38:18 that learning was the outcast.
00:38:21 Like the weird people working on learning,
00:38:24 like the mainstream AI world was,
00:38:28 I mean, I don’t know what the best term is,
00:38:31 but it’s non learning.
00:38:33 It was seen as like reasoning would not be learning based.
00:38:37 Yes, it was considered that the mind
00:38:40 was a collection of programs
00:38:43 that were primarily logical in nature.
00:38:46 And that’s all you needed to do to create a mind
00:38:49 was to write down these programs
00:38:50 and they would operate over knowledge,
00:38:52 which would be stored in some kind of database.
00:38:55 And as long as your database would encompass,
00:38:57 you know, everything about the world
00:38:59 and your logical rules were comprehensive,
00:39:03 then you would have a mind.
00:39:04 So the other view of the mind
00:39:06 is the brain as a sort of blank slate, right?
00:39:11 This is a very old idea.
00:39:13 You find it in John Locke’s writings.
00:39:16 This is the tabula rasa.
00:39:19 And this is this idea that the mind
00:39:21 is some kind of like information sponge
00:39:23 that starts empty, that starts blank.
00:39:27 And that absorbs knowledge and skills from experience, right?
00:39:34 So it’s a sponge that reflects the complexity of the world,
00:39:38 the complexity of your life experience, essentially.
00:39:41 That everything you know and everything you can do
00:39:44 is a reflection of something you found
00:39:47 in the outside world, essentially.
00:39:49 So this is an idea that’s very old.
00:39:51 That was not very popular, for instance, in the 1970s.
00:39:56 But that gained a lot of vitality recently
00:39:58 with the rise of connectionism, in particular deep learning.
00:40:02 And so today, deep learning
00:40:03 is the dominant paradigm in AI.
00:40:06 And I feel like lots of AI researchers
00:40:10 are conceptualizing the mind via a deep learning metaphor.
00:40:14 Like they see the mind as a kind of
00:40:17 randomly initialized neural network that starts blank
00:40:21 when you’re born.
00:40:22 And then that gets trained via exposure to trained data
00:40:26 that acquires knowledge and skills
00:40:27 via exposure to trained data.
00:40:29 By the way, it’s a small tangent.
00:40:32 I feel like people who are thinking about intelligence
00:40:36 are not conceptualizing it that way.
00:40:39 I actually haven’t met too many people
00:40:41 who believe that a neural network
00:40:44 will be able to reason, who seriously think that rigorously.
00:40:51 Because I think it’s actually an interesting worldview.
00:40:54 And we’ll talk about it more,
00:40:56 but it’s been impressive what neural networks
00:41:00 have been able to accomplish.
00:41:02 And to me, I don’t know, you might disagree,
00:41:04 but it’s an open question whether like scaling size
00:41:09 eventually might lead to incredible results
00:41:13 to us mere humans will appear as if it’s general.
00:41:17 I mean, if you ask people who are seriously thinking
00:41:19 about intelligence, they will definitely not say
00:41:22 that all you need to do is,
00:41:24 like the mind is just a neural network.
00:41:27 However, it’s actually a view that’s very popular,
00:41:30 I think, in the deep learning community
00:41:31 that many people are kind of conceptually
00:41:35 intellectually lazy about it.
00:41:37 Right, it’s a, but I guess what I’m saying exactly right,
00:41:40 it’s, I mean, I haven’t met many people
00:41:44 and I think it would be interesting to meet a person
00:41:47 who is not intellectually lazy about this particular topic
00:41:50 and still believes that neural networks will go all the way.
00:41:54 I think Yama is probably closest to that
00:41:56 with self supervised.
00:41:57 There are definitely people who argue
00:41:59 that current deep learning techniques
00:42:03 are already the way to general artificial intelligence.
00:42:06 And that all you need to do is to scale it up
00:42:09 to all the available trained data.
00:42:12 And that’s, if you look at the waves
00:42:16 that OpenAI’s GPT3 model has made,
00:42:19 you see echoes of this idea.
00:42:22 So on that topic, GPT3, similar to GPT2 actually,
00:42:28 have captivated some part of the imagination of the public.
00:42:33 There’s just a bunch of hype of different kind.
00:42:35 That’s, I would say it’s emergent.
00:42:37 It’s not artificially manufactured.
00:42:39 It’s just like people just get excited
00:42:42 for some strange reason.
00:42:43 And in the case of GPT3, which is funny,
00:42:46 that there’s, I believe, a couple months delay
00:42:49 from release to hype.
00:42:51 Maybe I’m not historically correct on that,
00:42:56 but it feels like there was a little bit of a lack of hype
00:43:01 and then there’s a phase shift into hype.
00:43:04 But nevertheless, there’s a bunch of cool applications
00:43:07 that seem to captivate the imagination of the public
00:43:10 about what this language model
00:43:12 that’s trained in unsupervised way
00:43:15 without any fine tuning is able to achieve.
00:43:19 So what do you make of that?
00:43:20 What are your thoughts about GPT3?
00:43:22 Yeah, so I think what’s interesting about GPT3
00:43:25 is the idea that it may be able to learn new tasks
00:43:31 after just being shown a few examples.
00:43:33 So I think if it’s actually capable of doing that,
00:43:35 that’s novel and that’s very interesting
00:43:37 and that’s something we should investigate.
00:43:39 That said, I must say, I’m not entirely convinced
00:43:43 that we have shown it’s capable of doing that.
00:43:47 It’s very likely, given the amount of data
00:43:50 that the model is trained on,
00:43:52 that what it’s actually doing is pattern matching
00:43:55 a new task you give it with a task
00:43:58 that it’s been exposed to in its trained data.
00:44:00 It’s just recognizing the task
00:44:01 instead of just developing a model of the task, right?
00:44:05 But there’s, sorry to interrupt,
00:44:07 there’s a parallel as to what you said before,
00:44:10 which is it’s possible to see GPT3 as like the prompts
00:44:14 it’s given as a kind of SQL query
00:44:17 into this thing that it’s learned,
00:44:19 similar to what you said before,
00:44:20 which is language is used to query the memory.
00:44:23 Yes.
00:44:24 So is it possible that neural network
00:44:26 is a giant memorization thing,
00:44:29 but then if it gets sufficiently giant,
00:44:32 it’ll memorize sufficiently large amounts
00:44:35 of things in the world or it becomes,
00:44:37 or intelligence becomes a querying machine?
00:44:40 I think it’s possible that a significant chunk
00:44:44 of intelligence is this giant associative memory.
00:44:48 I definitely don’t believe that intelligence
00:44:51 is just a giant associative memory,
00:44:53 but it may well be a big component.
00:44:57 So do you think GPT3, 4, 5,
00:45:02 GPT10 will eventually, like, what do you think,
00:45:07 where’s the ceiling?
00:45:08 Do you think you’ll be able to reason?
00:45:11 No, that’s a bad question.
00:45:14 Like, what is the ceiling is the better question.
00:45:17 How well is it gonna scale?
00:45:18 How good is GPTN going to be?
00:45:21 Yeah.
00:45:22 So I believe GPTN is gonna.
00:45:25 GPTN.
00:45:26 Is gonna improve on the strength of GPT2 and 3,
00:45:30 which is it will be able to generate, you know,
00:45:33 ever more plausible text in context.
00:45:37 Just monotonically increasing performance.
00:45:41 Yes, if you train a bigger model on more data,
00:45:44 then your text will be increasingly more context aware
00:45:49 and increasingly more plausible
00:45:51 in the same way that GPT3 is much better
00:45:54 at generating plausible text compared to GPT2.
00:45:57 But that said, I don’t think just scaling up the model
00:46:01 to more transformer layers and more trained data
00:46:04 is gonna address the flaws of GPT3,
00:46:07 which is that it can generate plausible text,
00:46:09 but that text is not constrained by anything else
00:46:13 other than plausibility.
00:46:15 So in particular, it’s not constrained by factualness
00:46:19 or even consistency, which is why it’s very easy
00:46:21 to get GPT3 to generate statements
00:46:23 that are factually untrue.
00:46:26 Or to generate statements that are even self contradictory.
00:46:29 Right?
00:46:30 Because it’s only goal is plausibility,
00:46:35 and it has no other constraints.
00:46:37 It’s not constrained to be self consistent, for instance.
00:46:40 Right?
00:46:41 And so for this reason, one thing that I thought
00:46:43 was very interesting with GPT3 is that you can
00:46:46 predetermine the answer it will give you
00:46:49 by asking the question in a specific way,
00:46:52 because it’s very responsive to the way you ask the question.
00:46:55 Since it has no understanding of the content of the question.
00:47:00 Right.
00:47:01 And if you have the same question in two different ways
00:47:05 that are basically adversarially engineered
00:47:09 to produce certain answer,
00:47:10 you will get two different answers,
00:47:12 two contradictory answers.
00:47:14 It’s very susceptible to adversarial attacks, essentially.
00:47:16 Potentially, yes.
00:47:17 So in general, the problem with these models,
00:47:20 these generative models, is that they are very good
00:47:24 at generating plausible text,
00:47:27 but that’s just not enough.
00:47:29 Right?
00:47:33 I think one avenue that would be very interesting
00:47:36 to make progress is to make it possible
00:47:40 to write programs over the latent space
00:47:43 that these models operate on.
00:47:45 That you would rely on these self supervised models
00:47:49 to generate a sort of like pool of knowledge and concepts
00:47:54 and common sense.
00:47:55 And then you would be able to write
00:47:57 explicit reasoning programs over it.
00:48:01 Because the current problem with GPT3 is that
00:48:03 it can be quite difficult to get it to do what you want to do.
00:48:09 If you want to turn GPT3 into products,
00:48:12 you need to put constraints on it.
00:48:14 You need to force it to obey certain rules.
00:48:19 So you need a way to program it explicitly.
00:48:22 Yeah, so if you look at its ability
00:48:24 to do program synthesis,
00:48:26 it generates, like you said, something that’s plausible.
00:48:29 Yeah, so if you try to make it generate programs,
00:48:32 it will perform well for any program
00:48:35 that it has seen in its training data.
00:48:38 But because program space is not interpretive, right?
00:48:42 It’s not going to be able to generalize to problems
00:48:46 it hasn’t seen before.
00:48:48 Now that’s currently, do you think sort of an absurd,
00:48:54 but I think useful, I guess, intuition builder is,
00:49:00 you know, the GPT3 has 175 billion parameters.
00:49:07 Human brain has 100, has about a thousand times that
00:49:11 or more in terms of number of synapses.
00:49:16 Do you think, obviously, very different kinds of things,
00:49:21 but there is some degree of similarity.
00:49:26 Do you think, what do you think GPT will look like
00:49:30 when it has 100 trillion parameters?
00:49:34 You think our conversation might be in nature different?
00:49:39 Like, because you’ve criticized GPT3 very effectively now.
00:49:42 Do you think?
00:49:45 No, I don’t think so.
00:49:46 So to begin with, the bottleneck with scaling up GPT3,
00:49:51 GPT models, generative pre trained transformer models,
00:49:54 is not going to be the size of the model
00:49:57 or how long it takes to train it.
00:49:59 The bottleneck is going to be the trained data
00:50:01 because OpenAI is already training GPT3
00:50:05 on a core of basically the entire web, right?
00:50:08 And that’s a lot of data.
00:50:09 So you could imagine training on more data than that,
00:50:12 like Google could train on more data than that,
00:50:14 but it would still be only incrementally more data.
00:50:17 And I don’t recall exactly how much more data GPT3
00:50:21 was trained on compared to GPT2,
00:50:22 but it’s probably at least like a hundred,
00:50:25 maybe even a thousand X.
00:50:26 I don’t have the exact number.
00:50:28 You’re not going to be able to train a model
00:50:30 on a hundred more data than what you’re already doing.
00:50:34 So that’s brilliant.
00:50:35 So it’s easier to think of compute as a bottleneck
00:50:38 and then arguing that we can remove that bottleneck.
00:50:41 But we can remove the compute bottleneck.
00:50:43 I don’t think it’s a big problem.
00:50:44 If you look at the pace at which we’ve improved
00:50:48 the efficiency of deep learning models
00:50:51 in the past few years,
00:50:54 I’m not worried about train time bottlenecks
00:50:57 or model size bottlenecks.
00:50:59 The bottleneck in the case
00:51:01 of these generative transformer models
00:51:03 is absolutely the trained data.
00:51:05 What about the quality of the data?
00:51:07 So, yeah.
00:51:08 So the quality of the data is an interesting point.
00:51:10 The thing is,
00:51:11 if you’re going to want to use these models
00:51:14 in real products,
00:51:16 then you want to feed them data
00:51:20 that’s as high quality, as factual,
00:51:23 I would say as unbiased as possible,
00:51:25 that there’s not really such a thing
00:51:27 as unbiased data in the first place.
00:51:30 But you probably don’t want to train it on Reddit,
00:51:34 for instance.
00:51:34 It sounds like a bad plan.
00:51:37 So from my personal experience,
00:51:38 working with large scale deep learning models.
00:51:42 So at some point I was working on a model at Google
00:51:46 that’s trained on 350 million labeled images.
00:51:52 It’s an image classification model.
00:51:53 That’s a lot of images.
00:51:54 That’s like probably most publicly available images
00:51:58 on the web at the time.
00:52:00 And it was a very noisy data set
00:52:03 because the labels were not originally annotated by hand,
00:52:07 by humans.
00:52:08 They were automatically derived from like tags
00:52:12 on social media,
00:52:14 or just keywords in the same page
00:52:16 as the image was found and so on.
00:52:18 So it was very noisy.
00:52:19 And it turned out that you could easily get a better model,
00:52:25 not just by training,
00:52:26 like if you train on more of the noisy data,
00:52:29 you get an incrementally better model,
00:52:31 but you very quickly hit diminishing returns.
00:52:35 On the other hand,
00:52:36 if you train on smaller data set
00:52:38 with higher quality annotations,
00:52:40 quality annotations that are actually made by humans,
00:52:45 you get a better model.
00:52:47 And it also takes less time to train it.
00:52:49 Yeah, that’s fascinating.
00:52:51 It’s the self supervised learning.
00:52:53 There’s a way to get better doing the automated labeling.
00:52:58 Yeah, so you can enrich or refine your labels
00:53:04 in an automated way.
00:53:05 That’s correct.
00:53:07 Do you have a hope for,
00:53:08 I don’t know if you’re familiar
00:53:09 with the idea of a semantic web.
00:53:11 Is a semantic web just for people who are not familiar
00:53:15 and is the idea of being able to convert the internet
00:53:20 or be able to attach like semantic meaning
00:53:25 to the words on the internet,
00:53:27 the sentences, the paragraphs,
00:53:29 to be able to convert information on the internet
00:53:33 or some fraction of the internet
00:53:35 into something that’s interpretable by machines.
00:53:39 That was kind of a dream for,
00:53:44 I think the semantic web papers in the nineties,
00:53:47 it’s kind of the dream that, you know,
00:53:49 the internet is full of rich, exciting information.
00:53:52 Even just looking at Wikipedia,
00:53:54 we should be able to use that as data for machines.
00:53:57 And so far it’s not,
00:53:58 it’s not really in a format that’s available to machines.
00:54:01 So no, I don’t think the semantic web will ever work
00:54:04 simply because it would be a lot of work, right?
00:54:08 To make, to provide that information in structured form.
00:54:12 And there is not really any incentive
00:54:13 for anyone to provide that work.
00:54:16 So I think the way forward to make the knowledge
00:54:21 on the web available to machines
00:54:22 is actually something closer to unsupervised deep learning.
00:54:29 So GPT3 is actually a bigger step in the direction
00:54:32 of making the knowledge of the web available to machines
00:54:34 than the semantic web was.
00:54:36 Yeah, perhaps in a human centric sense,
00:54:40 it feels like GPT3 hasn’t learned anything
00:54:47 that could be used to reason.
00:54:50 But that might be just the early days.
00:54:52 Yeah, I think that’s correct.
00:54:54 I think the forms of reasoning that you see it perform
00:54:57 are basically just reproducing patterns
00:55:00 that it has seen in string data.
00:55:02 So of course, if you’re trained on the entire web,
00:55:06 then you can produce an illusion of reasoning
00:55:09 in many different situations.
00:55:10 But it will break down if it’s presented
00:55:13 with a novel situation.
00:55:15 That’s the open question between the illusion of reasoning
00:55:17 and actual reasoning, yeah.
00:55:18 Yes.
00:55:19 The power to adapt to something that is genuinely new.
00:55:22 Because the thing is, even imagine you had,
00:55:28 you could train on every bit of data
00:55:31 ever generated in the history of humanity.
00:55:35 It remains, that model would be capable
00:55:38 of anticipating many different possible situations.
00:55:43 But it remains that the future is
00:55:45 going to be something different.
00:55:48 For instance, if you train a GPT3 model on data
00:55:52 from the year 2002, for instance,
00:55:55 and then use it today, it’s going to be missing many things.
00:55:58 It’s going to be missing many common sense
00:56:00 facts about the world.
00:56:02 It’s even going to be missing vocabulary and so on.
00:56:05 Yeah, it’s interesting that GPT3 even doesn’t have,
00:56:09 I think, any information about the coronavirus.
00:56:13 Yes.
00:56:14 Which is why a system that’s, you
00:56:19 tell that the system is intelligent
00:56:21 when it’s capable to adapt.
00:56:22 So intelligence is going to require
00:56:25 some amount of continuous learning.
00:56:28 It’s also going to require some amount of improvisation.
00:56:31 It’s not enough to assume that what you’re
00:56:33 going to be asked to do is something
00:56:36 that you’ve seen before, or something
00:56:39 that is a simple interpolation of things you’ve seen before.
00:56:42 Yeah.
00:56:43 In fact, that model breaks down for even very
00:56:49 tasks that look relatively simple from a distance,
00:56:52 like L5 self driving, for instance.
00:56:55 Google had a paper a couple of years
00:56:58 back showing that something like 30 million different road
00:57:04 situations were actually completely insufficient
00:57:07 to train a driving model.
00:57:09 It wasn’t even L2, right?
00:57:11 And that’s a lot of data.
00:57:12 That’s a lot more data than the 20 or 30 hours of driving
00:57:16 that a human needs to learn to drive,
00:57:19 given the knowledge they’ve already accumulated.
00:57:21 Well, let me ask you on that topic.
00:57:25 Elon Musk, Tesla Autopilot, one of the only companies,
00:57:31 I believe, is really pushing for a learning based approach.
00:57:34 Are you skeptical that that kind of network
00:57:37 can achieve level 4?
00:57:39 L4 is probably achievable.
00:57:42 L5 probably not.
00:57:44 What’s the distinction there?
00:57:45 Is L5 is completely you can just fall asleep?
00:57:49 Yeah, L5 is basically human level.
00:57:51 Well, with driving, we have to be careful saying human level,
00:57:53 because that’s the most of the drivers.
00:57:57 Yeah, that’s the clearest example of cars
00:58:00 will most likely be much safer than humans in many situations
00:58:05 where humans fail.
00:58:06 It’s the vice versa question.
00:58:09 I’ll tell you, the thing is the amount of trained data
00:58:13 you would need to anticipate for pretty much every possible
00:58:17 situation you learn content in the real world
00:58:20 is such that it’s not entirely unrealistic
00:58:23 to think that at some point in the future,
00:58:25 we’ll develop a system that’s trained on enough data,
00:58:27 especially provided that we can simulate a lot of that data.
00:58:32 We don’t necessarily need actual cars
00:58:34 on the road for everything.
00:58:37 But it’s a massive effort.
00:58:39 And it turns out you can create a system that’s
00:58:42 much more adaptive, that can generalize much better
00:58:45 if you just add explicit models of the surroundings
00:58:52 of the car.
00:58:53 And if you use deep learning for what
00:58:55 it’s good at, which is to provide
00:58:57 perceptive information.
00:58:59 So in general, deep learning is a way
00:59:02 to encode perception and a way to encode intuition.
00:59:05 But it is not a good medium for any sort of explicit reasoning.
00:59:11 And in AI systems today, strong generalization
00:59:15 tends to come from explicit models,
00:59:21 tend to come from abstractions in the human mind that
00:59:24 are encoded in program form by a human engineer.
00:59:29 These are the abstractions you can actually generalize, not
00:59:31 the sort of weak abstraction that
00:59:33 is learned by a neural network.
00:59:34 Yeah, and the question is how much reasoning,
00:59:38 how much strong abstractions are required
00:59:41 to solve particular tasks like driving.
00:59:44 That’s the question.
00:59:46 Or human life existence.
00:59:48 How much strong abstractions does existence require?
00:59:53 But more specifically on driving,
00:59:58 that seems to be a coupled question about intelligence.
01:00:02 How much intelligence, how do you
01:00:05 build an intelligent system?
01:00:07 And the coupled problem, how hard is this problem?
01:00:11 How much intelligence does this problem actually require?
01:00:14 So we get to cheat because we get
01:00:18 to look at the problem.
01:00:20 It’s not like you get to close our eyes
01:00:22 and completely new to driving.
01:00:24 We get to do what we do as human beings, which
01:00:27 is for the majority of our life before we ever
01:00:31 learn, quote unquote, to drive.
01:00:32 We get to watch other cars and other people drive.
01:00:35 We get to be in cars.
01:00:36 We get to watch.
01:00:37 We get to see movies about cars.
01:00:39 We get to observe all this stuff.
01:00:42 And that’s similar to what neural networks are doing.
01:00:45 It’s getting a lot of data, and the question
01:00:50 is, yeah, how many leaps of reasoning genius
01:00:55 is required to be able to actually effectively drive?
01:00:59 I think it’s a good example of driving.
01:01:01 I mean, sure, you’ve seen a lot of cars in your life
01:01:06 before you learned to drive.
01:01:07 But let’s say you’ve learned to drive in Silicon Valley,
01:01:10 and now you rent a car in Tokyo.
01:01:14 Well, now everyone is driving on the other side of the road,
01:01:16 and the signs are different, and the roads
01:01:19 are more narrow and so on.
01:01:20 So it’s a very, very different environment.
01:01:22 And a smart human, even an average human,
01:01:26 should be able to just zero shot it,
01:01:29 to just be operational in this very different environment
01:01:34 right away, despite having had no contact with the novel
01:01:40 complexity that is contained in this environment.
01:01:44 And that novel complexity is not just an interpolation
01:01:49 over the situations that you’ve encountered previously,
01:01:52 like learning to drive in the US.
01:01:55 I would say the reason I ask is one
01:01:57 of the most interesting tests of intelligence
01:01:59 we have today actively, which is driving,
01:02:04 in terms of having an impact on the world.
01:02:06 When do you think we’ll pass that test of intelligence?
01:02:09 So I don’t think driving is that much of a test of intelligence,
01:02:13 because again, there is no task for which skill at that task
01:02:18 demonstrates intelligence, unless it’s
01:02:21 a kind of meta task that involves acquiring new skills.
01:02:26 So I don’t think, I think you can actually
01:02:28 solve driving without having any real amount of intelligence.
01:02:35 For instance, if you did have infinite trained data,
01:02:39 you could just literally train an end to end deep learning
01:02:42 model that does driving, provided infinite trained data.
01:02:45 The only problem with the whole idea
01:02:48 is collecting a data set that’s sufficiently comprehensive,
01:02:53 that covers the very long tail of possible situations
01:02:56 you might encounter.
01:02:57 And it’s really just a scale problem.
01:02:59 So I think there’s nothing fundamentally wrong
01:03:04 with this plan, with this idea.
01:03:06 It’s just that it strikes me as a fairly inefficient thing
01:03:11 to do, because you run into this scaling issue with diminishing
01:03:17 returns.
01:03:17 Whereas if instead you took a more manual engineering
01:03:21 approach, where you use deep learning modules in combination
01:03:29 with engineering an explicit model of the surrounding
01:03:33 of the cars, and you bridge the two in a clever way,
01:03:36 your model will actually start generalizing
01:03:38 much earlier and more effectively
01:03:40 than the end to end deep learning model.
01:03:42 So why would you not go with the more manual engineering
01:03:46 oriented approach?
01:03:47 Even if you created that system, either the end
01:03:50 to end deep learning model system that’s
01:03:52 running infinite data, or the slightly more human system,
01:03:58 I don’t think achieving L5 would demonstrate
01:04:02 general intelligence or intelligence
01:04:04 of any generality at all.
01:04:05 Again, the only possible test of generality in AI
01:04:10 would be a test that looks at skill acquisition
01:04:12 over unknown tasks.
01:04:14 For instance, you could take your L5 driver
01:04:17 and ask it to learn to pilot a commercial airplane,
01:04:21 for instance.
01:04:22 And then you would look at how much human involvement is
01:04:25 required and how much strength data
01:04:26 is required for the system to learn to pilot an airplane.
01:04:29 And that gives you a measure of how intelligent
01:04:35 that system really is.
01:04:35 Yeah, well, I mean, that’s a big leap.
01:04:37 I get you.
01:04:38 But I’m more interested, as a problem, I would see,
01:04:42 to me, driving is a black box that
01:04:47 can generate novel situations at some rate,
01:04:51 what people call edge cases.
01:04:53 So it does have newness that keeps being like,
01:04:56 we’re confronted, let’s say, once a month.
01:04:59 It is a very long tail, yes.
01:05:00 It’s a long tail.
01:05:01 That doesn’t mean you cannot solve it just
01:05:05 by training a statistical model and a lot of data.
01:05:08 Huge amount of data.
01:05:09 It’s really a matter of scale.
01:05:11 But I guess what I’m saying is if you have a vehicle that
01:05:16 achieves level 5, it is going to be able to deal
01:05:21 with new situations.
01:05:23 Or, I mean, the data is so large that the rate of new situations
01:05:30 is very low.
01:05:32 Yes.
01:05:33 That’s not intelligent.
01:05:34 So if we go back to your kind of definition of intelligence,
01:05:37 it’s the efficiency.
01:05:39 With which you can adapt to new situations,
01:05:42 to truly new situations, not situations you’ve seen before.
01:05:45 Not situations that could be anticipated by your creators,
01:05:48 by the creators of the system, but truly new situations.
01:05:51 The efficiency with which you acquire new skills.
01:05:54 If you require, if in order to pick up a new skill,
01:05:58 you require a very extensive training
01:06:03 data set of most possible situations
01:06:05 that can occur in the practice of that skill,
01:06:08 then the system is not intelligent.
01:06:10 It is mostly just a lookup table.
01:06:15 Yeah.
01:06:16 Well, likewise, if in order to acquire a skill,
01:06:20 you need a human engineer to write down
01:06:23 a bunch of rules that cover most or every possible situation.
01:06:26 Likewise, the system is not intelligent.
01:06:29 The system is merely the output artifact
01:06:33 of a process that happens in the minds of the engineers that
01:06:39 are creating it.
01:06:40 It is encoding an abstraction that’s
01:06:44 produced by the human mind.
01:06:46 And intelligence would actually be
01:06:51 the process of autonomously producing this abstraction.
01:06:56 Yeah.
01:06:57 Not like if you take an abstraction
01:06:59 and you encode it on a piece of paper or in a computer program,
01:07:02 the abstraction itself is not intelligent.
01:07:05 What’s intelligent is the agent that’s
01:07:09 capable of producing these abstractions.
01:07:11 Yeah, it feels like there’s a little bit of a gray area.
01:07:16 Because you’re basically saying that deep learning forms
01:07:18 abstractions, too.
01:07:21 But those abstractions do not seem
01:07:24 to be effective for generalizing far outside of the things
01:07:29 that it’s already seen.
01:07:30 But generalize a little bit.
01:07:31 Yeah, absolutely.
01:07:32 No, deep learning does generalize a little bit.
01:07:34 Generalization is not binary.
01:07:36 It’s more like a spectrum.
01:07:38 Yeah.
01:07:38 And there’s a certain point, it’s a gray area,
01:07:40 but there’s a certain point where
01:07:42 there’s an impressive degree of generalization that happens.
01:07:47 No, I guess exactly what you were saying
01:07:50 is intelligence is how efficiently you’re
01:07:56 able to generalize far outside of the distribution of things
01:08:02 you’ve seen already.
01:08:03 Yes.
01:08:03 So it’s both the distance of how far you can,
01:08:07 how new, how radically new something is,
01:08:10 and how efficiently you’re able to deal with that.
01:08:12 So you can think of intelligence as a measure of an information
01:08:17 conversion ratio.
01:08:19 Imagine a space of possible situations.
01:08:23 And you’ve covered some of them.
01:08:27 So you have some amount of information
01:08:30 about your space of possible situations
01:08:32 that’s provided by the situations you already know.
01:08:34 And that’s, on the other hand, also provided
01:08:36 by the prior knowledge that the system brings
01:08:40 to the table, the prior knowledge embedded
01:08:42 in the system.
01:08:43 So the system starts with some information
01:08:46 about the problem, about the task.
01:08:48 And it’s about going from that information
01:08:52 to a program, what we would call a skill program,
01:08:55 a behavioral program, that can cover a large area
01:08:58 of possible situation space.
01:09:01 And essentially, the ratio between that area
01:09:04 and the amount of information you start with is intelligence.
01:09:09 So a very smart agent can make efficient use
01:09:14 of very little information about a new problem
01:09:17 and very little prior knowledge as well
01:09:19 to cover a very large area of potential situations
01:09:23 in that problem without knowing what these future new situations
01:09:28 are going to be.
01:09:31 So one of the other big things you talk about in the paper,
01:09:34 we’ve talked about a little bit already,
01:09:36 but let’s talk about it some more,
01:09:37 is the actual tests of intelligence.
01:09:41 So if we look at human and machine intelligence,
01:09:45 do you think tests of intelligence
01:09:48 should be different for humans and machines,
01:09:50 or how we think about testing of intelligence?
01:09:54 Are these fundamentally the same kind of intelligences
01:09:59 that we’re after, and therefore, the tests should be similar?
01:10:03 So if your goal is to create AIs that are more humanlike,
01:10:10 then it would be super valuable, obviously,
01:10:12 to have a test that’s universal, that applies to both AIs
01:10:18 and humans, so that you could establish
01:10:20 a comparison between the two, that you
01:10:23 could tell exactly how intelligent,
01:10:27 in terms of human intelligence, a given system is.
01:10:30 So that said, the constraints that
01:10:34 apply to artificial intelligence and to human intelligence
01:10:37 are very different.
01:10:39 And your test should account for this difference.
01:10:44 Because if you look at artificial systems,
01:10:47 it’s always possible for an experimenter
01:10:50 to buy arbitrary levels of skill at arbitrary tasks,
01:10:55 either by injecting hardcoded prior knowledge
01:11:01 into the system via rules and so on that
01:11:05 come from the human mind, from the minds of the programmers,
01:11:08 and also buying higher levels of skill
01:11:12 just by training on more data.
01:11:15 For instance, you could generate an infinity
01:11:17 of different Go games, and you could train a Go playing
01:11:21 system that way, but you could not directly compare it
01:11:26 to human Go playing skills.
01:11:28 Because a human that plays Go had
01:11:31 to develop that skill in a very constrained environment.
01:11:34 They had a limited amount of time.
01:11:36 They had a limited amount of energy.
01:11:38 And of course, this started from a different set of priors.
01:11:42 This started from innate human priors.
01:11:48 So I think if you want to compare
01:11:49 the intelligence of two systems, like the intelligence of an AI
01:11:53 and the intelligence of a human, you have to control for priors.
01:11:59 You have to start from the same set of knowledge priors
01:12:04 about the task, and you have to control
01:12:06 for experience, that is to say, for training data.
01:12:11 So what’s priors?
01:12:14 So prior is whatever information you
01:12:18 have about a given task before you
01:12:21 start learning about this task.
01:12:23 And how’s that different from experience?
01:12:25 Well, experience is acquired.
01:12:28 So for instance, if you’re trying to play Go,
01:12:31 your experience with Go is all the Go games
01:12:33 you’ve played, or you’ve seen, or you’ve simulated
01:12:37 in your mind, let’s say.
01:12:38 And your priors are things like, well,
01:12:42 Go is a game on the 2D grid.
01:12:45 And we have lots of hardcoded priors
01:12:48 about the organization of 2D space.
01:12:53 And the rules of how the dynamics of the physics
01:12:58 of this game in this 2D space?
01:12:59 Yes.
01:13:00 And the idea that you have what winning is.
01:13:04 Yes, exactly.
01:13:05 And other board games can also share some similarities with Go.
01:13:09 And if you’ve played these board games, then,
01:13:12 with respect to the game of Go, that
01:13:13 would be part of your priors about the game.
01:13:16 Well, it’s interesting to think about the game of Go
01:13:18 is how many priors are actually brought to the table.
01:13:22 When you look at self play, reinforcement learning based
01:13:27 mechanisms that do learning, it seems
01:13:29 like the number of priors is pretty low.
01:13:31 Yes.
01:13:31 But you’re saying you should be expec…
01:13:32 There is a 2D special priors in the carbonate.
01:13:35 Right.
01:13:36 But you should be clear at making
01:13:39 those priors explicit.
01:13:40 Yes.
01:13:41 So in particular, I think if your goal
01:13:44 is to measure a humanlike form of intelligence,
01:13:47 then you should clearly establish
01:13:49 that you want the AI you’re testing
01:13:52 to start from the same set of priors that humans start with.
01:13:57 Right.
01:13:58 So I mean, to me personally, but I think to a lot of people,
01:14:02 the human side of things is very interesting.
01:14:05 So testing intelligence for humans.
01:14:08 What do you think is a good test of human intelligence?
01:14:14 Well, that’s the question that psychometrics is interested in.
01:14:19 There’s an entire subfield of psychology
01:14:22 that deals with this question.
01:14:23 So what’s psychometrics?
01:14:25 The psychometrics is the subfield of psychology
01:14:27 that tries to measure, quantify aspects of the human mind.
01:14:33 So in particular, our cognitive abilities, intelligence,
01:14:36 and personality traits as well.
01:14:39 So what are, it might be a weird question,
01:14:43 but what are the first principles of psychometrics
01:14:49 this operates on?
01:14:52 What are the priors it brings to the table?
01:14:55 So it’s a field with a fairly long history.
01:15:01 So psychology sometimes gets a bad reputation
01:15:05 for not having very reproducible results.
01:15:09 And psychometrics has actually some fairly solidly
01:15:12 reproducible results.
01:15:14 So the ideal goals of the field is a test
01:15:17 should be reliable, which is a notion tied to reproducibility.
01:15:23 It should be valid, meaning that it should actually
01:15:26 measure what you say it measures.
01:15:30 So for instance, if you’re saying
01:15:32 that you’re measuring intelligence,
01:15:34 then your test results should be correlated
01:15:36 with things that you expect to be correlated
01:15:39 with intelligence like success in school
01:15:41 or success in the workplace and so on.
01:15:43 Should be standardized, meaning that you
01:15:46 can administer your tests to many different people
01:15:48 in some conditions.
01:15:50 And it should be free from bias.
01:15:52 Meaning that, for instance, if your test involves
01:15:57 the English language, then you have
01:15:59 to be aware that this creates a bias against people
01:16:02 who have English as their second language
01:16:04 or people who can’t speak English at all.
01:16:07 So of course, these principles for creating
01:16:10 psychometric tests are very much an ideal.
01:16:13 I don’t think every psychometric test is really either
01:16:17 reliable, valid, or free from bias.
01:16:22 But at least the field is aware of these weaknesses
01:16:25 and is trying to address them.
01:16:27 So it’s kind of interesting.
01:16:30 Ultimately, you’re only able to measure,
01:16:31 like you said previously, the skill.
01:16:34 But you’re trying to do a bunch of measures
01:16:36 of different skills that correlate,
01:16:38 as you mentioned, strongly with some general concept
01:16:41 of cognitive ability.
01:16:43 Yes, yes.
01:16:44 So what’s the G factor?
01:16:46 So right, there are many different kinds
01:16:48 of tests of intelligence.
01:16:50 And each of them is interesting in different aspects
01:16:55 of intelligence.
01:16:56 Some of them will deal with language.
01:16:57 Some of them will deal with spatial vision,
01:17:00 maybe mental rotations, numbers, and so on.
01:17:04 When you run these very different tests at scale,
01:17:08 what you start seeing is that there
01:17:10 are clusters of correlations among test results.
01:17:14 So for instance, if you look at homework at school,
01:17:19 you will see that people who do well at math
01:17:21 are also likely statistically to do well in physics.
01:17:25 And what’s more, people who do well at math and physics
01:17:30 are also statistically likely to do well
01:17:32 in things that sound completely unrelated,
01:17:35 like writing an English essay, for instance.
01:17:38 And so when you see clusters of correlations
01:17:42 in statistical terms, you would explain them
01:17:46 with the latent variable.
01:17:47 And the latent variable that would, for instance, explain
01:17:51 the relationship between being good at math
01:17:53 and being good at physics would be cognitive ability.
01:17:57 And the G factor is the latent variable
01:18:00 that explains the fact that every test of intelligence
01:18:05 that you can come up with results on this test
01:18:09 end up being correlated.
01:18:10 So there is some single unique variable
01:18:16 that explains these correlations.
01:18:17 That’s the G factor.
01:18:18 So it’s a statistical construct.
01:18:20 It’s not really something you can directly measure,
01:18:23 for instance, in a person.
01:18:25 But it’s there.
01:18:26 But it’s there.
01:18:27 It’s there.
01:18:27 It’s there at scale.
01:18:28 And that’s also one thing I want to mention about psychometrics.
01:18:33 Like when you talk about measuring intelligence
01:18:36 in humans, for instance, some people
01:18:38 get a little bit worried.
01:18:40 They will say, that sounds dangerous.
01:18:41 Maybe that sounds potentially discriminatory, and so on.
01:18:44 And they’re not wrong.
01:18:46 And the thing is, personally, I’m
01:18:48 not interested in psychometrics as a way
01:18:51 to characterize one individual person.
01:18:54 Like if I get your psychometric personality
01:18:59 assessments or your IQ, I don’t think that actually
01:19:01 tells me much about you as a person.
01:19:05 I think psychometrics is most useful as a statistical tool.
01:19:10 So it’s most useful at scale.
01:19:12 It’s most useful when you start getting test results
01:19:15 for a large number of people.
01:19:17 And you start cross correlating these test results.
01:19:20 Because that gives you information
01:19:23 about the structure of the human mind,
01:19:26 in particular about the structure
01:19:28 of human cognitive abilities.
01:19:29 So at scale, psychometrics paints a certain picture
01:19:34 of the human mind.
01:19:35 And that’s interesting.
01:19:37 And that’s what’s relevant to AI, the structure
01:19:39 of human cognitive abilities.
01:19:41 Yeah, it gives you an insight into it.
01:19:42 I mean, to me, I remember when I learned about G factor,
01:19:45 it seemed like it would be impossible for it
01:19:52 to be real, even as a statistical variable.
01:19:55 Like it felt kind of like astrology.
01:19:59 Like it’s like wishful thinking among psychologists.
01:20:01 But the more I learned, I realized that there’s some.
01:20:05 I mean, I’m not sure what to make about human beings,
01:20:07 the fact that the G factor is a thing.
01:20:10 There’s a commonality across all of human species,
01:20:13 that there does seem to be a strong correlation
01:20:15 between cognitive abilities.
01:20:17 That’s kind of fascinating, actually.
01:20:19 So human cognitive abilities have a structure.
01:20:22 Like the most mainstream theory of the structure
01:20:25 of cognitive abilities is called CHC theory.
01:20:28 It’s Cattell, Horn, Carroll.
01:20:30 It’s named after the three psychologists who
01:20:33 contributed key pieces of it.
01:20:35 And it describes cognitive abilities
01:20:38 as a hierarchy with three levels.
01:20:41 And at the top, you have the G factor.
01:20:43 Then you have broad cognitive abilities,
01:20:46 for instance fluid intelligence, that
01:20:49 encompass a broad set of possible kinds of tasks
01:20:54 that are all related.
01:20:57 And then you have narrow cognitive abilities
01:20:59 at the last level, which is closer to task specific skill.
01:21:04 And there are actually different theories of the structure
01:21:09 of cognitive abilities that just emerge
01:21:10 from different statistical analysis of IQ test results.
01:21:14 But they all describe a hierarchy with a kind of G
01:21:18 factor at the top.
01:21:21 And you’re right that the G factor,
01:21:23 it’s not quite real in the sense that it’s not something
01:21:27 you can observe and measure, like your height,
01:21:29 for instance.
01:21:30 But it’s real in the sense that you
01:21:32 see it in a statistical analysis of the data.
01:21:37 One thing I want to mention is that the fact
01:21:39 that there is a G factor does not really
01:21:41 mean that human intelligence is general in a strong sense.
01:21:45 It does not mean human intelligence
01:21:47 can be applied to any problem at all,
01:21:50 and that someone who has a high IQ
01:21:52 is going to be able to solve any problem at all.
01:21:54 That’s not quite what it means.
01:21:55 I think one popular analogy to understand it
01:22:00 is the sports analogy.
01:22:03 If you consider the concept of physical fitness,
01:22:06 it’s a concept that’s very similar to intelligence
01:22:09 because it’s a useful concept.
01:22:11 It’s something you can intuitively understand.
01:22:14 Some people are fit, maybe like you.
01:22:17 Some people are not as fit, maybe like me.
01:22:20 But none of us can fly.
01:22:22 Absolutely.
01:22:23 It’s constrained to a specific set of skills.
01:22:25 Even if you’re very fit, that doesn’t
01:22:27 mean you can do anything at all in any environment.
01:22:31 You obviously cannot fly.
01:22:32 You cannot survive at the bottom of the ocean and so on.
01:22:36 And if you were a scientist and you
01:22:38 wanted to precisely define and measure physical fitness
01:22:42 in humans, then you would come up with a battery of tests.
01:22:47 You would have running 100 meter, playing soccer,
01:22:51 playing table tennis, swimming, and so on.
01:22:54 And if you ran these tests over many different people,
01:22:58 you would start seeing correlations in test results.
01:23:01 For instance, people who are good at soccer
01:23:03 are also good at sprinting.
01:23:05 And you would explain these correlations
01:23:08 with physical abilities that are strictly
01:23:11 analogous to cognitive abilities.
01:23:14 And then you would start also observing correlations
01:23:17 between biological characteristics,
01:23:21 like maybe lung volume is correlated with being
01:23:24 a fast runner, for instance, in the same way
01:23:27 that there are neurophysical correlates of cognitive
01:23:32 abilities.
01:23:33 And at the top of the hierarchy of physical abilities
01:23:38 that you would be able to observe,
01:23:39 you would have a G factor, a physical G factor, which
01:23:43 would map to physical fitness.
01:23:45 And as you just said, that doesn’t
01:23:47 mean that people with high physical fitness can’t fly.
01:23:51 It doesn’t mean human morphology and human physiology
01:23:54 is universal.
01:23:55 It’s actually super specialized.
01:23:57 We can only do the things that we were evolved to do.
01:24:04 We are not appropriate to, you could not
01:24:08 exist on Venus or Mars or in the void of space
01:24:11 or the bottom of the ocean.
01:24:12 So that said, one thing that’s really striking and remarkable
01:24:17 is that our morphology generalizes
01:24:23 far beyond the environments that we evolved for.
01:24:27 Like in a way, you could say we evolved to run after prey
01:24:31 in the savanna, right?
01:24:32 That’s very much where our human morphology comes from.
01:24:36 And that said, we can do a lot of things
01:24:40 that are completely unrelated to that.
01:24:42 We can climb mountains.
01:24:44 We can swim across lakes.
01:24:47 We can play table tennis.
01:24:48 I mean, table tennis is very different from what
01:24:51 we were evolved to do, right?
01:24:53 So our morphology, our bodies, our sense and motor
01:24:56 affordances have a degree of generality
01:24:59 that is absolutely remarkable, right?
01:25:02 And I think cognition is very similar to that.
01:25:05 Our cognitive abilities have a degree of generality
01:25:08 that goes far beyond what the mind was initially
01:25:11 supposed to do, which is why we can play music and write
01:25:14 novels and go to Mars and do all kinds of crazy things.
01:25:18 But it’s not universal in the same way
01:25:20 that human morphology and our body
01:25:23 is not appropriate for actually most of the universe by volume.
01:25:27 In the same way, you could say that the human mind is not
01:25:29 really appropriate for most of problem space,
01:25:32 potential problem space by volume.
01:25:35 So we have very strong cognitive biases, actually,
01:25:39 that mean that there are certain types of problems
01:25:42 that we handle very well and certain types of problems
01:25:45 that we are completely in adapted for.
01:25:48 So that’s really how we’d interpret the G factor.
01:25:52 It’s not a sign of strong generality.
01:25:56 It’s really just the broadest cognitive ability.
01:26:01 But our abilities, whether we are
01:26:03 talking about sensory motor abilities or cognitive
01:26:05 abilities, they still remain very specialized
01:26:09 in the human condition, right?
01:26:12 Within the constraints of the human cognition,
01:26:16 they’re general.
01:26:18 Yes, absolutely.
01:26:19 But the constraints, as you’re saying, are very limited.
01:26:22 I think what’s limiting.
01:26:23 So we evolved our cognition and our body
01:26:26 evolved in very specific environments.
01:26:29 Because our environment was so variable, fast changing,
01:26:32 and so unpredictable, part of the constraints
01:26:35 that drove our evolution is generality itself.
01:26:39 So we were, in a way, evolved to be able to improvise
01:26:42 in all kinds of physical or cognitive environments.
01:26:47 And for this reason, it turns out
01:26:49 that the minds and bodies that we ended up with
01:26:55 can be applied to much, much broader scope
01:26:58 than what they were evolved for.
01:27:00 And that’s truly remarkable.
01:27:01 And that’s a degree of generalization
01:27:03 that is far beyond anything you can see in artificial systems
01:27:07 today.
01:27:10 That said, it does not mean that human intelligence
01:27:14 is anywhere universal.
01:27:16 Yeah, it’s not general.
01:27:18 It’s a kind of exciting topic for people,
01:27:21 even outside of artificial intelligence, is IQ tests.
01:27:27 I think it’s Mensa, whatever.
01:27:29 There’s different degrees of difficulty for questions.
01:27:32 We talked about this offline a little bit, too,
01:27:34 about difficult questions.
01:27:37 What makes a question on an IQ test more difficult or less
01:27:42 difficult, do you think?
01:27:43 So the thing to keep in mind is that there’s
01:27:46 no such thing as a question that’s intrinsically difficult.
01:27:51 It has to be difficult to suspect to the things you
01:27:54 already know and the things you can already do, right?
01:27:58 So in terms of an IQ test question,
01:28:02 typically it would be structured, for instance,
01:28:05 as a set of demonstration input and output pairs, right?
01:28:11 And then you would be given a test input, a prompt,
01:28:15 and you would need to recognize or produce
01:28:18 the corresponding output.
01:28:20 And in that narrow context, you could say a difficult question
01:28:26 is a question where the input prompt is
01:28:31 very surprising and unexpected, given the training examples.
01:28:36 Just even the nature of the patterns
01:28:38 that you’re observing in the input prompt.
01:28:40 For instance, let’s say you have a rotation problem.
01:28:43 You must relate the shape by 90 degrees.
01:28:46 If I give you two examples and then I give you one prompt,
01:28:50 which is actually one of the two training examples,
01:28:53 then there is zero generalization difficulty
01:28:55 for the task.
01:28:56 It’s actually a trivial task.
01:28:57 You just recognize that it’s one of the training examples,
01:29:00 and you produce the same answer.
01:29:02 Now, if it’s a more complex shape,
01:29:05 there is a little bit more generalization,
01:29:07 but it remains that you are still
01:29:09 doing the same thing at this time,
01:29:12 as you were being demonstrated at training time.
01:29:15 A difficult task starts to require some amount of test
01:29:20 time adaptation, some amount of improvisation, right?
01:29:25 So consider, I don’t know, you’re
01:29:29 teaching a class on quantum physics or something.
01:29:34 If you wanted to test the understanding that students
01:29:40 have of the material, you would come up
01:29:42 with an exam that’s very different from anything
01:29:47 they’ve seen on the internet when they were cramming.
01:29:51 On the other hand, if you wanted to make it easy,
01:29:54 you would just give them something
01:29:56 that’s very similar to the mock exams
01:30:00 that they’ve taken, something that’s
01:30:03 just a simple interpolation of questions
01:30:05 that they’ve already seen.
01:30:07 And so that would be an easy exam.
01:30:09 It’s very similar to what you’ve been trained on.
01:30:11 And a difficult exam is one that really probes your understanding
01:30:15 because it forces you to improvise.
01:30:18 It forces you to do things that are
01:30:22 different from what you were exposed to before.
01:30:24 So that said, it doesn’t mean that the exam that
01:30:29 requires improvisation is intrinsically hard, right?
01:30:32 Because maybe you’re a quantum physics expert.
01:30:35 So when you take the exam, this is actually
01:30:37 stuff that, despite being new to the students,
01:30:40 it’s not new to you, right?
01:30:42 So it can only be difficult with respect
01:30:46 to what the test taker already knows
01:30:49 and with respect to the information
01:30:51 that the test taker has about the task.
01:30:54 So that’s what I mean by controlling for priors
01:30:57 what the information you bring to the table.
01:30:59 And the experience.
01:31:00 And the experience, which is to train data.
01:31:02 So in the case of the quantum physics exam,
01:31:05 that would be all the course material itself
01:31:09 and all the mock exams that students
01:31:11 might have taken online.
01:31:12 Yeah, it’s interesting because I’ve also sent you an email.
01:31:17 I asked you, I’ve been in just this curious question
01:31:21 of what’s a really hard IQ test question.
01:31:27 And I’ve been talking to also people
01:31:30 who have designed IQ tests.
01:31:32 There’s a few folks on the internet, it’s like a thing.
01:31:34 People are really curious about it.
01:31:36 First of all, most of the IQ tests they designed,
01:31:39 they like religiously protect against the correct answers.
01:31:45 Like you can’t find the correct answers anywhere.
01:31:48 In fact, the question is ruined once you know,
01:31:50 even like the approach you’re supposed to take.
01:31:53 So they’re very…
01:31:54 That said, the approach is implicit in the training examples.
01:31:58 So if you release the training examples, it’s over.
01:32:02 Which is why in Arc, for instance,
01:32:04 there is a test set that is private and no one has seen it.
01:32:09 No, for really tough IQ questions, it’s not obvious.
01:32:13 It’s not because the ambiguity.
01:32:17 Like it’s, I mean, we’ll have to look through them,
01:32:20 but like some number sequences and so on,
01:32:22 it’s not completely clear.
01:32:25 So like you can get a sense, but there’s like some,
01:32:30 you know, when you look at a number sequence, I don’t know,
01:32:36 like your Fibonacci number sequence,
01:32:37 if you look at the first few numbers,
01:32:39 that sequence could be completed in a lot of different ways.
01:32:42 And you know, some are, if you think deeply,
01:32:45 are more correct than others.
01:32:46 Like there’s a kind of intuitive simplicity
01:32:51 and elegance to the correct solution.
01:32:53 Yes.
01:32:53 I am personally not a fan of ambiguity
01:32:56 in test questions actually,
01:32:58 but I think you can have difficulty
01:33:01 without requiring ambiguity simply by making the test
01:33:05 require a lot of extrapolation over the training examples.
01:33:09 But the beautiful question is difficult,
01:33:13 but gives away everything
01:33:14 when you give the training example.
01:33:17 Basically, yes.
01:33:18 Meaning that, so the tests I’m interested in creating
01:33:24 are not necessarily difficult for humans
01:33:27 because human intelligence is the benchmark.
01:33:31 They’re supposed to be difficult for machines
01:33:34 in ways that are easy for humans.
01:33:36 Like I think an ideal test of human and machine intelligence
01:33:40 is a test that is actionable,
01:33:44 that highlights the need for progress,
01:33:48 and that highlights the direction
01:33:50 in which you should be making progress.
01:33:51 I think we’ll talk about the ARC challenge
01:33:54 and the test you’ve constructed
01:33:55 and you have these elegant examples.
01:33:58 I think that highlight,
01:33:59 like this is really easy for us humans,
01:34:01 but it’s really hard for machines.
01:34:04 But on the, you know, the designing an IQ test
01:34:09 for IQs of like higher than 160 and so on,
01:34:13 you have to say, you have to take that
01:34:15 and put it on steroids, right?
01:34:16 You have to think like, what is hard for humans?
01:34:19 And that’s a fascinating exercise in itself, I think.
01:34:25 And it was an interesting question
01:34:27 of what it takes to create a really hard question for humans
01:34:32 because you again have to do the same process
01:34:36 as you mentioned, which is, you know,
01:34:39 something basically where the experience
01:34:45 that you have likely to have encountered
01:34:46 throughout your whole life,
01:34:48 even if you’ve prepared for IQ tests,
01:34:51 which is a big challenge,
01:34:53 that this will still be novel for you.
01:34:55 Yeah, I mean, novelty is a requirement.
01:34:58 You should not be able to practice for the questions
01:35:02 that you’re gonna be tested on.
01:35:03 That’s important because otherwise what you’re doing
01:35:06 is not exhibiting intelligence.
01:35:08 What you’re doing is just retrieving
01:35:10 what you’ve been exposed before.
01:35:12 It’s the same thing as deep learning model.
01:35:14 If you train a deep learning model
01:35:15 on all the possible answers, then it will ace your test
01:35:20 in the same way that, you know,
01:35:24 a stupid student can still ace the test
01:35:28 if they cram for it.
01:35:30 They memorize, you know,
01:35:32 a hundred different possible mock exams.
01:35:34 And then they hope that the actual exam
01:35:37 will be a very simple interpolation of the mock exams.
01:35:41 And that student could just be a deep learning model
01:35:43 at that point.
01:35:44 But you can actually do that
01:35:45 without any understanding of the material.
01:35:48 And in fact, many students pass their exams
01:35:50 in exactly this way.
01:35:51 And if you want to avoid that,
01:35:53 you need an exam that’s unlike anything they’ve seen
01:35:56 that really probes their understanding.
01:36:00 So how do we design an IQ test for machines,
01:36:05 an intelligent test for machines?
01:36:07 All right, so in the paper I outline
01:36:10 a number of requirements that you expect of such a test.
01:36:14 And in particular, we should start by acknowledging
01:36:19 the priors that we expect to be required
01:36:23 in order to perform the test.
01:36:25 So we should be explicit about the priors, right?
01:36:28 And if the goal is to compare machine intelligence
01:36:31 and human intelligence,
01:36:32 then we should assume human cognitive priors, right?
01:36:36 And secondly, we should make sure that we are testing
01:36:42 for skill acquisition ability,
01:36:44 skill acquisition efficiency in particular,
01:36:46 and not for skill itself.
01:36:48 Meaning that every task featured in your test
01:36:51 should be novel and should not be something
01:36:54 that you can anticipate.
01:36:55 So for instance, it should not be possible
01:36:57 to brute force the space of possible questions, right?
01:37:02 To pre generate every possible question and answer.
01:37:06 So it should be tasks that cannot be anticipated,
01:37:10 not just by the system itself,
01:37:12 but by the creators of the system, right?
01:37:15 Yeah, you know what’s fascinating?
01:37:17 I mean, one of my favorite aspects of the paper
01:37:20 and the work you do with the ARC challenge
01:37:22 is the process of making priors explicit.
01:37:28 Just even that act alone is a really powerful one
01:37:33 of like, what are, it’s a really powerful question
01:37:39 asked of us humans.
01:37:40 What are the priors that we bring to the table?
01:37:44 So the next step is like, once you have those priors,
01:37:46 how do you use them to solve a novel task?
01:37:50 But like, just even making the priors explicit
01:37:52 is a really difficult and really powerful step.
01:37:56 And that’s like visually beautiful
01:37:58 and conceptually philosophically beautiful part
01:38:01 of the work you did with, and I guess continue to do
01:38:06 probably with the paper and the ARC challenge.
01:38:08 Can you talk about some of the priors
01:38:10 that we’re talking about here?
01:38:12 Yes, so a researcher has done a lot of work
01:38:15 on what exactly are the knowledge priors
01:38:19 that are innate to humans is Elizabeth Spelke from Harvard.
01:38:26 So she developed the core knowledge theory,
01:38:30 which outlines four different core knowledge systems.
01:38:36 So systems of knowledge that we are basically
01:38:39 either born with or that we are hardwired
01:38:43 to acquire very early on in our development.
01:38:47 And there’s no strong distinction between the two.
01:38:52 Like if you are primed to acquire
01:38:57 a certain type of knowledge in just a few weeks,
01:39:01 you might as well just be born with it.
01:39:03 It’s just part of who you are.
01:39:06 And so there are four different core knowledge systems.
01:39:09 Like the first one is the notion of objectness
01:39:13 and basic physics.
01:39:16 Like you recognize that something that moves
01:39:20 coherently, for instance, is an object.
01:39:23 So we intuitively, naturally, innately divide the world
01:39:28 into objects based on this notion of coherence,
01:39:31 physical coherence.
01:39:32 And in terms of elementary physics,
01:39:34 there’s the fact that objects can bump against each other
01:39:41 and the fact that they can occlude each other.
01:39:44 So these are things that we are essentially born with
01:39:48 or at least that we are going to be acquiring extremely early
01:39:52 because we’re really hardwired to acquire them.
01:39:55 So a bunch of points, pixels that move together
01:39:59 on objects are partly the same object.
01:40:02 Yes.
01:40:07 I don’t smoke weed, but if I did,
01:40:11 that’s something I could sit all night
01:40:13 and just think about, remember what I wrote in your paper,
01:40:15 just objectness, I wasn’t self aware, I guess,
01:40:19 of that particular prior.
01:40:23 That’s such a fascinating prior that like…
01:40:28 That’s the most basic one, but actually…
01:40:30 Objectness, just identity, just objectness.
01:40:34 It’s very basic, I suppose, but it’s so fundamental.
01:40:39 It is fundamental to human cognition.
01:40:41 Yeah.
01:40:42 The second prior that’s also fundamental is agentness,
01:40:46 which is not a real world, a real world, so agentness.
01:40:50 The fact that some of these objects
01:40:53 that you segment your environment into,
01:40:56 some of these objects are agents.
01:40:58 So what’s an agent?
01:41:00 It’s basically, it’s an object that has goals.
01:41:05 That has what?
01:41:06 That has goals, that is capable of pursuing goals.
01:41:09 So for instance, if you see two dots
01:41:12 moving in roughly synchronized fashion,
01:41:16 you will intuitively infer that one of the dots
01:41:19 is pursuing the other.
01:41:21 So that one of the dots is…
01:41:24 And one of the dots is an agent
01:41:27 and its goal is to avoid the other dot.
01:41:29 And one of the dots, the other dot is also an agent
01:41:32 and its goal is to catch the first dot.
01:41:35 Belke has shown that babies as young as three months
01:41:40 identify agentness and goal directedness
01:41:45 in their environment.
01:41:46 Another prior is basic geometry and topology,
01:41:52 like the notion of distance,
01:41:53 the ability to navigate in your environment and so on.
01:41:57 This is something that is fundamentally hardwired
01:42:01 into our brain.
01:42:02 It’s in fact backed by very specific neural mechanisms,
01:42:07 like for instance, grid cells and place cells.
01:42:10 So it’s something that’s literally hard coded
01:42:15 at the neural level in our hippocampus.
01:42:19 And the last prior would be the notion of numbers.
01:42:23 Like numbers are not actually a cultural construct.
01:42:26 We are intuitively, innately able to do some basic counting
01:42:31 and to compare quantities.
01:42:34 So it doesn’t mean we can do arbitrary arithmetic.
01:42:37 Counting, the actual counting.
01:42:39 Counting, like counting one, two, three ish,
01:42:41 then maybe more than three.
01:42:43 You can also compare quantities.
01:42:45 If I give you three dots and five dots,
01:42:48 you can tell the side with five dots has more dots.
01:42:52 So this is actually an innate prior.
01:42:56 So that said, the list may not be exhaustive.
01:43:00 So SpellKey is still, you know,
01:43:02 passing the potential existence of new knowledge systems.
01:43:08 For instance, knowledge systems that we deal
01:43:12 with social relationships.
01:43:15 Yeah, I mean, and there could be…
01:43:17 Which is much less relevant to something like ARC
01:43:22 or IQ test and so on.
01:43:22 Right.
01:43:23 There could be stuff that’s like you said,
01:43:26 rotation, symmetry, is there like…
01:43:29 Symmetry is really interesting.
01:43:31 It’s very likely that there is, speaking about rotation,
01:43:34 that there is in the brain, a hard coded system
01:43:38 that is capable of performing rotations.
01:43:42 One famous experiment that people did in the…
01:43:45 I don’t remember which was exactly,
01:43:48 but in the 70s was that people found that
01:43:53 if you asked people, if you give them two different shapes
01:43:57 and one of the shapes is a rotated version
01:44:01 of the first shape, and you ask them,
01:44:03 is that shape a rotated version of the first shape or not?
01:44:07 What you see is that the time it takes people to answer
01:44:11 is linearly proportional, right, to the angle of rotation.
01:44:16 So it’s almost like you have somewhere in your brain
01:44:19 like a turntable with a fixed speed.
01:44:24 And if you want to know if two objects are a rotated version
01:44:28 of each other, you put the object on the turntable,
01:44:31 you let it move around a little bit,
01:44:34 and then you stop when you have a match.
01:44:37 And that’s really interesting.
01:44:40 So what’s the ARC challenge?
01:44:42 So in the paper, I outline all these principles
01:44:47 that a good test of machine intelligence
01:44:50 and human intelligence should follow.
01:44:51 And the ARC challenge is one attempt
01:44:55 to embody as many of these principles as possible.
01:44:58 So I don’t think it’s anywhere near a perfect attempt, right?
01:45:03 It does not actually follow every principle,
01:45:06 but it is what I was able to do given the constraints.
01:45:10 So the format of ARC is very similar to classic IQ tests,
01:45:15 in particular Raven’s Progressive Metrices.
01:45:18 Raven’s?
01:45:18 Yeah, Raven’s Progressive Metrices.
01:45:20 I mean, if you’ve done IQ tests in the past,
01:45:22 you know what that is, probably.
01:45:24 Or at least you’ve seen it, even if you
01:45:25 don’t know what it’s called.
01:45:26 And so you have a set of tasks, that’s what they’re called.
01:45:32 And for each task, you have training data,
01:45:37 which is a set of input and output pairs.
01:45:40 So an input or output pair is a grid of colors, basically.
01:45:45 The grid, the size of the grid is variables.
01:45:48 The size of the grid is variable.
01:45:51 And you’re given an input, and you must transform it
01:45:56 into the proper output.
01:45:59 And so you’re shown a few demonstrations
01:46:02 of a task in the form of existing input output pairs,
01:46:05 and then you’re given a new input.
01:46:06 And you must provide, you must produce the correct output.
01:46:12 And the assumptions in Arc is that every task should only
01:46:22 require core knowledge priors, should not
01:46:27 require any outside knowledge.
01:46:30 So for instance, no language, no English, nothing like this.
01:46:36 No concepts taken from our human experience,
01:46:41 like trees, dogs, cats, and so on.
01:46:44 So only reasoning tasks that are built on top
01:46:49 of core knowledge priors.
01:46:52 And some of the tasks are actually explicitly
01:46:56 trying to probe specific forms of abstraction.
01:47:02 Part of the reason why I wanted to create Arc
01:47:05 is I’m a big believer in when you’re
01:47:11 faced with a problem as murky as understanding
01:47:18 how to autonomously generate abstraction in a machine,
01:47:22 you have to coevolve the solution and the problem.
01:47:27 And so part of the reason why I designed Arc
01:47:29 was to clarify my ideas about the nature of abstraction.
01:47:34 And some of the tasks are actually
01:47:36 designed to probe bits of that theory.
01:47:39 And there are things that turn out
01:47:42 to be very easy for humans to perform, including young kids,
01:47:46 but turn out to be near impossible for machines.
01:47:50 So what have you learned from the nature of abstraction
01:47:53 from designing that?
01:47:58 Can you clarify what you mean?
01:47:59 One of the things you wanted to try to understand
01:48:02 was this idea of abstraction.
01:48:06 Yes, so clarifying my own ideas about abstraction
01:48:10 by forcing myself to produce tasks that
01:48:13 would require the ability to produce
01:48:17 that form of abstraction in order to solve them.
01:48:19 Got it.
01:48:20 OK, so and by the way, just the people should check out.
01:48:24 I’ll probably overlay if you’re watching the video part.
01:48:26 But the grid input output with the different colors
01:48:32 on the grid, that’s it.
01:48:34 I mean, it’s a very simple world,
01:48:36 but it’s kind of beautiful.
01:48:37 It’s very similar to classic IQ tests.
01:48:39 It’s not very original in that sense.
01:48:41 The main difference with IQ tests
01:48:43 is that we make the priors explicit, which is not
01:48:46 usually the case in IQ tests.
01:48:48 So you make it explicit that everything should only
01:48:50 be built on top of core knowledge priors.
01:48:53 I also think it’s generally more diverse than IQ tests
01:48:58 in general.
01:49:00 And it perhaps requires a bit more manual work
01:49:03 to produce solutions, because you
01:49:05 have to click around on a grid for a while.
01:49:08 Sometimes the grids can be as large as 30 by 30 cells.
01:49:12 So how did you come up, if you can reveal, with the questions?
01:49:18 What’s the process of the questions?
01:49:19 Was it mostly you that came up with the questions?
01:49:23 How difficult is it to come up with a question?
01:49:25 Is this scalable to a much larger number?
01:49:30 If we think, with IQ tests, you might not necessarily
01:49:33 want it to or need it to be scalable.
01:49:36 With machines, it’s possible, you
01:49:39 could argue, that it needs to be scalable.
01:49:41 So there are 1,000 questions, 1,000 tasks,
01:49:46 including the test set, the prior test set.
01:49:49 I think it’s fairly difficult in the sense
01:49:51 that a big requirement is that every task should
01:49:54 be novel and unique and unpredictable.
01:50:00 You don’t want to create your own little world that
01:50:04 is simple enough that it would be possible for a human
01:50:08 to reverse and generate and write down
01:50:12 an algorithm that could generate every possible arc
01:50:15 task and their solution.
01:50:17 So that would completely invalidate the test.
01:50:19 So you’re constantly coming up with new stuff.
01:50:21 Yeah, you need a source of novelty,
01:50:24 of unfakeable novelty.
01:50:27 And one thing I found is that, as a human,
01:50:32 you are not a very good source of unfakeable novelty.
01:50:36 And so you have to base the creation of these tasks
01:50:40 quite a bit.
01:50:41 There are only so many unique tasks
01:50:42 that you can do in a given day.
01:50:45 So that means coming up with truly original new ideas.
01:50:49 Did psychedelics help you at all?
01:50:52 No, I’m just kidding.
01:50:53 But I mean, that’s fascinating to think about.
01:50:55 So you would be walking or something like that.
01:50:58 Are you constantly thinking of something totally new?
01:51:02 Yes.
01:51:06 This is hard.
01:51:06 This is hard.
01:51:07 Yeah, I mean, I’m not saying you’ve done anywhere
01:51:10 near a perfect job at it.
01:51:12 There is some amount of redundancy,
01:51:14 and there are many imperfections in ARC.
01:51:16 So that said, you should consider
01:51:18 ARC as a work in progress.
01:51:19 It is not the definitive state.
01:51:25 The ARC tasks today are not the definitive state of the test.
01:51:29 I want to keep refining it in the future.
01:51:32 I also think it should be possible to open up
01:51:36 the creation of tasks to a broad audience
01:51:38 to do crowdsourcing.
01:51:40 That would involve several levels of filtering,
01:51:43 obviously.
01:51:44 But I think it’s possible to apply crowdsourcing
01:51:46 to develop a much bigger and much more diverse ARC data set.
01:51:51 That would also be free of potentially some
01:51:54 of my own personal biases.
01:51:56 Is there always need to be a part of ARC
01:51:59 that the test is hidden?
01:52:02 Yes, absolutely.
01:52:04 It is imperative that the tests that you’re
01:52:08 using to actually benchmark algorithms
01:52:11 is not accessible to the people developing these algorithms.
01:52:15 Because otherwise, what’s going to happen
01:52:16 is that the human engineers are just
01:52:19 going to solve the tasks themselves
01:52:21 and encode their solution in program form.
01:52:24 But that, again, what you’re seeing here
01:52:27 is the process of intelligence happening
01:52:30 in the mind of the human.
01:52:31 And then you’re just capturing its crystallized output.
01:52:35 But that crystallized output is not the same thing
01:52:38 as the process it generated.
01:52:40 It’s not intelligent in itself.
01:52:41 So what, by the way, the idea of crowdsourcing it
01:52:43 is fascinating.
01:52:45 I think the creation of questions
01:52:49 is really exciting for people.
01:52:51 I think there’s a lot of really brilliant people
01:52:53 out there that love to create these kinds of stuff.
01:52:56 Yeah, one thing that kind of surprised me
01:52:59 that I wasn’t expecting is that lots of people
01:53:01 seem to actually enjoy ARC as a kind of game.
01:53:05 And I was releasing it as a test,
01:53:08 as a benchmark of fluid general intelligence.
01:53:14 And lots of people just, including kids,
01:53:17 just started enjoying it as a game.
01:53:18 So I think that’s encouraging.
01:53:20 Yeah, I’m fascinated by it.
01:53:22 There’s a world of people who create IQ questions.
01:53:25 I think that’s a cool activity for machines and for humans.
01:53:32 And humans are themselves fascinated
01:53:35 by taking the questions, like measuring
01:53:40 their own intelligence.
01:53:42 I mean, that’s just really compelling.
01:53:44 It’s really interesting to me, too.
01:53:47 One of the cool things about ARC, you said,
01:53:48 is kind of inspired by IQ tests or whatever
01:53:51 follows a similar process.
01:53:53 But because of its nature, because of the context
01:53:56 in which it lives, it immediately
01:53:59 forces you to think about the nature of intelligence
01:54:01 as opposed to just the test of your own.
01:54:04 It forces you to really think.
01:54:06 I don’t know if it’s within the question,
01:54:09 inherent in the question, or just the fact
01:54:11 that it lives in the test that’s supposed
01:54:13 to be a test of machine intelligence.
01:54:15 Absolutely.
01:54:15 As you solve ARC tasks as a human,
01:54:20 you will be forced to basically introspect
01:54:24 how you come up with solutions.
01:54:27 And that forces you to reflect on the human problem solving
01:54:32 process.
01:54:33 And the way your own mind generates
01:54:38 abstract representations of the problems it’s exposed to.
01:54:44 I think it’s due to the fact that the set of core knowledge
01:54:48 priors that ARC is built upon is so small.
01:54:52 It’s all a recombination of a very, very small set
01:54:58 of assumptions.
01:55:00 OK, so what’s the future of ARC?
01:55:02 So you held ARC as a challenge, as part
01:55:05 of like a Kaggle competition.
01:55:06 Yes.
01:55:07 Kaggle competition.
01:55:08 And what do you think?
01:55:11 Do you think that’s something that
01:55:13 continues for five years, 10 years,
01:55:16 like just continues growing?
01:55:17 Yes, absolutely.
01:55:18 So ARC itself will keep evolving.
01:55:21 So I’ve talked about crowdsourcing.
01:55:22 I think that’s a good avenue.
01:55:26 Another thing I’m starting is I’ll
01:55:29 be collaborating with folks from the psychology department
01:55:32 at NYU to do human testing on ARC.
01:55:36 And I think there are lots of interesting questions
01:55:38 you can start asking, especially as you start correlating
01:55:43 machine solutions to ARC tasks and the human characteristics
01:55:49 of solutions.
01:55:50 Like for instance, you can try to see
01:55:52 if there’s a relationship between the human perceived
01:55:55 difficulty of a task and the machine perceived.
01:55:59 Yes, and exactly some measure of machine
01:56:01 perceived difficulty.
01:56:02 Yeah, it’s a nice playground in which
01:56:04 to explore this very difference.
01:56:06 It’s the same thing as we talked about the autonomous vehicles.
01:56:09 The things that could be difficult for humans
01:56:10 might be very different than the things that are difficult.
01:56:13 And formalizing or making explicit that difference
01:56:17 in difficulty may teach us something fundamental
01:56:21 about intelligence.
01:56:22 So one thing I think we did well with ARC
01:56:26 is that it’s proving to be a very actionable test in the sense
01:56:33 that machine performance on ARC started at very much zero
01:56:37 initially, while humans found actually the task very easy.
01:56:43 And that alone was like a big red flashing light saying
01:56:48 that something is going on and that we are missing something.
01:56:52 And at the same time, machine performance
01:56:55 did not stay at zero for very long.
01:56:57 Actually, within two weeks of the Kaggle competition,
01:57:00 we started having a nonzero number.
01:57:03 And now the state of the art is around 20%
01:57:06 of the test set solved.
01:57:10 And so ARC is actually a challenge
01:57:12 where our capabilities start at zero, which indicates
01:57:16 the need for progress.
01:57:18 But it’s also not an impossible challenge.
01:57:20 It’s not accessible.
01:57:21 You can start making progress basically right away.
01:57:25 At the same time, we are still very far
01:57:28 from having solved it.
01:57:29 And that’s actually a very positive outcome
01:57:32 of the competition is that the competition has proven
01:57:35 that there was no obvious shortcut to solve these tasks.
01:57:41 Yeah, so the test held up.
01:57:43 Yeah, exactly.
01:57:44 That was the primary reason to use the Kaggle competition
01:57:46 is to check if some clever person was
01:57:51 going to hack the benchmark that did not happen.
01:57:56 People who are solving the task are essentially doing it.
01:58:01 Well, in a way, they’re actually exploring some flaws of ARC
01:58:05 that we will need to address in the future,
01:58:07 especially they’re essentially anticipating
01:58:09 what sort of tasks may be contained in the test set.
01:58:13 Right, which is kind of, yeah, that’s the kind of hacking.
01:58:18 It’s human hacking of the test.
01:58:20 Yes, that said, with the state of the art,
01:58:23 it’s like 20% we’re still very, very far from human level,
01:58:28 which is closer to 100%.
01:58:30 And I do believe that it will take a while
01:58:35 until we reach human parity on ARC.
01:58:40 And that by the time we have human parity,
01:58:43 we will have AI systems that are probably
01:58:47 pretty close to human level in terms of general fluid
01:58:50 intelligence, which is, I mean, they are not
01:58:53 going to be necessarily human like.
01:58:54 They’re not necessarily, you would not necessarily
01:58:58 recognize them as being an AGI.
01:59:01 But they would be capable of a degree of generalization
01:59:06 that matches the generalization performed
01:59:09 by human fluid intelligence.
01:59:11 Sure.
01:59:11 I mean, this is a good point in terms
01:59:13 of general fluid intelligence to mention in your paper.
01:59:17 You describe different kinds of generalizations,
01:59:21 local, broad, extreme.
01:59:23 And there’s a kind of a hierarchy that you form.
01:59:25 So when we say generalizations, what are we talking about?
01:59:31 What kinds are there?
01:59:33 Right, so generalization is a very old idea.
01:59:37 I mean, it’s even older than machine learning.
01:59:39 In the context of machine learning,
01:59:40 you say a system generalizes if it can make sense of an input
01:59:47 it has not yet seen.
01:59:49 And that’s what I would call system centric generalization,
01:59:54 generalization with respect to novelty
02:00:00 for the specific system you’re considering.
02:00:02 So I think a good test of intelligence
02:00:05 should actually deal with developer aware generalization,
02:00:09 which is slightly stronger than system centric generalization.
02:00:13 So developer aware generalization
02:00:16 would be the ability to generalize
02:00:19 to novelty or uncertainty that not only the system itself has
02:00:24 not access to, but the developer of the system
02:00:26 could not have access to either.
02:00:29 That’s a fascinating meta definition.
02:00:32 So the system is basically the edge case thing
02:00:37 we’re talking about with autonomous vehicles.
02:00:39 Neither the developer nor the system
02:00:41 know about the edge cases in my encounter.
02:00:44 So it’s up to the system should be
02:00:47 able to generalize the thing that nobody expected,
02:00:51 neither the designer of the training data,
02:00:54 nor obviously the contents of the training data.
02:00:59 That’s a fascinating definition.
02:01:00 So you can see degrees of generalization as a spectrum.
02:01:04 And the lowest level is what machine learning
02:01:08 is trying to do is the assumption
02:01:10 that any new situation is going to be sampled
02:01:15 from a static distribution of possible situations
02:01:18 and that you already have a representative sample
02:01:21 of the distribution.
02:01:22 That’s your training data.
02:01:23 And so in machine learning, you generalize to a new sample
02:01:26 from a known distribution.
02:01:28 And the ways in which your new sample will be new or different
02:01:34 are ways that are already understood by the developers
02:01:38 of the system.
02:01:39 So you are generalizing to known unknowns
02:01:43 for one specific task.
02:01:45 That’s what you would call robustness.
02:01:47 You are robust to things like noise, small variations,
02:01:50 and so on for one fixed known distribution
02:01:56 that you know through your training data.
02:01:59 And the higher degree would be flexibility
02:02:05 in machine intelligence.
02:02:06 So flexibility would be something
02:02:08 like an L5 cell driving car or maybe a robot that
02:02:12 can pass the coffee cup test, which
02:02:16 is the notion that you’d be given a random kitchen
02:02:21 somewhere in the country.
02:02:22 And you would have to go make a cup of coffee in that kitchen.
02:02:28 So flexibility would be the ability
02:02:30 to deal with unknown unknowns, so things that could not,
02:02:35 dimensions of viability that could not
02:02:37 have been possibly foreseen by the creators of the system
02:02:41 within one specific task.
02:02:42 So generalizing to the long tail of situations in self driving,
02:02:47 for instance, would be flexibility.
02:02:48 So you have robustness, flexibility, and finally,
02:02:51 you would have extreme generalization,
02:02:53 which is basically flexibility, but instead
02:02:57 of just considering one specific domain,
02:03:01 like driving or domestic robotics,
02:03:03 you’re considering an open ended range of possible domains.
02:03:07 So a robot would be capable of extreme generalization
02:03:12 if, let’s say, it’s designed and trained for cooking,
02:03:18 for instance.
02:03:19 And if I buy the robot and if it’s
02:03:24 able to teach itself gardening in a couple of weeks,
02:03:28 it would be capable of extreme generalization, for instance.
02:03:32 So the ultimate goal is extreme generalization.
02:03:34 Yes.
02:03:34 So creating a system that is so general that it could
02:03:40 essentially achieve human skill parity over arbitrary tasks
02:03:46 and arbitrary domains with the same level of improvisation
02:03:50 and adaptation power as humans when
02:03:53 it encounters new situations.
02:03:55 And it would do so over basically the same range
02:03:59 of possible domains and tasks as humans
02:04:02 and using essentially the same amount of training
02:04:05 experience of practice as humans would require.
02:04:07 That would be human level extreme generalization.
02:04:10 So I don’t actually think humans are anywhere
02:04:14 near the optimal intelligence bounds
02:04:19 if there is such a thing.
02:04:21 So I think for humans or in general?
02:04:23 In general.
02:04:25 I think it’s quite likely that there
02:04:26 is a hard limit to how intelligent any system can be.
02:04:33 But at the same time, I don’t think humans are anywhere
02:04:35 near that limit.
02:04:39 Yeah, last time I think we talked,
02:04:40 I think you had this idea that we’re only
02:04:43 as intelligent as the problems we face.
02:04:46 Sort of we are bounded by the problems.
02:04:51 In a way, yes.
02:04:51 We are bounded by our environments,
02:04:55 and we are bounded by the problems we try to solve.
02:04:58 Yeah.
02:04:59 Yeah.
02:04:59 What do you make of Neuralink and outsourcing
02:05:03 some of the brain power, like brain computer interfaces?
02:05:07 Do you think we can expand or augment our intelligence?
02:05:13 I am fairly skeptical of neural interfaces
02:05:18 because they are trying to fix one specific bottleneck
02:05:23 in human machine cognition, which
02:05:26 is the bandwidth bottleneck, input and output
02:05:29 of information in the brain.
02:05:31 And my perception of the problem is that bandwidth is not
02:05:37 at this time a bottleneck at all.
02:05:41 Meaning that we already have sensors
02:05:43 that enable us to take in far more information than what
02:05:48 we can actually process.
02:05:50 Well, to push back on that a little bit,
02:05:53 to sort of play devil’s advocate a little bit,
02:05:55 is if you look at the internet, Wikipedia, let’s say Wikipedia,
02:05:58 I would say that humans, after the advent of Wikipedia,
02:06:03 are much more intelligent.
02:06:05 Yes, I think that’s a good one.
02:06:07 But that’s also not about, that’s about externalizing
02:06:14 our intelligence via information processing systems,
02:06:18 external information processing systems,
02:06:19 which is very different from brain computer interfaces.
02:06:23 Right, but the question is whether if we have direct
02:06:27 access, if our brain has direct access to Wikipedia without
02:06:31 Your brain already has direct access to Wikipedia.
02:06:34 It’s on your phone.
02:06:35 And you have your hands and your eyes and your ears
02:06:39 and so on to access that information.
02:06:42 And the speed at which you can access it
02:06:44 Is bottlenecked by the cognition.
02:06:45 I think it’s already close, fairly close to optimal,
02:06:49 which is why speed reading, for instance, does not work.
02:06:53 The faster you read, the less you understand.
02:06:55 But maybe it’s because it uses the eyes.
02:06:58 So maybe.
02:07:00 So I don’t believe so.
02:07:01 I think the brain is very slow.
02:07:04 It typically operates, you know, the fastest things
02:07:07 that happen in the brain are at the level of 50 milliseconds.
02:07:11 Forming a conscious thought can potentially
02:07:14 take entire seconds, right?
02:07:16 And you can already read pretty fast.
02:07:19 So I think the speed at which you can take information in
02:07:23 and even the speed at which you can output information
02:07:26 can only be very incrementally improved.
02:07:29 Maybe there’s a question.
02:07:31 If you’re a very fast typer, if you’re a very trained typer,
02:07:34 the speed at which you can express your thoughts
02:07:36 is already the speed at which you can form your thoughts.
02:07:40 Right, so that’s kind of an idea that there are
02:07:44 fundamental bottlenecks to the human mind.
02:07:47 But it’s possible that everything we have
02:07:50 in the human mind is just to be able to survive
02:07:53 in the environment.
02:07:54 And there’s a lot more to expand.
02:07:58 Maybe, you know, you said the speed of the thought.
02:08:02 So I think augmenting human intelligence
02:08:06 is a very valid and very powerful avenue, right?
02:08:09 And that’s what computers are about.
02:08:12 In fact, that’s what all of culture and civilization
02:08:15 is about.
02:08:16 Our culture is externalized cognition
02:08:20 and we rely on culture to think constantly.
02:08:23 Yeah, I mean, that’s another, yeah.
02:08:26 Not just computers, not just phones and the internet.
02:08:29 I mean, all of culture, like language, for instance,
02:08:32 is a form of externalized cognition.
02:08:34 Books are obviously externalized cognition.
02:08:37 Yeah, that’s a good point.
02:08:38 And you can scale that externalized cognition
02:08:42 far beyond the capability of the human brain.
02:08:45 And you could see civilization itself
02:08:48 is it has capabilities that are far beyond any individual brain
02:08:54 and will keep scaling it because it’s not
02:08:55 rebound by individual brains.
02:08:59 It’s a different kind of system.
02:09:01 Yeah, and that system includes nonhuman, nonhumans.
02:09:06 First of all, it includes all the other biological systems,
02:09:08 which are probably contributing to the overall intelligence
02:09:11 of the organism.
02:09:12 And then computers are part of it.
02:09:14 Nonhuman systems are probably not contributing much,
02:09:16 but AIs are definitely contributing to that.
02:09:19 Like Google search, for instance, is a big part of it.
02:09:24 Yeah, yeah, a huge part, a part that we can’t probably
02:09:29 introspect.
02:09:31 Like how the world has changed in the past 20 years,
02:09:33 it’s probably very difficult for us
02:09:35 to be able to understand until, of course,
02:09:38 whoever created the simulation we’re in is probably
02:09:41 doing metrics, measuring the progress.
02:09:44 There was probably a big spike in performance.
02:09:48 They’re enjoying this.
02:09:51 So what are your thoughts on the Turing test
02:09:56 and the Lobner Prize, which is one
02:10:00 of the most famous attempts at the test of artificial
02:10:05 intelligence by doing a natural language open dialogue test
02:10:11 that’s judged by humans as far as how well the machine did?
02:10:18 So I’m not a fan of the Turing test.
02:10:21 Itself or any of its variants for two reasons.
02:10:25 So first of all, it’s really coping out
02:10:34 of trying to define and measure intelligence
02:10:37 because it’s entirely outsourcing that
02:10:40 to a panel of human judges.
02:10:43 And these human judges, they may not themselves
02:10:47 have any proper methodology.
02:10:49 They may not themselves have any proper definition
02:10:52 of intelligence.
02:10:53 They may not be reliable.
02:10:54 So the Turing test is already failing
02:10:57 one of the core psychometrics principles, which
02:10:59 is reliability because you have biased human judges.
02:11:04 It’s also violating the standardization requirement
02:11:07 and the freedom from bias requirement.
02:11:10 And so it’s really a cope out because you are outsourcing
02:11:13 everything that matters, which is precisely describing
02:11:17 intelligence and finding a standalone test to measure it.
02:11:22 You’re outsourcing everything to people.
02:11:25 So it’s really a cope out.
02:11:26 And by the way, we should keep in mind
02:11:28 that when Turing proposed the imitation game,
02:11:33 it was not meaning for the imitation game
02:11:36 to be an actual goal for the field of AI
02:11:40 and actual test of intelligence.
02:11:42 It was using the imitation game as a thought experiment
02:11:48 in a philosophical discussion in his 1950 paper.
02:11:53 He was trying to argue that theoretically, it
02:11:58 should be possible for something very much like the human mind,
02:12:04 indistinguishable from the human mind,
02:12:06 to be encoded in a Turing machine.
02:12:08 And at the time, that was a very daring idea.
02:12:14 It was stretching credulity.
02:12:16 But nowadays, I think it’s fairly well accepted
02:12:20 that the mind is an information processing system
02:12:22 and that you could probably encode it into a computer.
02:12:25 So another reason why I’m not a fan of this type of test
02:12:29 is that the incentives that it creates
02:12:34 are incentives that are not conducive to proper scientific
02:12:39 research.
02:12:40 If your goal is to trick, to convince a panel of human
02:12:45 judges that they are talking to a human,
02:12:48 then you have an incentive to rely on tricks
02:12:53 and prestidigitation.
02:12:56 In the same way that, let’s say, you’re doing physics
02:12:59 and you want to solve teleportation.
02:13:01 And what if the test that you set out to pass
02:13:04 is you need to convince a panel of judges
02:13:07 that teleportation took place?
02:13:09 And they’re just sitting there and watching what you’re doing.
02:13:12 And that is something that you can achieve with David
02:13:17 Copperfield could achieve it in his show at Vegas.
02:13:22 And what he’s doing is very elaborate.
02:13:25 But it’s not physics.
02:13:29 It’s not making any progress in our understanding
02:13:31 of the universe.
02:13:32 To push back on that is possible.
02:13:34 That’s the hope with these kinds of subjective evaluations
02:13:39 is that it’s easier to solve it generally
02:13:41 than it is to come up with tricks that convince
02:13:45 a large number of judges.
02:13:46 That’s the hope.
02:13:47 In practice, it turns out that it’s
02:13:49 very easy to deceive people in the same way
02:13:51 that you can do magic in Vegas.
02:13:54 You can actually very easily convince people
02:13:57 that they’re talking to a human when they’re actually
02:13:59 talking to an algorithm.
02:14:00 I just disagree.
02:14:01 I disagree with that.
02:14:02 I think it’s easy.
02:14:03 I would push.
02:14:05 No, it’s not easy.
02:14:07 It’s doable.
02:14:08 It’s very easy because we are biased.
02:14:12 We have theory of mind.
02:14:13 We are constantly projecting emotions, intentions, agentness.
02:14:21 Agentness is one of our core innate priors.
02:14:24 We are projecting these things on everything around us.
02:14:26 Like if you paint a smiley on a rock,
02:14:31 the rock becomes happy in our eyes.
02:14:33 And because we have this extreme bias that
02:14:36 permits everything we see around us,
02:14:39 it’s actually pretty easy to trick people.
02:14:41 I just disagree with that.
02:14:44 I so totally disagree with that.
02:14:45 You brilliantly put as a huge, the anthropomorphization
02:14:50 that we naturally do, the agentness of that word.
02:14:53 Is that a real word?
02:14:53 No, it’s not a real word.
02:14:55 I like it.
02:14:56 But it’s a useful word.
02:14:57 It’s a useful word.
02:14:58 Let’s make it real.
02:14:59 It’s a huge help.
02:15:01 But I still think it’s really difficult to convince.
02:15:04 If you do like the Alexa Prize formulation,
02:15:07 where you talk for an hour, there’s
02:15:10 formulations of the test you can create,
02:15:12 where it’s very difficult.
02:15:13 So I like the Alexa Prize better because it’s more pragmatic.
02:15:18 It’s more practical.
02:15:19 It’s actually incentivizing developers
02:15:22 to create something that’s useful as a human machine
02:15:27 interface.
02:15:29 So that’s slightly better than just the imitation.
02:15:31 So I like it.
02:15:34 Your idea is like a test which hopefully
02:15:36 help us in creating intelligent systems as a result.
02:15:39 Like if you create a system that passes it,
02:15:41 it’ll be useful for creating further intelligent systems.
02:15:44 Yes, at least.
02:15:46 Yeah.
02:15:47 Just to kind of comment, I’m a little bit surprised
02:15:51 how little inspiration people draw from the Turing test
02:15:55 today.
02:15:57 The media and the popular press might write about it
02:15:59 every once in a while.
02:16:00 The philosophers might talk about it.
02:16:03 But most engineers are not really inspired by it.
02:16:07 And I know you don’t like the Turing test,
02:16:11 but we’ll have this argument another time.
02:16:15 There’s something inspiring about it, I think.
02:16:18 As a philosophical device in a physical discussion,
02:16:21 I think there is something very interesting about it.
02:16:23 I don’t think it is in practical terms.
02:16:26 I don’t think it’s conducive to progress.
02:16:29 And one of the reasons why is that I
02:16:32 think being very human like, being
02:16:35 indistinguishable from a human is actually
02:16:37 the very last step in the creation of machine
02:16:40 intelligence.
02:16:41 That the first ARs that will show strong generalization
02:16:46 that will actually implement human like broad cognitive
02:16:52 abilities, they will not actually behave or look
02:16:54 anything like humans.
02:16:58 Human likeness is the very last step in that process.
02:17:01 And so a good test is a test that
02:17:03 points you towards the first step on the ladder,
02:17:07 not towards the top of the ladder.
02:17:08 So to push back on that, I usually
02:17:11 agree with you on most things.
02:17:13 I remember you, I think at some point,
02:17:15 tweeting something about the Turing test
02:17:17 not being being counterproductive
02:17:19 or something like that.
02:17:20 And I think a lot of very smart people agree with that.
02:17:23 I, a computation speaking, not very smart person,
02:17:31 disagree with that.
02:17:32 Because I think there’s some magic
02:17:33 to the interactivity with other humans.
02:17:36 So to play devil’s advocate on your statement,
02:17:39 it’s possible that in order to demonstrate
02:17:42 the generalization abilities of a system,
02:17:45 you have to show your ability, in conversation,
02:17:49 show your ability to adjust, adapt to the conversation
02:17:55 through not just like as a standalone system,
02:17:58 but through the process of like the interaction,
02:18:01 the game theoretic, where you really
02:18:05 are changing the environment by your actions.
02:18:09 So in the ARC challenge, for example,
02:18:11 you’re an observer.
02:18:12 You can’t scare the test into changing.
02:18:17 You can’t talk to the test.
02:18:19 You can’t play with it.
02:18:21 So there’s some aspect of that interactivity
02:18:24 that becomes highly subjective, but it
02:18:26 feels like it could be conducive to generalizability.
02:18:29 I think you make a great point.
02:18:31 The interactivity is a very good setting
02:18:33 to force a system to show adaptation,
02:18:36 to show generalization.
02:18:39 That said, at the same time, it’s
02:18:42 not something very scalable, because you
02:18:44 rely on human judges.
02:18:46 It’s not something reliable, because the human judges may
02:18:48 not, may not.
02:18:49 So you don’t like human judges.
02:18:50 Basically, yes.
02:18:51 And I think so.
02:18:52 I love the idea of interactivity.
02:18:56 I initially wanted an ARC test that
02:18:59 had some amount of interactivity where your score on a task
02:19:02 would not be 1 or 0, if you can solve it or not,
02:19:05 but would be the number of attempts
02:19:11 that you can make before you hit the right solution, which
02:19:14 means that now you can start applying
02:19:16 the scientific method as you solve ARC tasks,
02:19:19 that you can start formulating hypotheses and probing
02:19:23 the system to see whether the observation will
02:19:27 match the hypothesis or not.
02:19:28 It would be amazing if you could also,
02:19:30 even higher level than that, measure the quality of your attempts,
02:19:35 which, of course, is impossible.
02:19:36 But again, that gets subjective.
02:19:38 How good was your thinking?
02:19:41 How efficient was?
02:19:43 So one thing that’s interesting about this notion of scoring you
02:19:48 as how many attempts you need is that you
02:19:50 can start producing tasks that are way more ambiguous, right?
02:19:55 Right.
02:19:56 Because with the different attempts,
02:19:59 you can actually probe that ambiguity, right?
02:20:03 Right.
02:20:04 So that’s, in a sense, which is how good can
02:20:08 you adapt to the uncertainty and reduce the uncertainty?
02:20:15 Yes, it’s half fast.
02:20:18 It’s the efficiency with which you reduce uncertainty
02:20:21 in program space, exactly.
02:20:22 Very difficult to come up with that kind of test, though.
02:20:24 Yeah, so I would love to be able to create something like this.
02:20:28 In practice, it would be very, very difficult, but yes.
02:20:33 I mean, what you’re doing, what you’ve done with the ARC challenge
02:20:36 is brilliant.
02:20:37 I’m also not surprised that it’s not more popular,
02:20:40 but I think it’s picking up.
02:20:42 It does its niche.
02:20:42 It does its niche, yeah.
02:20:44 Yeah.
02:20:44 What are your thoughts about another test?
02:20:47 I talked with Marcus Hutter.
02:20:48 He has the Hutter Prize for compression of human knowledge.
02:20:51 And the idea is really sort of quantify and reduce
02:20:55 the test of intelligence purely to just the ability
02:20:58 to compress.
02:20:59 What’s your thoughts about this intelligence as compression?
02:21:04 I mean, it’s a very fun test because it’s
02:21:07 such a simple idea, like you’re given Wikipedia,
02:21:12 basic English Wikipedia, and you must compress it.
02:21:15 And so it stems from the idea that cognition is compression,
02:21:21 that the brain is basically a compression algorithm.
02:21:24 This is a very old idea.
02:21:25 It’s a very, I think, striking and beautiful idea.
02:21:30 I used to believe it.
02:21:32 I eventually had to realize that it was very much
02:21:36 a flawed idea.
02:21:36 So I no longer believe that cognition is compression.
02:21:41 But I can tell you what’s the difference.
02:21:44 So it’s very easy to believe that cognition and compression
02:21:48 are the same thing.
02:21:51 So Jeff Hawkins, for instance, says
02:21:53 that cognition is prediction.
02:21:54 And of course, prediction is basically the same thing
02:21:57 as compression.
02:21:58 It’s just including the temporal axis.
02:22:03 And it’s very easy to believe this
02:22:05 because compression is something that we
02:22:06 do all the time very naturally.
02:22:09 We are constantly compressing information.
02:22:12 We are constantly trying.
02:22:15 We have this bias towards simplicity.
02:22:17 We are constantly trying to organize things in our mind
02:22:21 and around us to be more regular.
02:22:24 So it’s a beautiful idea.
02:22:26 It’s very easy to believe.
02:22:28 There is a big difference between what
02:22:31 we do with our brains and compression.
02:22:33 So compression is actually kind of a tool
02:22:38 in the human cognitive toolkit that is used in many ways.
02:22:42 But it’s just a tool.
02:22:44 It is a tool for cognition.
02:22:45 It is not cognition itself.
02:22:47 And the big fundamental difference
02:22:50 is that cognition is about being able to operate
02:22:55 in future situations that include fundamental uncertainty
02:23:00 and novelty.
02:23:02 So for instance, consider a child at age 10.
02:23:06 And so they have 10 years of life experience.
02:23:10 They’ve gotten pain, pleasure, rewards, and punishment
02:23:14 in a period of time.
02:23:16 If you were to generate the shortest behavioral program
02:23:21 that would have basically run that child over these 10 years
02:23:26 in an optimal way, the shortest optimal behavioral program
02:23:32 given the experience of that child so far,
02:23:34 well, that program, that compressed program,
02:23:37 this is what you would get if the mind of the child
02:23:39 was a compression algorithm essentially,
02:23:42 would be utterly unable, inappropriate,
02:23:48 to process the next 70 years in the life of that child.
02:23:54 So in the models we build of the world,
02:23:59 we are not trying to make them actually optimally compressed.
02:24:03 We are using compression as a tool
02:24:06 to promote simplicity and efficiency in our models.
02:24:10 But they are not perfectly compressed
02:24:12 because they need to include things
02:24:15 that are seemingly useless today, that have seemingly
02:24:18 been useless so far.
02:24:20 But that may turn out to be useful in the future
02:24:24 because you just don’t know the future.
02:24:25 And that’s the fundamental principle
02:24:28 that cognition, that intelligence arises from
02:24:31 is that you need to be able to run
02:24:33 appropriate behavioral programs except you have absolutely
02:24:36 no idea what sort of context, environment, situation
02:24:40 they are going to be running in.
02:24:42 And you have to deal with that uncertainty,
02:24:45 with that future anomaly.
02:24:46 So an analogy that you can make is with investing,
02:24:52 for instance.
02:24:54 If I look at the past 20 years of stock market data,
02:24:59 and I use a compression algorithm
02:25:01 to figure out the best trading strategy,
02:25:04 it’s going to be you buy Apple stock, then
02:25:06 maybe the past few years you buy Tesla stock or something.
02:25:10 But is that strategy still going to be
02:25:13 true for the next 20 years?
02:25:14 Well, actually, probably not, which
02:25:17 is why if you’re a smart investor,
02:25:21 you’re not just going to be following the strategy that
02:25:26 corresponds to compression of the past.
02:25:28 You’re going to be following, you’re
02:25:31 going to have a balanced portfolio, right?
02:25:34 Because you just don’t know what’s going to happen.
02:25:38 I mean, I guess in that same sense,
02:25:40 the compression is analogous to what
02:25:42 you talked about, which is local or robust generalization
02:25:45 versus extreme generalization.
02:25:47 It’s much closer to that side of being able to generalize
02:25:52 in the local sense.
02:25:53 That’s why as humans, when we are children, in our education,
02:25:59 so a lot of it is driven by play, driven by curiosity.
02:26:04 We are not efficiently compressing things.
02:26:07 We’re actually exploring.
02:26:09 We are retaining all kinds of things
02:26:16 from our environment that seem to be completely useless.
02:26:19 Because they might turn out to be eventually useful, right?
02:26:24 And that’s what cognition is really about.
02:26:26 And what makes it antagonistic to compression
02:26:29 is that it is about hedging for future uncertainty.
02:26:33 And that’s antagonistic to compression.
02:26:35 Yes.
02:26:36 Officially hedging.
02:26:38 Cognition leverages compression as a tool
02:26:41 to promote efficiency and simplicity in our models.
02:26:47 It’s like Einstein said, make it simpler, but not,
02:26:52 however that quote goes, but not too simple.
02:26:54 So compression simplifies things,
02:26:57 but you don’t want to make it too simple.
02:27:00 Yes.
02:27:00 So a good model of the world is going
02:27:03 to include all kinds of things that are completely useless,
02:27:06 actually, just in case.
02:27:08 Because you need diversity in the same way
02:27:10 that in your portfolio.
02:27:11 You need all kinds of stocks that may not
02:27:13 have performed well so far, but you need diversity.
02:27:15 And the reason you need diversity
02:27:16 is because fundamentally you don’t know what you’re doing.
02:27:19 And the same is true of the human mind,
02:27:22 is that it needs to behave appropriately in the future.
02:27:26 And it has no idea what the future is going to be like.
02:27:29 But it’s not going to be like the past.
02:27:31 So compressing the past is not appropriate,
02:27:33 because the past is not, it’s not predictive of the future.
02:27:40 Yeah, history repeats itself, but not perfectly.
02:27:44 I don’t think I asked you last time the most inappropriately
02:27:48 absurd question.
02:27:51 We’ve talked a lot about intelligence,
02:27:54 but the bigger question from intelligence is of meaning.
02:28:00 Intelligence systems are kind of goal oriented.
02:28:02 They’re always optimizing for a goal.
02:28:05 If you look at the Hutter Prize, actually,
02:28:07 I mean, there’s always a clean formulation of a goal.
02:28:10 But the natural question for us humans,
02:28:14 since we don’t know our objective function,
02:28:16 is what is the meaning of it all?
02:28:18 So the absurd question is, what, Francois,
02:28:22 do you think is the meaning of life?
02:28:25 What’s the meaning of life?
02:28:26 Yeah, that’s a big question.
02:28:28 And I think I can give you my answer, at least one
02:28:33 of my answers.
02:28:34 And so one thing that’s very important in understanding who
02:28:42 we are is that everything that makes up ourselves,
02:28:48 that makes up who we are, even your most personal thoughts,
02:28:53 is not actually your own.
02:28:55 Even your most personal thoughts are expressed in words
02:29:00 that you did not invent and are built on concepts and images
02:29:04 that you did not invent.
02:29:06 We are very much cultural beings.
02:29:10 We are made of culture.
02:29:12 What makes us different from animals, for instance?
02:29:16 So everything about ourselves is an echo of the past.
02:29:22 Is an echo of the past, an echo of people who lived before us.
02:29:29 That’s who we are.
02:29:31 And in the same way, if we manage
02:29:35 to contribute something to the collective edifice of culture,
02:29:41 a new idea, maybe a beautiful piece of music,
02:29:44 a work of art, a grand theory, a new world, maybe,
02:29:51 that something is going to become
02:29:54 a part of the minds of future humans, essentially, forever.
02:30:00 So everything we do creates ripples
02:30:03 that propagate into the future.
02:30:06 And in a way, this is our path to immortality,
02:30:11 is that as we contribute things to culture,
02:30:17 culture in turn becomes future humans.
02:30:21 And we keep influencing people thousands of years from now.
02:30:27 So our actions today create ripples.
02:30:30 And these ripples, I think, basically
02:30:35 sum up the meaning of life.
02:30:37 In the same way that we are the sum
02:30:42 of the interactions between many different ripples that
02:30:45 came from our past, we are ourselves
02:30:48 creating ripples that will propagate into the future.
02:30:50 And that’s why we should be, this
02:30:53 seems like perhaps an eighth thing to say,
02:30:56 but we should be kind to others during our time on Earth
02:31:02 because every act of kindness creates ripples.
02:31:05 And in reverse, every act of violence also creates ripples.
02:31:09 And you want to carefully choose which kind of ripples
02:31:13 you want to create, and you want to propagate into the future.
02:31:16 And in your case, first of all, beautifully put,
02:31:19 but in your case, creating ripples
02:31:21 into the future human and future AGI systems.
02:31:27 Yes.
02:31:28 It’s fascinating.
02:31:29 Our successors.
02:31:30 I don’t think there’s a better way to end it,
02:31:34 Francois, as always, for a second time.
02:31:37 And I’m sure many times in the future,
02:31:39 it’s been a huge honor.
02:31:40 You’re one of the most brilliant people
02:31:43 in the machine learning, computer science world.
02:31:47 Again, it’s a huge honor.
02:31:48 Thanks for talking to me.
02:31:49 It’s been a pleasure.
02:31:50 Thanks a lot for having me.
02:31:51 We appreciate it.
02:31:53 Thanks for listening to this conversation with Francois
02:31:56 Chollet, and thank you to our sponsors, Babbel, Masterclass,
02:32:00 and Cash App.
02:32:01 Click the sponsor links in the description
02:32:03 to get a discount and to support this podcast.
02:32:06 If you enjoy this thing, subscribe on YouTube,
02:32:09 review it with five stars on Apple Podcast,
02:32:11 follow on Spotify, support on Patreon,
02:32:14 or connect with me on Twitter at Lex Friedman.
02:32:17 And now let me leave you with some words
02:32:19 from René Descartes in 1668, an excerpt of which Francois
02:32:24 includes and is on the measure of intelligence paper.
02:32:27 If there were machines which bore a resemblance
02:32:30 to our bodies and imitated our actions as closely as possible
02:32:34 for all practical purposes, we should still
02:32:36 have two very certain means of recognizing
02:32:40 that they were not real men.
02:32:42 The first is that they could never use words or put together
02:32:45 signs, as we do in order to declare our thoughts to others.
02:32:49 For we can certainly conceive of a machine so constructed
02:32:53 that it utters words and even utters
02:32:55 words that correspond to bodily actions causing
02:32:57 a change in its organs.
02:32:59 But it is not conceivable that such a machine should produce
02:33:03 different arrangements of words so as
02:33:05 to give an appropriately meaningful answer to whatever
02:33:08 is said in its presence as the dullest of men can do.
02:33:12 Here, Descartes is anticipating the Turing test,
02:33:15 and the argument still continues to this day.
02:33:18 Secondly, he continues, even though some machines might
02:33:22 do some things as well as we do them, or perhaps even better,
02:33:26 they would inevitably fail in others,
02:33:29 which would reveal that they are acting not from understanding
02:33:32 but only from the disposition of their organs.
02:33:36 This is an incredible quote.
02:33:39 Whereas reason is a universal instrument
02:33:43 which can be used in all kinds of situations,
02:33:46 these organs need some particular action.
02:33:49 Hence, it is for all practical purposes
02:33:51 impossible for a machine to have enough different organs
02:33:54 to make it act in all the contingencies of life
02:33:57 and the way in which our reason makes us act.
02:34:01 That’s the debate between mimicry and memorization
02:34:05 versus understanding.
02:34:07 So thank you for listening and hope to see you next time.