Transcript
00:00:00 The following is a conversation with Greg Brockman.
00:00:02 He’s the cofounder and CTO of OpenAI,
00:00:05 a world class research organization
00:00:07 developing ideas in AI with a goal of eventually
00:00:10 creating a safe and friendly artificial general
00:00:14 intelligence, one that benefits and empowers humanity.
00:00:18 OpenAI is not only a source of publications, algorithms, tools,
00:00:23 and data sets.
00:00:24 Their mission is a catalyst for an important public discourse
00:00:28 about our future with both narrow and general intelligence
00:00:32 systems.
00:00:34 This conversation is part of the Artificial Intelligence
00:00:36 podcast at MIT and beyond.
00:00:39 If you enjoy it, subscribe on YouTube, iTunes,
00:00:42 or simply connect with me on Twitter at Lex Friedman,
00:00:45 spelled F R I D. And now, here’s my conversation
00:00:50 with Greg Brockman.
00:00:52 So in high school, and right after you
00:00:54 wrote a draft of a chemistry textbook,
00:00:56 saw that that covers everything from basic structure
00:00:59 of the atom to quantum mechanics.
00:01:01 So it’s clear you have an intuition and a passion
00:01:04 for both the physical world with chemistry and now robotics
00:01:09 to the digital world with AI, deep learning, reinforcement
00:01:14 learning, so on.
00:01:15 Do you see the physical world and the digital world
00:01:17 as different?
00:01:18 And what do you think is the gap?
00:01:20 A lot of it actually boils down to iteration speed.
00:01:23 I think that a lot of what really motivates me
00:01:25 is building things.
00:01:26 I think about mathematics, for example,
00:01:28 where you think really hard about a problem.
00:01:30 You understand it.
00:01:31 You write it down in this very obscure form
00:01:33 that we call a proof.
00:01:34 But then, this is in humanity’s library.
00:01:37 It’s there forever.
00:01:38 This is some truth that we’ve discovered.
00:01:40 Maybe only five people in your field will ever read it.
00:01:43 But somehow, you’ve kind of moved humanity forward.
00:01:45 And so I actually used to really think
00:01:46 that I was going to be a mathematician.
00:01:48 And then I actually started writing this chemistry
00:01:51 textbook.
00:01:51 One of my friends told me, you’ll never publish it
00:01:53 because you don’t have a PhD.
00:01:54 So instead, I decided to build a website
00:01:57 and try to promote my ideas that way.
00:01:59 And then I discovered programming.
00:02:01 And in programming, you think hard about a problem.
00:02:05 You understand it.
00:02:06 You write it down in a very obscure form
00:02:08 that we call a program.
00:02:10 But then once again, it’s in humanity’s library.
00:02:12 And anyone can get the benefit from it.
00:02:14 And the scalability is massive.
00:02:15 And so I think that the thing that really appeals
00:02:17 to me about the digital world is that you
00:02:19 can have this insane leverage.
00:02:21 A single individual with an idea is
00:02:24 able to affect the entire planet.
00:02:25 And that’s something I think is really
00:02:27 hard to do if you’re moving around physical atoms.
00:02:30 But you said mathematics.
00:02:32 So if you look at the wet thing over here, our mind,
00:02:36 do you ultimately see it as just math,
00:02:39 as just information processing?
00:02:41 Or is there some other magic, as you’ve seen,
00:02:44 if you’ve seen through biology and chemistry and so on?
00:02:47 Yeah, I think it’s really interesting to think about
00:02:48 humans as just information processing systems.
00:02:50 And that seems like it’s actually
00:02:52 a pretty good way of describing a lot of how the world works
00:02:57 or a lot of what we’re capable of, to think that, again,
00:03:00 if you just look at technological innovations
00:03:02 over time, that in some ways, the most transformative
00:03:05 innovation that we’ve had has been the computer.
00:03:07 In some ways, the internet, that what has the internet done?
00:03:10 The internet is not about these physical cables.
00:03:12 It’s about the fact that I am suddenly
00:03:14 able to instantly communicate with any other human
00:03:16 on the planet.
00:03:17 I’m able to retrieve any piece of knowledge
00:03:19 that in some ways the human race has ever had,
00:03:22 and that those are these insane transformations.
00:03:26 Do you see our society as a whole, the collective,
00:03:29 as another extension of the intelligence of the human being?
00:03:32 So if you look at the human being
00:03:33 as an information processing system,
00:03:35 you mentioned the internet, the networking.
00:03:36 Do you see us all together as a civilization
00:03:39 as a kind of intelligent system?
00:03:41 Yeah, I think this is actually
00:03:42 a really interesting perspective to take
00:03:44 and to think about, that you sort of have
00:03:46 this collective intelligence of all of society,
00:03:49 the economy itself is this superhuman machine
00:03:51 that is optimizing something, right?
00:03:54 And in some ways, a company has a will of its own, right?
00:03:57 That you have all these individuals
00:03:59 who are all pursuing their own individual goals
00:04:00 and thinking really hard
00:04:01 and thinking about the right things to do,
00:04:03 but somehow the company does something
00:04:05 that is this emergent thing
00:04:07 and that is a really useful abstraction.
00:04:10 And so I think that in some ways,
00:04:12 we think of ourselves as the most intelligent things
00:04:14 on the planet and the most powerful things on the planet,
00:04:17 but there are things that are bigger than us
00:04:19 that are the systems that we all contribute to.
00:04:21 And so I think actually, it’s interesting to think about
00:04:24 if you’ve read Isaac Asimov’s foundation, right?
00:04:27 That there’s this concept of psychohistory in there,
00:04:30 which is effectively this,
00:04:31 that if you have trillions or quadrillions of beings,
00:04:33 then maybe you could actually predict what that being,
00:04:36 that huge macro being will do
00:04:39 and almost independent of what the individuals want.
00:04:42 And I actually have a second angle on this
00:04:44 that I think is interesting,
00:04:45 which is thinking about technological determinism.
00:04:48 One thing that I actually think a lot about with OpenAI,
00:04:51 right, is that we’re kind of coming on
00:04:53 to this insanely transformational technology
00:04:55 of general intelligence, right,
00:04:57 that will happen at some point.
00:04:58 And there’s a question of how can you take actions
00:05:01 that will actually steer it to go better rather than worse.
00:05:04 And that I think one question you need to ask
00:05:06 is as a scientist, as an inventor, as a creator,
00:05:09 what impact can you have in general, right?
00:05:11 You look at things like the telephone
00:05:12 invented by two people on the same day.
00:05:14 Like, what does that mean?
00:05:15 Like, what does that mean about the shape of innovation?
00:05:18 And I think that what’s going on
00:05:19 is everyone’s building on the shoulders of the same giants.
00:05:21 And so you can kind of, you can’t really hope
00:05:23 to create something no one else ever would.
00:05:25 You know, if Einstein wasn’t born,
00:05:27 someone else would have come up with relativity.
00:05:29 You know, he changed the timeline a bit, right,
00:05:30 that maybe it would have taken another 20 years,
00:05:32 but it wouldn’t be that fundamentally humanity
00:05:34 would never discover these fundamental truths.
00:05:37 So there’s some kind of invisible momentum
00:05:40 that some people like Einstein or OpenAI is plugging into
00:05:45 that anybody else can also plug into
00:05:47 and ultimately that wave takes us into a certain direction.
00:05:50 That’s what he means by digital.
00:05:51 That’s right, that’s right.
00:05:52 And you know, this kind of seems to play out
00:05:54 in a bunch of different ways,
00:05:55 that there’s some exponential that is being written
00:05:58 and that the exponential itself, which one it is, changes.
00:06:00 Think about Moore’s Law, an entire industry
00:06:02 set its clock to it for 50 years.
00:06:04 Like, how can that be, right?
00:06:06 How is that possible?
00:06:07 And yet somehow it happened.
00:06:09 And so I think you can’t hope to ever invent something
00:06:12 that no one else will.
00:06:13 Maybe you can change the timeline a little bit.
00:06:15 But if you really want to make a difference,
00:06:17 I think that the thing that you really have to do,
00:06:19 the only real degree of freedom you have
00:06:21 is to set the initial conditions
00:06:23 under which a technology is born.
00:06:24 And so you think about the internet, right?
00:06:26 That there are lots of other competitors
00:06:27 trying to build similar things.
00:06:29 And the internet won.
00:06:30 And that the initial conditions
00:06:33 were that it was created by this group
00:06:34 that really valued people being able to be,
00:06:38 anyone being able to plug in
00:06:39 this very academic mindset of being open and connected.
00:06:42 And I think that the internet for the next 40 years
00:06:44 really played out that way.
00:06:46 You know, maybe today things are starting
00:06:48 to shift in a different direction.
00:06:49 But I think that those initial conditions
00:06:51 were really important to determine
00:06:52 the next 40 years worth of progress.
00:06:55 That’s really beautifully put.
00:06:56 So another example that I think about,
00:06:58 you know, I recently looked at it.
00:07:00 I looked at Wikipedia, the formation of Wikipedia.
00:07:03 And I wondered what the internet would be like
00:07:05 if Wikipedia had ads.
00:07:07 You know, there’s an interesting argument
00:07:09 that why they chose not to make it,
00:07:12 put advertisement on Wikipedia.
00:07:14 I think Wikipedia’s one of the greatest resources
00:07:17 we have on the internet.
00:07:18 It’s extremely surprising how well it works
00:07:21 and how well it was able to aggregate
00:07:22 all this kind of good information.
00:07:24 And essentially the creator of Wikipedia,
00:07:27 I don’t know, there’s probably some debates there,
00:07:29 but set the initial conditions.
00:07:31 And now it carried itself forward.
00:07:33 That’s really interesting.
00:07:34 So the way you’re thinking about AGI
00:07:36 or artificial intelligence is you’re focused
00:07:38 on setting the initial conditions for the progress.
00:07:41 That’s right.
00:07:42 That’s powerful.
00:07:43 Okay, so looking to the future,
00:07:45 if you create an AGI system,
00:07:48 like one that can ace the Turing test, natural language,
00:07:51 what do you think would be the interactions
00:07:54 you would have with it?
00:07:55 What do you think are the questions you would ask?
00:07:57 Like what would be the first question you would ask?
00:08:00 It, her, him.
00:08:01 That’s right.
00:08:02 I think that at that point,
00:08:03 if you’ve really built a powerful system
00:08:05 that is capable of shaping the future of humanity,
00:08:08 the first question that you really should ask
00:08:10 is how do we make sure that this plays out well?
00:08:12 And so that’s actually the first question
00:08:13 that I would ask a powerful AGI system is.
00:08:17 So you wouldn’t ask your colleague,
00:08:19 you wouldn’t ask like Ilya,
00:08:20 you would ask the AGI system.
00:08:22 Oh, we’ve already had the conversation with Ilya, right?
00:08:24 And everyone here.
00:08:25 And so you want as many perspectives
00:08:27 and a piece of wisdom as you can
00:08:29 for answering this question.
00:08:31 So I don’t think you necessarily defer
00:08:32 to whatever your powerful system tells you,
00:08:35 but you use it as one input
00:08:37 to try to figure out what to do.
00:08:39 But, and I guess fundamentally what it really comes down to
00:08:41 is if you built something really powerful
00:08:43 and you think about, for example,
00:08:45 the creation of shortly after
00:08:47 the creation of nuclear weapons, right?
00:08:48 The most important question in the world was
00:08:51 what’s the world order going to be like?
00:08:52 How do we set ourselves up in a place
00:08:54 where we’re going to be able to survive as a species?
00:08:58 With AGI, I think the question is slightly different, right?
00:09:00 That there is a question of how do we make sure
00:09:02 that we don’t get the negative effects,
00:09:04 but there’s also the positive side, right?
00:09:06 You imagine that, like what won’t AGI be like?
00:09:09 Like what will it be capable of?
00:09:11 And I think that one of the core reasons
00:09:13 that an AGI can be powerful and transformative
00:09:15 is actually due to technological development, right?
00:09:18 If you have something that’s capable as a human
00:09:21 and that it’s much more scalable,
00:09:23 that you absolutely want that thing
00:09:25 to go read the whole scientific literature
00:09:27 and think about how to create cures for all the diseases,
00:09:29 right?
00:09:30 You want it to think about how to go
00:09:31 and build technologies to help us create material abundance
00:09:34 and to figure out societal problems
00:09:37 that we have trouble with.
00:09:38 Like how are we supposed to clean up the environment?
00:09:40 And maybe you want this to go and invent
00:09:42 a bunch of little robots that will go out
00:09:44 and be biodegradable and turn ocean debris
00:09:47 into harmless molecules.
00:09:49 And I think that that positive side
00:09:54 is something that I think people miss
00:09:55 sometimes when thinking about what an AGI will be like.
00:09:58 And so I think that if you have a system
00:10:00 that’s capable of all of that,
00:10:01 you absolutely want its advice about how do I make sure
00:10:03 that we’re using your capabilities
00:10:07 in a positive way for humanity.
00:10:09 So what do you think about that psychology
00:10:11 that looks at all the different possible trajectories
00:10:14 of an AGI system, many of which,
00:10:17 perhaps the majority of which are positive,
00:10:19 and nevertheless focuses on the negative trajectories?
00:10:23 I mean, you get to interact with folks,
00:10:24 you get to think about this, maybe within yourself as well.
00:10:28 You look at Sam Harris and so on.
00:10:30 It seems to be, sorry to put it this way,
00:10:32 but almost more fun to think about
00:10:34 the negative possibilities.
00:10:36 Whatever that’s deep in our psychology,
00:10:39 what do you think about that?
00:10:40 And how do we deal with it?
00:10:41 Because we want AI to help us.
00:10:44 So I think there’s kind of two problems
00:10:47 entailed in that question.
00:10:49 The first is more of the question of
00:10:52 how can you even picture what a world
00:10:54 with a new technology will be like?
00:10:56 Now imagine we’re in 1950,
00:10:57 and I’m trying to describe Uber to someone.
00:11:02 Apps and the internet.
00:11:05 Yeah, I mean, that’s going to be extremely complicated.
00:11:08 But it’s imaginable.
00:11:10 It’s imaginable, right?
00:11:11 And now imagine being in 1950 and predicting Uber, right?
00:11:15 And you need to describe the internet,
00:11:17 you need to describe GPS,
00:11:18 you need to describe the fact that
00:11:20 everyone’s going to have this phone in their pocket.
00:11:23 And so I think that just the first truth
00:11:26 is that it is hard to picture
00:11:28 how a transformative technology will play out in the world.
00:11:31 We’ve seen that before with technologies
00:11:32 that are far less transformative than AGI will be.
00:11:35 And so I think that one piece is that
00:11:37 it’s just even hard to imagine
00:11:39 and to really put yourself in a world
00:11:41 where you can predict what that positive vision
00:11:44 would be like.
00:11:46 And I think the second thing is that
00:11:49 I think it is always easier to support the negative side
00:11:54 than the positive side.
00:11:55 It’s always easier to destroy than create.
00:11:58 And less in a physical sense
00:12:00 and more just in an intellectual sense, right?
00:12:03 Because I think that with creating something,
00:12:05 you need to just get a bunch of things right.
00:12:07 And to destroy, you just need to get one thing wrong.
00:12:10 And so I think that what that means
00:12:12 is that I think a lot of people’s thinking dead ends
00:12:14 as soon as they see the negative story.
00:12:16 But that being said, I actually have some hope, right?
00:12:20 I think that the positive vision
00:12:23 is something that I think can be,
00:12:26 is something that we can talk about.
00:12:27 And I think that just simply saying this fact of,
00:12:30 yeah, there’s positive, there’s negatives,
00:12:32 everyone likes to dwell on the negative.
00:12:33 People actually respond well to that message and say,
00:12:35 huh, you’re right, there’s a part of this
00:12:37 that we’re not talking about, not thinking about.
00:12:39 And that’s actually something that’s I think really
00:12:42 been a key part of how we think about AGI at OpenAI.
00:12:46 You can kind of look at it as like, okay,
00:12:48 OpenAI talks about the fact that there are risks
00:12:51 and yet they’re trying to build this system.
00:12:53 How do you square those two facts?
00:12:56 So do you share the intuition that some people have,
00:12:59 I mean from Sam Harris to even Elon Musk himself,
00:13:02 that it’s tricky as you develop AGI
00:13:06 to keep it from slipping into the existential threats,
00:13:10 into the negative?
00:13:11 What’s your intuition about how hard is it
00:13:14 to keep AI development on the positive track?
00:13:19 What’s your intuition there?
00:13:20 To answer that question, you can really look
00:13:22 at how we structure OpenAI.
00:13:24 So we really have three main arms.
00:13:25 We have capabilities, which is actually doing
00:13:28 the technical work and pushing forward
00:13:29 what these systems can do.
00:13:31 There’s safety, which is working on technical mechanisms
00:13:35 to ensure that the systems we build
00:13:36 are aligned with human values.
00:13:38 And then there’s policy, which is making sure
00:13:40 that we have governance mechanisms,
00:13:42 answering that question of, well, whose values?
00:13:45 And so I think that the technical safety one
00:13:47 is the one that people kind of talk about the most, right?
00:13:50 You talk about, like think about all of the dystopic AI
00:13:53 movies, a lot of that is about not having
00:13:55 good technical safety in place.
00:13:57 And what we’ve been finding is that,
00:13:59 you know, I think that actually a lot of people
00:14:01 look at the technical safety problem
00:14:02 and think it’s just intractable, right?
00:14:05 This question of what do humans want?
00:14:07 How am I supposed to write that down?
00:14:09 Can I even write down what I want?
00:14:11 No way.
00:14:13 And then they stop there.
00:14:14 But the thing is, we’ve already built systems
00:14:16 that are able to learn things that humans can’t specify.
00:14:20 You know, even the rules for how to recognize
00:14:22 if there’s a cat or a dog in an image.
00:14:24 Turns out it’s intractable to write that down,
00:14:26 and yet we’re able to learn it.
00:14:28 And that what we’re seeing with systems we build at OpenAI,
00:14:31 and they’re still in early proof of concept stage,
00:14:33 is that you are able to learn human preferences.
00:14:36 You’re able to learn what humans want from data.
00:14:38 And so that’s kind of the core focus
00:14:40 for our technical safety team,
00:14:41 and I think that there actually,
00:14:43 we’ve had some pretty encouraging updates
00:14:45 in terms of what we’ve been able to make work.
00:14:48 So you have an intuition and a hope that from data,
00:14:51 you know, looking at the value alignment problem,
00:14:53 from data we can build systems that align
00:14:57 with the collective better angels of our nature.
00:15:00 So align with the ethics and the morals of human beings.
00:15:04 To even say this in a different way,
00:15:05 I mean, think about how do we align humans, right?
00:15:08 Think about like a human baby can grow up
00:15:10 to be an evil person or a great person.
00:15:12 And a lot of that is from learning from data, right?
00:15:15 That you have some feedback as a child is growing up,
00:15:17 they get to see positive examples.
00:15:19 And so I think that just like,
00:15:22 that the only example we have of a general intelligence
00:15:25 that is able to learn from data
00:15:28 to align with human values and to learn values,
00:15:31 I think we shouldn’t be surprised
00:15:32 that we can do the same sorts of techniques
00:15:36 or whether the same sort of techniques
00:15:37 end up being how we solve value alignment for AGI’s.
00:15:41 So let’s go even higher.
00:15:42 I don’t know if you’ve read the book, Sapiens,
00:15:44 but there’s an idea that, you know,
00:15:48 that as a collective, as us human beings,
00:15:49 we kind of develop together ideas that we hold.
00:15:54 There’s no, in that context, objective truth.
00:15:57 We just kind of all agree to certain ideas
00:15:59 and hold them as a collective.
00:16:01 Did you have a sense that there is,
00:16:03 in the world of good and evil,
00:16:05 do you have a sense that to the first approximation,
00:16:07 there are some things that are good
00:16:10 and that you could teach systems to behave to be good?
00:16:14 So I think that this actually blends into our third team,
00:16:18 right, which is the policy team.
00:16:19 And this is the one, the aspect I think people
00:16:22 really talk about way less than they should, right?
00:16:25 Because imagine that we build super powerful systems
00:16:27 that we’ve managed to figure out all the mechanisms
00:16:29 for these things to do whatever the operator wants.
00:16:32 The most important question becomes,
00:16:34 who’s the operator, what do they want,
00:16:36 and how is that going to affect everyone else, right?
00:16:39 And I think that this question of what is good,
00:16:43 what are those values, I mean,
00:16:44 I think you don’t even have to go to those,
00:16:46 those very grand existential places
00:16:48 to start to realize how hard this problem is.
00:16:50 You just look at different countries
00:16:52 and cultures across the world,
00:16:54 and that there’s a very different conception
00:16:57 of how the world works and what kinds of ways
00:17:01 that society wants to operate.
00:17:03 And so I think that the really core question
00:17:07 is actually very concrete,
00:17:09 and I think it’s not a question
00:17:10 that we have ready answers to, right?
00:17:12 It’s how do you have a world
00:17:14 where all of the different countries that we have,
00:17:17 United States, China, Russia,
00:17:19 and the hundreds of other countries out there
00:17:22 are able to continue to not just operate
00:17:26 in the way that they see fit,
00:17:28 but in the world that emerges
00:17:32 where you have these very powerful systems
00:17:36 operating alongside humans,
00:17:37 ends up being something that empowers humans more,
00:17:39 that makes human existence be a more meaningful thing,
00:17:44 and that people are happier and wealthier,
00:17:46 and able to live more fulfilling lives.
00:17:49 It’s not an obvious thing for how to design that world
00:17:51 once you have that very powerful system.
00:17:53 So if we take a little step back,
00:17:55 and we’re having a fascinating conversation,
00:17:58 and OpenAI is in many ways a tech leader in the world,
00:18:01 and yet we’re thinking about
00:18:03 these big existential questions,
00:18:05 which is fascinating, really important.
00:18:07 I think you’re a leader in that space,
00:18:09 and that’s a really important space
00:18:10 of just thinking how AI affects society
00:18:13 in a big picture view.
00:18:14 So Oscar Wilde said, we’re all in the gutter,
00:18:17 but some of us are looking at the stars,
00:18:19 and I think OpenAI has a charter
00:18:22 that looks to the stars, I would say,
00:18:24 to create intelligence, to create general intelligence,
00:18:26 make it beneficial, safe, and collaborative.
00:18:29 So can you tell me how that came about,
00:18:33 how a mission like that and the path
00:18:36 to creating a mission like that at OpenAI was founded?
00:18:39 Yeah, so I think that in some ways
00:18:41 it really boils down to taking a look at the landscape.
00:18:45 So if you think about the history of AI,
00:18:47 that basically for the past 60 or 70 years,
00:18:49 people have thought about this goal
00:18:51 of what could happen if you could automate
00:18:54 human intellectual labor.
00:18:56 Imagine you could build a computer system
00:18:58 that could do that, what becomes possible?
00:19:00 We have a lot of sci fi that tells stories
00:19:02 of various dystopias, and increasingly you have movies
00:19:04 like Her that tell you a little bit about,
00:19:06 maybe more of a little bit utopic vision.
00:19:09 You think about the impacts that we’ve seen
00:19:12 from being able to have bicycles for our minds
00:19:16 and computers, and I think that the impact
00:19:20 of computers and the internet has just far outstripped
00:19:23 what anyone really could have predicted.
00:19:26 And so I think that it’s very clear
00:19:27 that if you can build an AGI,
00:19:29 it will be the most transformative technology
00:19:31 that humans will ever create.
00:19:34 And so what it boils down to then is a question of,
00:19:36 well, is there a path, is there hope,
00:19:39 is there a way to build such a system?
00:19:41 And I think that for 60 or 70 years,
00:19:43 that people got excited and that ended up
00:19:47 not being able to deliver on the hopes
00:19:49 that people had pinned on them.
00:19:51 And I think that then, that after two winters
00:19:54 of AI development, that people I think kind of
00:19:58 almost stopped daring to dream, right?
00:20:00 That really talking about AGI or thinking about AGI
00:20:03 became almost this taboo in the community.
00:20:06 But I actually think that people took the wrong lesson
00:20:08 from AI history.
00:20:10 And if you look back, starting in 1959
00:20:12 is when the Perceptron was released.
00:20:14 And this is basically one of the earliest neural networks.
00:20:17 It was released to what was perceived
00:20:19 as this massive overhype.
00:20:20 So in the New York Times in 1959,
00:20:22 you have this article saying that the Perceptron
00:20:26 will one day recognize people, call out their names,
00:20:29 instantly translate speech between languages.
00:20:31 And people at the time looked at this and said,
00:20:33 this is, your system can’t do any of that.
00:20:36 And basically spent 10 years trying to discredit
00:20:38 the whole Perceptron direction and succeeded.
00:20:40 And all the funding dried up.
00:20:41 And people kind of went in other directions.
00:20:44 And in the 80s, there was this resurgence.
00:20:46 And I’d always heard that the resurgence in the 80s
00:20:49 was due to the invention of backpropagation
00:20:51 and these algorithms that got people excited.
00:20:53 But actually the causality was due to people
00:20:55 building larger computers.
00:20:57 That you can find these articles from the 80s
00:20:59 saying that the democratization of computing power
00:21:01 suddenly meant that you could run
00:21:02 these larger neural networks.
00:21:04 And then people started to do all these amazing things.
00:21:06 Backpropagation algorithm was invented.
00:21:08 And the neural nets people were running
00:21:10 were these tiny little 20 neuron neural nets.
00:21:13 What are you supposed to learn with 20 neurons?
00:21:15 And so of course, they weren’t able to get great results.
00:21:18 And it really wasn’t until 2012 that this approach,
00:21:21 that’s almost the most simple, natural approach
00:21:24 that people had come up with in the 50s,
00:21:27 in some ways even in the 40s before there were computers,
00:21:30 with the Pitts–McCullough neuron,
00:21:32 suddenly this became the best way of solving problems.
00:21:37 And I think there are three core properties
00:21:39 that deep learning has that I think
00:21:42 are very worth paying attention to.
00:21:44 The first is generality.
00:21:45 We have a very small number of deep learning tools.
00:21:48 SGD, deep neural net, maybe some RL.
00:21:53 And it solves this huge variety of problems.
00:21:55 Speech recognition, machine translation,
00:21:57 game playing, all of these problems, small set of tools.
00:22:00 So there’s the generality.
00:22:02 There’s a second piece, which is the competence.
00:22:04 You want to solve any of those problems?
00:22:07 Throw up 40 years worth of normal computer vision research,
00:22:10 replace it with a deep neural net,
00:22:11 it’s going to work better.
00:22:13 And there’s a third piece, which is the scalability.
00:22:16 One thing that has been shown time and time again
00:22:18 is that if you have a larger neural network,
00:22:21 throw more compute, more data at it, it will work better.
00:22:25 Those three properties together feel like essential parts
00:22:28 of building a general intelligence.
00:22:30 Now it doesn’t just mean that if we scale up what we have,
00:22:33 that we will have an AGI, right?
00:22:35 There are clearly missing pieces.
00:22:36 There are missing ideas.
00:22:38 We need to have answers for reasoning.
00:22:40 But I think that the core here is that for the first time,
00:22:44 it feels that we have a paradigm that gives us hope
00:22:47 that general intelligence can be achievable.
00:22:50 And so as soon as you believe that,
00:22:52 everything else comes into focus, right?
00:22:54 If you imagine that you may be able to,
00:22:56 and you know that the timeline I think remains uncertain,
00:22:59 but I think that certainly within our lifetimes
00:23:02 and possibly within a much shorter period of time
00:23:04 than people would expect,
00:23:06 if you can really build the most transformative technology
00:23:09 that will ever exist,
00:23:10 you stop thinking about yourself so much, right?
00:23:12 You start thinking about just like,
00:23:14 how do you have a world where this goes well?
00:23:16 And that you need to think about the practicalities
00:23:18 of how do you build an organization
00:23:19 and get together a bunch of people and resources
00:23:22 and to make sure that people feel motivated
00:23:25 and ready to do it.
00:23:26 But I think that then you start thinking about,
00:23:29 well, what if we succeed?
00:23:30 And how do we make sure that when we succeed,
00:23:32 that the world is actually the place
00:23:34 that we want ourselves to exist in?
00:23:36 And almost in the Rawlsian Veil sense of the word.
00:23:39 And so that’s kind of the broader landscape.
00:23:42 And OpenAI was really formed in 2015
00:23:45 with that high level picture of AGI might be possible
00:23:50 sooner than people think,
00:23:51 and that we need to try to do our best
00:23:54 to make sure it’s going to go well.
00:23:55 And then we spent the next couple of years
00:23:57 really trying to figure out what does that mean?
00:23:59 How do we do it?
00:24:00 And I think that typically with a company,
00:24:03 you start out very small, see you in a co founder,
00:24:06 and you build a product, you get some users,
00:24:07 you get a product market fit.
00:24:09 Then at some point you raise some money,
00:24:11 you hire people, you scale, and then down the road,
00:24:14 then the big companies realize you exist
00:24:16 and try to kill you.
00:24:17 And for OpenAI, it was basically everything
00:24:19 in exactly the opposite order.
00:24:21 Let me just pause for a second, you said a lot of things.
00:24:26 And let me just admire the jarring aspect
00:24:29 of what OpenAI stands for, which is daring to dream.
00:24:33 I mean, you said it’s pretty powerful.
00:24:35 It caught me off guard because I think that’s very true.
00:24:38 The step of just daring to dream about the possibilities
00:24:43 of creating intelligence in a positive, in a safe way,
00:24:47 but just even creating intelligence is a very powerful
00:24:50 is a much needed refreshing catalyst for the AI community.
00:24:57 So that’s the starting point.
00:24:58 Okay, so then formation of OpenAI, what’s that?
00:25:02 I would just say that when we were starting OpenAI,
00:25:05 that kind of the first question that we had is,
00:25:07 is it too late to start a lab
00:25:10 with a bunch of the best people?
00:25:12 Right, is that even possible? Wow, okay.
00:25:13 That was an actual question?
00:25:14 That was the core question of,
00:25:17 we had this dinner in July of 2015,
00:25:19 and that was really what we spent the whole time
00:25:21 talking about.
00:25:22 And, you know, because you think about kind of where AI was
00:25:26 is that it had transitioned from being an academic pursuit
00:25:30 to an industrial pursuit.
00:25:32 And so a lot of the best people were in these big
00:25:34 research labs and that we wanted to start our own one
00:25:36 that no matter how much resources we could accumulate
00:25:40 would be pale in comparison to the big tech companies.
00:25:43 And we knew that.
00:25:44 And it was a question of, are we going to be actually
00:25:47 able to get this thing off the ground?
00:25:48 You need critical mass.
00:25:49 You can’t just do you and a cofounder build a product.
00:25:52 You really need to have a group of five to 10 people.
00:25:55 And we kind of concluded it wasn’t obviously impossible.
00:25:59 So it seemed worth trying.
00:26:02 Well, you’re also a dreamer, so who knows, right?
00:26:04 That’s right.
00:26:05 Okay, so speaking of that, competing with the big players,
00:26:11 let’s talk about some of the tricky things
00:26:14 as you think through this process of growing,
00:26:17 of seeing how you can develop these systems
00:26:20 at a scale that competes.
00:26:22 So you recently formed OpenAI LP,
00:26:26 a new cap profit company that now carries the name OpenAI.
00:26:30 So OpenAI is now this official company.
00:26:33 The original nonprofit company still exists
00:26:36 and carries the OpenAI nonprofit name.
00:26:39 So can you explain what this company is,
00:26:41 what the purpose of this creation is,
00:26:44 and how did you arrive at the decision to create it?
00:26:48 OpenAI, the whole entity and OpenAI LP as a vehicle
00:26:53 is trying to accomplish the mission
00:26:55 of ensuring that artificial general intelligence
00:26:57 benefits everyone.
00:26:58 And the main way that we’re trying to do that
00:27:00 is by actually trying to build general intelligence
00:27:02 ourselves and make sure the benefits
00:27:04 are distributed to the world.
00:27:05 That’s the primary way.
00:27:07 We’re also fine if someone else does this, right?
00:27:09 Doesn’t have to be us.
00:27:10 If someone else is going to build an AGI
00:27:12 and make sure that the benefits don’t get locked up
00:27:14 in one company or with one set of people,
00:27:19 like we’re actually fine with that.
00:27:21 And so those ideas are baked into our charter,
00:27:25 which is kind of the foundational document
00:27:28 that describes kind of our values and how we operate.
00:27:32 But it’s also really baked into the structure of OpenAI LP.
00:27:36 And so the way that we’ve set up OpenAI LP
00:27:37 is that in the case where we succeed, right?
00:27:42 If we actually build what we’re trying to build,
00:27:45 then investors are able to get a return,
00:27:48 but that return is something that is capped.
00:27:50 And so if you think of AGI in terms of the value
00:27:52 that you could really create,
00:27:54 you’re talking about the most transformative technology
00:27:56 ever created, it’s going to create orders of magnitude
00:27:58 more value than any existing company.
00:28:01 And that all of that value will be owned by the world,
00:28:05 like legally titled to the nonprofit
00:28:07 to fulfill that mission.
00:28:09 And so that’s the structure.
00:28:12 So the mission is a powerful one,
00:28:15 and it’s one that I think most people would agree with.
00:28:18 It’s how we would hope AI progresses.
00:28:22 And so how do you tie yourself to that mission?
00:28:25 How do you make sure you do not deviate from that mission,
00:28:29 that other incentives that are profit driven
00:28:35 don’t interfere with the mission?
00:28:36 So this was actually a really core question for us
00:28:39 for the past couple of years,
00:28:40 because I’d say that like the way that our history went
00:28:43 was that for the first year,
00:28:44 we were getting off the ground, right?
00:28:46 We had this high level picture,
00:28:47 but we didn’t know exactly how we wanted to accomplish it.
00:28:51 And really two years ago is when we first started realizing
00:28:55 in order to build AGI,
00:28:56 we’re just going to need to raise way more money
00:28:58 than we can as a nonprofit.
00:29:00 And we’re talking many billions of dollars.
00:29:02 And so the first question is how are you supposed to do that
00:29:06 and stay true to this mission?
00:29:08 And we looked at every legal structure out there
00:29:10 and concluded none of them were quite right
00:29:11 for what we wanted to do.
00:29:13 And I guess it shouldn’t be too surprising
00:29:14 if you’re gonna do some like crazy unprecedented technology
00:29:16 that you’re gonna have to come with
00:29:17 some crazy unprecedented structure to do it in.
00:29:20 And a lot of our conversation was with people at OpenAI,
00:29:26 the people who really joined
00:29:27 because they believe so much in this mission
00:29:29 and thinking about how do we actually
00:29:31 raise the resources to do it
00:29:33 and also stay true to what we stand for.
00:29:35 And the place you gotta start is to really align
00:29:37 on what is it that we stand for, right?
00:29:39 What are those values?
00:29:40 What’s really important to us?
00:29:41 And so I’d say that we spent about a year
00:29:43 really compiling the OpenAI charter
00:29:46 and that determines,
00:29:47 and if you even look at the first line item in there,
00:29:50 it says that, look, we expect we’re gonna have to marshal
00:29:52 huge amounts of resources,
00:29:53 but we’re going to make sure that we minimize
00:29:55 conflict of interest with the mission.
00:29:57 And that kind of aligning on all of those pieces
00:30:00 was the most important step towards figuring out
00:30:04 how do we structure a company
00:30:06 that can actually raise the resources
00:30:08 to do what we need to do.
00:30:10 I imagine OpenAI, the decision to create OpenAI LP
00:30:14 was a really difficult one.
00:30:16 And there was a lot of discussions,
00:30:17 as you mentioned, for a year,
00:30:19 and there was different ideas,
00:30:22 perhaps detractors within OpenAI,
00:30:26 sort of different paths that you could have taken.
00:30:28 What were those concerns?
00:30:30 What were the different paths considered?
00:30:32 What was that process of making that decision like?
00:30:34 Yep, so if you look actually at the OpenAI charter,
00:30:37 there’s almost two paths embedded within it.
00:30:40 There is, we are primarily trying to build AGI ourselves,
00:30:44 but we’re also okay if someone else does it.
00:30:47 And this is a weird thing for a company.
00:30:49 It’s really interesting, actually.
00:30:51 There is an element of competition
00:30:53 that you do wanna be the one that does it,
00:30:56 but at the same time, you’re okay if somebody else doesn’t.
00:30:59 We’ll talk about that a little bit, that trade off,
00:31:01 that dance that’s really interesting.
00:31:02 And I think this was the core tension
00:31:04 as we were designing OpenAI LP,
00:31:06 and really the OpenAI strategy,
00:31:08 is how do you make sure that both you have a shot
00:31:11 at being a primary actor,
00:31:12 which really requires building an organization,
00:31:15 raising massive resources,
00:31:17 and really having the will to go
00:31:19 and execute on some really, really hard vision, right?
00:31:22 You need to really sign up for a long period
00:31:23 to go and take on a lot of pain and a lot of risk.
00:31:27 And to do that, normally you just import
00:31:30 the startup mindset, right?
00:31:31 And that you think about, okay,
00:31:32 like how do we out execute everyone?
00:31:34 You have this very competitive angle.
00:31:36 But you also have the second angle of saying that,
00:31:38 well, the true mission isn’t for OpenAI to build AGI.
00:31:41 The true mission is for AGI to go well for humanity.
00:31:45 And so how do you take all of those first actions
00:31:48 and make sure you don’t close the door on outcomes
00:31:51 that would actually be positive and fulfill the mission?
00:31:54 And so I think it’s a very delicate balance, right?
00:31:56 And I think that going 100% one direction or the other
00:31:59 is clearly not the correct answer.
00:32:01 And so I think that even in terms of just how we talk
00:32:03 about OpenAI and think about it,
00:32:05 there’s just like one thing that’s always in the back
00:32:07 of my mind is to make sure that we’re not just saying
00:32:11 OpenAI’s goal is to build AGI, right?
00:32:14 That it’s actually much broader than that, right?
00:32:15 That first of all, it’s not just AGI,
00:32:18 it’s safe AGI that’s very important.
00:32:20 But secondly, our goal isn’t to be the ones to build it.
00:32:23 Our goal is to make sure it goes well for the world.
00:32:24 And so I think that figuring out
00:32:26 how do you balance all of those
00:32:27 and to get people to really come to the table
00:32:30 and compile a single document that encompasses all of that
00:32:36 wasn’t trivial.
00:32:37 So part of the challenge here is your mission is,
00:32:41 I would say, beautiful, empowering,
00:32:44 and a beacon of hope for people in the research community
00:32:47 and just people thinking about AI.
00:32:49 So your decisions are scrutinized more than,
00:32:53 I think, a regular profit driven company.
00:32:55 Do you feel the burden of this
00:32:57 in the creation of the charter
00:32:58 and just in the way you operate?
00:33:00 Yes.
00:33:01 So why do you lean into the burden
00:33:07 by creating such a charter?
00:33:08 Why not keep it quiet?
00:33:10 I mean, it just boils down to the mission, right?
00:33:12 Like I’m here and everyone else is here
00:33:15 because we think this is the most important mission.
00:33:17 Dare to dream.
00:33:18 All right, so do you think you can be good for the world
00:33:23 or create an AGI system that’s good
00:33:25 when you’re a for profit company?
00:33:28 From my perspective, I don’t understand
00:33:30 why profit interferes with positive impact on society.
00:33:37 I don’t understand why Google,
00:33:40 that makes most of its money from ads,
00:33:42 can’t also do good for the world
00:33:45 or other companies, Facebook, anything.
00:33:47 I don’t understand why those have to interfere.
00:33:50 You know, profit isn’t the thing, in my view,
00:33:55 that affects the impact of a company.
00:33:57 What affects the impact of the company is the charter,
00:34:00 is the culture, is the people inside,
00:34:04 and profit is the thing that just fuels those people.
00:34:07 So what are your views there?
00:34:08 Yeah, so I think that’s a really good question
00:34:10 and there’s some real longstanding debates
00:34:14 in human society that are wrapped up in it.
00:34:16 The way that I think about it is just think about
00:34:18 what are the most impactful non profits in the world?
00:34:23 What are the most impactful for profits in the world?
00:34:26 Right, it’s much easier to list the for profits.
00:34:29 That’s right, and I think that there’s some real truth here
00:34:32 that the system that we set up,
00:34:34 the system for kind of how today’s world is organized,
00:34:38 is one that really allows for huge impact.
00:34:41 And that kind of part of that is that you need to be,
00:34:45 that for profits are self sustaining
00:34:48 and able to kind of build on their own momentum.
00:34:51 And I think that’s a really powerful thing.
00:34:53 It’s something that when it turns out
00:34:55 that we haven’t set the guardrails correctly,
00:34:57 causes problems, right?
00:34:58 Think about logging companies that go into forest,
00:35:01 the rainforest, that’s really bad, we don’t want that.
00:35:04 And it’s actually really interesting to me
00:35:06 that kind of this question of how do you get
00:35:08 positive benefits out of a for profit company,
00:35:11 it’s actually very similar to how do you get
00:35:13 positive benefits out of an AGI, right?
00:35:15 That you have this like very powerful system,
00:35:17 it’s more powerful than any human,
00:35:19 and is kind of autonomous in some ways,
00:35:21 it’s superhuman in a lot of axes,
00:35:23 and somehow you have to set the guardrails
00:35:25 to get good things to happen.
00:35:26 But when you do, the benefits are massive.
00:35:29 And so I think that when I think about
00:35:32 nonprofit versus for profit,
00:35:34 I think just not enough happens in nonprofits,
00:35:36 they’re very pure, but it’s just kind of,
00:35:39 it’s just hard to do things there.
00:35:40 In for profits in some ways, like too much happens,
00:35:43 but if kind of shaped in the right way,
00:35:46 it can actually be very positive.
00:35:47 And so with OpenAI LP, we’re picking a road in between.
00:35:52 Now the thing that I think is really important to recognize
00:35:54 is that the way that we think about OpenAI LP
00:35:57 is that in the world where AGI actually happens, right,
00:36:00 in a world where we are successful,
00:36:01 we build the most transformative technology ever,
00:36:03 the amount of value we’re gonna create will be astronomical.
00:36:07 And so then in that case, that the cap that we have
00:36:12 will be a small fraction of the value we create,
00:36:15 and the amount of value that goes back to investors
00:36:17 and employees looks pretty similar to what would happen
00:36:20 in a pretty successful startup.
00:36:23 And that’s really the case that we’re optimizing for, right?
00:36:26 That we’re thinking about in the success case,
00:36:28 making sure that the value we create doesn’t get locked up.
00:36:32 And I expect that in other for profit companies
00:36:34 that it’s possible to do something like that.
00:36:37 I think it’s not obvious how to do it, right?
00:36:39 I think that as a for profit company,
00:36:41 you have a lot of fiduciary duty to your shareholders
00:36:44 and that there are certain decisions
00:36:45 that you just cannot make.
00:36:47 In our structure, we’ve set it up
00:36:49 so that we have a fiduciary duty to the charter,
00:36:52 that we always get to make the decision
00:36:54 that is right for the charter rather than,
00:36:57 even if it comes at the expense of our own stakeholders.
00:37:00 And so I think that when I think about
00:37:03 what’s really important,
00:37:04 it’s not really about nonprofit versus for profit,
00:37:06 it’s really a question of if you build AGI
00:37:09 and you kind of, humanity’s now in this new age,
00:37:13 who benefits, whose lives are better?
00:37:15 And I think that what’s really important
00:37:17 is to have an answer that is everyone.
00:37:20 Yeah, which is one of the core aspects of the charter.
00:37:23 So one concern people have, not just with OpenAI,
00:37:26 but with Google, Facebook, Amazon,
00:37:28 anybody really that’s creating impact at scale
00:37:35 is how do we avoid, as your charter says,
00:37:37 avoid enabling the use of AI or AGI
00:37:40 to unduly concentrate power?
00:37:43 Why would not a company like OpenAI
00:37:45 keep all the power of an AGI system to itself?
00:37:48 The charter.
00:37:49 The charter.
00:37:50 So how does the charter
00:37:53 actualize itself in day to day?
00:37:57 So I think that first, to zoom out,
00:38:00 that the way that we structure the company
00:38:01 is so that the power for sort of dictating the actions
00:38:05 that OpenAI takes ultimately rests with the board,
00:38:08 the board of the nonprofit.
00:38:11 And the board is set up in certain ways
00:38:12 with certain restrictions that you can read about
00:38:14 in the OpenAI LP blog post.
00:38:16 But effectively the board is the governing body
00:38:19 for OpenAI LP.
00:38:21 And the board has a duty to fulfill the mission
00:38:24 of the nonprofit.
00:38:26 And so that’s kind of how we tie,
00:38:28 how we thread all these things together.
00:38:30 Now there’s a question of, so day to day,
00:38:32 how do people, the individuals,
00:38:34 who in some ways are the most empowered ones, right?
00:38:36 Now the board sort of gets to call the shots
00:38:38 at the high level, but the people
00:38:40 who are actually executing are the employees, right?
00:38:43 People here on a day to day basis
00:38:44 who have the keys to the technical whole kingdom.
00:38:48 And there I think that the answer looks a lot like,
00:38:51 well, how does any company’s values get actualized, right?
00:38:55 And I think that a lot of that comes down to
00:38:56 that you need people who are here
00:38:58 because they really believe in that mission
00:39:01 and they believe in the charter
00:39:02 and that they are willing to take actions
00:39:05 that maybe are worse for them,
00:39:07 but are better for the charter.
00:39:08 And that’s something that’s really baked into the culture.
00:39:11 And honestly, I think it’s, you know,
00:39:13 I think that that’s one of the things
00:39:14 that we really have to work to preserve as time goes on.
00:39:18 And that’s a really important part
00:39:19 of how we think about hiring people
00:39:21 and bringing people into OpenAI.
00:39:23 So there’s people here, there’s people here
00:39:25 who could speak up and say, like, hold on a second,
00:39:30 this is totally against what we stand for, culture wise.
00:39:34 Yeah, yeah, for sure.
00:39:35 I mean, I think that we actually have,
00:39:37 I think that’s like a pretty important part
00:39:38 of how we operate and how we have,
00:39:41 even again with designing the charter
00:39:44 and designing OpenAI LP in the first place,
00:39:46 that there has been a lot of conversation
00:39:48 with employees here and a lot of times
00:39:50 where employees said, wait a second,
00:39:52 this seems like it’s going in the wrong direction
00:39:53 and let’s talk about it.
00:39:55 And so I think one thing that’s I think a really,
00:39:57 and you know, here’s actually one thing
00:39:58 that I think is very unique about us as a small company,
00:40:02 is that if you’re at a massive tech giant,
00:40:04 that’s a little bit hard for someone
00:40:05 who’s a line employee to go and talk to the CEO
00:40:08 and say, I think that we’re doing this wrong.
00:40:10 And you know, you’ll get companies like Google
00:40:13 that have had some collective action from employees
00:40:15 to make ethical change around things like Maven.
00:40:19 And so maybe there are mechanisms
00:40:20 at other companies that work.
00:40:22 But here, super easy for anyone to pull me aside,
00:40:24 to pull Sam aside, to pull Ilya aside,
00:40:26 and people do it all the time.
00:40:27 One of the interesting things in the charter
00:40:29 is this idea that it’d be great
00:40:31 if you could try to describe or untangle
00:40:34 switching from competition to collaboration
00:40:36 in late stage AGI development.
00:40:38 It’s really interesting,
00:40:39 this dance between competition and collaboration.
00:40:42 How do you think about that?
00:40:43 Yeah, assuming that you can actually do
00:40:45 the technical side of AGI development,
00:40:47 I think there’s going to be two key problems
00:40:48 with figuring out how do you actually deploy it,
00:40:50 make it go well.
00:40:51 The first one of these is the run up
00:40:53 to building the first AGI.
00:40:56 You look at how self driving cars are being developed,
00:40:58 and it’s a competitive race.
00:41:00 And the thing that always happens in competitive race
00:41:02 is that you have huge amounts of pressure
00:41:04 to get rid of safety.
00:41:06 And so that’s one thing we’re very concerned about,
00:41:08 is that people, multiple teams figuring out
00:41:12 we can actually get there,
00:41:13 but if we took the slower path
00:41:16 that is more guaranteed to be safe, we will lose.
00:41:20 And so we’re going to take the fast path.
00:41:22 And so the more that we can both ourselves
00:41:25 be in a position where we don’t generate
00:41:27 that competitive race, where we say,
00:41:29 if the race is being run and that someone else
00:41:31 is further ahead than we are,
00:41:33 we’re not going to try to leapfrog.
00:41:35 We’re going to actually work with them, right?
00:41:37 We will help them succeed.
00:41:38 As long as what they’re trying to do
00:41:40 is to fulfill our mission, then we’re good.
00:41:42 We don’t have to build AGI ourselves.
00:41:44 And I think that’s a really important commitment from us,
00:41:47 but it can’t just be unilateral, right?
00:41:49 I think that it’s really important that other players
00:41:51 who are serious about building AGI
00:41:53 make similar commitments, right?
00:41:54 I think that, again, to the extent that everyone believes
00:41:57 that AGI should be something to benefit everyone,
00:42:00 then it actually really shouldn’t matter
00:42:01 which company builds it.
00:42:02 And we should all be concerned about the case
00:42:04 where we just race so hard to get there
00:42:06 that something goes wrong.
00:42:07 So what role do you think government,
00:42:10 our favorite entity, has in setting policy and rules
00:42:13 about this domain, from research to the development
00:42:18 to early stage to late stage AI and AGI development?
00:42:22 So I think that, first of all,
00:42:25 it’s really important that government’s in there, right?
00:42:28 In some way, shape, or form.
00:42:29 At the end of the day, we’re talking about
00:42:30 building technology that will shape how the world operates,
00:42:35 and that there needs to be government
00:42:37 as part of that answer.
00:42:39 And so that’s why we’ve done a number
00:42:42 of different congressional testimonies,
00:42:43 we interact with a number of different lawmakers,
00:42:46 and that right now, a lot of our message to them
00:42:50 is that it’s not the time for regulation,
00:42:54 it is the time for measurement, right?
00:42:56 That our main policy recommendation is that people,
00:42:59 and the government does this all the time
00:43:00 with bodies like NIST, spend time trying to figure out
00:43:04 just where the technology is, how fast it’s moving,
00:43:07 and can really become literate and up to speed
00:43:11 with respect to what to expect.
00:43:13 So I think that today, the answer really
00:43:15 is about measurement, and I think that there will be a time
00:43:19 and place where that will change.
00:43:21 And I think it’s a little bit hard to predict
00:43:23 exactly what exactly that trajectory should look like.
00:43:27 So there will be a point at which regulation,
00:43:31 federal in the United States, the government steps in
00:43:34 and helps be the, I don’t wanna say the adult in the room,
00:43:39 to make sure that there is strict rules,
00:43:42 maybe conservative rules that nobody can cross.
00:43:45 Well, I think there’s kind of maybe two angles to it.
00:43:47 So today, with narrow AI applications
00:43:49 that I think there are already existing bodies
00:43:51 that are responsible and should be responsible
00:43:53 for regulation, you think about, for example,
00:43:55 with self driving cars, that you want the national highway.
00:44:00 Netsa.
00:44:01 Yeah, exactly, to be regulating that.
00:44:02 That makes sense, right, that basically what we’re saying
00:44:04 is that we’re going to have these technological systems
00:44:08 that are going to be performing applications
00:44:10 that humans already do, great.
00:44:12 We already have ways of thinking about standards
00:44:14 and safety for those.
00:44:16 So I think actually empowering those regulators today
00:44:18 is also pretty important.
00:44:20 And then I think for AGI, that there’s going to be a point
00:44:24 where we’ll have better answers.
00:44:26 And I think that maybe a similar approach
00:44:27 of first measurement and start thinking about
00:44:30 what the rules should be.
00:44:31 I think it’s really important
00:44:32 that we don’t prematurely squash progress.
00:44:36 I think it’s very easy to kind of smother a budding field.
00:44:40 And I think that’s something to really avoid.
00:44:42 But I don’t think that the right way of doing it
00:44:43 is to say, let’s just try to blaze ahead
00:44:46 and not involve all these other stakeholders.
00:44:50 So you recently released a paper on GPT2 language modeling,
00:44:58 but did not release the full model
00:45:02 because you had concerns about the possible
00:45:04 negative effects of the availability of such model.
00:45:07 It’s outside of just that decision,
00:45:10 it’s super interesting because of the discussion
00:45:14 at a societal level, the discourse it creates.
00:45:16 So it’s fascinating in that aspect.
00:45:19 But if you think that’s the specifics here at first,
00:45:22 what are some negative effects that you envisioned?
00:45:25 And of course, what are some of the positive effects?
00:45:28 Yeah, so again, I think to zoom out,
00:45:30 the way that we thought about GPT2
00:45:33 is that with language modeling,
00:45:35 we are clearly on a trajectory right now
00:45:38 where we scale up our models
00:45:40 and we get qualitatively better performance.
00:45:44 GPT2 itself was actually just a scale up
00:45:47 of a model that we’ve released in the previous June.
00:45:50 We just ran it at much larger scale
00:45:52 and we got these results where
00:45:54 suddenly starting to write coherent pros,
00:45:57 which was not something we’d seen previously.
00:46:00 And what are we doing now?
00:46:01 Well, we’re gonna scale up GPT2 by 10x, by 100x, by 1000x,
00:46:05 and we don’t know what we’re gonna get.
00:46:07 And so it’s very clear that the model
00:46:10 that we released last June,
00:46:12 I think it’s kind of like, it’s a good academic toy.
00:46:16 It’s not something that we think is something
00:46:18 that can really have negative applications
00:46:20 or to the extent that it can,
00:46:21 that the positive of people being able to play with it
00:46:24 is far outweighs the possible harms.
00:46:28 You fast forward to not GPT2, but GPT20,
00:46:32 and you think about what that’s gonna be like.
00:46:34 And I think that the capabilities are going to be substantive.
00:46:38 And so there needs to be a point in between the two
00:46:41 where you say, this is something
00:46:43 where we are drawing the line
00:46:45 and that we need to start thinking about the safety aspects.
00:46:47 And I think for GPT2, we could have gone either way.
00:46:50 And in fact, when we had conversations internally
00:46:52 that we had a bunch of pros and cons,
00:46:54 and it wasn’t clear which one outweighed the other.
00:46:58 And I think that when we announced that,
00:46:59 hey, we decide not to release this model,
00:47:02 then there was a bunch of conversation
00:47:03 where various people said,
00:47:04 it’s so obvious that you should have just released it.
00:47:06 There are other people said,
00:47:07 it’s so obvious you should not have released it.
00:47:08 And I think that that almost definitionally means
00:47:10 that holding it back was the correct decision.
00:47:13 Right, if it’s not obvious
00:47:15 whether something is beneficial or not,
00:47:17 you should probably default to caution.
00:47:19 And so I think that the overall landscape
00:47:22 for how we think about it
00:47:23 is that this decision could have gone either way.
00:47:25 There are great arguments in both directions,
00:47:27 but for future models down the road
00:47:30 and possibly sooner than you’d expect,
00:47:32 because scaling these things up
00:47:33 doesn’t actually take that long,
00:47:35 those ones you’re definitely not going to want
00:47:37 to release into the wild.
00:47:39 And so I think that we almost view this as a test case
00:47:42 and to see, can we even design,
00:47:45 you know, how do you have a society
00:47:46 or how do you have a system
00:47:47 that goes from having no concept
00:47:49 of responsible disclosure,
00:47:50 where the mere idea of not releasing something
00:47:53 for safety reasons is unfamiliar
00:47:55 to a world where you say, okay, we have a powerful model,
00:47:58 let’s at least think about it,
00:47:59 let’s go through some process.
00:48:01 And you think about the security community,
00:48:02 it took them a long time
00:48:03 to design responsible disclosure, right?
00:48:05 You know, you think about this question of,
00:48:07 well, I have a security exploit,
00:48:08 I send it to the company,
00:48:09 the company is like, tries to prosecute me
00:48:11 or just sit, just ignores it, what do I do, right?
00:48:16 And so, you know, the alternatives of,
00:48:17 oh, I just always publish your exploits,
00:48:19 that doesn’t seem good either, right?
00:48:20 And so it really took a long time
00:48:21 and took this, it was bigger than any individual, right?
00:48:25 It’s really about building a whole community
00:48:27 that believe that, okay, we’ll have this process
00:48:28 where you send it to the company, you know,
00:48:30 if they don’t act in a certain time,
00:48:31 then you can go public and you’re not a bad person,
00:48:34 you’ve done the right thing.
00:48:36 And I think that in AI,
00:48:38 part of the response at GPT2 just proves
00:48:41 that we don’t have any concept of this.
00:48:44 So that’s the high level picture.
00:48:47 And so I think that,
00:48:48 I think this was a really important move to make
00:48:51 and we could have maybe delayed it for GPT3,
00:48:53 but I’m really glad we did it for GPT2.
00:48:56 And so now you look at GPT2 itself
00:48:57 and you think about the substance of, okay,
00:48:59 what are potential negative applications?
00:49:01 So you have this model that’s been trained on the internet,
00:49:04 which, you know, it’s also going to be
00:49:05 a bunch of very biased data,
00:49:06 a bunch of, you know, very offensive content in there,
00:49:09 and you can ask it to generate content for you
00:49:13 on basically any topic, right?
00:49:14 You just give it a prompt and it’ll just start writing
00:49:16 and it writes content like you see on the internet,
00:49:19 you know, even down to like saying advertisement
00:49:21 in the middle of some of its generations.
00:49:24 And you think about the possibilities
00:49:26 for generating fake news or abusive content.
00:49:29 And, you know, it’s interesting seeing
00:49:30 what people have done with, you know,
00:49:31 we released a smaller version of GPT2
00:49:34 and the people have done things like try to generate,
00:49:37 you know, take my own Facebook message history
00:49:40 and generate more Facebook messages like me
00:49:43 and people generating fake politician content
00:49:47 or, you know, there’s a bunch of things there
00:49:49 where you at least have to think,
00:49:51 is this going to be good for the world?
00:49:54 There’s the flip side, which is I think
00:49:56 that there’s a lot of awesome applications
00:49:57 that we really want to see,
00:49:59 like creative applications in terms of
00:50:02 if you have sci fi authors that can work with this tool
00:50:05 and come up with cool ideas, like that seems awesome
00:50:08 if we can write better sci fi through the use of these tools
00:50:11 and we’ve actually had a bunch of people write into us
00:50:13 asking, hey, can we use it for, you know,
00:50:16 a variety of different creative applications?
00:50:18 So the positive are actually pretty easy to imagine.
00:50:21 They’re, you know, the usual NLP applications
00:50:26 are really interesting, but let’s go there.
00:50:30 It’s kind of interesting to think about a world
00:50:32 where, look at Twitter, where not just fake news,
00:50:37 but smarter and smarter bots being able to spread
00:50:42 in an interesting, complex, networking way information
00:50:47 that just floods out us regular human beings
00:50:50 with our original thoughts.
00:50:52 So what are your views of this world with GPT20, right?
00:51:00 How do we think about it?
00:51:01 Again, it’s like one of those things about in the 50s
00:51:03 trying to describe the internet or the smartphone.
00:51:08 What do you think about that world,
00:51:09 the nature of information?
00:51:12 One possibility is that we’ll always try to design systems
00:51:16 that identify robot versus human
00:51:19 and we’ll do so successfully and so we’ll authenticate
00:51:23 that we’re still human and the other world is that
00:51:25 we just accept the fact that we’re swimming in a sea
00:51:29 of fake news and just learn to swim there.
00:51:32 Well, have you ever seen the popular meme of robot
00:51:39 with a physical arm and pen clicking the
00:51:42 I’m not a robot button?
00:51:43 Yeah.
00:51:44 I think the truth is that really trying to distinguish
00:51:48 between robot and human is a losing battle.
00:51:52 Ultimately, you think it’s a losing battle?
00:51:53 I think it’s a losing battle ultimately, right?
00:51:55 I think that that is, in terms of the content,
00:51:57 in terms of the actions that you can take.
00:51:59 I mean, think about how captures have gone, right?
00:52:01 The captures used to be a very nice, simple,
00:52:02 you just have this image, all of our OCR is terrible,
00:52:06 you put a couple of artifacts in it,
00:52:08 humans are gonna be able to tell what it is.
00:52:11 An AI system wouldn’t be able to.
00:52:13 Today, I could barely do captures.
00:52:15 And I think that this is just kind of where we’re going.
00:52:18 I think captures were a moment in time thing
00:52:20 and as AI systems become more powerful,
00:52:22 that there being human capabilities that can be measured
00:52:25 in a very easy, automated way that AIs
00:52:28 will not be capable of.
00:52:30 I think that’s just like,
00:52:31 it’s just an increasingly hard technical battle.
00:52:34 But it’s not that all hope is lost, right?
00:52:36 You think about how do we already authenticate ourselves,
00:52:40 right, that we have systems, we have social security numbers
00:52:43 if you’re in the US or you have ways of identifying
00:52:47 individual people and having real world identity
00:52:50 tied to digital identity seems like a step
00:52:53 towards authenticating the source of content
00:52:56 rather than the content itself.
00:52:58 Now, there are problems with that.
00:52:59 How can you have privacy and anonymity
00:53:02 in a world where the only content you can really trust is,
00:53:05 or the only way you can trust content
00:53:06 is by looking at where it comes from?
00:53:08 And so I think that building out good reputation networks
00:53:11 may be one possible solution.
00:53:14 But yeah, I think that this question is not an obvious one.
00:53:17 And I think that we, maybe sooner than we think,
00:53:20 will be in a world where today I often will read a tweet
00:53:23 and be like, hmm, do I feel like a real human wrote this?
00:53:25 Or do I feel like this is genuine?
00:53:27 I feel like I can kind of judge the content a little bit.
00:53:30 And I think in the future, it just won’t be the case.
00:53:32 You look at, for example, the FCC comments on net neutrality.
00:53:36 It came out later that millions of those were auto generated
00:53:39 and that the researchers were able to do
00:53:41 various statistical techniques to do that.
00:53:44 What do you do in a world
00:53:45 where those statistical techniques don’t exist?
00:53:47 It’s just impossible to tell the difference
00:53:49 between humans and AIs.
00:53:50 And in fact, the most persuasive arguments
00:53:53 are written by AI.
00:53:56 All that stuff, it’s not sci fi anymore.
00:53:58 You look at GPT2 making a great argument
00:54:00 for why recycling is bad for the world.
00:54:02 You gotta read that and be like, huh, you’re right.
00:54:04 We are addressing just the symptoms.
00:54:06 Yeah, that’s quite interesting.
00:54:08 I mean, ultimately it boils down to the physical world
00:54:11 being the last frontier of proving,
00:54:13 so you said like basically networks of people,
00:54:16 humans vouching for humans in the physical world.
00:54:19 And somehow the authentication ends there.
00:54:22 I mean, if I had to ask you,
00:54:25 I mean, you’re way too eloquent for a human.
00:54:28 So if I had to ask you to authenticate,
00:54:31 like prove how do I know you’re not a robot
00:54:33 and how do you know I’m not a robot?
00:54:34 Yeah.
00:54:35 I think that’s so far where in this space,
00:54:40 this conversation we just had,
00:54:42 the physical movements we did,
00:54:44 is the biggest gap between us and AI systems
00:54:47 is the physical manipulation.
00:54:49 So maybe that’s the last frontier.
00:54:51 Well, here’s another question is why is,
00:54:55 why is solving this problem important, right?
00:54:57 Like what aspects are really important to us?
00:54:59 And I think that probably where we’ll end up
00:55:01 is we’ll hone in on what do we really want
00:55:03 out of knowing if we’re talking to a human.
00:55:06 And I think that, again, this comes down to identity.
00:55:09 And so I think that the internet of the future,
00:55:11 I expect to be one that will have lots of agents out there
00:55:14 that will interact with you.
00:55:16 But I think that the question of is this
00:55:19 flesh, real flesh and blood human
00:55:21 or is this an automated system,
00:55:23 may actually just be less important.
00:55:25 Let’s actually go there.
00:55:27 It’s GPT2 is impressive and let’s look at GPT20.
00:55:32 Why is it so bad that all my friends are GPT20?
00:55:37 Why is it so important on the internet,
00:55:43 do you think, to interact with only human beings?
00:55:47 Why can’t we live in a world where ideas can come
00:55:50 from models trained on human data?
00:55:52 Yeah, I think this is actually
00:55:54 a really interesting question.
00:55:55 This comes back to the how do you even picture a world
00:55:58 with some new technology?
00:55:59 And I think that one thing that I think is important
00:56:02 is, you know, let’s say honesty.
00:56:04 And I think that if you have almost in the Turing test
00:56:07 style sense of technology, you have AIs that are pretending
00:56:12 to be humans and deceiving you.
00:56:14 I think that feels like a bad thing, right?
00:56:17 I think that it’s really important that we feel like
00:56:19 we’re in control of our environment, right?
00:56:20 That we understand who we’re interacting with.
00:56:23 And if it’s an AI or a human, that’s not something
00:56:27 that we’re being deceived about.
00:56:28 But I think that the flip side of can I have as meaningful
00:56:31 of an interaction with an AI as I can with a human?
00:56:33 Well, I actually think here you can turn to sci fi.
00:56:36 And her I think is a great example of asking
00:56:39 this very question, right?
00:56:40 One thing I really love about her is it really starts out
00:56:42 almost by asking how meaningful
00:56:44 are human virtual relationships, right?
00:56:47 And then you have a human who has a relationship with an AI
00:56:50 and that you really start to be drawn into that, right?
00:56:54 That all of your emotional buttons get triggered
00:56:56 in the same way as if there was a real human
00:56:58 that was on the other side of that phone.
00:57:00 And so I think that this is one way of thinking about it
00:57:03 is that I think that we can have meaningful interactions
00:57:06 and that if there’s a funny joke,
00:57:09 some sense it doesn’t really matter
00:57:10 if it was written by a human or an AI.
00:57:12 But what you don’t want and why I think
00:57:14 we should really draw hard lines is deception.
00:57:17 And I think that as long as we’re in a world
00:57:19 where why do we build AI systems at all, right?
00:57:22 The reason we want to build them is to enhance human lives,
00:57:24 to make humans be able to do more things,
00:57:26 to have humans feel more fulfilled.
00:57:28 And if we can build AI systems that do that, sign me up.
00:57:32 So the process of language modeling,
00:57:36 how far do you think it’d take us?
00:57:38 Let’s look at movie Her.
00:57:40 Do you think a dialogue, natural language conversation
00:57:44 is formulated by the Turing test, for example,
00:57:47 do you think that process could be achieved
00:57:50 through this kind of unsupervised language modeling?
00:57:52 So I think the Turing test in its real form
00:57:56 isn’t just about language, right?
00:57:58 It’s really about reasoning too, right?
00:58:00 To really pass the Turing test,
00:58:01 I should be able to teach calculus
00:58:03 to whoever’s on the other side
00:58:05 and have it really understand calculus
00:58:07 and be able to go and solve new calculus problems.
00:58:11 And so I think that to really solve the Turing test,
00:58:13 we need more than what we’re seeing with language models.
00:58:16 We need some way of plugging in reasoning.
00:58:18 Now, how different will that be from what we already do?
00:58:22 That’s an open question, right?
00:58:23 Might be that we need some sequence
00:58:25 of totally radical new ideas,
00:58:26 or it might be that we just need to kind of shape
00:58:29 our existing systems in a slightly different way.
00:58:32 But I think that in terms of how far language modeling
00:58:35 will go, it’s already gone way further
00:58:37 than many people would have expected, right?
00:58:39 I think that things like,
00:58:40 and I think there’s a lot of really interesting angles
00:58:42 to poke in terms of how much does GPT2
00:58:45 understand physical world?
00:58:47 Like, you read a little bit about fire underwater in GPT2.
00:58:52 So it’s like, okay, maybe it doesn’t quite understand
00:58:53 what these things are, but at the same time,
00:58:56 I think that you also see various things
00:58:58 like smoke coming from flame,
00:59:00 and a bunch of these things that GPT2,
00:59:02 it has no body, it has no physical experience,
00:59:04 it’s just statically read data.
00:59:06 And I think that the answer is like, we don’t know yet.
00:59:13 These questions, though, we’re starting to be able
00:59:15 to actually ask them to physical systems,
00:59:17 to real systems that exist, and that’s very exciting.
00:59:19 Do you think, what’s your intuition?
00:59:20 Do you think if you just scale language modeling,
00:59:25 like significantly scale,
00:59:27 that reasoning can emerge from the same exact mechanisms?
00:59:30 I think it’s unlikely that if we just scale GPT2
00:59:34 that we’ll have reasoning in the full fledged way.
00:59:38 And I think that there’s like,
00:59:39 the type signature’s a little bit wrong, right?
00:59:41 That like, there’s something we do with,
00:59:44 that we call thinking, right?
00:59:45 Where we spend a lot of compute,
00:59:47 like a variable amount of compute,
00:59:48 to get to better answers, right?
00:59:50 I think a little bit harder, I get a better answer.
00:59:52 And that that kind of type signature
00:59:54 isn’t quite encoded in a GPT, right?
00:59:58 GPT will kind of like, it’s been a long time,
01:00:01 and it’s like evolutionary history,
01:00:03 baking in all this information,
01:00:04 getting very, very good at this predictive process.
01:00:06 And then at runtime, I just kind of do one forward pass,
01:00:10 and I’m able to generate stuff.
01:00:12 And so, you know, there might be small tweaks
01:00:15 to what we do in order to get the type signature, right?
01:00:17 For example, well, you know,
01:00:19 it’s not really one forward pass, right?
01:00:20 You know, you generate symbol by symbol,
01:00:22 and so maybe you generate like a whole sequence
01:00:24 of thoughts, and you only keep like the last bit
01:00:26 or something.
01:00:27 But I think that at the very least,
01:00:29 I would expect you have to make changes like that.
01:00:31 Yeah, just exactly how we, you said, think,
01:00:35 is the process of generating thought by thought
01:00:38 in the same kind of way, like you said,
01:00:40 keep the last bit, the thing that we converge towards.
01:00:43 Yep.
01:00:44 And I think there’s another piece which is interesting,
01:00:46 which is this out of distribution generalization, right?
01:00:49 That like thinking somehow lets us do that, right?
01:00:52 That we haven’t experienced a thing, and yet somehow
01:00:54 we just kind of keep refining our mental model of it.
01:00:57 This is, again, something that feels tied
01:01:00 to whatever reasoning is, and maybe it’s a small tweak
01:01:04 to what we do, maybe it’s many ideas,
01:01:06 and we’ll take as many decades.
01:01:07 Yeah, so the assumption there,
01:01:10 generalization out of distribution,
01:01:12 is that it’s possible to create new ideas.
01:01:16 Mm hmm.
01:01:17 You know, it’s possible that nobody’s ever created
01:01:19 any new ideas, and then with scaling GPT2 to GPT20,
01:01:25 you would essentially generalize to all possible thoughts
01:01:30 that us humans could have.
01:01:31 I mean.
01:01:33 Just to play devil’s advocate.
01:01:34 Right, right, right, I mean, how many new story ideas
01:01:37 have we come up with since Shakespeare, right?
01:01:39 Yeah, exactly.
01:01:40 It’s just all different forms of love and drama and so on.
01:01:44 Okay.
01:01:45 Not sure if you read Bitter Lesson,
01:01:47 a recent blog post by Rich Sutton.
01:01:49 Yep, I have.
01:01:50 He basically says something that echoes some of the ideas
01:01:54 that you’ve been talking about, which is,
01:01:56 he says the biggest lesson that can be read
01:01:58 from 70 years of AI research is that general methods
01:02:01 that leverage computation are ultimately going to,
01:02:05 ultimately win out.
01:02:07 Do you agree with this?
01:02:08 So basically, and OpenAI in general,
01:02:12 but the ideas you’re exploring about coming up with methods,
01:02:15 whether it’s GPT2 modeling or whether it’s OpenAI 5
01:02:20 playing Dota, or a general method is better
01:02:23 than a more fine tuned, expert tuned method.
01:02:29 Yeah, so I think that, well one thing that I think
01:02:32 was really interesting about the reaction
01:02:33 to that blog post was that a lot of people have read this
01:02:36 as saying that compute is all that matters.
01:02:39 And that’s a very threatening idea, right?
01:02:41 And I don’t think it’s a true idea either.
01:02:43 Right, it’s very clear that we have algorithmic ideas
01:02:45 that have been very important for making progress
01:02:47 and to really build AGI.
01:02:49 You wanna push as far as you can on the computational scale
01:02:52 and you wanna push as far as you can on human ingenuity.
01:02:55 And so I think you need both.
01:02:56 But I think the way that you phrased the question
01:02:58 is actually very good, right?
01:02:59 That it’s really about what kind of ideas
01:03:02 should we be striving for?
01:03:03 And absolutely, if you can find a scalable idea,
01:03:07 you pour more compute into it, you pour more data into it,
01:03:09 it gets better, like that’s the real holy grail.
01:03:13 And so I think that the answer to the question,
01:03:16 I think, is yes, that that’s really how we think about it
01:03:19 and that part of why we’re excited about the power
01:03:22 of deep learning, the potential for building AGI
01:03:25 is because we look at the systems that exist
01:03:27 in the most successful AI systems
01:03:29 and we realize that you scale those up,
01:03:32 they’re gonna work better.
01:03:33 And I think that that scalability
01:03:35 is something that really gives us hope
01:03:37 for being able to build transformative systems.
01:03:39 So I’ll tell you, this is partially an emotional,
01:03:43 a response that people often have,
01:03:45 if compute is so important for state of the art performance,
01:03:49 individual developers, maybe a 13 year old
01:03:51 sitting somewhere in Kansas or something like that,
01:03:54 they’re sitting, they might not even have a GPU
01:03:56 or may have a single GPU, a 1080 or something like that,
01:03:59 and there’s this feeling like, well,
01:04:02 how can I possibly compete or contribute
01:04:05 to this world of AI if scale is so important?
01:04:09 So if you can comment on that and in general,
01:04:12 do you think we need to also in the future
01:04:14 focus on democratizing compute resources more
01:04:19 or as much as we democratize the algorithms?
01:04:22 Well, so the way that I think about it
01:04:23 is that there’s this space of possible progress, right?
01:04:28 There’s a space of ideas and sort of systems
01:04:30 that will work that will move us forward
01:04:32 and there’s a portion of that space
01:04:34 and to some extent, an increasingly significant portion
01:04:37 of that space that does just require
01:04:38 massive compute resources.
01:04:40 And for that, I think that the answer is kind of clear
01:04:44 and that part of why we have the structure that we do
01:04:47 is because we think it’s really important
01:04:49 to be pushing the scale and to be building
01:04:51 these large clusters and systems.
01:04:53 But there’s another portion of the space
01:04:55 that isn’t about the large scale compute
01:04:57 that are these ideas that, and again,
01:04:59 I think that for the ideas to really be impactful
01:05:02 and really shine, that they should be ideas
01:05:04 that if you scale them up, would work way better
01:05:06 than they do at small scale.
01:05:08 But that you can discover them
01:05:10 without massive computational resources.
01:05:12 And if you look at the history of recent developments,
01:05:15 you think about things like the GAN or the VAE,
01:05:17 that these are ones that I think you could come up with them
01:05:20 without having, and in practice,
01:05:22 people did come up with them without having
01:05:24 massive, massive computational resources.
01:05:26 Right, I just talked to Ian Goodfellow,
01:05:27 but the thing is the initial GAN
01:05:31 produced pretty terrible results, right?
01:05:34 So only because it was in a very specific,
01:05:36 it was only because they’re smart enough
01:05:38 to know that this is quite surprising
01:05:39 it can generate anything that they know.
01:05:43 Do you see a world, or is that too optimistic and dreamer
01:05:45 like to imagine that the compute resources
01:05:49 are something that’s owned by governments
01:05:52 and provided as utility?
01:05:55 Actually, to some extent, this question reminds me
01:05:57 of a blog post from one of my former professors at Harvard,
01:06:01 this guy Matt Welsh, who was a systems professor.
01:06:03 I remember sitting in his tenure talk, right,
01:06:05 and that he had literally just gotten tenure.
01:06:08 He went to Google for the summer
01:06:10 and then decided he wasn’t going back to academia, right?
01:06:15 And kind of in his blog post, he makes this point that,
01:06:18 look, as a systems researcher,
01:06:20 that I come up with these cool system ideas, right,
01:06:23 and I kind of build a little proof of concept,
01:06:25 and the best thing I can hope for
01:06:27 is that the people at Google or Yahoo,
01:06:30 which was around at the time,
01:06:31 will implement it and actually make it work at scale, right?
01:06:35 That’s like the dream for me, right?
01:06:36 I build the little thing,
01:06:37 and they turn it into the big thing that’s actually working.
01:06:39 And for him, he said, I’m done with that.
01:06:43 I want to be the person who’s actually doing building
01:06:45 and deploying.
01:06:47 And I think that there’s a similar dichotomy here, right?
01:06:49 I think that there are people who really actually find value,
01:06:53 and I think it is a valuable thing to do
01:06:55 to be the person who produces those ideas, right,
01:06:57 who builds the proof of concept.
01:06:58 And yeah, you don’t get to generate
01:07:00 the coolest possible GAN images,
01:07:02 but you invented the GAN, right?
01:07:04 And so there’s a real trade off there,
01:07:07 and I think that that’s a very personal choice,
01:07:09 but I think there’s value in both sides.
01:07:10 So do you think creating AGI or some new models,
01:07:18 we would see echoes of the brilliance
01:07:20 even at the prototype level?
01:07:22 So you would be able to develop those ideas without scale,
01:07:24 the initial seeds.
01:07:27 So take a look at, you know,
01:07:28 I always like to look at examples that exist, right?
01:07:31 Look at real precedent.
01:07:32 And so take a look at the June 2018 model that we released,
01:07:37 that we scaled up to turn into GPT2.
01:07:39 And you can see that at small scale,
01:07:41 it set some records, right?
01:07:42 This was the original GPT.
01:07:44 We actually had some cool generations.
01:07:46 They weren’t nearly as amazing and really stunning
01:07:49 as the GPT2 ones, but it was promising.
01:07:51 It was interesting.
01:07:53 And so I think it is the case
01:07:54 that with a lot of these ideas,
01:07:56 that you see promise at small scale.
01:07:58 But there is an asterisk here, a very big asterisk,
01:08:00 which is sometimes we see behaviors that emerge
01:08:05 that are qualitatively different
01:08:07 from anything we saw at small scale.
01:08:09 And that the original inventor of whatever algorithm
01:08:12 looks at and says, I didn’t think it could do that.
01:08:15 This is what we saw in Dota, right?
01:08:17 So PPO was created by John Shulman,
01:08:19 who’s a researcher here.
01:08:20 And with Dota, we basically just ran PPO
01:08:24 at massive, massive scale.
01:08:26 And there’s some tweaks in order to make it work,
01:08:29 but fundamentally, it’s PPO at the core.
01:08:31 And we were able to get this long term planning,
01:08:35 these behaviors to really play out on a time scale
01:08:38 that we just thought was not possible.
01:08:40 And John looked at that and was like,
01:08:42 I didn’t think it could do that.
01:08:44 That’s what happens when you’re at three orders
01:08:45 of magnitude more scale than you tested at.
01:08:48 Yeah, but it still has the same flavors of,
01:08:50 you know, at least echoes of the expected billions.
01:08:55 Although I suspect with GPT scaled more and more,
01:08:59 you might get surprising things.
01:09:01 So yeah, you’re right, it’s interesting.
01:09:04 It’s difficult to see how far an idea will go
01:09:07 when it’s scaled.
01:09:09 It’s an open question.
01:09:11 Well, so to that point with Dota and PPO,
01:09:13 like, I mean, here’s a very concrete one, right?
01:09:14 It’s like, it’s actually one thing
01:09:16 that’s very surprising about Dota
01:09:17 that I think people don’t really pay that much attention to
01:09:20 is the decree of generalization
01:09:22 out of distribution that happens, right?
01:09:24 That you have this AI that’s trained against other bots
01:09:27 for its entirety, the entirety of its existence.
01:09:30 Sorry to take a step back.
01:09:31 Can you talk through, you know, a story of Dota,
01:09:37 a story of leading up to opening I5 and that past,
01:09:42 and what was the process of self play
01:09:43 and so on of training on this?
01:09:45 Yeah, yeah, yeah.
01:09:46 So with Dota.
01:09:47 What is Dota?
01:09:47 Yeah, Dota is a complex video game
01:09:50 and we started trying to solve Dota
01:09:52 because we felt like this was a step towards the real world
01:09:55 relative to other games like chess or Go, right?
01:09:58 Those very cerebral games
01:09:59 where you just kind of have this board,
01:10:00 very discreet moves.
01:10:01 Dota starts to be much more continuous time
01:10:04 that you have this huge variety of different actions
01:10:06 that you have a 45 minute game
01:10:07 with all these different units
01:10:09 and it’s got a lot of messiness to it
01:10:11 that really hasn’t been captured by previous games.
01:10:14 And famously, all of the hard coded bots for Dota
01:10:17 were terrible, right?
01:10:18 It’s just impossible to write anything good for it
01:10:19 because it’s so complex.
01:10:21 And so this seemed like a really good place
01:10:23 to push what’s the state of the art
01:10:25 in reinforcement learning.
01:10:26 And so we started by focusing
01:10:28 on the one versus one version of the game
01:10:29 and we’re able to solve that.
01:10:32 We’re able to beat the world champions
01:10:33 and the skill curve was this crazy exponential, right?
01:10:38 And it was like constantly we were just scaling up
01:10:41 that we were fixing bugs
01:10:42 and that you look at the skill curve
01:10:44 and it was really a very, very smooth one.
01:10:46 This is actually really interesting
01:10:47 to see how that human iteration loop
01:10:50 yielded very steady exponential progress.
01:10:52 And to one side note, first of all,
01:10:55 it’s an exceptionally popular video game.
01:10:57 The side effect is that there’s a lot of incredible
01:11:00 human experts at that video game.
01:11:01 So the benchmark that you’re trying to reach is very high.
01:11:05 And the other, can you talk about the approach
01:11:07 that was used initially and throughout
01:11:10 training these agents to play this game?
01:11:12 Yep, and so the approach that we used is self play.
01:11:14 And so you have two agents that don’t know anything.
01:11:17 They battle each other,
01:11:18 they discover something a little bit good
01:11:20 and now they both know it.
01:11:22 And they just get better and better and better
01:11:23 without bound.
01:11:24 And that’s a really powerful idea, right?
01:11:27 That we then went from the one versus one version
01:11:30 of the game and scaled up to five versus five, right?
01:11:32 So you think about kind of like with basketball
01:11:34 where you have this like team sport
01:11:35 and you need to do all this coordination
01:11:37 and we were able to push the same idea,
01:11:40 the same self play to really get to the professional level
01:11:45 at the full five versus five version of the game.
01:11:48 And the things I think are really interesting here
01:11:52 is that these agents, in some ways,
01:11:54 they’re almost like an insect like intelligence, right?
01:11:56 Where they have a lot in common
01:11:58 with how an insect is trained, right?
01:12:00 An insect kind of lives in this environment
01:12:01 for a very long time or the ancestors of this insect
01:12:04 have been around for a long time
01:12:05 and had a lot of experience that gets baked into this agent.
01:12:09 And it’s not really smart in the sense of a human, right?
01:12:12 It’s not able to go and learn calculus,
01:12:14 but it’s able to navigate its environment extremely well.
01:12:16 And it’s able to handle unexpected things
01:12:18 in the environment that it’s never seen before pretty well.
01:12:22 And we see the same sort of thing with our Dota bots, right?
01:12:24 That they’re able to, within this game,
01:12:26 they’re able to play against humans,
01:12:28 which is something that never existed
01:12:29 in its evolutionary environment,
01:12:31 totally different play styles from humans versus the bots.
01:12:34 And yet it’s able to handle it extremely well.
01:12:37 And that’s something that I think was very surprising to us,
01:12:40 was something that doesn’t really emerge
01:12:43 from what we’ve seen with PPO at smaller scale, right?
01:12:47 And the kind of scale we’re running this stuff at was,
01:12:49 I could say like 100,000 CPU cores
01:12:51 running with like hundreds of GPUs.
01:12:54 It was probably about something like hundreds
01:12:57 of years of experience going into this bot
01:13:01 every single real day.
01:13:03 And so that scale is massive
01:13:06 and we start to see very different kinds of behaviors
01:13:08 out of the algorithms that we all know and love.
01:13:10 Dota, you mentioned, beat the world expert one v one.
01:13:15 And then you weren’t able to win five v five this year.
01:13:20 Yeah.
01:13:21 At the best players in the world.
01:13:24 So what’s the comeback story?
01:13:26 First of all, talk through that.
01:13:27 That was an exceptionally exciting event.
01:13:29 And what’s the following months and this year look like?
01:13:33 Yeah, yeah, so one thing that’s interesting
01:13:35 is that we lose all the time.
01:13:38 Because we play.
01:13:39 Who’s we here?
01:13:40 The Dota team at OpenAI.
01:13:41 We play the bot against better players
01:13:44 than our system all the time.
01:13:45 Or at least we used to, right?
01:13:47 Like the first time we lost publicly
01:13:50 was we went up on stage at the international
01:13:52 and we played against some of the best teams in the world
01:13:54 and we ended up losing both games,
01:13:56 but we gave them a run for their money, right?
01:13:58 That both games were kind of 30 minutes, 25 minutes
01:14:01 and they went back and forth, back and forth,
01:14:03 back and forth.
01:14:04 And so I think that really shows
01:14:06 that we’re at the professional level
01:14:08 and that kind of looking at those games,
01:14:09 we think that the coin could have gone a different direction
01:14:12 and we could have had some wins.
01:14:14 That was actually very encouraging for us.
01:14:16 And it’s interesting because the international
01:14:18 was at a fixed time, right?
01:14:19 So we knew exactly what day we were going to be playing
01:14:22 and we pushed as far as we could, as fast as we could.
01:14:25 Two weeks later, we had a bot that had an 80% win rate
01:14:28 versus the one that played at TI.
01:14:30 So the march of progress, you should think of it
01:14:32 as a snapshot rather than as an end state.
01:14:34 And so in fact, we’ll be announcing our finals pretty soon.
01:14:39 I actually think that we’ll announce our final match
01:14:42 prior to this podcast being released.
01:14:45 So we’ll be playing against the world champions.
01:14:49 And for us, it’s really less about,
01:14:52 like the way that we think about what’s upcoming
01:14:55 is the final milestone, the final competitive milestone
01:14:59 for the project, right?
01:15:00 That our goal in all of this
01:15:02 isn’t really about beating humans at Dota.
01:15:05 Our goal is to push the state of the art
01:15:06 in reinforcement learning.
01:15:08 And we’ve done that, right?
01:15:09 And we’ve actually learned a lot from our system
01:15:10 and that we have, I think, a lot of exciting next steps
01:15:13 that we want to take.
01:15:14 And so kind of as a final showcase of what we built,
01:15:17 we’re going to do this match.
01:15:18 But for us, it’s not really the success or failure
01:15:21 to see do we have the coin flip go in our direction
01:15:24 or against.
01:15:25 Where do you see the field of deep learning
01:15:28 heading in the next few years?
01:15:31 Where do you see the work and reinforcement learning
01:15:35 perhaps heading, and more specifically with OpenAI,
01:15:41 all the exciting projects that you’re working on,
01:15:44 what does 2019 hold for you?
01:15:46 Massive scale.
01:15:47 Scale.
01:15:48 I will put an asterisk on that and just say,
01:15:49 I think that it’s about ideas plus scale.
01:15:52 You need both.
01:15:53 So that’s a really good point.
01:15:55 So the question, in terms of ideas,
01:15:58 you have a lot of projects
01:16:00 that are exploring different areas of intelligence.
01:16:04 And the question is, when you think of scale,
01:16:07 do you think about growing the scale
01:16:09 of those individual projects
01:16:10 or do you think about adding new projects?
01:16:13 And sorry to, and if you’re thinking about
01:16:16 adding new projects, or if you look at the past,
01:16:19 what’s the process of coming up with new projects
01:16:21 and new ideas?
01:16:22 Yep.
01:16:23 So we really have a life cycle of project here.
01:16:25 So we start with a few people
01:16:27 just working on a small scale idea.
01:16:28 And language is actually a very good example of this.
01:16:30 That it was really one person here
01:16:32 who was pushing on language for a long time.
01:16:35 I mean, then you get signs of life, right?
01:16:36 And so this is like, let’s say,
01:16:38 with the original GPT, we had something that was interesting
01:16:42 and we said, okay, it’s time to scale this, right?
01:16:44 It’s time to put more people on it,
01:16:46 put more computational resources behind it.
01:16:48 And then we just kind of keep pushing and keep pushing.
01:16:51 And the end state is something
01:16:52 that looks like Dota or robotics,
01:16:54 where you have a large team of 10 or 15 people
01:16:57 that are running things at very large scale
01:16:59 and that you’re able to really have material engineering
01:17:02 and sort of machine learning science coming together
01:17:06 to make systems that work and get material results
01:17:10 that just would have been impossible otherwise.
01:17:12 So we do that whole life cycle.
01:17:13 We’ve done it a number of times, typically end to end.
01:17:16 It’s probably two years or so to do it.
01:17:20 The organization has been around for three years,
01:17:21 so maybe we’ll find that we also have
01:17:23 longer life cycle projects, but we’ll work up to those.
01:17:29 So one team that we were actually just starting,
01:17:31 Ilya and I are kicking off a new team
01:17:33 called the Reasoning Team,
01:17:34 and that this is to really try to tackle
01:17:36 how do you get neural networks to reason?
01:17:38 And we think that this will be a long term project.
01:17:42 It’s one that we’re very excited about.
01:17:44 In terms of reasoning, super exciting topic,
01:17:48 what kind of benchmarks, what kind of tests of reasoning
01:17:54 do you envision?
01:17:55 What would, if you sat back with whatever drink
01:17:58 and you would be impressed that this system
01:18:01 is able to do something, what would that look like?
01:18:03 Theorem proving.
01:18:04 Theorem proving.
01:18:06 So some kind of logic, and especially mathematical logic.
01:18:10 I think so.
01:18:11 I think that there’s other problems that are dual
01:18:14 to theorem proving in particular.
01:18:15 You think about programming, you think about
01:18:18 even security analysis of code,
01:18:21 that these all kind of capture the same sorts
01:18:23 of core reasoning and being able to do
01:18:26 some out of distribution generalization.
01:18:28 So it would be quite exciting if OpenAI Reasoning Team
01:18:32 was able to prove that P equals NP.
01:18:34 That would be very nice.
01:18:36 It would be very, very, very exciting, especially.
01:18:38 If it turns out that P equals NP,
01:18:39 that’ll be interesting too.
01:18:41 It would be ironic and humorous.
01:18:47 So what problem stands out to you
01:18:49 as the most exciting and challenging and impactful
01:18:53 to the work for us as a community in general
01:18:56 and for OpenAI this year?
01:18:58 You mentioned reasoning.
01:18:59 I think that’s a heck of a problem.
01:19:01 Yeah, so I think reasoning’s an important one.
01:19:02 I think it’s gonna be hard to get good results in 2019.
01:19:05 Again, just like we think about the life cycle, takes time.
01:19:08 I think for 2019, language modeling seems to be
01:19:11 kind of on that ramp.
01:19:12 It’s at the point that we have a technique that works.
01:19:14 We wanna scale 100x, 1,000x, see what happens.
01:19:18 Awesome.
01:19:19 Do you think we’re living in a simulation?
01:19:21 I think it’s hard to have a real opinion about it.
01:19:24 It’s actually interesting.
01:19:26 I separate out things that I think can have like,
01:19:29 yield materially different predictions about the world
01:19:32 from ones that are just kind of fun to speculate about.
01:19:35 I kind of view simulation as more like,
01:19:37 is there a flying teapot between Mars and Jupiter?
01:19:40 Like, maybe, but it’s a little bit hard to know
01:19:44 what that would mean for my life.
01:19:45 So there is something actionable.
01:19:47 So some of the best work OpenAI has done
01:19:50 is in the field of reinforcement learning.
01:19:52 And some of the success of reinforcement learning
01:19:56 come from being able to simulate
01:19:58 the problem you’re trying to solve.
01:20:00 So do you have a hope for reinforcement,
01:20:03 for the future of reinforcement learning
01:20:05 and for the future of simulation?
01:20:07 Like whether it’s, we’re talking about autonomous vehicles
01:20:09 or any kind of system.
01:20:10 Do you see that scaling to where we’ll be able
01:20:13 to simulate systems and hence,
01:20:16 be able to create a simulator that echoes our real world
01:20:19 and proving once and for all,
01:20:21 even though you’re denying it,
01:20:22 that we’re living in a simulation?
01:20:25 I feel like it’s two separate questions, right?
01:20:26 So kind of at the core there of like,
01:20:28 can we use simulation for self driving cars?
01:20:31 Take a look at our robotic system, Dactyl, right?
01:20:33 That was trained in simulation using the Dota system,
01:20:37 in fact, and it transfers to a physical robot.
01:20:40 And I think everyone looks at our Dota system,
01:20:42 they’re like, okay, it’s just a game.
01:20:43 How are you ever gonna escape to the real world?
01:20:45 And the answer is, well, we did it with a physical robot
01:20:47 that no one could program.
01:20:48 And so I think the answer is simulation
01:20:50 goes a lot further than you think
01:20:52 if you apply the right techniques to it.
01:20:54 Now, there’s a question of,
01:20:55 are the beings in that simulation gonna wake up
01:20:57 and have consciousness?
01:20:59 I think that one seems a lot harder to, again,
01:21:02 reason about.
01:21:03 I think that you really should think about
01:21:05 where exactly does human consciousness come from
01:21:07 in our own self awareness?
01:21:09 And is it just that once you have a complicated enough
01:21:11 neural net, you have to worry about
01:21:13 the agents feeling pain?
01:21:15 And I think there’s interesting speculation to do there,
01:21:19 but again, I think it’s a little bit hard to know for sure.
01:21:23 Well, let me just keep with the speculation.
01:21:25 Do you think to create intelligence, general intelligence,
01:21:28 you need, one, consciousness, and two, a body?
01:21:33 Do you think any of those elements are needed,
01:21:35 or is intelligence something that’s orthogonal to those?
01:21:38 I’ll stick to the non grand answer first, right?
01:21:41 So the non grand answer is just to look at,
01:21:44 what are we already making work?
01:21:45 You look at GPT2, a lot of people would have said
01:21:47 that to even get these kinds of results,
01:21:49 you need real world experience.
01:21:51 You need a body, you need grounding.
01:21:52 How are you supposed to reason about any of these things?
01:21:55 How are you supposed to like even kind of know
01:21:56 about smoke and fire and those things
01:21:58 if you’ve never experienced them?
01:21:59 And GPT2 shows that you can actually go way further
01:22:03 than that kind of reasoning would predict.
01:22:06 So I think that in terms of, do we need consciousness?
01:22:10 Do we need a body?
01:22:11 It seems the answer is probably not, right?
01:22:13 That we could probably just continue to push
01:22:15 kind of the systems we have.
01:22:16 They already feel general.
01:22:18 They’re not as competent or as general
01:22:20 or able to learn as quickly as an AGI would,
01:22:23 but they’re at least like kind of proto AGI in some way,
01:22:27 and they don’t need any of those things.
01:22:29 Now let’s move to the grand answer,
01:22:31 which is, are our neural nets conscious already?
01:22:36 Would we ever know?
01:22:37 How can we tell, right?
01:22:38 And here’s where the speculation starts to become
01:22:43 at least interesting or fun
01:22:44 and maybe a little bit disturbing
01:22:46 depending on where you take it.
01:22:48 But it certainly seems that when we think about animals,
01:22:51 that there’s some continuum of consciousness.
01:22:53 You know, my cat I think is conscious in some way, right?
01:22:57 Not as conscious as a human.
01:22:58 And you could imagine that you could build
01:23:00 a little consciousness meter, right?
01:23:01 You point at a cat, it gives you a little reading.
01:23:03 Point at a human, it gives you much bigger reading.
01:23:06 What would happen if you pointed one of those
01:23:08 at a donor neural net?
01:23:09 And if you’re training in this massive simulation,
01:23:12 do the neural nets feel pain?
01:23:13 You know, it becomes pretty hard to know
01:23:16 that the answer is no.
01:23:18 And it becomes pretty hard to really think about
01:23:21 what that would mean if the answer were yes.
01:23:25 And it’s very possible, you know, for example,
01:23:27 you could imagine that maybe the reason
01:23:29 that humans have consciousness
01:23:31 is because it’s a convenient computational shortcut, right?
01:23:35 If you think about it, if you have a being
01:23:37 that wants to avoid pain,
01:23:38 which seems pretty important to survive in this environment
01:23:40 and wants to like, you know, eat food,
01:23:43 then that maybe the best way of doing it
01:23:45 is to have a being that’s conscious, right?
01:23:47 That, you know, in order to succeed in the environment,
01:23:49 you need to have those properties
01:23:51 and how are you supposed to implement them
01:23:52 and maybe this consciousness’s way of doing that.
01:23:55 If that’s true, then actually maybe we should expect
01:23:57 that really competent reinforcement learning agents
01:24:00 will also have consciousness.
01:24:02 But you know, that’s a big if.
01:24:03 And I think there are a lot of other arguments
01:24:04 they can make in other directions.
01:24:06 I think that’s a really interesting idea
01:24:08 that even GPT2 has some degree of consciousness.
01:24:11 That’s something, it’s actually not as crazy
01:24:14 to think about, it’s useful to think about
01:24:16 as we think about what it means
01:24:18 to create intelligence of a dog, intelligence of a cat,
01:24:22 and the intelligence of a human.
01:24:24 So last question, do you think
01:24:27 we will ever fall in love, like in the movie Her,
01:24:32 with an artificial intelligence system
01:24:34 or an artificial intelligence system
01:24:36 falling in love with a human?
01:24:38 I hope so.
01:24:40 If there’s any better way to end it is on love.
01:24:43 So Greg, thanks so much for talking today.
01:24:45 Thank you for having me.