George Hotz: Comma.ai, OpenPilot, and Autonomous Vehicles #31

Transcript

00:00:00 The following is a conversation with George Hotz.

00:00:02 He’s the founder of Kama AI,

00:00:04 a machine learning based vehicle automation company.

00:00:07 He is most certainly an outspoken personality

00:00:10 in the field of AI and technology in general.

00:00:13 He first gained recognition for being the first person

00:00:16 to carry or unlock an iPhone.

00:00:18 And since then, he’s done quite a few interesting things

00:00:21 at the intersection of hardware and software.

00:00:24 This is the Artificial Intelligence Podcast.

00:00:27 If you enjoy it, subscribe on YouTube,

00:00:29 give it five stars on iTunes, support it on Patreon,

00:00:32 or simply connect with me on Twitter

00:00:34 at Lex Friedman, spelled F R I D M A N.

00:00:39 And I’d like to give a special thank you

00:00:40 to Jennifer from Canada

00:00:43 for her support of the podcast on Patreon.

00:00:45 Merci beaucoup, Jennifer.

00:00:47 She’s been a friend and an engineering colleague

00:00:50 for many years since I was in grad school.

00:00:52 Your support means a lot

00:00:54 and inspires me to keep this series going.

00:00:57 And now, here’s my conversation with George Hotz.

00:01:02 Do you think we’re living in a simulation?

00:01:06 Yes, but it may be unfalsifiable.

00:01:10 What do you mean by unfalsifiable?

00:01:12 So if the simulation is designed in such a way

00:01:16 that they did like a formal proof

00:01:19 to show that no information can get in and out,

00:01:22 and if their hardware is designed

00:01:24 for the anything in the simulation

00:01:25 to always keep the hardware in spec,

00:01:27 it may be impossible to prove

00:01:29 whether we’re in a simulation or not.

00:01:32 So they’ve designed it such that it’s a closed system

00:01:35 you can’t get outside the system.

00:01:37 Well, maybe it’s one of three worlds.

00:01:38 We’re either in a simulation which can be exploited,

00:01:41 we’re in a simulation which not only can’t be exploited,

00:01:44 but like the same thing’s true about VMs.

00:01:46 A really well designed VM,

00:01:48 you can’t even detect if you’re in a VM or not.

00:01:51 That’s brilliant.

00:01:52 So the simulation is running on a virtual machine.

00:01:56 But now in reality, all VMs have ways to detect.

00:01:59 That’s the point.

00:02:00 I mean, you’ve done quite a bit of hacking yourself.

00:02:04 So you should know that really any complicated system

00:02:08 will have ways in and out.

00:02:10 So this isn’t necessarily true going forward.

00:02:15 I spent my time away from Comma,

00:02:18 I learned Coq, it’s a dependently typed,

00:02:21 it’s a language for writing math proofs in.

00:02:24 And if you write code that compiles in a language like that,

00:02:28 it is correct by definition.

00:02:30 The types check its correctness.

00:02:33 So it’s possible that the simulation

00:02:34 is written in a language like this, in which case, yeah.

00:02:39 Yeah, but that can’t be sufficiently expressive

00:02:42 a language like that.

00:02:43 Oh, it can.

00:02:44 It can be?

00:02:45 Oh, yeah.

00:02:46 Okay, well, so all right, so.

00:02:48 The simulation doesn’t have to be Turing complete

00:02:50 if it has a scheduled end date.

00:02:52 Looks like it does actually with entropy.

00:02:54 I mean, I don’t think that a simulation

00:02:58 that results in something as complicated as the universe

00:03:03 would have a form of proof of correctness, right?

00:03:08 It’s possible, of course.

00:03:09 We have no idea how good their tooling is.

00:03:12 And we have no idea how complicated

00:03:14 the universe computer really is.

00:03:16 It may be quite simple.

00:03:17 It’s just very large, right?

00:03:19 It’s very, it’s definitely very large.

00:03:22 But the fundamental rules might be super simple.

00:03:24 Yeah, Conway’s getting a life kind of stuff.

00:03:26 Right.

00:03:28 So if you could hack,

00:03:30 so imagine a simulation that is hackable,

00:03:32 if you could hack it,

00:03:35 what would you change about the,

00:03:37 like how would you approach hacking a simulation?

00:03:41 The reason I gave that talk.

00:03:44 By the way, I’m not familiar with the talk you gave.

00:03:46 I just read that you talked about escaping the simulation

00:03:50 or something like that.

00:03:51 So maybe you can tell me a little bit about the theme

00:03:53 and the message there too.

00:03:55 It wasn’t a very practical talk

00:03:57 about how to actually escape a simulation.

00:04:00 It was more about a way of restructuring

00:04:03 an us versus them narrative.

00:04:05 If

00:04:08 we continue on the path we’re going with technology,

00:04:12 I think we’re in big trouble,

00:04:14 like as a species and not just as a species,

00:04:16 but even as me as an individual member of the species.

00:04:19 So if we could change rhetoric

00:04:22 to be more like to think upwards,

00:04:26 like to think about that we’re in a simulation

00:04:29 and how we could get out,

00:04:30 already we’d be on the right path.

00:04:32 What you actually do once you do that,

00:04:34 well, I assume I would have acquired way more intelligence

00:04:37 in the process of doing that.

00:04:38 So I’ll just ask that.

00:04:39 So the thinking upwards,

00:04:42 what kind of ideas,

00:04:43 what kind of breakthrough ideas

00:04:44 do you think thinking in that way could inspire?

00:04:47 And why did you say upwards?

00:04:49 Upwards.

00:04:50 Into space?

00:04:51 Are you thinking sort of exploration in all forms?

00:04:54 The space narrative

00:04:57 that held for the modernist generation

00:04:59 doesn’t hold as well for the postmodern generation.

00:05:04 What’s the space narrative?

00:05:05 Are we talking about the same space,

00:05:06 the three dimensional space?

00:05:07 No, no, no, space, like going on space,

00:05:08 like building like Elon Musk,

00:05:09 like we’re going to build rockets,

00:05:11 we’re going to go to Mars,

00:05:11 we’re going to colonize the universe.

00:05:13 And the narrative you’re referring,

00:05:14 I was born in the Soviet Union,

00:05:15 you’re referring to the race to space.

00:05:17 The race to space, yeah.

00:05:18 Explore, okay.

00:05:19 That was a great modernist narrative.

00:05:21 Yeah.

00:05:23 It doesn’t seem to hold the same weight in today’s culture.

00:05:27 I’m hoping for good postmodern narratives that replace it.

00:05:32 So let’s think, so you work a lot with AI.

00:05:35 So AI is one formulation of that narrative.

00:05:39 There could be also,

00:05:40 I don’t know how much you do in VR and AR.

00:05:42 Yeah.

00:05:43 That’s another, I know less about it,

00:05:45 but every time I play with it in our research,

00:05:47 it’s fascinating, that virtual world.

00:05:49 Are you interested in the virtual world?

00:05:51 I would like to move to virtual reality.

00:05:55 In terms of your work?

00:05:56 No, I would like to physically move there.

00:05:58 The apartment I can rent in the cloud

00:06:00 is way better than the apartment

00:06:01 I can rent in the real world.

00:06:03 Well, it’s all relative, isn’t it?

00:06:04 Because others will have very nice apartments too,

00:06:07 so you’ll be inferior in the virtual world as well.

00:06:09 No, but that’s not how I view the world, right?

00:06:11 I don’t view the world,

00:06:12 I mean, it’s a very almost zero sum ish way

00:06:15 to view the world.

00:06:16 Say like, my great apartment isn’t great

00:06:18 because my neighbor has one too.

00:06:20 No, my great apartment is great

00:06:21 because look at this dishwasher, man.

00:06:24 You just touch the dish and it’s washed, right?

00:06:26 And that is great in and of itself

00:06:28 if I have the only apartment

00:06:30 or if everybody had the apartment.

00:06:31 I don’t care.

00:06:32 So you have fundamental gratitude.

00:06:34 The world first learned of George Hots

00:06:39 in August 2007, maybe before then,

00:06:42 but certainly in August 2007

00:06:44 when you were the first person to unlock,

00:06:46 carry unlock an iPhone.

00:06:48 How did you get into hacking?

00:06:50 What was the first system

00:06:51 you discovered vulnerabilities for and broke into?

00:06:56 So that was really kind of the first thing.

00:07:01 I had a book in 2006 called Grey Hat Hacking.

00:07:06 And I guess I realized that if you acquired

00:07:12 these sort of powers, you could control the world.

00:07:16 But I didn’t really know that much

00:07:18 about computers back then.

00:07:20 I started with electronics.

00:07:22 The first iPhone hack was physical.

00:07:24 Cardware.

00:07:25 You had to open it up and pull an address line high.

00:07:28 And it was because I didn’t really know

00:07:29 about software exploitation.

00:07:31 I learned that all in the next few years

00:07:32 and I got very good at it.

00:07:33 But back then I knew about like how memory chips

00:07:37 are connected to processors and stuff.

00:07:38 You knew about software and programming.

00:07:40 You just didn’t know.

00:07:43 Oh really?

00:07:44 So your view of the world and computers

00:07:46 was physical, was hardware.

00:07:49 Actually, if you read the code that I released

00:07:51 with that in August 2007, it’s atrocious.

00:07:55 What language was it?

00:07:56 C.

00:07:57 C, nice.

00:07:58 And in a broken sort of state machine ask C.

00:08:01 I didn’t know how to program.

00:08:02 Yeah.

00:08:04 So how did you learn to program?

00:08:07 What was your journey?

00:08:08 Cause I mean, we’ll talk about it.

00:08:10 You’ve live streamed some of your programming.

00:08:12 This chaotic, beautiful mess.

00:08:14 How did you arrive at that?

00:08:16 Years and years of practice.

00:08:18 I interned at Google after the summer

00:08:22 after the iPhone unlock.

00:08:24 And I did a contract for them where I built hardware

00:08:27 for Street View and I wrote a software library

00:08:30 to interact with it.

00:08:31 And it was terrible code.

00:08:34 And for the first time I got feedback from people

00:08:36 who I respected saying, no, like don’t write code like this.

00:08:42 Now, of course, just getting that feedback is not enough.

00:08:45 The way that I really got good was I wanted to write

00:08:51 this thing like that could emulate and then visualize

00:08:56 like arm binaries.

00:08:57 Cause I wanted to hack the iPhone better.

00:08:59 And I didn’t like that I couldn’t like see

00:09:01 what the, I couldn’t single step through the processor

00:09:03 because I had no debugger on there,

00:09:05 especially for the low level things like the boot rum

00:09:06 and the bootloader.

00:09:07 So I tried to build this tool to do it.

00:09:10 And I built the tool once and it was terrible.

00:09:13 I built the tool a second time, it was terrible.

00:09:15 I built the tool a third time.

00:09:16 This was by the time I was at Facebook, it was kind of okay.

00:09:18 And then I built the tool a fourth time

00:09:20 when I was a Google intern again in 2014.

00:09:22 And that was the first time I was like,

00:09:24 this is finally usable.

00:09:25 How do you pronounce this Kira?

00:09:27 Kira, yeah.

00:09:28 So it’s essentially the most efficient way

00:09:31 to visualize the change of state of the computer

00:09:35 as the program is running.

00:09:37 That’s what you mean by debugger.

00:09:38 Yeah, it’s a timeless debugger.

00:09:41 So you can rewind just as easily as going forward.

00:09:45 Think about if you’re using GDB,

00:09:46 you have to put a watch on a variable.

00:09:47 If you wanna see if that variable changes.

00:09:49 In Kira, you can just click on that variable

00:09:51 and then it shows every single time

00:09:53 when that variable was changed or accessed.

00:09:56 Think about it like Git for your computers, the run log.

00:09:59 So there’s like a deep log of the state of the computer

00:10:05 as the program runs and you can rewind.

00:10:07 Why isn’t that, maybe it is, maybe you can educate me.

00:10:11 Why isn’t that kind of debugging used more often?

00:10:14 Cause the tooling’s bad.

00:10:16 Well, two things.

00:10:17 One, if you’re trying to debug Chrome,

00:10:19 Chrome is a 200 megabyte binary

00:10:22 that runs slowly on desktops.

00:10:25 So that’s gonna be really hard to use for that.

00:10:27 But it’s really good to use for like CTFs

00:10:30 and for boot roms and for small parts of code.

00:10:33 So it’s hard if you’re trying to debug like massive systems.

00:10:36 What’s a CTF and what’s a boot rom?

00:10:38 A boot rom is the first code that executes

00:10:40 the minute you give power to your iPhone.

00:10:42 Okay.

00:10:43 And CTF where these competitions

00:10:45 that I played capture the flag.

00:10:46 Capture the flag, I was gonna ask you about that.

00:10:48 What are those, look at,

00:10:49 I watched a couple of videos on YouTube,

00:10:51 those look fascinating.

00:10:52 What have you learned about maybe

00:10:54 at the high level of vulnerability of systems

00:10:56 from these competitions?

00:11:00 I feel like in the heyday of CTFs,

00:11:04 you had all of the best security people in the world

00:11:08 challenging each other and coming up

00:11:10 with new toy exploitable things over here.

00:11:13 And then everybody, okay, who can break it?

00:11:15 And when you break it, you get like,

00:11:17 there’s like a file on the server called flag.

00:11:19 And then there’s a program running,

00:11:20 listening on a socket that’s vulnerable.

00:11:22 So you write an exploit, you get a shell,

00:11:24 and then you cat flag, and then you type the flag

00:11:27 into like a web based scoreboard and you get points.

00:11:29 So the goal is essentially,

00:11:31 to find an exploit in the system

00:11:32 that allows you to run shell,

00:11:35 to run arbitrary code on that system.

00:11:37 That’s one of the categories.

00:11:40 That’s like the pwnable category.

00:11:43 Pwnable?

00:11:44 Yeah, pwnable.

00:11:45 It’s like, you know, you pwn the program.

00:11:47 It’s a program that’s, yeah.

00:11:48 Yeah, you know, first of all, I apologize.

00:11:54 I’m gonna say it’s because I’m Russian,

00:11:56 but maybe you can help educate me.

00:12:00 Some video game like misspelled own way back in the day.

00:12:02 Yeah, and it’s just, I wonder if there’s a definition.

00:12:06 I’ll have to go to Urban Dictionary for it.

00:12:08 It’ll be interesting to see what it says.

00:12:09 Okay, so what was the heyday of CTF, by the way?

00:12:12 But was it, what decade are we talking about?

00:12:15 I think like, I mean, maybe unbiased

00:12:18 because it’s the era that I played,

00:12:21 but like 2011 to 2015,

00:12:27 because the modern CTF scene

00:12:30 is similar to the modern competitive programming scene.

00:12:32 You have people who like do drills.

00:12:34 You have people who practice.

00:12:35 And then once you’ve done that,

00:12:36 you’ve turned it less into a game of generic computer skill

00:12:39 and more into a game of, okay,

00:12:41 you drill on these five categories.

00:12:44 And then before that, it wasn’t,

00:12:48 it didn’t have like as much attention as it had.

00:12:52 I don’t know, they were like,

00:12:53 I won $30,000 once in Korea for one of these competitions.

00:12:56 Holy crap.

00:12:56 Yeah, they were, they were, that was.

00:12:57 So that means, I mean, money is money,

00:12:59 but that means there was probably good people there.

00:13:02 Exactly, yeah.

00:13:03 Are the challenges human constructed

00:13:06 or are they grounded in some real flaws and real systems?

00:13:10 Usually they’re human constructed,

00:13:13 but they’re usually inspired by real flaws.

00:13:15 What kind of systems are imagined

00:13:17 is really focused on mobile.

00:13:19 Like what has vulnerabilities these days?

00:13:20 Is it primarily mobile systems like Android?

00:13:25 Oh, everything does.

00:13:26 Still. Yeah, of course.

00:13:28 The price has kind of gone up

00:13:29 because less and less people can find them.

00:13:31 And what’s happened in security

00:13:32 is now if you want to like jailbreak an iPhone,

00:13:34 you don’t need one exploit anymore, you need nine.

00:13:37 Nine chained together, what would it mean?

00:13:39 Yeah, wow.

00:13:40 Okay, so it’s really,

00:13:42 what’s the benefit speaking higher level

00:13:46 philosophically about hacking?

00:13:48 I mean, it sounds from everything I’ve seen about you,

00:13:50 you just love the challenge

00:13:51 and you don’t want to do anything.

00:13:54 You don’t want to bring that exploit out into the world

00:13:58 and do any actual, let it run wild.

00:14:01 You just want to solve it

00:14:02 and then you go on to the next thing.

00:14:05 Oh yeah, I mean, doing criminal stuff’s not really worth it.

00:14:08 And I’ll actually use the same argument

00:14:10 for why I don’t do defense for why I don’t do crime.

00:14:15 If you want to defend a system,

00:14:16 say the system has 10 holes, right?

00:14:19 If you find nine of those holes as a defender,

00:14:22 you still lose because the attacker

00:14:23 gets in through the last one.

00:14:25 If you’re an attacker,

00:14:26 you only have to find one out of the 10.

00:14:28 But if you’re a criminal,

00:14:30 if you log on with a VPN nine out of the 10 times,

00:14:34 but one time you forget, you’re done.

00:14:37 Because you’re caught, okay.

00:14:39 Because you only have to mess up once

00:14:41 to be caught as a criminal.

00:14:42 That’s why I’m not a criminal.

00:14:45 But okay, let me,

00:14:46 because I was having a discussion with somebody

00:14:49 just at a high level about nuclear weapons actually,

00:14:52 why we’re having blown ourselves up yet.

00:14:56 And my feeling is all the smart people in the world,

00:14:59 if you look at the distribution of smart people,

00:15:04 smart people are generally good.

00:15:06 And then this other person I was talking to,

00:15:07 Sean Carroll, the physicist,

00:15:09 and he was saying, no, good and bad people

00:15:11 are evenly distributed amongst everybody.

00:15:13 My sense was good hackers are in general good people

00:15:17 and they don’t want to mess with the world.

00:15:20 What’s your sense?

00:15:21 I’m not even sure about that.

00:15:25 Like,

00:15:28 I have a nice life.

00:15:30 Crime wouldn’t get me anything.

00:15:34 But if you’re good and you have these skills,

00:15:36 you probably have a nice life too, right?

00:15:38 Right, you can use it for other things.

00:15:40 But is there an ethical,

00:15:41 is there a little voice in your head that says,

00:15:46 well, yeah, if you could hack something

00:15:48 to where you could hurt people

00:15:52 and you could earn a lot of money doing it though,

00:15:54 not hurt physically perhaps,

00:15:56 but disrupt their life in some kind of way,

00:16:00 isn’t there a little voice that says?

00:16:03 Well, two things.

00:16:04 One, I don’t really care about money.

00:16:06 So like the money wouldn’t be an incentive.

00:16:08 The thrill might be an incentive.

00:16:10 But when I was 19, I read Crime and Punishment.

00:16:14 And that was another great one

00:16:16 that talked me out of ever really doing crime.

00:16:19 Cause it’s like, that’s gonna be me.

00:16:21 I’d get away with it, but it would just run through my head.

00:16:25 Even if I got away with it, you know?

00:16:26 And then you do crime for long enough,

00:16:27 you’ll never get away with it.

00:16:28 That’s right.

00:16:29 In the end, that’s a good reason to be good.

00:16:32 I wouldn’t say I’m good.

00:16:33 I would just say I’m not bad.

00:16:34 You’re a talented programmer and a hacker

00:16:38 in a good positive sense of the word.

00:16:40 You’ve played around,

00:16:42 found vulnerabilities in various systems.

00:16:44 What have you learned broadly

00:16:46 about the design of systems and so on

00:16:49 from that whole process?

00:16:53 You learn to not take things

00:16:59 for what people say they are,

00:17:02 but you look at things for what they actually are.

00:17:07 Yeah.

00:17:07 I understand that’s what you tell me it is,

00:17:10 but what does it do?

00:17:11 Right.

00:17:12 And you have nice visualization tools

00:17:14 to really know what it’s really doing.

00:17:16 Oh, I wish.

00:17:17 I’m a better programmer now than I was in 2014.

00:17:20 I said, Kira, that was the first tool

00:17:21 that I wrote that was usable.

00:17:23 I wouldn’t say the code was great.

00:17:25 I still wouldn’t say my code is great.

00:17:28 So how was your evolution as a programmer except practice?

00:17:31 So you started with C.

00:17:33 At which point did you pick up Python?

00:17:35 Because you’re pretty big in Python now.

00:17:37 Now, yeah, in college.

00:17:39 I went to Carnegie Mellon when I was 22.

00:17:42 I went back.

00:17:43 I’m like, all right,

00:17:44 I’m gonna take all your hardest CS courses.

00:17:46 We’ll see how I do, right?

00:17:47 Like, did I miss anything

00:17:48 by not having a real undergraduate education?

00:17:51 Took operating systems, compilers, AI,

00:17:54 and they’re like a freshman wheat or math course.

00:17:58 And…

00:18:00 Operating systems, some of those classes

00:18:02 you mentioned are pretty tough, actually.

00:18:04 They’re great.

00:18:05 At least the 2012, circa 2012,

00:18:08 operating systems and compilers were two of the,

00:18:12 they were the best classes I’ve ever taken in my life.

00:18:14 Because you write an operating system

00:18:15 and you write a compiler.

00:18:18 I wrote my operating system in C

00:18:19 and I wrote my compiler in Haskell,

00:18:21 but somehow I picked up Python that semester as well.

00:18:26 I started using it for the CTFs, actually.

00:18:28 That’s when I really started to get into CTFs

00:18:30 and CTFs, you’re all, it’s a race against the clock.

00:18:33 So I can’t write things in C.

00:18:35 Oh, there’s a clock component.

00:18:36 So you really want to use the programming languages

00:18:38 so you can be fastest.

00:18:38 48 hours, pwn as many of these challenges as you can.

00:18:41 Pwn.

00:18:42 Yeah, you got like a hundred points of challenge.

00:18:43 Whatever team gets the most.

00:18:46 You were both at Facebook and Google for a brief stint.

00:18:50 Yeah.

00:18:51 With Project Zero actually at Google for five months

00:18:54 where you developed Kira.

00:18:56 What was Project Zero about in general?

00:18:59 What, I’m just curious about the security efforts

00:19:03 in these companies.

00:19:05 Well, Project Zero started the same time I went there.

00:19:08 What years are there?

00:19:11 2015.

00:19:12 2015.

00:19:13 So that was right at the beginning of Project Zero.

00:19:15 It’s small.

00:19:16 It’s Google’s offensive security team.

00:19:21 I’ll try to give the best public facing explanation

00:19:25 that I can.

00:19:26 So the idea is basically these vulnerabilities

00:19:31 exist in the world.

00:19:33 Nation states have them.

00:19:35 Some high powered bad actors have them.

00:19:39 Sometime people will find these vulnerabilities

00:19:44 and submit them in bug bounties to the companies.

00:19:47 But a lot of the companies don’t really care.

00:19:49 They don’t even fix the bug.

00:19:51 It doesn’t hurt for there to be a vulnerability.

00:19:53 So Project Zero is like, we’re going to do it different.

00:19:55 We’re going to announce a vulnerability

00:19:57 and we’re going to give them 90 days to fix it.

00:19:59 And then whether they fix it or not,

00:20:00 we’re going to drop the zero day.

00:20:03 Oh, wow.

00:20:04 We’re going to drop the weapon.

00:20:04 That’s so cool.

00:20:05 That is so cool.

00:20:07 I love the deadlines.

00:20:09 Oh, that’s so cool.

00:20:10 Give them real deadlines.

00:20:10 Yeah.

00:20:12 And I think it’s done a lot for moving the industry forward.

00:20:15 I watched your coding sessions on the streamed online.

00:20:20 You code things up, the basic projects,

00:20:22 usually from scratch.

00:20:24 I would say sort of as a programmer myself,

00:20:28 just watching you that you type really fast

00:20:30 and your brain works in both brilliant and chaotic ways.

00:20:34 I don’t know if that’s always true,

00:20:35 but certainly for the live streams.

00:20:37 So it’s interesting to me because I’m more,

00:20:40 I’m much slower and systematic and careful.

00:20:43 And you just move, I mean,

00:20:44 probably in order of magnitude faster.

00:20:48 So I’m curious, is there a method to your madness?

00:20:51 Is it just who you are?

00:20:53 There’s pros and cons.

00:20:54 There’s pros and cons to my programming style.

00:20:58 And I’m aware of them.

00:20:59 Like if you ask me to like get something up

00:21:03 and working quickly with like an API

00:21:05 that’s kind of undocumented,

00:21:06 I will do this super fast

00:21:08 because I will throw things at it until it works.

00:21:10 If you ask me to take a vector and rotate it 90 degrees

00:21:14 and then flip it over the XY plane,

00:21:19 I’ll spam program for two hours and won’t get it.

00:21:22 Oh, because it’s something that you could do

00:21:23 with a sheet of paper, think through design,

00:21:26 and then just, do you really just throw stuff at the wall

00:21:30 and you get so good at it that it usually works?

00:21:34 I should become better at the other kind as well.

00:21:36 Sometimes I’ll do things methodically.

00:21:39 It’s nowhere near as entertaining on the Twitch streams.

00:21:41 I do exaggerate it a bit on the Twitch streams as well.

00:21:43 The Twitch streams, I mean,

00:21:44 what do you want to see a game or you want to see

00:21:45 actions per minute, right?

00:21:46 I’ll show you APM for programming too.

00:21:48 Yeah, I recommend people go to it.

00:21:50 I think I watched, I watched probably several hours

00:21:53 of you, like I’ve actually left you programming

00:21:56 in the background while I was programming

00:21:59 because you made me, it was like watching

00:22:02 a really good gamer.

00:22:03 It’s like energizes you because you’re like moving so fast.

00:22:06 It’s so, it’s awesome.

00:22:07 It’s inspiring and it made me jealous that like,

00:22:12 because my own programming is inadequate

00:22:14 in terms of speed.

00:22:15 Oh, I was like.

00:22:17 So I’m twice as frantic on the live streams

00:22:20 as I am when I code without them.

00:22:22 It’s super entertaining.

00:22:23 So I wasn’t even paying attention to what you were coding,

00:22:26 which is great.

00:22:27 It’s just watching you switch windows and Vim I guess

00:22:30 is the most.

00:22:31 Yeah, there’s Vim on screen.

00:22:33 I’ve developed the workload at Facebook

00:22:34 and stuck with it.

00:22:35 How do you learn new programming tools,

00:22:37 ideas, techniques these days?

00:22:39 What’s your like a methodology for learning new things?

00:22:42 So I wrote for comma, the distributed file systems

00:22:48 out in the world are extremely complex.

00:22:50 Like if you want to install something like like like Ceph,

00:22:55 Ceph is I think the like open infrastructure

00:22:58 distributed file system,

00:23:00 or there’s like newer ones like seaweed FS,

00:23:04 but these are all like 10,000 plus line projects.

00:23:06 I think some of them are even a hundred thousand line

00:23:09 and just configuring them as a nightmare.

00:23:11 So I wrote, I wrote one, it’s 200 lines

00:23:16 and it’s, it uses like NGINX and volume servers

00:23:18 and has this little master server that I wrote in Go.

00:23:21 And the way I go, this,

00:23:24 if I would say that I’m proud per line of any code I wrote,

00:23:27 maybe there’s some exploits that I think are beautiful.

00:23:29 And then this, this is 200 lines.

00:23:31 And just the way that I thought about it,

00:23:33 I think was very good.

00:23:34 And the reason it’s very good is because

00:23:35 that was the fourth version of it that I wrote.

00:23:37 And I had three versions that I threw away.

00:23:39 You mentioned, did you say Go?

00:23:40 I wrote in Go, yeah.

00:23:41 In Go.

00:23:42 Is that a functional language?

00:23:43 I forget what Go is.

00:23:45 Go is Google’s language.

00:23:47 Right.

00:23:48 It’s not functional.

00:23:49 It’s some, it’s like in a way it’s C++, but easier.

00:23:56 It’s, it’s strongly typed.

00:23:58 It has a nice ecosystem around it.

00:23:59 When I first looked at it, I was like, this is like Python,

00:24:02 but it takes twice as long to do anything.

00:24:04 Yeah.

00:24:05 Now that I’ve, OpenPilot is migrating to C,

00:24:09 but it still has large Python components.

00:24:10 I now understand why Python doesn’t work

00:24:12 for large code bases and why you want something like Go.

00:24:15 Interesting.

00:24:16 So why, why doesn’t Python work for,

00:24:18 so even most, speaking for myself at least,

00:24:21 like we do a lot of stuff,

00:24:23 basically demo level work with autonomous vehicles

00:24:26 and most of the work is Python.

00:24:28 Yeah.

00:24:29 Why doesn’t Python work for large code bases?

00:24:32 Because, well, lack of type checking is a big part.

00:24:37 So errors creep in.

00:24:39 Yeah.

00:24:40 And like, you don’t know,

00:24:41 the compiler can tell you like nothing, right?

00:24:45 So everything is either, you know,

00:24:48 like, like syntax errors, fine.

00:24:49 But if you misspell a variable in Python,

00:24:51 the compiler won’t catch that.

00:24:53 There’s like linters that can catch it some of the time.

00:24:56 There’s no types.

00:24:57 This is really the biggest downside.

00:25:00 And then, well, Python’s slow, but that’s not related to it.

00:25:02 Well, maybe it’s kind of related to it, so it’s lack of.

00:25:04 So what’s, what’s in your toolbox these days?

00:25:06 Is it Python?

00:25:07 What else?

00:25:08 I need to move to something else.

00:25:10 My adventure into dependently typed languages,

00:25:12 I love these languages.

00:25:14 They just have like syntax from the 80s.

00:25:18 What do you think about JavaScript?

00:25:21 ES6, like the modern, or TypeScript?

00:25:23 JavaScript is,

00:25:26 the whole ecosystem is unbelievably confusing.

00:25:28 Right.

00:25:29 NPM updates a package from 0.2.2 to 0.2.5,

00:25:32 and that breaks your Babel linter,

00:25:34 which translates your ES5 into ES6,

00:25:37 which doesn’t run on, so.

00:25:39 Why do I have to compile my JavaScript again, huh?

00:25:42 It may be the future, though.

00:25:44 You think about, I mean,

00:25:45 I’ve embraced JavaScript recently,

00:25:47 just because, just like I’ve continually embraced PHP,

00:25:52 it seems that these worst possible languages

00:25:54 live on for the longest, like cockroaches never die.

00:25:57 Yeah.

00:25:58 Well, it’s in the browser, and it’s fast.

00:26:00 It’s fast.

00:26:01 Yeah.

00:26:02 It’s in the browser, and compute might stay,

00:26:04 become, you know, the browser.

00:26:06 It’s unclear what the role of the browser is

00:26:09 in terms of distributed computation in the future, so.

00:26:13 JavaScript is definitely here to stay.

00:26:15 Yeah.

00:26:16 It’s interesting if autonomous vehicles

00:26:18 will run on JavaScript one day.

00:26:19 I mean, you have to consider these possibilities.

00:26:21 Well, all our debug tools are JavaScript.

00:26:24 We actually just open sourced them.

00:26:26 We have a tool, Explorer,

00:26:27 which you can annotate your disengagements,

00:26:29 and we have a tool, Cabana,

00:26:30 which lets you analyze the can traffic from the car.

00:26:32 So basically, anytime you’re visualizing something

00:26:35 about the log, you’re using JavaScript.

00:26:37 Well, the web is the best UI toolkit by far, so.

00:26:41 And then, you know what?

00:26:42 You’re coding in JavaScript.

00:26:42 We have a React guy.

00:26:43 He’s good.

00:26:44 React, nice.

00:26:46 Let’s get into it.

00:26:46 So let’s talk autonomous vehicles.

00:26:48 Yeah.

00:26:49 You founded Comma AI.

00:26:51 Let’s, at a high level,

00:26:54 how did you get into the world of vehicle automation?

00:26:57 Can you also just, for people who don’t know,

00:26:59 tell the story of Comma AI?

00:27:01 Sure.

00:27:02 So I was working at this AI startup,

00:27:06 and a friend approached me,

00:27:08 and he’s like, dude, I don’t know where this is going,

00:27:12 but the coolest applied AI problem today

00:27:15 is self driving cars.

00:27:16 I’m like, well, absolutely.

00:27:18 You want to meet with Elon Musk,

00:27:20 and he’s looking for somebody to build a vision system

00:27:24 for autopilot.

00:27:27 This is when they were still on AP1.

00:27:29 They were still using Mobileye.

00:27:30 Elon, back then, was looking for a replacement,

00:27:33 and he brought me in,

00:27:36 and we talked about a contract

00:27:37 where I would deliver something

00:27:39 that meets Mobileye level performance.

00:27:41 I would get paid $12 million if I could deliver it tomorrow,

00:27:43 and I would lose $1 million

00:27:45 for every month I didn’t deliver.

00:27:46 Yeah.

00:27:47 So I was like, okay, this is a great deal.

00:27:49 This is a super exciting challenge.

00:27:52 You know what?

00:27:53 Even if it takes me 10 months,

00:27:54 I get $2 million.

00:27:55 It’s good.

00:27:56 Maybe I can finish up in five.

00:27:57 Maybe I don’t finish it at all,

00:27:58 and I get paid nothing,

00:27:58 and I can still work for 12 months for free.

00:28:00 So maybe just take a pause on that.

00:28:02 I’m also curious about this

00:28:04 because I’ve been working in robotics for a long time,

00:28:06 and I’m curious to see a person like you

00:28:07 just step in and sort of somewhat naive,

00:28:11 but brilliant, right?

00:28:11 So that’s the best place to be

00:28:13 because you basically full steam take on a problem.

00:28:17 How confident, how, from that time,

00:28:19 because you know a lot more now,

00:28:21 at that time, how hard do you think it is

00:28:23 to solve all of autonomous driving?

00:28:25 I remember I suggested to Elon in the meeting

00:28:30 putting a GPU behind each camera

00:28:33 to keep the compute local.

00:28:35 This is an incredibly stupid idea.

00:28:38 I leave the meeting 10 minutes later,

00:28:39 and I’m like, I could have spent a little bit of time

00:28:41 thinking about this problem before I went in.

00:28:42 Why is it a stupid idea?

00:28:44 Oh, just send all your cameras to one big GPU.

00:28:46 You’re much better off doing that.

00:28:48 Oh, sorry.

00:28:49 You said behind every camera have a GPU.

00:28:50 Every camera have a small GPU.

00:28:51 I was like, oh, I’ll put the first few layers

00:28:52 of my comms there.

00:28:54 Ugh, why’d I say that?

00:28:56 That’s possible.

00:28:56 It’s possible, but it’s a bad idea.

00:28:58 It’s not obviously a bad idea.

00:29:00 Pretty obviously bad,

00:29:01 but whether it’s actually a bad idea or not,

00:29:02 I left that meeting with Elon, beating myself up.

00:29:05 I’m like, why’d I say something stupid?

00:29:07 Yeah, you haven’t at least thought through

00:29:10 every aspect of it, yeah.

00:29:12 He’s very sharp too.

00:29:13 Usually in life, I get away with saying stupid things

00:29:15 and then kind of course,

00:29:16 oh, right away he called me out about it.

00:29:18 And usually in life, I get away with saying stupid things

00:29:21 and then a lot of times people don’t even notice

00:29:26 and I’ll correct it and bring the conversation back.

00:29:28 But with Elon, it was like, nope, okay, well.

00:29:31 That’s not at all why the contract fell through.

00:29:33 I was much more prepared the second time I met him.

00:29:35 Yeah, but in general, how hard did you think it is?

00:29:39 Like 12 months is a tough timeline.

00:29:43 Oh, I just thought I’d clone Mobileye IQ3.

00:29:45 I didn’t think I’d solve level five self driving

00:29:47 or anything.

00:29:48 So the goal there was to do lane keeping, good lane keeping.

00:29:52 I saw, my friend showed me the outputs from a Mobileye

00:29:55 and the outputs from a Mobileye was just basically

00:29:57 two lanes at a position of a lead car.

00:29:59 I’m like, I can gather a data set and train this net

00:30:02 in weeks and I did.

00:30:04 Well, first time I tried the implementation of Mobileye

00:30:07 in a Tesla, I was really surprised how good it is.

00:30:11 It’s going incredibly good.

00:30:12 Cause I thought it’s just cause I’ve done a lot

00:30:14 of computer vision, I thought it’d be a lot harder

00:30:17 to create a system that that’s stable.

00:30:20 So I was personally surprised, you know,

00:30:24 have to admit it.

00:30:25 Cause I was kind of skeptical before trying it.

00:30:27 Cause I thought it would go in and out a lot more.

00:30:31 It would get disengaged a lot more and it’s pretty robust.

00:30:36 So what, how hard is the problem when you tackled it?

00:30:42 So I think AP1 was great.

00:30:44 Like Elon talked about disengagements on the 405 down in LA

00:30:49 with like the lane marks are kind of faded

00:30:51 and the Mobileye system would drop out.

00:30:53 Like I had something up and working that I would say

00:30:57 was like the same quality in three months.

00:31:02 Same quality, but how do you know?

00:31:04 You say stuff like that confidently, but you can’t,

00:31:07 and I love it, but the question is you can’t,

00:31:12 you’re kind of going by feel cause you test it out.

00:31:14 Absolutely, absolutely.

00:31:15 Like I would take, I borrowed my friend’s Tesla.

00:31:18 I would take AP1 out for a drive

00:31:20 and then I would take my system out for a drive.

00:31:22 And it seems reasonably like the same.

00:31:25 So the 405, how hard is it to create something

00:31:30 that could actually be a product that’s deployed?

00:31:34 I mean, I’ve read an article where Elon,

00:31:37 this respondent said something about you saying

00:31:40 that to build autopilot is more complicated

00:31:46 than a single George Hodge level job.

00:31:51 How hard is that job to create something

00:31:55 that would work across globally?

00:31:58 Why don’t think globally is the challenge?

00:32:00 But Elon followed that up by saying

00:32:02 it’s gonna take two years in a company of 10 people.

00:32:04 And here I am four years later with a company of 12 people.

00:32:07 And I think we still have another two to go.

00:32:09 Two years, so yeah.

00:32:11 So what do you think about how Tesla is progressing

00:32:15 with autopilot of V2, V3?

00:32:19 I think we’ve kept pace with them pretty well.

00:32:23 I think navigate and autopilot is terrible.

00:32:26 We had some demo features internally of the same stuff

00:32:31 and we would test it.

00:32:32 And I’m like, I’m not shipping this

00:32:33 even as like open source software to people.

00:32:35 Why do you think it’s terrible?

00:32:37 Consumer Reports does a great job of describing it.

00:32:39 Like when it makes a lane change,

00:32:41 it does it worse than a human.

00:32:43 You shouldn’t ship things like autopilot, open pilot.

00:32:46 They lane keep better than a human.

00:32:49 If you turn it on for a stretch of a highway,

00:32:53 like an hour long, it’s never gonna touch a lane line.

00:32:56 Human will touch probably a lane line twice.

00:32:58 You just inspired me.

00:33:00 I don’t know if you’re grounded in data on that.

00:33:02 I read your paper.

00:33:03 Okay, but that’s interesting.

00:33:05 I wonder actually how often we touch lane lines

00:33:10 in general, like a little bit,

00:33:11 because it is.

00:33:13 I could answer that question pretty easily

00:33:14 with the common data set.

00:33:15 Yeah, I’m curious.

00:33:16 I’ve never answered it.

00:33:17 I don’t know.

00:33:18 I just, two is like my personal.

00:33:19 It feels right.

00:33:21 That’s interesting.

00:33:22 Because every time you touch a lane,

00:33:23 that’s a source of a little bit of stress

00:33:26 and kind of lane keeping is removing that stress.

00:33:29 That’s ultimately the biggest value add honestly

00:33:32 is just removing the stress of having to stay in lane.

00:33:35 And I think honestly, I don’t think people fully realize,

00:33:39 first of all, that that’s a big value add,

00:33:41 but also that that’s all it is.

00:33:44 And that not only, I find it a huge value add.

00:33:48 I drove down when we moved to San Diego,

00:33:50 I drove down in a enterprise rental car and I missed it.

00:33:53 So I missed having the system so much.

00:33:55 It’s so much more tiring to drive without it.

00:34:00 It is that lane centering.

00:34:02 That’s the key feature.

00:34:04 Yeah.

00:34:06 And in a way, it’s the only feature

00:34:08 that actually adds value to people’s lives

00:34:11 in autonomous vehicles today.

00:34:12 Waymo does not add value to people’s lives.

00:34:13 It’s a more expensive, slower Uber.

00:34:15 Maybe someday it’ll be this big cliff where it adds value,

00:34:18 but I don’t usually believe it.

00:34:19 It is fascinating.

00:34:20 I haven’t talked to, this is good.

00:34:22 Cause I haven’t, I have intuitively,

00:34:25 but I think we’re making it explicit now.

00:34:28 I actually believe that really good lane keeping

00:34:35 is a reason to buy a car.

00:34:37 Will be a reason to buy a car and it’s a huge value add.

00:34:39 I’ve never, until we just started talking about it,

00:34:41 I haven’t really quite realized it.

00:34:43 That I’ve felt with Elon’s chase of level four

00:34:49 is not the correct chase.

00:34:52 It was on, cause you should just say Tesla has the best

00:34:55 as if from a Tesla perspective, say,

00:34:58 Tesla has the best lane keeping.

00:35:00 Comma AI should say, Comma AI is the best lane keeping.

00:35:04 And that is it.

00:35:05 Yeah. Yeah.

00:35:06 So do you think?

00:35:07 You have to do the longitudinal as well.

00:35:09 You can’t just lane keep.

00:35:10 You have to do ACC,

00:35:12 but ACC is much more forgiving than lane keep,

00:35:15 especially on the highway.

00:35:17 By the way, are you Comma AI’s camera only, correct?

00:35:21 No, we use the radar.

00:35:23 From the car, you’re able to get the, okay.

00:35:25 Hmm?

00:35:26 We can do a camera only now.

00:35:28 It’s gotten to the point,

00:35:29 but we leave the radar there as like a, it’s fusion now.

00:35:33 Okay, so let’s maybe talk through some of the system specs

00:35:36 on the hardware.

00:35:37 What’s the hardware side of what you’re providing?

00:35:42 What’s the capabilities on the software side

00:35:44 with OpenPilot and so on?

00:35:46 So OpenPilot, as the box that we sell, that it runs on,

00:35:52 it’s a phone in a plastic case.

00:35:54 It’s nothing special.

00:35:55 We sell it without the software.

00:35:56 So you buy the phone, it’s just easy.

00:35:59 It’ll be easy set up, but it’s sold with no software.

00:36:03 OpenPilot right now is about to be 0.6.

00:36:07 When it gets to 1.0,

00:36:08 I think we’ll be ready for a consumer product.

00:36:10 We’re not gonna add any new features.

00:36:11 We’re just gonna make the lane keeping really, really good.

00:36:14 Okay, I got it.

00:36:15 So what do we have right now?

00:36:16 It’s a Snapdragon 820.

00:36:20 It’s a Sony IMX 298 forward facing camera.

00:36:24 Driver monitoring camera,

00:36:26 it’s just a selfie camera on the phone.

00:36:27 And a CAN transceiver,

00:36:31 maybe there’s a little thing called PANDAS.

00:36:33 And they talk over USB to the phone

00:36:36 and then they have three CAN buses

00:36:37 that they talk to the car.

00:36:39 One of those CAN buses is the radar CAN bus.

00:36:42 One of them is the main car CAN bus

00:36:44 and the other one is the proxy camera CAN bus.

00:36:46 We leave the existing camera in place

00:36:48 so we don’t turn AEB off.

00:36:50 Right now, we still turn AEB off

00:36:52 if you’re using our longitudinal,

00:36:53 but we’re gonna fix that before 1.0.

00:36:55 Got it.

00:36:56 Wow, that’s cool.

00:36:57 And it’s CAN both ways.

00:36:59 So how are you able to control vehicles?

00:37:03 So we proxy,

00:37:05 the vehicles that we work with

00:37:06 already have a lane keeping assist system.

00:37:10 So lane keeping assist can mean a huge variety of things.

00:37:13 It can mean it will apply a small torque to the wheel

00:37:17 after you’ve already crossed a lane line by a foot,

00:37:21 which is the system in the older Toyotas

00:37:23 versus like, I think Tesla still calls it

00:37:26 lane keeping assist,

00:37:27 where it’ll keep you perfectly

00:37:28 in the center of the lane on the highway.

00:37:32 You can control, like with the joystick, the car.

00:37:35 So these cars already have the capability of drive by wire.

00:37:37 So is it trivial to convert a car that it operates with?

00:37:45 OpenPILOT is able to control the steering?

00:37:48 Oh, a new car or a car that we,

00:37:49 so we have support now for 45 different makes of cars.

00:37:52 What are the cars in general?

00:37:54 Mostly Hondas and Toyotas.

00:37:56 We support almost every Honda and Toyota made this year.

00:38:01 And then a bunch of GMs, a bunch of Subarus,

00:38:04 a bunch of Chevys.

00:38:05 It doesn’t have to be like a Prius,

00:38:06 it could be a Corolla as well.

00:38:07 Oh, the 2020 Corolla is the best car with OpenPILOT.

00:38:10 It just came out.

00:38:11 The actuator has less lag than the older Corolla.

00:38:15 I think I started watching a video with your,

00:38:18 I mean, the way you make videos is awesome.

00:38:21 You’re just literally at the dealership streaming.

00:38:24 Yeah, I had my friend on the phone,

00:38:26 I’m like, bro, you wanna stream for an hour?

00:38:27 Yeah, and basically, like if stuff goes a little wrong,

00:38:31 you’re just like, you just go with it.

00:38:33 Yeah, I love it.

00:38:33 Well, it’s real.

00:38:34 Yeah, it’s real.

00:38:35 That’s so beautiful and it’s so in contrast

00:38:39 to the way other companies

00:38:42 would put together a video like that.

00:38:44 Kind of why I like to do it like that.

00:38:46 Good.

00:38:46 I mean, if you become super rich one day and successful,

00:38:49 I hope you keep it that way

00:38:50 because I think that’s actually what people love,

00:38:53 that kind of genuine.

00:38:54 Oh, it’s all that has value to me.

00:38:56 Money has no, if I sell out to like make money,

00:38:59 I sold out, it doesn’t matter.

00:39:01 What do I get?

00:39:02 Yacht?

00:39:03 I don’t want a yacht.

00:39:04 And I think Tesla’s actually has a small inkling

00:39:09 of that as well with Autonomy Day.

00:39:11 They did reveal more than, I mean, of course,

00:39:14 there’s marketing communications, you could tell,

00:39:15 but it’s more than most companies would reveal,

00:39:17 which is, I hope they go towards that direction more,

00:39:21 other companies, GM, Ford.

00:39:23 Oh, Tesla’s gonna win level five.

00:39:25 They really are.

00:39:26 So let’s talk about it.

00:39:27 You think, you’re focused on level two currently?

00:39:32 Currently.

00:39:33 We’re gonna be one to two years behind Tesla

00:39:36 getting to level five.

00:39:37 Okay.

00:39:38 We’re Android, right?

00:39:39 We’re Android.

00:39:40 You’re Android.

00:39:41 I’m just saying, once Tesla gets it,

00:39:42 we’re one to two years behind.

00:39:43 I’m not making any timeline on when Tesla’s

00:39:45 gonna get it. That’s right.

00:39:46 You did, that was brilliant.

00:39:47 I’m sorry, Tesla investors,

00:39:48 if you think you’re gonna have an autonomous

00:39:49 Robo Taxi fleet by the end of the year.

00:39:52 Yeah, so that’s.

00:39:53 I’ll bet against that.

00:39:54 So what do you think about this?

00:39:57 The most level four companies

00:40:02 are kind of just doing their usual safety driver,

00:40:07 doing full autonomy kind of testing.

00:40:08 And then Tesla does basically trying to go

00:40:12 from lane keeping to full autonomy.

00:40:15 What do you think about that approach?

00:40:16 How successful would it be?

00:40:18 It’s a ton better approach.

00:40:20 Because Tesla is gathering data on a scale

00:40:23 that none of them are.

00:40:25 They’re putting real users behind the wheel of the cars.

00:40:29 It’s, I think, the only strategy that works.

00:40:33 The incremental.

00:40:34 Well, so there’s a few components to Tesla approach

00:40:36 that’s more than just the incrementalists.

00:40:38 What you spoke with is the ones, the software,

00:40:41 so over the air software updates.

00:40:43 Necessity.

00:40:44 I mean Waymo crews have those too.

00:40:46 Those aren’t.

00:40:47 But.

00:40:48 Those differentiate from the automakers.

00:40:49 Right, no lane keeping systems have,

00:40:52 no cars with lane keeping system have that except Tesla.

00:40:54 Yeah.

00:40:55 And the other one is the data, the other direction,

00:40:59 which is the ability to query the data.

00:41:01 I don’t think they’re actually collecting

00:41:03 as much data as people think,

00:41:04 but the ability to turn on collection and turn it off.

00:41:09 So I’m both in the robotics world

00:41:12 and the psychology human factors world.

00:41:15 Many people believe that level two autonomy is problematic

00:41:18 because of the human factor.

00:41:20 Like the more the task is automated,

00:41:23 the more there’s a vigilance decrement.

00:41:26 You start to fall asleep.

00:41:27 You start to become complacent,

00:41:28 start texting more and so on.

00:41:30 Do you worry about that?

00:41:32 Cause if we’re talking about transition from lane keeping

00:41:35 to full autonomy, if you’re spending 80% of the time,

00:41:39 not supervising the machine,

00:41:42 do you worry about what that means

00:41:45 to the safety of the drivers?

00:41:47 One, we don’t consider open pilot to be 1.0

00:41:49 until we have 100% driver monitoring.

00:41:52 You can cheat right now, our driver monitoring system.

00:41:55 There’s a few ways to cheat it.

00:41:56 They’re pretty obvious.

00:41:58 We’re working on making that better.

00:41:59 Before we ship a consumer product that can drive cars,

00:42:02 I want to make sure that I have driver monitoring

00:42:04 that you can’t cheat.

00:42:05 What’s like a successful driver monitoring system look like?

00:42:07 Is it all about just keeping your eyes on the road?

00:42:11 Well, a few things.

00:42:12 So that’s what we went with at first for driver monitoring.

00:42:16 I’m checking, I’m actually looking at

00:42:18 where your head is looking.

00:42:19 The camera’s not that high resolution.

00:42:20 Eyes are a little bit hard to get.

00:42:21 Well, head is this big.

00:42:22 I mean, that’s.

00:42:23 Head is good.

00:42:24 And actually a lot of it, just psychology wise,

00:42:28 to have that monitor constantly there,

00:42:30 it reminds you that you have to be paying attention.

00:42:33 But we want to go further.

00:42:35 We just hired someone full time

00:42:36 to come on to do the driver monitoring.

00:42:37 I want to detect phone in frame

00:42:40 and I want to make sure you’re not sleeping.

00:42:42 How much does the camera see of the body?

00:42:44 This one, not enough.

00:42:47 Not enough.

00:42:48 The next one, everything.

00:42:50 Well, it’s interesting, Fisheye,

00:42:51 because we’re doing just data collection, not real time.

00:42:55 But Fisheye is a beautiful,

00:42:57 being able to capture the body.

00:42:59 And the smartphone is really like the biggest problem.

00:43:03 I’ll show you.

00:43:04 I can show you one of the pictures from our new system.

00:43:06 Awesome, so you’re basically saying

00:43:09 the driver monitoring will be the answer to that.

00:43:13 I think the other point

00:43:14 that you raised in your paper is good as well.

00:43:16 You’re not asking a human to supervise a machine

00:43:20 without giving them the,

00:43:21 they can take over at any time.

00:43:23 Right.

00:43:24 Our safety model, you can take over.

00:43:25 We disengage on both the gas or the brake.

00:43:27 We don’t disengage on steering.

00:43:28 I don’t feel you have to.

00:43:30 But we disengage on gas or brake.

00:43:31 So it’s very easy for you to take over

00:43:34 and it’s very easy for you to reengage.

00:43:36 That switching should be super cheap.

00:43:39 The cars that require,

00:43:40 even autopilot requires a double press.

00:43:42 That’s almost, I see, I don’t like that.

00:43:44 And then the cancel, to cancel in autopilot,

00:43:48 you either have to press cancel,

00:43:49 which no one knows what that is, so they press the brake.

00:43:51 But a lot of times you don’t actually want

00:43:52 to press the brake.

00:43:53 You want to press the gas.

00:43:54 So you should cancel on gas.

00:43:55 Or wiggle the steering wheel, which is bad as well.

00:43:57 Wow, that’s brilliant.

00:43:58 I haven’t heard anyone articulate that point.

00:44:01 Oh, this is all I think about.

00:44:03 It’s the, because I think,

00:44:06 I think actually Tesla has done a better job

00:44:09 than most automakers at making that frictionless.

00:44:12 But you just described that it could be even better.

00:44:16 I love Super Cruise as an experience once it’s engaged.

00:44:21 I don’t know if you’ve used it,

00:44:22 but getting the thing to try to engage.

00:44:25 Yeah, I’ve used the, I’ve driven Super Cruise a lot.

00:44:27 So what’s your thoughts on the Super Cruise system?

00:44:29 You disengage Super Cruise and it falls back to ACC.

00:44:32 So my car’s like still accelerating.

00:44:34 It feels weird.

00:44:36 Otherwise, when you actually have Super Cruise engaged

00:44:39 on the highway, it is phenomenal.

00:44:41 We bought that Cadillac.

00:44:42 We just sold it.

00:44:43 But we bought it just to like experience this.

00:44:45 And I wanted everyone in the office to be like,

00:44:47 this is what we’re striving to build.

00:44:49 GM pioneering with the driver monitoring.

00:44:52 You like their driver monitoring system?

00:44:55 It has some bugs.

00:44:56 If there’s a sun shining back here, it’ll be blind to you.

00:45:00 Right.

00:45:01 But overall, mostly, yeah.

00:45:03 That’s so cool that you know all this stuff.

00:45:05 I don’t often talk to people that,

00:45:08 because it’s such a rare car, unfortunately, currently.

00:45:10 We bought one explicitly for this.

00:45:12 We lost like 25K in the deprecation,

00:45:15 but I feel it’s worth it.

00:45:16 I was very pleasantly surprised that GM system

00:45:21 was so innovative and really wasn’t advertised much,

00:45:26 wasn’t talked about much.

00:45:27 Yeah.

00:45:28 And I was nervous that it would die,

00:45:30 that it would disappear.

00:45:31 Well, they put it on the wrong car.

00:45:33 They should have put it on the Bolt

00:45:34 and not some weird Cadillac that nobody bought.

00:45:36 I think that’s gonna be into,

00:45:38 they’re saying at least it’s gonna be

00:45:40 into their entire fleet.

00:45:41 So what do you think about,

00:45:43 as long as we’re on the driver monitoring,

00:45:45 what do you think about Elon Musk’s claim

00:45:49 that driver monitoring is not needed?

00:45:51 Normally, I love his claims.

00:45:53 That one is stupid.

00:45:55 That one is stupid.

00:45:56 And, you know, he’s not gonna have his level five fleet

00:46:00 by the end of the year.

00:46:01 Hopefully he’s like, okay, I was wrong.

00:46:04 I’m gonna add driver monitoring.

00:46:06 Because when these systems get to the point

00:46:08 that they’re only messing up once every thousand miles,

00:46:10 you absolutely need driver monitoring.

00:46:14 So let me play, cause I agree with you,

00:46:15 but let me play devil’s advocate.

00:46:17 One possibility is that without driver monitoring,

00:46:22 people are able to monitor, self regulate,

00:46:26 monitor themselves.

00:46:28 You know, that, so your idea is.

00:46:30 You’ve seen all the people sleeping in Teslas?

00:46:33 Yeah, well, I’m a little skeptical

00:46:37 of all the people sleeping in Teslas

00:46:38 because I’ve stopped paying attention to that kind of stuff

00:46:44 because I want to see real data.

00:46:45 It’s too much glorified.

00:46:47 It doesn’t feel scientific to me.

00:46:48 So I want to know how many people are really sleeping

00:46:52 in Teslas versus sleeping.

00:46:54 I was driving here sleep deprived in a car

00:46:58 with no automation.

00:46:59 I was falling asleep.

00:47:00 I agree that it’s hypey.

00:47:02 It’s just like, you know what?

00:47:04 If you want to put driver monitoring,

00:47:06 I rented a, my last autopilot experience

00:47:08 was I rented a model three in March and drove it around.

00:47:12 The wheel thing is annoying.

00:47:13 And the reason the wheel thing is annoying,

00:47:15 we use the wheel thing as well,

00:47:16 but we don’t disengage on wheel.

00:47:18 For Tesla, you have to touch the wheel just enough

00:47:21 to trigger the torque sensor, to tell it that you’re there,

00:47:25 but not enough as to disengage it,

00:47:28 which don’t use it for two things.

00:47:30 Don’t disengage on wheel.

00:47:31 You don’t have to.

00:47:32 That whole experience, wow, beautifully put.

00:47:35 All of those elements,

00:47:36 even if you don’t have driver monitoring,

00:47:38 that whole experience needs to be better.

00:47:41 Driver monitoring, I think would make,

00:47:43 I mean, I think Super Cruise is a better experience

00:47:46 once it’s engaged over autopilot.

00:47:48 I think Super Cruise is a transition

00:47:50 to engagement and disengagement are significantly worse.

00:47:53 Yeah.

00:47:54 Well, there’s a tricky thing,

00:47:56 because if I were to criticize Super Cruise is,

00:47:59 it’s a little too crude.

00:48:00 And I think like six seconds or something,

00:48:03 if you look off road, it’ll start warning you.

00:48:05 It’s some ridiculously long period of time.

00:48:09 And just the way,

00:48:12 I think it’s basically, it’s a binary.

00:48:15 It should be adaptive.

00:48:17 Yeah, it needs to learn more about you.

00:48:19 It needs to communicate what it sees about you more.

00:48:24 Tesla shows what it sees about the external world.

00:48:27 It would be nice if Super Cruise would tell us

00:48:29 what it sees about the internal world.

00:48:30 It’s even worse than that.

00:48:31 You press the button to engage

00:48:33 and it just says Super Cruise unavailable.

00:48:35 Yeah. Why?

00:48:36 Why?

00:48:37 Yeah, that transparency is good.

00:48:41 We’ve renamed the driver monitoring packet to driver state.

00:48:45 Driver state.

00:48:46 We have car state packet, which has the state of the car.

00:48:48 And you have driver state packet,

00:48:49 which has the state of the driver.

00:48:50 So what is the…

00:48:52 Estimate their BAC.

00:48:53 What’s BAC?

00:48:54 Blood alcohol content.

00:48:57 You think that’s possible with computer vision?

00:48:59 Absolutely.

00:49:03 To me, it’s an open question.

00:49:04 I haven’t looked into it too much.

00:49:06 Actually, I quite seriously looked at the literature.

00:49:08 It’s not obvious to me that from the eyes and so on,

00:49:10 you can tell.

00:49:11 You might need stuff from the car as well.

00:49:13 Yeah.

00:49:13 You might need how they’re controlling the car, right?

00:49:15 And that’s fundamentally at the end of the day,

00:49:17 what you care about.

00:49:18 But I think, especially when people are really drunk,

00:49:21 they’re not controlling the car nearly as smoothly

00:49:23 as they would look at them walking, right?

00:49:25 The car is like an extension of the body.

00:49:27 So I think you could totally detect.

00:49:29 And if you could fix people who are drunk, distracted,

00:49:31 asleep, if you fix those three.

00:49:32 Yeah, that’s huge.

00:49:35 So what are the current limitations of open pilot?

00:49:38 What are the main problems that still need to be solved?

00:49:41 We’re hopefully fixing a few of them in 06.

00:49:45 We’re not as good as autopilot at stop cars.

00:49:49 So if you’re coming up to a red light at 55,

00:49:55 so it’s the radar stopped car problem, which

00:49:57 is responsible for two autopilot accidents,

00:49:59 it’s hard to differentiate a stopped car from a signpost.

00:50:03 Yeah, a static object.

00:50:05 So you have to fuse.

00:50:06 You have to do this visually.

00:50:07 There’s no way from the radar data to tell the difference.

00:50:09 Maybe you can make a map, but I don’t really

00:50:11 believe in mapping at all anymore.

00:50:13 Wait, wait, wait, what, you don’t believe in mapping?

00:50:16 No.

00:50:16 So you basically, the open pilot solution

00:50:20 is saying react to the environment as you see it,

00:50:22 just like human beings do.

00:50:24 And then eventually, when you want

00:50:25 to do navigate on open pilot, I’ll

00:50:28 train the net to look at ways.

00:50:29 I’ll run ways in the background, I’ll

00:50:31 train a confident way.

00:50:32 Are you using GPS at all?

00:50:34 We use it to ground truth.

00:50:35 We use it to very carefully ground truth the paths.

00:50:38 We have a stack which can recover relative to 10

00:50:40 centimeters over one minute.

00:50:42 And then we use that to ground truth exactly where

00:50:45 the car went in that local part of the environment,

00:50:47 but it’s all local.

00:50:48 How are you testing in general, just for yourself,

00:50:50 like experiments and stuff?

00:50:53 Where are you located?

00:50:54 San Diego.

00:50:55 San Diego.

00:50:56 Yeah.

00:50:56 OK.

00:50:58 So you basically drive around there, collect some data,

00:51:01 and watch the performance?

00:51:03 We have a simulator now.

00:51:04 And we have, our simulator is really cool.

00:51:06 Our simulator is not, it’s not like a Unity based simulator.

00:51:09 Our simulator lets us load in real state.

00:51:12 What do you mean?

00:51:13 We can load in a drive and simulate

00:51:16 what the system would have done on the historical data.

00:51:20 Ooh, nice.

00:51:22 Interesting.

00:51:23 So what, yeah.

00:51:24 Right now we’re only using it for testing,

00:51:26 but as soon as we start using it for training, that’s it.

00:51:29 That’s all that matters.

00:51:30 What’s your feeling about the real world versus simulation?

00:51:33 Do you like simulation for training,

00:51:34 if this moves to training?

00:51:35 So we have to distinguish two types of simulators, right?

00:51:40 There’s a simulator that is completely fake.

00:51:44 I could get my car to drive around in GTA.

00:51:47 I feel that this kind of simulator is useless.

00:51:51 You’re never, there’s so many.

00:51:54 My analogy here is like, OK, fine.

00:51:56 You’re not solving the computer vision problem,

00:51:59 but you’re solving the computer graphics problem.

00:52:02 Right.

00:52:02 And you don’t think you can get very far by creating

00:52:05 ultra realistic graphics?

00:52:07 No, because you can create ultra realistic graphics

00:52:10 of the road, now create ultra realistic behavioral models

00:52:13 of the other cars.

00:52:14 Oh, well, I’ll just use myself driving.

00:52:16 No, you won’t.

00:52:18 You need actual human behavior, because that’s

00:52:22 what you’re trying to learn.

00:52:23 Driving does not have a spec.

00:52:25 The definition of driving is what humans do when they drive.

00:52:29 Whatever Waymo does, I don’t think it’s driving.

00:52:32 Right.

00:52:33 Well, I think actually Waymo and others,

00:52:36 if there’s any use for reinforcement learning,

00:52:38 I’ve seen it used quite well.

00:52:40 I study pedestrians a lot, too, is

00:52:42 try to train models from real data of how pedestrians move,

00:52:45 and try to use reinforcement learning models to make

00:52:47 pedestrians move in human like ways.

00:52:49 By that point, you’ve already gone so many layers,

00:52:53 you detected a pedestrian?

00:52:55 Did you hand code the feature vector of their state?

00:53:00 Did you guys learn anything from computer vision

00:53:02 before deep learning?

00:53:04 Well, OK, I feel like this is.

00:53:07 So perception to you is the sticking point.

00:53:10 I mean, what’s the hardest part of the stack here?

00:53:13 There is no human understandable feature vector separating

00:53:20 perception and planning.

00:53:23 That’s the best way I can put that.

00:53:25 There is no, so it’s all together,

00:53:26 and it’s a joint problem.

00:53:29 So you can take localization.

00:53:31 Localization and planning, there is

00:53:33 a human understandable feature vector between these two

00:53:35 things.

00:53:35 I mean, OK, so I have like three degrees position,

00:53:38 three degrees orientation, and those derivatives,

00:53:40 maybe those second derivatives.

00:53:41 That’s human understandable.

00:53:43 That’s physical.

00:53:45 Between perception and planning, so like Waymo

00:53:50 has a perception stack and then a planner.

00:53:53 And one of the things Waymo does right

00:53:55 is they have a simulator that can separate those two.

00:54:00 They can like replay their perception data

00:54:02 and test their system, which is what

00:54:04 I’m talking about about like the two

00:54:05 different kinds of simulators.

00:54:06 There’s the kind that can work on real data,

00:54:08 and there’s the kind that can’t work on real data.

00:54:10 Now, the problem is that I don’t think you can hand code

00:54:14 a feature vector, right?

00:54:16 Like you have some list of like, oh, here’s

00:54:17 my list of cars in the scenes.

00:54:19 Here’s my list of pedestrians in the scene.

00:54:21 This isn’t what humans are doing.

00:54:23 What are humans doing?

00:54:24 Global.

00:54:27 And you’re saying that’s too difficult to hand engineer.

00:54:31 I’m saying that there is no state vector given a perfect.

00:54:35 I could give you the best team of engineers in the world

00:54:37 to build a perception system and the best team

00:54:39 to build a planner.

00:54:40 All you have to do is define the state vector

00:54:42 that separates those two.

00:54:43 I’m missing the state vector that separates those two.

00:54:48 What do you mean?

00:54:49 So what is the output of your perception system?

00:54:53 Output of the perception system, it’s, OK, well,

00:55:00 there’s several ways to do it.

00:55:01 One is the SLAM components localization.

00:55:03 The other is drivable area, drivable space.

00:55:05 Drivable space, yeah.

00:55:06 And then there’s the different objects in the scene.

00:55:10 And different objects in the scene over time,

00:55:15 maybe, to give you input to then try

00:55:17 to start modeling the trajectories of those objects.

00:55:21 Sure.

00:55:22 That’s it.

00:55:22 I can give you a concrete example

00:55:24 of something you missed.

00:55:25 What’s that?

00:55:25 So say there’s a bush in the scene.

00:55:28 Humans understand that when they see this bush

00:55:30 that there may or may not be a car behind that bush.

00:55:34 Drivable area and a list of objects does not include that.

00:55:37 Humans are doing this constantly at the simplest intersections.

00:55:40 So now you have to talk about occluded area.

00:55:44 But even that, what do you mean by occluded?

00:55:47 OK, so I can’t see it.

00:55:49 Well, if it’s the other side of a house, I don’t care.

00:55:51 What’s the likelihood that there’s

00:55:53 a car in that occluded area?

00:55:55 And if you say, OK, we’ll add that,

00:55:57 I can come up with 10 more examples that you can’t add.

00:56:01 Certainly, occluded area would be something

00:56:03 that Simulator would have because it’s

00:56:05 simulating the entire occlusion is part of it.

00:56:11 Occlusion is part of a vision stack.

00:56:12 But what I’m saying is if you have a hand engineered,

00:56:16 if your perception system output can

00:56:19 be written in a spec document, it is incomplete.

00:56:22 Yeah, I mean, certainly, it’s hard to argue with that

00:56:27 because in the end, that’s going to be true.

00:56:30 Yeah, and I’ll tell you what the output of our perception

00:56:32 system is.

00:56:32 What’s that?

00:56:33 It’s a 1,024 dimensional vector, trained by neural net.

00:56:37 Oh, you know that.

00:56:38 No, it’s 1,024 dimensions of who knows what.

00:56:43 Because it’s operating on real data.

00:56:45 Yeah.

00:56:46 And that’s the perception.

00:56:48 That’s the perception state.

00:56:50 Think about an autoencoder for faces.

00:56:53 If you have an autoencoder for faces and you say

00:56:56 it has 256 dimensions in the middle,

00:56:59 and I’m taking a face over here and projecting it

00:57:01 to a face over here.

00:57:02 Can you hand label all 256 of those dimensions?

00:57:06 Well, no, but those have to generate automatically.

00:57:09 But even if you tried to do it by hand,

00:57:11 could you come up with a spec between your encoder

00:57:15 and your decoder?

00:57:17 No, because it wasn’t designed, but there.

00:57:20 No, no, no, but if you could design it.

00:57:23 If you could design a face reconstructor system,

00:57:26 could you come up with a spec?

00:57:29 No, but I think we’re missing here a little bit.

00:57:32 I think you’re just being very poetic about expressing

00:57:35 a fundamental problem of simulators,

00:57:38 that they’re going to be missing so much that the feature

00:57:44 vector will just look fundamentally different

00:57:47 in the simulated world than the real world.

00:57:51 I’m not making a claim about simulators.

00:57:53 I’m making a claim about the spec division

00:57:57 between perception and planning, even in your system.

00:58:00 Just in general.

00:58:01 Just in general.

00:58:03 If you’re trying to build a car that drives,

00:58:05 if you’re trying to hand code the output of your perception

00:58:08 system, like saying, here’s a list of all the cars

00:58:10 in the scene, here’s a list of all the people,

00:58:11 here’s a list of the occluded areas,

00:58:13 here’s a vector of drivable areas, it’s insufficient.

00:58:16 And if you start to believe that,

00:58:17 you realize that what Waymo and Cruz are doing is impossible.

00:58:20 Currently, what we’re doing is the perception problem

00:58:24 is converting the scene into a chessboard.

00:58:29 And then you reason some basic reasoning

00:58:31 around that chessboard.

00:58:33 And you’re saying that really, there’s a lot missing there.

00:58:38 First of all, why are we talking about this?

00:58:40 Because isn’t this a full autonomy?

00:58:42 Is this something you think about?

00:58:44 Oh, I want to win self driving cars.

00:58:47 So your definition of win includes?

00:58:51 Level four or five.

00:58:53 Level five.

00:58:53 I don’t think level four is a real thing.

00:58:55 I want to build the AlphaGo of driving.

00:59:01 So AlphaGo is really end to end.

00:59:06 Yeah.

00:59:06 Is, yeah, it’s end to end.

00:59:09 And do you think this whole problem,

00:59:12 is that also kind of what you’re getting at

00:59:14 with the perception and the planning?

00:59:16 Is that this whole problem, the right way to do it

00:59:19 is really to learn the entire thing.

00:59:21 I’ll argue that not only is it the right way,

00:59:23 it’s the only way that’s going to exceed human performance.

00:59:27 Well.

00:59:28 It’s certainly true for Go.

00:59:29 Everyone who tried to hand code Go things

00:59:31 built human inferior things.

00:59:33 And then someone came along and wrote some 10,000 line thing

00:59:36 that doesn’t know anything about Go that beat everybody.

00:59:39 It’s 10,000 lines.

00:59:41 True, in that sense, the open question then

00:59:44 that maybe I can ask you is driving is much harder than Go.

00:59:53 The open question is how much harder?

00:59:56 So how, because I think the Elon Musk approach here

00:59:59 with planning and perception is similar

01:00:01 to what you’re describing,

01:00:02 which is really turning into not some kind of modular thing,

01:00:08 but really do formulate it as a learning problem

01:00:11 and solve the learning problem with scale.

01:00:13 So how many years, put one is how many years

01:00:17 would it take to solve this problem

01:00:18 or just how hard is this freaking problem?

01:00:21 Well, the cool thing is I think there’s a lot of value

01:00:27 that we can deliver along the way.

01:00:30 I think that you can build lane keeping assist actually

01:00:37 plus adaptive cruise control, plus, okay, looking at ways,

01:00:42 extends to like all of driving.

01:00:45 Yeah, most of driving, right?

01:00:47 Oh, your adaptive cruise control treats red lights

01:00:49 like cars, okay.

01:00:51 So let’s jump around.

01:00:52 You mentioned that you didn’t like navigate an autopilot.

01:00:55 What advice, how would you make it better?

01:00:57 Do you think as a feature that if it’s done really well,

01:01:00 it’s a good feature?

01:01:02 I think that it’s too reliant on like hand coded hacks

01:01:07 for like, how does navigate an autopilot do a lane change?

01:01:10 It actually does the same lane change every time

01:01:13 and it feels mechanical.

01:01:14 Humans do different lane changes.

01:01:15 Humans sometime will do a slow one,

01:01:17 sometimes do a fast one.

01:01:18 Navigate an autopilot, at least every time I use it,

01:01:20 it is the identical lane change.

01:01:22 How do you learn?

01:01:24 I mean, this is a fundamental thing actually

01:01:26 is the braking and then accelerating

01:01:30 something that’s still, Tesla probably does it better

01:01:33 than most cars, but it still doesn’t do a great job

01:01:36 of creating a comfortable natural experience.

01:01:39 And navigate an autopilot is just lane changes

01:01:42 and extension of that.

01:01:44 So how do you learn to do a natural lane change?

01:01:49 So we have it and I can talk about how it works.

01:01:52 So I feel that we have the solution for lateral.

01:01:58 We don’t yet have the solution for longitudinal.

01:02:00 There’s a few reasons longitudinal is harder than lateral.

01:02:03 The lane change component,

01:02:05 the way that we train on it very simply

01:02:08 is like our model has an input

01:02:10 for whether it’s doing a lane change or not.

01:02:14 And then when we train the end to end model,

01:02:16 we hand label all the lane changes,

01:02:18 cause you have to.

01:02:19 I’ve struggled a long time about not wanting to do that,

01:02:22 but I think you have to.

01:02:24 Or the training data.

01:02:25 For the training data, right?

01:02:26 Oh, we actually, we have an automatic ground truther

01:02:28 which automatically labels all the lane changes.

01:02:30 Was that possible?

01:02:31 To automatically label the lane changes?

01:02:32 Yeah.

01:02:33 Yeah, detect the lane, I see when it crosses it, right?

01:02:34 And I don’t have to get that high percent accuracy,

01:02:36 but it’s like 95, good enough.

01:02:38 Now I set the bit when it’s doing the lane change

01:02:43 in the end to end learning.

01:02:44 And then I set it to zero when it’s not doing a lane change.

01:02:47 So now if I wanted to do a lane change at test time,

01:02:49 I just put the bit to a one and it’ll do a lane change.

01:02:52 Yeah, but so if you look at the space of lane change,

01:02:54 you know, some percentage, not a hundred percent

01:02:57 that we make as humans is not a pleasant experience

01:03:01 cause we messed some part of it up.

01:03:02 It’s nerve wracking to change the look,

01:03:04 you have to see, it has to accelerate.

01:03:06 How do we label the ones that are natural and feel good?

01:03:09 You know, that’s the, cause that’s your ultimate criticism.

01:03:13 The current navigate and autopilot

01:03:15 just doesn’t feel good.

01:03:16 Well, the current navigate and autopilot

01:03:18 is a hand coded policy written by an engineer in a room

01:03:21 who probably went out and tested it a few times on the 280.

01:03:25 Probably a more, a better version of that, but yes.

01:03:29 That’s how we would have written it at Comma AI.

01:03:31 Yeah, yeah, yeah.

01:03:31 Maybe Tesla did, Tesla, they tested it in the end.

01:03:33 That might’ve been two engineers.

01:03:35 Two engineers, yeah.

01:03:37 No, but so if you learn the lane change,

01:03:40 if you learn how to do a lane change from data,

01:03:42 just like you have a label that says lane change

01:03:44 and then you put it in when you want it

01:03:46 to do the lane change,

01:03:48 it’ll automatically do the lane change

01:03:49 that’s appropriate for the situation.

01:03:51 Now, to get at the problem of some humans

01:03:54 do bad lane changes,

01:03:57 we haven’t worked too much on this problem yet.

01:03:59 It’s not that much of a problem in practice.

01:04:03 My theory is that all good drivers are good in the same way

01:04:06 and all bad drivers are bad in different ways.

01:04:09 And we’ve seen some data to back this up.

01:04:11 Well, beautifully put.

01:04:12 So you just basically, if that’s true hypothesis,

01:04:16 then your task is to discover the good drivers.

01:04:19 The good drivers stand out because they’re in one cluster

01:04:23 and the bad drivers are scattered all over the place

01:04:25 and your net learns the cluster.

01:04:27 Yeah, that’s, so you just learn from the good drivers

01:04:30 and they’re easy to cluster.

01:04:33 In fact, we learned from all of them

01:04:33 and the net automatically learns the policy

01:04:35 that’s like the majority,

01:04:36 but we’ll eventually probably have to filter them out.

01:04:38 If that theory is true, I hope it’s true

01:04:41 because the counter theory is there is many clusters,

01:04:49 maybe arbitrarily many clusters of good drivers.

01:04:53 Because if there’s one cluster of good drivers,

01:04:55 you can at least discover a set of policies.

01:04:57 You can learn a set of policies,

01:04:58 which would be good universally.

01:05:00 Yeah.

01:05:01 That would be a nice, that would be nice if it’s true.

01:05:04 And you’re saying that there is some evidence that.

01:05:06 Let’s say lane changes can be clustered into four clusters.

01:05:09 Right. Right.

01:05:10 There’s this finite level of.

01:05:12 I would argue that all four of those are good clusters.

01:05:15 All the things that are random are noise and probably bad.

01:05:18 And which one of the four you pick,

01:05:20 or maybe it’s 10 or maybe it’s 20.

01:05:21 You can learn that.

01:05:22 It’s context dependent.

01:05:23 It depends on the scene.

01:05:24 And the hope is it’s not too dependent on the driver.

01:05:31 Yeah. The hope is that it all washes out.

01:05:34 The hope is that there’s, that the distribution’s not bimodal.

01:05:36 The hope is that it’s a nice Gaussian.

01:05:39 So what advice would you give to Tesla,

01:05:41 how to fix, how to improve navigating autopilot?

01:05:44 That’s the lessons that you’ve learned from Comm AI?

01:05:48 The only real advice I would give to Tesla

01:05:50 is please put driver monitoring in your cars.

01:05:52 With respect to improving it?

01:05:55 You can’t do that anymore.

01:05:55 I decided to interrupt, but you know,

01:05:58 there’s a practical nature of many of hundreds of thousands

01:06:01 of cars being produced that don’t have

01:06:04 a good driver facing camera.

01:06:05 The Model 3 has a selfie cam.

01:06:07 Is it not good enough?

01:06:08 Did they not put IR LEDs for night?

01:06:10 That’s a good question.

01:06:11 But I do know that it’s fisheye

01:06:13 and it’s relatively low resolution.

01:06:15 So it’s really not designed.

01:06:16 It wasn’t.

01:06:17 It wasn’t designed for driver monitoring.

01:06:18 You can hope that you can kind of scrape up

01:06:21 and have something from it.

01:06:24 Yeah.

01:06:25 But why didn’t they put it in today?

01:06:27 Put it in today.

01:06:28 Put it in today.

01:06:29 Every time I’ve heard Karpathy talk about the problem

01:06:31 and talking about like software 2.0

01:06:33 and how the machine learning is gobbling up everything,

01:06:35 I think this is absolutely the right strategy.

01:06:37 I think that he didn’t write navigate on autopilot.

01:06:40 I think somebody else did

01:06:41 and kind of hacked it on top of that stuff.

01:06:43 I think when Karpathy says, wait a second,

01:06:45 why did we hand code this lane change policy

01:06:47 with all these magic numbers?

01:06:48 We’re gonna learn it from data.

01:06:49 They’ll fix it.

01:06:50 They already know what to do there.

01:06:51 Well, that’s Andrei’s job

01:06:53 is to turn everything into a learning problem

01:06:55 and collect a huge amount of data.

01:06:57 The reality is though,

01:06:59 not every problem can be turned into a learning problem

01:07:02 in the short term.

01:07:04 In the end, everything will be a learning problem.

01:07:07 The reality is like if you wanna build L5 vehicles today,

01:07:12 it will likely involve no learning.

01:07:15 And that’s the reality is,

01:07:17 so at which point does learning start?

01:07:20 It’s the crutch statement that LiDAR is a crutch.

01:07:23 At which point will learning

01:07:24 get up to part of human performance?

01:07:27 It’s over human performance on ImageNet,

01:07:30 classification, on driving, it’s a question still.

01:07:34 It is a question.

01:07:35 I’ll say this, I’m here to play for 10 years.

01:07:39 I’m not here to try to,

01:07:40 I’m here to play for 10 years and make money along the way.

01:07:43 I’m not here to try to promise people

01:07:45 that I’m gonna have my L5 taxi network

01:07:47 up and working in two years.

01:07:48 Do you think that was a mistake?

01:07:49 Yes.

01:07:50 What do you think was the motivation behind saying that?

01:07:53 Other companies are also promising L5 vehicles

01:07:56 with very different approaches in 2020, 2021, 2022.

01:08:01 If anybody would like to bet me

01:08:03 that those things do not pan out, I will bet you.

01:08:06 Even money, even money, I’ll bet you as much as you want.

01:08:09 Yeah.

01:08:10 So are you worried about what’s going to happen?

01:08:13 Cause you’re not in full agreement on that.

01:08:16 What’s going to happen when 2022, 21 come around

01:08:19 and nobody has fleets of autonomous vehicles?

01:08:22 Well, you can look at the history.

01:08:25 If you go back five years ago,

01:08:26 they were all promised by 2018 and 2017.

01:08:29 But they weren’t that strong of promises.

01:08:32 I mean, Ford really declared pretty,

01:08:36 I think not many have declared as like definitively

01:08:40 as they have now these dates.

01:08:42 Well, okay, so let’s separate L4 and L5.

01:08:45 Do I think that it’s possible for Waymo to continue to kind

01:08:49 of like hack on their system

01:08:51 until it gets to level four in Chandler, Arizona?

01:08:53 Yes.

01:08:55 When there’s no safety driver?

01:08:56 Chandler, Arizona?

01:08:57 Yeah.

01:08:59 By, sorry, which year are we talking about?

01:09:02 Oh, I even think that’s possible by like 2020, 2021.

01:09:06 But level four, Chandler, Arizona,

01:09:08 not level five, New York City.

01:09:10 Level four, meaning some very defined streets,

01:09:15 it works out really well.

01:09:17 Very defined streets.

01:09:18 And then practically these streets are pretty empty.

01:09:20 If most of the streets are covered in Waymo’s,

01:09:24 Waymo can kind of change the definition of what driving is.

01:09:28 Right?

01:09:29 If your self driving network

01:09:30 is the majority of cars in an area,

01:09:33 they only need to be safe with respect to each other

01:09:35 and all the humans will need to learn to adapt to them.

01:09:38 Now go drive in downtown New York.

01:09:41 Well, yeah, that’s.

01:09:42 I mean, already you can talk about autonomy

01:09:44 and like on farms, it already works great

01:09:46 because you can really just follow the GPS line.

01:09:51 So what does success look like for common AI?

01:09:55 What are the milestones?

01:09:57 Like where you can sit back with some champagne

01:09:59 and say, we did it, boys and girls?

01:10:04 Well, it’s never over.

01:10:06 Yeah, but.

01:10:07 You must drink champagne and celebrate.

01:10:10 So what is a good, what are some wins?

01:10:13 A big milestone that we’re hoping for

01:10:17 by mid next year is profitability of the company.

01:10:23 And we’re gonna have to revisit the idea

01:10:27 of selling a consumer product,

01:10:30 but it’s not gonna be like the comma one.

01:10:32 When we do it, it’s gonna be perfect.

01:10:35 Open pilot has gotten so much better in the last two years.

01:10:39 We’re gonna have a few features.

01:10:41 We’re gonna have a hundred percent driver monitoring.

01:10:43 We’re gonna disable no safety features in the car.

01:10:47 Actually, I think it’d be really cool

01:10:48 what we’re doing right now.

01:10:49 Our project this week is we’re analyzing the data set

01:10:51 and looking for all the AEB triggers

01:10:53 from the manufacturer systems.

01:10:55 We have better data set on that than the manufacturers.

01:10:59 How much, just how many,

01:11:00 does Toyota have 10 million miles of real world driving

01:11:03 to know how many times their AEB triggered?

01:11:05 So let me give you, cause you asked, right?

01:11:08 Financial advice.

01:11:09 Yeah.

01:11:10 Cause I work with a lot of automakers

01:11:12 and one possible source of money for you,

01:11:15 which I’ll be excited to see you take on

01:11:18 is basically selling the data.

01:11:24 So, which is something that most people,

01:11:29 and not selling in a way where here, here at Automaker,

01:11:31 but creating, we’ve done this actually at MIT,

01:11:34 not for money purposes,

01:11:35 but you could do it for significant money purposes

01:11:37 and make the world a better place by creating a consortia

01:11:41 where automakers would pay in

01:11:44 and then they get to have free access to the data.

01:11:46 And I think a lot of people are really hungry for that

01:11:52 and would pay significant amount of money for it.

01:11:54 Here’s the problem with that.

01:11:55 I like this idea all in theory.

01:11:56 It’d be very easy for me to give them access to my servers

01:11:59 and we already have all open source tools

01:12:01 to access this data.

01:12:02 It’s in a great format.

01:12:03 We have a great pipeline,

01:12:05 but they’re gonna put me in the room

01:12:07 with some business development guy.

01:12:10 And I’m gonna have to talk to this guy

01:12:12 and he’s not gonna know most of the words I’m saying.

01:12:15 I’m not willing to tolerate that.

01:12:17 Okay, Mick Jagger.

01:12:18 No, no, no, no, no.

01:12:19 I think I agree with you.

01:12:21 I’m the same way, but you just tell them the terms

01:12:23 and there’s no discussion needed.

01:12:24 If I could just tell them the terms,

01:12:28 Yeah.

01:12:28 and like, all right, who wants access to my data?

01:12:31 I will sell it to you for, let’s say,

01:12:36 you want a subscription?

01:12:37 I’ll sell to you for 100K a month.

01:12:40 Anyone.

01:12:41 100K a month.

01:12:42 100K a month.

01:12:43 I’ll give you access to this data subscription.

01:12:45 Yeah.

01:12:46 Yeah, I think that’s kind of fair.

01:12:46 Came up with that number off the top of my head.

01:12:48 If somebody sends me like a three line email

01:12:50 where it’s like, we would like to pay 100K a month

01:12:52 to get access to your data.

01:12:54 We would agree to like reasonable privacy terms

01:12:56 of the people who are in the data set.

01:12:58 I would be happy to do it,

01:12:59 but that’s not going to be the email.

01:13:01 The email is going to be, hey,

01:13:02 do you have some time in the next month

01:13:04 where we can sit down and we can,

01:13:06 I don’t have time for that.

01:13:06 We’re moving too fast.

01:13:07 Yeah.

01:13:08 You could politely respond to that email,

01:13:10 but not saying, I don’t have any time for your bullshit.

01:13:13 You say, oh, well, unfortunately these are the terms.

01:13:15 And so this is, we try to,

01:13:17 we brought the cost down for you

01:13:19 in order to minimize the friction and communication.

01:13:22 Absolutely.

01:13:23 Here’s the, whatever it is,

01:13:24 one, two million dollars a year and you have access.

01:13:28 And it’s not like I get that email from like,

01:13:31 but okay, am I going to reach out?

01:13:32 Am I going to hire a business development person

01:13:34 who’s going to reach out to the automakers?

01:13:35 No way.

01:13:36 Yeah. Okay.

01:13:37 I got you.

01:13:38 If they reached into me, I’m not going to ignore the email.

01:13:40 I’ll come back with something like,

01:13:41 yeah, if you’re willing to pay 100K a month

01:13:43 for access to the data, I’m happy to set that up.

01:13:46 That’s worth my engineering time.

01:13:48 That’s actually quite insightful of you.

01:13:49 You’re right.

01:13:50 Probably because many of the automakers

01:13:52 are quite a bit old school,

01:13:54 there will be a need to reach out and they want it,

01:13:57 but there’ll need to be some communication.

01:13:59 You’re right.

01:14:00 Mobileye circa 2015 had the lowest R&D spend

01:14:04 of any chip maker, like per, per,

01:14:08 and you look at all the people who work for them

01:14:10 and it’s all business development people

01:14:12 because the car companies are impossible to work with.

01:14:15 Yeah.

01:14:16 So you’re, you have no patience for that

01:14:17 and you’re, you’re legit Android, huh?

01:14:20 I have something to do, right?

01:14:21 Like, like it’s not like, it’s not like,

01:14:22 I don’t, like, I don’t mean to like be a dick

01:14:23 and say like, I don’t have patience for that,

01:14:25 but it’s like that stuff doesn’t help us

01:14:28 with our goal of winning self driving cars.

01:14:30 If I want money in the short term,

01:14:33 if I showed off like the actual,

01:14:36 like the learning tech that we have,

01:14:38 it’s, it’s somewhat sad.

01:14:39 Like it’s years and years ahead of everybody else’s.

01:14:42 Not to, maybe not Tesla’s.

01:14:43 I think Tesla has some more stuff to us actually.

01:14:45 Yeah.

01:14:46 I think Tesla has similar stuff,

01:14:46 but when you compare it to like

01:14:47 what the Toyota Research Institute has,

01:14:50 you’re not even close to what we have.

01:14:53 No comments.

01:14:54 But I also can’t, I have to take your comments.

01:14:58 I intuitively believe you,

01:15:01 but I have to take it with a grain of salt

01:15:03 because I mean, you are an inspiration

01:15:06 because you basically don’t care about a lot of things

01:15:09 that other companies care about.

01:15:10 You don’t try to bullshit in a sense,

01:15:15 like make up stuff.

01:15:16 So to drive up valuation, you’re really very real

01:15:19 and you’re trying to solve the problem

01:15:20 and admire that a lot.

01:15:22 What I don’t necessarily fully can’t trust you on,

01:15:25 with all due respect, is how good it is, right?

01:15:28 I can only, but I also know how bad others are.

01:15:32 And so.

01:15:33 I’ll say two things about, trust but verify, right?

01:15:36 I’ll say two things about that.

01:15:38 One is try, get in a 2020 Corolla

01:15:42 and try open pilot 0.6 when it comes out next month.

01:15:46 I think already you’ll look at this

01:15:48 and you’ll be like, this is already really good.

01:15:51 And then I could be doing that all with hand labelers

01:15:54 and all with like the same approach that Mobileye uses.

01:15:57 When we release a model that no longer has the lanes in it,

01:16:01 that only outputs a path,

01:16:04 then think about how we did that machine learning

01:16:08 and then right away when you see,

01:16:10 and that’s gonna be an open pilot,

01:16:11 that’s gonna be an open pilot before 1.0.

01:16:13 When you see that model,

01:16:14 you’ll know that everything I’m saying is true

01:16:15 because how else did I get that model?

01:16:16 Good.

01:16:17 You know what I’m saying is true about the simulator.

01:16:19 Yeah, yeah, this is super exciting, that’s super exciting.

01:16:22 But like, you know, I listened to your talk with Kyle

01:16:25 and Kyle was originally building the aftermarket system

01:16:30 and he gave up on it because of technical challenges,

01:16:34 because of the fact that he’s gonna have to support

01:16:38 20 to 50 cars, we support 45,

01:16:40 because what is he gonna do

01:16:41 when the manufacturer ABS system triggers?

01:16:43 We have alerts and warnings to deal with all of that

01:16:45 and all the cars.

01:16:46 And how is he going to formally verify it?

01:16:48 Well, I got 10 million miles of data,

01:16:49 it’s probably better,

01:16:50 it’s probably better verified than the spec.

01:16:53 Yeah, I’m glad you’re here talking to me.

01:16:57 This is, I’ll remember this day,

01:17:00 because it’s interesting.

01:17:01 If you look at Kyle’s from cruise,

01:17:04 I’m sure they have a large number

01:17:05 of business development folks

01:17:07 and you work with, he’s working with GM,

01:17:10 you could work with Argo AI, working with Ford.

01:17:13 It’s interesting because chances that you fail,

01:17:17 business wise, like bankrupt, are pretty high.

01:17:20 Yeah.

01:17:21 And yet, it’s the Android model,

01:17:23 is you’re actually taking on the problem.

01:17:26 So that’s really inspiring, I mean.

01:17:28 Well, I have a long term way for Comma to make money too.

01:17:30 And one of the nice things

01:17:32 when you really take on the problem,

01:17:34 which is my hope for Autopilot, for example,

01:17:36 is things you don’t expect,

01:17:39 ways to make money or create value

01:17:41 that you don’t expect will pop up.

01:17:43 Oh, I’ve known how to do it since kind of,

01:17:46 2017 is the first time I said it.

01:17:48 Which part, to know how to do which part?

01:17:50 Our long term plan is to be a car insurance company.

01:17:52 Insurance, yeah, I love it, yep, yep.

01:17:55 I make driving twice as safe.

01:17:56 Not only that, I have the best data

01:17:57 such to know who statistically is the safest drivers.

01:17:59 And oh, oh, we see you, we see you driving unsafely,

01:18:03 we’re not gonna insure you.

01:18:05 And that causes a bifurcation in the market

01:18:08 because the only people who can’t get Comma insurance

01:18:10 are the bad drivers, Geico can insure them,

01:18:12 their premiums are crazy high,

01:18:13 our premiums are crazy low.

01:18:15 We’ll win car insurance, take over that whole market.

01:18:18 Okay, so.

01:18:19 If we win, if we win.

01:18:21 But that’s what I’m saying,

01:18:22 how do you turn Comma into a $10 billion company?

01:18:24 It’s that.

01:18:24 That’s right.

01:18:25 So you, Elon Musk, who else?

01:18:29 Who else is thinking like this and working like this

01:18:32 in your view?

01:18:33 Who are the competitors?

01:18:34 Are there people seriously,

01:18:36 I don’t think anyone that I’m aware of

01:18:38 is seriously taking on lane keeping,

01:18:42 like where it’s a huge business

01:18:45 that turns eventually into full autonomy

01:18:47 that then creates, yeah, like that creates other businesses

01:18:52 on top of it and so on.

01:18:53 Thinks insurance, thinks all kinds of ideas like that.

01:18:56 Do you know anyone else thinking like this?

01:19:00 Not really.

01:19:02 That’s interesting.

01:19:02 I mean, my sense is everybody turns to that

01:19:05 in like four or five years.

01:19:07 Like Ford, once the autonomy doesn’t fall through.

01:19:10 Yeah.

01:19:11 But at this time.

01:19:12 Elon’s the iOS.

01:19:14 By the way, he paved the way for all of us.

01:19:16 It’s the iOS, true.

01:19:17 I would not be doing Comma AI today

01:19:20 if it was not for those conversations with Elon.

01:19:23 And if it were not for him saying like,

01:19:27 I think he said like,

01:19:27 well, obviously we’re not gonna use LiDAR,

01:19:29 we use cameras, humans use cameras.

01:19:31 So what do you think about that?

01:19:32 How important is LiDAR?

01:19:33 Everybody else on L5 is using LiDAR.

01:19:36 What are your thoughts on his provocative statement

01:19:39 that LiDAR is a crutch?

01:19:41 See, sometimes he’ll say dumb things,

01:19:43 like the driver monitoring thing,

01:19:44 but sometimes he’ll say absolutely, completely,

01:19:46 100% obviously true things.

01:19:48 Of course LiDAR is a crutch.

01:19:50 It’s not even a good crutch.

01:19:53 You’re not even using it.

01:19:53 Oh, they’re using it for localization.

01:19:56 Yeah.

01:19:57 Which isn’t good in the first place.

01:19:58 If you have to localize your car to centimeters

01:20:00 in order to drive, like that’s not driving.

01:20:04 Currently not doing much machine learning

01:20:06 I thought for LiDAR data.

01:20:07 Meaning like to help you in the task of,

01:20:11 general task of perception.

01:20:12 The main goal of those LiDARs on those cars

01:20:15 I think is actually localization more than perception.

01:20:18 Or at least that’s what they use them for.

01:20:20 Yeah, that’s true.

01:20:20 If you want to localize to centimeters,

01:20:22 you can’t use GPS.

01:20:23 The fanciest GPS in the world can’t do it.

01:20:25 Especially if you’re under tree cover and stuff.

01:20:26 With LiDAR you can do this pretty easily.

01:20:28 So you really, they’re not taking on,

01:20:30 I mean in some research they’re using it for perception,

01:20:33 but, and they’re certainly not, which is sad,

01:20:35 they’re not fusing it well with vision.

01:20:38 They do use it for perception.

01:20:40 I’m not saying they don’t use it for perception,

01:20:42 but the thing that, they have vision based

01:20:45 and radar based perception systems as well.

01:20:47 You could remove the LiDAR and keep around

01:20:51 a lot of the dynamic object perception.

01:20:54 You want to get centimeter accurate localization?

01:20:56 Good luck doing that with anything else.

01:20:59 So what should Cruz, Waymo do?

01:21:02 Like what would be your advice to them now?

01:21:06 I mean Waymo is actually,

01:21:08 they’re, I mean they’re doing, they’re serious.

01:21:11 Waymo out of the ball of them are quite

01:21:14 so serious about the long game.

01:21:16 If L5 is a lot, requires 50 years,

01:21:20 I think Waymo will be the only one left standing at the end

01:21:24 with the, given the financial backing that they have.

01:21:26 Buku Google bucks.

01:21:28 I’ll say nice things about both Waymo and Cruz.

01:21:32 Let’s do it.

01:21:33 Nice is good.

01:21:35 Waymo is by far the furthest along with technology.

01:21:39 Waymo has a three to five year lead on all the competitors.

01:21:44 If that, if the Waymo looking stack works,

01:21:48 maybe three year lead.

01:21:49 If the Waymo looking stack works,

01:21:51 they have a three year lead.

01:21:52 Now I argue that Waymo has spent too much money

01:21:55 to recapitalize, to gain back their losses

01:21:59 in those three years.

01:22:00 Also self driving cars have no network effect like that.

01:22:03 Uber has a network effect.

01:22:04 You have a market, you have drivers and you have riders.

01:22:07 Self driving cars, you have capital and you have riders.

01:22:09 There’s no network effect.

01:22:11 If I want to blanket a new city in self driving cars,

01:22:13 I buy the off the shelf Chinese knockoff self driving cars

01:22:16 and I buy enough of them in the city.

01:22:17 I can’t do that with drivers.

01:22:18 And that’s why Uber has a first mover advantage

01:22:20 that no self driving car company will.

01:22:24 Can you disentangle that a little bit?

01:22:26 Uber, you’re not talking about Uber,

01:22:28 the autonomous vehicle Uber.

01:22:29 You’re talking about the Uber car, the, yeah.

01:22:31 I’m Uber.

01:22:32 I open for business in Austin, Texas, let’s say.

01:22:36 I need to attract both sides of the market.

01:22:38 I need to both get drivers on my platform

01:22:41 and riders on my platform.

01:22:42 And I need to keep them both sufficiently happy, right?

01:22:45 Riders aren’t gonna use it

01:22:46 if it takes more than five minutes for an Uber to show up.

01:22:49 Drivers aren’t gonna use it

01:22:50 if they have to sit around all day and there’s no riders.

01:22:52 So you have to carefully balance a market.

01:22:54 And whenever you have to carefully balance a market,

01:22:56 there’s a great first mover advantage

01:22:58 because there’s a switching cost for everybody, right?

01:23:01 The drivers and the riders

01:23:02 would have to switch at the same time.

01:23:04 Let’s even say that, you know, let’s say a Luber shows up

01:23:08 and Luber somehow, you know, agrees to do things

01:23:12 at a bigger, you know, we’re just gonna,

01:23:15 we’ve done it more efficiently, right?

01:23:17 Luber is only takes 5% of a cut

01:23:19 instead of the 10% that Uber takes.

01:23:21 No one is gonna switch

01:23:22 because the switching cost is higher than that 5%.

01:23:25 So you actually can, in markets like that,

01:23:27 you have a first mover advantage.

01:23:28 Yeah.

01:23:30 Autonomous vehicles of the level five variety

01:23:32 have no first mover advantage.

01:23:34 If the technology becomes commoditized,

01:23:36 say I wanna go to a new city, look at the scooters.

01:23:39 It’s gonna look a lot more like scooters.

01:23:41 Every person with a checkbook

01:23:44 can blanket a city in scooters.

01:23:45 And that’s why you have 10 different scooter companies.

01:23:47 Which one’s gonna win?

01:23:48 It’s a race to the bottom.

01:23:49 It’s a terrible market to be in

01:23:51 because there’s no market for scooters.

01:23:55 And the scooters don’t get a say

01:23:56 in whether they wanna be bought and deployed to a city

01:23:58 or not. Right.

01:23:59 So the, yeah.

01:24:00 We’re gonna entice the scooters

01:24:01 with subsidies and deals and.

01:24:03 So whenever you have to invest that capital,

01:24:05 it doesn’t.

01:24:06 It doesn’t come back.

01:24:07 Yeah.

01:24:08 That can’t be your main criticism of the Waymo approach.

01:24:12 Oh, I’m saying even if it does technically work.

01:24:14 Even if it does technically work, that’s a problem.

01:24:17 Yeah.

01:24:18 I don’t know if I were to say,

01:24:21 I would say you’re already there.

01:24:23 I haven’t even thought about that,

01:24:24 but I would say the bigger challenge

01:24:26 is the technical approach.

01:24:28 The.

01:24:29 So Waymo’s cruises.

01:24:31 And not just the technical approach,

01:24:33 but of creating value.

01:24:34 I still don’t understand how you beat Uber,

01:24:40 the human driven cars.

01:24:43 In terms of financially,

01:24:44 it doesn’t make sense to me

01:24:47 that people wanna get in an autonomous vehicle.

01:24:50 I don’t understand how you make money.

01:24:52 In the longterm, yes.

01:24:54 Like real longterm.

01:24:56 But it just feels like there’s too much

01:24:58 capital investment needed.

01:24:59 Oh, and they’re gonna be worse than Ubers

01:25:01 because they’re gonna stop for every little thing,

01:25:04 everywhere.

01:25:06 I’ll say a nice thing about cruise.

01:25:07 That was my nice thing about Waymo.

01:25:08 They’re three years ahead.

01:25:09 Wait, what was the nice?

01:25:10 Oh, because they’re three.

01:25:10 They’re three years technically ahead of everybody.

01:25:12 Their tech stack is great.

01:25:14 My nice thing about cruise is GM buying them

01:25:17 was a great move for GM.

01:25:20 For $1 billion,

01:25:22 GM bought an insurance policy against Waymo.

01:25:25 They put, cruise is three years behind Waymo.

01:25:30 That means Google will get a monopoly on the technology

01:25:33 for at most three years.

01:25:36 And if technology works,

01:25:38 so you might not even be right about the three years,

01:25:40 it might be less.

01:25:41 Might be less.

01:25:42 Cruise actually might not be that far behind.

01:25:44 I don’t know how much Waymo has waffled around

01:25:47 or how much of it actually is just that long tail.

01:25:49 Yeah, okay.

01:25:50 If that’s the best you could say in terms of nice things,

01:25:53 that’s more of a nice thing for GM

01:25:55 that that’s the smart insurance policy.

01:25:58 It’s a smart insurance policy.

01:25:59 I mean, I think that’s how,

01:26:01 I can’t see cruise working out any other.

01:26:05 For cruise to leapfrog Waymo would really surprise me.

01:26:10 Yeah, so let’s talk about

01:26:12 the underlying assumption of everything is.

01:26:13 We’re not gonna leapfrog Tesla.

01:26:17 Tesla would have to seriously mess up for us

01:26:19 because you’re.

01:26:20 Okay, so the way you leapfrog, right?

01:26:23 Is you come up with an idea

01:26:26 or you take a direction perhaps secretly

01:26:28 that the other people aren’t taking.

01:26:31 And so the cruise, Waymo,

01:26:35 even Aurora.

01:26:38 I don’t know Aurora, Zooks is the same stack as well.

01:26:40 They’re all the same code base even.

01:26:41 And they’re all the same DARPA Urban Challenge code base.

01:26:45 So the question is,

01:26:46 do you think there’s a room for brilliance and innovation

01:26:48 that will change everything?

01:26:50 Like say, okay, so I’ll give you examples.

01:26:53 It could be if revolution and mapping, for example,

01:26:59 that allow you to map things,

01:27:03 do HD maps of the whole world,

01:27:05 all weather conditions somehow really well,

01:27:08 or revolution and simulation

01:27:13 to where the all the way you said before becomes incorrect.

01:27:20 That kind of thing.

01:27:21 Any room for breakthrough innovation?

01:27:24 What I said before about,

01:27:25 oh, they actually get the whole thing.

01:27:27 Well, I’ll say this about,

01:27:30 we divide driving into three problems

01:27:32 and I actually haven’t solved the third yet,

01:27:33 but I haven’t had you how to do it.

01:27:34 So there’s the static.

01:27:36 The static driving problem is assuming

01:27:38 you are the only car on the road, right?

01:27:40 And this problem can be solved 100%

01:27:41 with mapping and localization.

01:27:43 This is why farms work the way they do.

01:27:45 If all you have to deal with is the static problem

01:27:48 and you can statically schedule your machines, right?

01:27:50 It’s the same as like statically scheduling processes.

01:27:52 You can statically schedule your tractors

01:27:53 to never hit each other on their paths, right?

01:27:56 Cause they know the speed they go at.

01:27:57 So that’s the static driving problem.

01:28:00 Maps only helps you with the static driving problem.

01:28:03 Yeah, the question about static driving,

01:28:06 you’ve just made it sound like it’s really easy.

01:28:08 Static driving is really easy.

01:28:11 How easy?

01:28:13 How, well, cause the whole drifting out of lane,

01:28:16 when Tesla drifts out of lane,

01:28:18 it’s failing on the fundamental static driving problem.

01:28:22 Tesla is drifting out of lane?

01:28:24 The static driving problem is not easy for the world.

01:28:27 The static driving problem is easy for one route.

01:28:31 One route and one weather condition

01:28:33 with one state of lane markings

01:28:37 and like no deterioration, no cracks in the road.

01:28:40 No, I’m assuming you have a perfect localizer.

01:28:42 So that’s solved for the weather condition

01:28:44 and the lane marking condition.

01:28:45 But that’s the problem is,

01:28:46 how do you have a perfect localizer?

01:28:48 Perfect localizers are not that hard to build.

01:28:50 Okay, come on now, with LIDAR?

01:28:53 With LIDAR, yeah.

01:28:54 Oh, with LIDAR, okay.

01:28:55 With LIDAR, yeah, but you use LIDAR, right?

01:28:56 Like use LIDAR, build a perfect localizer.

01:28:58 Building a perfect localizer without LIDAR,

01:29:02 it’s gonna be hard.

01:29:04 You can get 10 centimeters without LIDAR,

01:29:05 you can get one centimeter with LIDAR.

01:29:07 I’m not even concerned about the one or 10 centimeters.

01:29:09 I’m concerned if every once in a while,

01:29:11 you’re just way off.

01:29:12 Yeah, so this is why you have to carefully make sure

01:29:17 you’re always tracking your position.

01:29:19 You wanna use LIDAR camera fusion,

01:29:21 but you can get the reliability of that system

01:29:24 up to 100,000 miles,

01:29:27 and then you write some fallback condition

01:29:29 where it’s not that bad if you’re way off, right?

01:29:32 I think that you can get it to the point,

01:29:33 it’s like ASLD that you’re never in a case

01:29:36 where you’re way off and you don’t know it.

01:29:38 Yeah, okay, so this is brilliant.

01:29:40 So that’s the static. Static.

01:29:42 We can, especially with LIDAR and good HG maps,

01:29:45 you can solve that problem. Easy.

01:29:47 No, I just disagree with your word easy.

01:29:50 The static problem’s so easy.

01:29:51 It’s very typical for you to say something is easy.

01:29:54 I got it. No.

01:29:54 It’s not as challenging as the other ones, okay.

01:29:56 Well, okay, maybe it’s obvious how to solve it.

01:29:58 The third one’s the hardest.

01:30:00 And a lot of people don’t even think about the third one

01:30:01 and even see it as different from the second one.

01:30:03 So the second one is dynamic.

01:30:05 The second one is like, say there’s an obvious example

01:30:08 is like a car stopped at a red light, right?

01:30:10 You can’t have that car in your map

01:30:12 because you don’t know whether that car

01:30:13 is gonna be there or not.

01:30:14 So you have to detect that car in real time

01:30:17 and then you have to do the appropriate action, right?

01:30:21 Also, that car is not a fixed object.

01:30:24 That car may move and you have to predict

01:30:26 what that car will do, right?

01:30:28 So this is the dynamic problem.

01:30:30 Yeah.

01:30:31 So you have to deal with this.

01:30:32 This involves, again, like you’re gonna need models

01:30:36 of other people’s behavior.

01:30:39 Are you including in that,

01:30:40 I don’t wanna step on the third one.

01:30:42 Oh.

01:30:43 But are you including in that your influence on people?

01:30:46 Ah, that’s the third one.

01:30:48 Okay.

01:30:49 That’s the third one.

01:30:49 We call it the counterfactual.

01:30:51 Yeah, brilliant.

01:30:52 And that.

01:30:53 I just talked to Judea Pearl

01:30:54 who’s obsessed with counterfactuals.

01:30:55 And the counterfactual.

01:30:56 Oh yeah, yeah, I read his books.

01:30:58 So the static and the dynamic

01:31:00 Yeah.

01:31:01 Our approach right now for lateral

01:31:04 will scale completely to the static and dynamic.

01:31:07 The counterfactual, the only way I have to do it yet,

01:31:10 the thing that I wanna do once we have all of these cars

01:31:13 is I wanna do reinforcement learning on the world.

01:31:16 I’m always gonna turn the exploiter up to max.

01:31:18 I’m not gonna have them explore.

01:31:20 But the only real way to get at the counterfactual

01:31:22 is to do reinforcement learning

01:31:24 because the other agents are humans.

01:31:27 So that’s fascinating that you break it down like that.

01:31:30 I agree completely.

01:31:31 I’ve spent my life thinking about this problem.

01:31:33 It’s beautiful.

01:31:34 And part of it, because you’re slightly insane,

01:31:37 it’s good.

01:31:39 Because.

01:31:41 Not my life.

01:31:42 Just the last four years.

01:31:43 No, no.

01:31:43 You have some nonzero percent of your brain

01:31:48 has a madman in it, which is good.

01:31:51 That’s a really good feature.

01:31:52 But there’s a safety component to it

01:31:55 that I think sort of with counterfactuals and so on

01:31:59 that would just freak people out.

01:32:00 How do you even start to think about just in general?

01:32:03 I mean, you’ve had some friction with NHTSA and so on.

01:32:07 I am frankly exhausted by safety engineers.

01:32:14 The prioritization on safety over innovation

01:32:21 to a degree where it kills, in my view,

01:32:23 kills safety in the long term.

01:32:26 So the counterfactual thing,

01:32:28 they just actually exploring this world

01:32:31 of how do you interact with dynamic objects and so on.

01:32:33 How do you think about safety?

01:32:34 You can do reinforcement learning without ever exploring.

01:32:38 And I said that, so you can think about your,

01:32:40 in reinforcement learning,

01:32:41 it’s usually called a temperature parameter.

01:32:44 And your temperature parameter

01:32:45 is how often you deviate from the argmax.

01:32:48 I could always set that to zero and still learn.

01:32:50 And I feel that you’d always want that set to zero

01:32:52 on your actual system.

01:32:54 Gotcha.

01:32:54 But the problem is you first don’t know very much.

01:32:58 And so you’re going to make mistakes.

01:32:59 So the learning, the exploration happens through mistakes.

01:33:02 Yeah, but okay.

01:33:03 So the consequences of a mistake.

01:33:06 Open pilot and autopilot are making mistakes left and right.

01:33:09 We have 700 daily active users,

01:33:12 a thousand weekly active users.

01:33:14 Open pilot makes tens of thousands of mistakes a week.

01:33:18 These mistakes have zero consequences.

01:33:21 These mistakes are,

01:33:22 oh, I wanted to take this exit and it went straight.

01:33:26 So I’m just going to carefully touch the wheel.

01:33:28 The humans catch them.

01:33:29 The humans catch them.

01:33:30 And the human disengagement is labeling

01:33:33 that reinforcement learning

01:33:34 in a completely consequence free way.

01:33:37 So driver monitoring is the way you ensure they keep.

01:33:39 Yes.

01:33:40 They keep paying attention.

01:33:42 How is your messaging?

01:33:43 Say I gave you a billion dollars,

01:33:45 you would be scaling it now.

01:33:47 Oh, I couldn’t scale it with any amount of money.

01:33:49 I’d raise money if I could, if I had a way to scale it.

01:33:51 Yeah, you’re now not focused on scale.

01:33:53 I don’t know how to do,

01:33:54 oh, like I guess I could sell it to more people,

01:33:55 but I want to make the system better.

01:33:57 Better, better.

01:33:57 And I don’t know how to, I mean.

01:33:58 But what’s the messaging here?

01:34:01 I got a chance to talk to Elon and he basically said

01:34:06 that the human factor doesn’t matter.

01:34:09 You know, the human doesn’t matter

01:34:10 because the system will perform,

01:34:12 there’ll be sort of a, sorry to use the term,

01:34:14 but like a singular,

01:34:15 like a point where it gets just much better.

01:34:17 And so the human, it won’t really matter.

01:34:20 But it seems like that human catching the system

01:34:25 when it gets into trouble is like the thing

01:34:29 which will make something like reinforcement learning work.

01:34:32 So how do you think messaging for Tesla,

01:34:35 for you should change,

01:34:36 for the industry in general should change?

01:34:39 I think our messaging is pretty clear.

01:34:40 At least like our messaging wasn’t that clear

01:34:43 in the beginning and I do kind of fault myself for that.

01:34:45 We are proud right now to be a level two system.

01:34:48 We are proud to be level two.

01:34:50 If we talk about level four,

01:34:51 it’s not with the current hardware.

01:34:53 It’s not gonna be just a magical OTA upgrade.

01:34:55 It’s gonna be new hardware.

01:34:57 It’s gonna be very carefully thought out.

01:34:59 Right now, we are proud to be level two

01:35:01 and we have a rigorous safety model.

01:35:03 I mean, not like, okay, rigorous, who knows what that means,

01:35:06 but we at least have a safety model

01:35:08 and we make it explicit as in safety.md in OpenPilot.

01:35:11 And it says, seriously though, safety.md.

01:35:17 This is brilliant, this is so Android.

01:35:18 Well, this is the safety model

01:35:21 and I like to have conversations like,

01:35:25 sometimes people will come to you and they’re like,

01:35:27 your system’s not safe.

01:35:29 Okay, have you read my safety docs?

01:35:31 Would you like to have an intelligent conversation

01:35:32 about this?

01:35:33 And the answer is always no.

01:35:34 They just like scream about, it runs Python.

01:35:38 Okay, what?

01:35:39 So you’re saying that because Python’s not real time,

01:35:41 Python not being real time never causes disengagements.

01:35:44 Disengagements are caused by, the model is QM.

01:35:47 But safety.md says the following,

01:35:49 first and foremost,

01:35:50 the driver must be paying attention at all times.

01:35:55 I still consider the software to be alpha software

01:35:57 until we can actually enforce that statement,

01:36:00 but I feel it’s very well communicated to our users.

01:36:03 Two more things.

01:36:04 One is the user must be able to easily take control

01:36:09 of the vehicle at all times.

01:36:10 So if you step on the gas or brake with OpenPilot,

01:36:14 it gives full manual control back to the user

01:36:16 or press the cancel button.

01:36:18 Step two, the car will never react so quickly,

01:36:23 we define so quickly to be about one second,

01:36:26 that you can’t react in time.

01:36:27 And we do this by enforcing torque limits,

01:36:29 braking limits and acceleration limits.

01:36:31 So we have like our torque limits way lower than Tesla’s.

01:36:36 This is another potential.

01:36:39 If I could tweak Autopilot,

01:36:40 I would lower their torque limit

01:36:41 and I would add driver monitoring.

01:36:42 Because Autopilot can jerk the wheel hard.

01:36:46 OpenPilot can’t.

01:36:47 We limit, and all this code is open source, readable.

01:36:52 And I believe now it’s all Misra C compliant.

01:36:54 What’s that mean?

01:36:57 Misra is like the automotive coding standard.

01:37:00 At first, I’ve come to respect.

01:37:03 I’ve been reading like the standards lately

01:37:05 and I’ve come to respect them.

01:37:05 They’re actually written by very smart people.

01:37:07 Yeah, they’re brilliant people actually.

01:37:09 They have a lot of experience.

01:37:11 They’re sometimes a little too cautious,

01:37:13 but in this case, it pays off.

01:37:16 Misra is written by like computer scientists.

01:37:18 And you can tell by the language they use.

01:37:19 You can tell by the language they use,

01:37:21 they talk about like whether certain conditions in Misra

01:37:24 are decidable or undecidable.

01:37:26 And you mean like the halting problem?

01:37:28 And yes, all right, you’ve earned my respect.

01:37:31 I will read carefully what you have to say

01:37:33 and we wanna make our code compliant with that.

01:37:35 All right, so you’re proud level two, beautiful.

01:37:38 So you were the founder and I think CEO of Kama AI,

01:37:42 then you were the head of research.

01:37:44 What the heck are you now?

01:37:46 What’s your connection to Kama AI?

01:37:47 I’m the president, but I’m one of those

01:37:49 like unelected presidents of like a small dictatorship

01:37:53 country, not one of those like elected presidents.

01:37:55 Oh, so you’re like Putin when he was like the,

01:37:57 yeah, I got you.

01:37:59 So there’s a, what’s the governance structure?

01:38:02 What’s the future of Kama AI?

01:38:04 I mean, yeah, it’s a business.

01:38:07 Do you want, are you just focused on getting things

01:38:10 right now, making some small amount of money in the meantime

01:38:14 and then when it works, it works and you scale.

01:38:17 Our burn rate is about 200K a month

01:38:20 and our revenue is about 100K a month.

01:38:23 So we need to forex our revenue,

01:38:24 but we haven’t like tried very hard at that yet.

01:38:28 And the revenue is basically selling stuff online.

01:38:30 Yeah, we sell stuff shop.kama.ai.

01:38:32 Is there other, well, okay,

01:38:33 so you’ll have to figure out the revenue.

01:38:35 That’s our only, see, but to me,

01:38:37 that’s like respectable revenues.

01:38:40 We make it by selling products to consumers

01:38:42 who are honest and transparent about what they are.

01:38:45 Most actually level four companies, right?

01:38:50 Cause you could easily start blowing up like smoke,

01:38:54 like overselling the hype and feeding into,

01:38:57 getting some fundraisers.

01:38:59 Oh, you’re the guy, you’re a genius

01:39:00 because you hacked the iPhone.

01:39:01 Oh, I hate that, I hate that.

01:39:03 Yeah, well, I can trade my social capital for more money.

01:39:06 I did it once, I almost regret it doing it the first time.

01:39:10 Well, on a small tangent,

01:39:11 what’s your, you seem to not like fame

01:39:16 and yet you’re also drawn to fame.

01:39:18 Where are you on that currently?

01:39:24 Have you had some introspection, some soul searching?

01:39:27 Yeah, I actually,

01:39:29 I’ve come to a pretty stable position on that.

01:39:32 Like after the first time,

01:39:33 I realized that I don’t want attention from the masses.

01:39:36 I want attention from people who I respect.

01:39:40 Who do you respect?

01:39:41 I can give a list of people.

01:39:43 So are these like Elon Musk type characters?

01:39:47 Yeah, well, actually, you know what?

01:39:50 I’ll make it more broad than that.

01:39:51 I won’t make it about a person, I respect skill.

01:39:54 I respect people who have skills, right?

01:39:56 And I would like to like be, I’m not gonna say famous,

01:40:01 but be like known among more people who have like real skills.

01:40:06 Who in cars do you think have skill, not do you respect?

01:40:15 Oh, Kyle Vogt has skill.

01:40:17 A lot of people at Waymo have skill and I respect them.

01:40:20 I respect them as engineers.

01:40:23 Like I can think, I mean,

01:40:24 I think about all the times in my life

01:40:26 where I’ve been like dead set on approaches

01:40:27 and they turn out to be wrong.

01:40:29 So, I mean, this might, I might be wrong.

01:40:31 I accept that.

01:40:32 I accept that there’s a decent chance that I’m wrong.

01:40:36 And actually, I mean,

01:40:37 having talked to Chris Hermsons, Sterling Anderson,

01:40:39 those guys, I mean, I deeply respect Chris.

01:40:43 I just admire the guy.

01:40:46 He’s legit.

01:40:47 When you drive a car through the desert

01:40:48 when everybody thinks it’s impossible, that’s legit.

01:40:52 And then I also really respect the people

01:40:53 who are like writing the infrastructure of the world,

01:40:55 like the Linus Torvalds and the Chris Lattiners.

01:40:57 They were doing the real work.

01:40:59 I know, they’re doing the real work.

01:41:00 This, having talked to Chris,

01:41:03 like Chris Lattiners, you realize,

01:41:04 especially when they’re humble,

01:41:05 it’s like you realize, oh, you guys,

01:41:07 we’re just using your,

01:41:09 Oh yeah.

01:41:10 All the hard work that you did.

01:41:11 Yeah, that’s incredible.

01:41:13 What do you think, Mr. Anthony Lewandowski,

01:41:18 what do you, he’s another mad genius.

01:41:21 Sharp guy, oh yeah.

01:41:22 What, do you think he might long term become a competitor?

01:41:27 Oh, to comma?

01:41:28 Well, so I think that he has the other right approach.

01:41:32 I think that right now there’s two right approaches.

01:41:35 One is what we’re doing, and one is what he’s doing.

01:41:37 Can you describe, I think it’s called Pronto AI.

01:41:39 He started a new thing.

01:41:40 Do you know what the approach is?

01:41:42 I actually don’t know.

01:41:43 Embark is also doing the same sort of thing.

01:41:45 The idea is almost that you want to,

01:41:47 so if you’re, I can’t partner with Honda and Toyota.

01:41:51 Honda and Toyota are like 400,000 person companies.

01:41:56 It’s not even a company at that point.

01:41:58 I don’t think of it like, I don’t personify it.

01:42:00 I think of it like an object,

01:42:01 but a trucker drives for a fleet,

01:42:06 maybe that has like, some truckers are independent.

01:42:09 Some truckers drive for fleets with a hundred trucks.

01:42:11 There are tons of independent trucking companies out there.

01:42:14 Start a trucking company and drive your costs down

01:42:17 or figure out how to drive down the cost of trucking.

01:42:23 Another company that I really respect is Nato.

01:42:25 Actually, I respect their business model.

01:42:27 Nato sells a driver monitoring camera

01:42:31 and they sell it to fleet owners.

01:42:33 If I owned a fleet of cars

01:42:35 and I could pay 40 bucks a month to monitor my employees,

01:42:41 this is gonna, it like reduces accidents 18%.

01:42:45 It’s so like that, in the space,

01:42:48 that is like the business model that I like most respect.

01:42:52 Cause they’re creating value today.

01:42:54 Yeah, which is a, that’s a huge one.

01:42:57 How do we create value today with some of this?

01:42:59 And the lane keeping thing is huge.

01:43:01 And it sounds like you’re creeping in

01:43:03 or full steam ahead on the driver monitoring too,

01:43:06 which I think actually where the short term value,

01:43:09 if you can get it right.

01:43:10 I still, I’m not a huge fan of the statement

01:43:12 that everything has to have driver monitoring.

01:43:15 I agree with that completely,

01:43:16 but that statement usually misses the point

01:43:18 that to get the experience of it right is not trivial.

01:43:21 Oh no, not at all.

01:43:22 In fact, like, so right now we have,

01:43:26 I think the timeout depends on speed of the car,

01:43:29 but we want to depend on like the scene state.

01:43:32 If you’re on like an empty highway,

01:43:35 it’s very different if you don’t pay attention

01:43:37 than if like you’re like coming up to a traffic light.

01:43:42 And longterm, it should probably learn from the driver

01:43:45 because that’s to do, I watched a lot of video.

01:43:48 We’ve built a smartphone detector

01:43:49 just to analyze how people are using smartphones

01:43:51 and people are using it very differently.

01:43:53 It’s a texting styles.

01:43:57 There’s.

01:43:58 We haven’t watched nearly enough of the videos.

01:44:00 We haven’t, I got millions of miles

01:44:01 of people driving cars.

01:44:02 In this moment, I spent a large fraction of my time

01:44:05 just watching videos because it’s never fails to learn.

01:44:10 Like it never, I’ve never failed

01:44:12 from a video watching session

01:44:13 to learn something I didn’t know before.

01:44:15 In fact, I usually like when I eat lunch,

01:44:18 I’ll sit, especially when the weather is good

01:44:20 and just watch pedestrians with an eye to understand

01:44:24 like from a computer vision eye,

01:44:26 just to see can this model, can you predict,

01:44:29 what are the decisions made?

01:44:30 And there’s so many things that we don’t understand.

01:44:33 This is what I mean about the state vector.

01:44:34 Yeah, it’s, I’m trying to always think like,

01:44:37 cause I’m understanding in my human brain,

01:44:40 how do we convert that into,

01:44:43 how hard is the learning problem here?

01:44:44 I guess is the fundamental question.

01:44:46 So something that’s from a hacking perspective,

01:44:51 this is always comes up, especially with folks.

01:44:54 Well, first the most popular question

01:44:55 is the trolley problem, right?

01:44:58 So that’s not a sort of a serious problem.

01:45:01 There are some ethical questions I think that arise.

01:45:06 Maybe you wanna, do you think there’s any ethical,

01:45:09 serious ethical questions?

01:45:11 We have a solution to the trolley problem at Comm.ai.

01:45:14 Well, so there is actually an alert in our code,

01:45:16 ethical dilemma detected.

01:45:18 It’s not triggered yet.

01:45:18 We don’t know how yet to detect the ethical dilemmas,

01:45:21 but we’re a level two system.

01:45:22 So we’re going to disengage

01:45:23 and leave that decision to the human.

01:45:25 You’re such a troll.

01:45:26 No, but the trolley problem deserves to be trolled.

01:45:28 Yeah, that’s a beautiful answer actually.

01:45:32 I know, I gave it to someone who was like,

01:45:34 sometimes people will ask,

01:45:35 like you asked about the trolley problem,

01:45:36 like you can have a kind of discussion about it.

01:45:38 Like you get someone who’s like really like earnest about it

01:45:40 because it’s the kind of thing where,

01:45:43 if you ask a bunch of people in an office,

01:45:45 whether we should use a SQL stack or a no SQL stack,

01:45:48 if they’re not that technical, they have no opinion.

01:45:50 But if you ask them what color they want to paint the office,

01:45:52 everyone has an opinion on that.

01:45:54 And that’s why the trolley problem is…

01:45:56 I mean, that’s a beautiful answer.

01:45:57 Yeah, we’re able to detect the problem

01:45:59 and we’re able to pass it on to the human.

01:46:01 Wow, I’ve never heard anyone say it.

01:46:03 This is your nice escape route.

01:46:06 Okay, but…

01:46:07 Proud level two.

01:46:08 I’m proud level two.

01:46:09 I love it.

01:46:10 So the other thing that people have some concern about

01:46:14 with AI in general is hacking.

01:46:17 So how hard is it, do you think,

01:46:20 to hack an autonomous vehicle,

01:46:21 either through physical access

01:46:23 or through the more sort of popular now,

01:46:25 these adversarial examples on the sensors?

01:46:28 Okay, the adversarial examples one.

01:46:30 You want to see some adversarial examples

01:46:32 that affect humans, right?

01:46:34 Oh, well, there used to be a stop sign here,

01:46:38 but I put a black bag over the stop sign

01:46:40 and then people ran it, adversarial, right?

01:46:43 Like there’s tons of human adversarial examples too.

01:46:48 The question in general about like security,

01:46:51 if you saw something just came out today

01:46:53 and like there are always such hypey headlines

01:46:55 about like how navigate on autopilot

01:46:57 was fooled by a GPS spoof to take an exit.

01:47:00 Right.

01:47:01 At least that’s all they could do was take an exit.

01:47:03 If your car is relying on GPS

01:47:06 in order to have a safe driving policy,

01:47:09 you’re doing something wrong.

01:47:10 If you’re relying,

01:47:11 and this is why V2V is such a terrible idea.

01:47:14 V2V now relies on both parties getting communication right.

01:47:19 This is not even, so I think of safety,

01:47:26 security is like a special case of safety, right?

01:47:28 Safety is like we put a little, you know,

01:47:31 piece of caution tape around the hole

01:47:33 so that people won’t walk into it by accident.

01:47:35 Security is like put a 10 foot fence around the hole

01:47:38 so you actually physically cannot climb into it

01:47:40 with barbed wire on the top and stuff, right?

01:47:42 So like if you’re designing systems that are like unreliable,

01:47:45 they’re definitely not secure.

01:47:48 Your car should always do something safe

01:47:51 using its local sensors.

01:47:53 And then the local sensor should be hardwired.

01:47:55 And then could somebody hack into your CAN bus

01:47:57 and turn your steering wheel on your brakes?

01:47:58 Yes, but they could do it before common AI too, so.

01:48:02 Let’s think out of the box on some things.

01:48:04 So do you think teleoperation has a role in any of this?

01:48:09 So remotely stepping in and controlling the cars?

01:48:13 No, I think that if the safety operation by design

01:48:22 requires a constant link to the cars,

01:48:26 I think it doesn’t work.

01:48:27 So that’s the same argument you’re using for V2I, V2V?

01:48:31 Well, there’s a lot of non safety critical stuff

01:48:34 you can do with V2I.

01:48:35 I like V2I, I like V2I way more than V2V.

01:48:37 Because V2I is already like,

01:48:39 I already have internet in the car, right?

01:48:40 There’s a lot of great stuff you can do with V2I.

01:48:44 Like for example, you can, well, I already have V2I,

01:48:47 Waze is V2I, right?

01:48:48 Waze can route me around traffic jams.

01:48:50 That’s a great example of V2I.

01:48:52 And then, okay, the car automatically talks

01:48:54 to that same service, like it works.

01:48:55 So it’s improving the experience,

01:48:56 but it’s not a fundamental fallback for safety.

01:48:59 No, if any of your things that require wireless communication

01:49:04 are more than QM, like have an ASL rating, it shouldn’t be.

01:49:10 You previously said that life is work

01:49:15 and that you don’t do anything to relax.

01:49:17 So how do you think about hard work?

01:49:20 What do you think it takes to accomplish great things?

01:49:24 And there’s a lot of people saying

01:49:25 that there needs to be some balance.

01:49:28 You need to, in order to accomplish great things,

01:49:31 you need to take some time off,

01:49:32 you need to reflect and so on.

01:49:33 Now, and then some people are just insanely working,

01:49:37 burning the candle on both ends.

01:49:39 How do you think about that?

01:49:41 I think I was trolling in the Siraj interview

01:49:43 when I said that.

01:49:44 Off camera, right before I smoked a little bit of weed,

01:49:47 like, you know, come on, this is a joke, right?

01:49:49 Like I do nothing to relax.

01:49:50 Look where I am, I’m at a party, right?

01:49:52 Yeah, yeah, yeah, that’s true.

01:49:55 So no, no, of course I don’t.

01:49:58 When I say that life is work though,

01:49:59 I mean that like, I think that what gives my life meaning is work.

01:50:04 I don’t mean that every minute of the day

01:50:05 you should be working.

01:50:06 I actually think this is not the best way to maximize results.

01:50:09 I think that if you’re working 12 hours a day,

01:50:12 you should be working smarter and not harder.

01:50:14 Well, so work gives you meaning.

01:50:17 For some people, other sorts of meaning

01:50:20 is personal relationships, like family and so on.

01:50:24 You’ve also, in that interview with Siraj,

01:50:27 or the trolling, mentioned that one of the things

01:50:30 you look forward to in the future is AI girlfriends.

01:50:34 So that’s a topic that I’m very much fascinated by,

01:50:38 not necessarily girlfriends,

01:50:39 but just forming a deep connection with AI.

01:50:42 What kind of system do you imagine

01:50:44 when you say AI girlfriend,

01:50:46 whether you were trolling or not?

01:50:47 No, that one I’m very serious about.

01:50:49 And I’m serious about that on both a shallow level

01:50:52 and a deep level.

01:50:53 I think that VR brothels are coming soon

01:50:55 and are going to be really cool.

01:50:57 It’s not cheating if it’s a robot.

01:50:59 I see the slogan already.

01:51:03 But there’s, I don’t know if you’ve watched,

01:51:06 or just watched the Black Mirror episode.

01:51:08 I watched the latest one, yeah.

01:51:09 Yeah, yeah.

01:51:11 Oh, the Ashley 2 one?

01:51:15 No, where there’s two friends

01:51:16 who are having sex with each other in…

01:51:20 Oh, in the VR game.

01:51:21 In the VR game.

01:51:22 It’s just two guys,

01:51:23 but one of them was a female, yeah.

01:51:27 Which is another mind blowing concept.

01:51:29 That in VR, you don’t have to be the form.

01:51:33 You can be two animals having sex.

01:51:37 It’s weird.

01:51:38 I mean, I’ll see how nice that the software

01:51:38 maps the nerve endings, right?

01:51:40 Yeah, it’s huge.

01:51:41 I mean, yeah, they sweep a lot of the fascinating,

01:51:44 really difficult technical challenges under the rug,

01:51:46 like assuming it’s possible

01:51:48 to do the mapping of the nerve endings, then…

01:51:51 I wish, yeah, I saw that,

01:51:51 the way they did it with the little like stim unit

01:51:53 on the head, that’d be amazing.

01:51:56 So, well, no, no, on a shallow level,

01:51:58 like you could set up like almost a brothel

01:52:01 with like real dolls and Oculus Quests,

01:52:05 write some good software.

01:52:06 I think it’d be a cool novelty experience.

01:52:09 But no, on a deeper, like emotional level,

01:52:12 I mean, yeah, I would really like to fall in love

01:52:17 with a machine.

01:52:18 Do you see yourself having a long term relationship

01:52:25 of the kind monogamous relationship that we have now

01:52:28 with a robot, with a AI system even,

01:52:31 not even just a robot?

01:52:32 So I think about maybe my ideal future.

01:52:38 When I was 15, I read Eliezer Yudkowsky’s early writings

01:52:43 on the singularity and like that AI

01:52:49 is going to surpass human intelligence massively.

01:52:53 He made some Moore’s law based predictions

01:52:55 that I mostly agree with.

01:52:57 And then I really struggled

01:52:59 for the next couple of years of my life.

01:53:01 Like, why should I even bother to learn anything?

01:53:03 It’s all gonna be meaningless when the machines show up.

01:53:06 Right.

01:53:07 Maybe when I was that young,

01:53:10 I was still a little bit more pure

01:53:11 and really like clung to that.

01:53:13 And then I’m like, well,

01:53:13 the machines ain’t here yet, you know,

01:53:14 and I seem to be pretty good at this stuff.

01:53:16 Let’s try my best, you know,

01:53:18 like what’s the worst that happens.

01:53:21 But the best possible future I see

01:53:24 is me sort of merging with the machine.

01:53:26 And the way that I personify this

01:53:28 is in a long term monogamous relationship with a machine.

01:53:32 Oh, you don’t think there’s a room

01:53:34 for another human in your life,

01:53:35 if you really truly merge with another machine?

01:53:39 I mean, I see merging.

01:53:40 I see like the best interface to my brain

01:53:46 is like the same relationship interface

01:53:48 to merge with an AI, right?

01:53:49 What does that merging feel like?

01:53:52 I’ve seen couples who’ve been together for a long time.

01:53:55 And like, I almost think of them as one person,

01:53:58 like couples who spend all their time together and…

01:54:01 That’s fascinating.

01:54:02 You’re actually putting,

01:54:03 what does that merging actually looks like?

01:54:06 It’s not just a nice channel.

01:54:08 Like a lot of people imagine it’s just an efficient link,

01:54:12 search link to Wikipedia or something.

01:54:14 I don’t believe in that.

01:54:15 But it’s more,

01:54:16 you’re saying that there’s the same kind of relationship

01:54:18 you have with another human,

01:54:19 that’s a deep relationship.

01:54:20 That’s what merging looks like.

01:54:22 That’s pretty…

01:54:24 I don’t believe that link is possible.

01:54:26 I think that that link,

01:54:27 so you’re like, oh, I’m gonna download Wikipedia

01:54:29 right to my brain.

01:54:30 My reading speed is not limited by my eyes.

01:54:33 My reading speed is limited by my inner processing loop.

01:54:36 And to like bootstrap that sounds kind of unclear

01:54:40 how to do it and horrifying.

01:54:42 But if I am with somebody and I’ll use a somebody

01:54:46 who is making a super sophisticated model of me

01:54:51 and then running simulations on that model,

01:54:53 I’m not gonna get into the question

01:54:54 whether the simulations are conscious or not.

01:54:55 I don’t really wanna know what it’s doing.

01:54:58 But using those simulations

01:55:00 to play out hypothetical futures for me,

01:55:01 deciding what things to say to me,

01:55:04 to guide me along a path.

01:55:06 And that’s how I envision it.

01:55:08 So on that path to AI of superhuman level intelligence,

01:55:15 you’ve mentioned that you believe in the singularity,

01:55:16 that singularity is coming.

01:55:18 Again, could be trolling, could be not,

01:55:20 could be part, all trolling has truth in it.

01:55:23 I don’t know what that means anymore.

01:55:24 What is the singularity?

01:55:25 Yeah, so that’s really the question.

01:55:28 How many years do you think before the singularity,

01:55:30 what form do you think it will take?

01:55:32 Does that mean fundamental shifts in capabilities of AI?

01:55:35 Or does it mean some other kind of ideas?

01:55:39 Maybe that’s just my roots, but.

01:55:41 So I can buy a human beings worth of compute

01:55:43 for like a million bucks today.

01:55:46 It’s about one TPU pod V3.

01:55:47 I want like, I think they claim a hundred pay to flops.

01:55:50 That’s being generous.

01:55:50 I think humans are actually more like 20.

01:55:52 So that’s like five humans.

01:55:53 That’s pretty good.

01:55:53 Google needs to sell their TPUs.

01:55:56 But I could buy, I could buy, I could buy GPUs.

01:55:58 I could buy a stack of like, I’d buy 1080 TIs,

01:56:02 build data center full of them.

01:56:03 And for a million bucks, I can get a human worth of compute.

01:56:08 But when you look at the total number of flops in the world,

01:56:12 when you look at human flops,

01:56:14 which goes up very, very slowly with the population

01:56:17 and machine flops, which goes up exponentially,

01:56:19 but it’s still nowhere near.

01:56:22 I think that’s the key thing to talk about

01:56:24 when the singularity happened.

01:56:25 When most flops in the world are Silicon and not biological,

01:56:29 that’s kind of the crossing point.

01:56:32 Like they’re now the dominant species on the planet.

01:56:35 And just looking at how technology is progressing,

01:56:38 when do you think that could possibly happen?

01:56:40 You think it would happen in your lifetime?

01:56:41 Oh yeah, definitely in my lifetime.

01:56:43 I’ve done the math.

01:56:44 I like 2038 because it’s the Unix timestamp rollover.

01:56:49 Yeah, beautifully put.

01:56:52 So you’ve said that the meaning of life is to win.

01:56:57 If you look five years into the future,

01:56:59 what does winning look like?

01:57:02 So,

01:57:08 there’s a lot of,

01:57:10 I can go into like technical depth

01:57:12 to what I mean by that, to win.

01:57:15 It may not mean, I was criticized for that in the comments.

01:57:18 Like, doesn’t this guy wanna like save the penguins

01:57:20 in Antarctica or like,

01:57:22 oh man, listen to what I’m saying.

01:57:24 I’m not talking about like I have a yacht or something.

01:57:27 But I am an agent.

01:57:30 I am put into this world.

01:57:32 And I don’t really know what my purpose is.

01:57:37 But if you’re an intelligent agent

01:57:40 and you’re put into a world,

01:57:41 what is the ideal thing to do?

01:57:43 Well, the ideal thing mathematically,

01:57:44 you can go back to like Schmidt Hoover theories about this,

01:57:47 is to build a compressive model of the world.

01:57:50 To build a maximally compressive,

01:57:51 to explore the world such that your exploration function

01:57:55 maximizes the derivative of compression of the past.

01:57:58 Schmidt Hoover has a paper about this.

01:58:00 And like, I took that kind of

01:58:02 as like a personal goal function.

01:58:05 So what I mean to win, I mean like,

01:58:07 maybe this is religious,

01:58:09 but like I think that in the future,

01:58:11 I might be given a real purpose

01:58:13 or I may decide this purpose myself.

01:58:14 And then at that point,

01:58:16 now I know what the game is and I know how to win.

01:58:18 I think right now,

01:58:19 I’m still just trying to figure out what the game is.

01:58:20 But once I know,

01:58:21 so you have imperfect information,

01:58:26 you have a lot of uncertainty about the reward function

01:58:28 and you’re discovering it.

01:58:29 Exactly.

01:58:30 But the purpose is…

01:58:31 That’s a better way to put it.

01:58:33 The purpose is to maximize it

01:58:34 while you have a lot of uncertainty around it.

01:58:37 And you’re both reducing the uncertainty

01:58:39 and maximizing at the same time.

01:58:41 Yeah.

01:58:42 And so that’s at the technical level.

01:58:44 What is the, if you believe in the universal prior,

01:58:47 what is the universal reward function?

01:58:49 That’s the better way to put it.

01:58:51 So that win is interesting.

01:58:53 I think I speak for everyone in saying that

01:58:57 I wonder what that reward function is for you.

01:59:01 And I look forward to seeing that in five years,

01:59:05 in 10 years.

01:59:07 I think a lot of people, including myself,

01:59:08 are cheering you on, man.

01:59:09 So I’m happy you exist and I wish you the best of luck.

01:59:14 Thanks for talking to me, man.

01:59:15 Thank you.

01:59:16 Have a good one.