Transcript
00:00:00 The following is a conversation with Anca Drogon,
00:00:03 a professor at Berkeley working on human robot interaction,
00:00:08 algorithms that look beyond the robot’s function
00:00:10 in isolation and generate robot behavior
00:00:13 that accounts for interaction
00:00:15 and coordination with human beings.
00:00:18 She also consults at Waymo, the autonomous vehicle company,
00:00:22 but in this conversation,
00:00:23 she is 100% wearing her Berkeley hat.
00:00:27 She is one of the most brilliant and fun roboticists
00:00:30 in the world to talk with.
00:00:32 I had a tough and crazy day leading up to this conversation,
00:00:36 so I was a bit tired, even more so than usual,
00:00:41 but almost immediately as she walked in,
00:00:44 her energy, passion, and excitement
00:00:46 for human robot interaction was contagious.
00:00:48 So I had a lot of fun and really enjoyed this conversation.
00:00:52 This is the Artificial Intelligence Podcast.
00:00:55 If you enjoy it, subscribe on YouTube,
00:00:57 review it with five stars on Apple Podcast,
00:01:00 support it on Patreon,
00:01:01 or simply connect with me on Twitter at Lex Friedman,
00:01:05 spelled F R I D M A N.
00:01:08 As usual, I’ll do one or two minutes of ads now
00:01:11 and never any ads in the middle
00:01:12 that can break the flow of the conversation.
00:01:14 I hope that works for you
00:01:16 and doesn’t hurt the listening experience.
00:01:20 This show is presented by Cash App,
00:01:22 the number one finance app in the App Store.
00:01:25 When you get it, use code LEXPODCAST.
00:01:29 Cash App lets you send money to friends,
00:01:31 buy Bitcoin, and invest in the stock market
00:01:33 with as little as one dollar.
00:01:36 Since Cash App does fractional share trading,
00:01:39 let me mention that the order execution algorithm
00:01:41 that works behind the scenes
00:01:43 to create the abstraction of fractional orders
00:01:45 is an algorithmic marvel.
00:01:48 So big props to the Cash App engineers
00:01:50 for solving a hard problem that in the end
00:01:53 provides an easy interface that takes a step up
00:01:56 to the next layer of abstraction over the stock market,
00:01:59 making trading more accessible for new investors
00:02:02 and diversification much easier.
00:02:05 So again, if you get Cash App from the App Store
00:02:08 or Google Play and use the code LEXPODCAST,
00:02:11 you get $10 and Cash App will also donate $10 to FIRST,
00:02:15 an organization that is helping to advance robotics
00:02:18 and STEM education for young people around the world.
00:02:22 And now, here’s my conversation with Anca Drogon.
00:02:26 When did you first fall in love with robotics?
00:02:29 I think it was a very gradual process
00:02:34 and it was somewhat accidental actually
00:02:37 because I first started getting into programming
00:02:41 when I was a kid and then into math
00:02:43 and then I decided computer science
00:02:46 was the thing I was gonna do
00:02:47 and then in college I got into AI
00:02:50 and then I applied to the Robotics Institute
00:02:52 at Carnegie Mellon and I was coming from this little school
00:02:56 in Germany that nobody had heard of
00:02:59 but I had spent an exchange semester at Carnegie Mellon
00:03:01 so I had letters from Carnegie Mellon.
00:03:04 So that was the only, you know, MIT said no,
00:03:06 Berkeley said no, Stanford said no.
00:03:09 That was the only place I got into
00:03:11 so I went there to the Robotics Institute
00:03:13 and I thought that robotics is a really cool way
00:03:16 to actually apply the stuff that I knew and loved
00:03:20 to like optimization so that’s how I got into robotics.
00:03:23 I have a better story how I got into cars
00:03:25 which is I used to do mostly manipulation in my PhD
00:03:31 but now I do kind of a bit of everything application wise
00:03:34 including cars and I got into cars
00:03:38 because I was here in Berkeley
00:03:42 while I was a PhD student still for RSS 2014,
00:03:46 Peter Bill organized it and he arranged for,
00:03:50 it was Google at the time to give us rides
00:03:52 in self driving cars and I was in a robot
00:03:56 and it was just making decision after decision,
00:04:00 the right call and it was so amazing.
00:04:03 So it was a whole different experience, right?
00:04:05 Just I mean manipulation is so hard you can’t do anything
00:04:07 and there it was.
00:04:08 Was it the most magical robot you’ve ever met?
00:04:11 So like for me to meet a Google self driving car
00:04:14 for the first time was like a transformative moment.
00:04:18 Like I had two moments like that,
00:04:19 that and Spot Mini, I don’t know if you met Spot Mini
00:04:22 from Boston Dynamics.
00:04:24 I felt like I fell in love or something
00:04:27 like it, cause I know how a Spot Mini works, right?
00:04:30 It’s just, I mean there’s nothing truly special,
00:04:34 it’s great engineering work but the anthropomorphism
00:04:38 that went on into my brain that came to life
00:04:41 like it had a little arm and it looked at me,
00:04:45 he, she looked at me, I don’t know,
00:04:47 there’s a magical connection there
00:04:48 and it made me realize, wow, robots can be so much more
00:04:52 than things that manipulate objects.
00:04:54 They can be things that have a human connection.
00:04:56 Do you have, was the self driving car the moment like,
00:05:01 was there a robot that truly sort of inspired you?
00:05:04 That was, I remember that experience very viscerally,
00:05:08 riding in that car and being just wowed.
00:05:11 I had the, they gave us a sticker that said,
00:05:16 I rode in a self driving car
00:05:17 and it had this cute little firefly on and,
00:05:20 or logo or something like that.
00:05:21 Oh, that was like the smaller one, like the firefly.
00:05:23 Yeah, the really cute one, yeah.
00:05:25 And I put it on my laptop and I had that for years
00:05:30 until I finally changed my laptop out and you know.
00:05:33 What about if we walk back, you mentioned optimization,
00:05:36 like what beautiful ideas inspired you in math,
00:05:40 computer science early on?
00:05:42 Like why get into this field?
00:05:44 It seems like a cold and boring field of math.
00:05:47 Like what was exciting to you about it?
00:05:49 The thing is I liked math from very early on,
00:05:52 from fifth grade is when I got into the math Olympiad
00:05:56 and all of that.
00:05:57 Oh, you competed too?
00:05:58 Yeah, this, it Romania is like our national sport too,
00:06:01 you gotta understand.
00:06:02 So I got into that fairly early
00:06:05 and it was a little, maybe too just theory
00:06:10 with no kind of, I didn’t kind of had a,
00:06:13 didn’t really have a goal.
00:06:15 And other than understanding, which was cool,
00:06:17 I always liked learning and understanding,
00:06:19 but there was no, okay,
00:06:20 what am I applying this understanding to?
00:06:22 And so I think that’s how I got into,
00:06:23 more heavily into computer science
00:06:25 because it was kind of math meets something
00:06:29 you can do tangibly in the world.
00:06:31 Do you remember like the first program you’ve written?
00:06:34 Okay, the first program I’ve written with,
00:06:37 I kind of do, it was in Cubasic in fourth grade.
00:06:42 Wow.
00:06:43 And it was drawing like a circle.
00:06:46 Graphics.
00:06:47 Yeah, that was, I don’t know how to do that anymore,
00:06:51 but in fourth grade,
00:06:52 that’s the first thing that they taught me.
00:06:54 I was like, you could take a special,
00:06:56 I wouldn’t say it was an extracurricular,
00:06:57 it’s in the sense an extracurricular,
00:06:59 so you could sign up for dance or music or programming.
00:07:03 And I did the programming thing
00:07:04 and my mom was like, what, why?
00:07:07 Did you compete in programming?
00:07:08 Like these days, Romania probably,
00:07:12 that’s like a big thing.
00:07:12 There’s a programming competition.
00:07:15 Was that, did that touch you at all?
00:07:17 I did a little bit of the computer science Olympian,
00:07:21 but not as seriously as I did the math Olympian.
00:07:24 So it was programming.
00:07:25 Yeah, it’s basically,
00:07:26 here’s a hard math problem,
00:07:27 solve it with a computer is kind of the deal.
00:07:29 Yeah, it’s more like algorithm.
00:07:30 Exactly, it’s always algorithmic.
00:07:32 So again, you kind of mentioned the Google self driving car,
00:07:36 but outside of that,
00:07:39 what’s like who or what is your favorite robot,
00:07:44 real or fictional that like captivated
00:07:46 your imagination throughout?
00:07:48 I mean, I guess you kind of alluded
00:07:49 to the Google self drive,
00:07:51 the Firefly was a magical moment,
00:07:53 but is there something else?
00:07:54 It wasn’t the Firefly there,
00:07:56 I think there was the Lexus by the way.
00:07:58 This was back then.
00:07:59 But yeah, so good question.
00:08:02 Okay, my favorite fictional robot is WALLI.
00:08:08 And I love how amazingly expressive it is.
00:08:15 I’m personally thinks a little bit
00:08:16 about expressive motion kinds of things you’re saying with,
00:08:18 you can do this and it’s a head and it’s the manipulator
00:08:20 and what does it all mean?
00:08:22 I like to think about that stuff.
00:08:24 I love Pixar, I love animation.
00:08:26 WALLI has two big eyes, I think, or no?
00:08:28 Yeah, it has these cameras and they move.
00:08:34 So yeah, it goes and then it’s super cute.
00:08:38 Yeah, the way it moves is just so expressive,
00:08:41 the timing of that motion,
00:08:43 what it’s doing with its arms
00:08:44 and what it’s doing with these lenses is amazing.
00:08:48 And so I’ve really liked that from the start.
00:08:53 And then on top of that, sometimes I share this,
00:08:56 it’s a personal story I share with people
00:08:58 or when I teach about AI or whatnot.
00:09:01 My husband proposed to me by building a WALLI
00:09:07 and he actuated it.
00:09:09 So it’s seven degrees of freedom, including the lens thing.
00:09:13 And it kind of came in and it had the,
00:09:17 he made it have like the belly box opening thing.
00:09:21 So it just did that.
00:09:23 And then it spewed out this box made out of Legos
00:09:27 that open slowly and then bam, yeah.
00:09:31 Yeah, it was quite, it set a bar.
00:09:34 That could be like the most impressive thing I’ve ever heard.
00:09:37 Okay.
00:09:39 That was special connection to WALLI, long story short.
00:09:40 I like WALLI because I like animation and I like robots
00:09:43 and I like the fact that this was,
00:09:46 we still have this robot to this day.
00:09:49 How hard is that problem,
00:09:50 do you think of the expressivity of robots?
00:09:54 Like with the Boston Dynamics, I never talked to those folks
00:09:59 about this particular element.
00:10:00 I’ve talked to them a lot,
00:10:02 but it seems to be like almost an accidental side effect
00:10:05 for them that they weren’t,
00:10:07 I don’t know if they’re faking it.
00:10:08 They weren’t trying to, okay.
00:10:11 They do say that the gripper,
00:10:14 it was not intended to be a face.
00:10:17 I don’t know if that’s a honest statement,
00:10:20 but I think they’re legitimate.
00:10:21 Probably yes. And so do we automatically just
00:10:25 anthropomorphize anything we can see about a robot?
00:10:29 So like the question is,
00:10:30 how hard is it to create a WALLI type robot
00:10:33 that connects so deeply with us humans?
00:10:35 What do you think?
00:10:36 It’s really hard, right?
00:10:37 So it depends on what setting.
00:10:39 So if you wanna do it in this very particular narrow setting
00:10:45 where it does only one thing and it’s expressive,
00:10:48 then you can get an animator, you know,
00:10:50 you can have Pixar on call come in,
00:10:52 design some trajectories.
00:10:53 There was a, Anki had a robot called Cosmo
00:10:56 where they put in some of these animations.
00:10:58 That part is easy, right?
00:11:00 The hard part is doing it not via these
00:11:04 kind of handcrafted behaviors,
00:11:06 but doing it generally autonomously.
00:11:09 Like I want robots, I don’t work on,
00:11:12 just to clarify, I don’t, I used to work a lot on this.
00:11:14 I don’t work on that quite as much these days,
00:11:17 but the notion of having robots that, you know,
00:11:21 when they pick something up and put it in a place,
00:11:24 they can do that with various forms of style,
00:11:28 or you can say, well, this robot is, you know,
00:11:30 succeeding at this task and is confident
00:11:32 versus it’s hesitant versus, you know,
00:11:34 maybe it’s happy or it’s, you know,
00:11:35 disappointed about something, some failure that it had.
00:11:38 I think that when robots move,
00:11:42 they can communicate so much about internal states
00:11:46 or perceived internal states that they have.
00:11:49 And I think that’s really useful
00:11:53 and an element that we’ll want in the future
00:11:55 because I was reading this article
00:11:58 about how kids are,
00:12:04 kids are being rude to Alexa
00:12:07 because they can be rude to it
00:12:09 and it doesn’t really get angry, right?
00:12:11 It doesn’t reply in any way, it just says the same thing.
00:12:15 So I think there’s, at least for that,
00:12:17 for the correct development of children,
00:12:20 it’s important that these things,
00:12:21 you kind of react differently.
00:12:22 I also think, you know, you walk in your home
00:12:24 and you have a personal robot and if you’re really pissed,
00:12:27 presumably the robot should kind of behave
00:12:28 slightly differently than when you’re super happy
00:12:31 and excited, but it’s really hard because it’s,
00:12:36 I don’t know, you know, the way I would think about it
00:12:38 and the way I thought about it when it came to
00:12:40 expressing goals or intentions for robots,
00:12:44 it’s, well, what’s really happening is that
00:12:47 instead of doing robotics where you have your state
00:12:51 and you have your action space and you have your space,
00:12:55 the reward function that you’re trying to optimize,
00:12:57 now you kind of have to expand the notion of state
00:13:00 to include this human internal state.
00:13:02 What is the person actually perceiving?
00:13:05 What do they think about the robots?
00:13:08 Something or rather,
00:13:10 and then you have to optimize in that system.
00:13:12 And so that means that you have to understand
00:13:14 how your motion, your actions end up sort of influencing
00:13:17 the observer’s kind of perception of you.
00:13:20 And it’s very hard to write math about that.
00:13:25 Right, so when you start to think about
00:13:27 incorporating the human into the state model,
00:13:31 apologize for the philosophical question,
00:13:33 but how complicated are human beings, do you think?
00:13:36 Like, can they be reduced to a kind of
00:13:40 almost like an object that moves
00:13:43 and maybe has some basic intents?
00:13:46 Or is there something, do we have to model things like mood
00:13:50 and general aggressiveness and time?
00:13:52 I mean, all these kinds of human qualities
00:13:54 or like game theoretic qualities, like what’s your sense?
00:13:58 How complicated is…
00:14:00 How hard is the problem of human robot interaction?
00:14:03 Yeah, should we talk about
00:14:05 what the problem of human robot interaction is?
00:14:07 Yeah, what is human robot interaction?
00:14:10 And then talk about how that, yeah.
00:14:12 So, and by the way, I’m gonna talk about
00:14:15 this very particular view of human robot interaction, right?
00:14:19 Which is not so much on the social side
00:14:21 or on the side of how do you have a good conversation
00:14:24 with the robot, what should the robot’s appearance be?
00:14:26 It turns out that if you make robots taller versus shorter,
00:14:29 this has an effect on how people act with them.
00:14:31 So I’m not talking about that.
00:14:34 But I’m talking about this very kind of narrow thing,
00:14:36 which is you take, if you wanna take a task
00:14:39 that a robot can do in isolation,
00:14:42 in a lab out there in the world, but in isolation,
00:14:46 and now you’re asking what does it mean for the robot
00:14:49 to be able to do this task for,
00:14:52 presumably what its actually end goal is,
00:14:54 which is to help some person.
00:14:56 That ends up changing the problem in two ways.
00:15:02 The first way it changes the problem is that
00:15:04 the robot is no longer the single agent acting.
00:15:08 That you have humans who also take actions
00:15:10 in that same space.
00:15:12 Cars navigating around people, robots around an office,
00:15:15 navigating around the people in that office.
00:15:18 If I send the robot over there in the cafeteria
00:15:20 to get me a coffee, then there’s probably other people
00:15:23 reaching for stuff in the same space.
00:15:25 And so now you have your robot and you’re in charge
00:15:28 of the actions that the robot is taking.
00:15:30 Then you have these people who are also making decisions
00:15:33 and taking actions in that same space.
00:15:36 And even if, you know, the robot knows what it should do
00:15:39 and all of that, just coexisting with these people, right?
00:15:42 Kind of getting the actions to gel well,
00:15:45 to mesh well together.
00:15:47 That’s sort of the kind of problem number one.
00:15:50 And then there’s problem number two,
00:15:51 which is, goes back to this notion of if I’m a programmer,
00:15:58 I can specify some objective for the robot
00:16:00 to go off and optimize and specify the task.
00:16:03 But if I put the robot in your home,
00:16:07 presumably you might have your own opinions about,
00:16:11 well, okay, I want my house clean,
00:16:12 but how do I want it cleaned?
00:16:14 And how should robot move, how close to me it should come
00:16:16 and all of that.
00:16:17 And so I think those are the two differences that you have.
00:16:20 You’re acting around people and what you should be
00:16:24 optimizing for should satisfy the preferences
00:16:27 of that end user, not of your programmer who programmed you.
00:16:30 Yeah, and the preferences thing is tricky.
00:16:33 So figuring out those preferences,
00:16:35 be able to interactively adjust
00:16:38 to understand what the human is doing.
00:16:39 So really it boils down to understand the humans
00:16:42 in order to interact with them and in order to please them.
00:16:45 Right.
00:16:47 So why is this hard?
00:16:48 Yeah, why is understanding humans hard?
00:16:51 So I think there’s two tasks about understanding humans
00:16:57 that in my mind are very, very similar,
00:16:59 but not everyone agrees.
00:17:00 So there’s the task of being able to just anticipate
00:17:04 what people will do.
00:17:05 We all know that cars need to do this, right?
00:17:07 We all know that, well, if I navigate around some people,
00:17:10 the robot has to get some notion of,
00:17:12 okay, where is this person gonna be?
00:17:15 So that’s kind of the prediction side.
00:17:17 And then there’s what you were saying,
00:17:19 satisfying the preferences, right?
00:17:21 So adapting to the person’s preferences,
00:17:22 knowing what to optimize for,
00:17:24 which is more this inference side,
00:17:25 this what does this person want?
00:17:28 What is their intent? What are their preferences?
00:17:31 And to me, those kind of go together
00:17:35 because I think that at the very least,
00:17:39 if you can understand, if you can look at human behavior
00:17:42 and understand what it is that they want,
00:17:45 then that’s sort of the key enabler
00:17:47 to being able to anticipate what they’ll do in the future.
00:17:50 Because I think that we’re not arbitrary.
00:17:53 We make these decisions that we make,
00:17:55 we act in the way we do
00:17:56 because we’re trying to achieve certain things.
00:17:59 And so I think that’s the relationship between them.
00:18:01 Now, how complicated do these models need to be
00:18:05 in order to be able to understand what people want?
00:18:10 So we’ve gotten a long way in robotics
00:18:15 with something called inverse reinforcement learning,
00:18:17 which is the notion of if someone acts,
00:18:19 demonstrates how they want the thing done.
00:18:22 What is inverse reinforcement learning?
00:18:24 You just briefly said it.
00:18:25 Right, so it’s the problem of take human behavior
00:18:30 and infer reward function from this.
00:18:33 So figure out what it is
00:18:34 that that behavior is optimal with respect to.
00:18:37 And it’s a great way to think
00:18:38 about learning human preferences
00:18:40 in the sense of you have a car and the person can drive it
00:18:45 and then you can say, well, okay,
00:18:46 I can actually learn what the person is optimizing for.
00:18:51 I can learn their driving style,
00:18:53 or you can have people demonstrate
00:18:55 how they want the house clean.
00:18:57 And then you can say, okay, this is,
00:18:59 I’m getting the trade offs that they’re making.
00:19:02 I’m getting the preferences that they want out of this.
00:19:06 And so we’ve been successful in robotics somewhat with this.
00:19:10 And it’s based on a very simple model of human behavior.
00:19:15 It was remarkably simple,
00:19:16 which is that human behavior is optimal
00:19:18 with respect to whatever it is that people want, right?
00:19:22 So you make that assumption
00:19:23 and now you can kind of inverse through.
00:19:24 That’s why it’s called inverse,
00:19:25 well, really optimal control,
00:19:27 but also inverse reinforcement learning.
00:19:30 So this is based on utility maximization in economics.
00:19:36 Back in the forties, von Neumann and Morgenstern
00:19:39 were like, okay, people are making choices
00:19:43 by maximizing utility, go.
00:19:45 And then in the late fifties,
00:19:48 we had Luce and Shepherd come in and say,
00:19:52 people are a little bit noisy and approximate in that process.
00:19:57 So they might choose something kind of stochastically
00:20:01 with probability proportional to
00:20:03 how much utility something has.
00:20:07 So there’s a bit of noise in there.
00:20:09 This has translated into robotics
00:20:11 and something that we call Boltzmann rationality.
00:20:14 So it’s a kind of an evolution
00:20:15 of inverse reinforcement learning
00:20:16 that accounts for human noise.
00:20:19 And we’ve had some success with that too,
00:20:21 for these tasks where it turns out
00:20:23 people act noisily enough that you can’t just do vanilla,
00:20:28 the vanilla version.
00:20:29 You can account for noise
00:20:31 and still infer what they seem to want based on this.
00:20:36 Then now we’re hitting tasks where that’s not enough.
00:20:39 And because…
00:20:41 What are examples of spatial tasks?
00:20:43 So imagine you’re trying to control some robot,
00:20:45 that’s fairly complicated.
00:20:47 You’re trying to control a robot arm
00:20:49 because maybe you’re a patient with a motor impairment
00:20:52 and you have this wheelchair mounted arm
00:20:53 and you’re trying to control it around.
00:20:56 Or one task that we’ve looked at with Sergei is,
00:21:00 and our students did, is a lunar lander.
00:21:02 So I don’t know if you know this Atari game,
00:21:05 it’s called Lunar Lander.
00:21:06 It’s really hard.
00:21:07 People really suck at landing the thing.
00:21:09 Mostly they just crash it left and right.
00:21:11 Okay, so this is the kind of task we imagine
00:21:14 you’re trying to provide some assistance
00:21:16 to a person operating such a robot
00:21:20 where you want the kind of the autonomy to kick in,
00:21:21 figure out what it is that you’re trying to do
00:21:23 and help you do it.
00:21:25 It’s really hard to do that for, say, Lunar Lander
00:21:30 because people are all over the place.
00:21:32 And so they seem much more noisy than really irrational.
00:21:36 That’s an example of a task
00:21:37 where these models are kind of failing us.
00:21:41 And it’s not surprising because
00:21:43 we’re talking about the 40s, utility, late 50s,
00:21:47 sort of noisy.
00:21:48 Then the 70s came and behavioral economics
00:21:52 started being a thing where people were like,
00:21:54 no, no, no, no, no, people are not rational.
00:21:58 People are messy and emotional and irrational
00:22:03 and have all sorts of heuristics
00:22:05 that might be domain specific.
00:22:06 And they’re just a mess.
00:22:08 The mess.
00:22:09 So what does my robot do to understand
00:22:13 what you want?
00:22:14 And it’s a very, it’s very, that’s why it’s complicated.
00:22:18 It’s, you know, for the most part,
00:22:19 we get away with pretty simple models until we don’t.
00:22:23 And then the question is, what do you do then?
00:22:26 And I had days when I wanted to, you know,
00:22:30 pack my bags and go home and switch jobs
00:22:32 because it’s just, it feels really daunting
00:22:35 to make sense of human behavior enough
00:22:37 that you can reliably understand what people want,
00:22:40 especially as, you know,
00:22:41 robot capabilities will continue to get developed.
00:22:44 You’ll get these systems that are more and more capable
00:22:47 of all sorts of things.
00:22:48 And then you really want to make sure
00:22:49 that you’re telling them the right thing to do.
00:22:51 What is that thing?
00:22:52 Well, read it in human behavior.
00:22:56 So if I just sat here quietly
00:22:58 and tried to understand something about you
00:23:00 by listening to you talk,
00:23:02 it would be harder than if I got to say something
00:23:06 and ask you and interact and control.
00:23:08 Can you, can the robot help its understanding of the human
00:23:13 by influencing the behavior by actually acting?
00:23:18 Yeah, absolutely.
00:23:19 So one of the things that’s been exciting to me lately
00:23:23 is this notion that when you try to,
00:23:28 that when you try to think of the robotics problem as,
00:23:31 okay, I have a robot and it needs to optimize
00:23:34 for whatever it is that a person wants it to optimize
00:23:37 as opposed to maybe what a programmer said.
00:23:40 That problem we think of as a human robot
00:23:44 collaboration problem in which both agents get to act
00:23:49 in which the robot knows less than the human
00:23:52 because the human actually has access to,
00:23:54 you know, at least implicitly to what it is that they want.
00:23:57 They can’t write it down, but they can talk about it.
00:24:00 They can give all sorts of signals.
00:24:02 They can demonstrate and,
00:24:04 but the robot doesn’t need to sit there
00:24:06 and passively observe human behavior
00:24:08 and try to make sense of it.
00:24:10 The robot can act too.
00:24:11 And so there’s these information gathering actions
00:24:15 that the robot can take to sort of solicit responses
00:24:19 that are actually informative.
00:24:21 So for instance, this is not for the purpose
00:24:22 of assisting people, but with kind of back to coordinating
00:24:25 with people in cars and all of that.
00:24:27 One thing that Dorsa did was,
00:24:31 so we were looking at cars being able to navigate
00:24:34 around people and you might not know exactly
00:24:39 the driving style of a particular individual
00:24:41 that’s next to you,
00:24:43 but you wanna change lanes in front of them.
00:24:45 Navigating around other humans inside cars.
00:24:48 Yeah, good, good clarification question.
00:24:50 So you have an autonomous car and it’s trying to navigate
00:24:55 the road around human driven vehicles.
00:24:58 Similar things ideas apply to pedestrians as well,
00:25:01 but let’s just take human driven vehicles.
00:25:03 So now you’re trying to change a lane.
00:25:06 Well, you could be trying to infer the driving style
00:25:10 of this person next to you.
00:25:12 You’d like to know if they’re in particular,
00:25:13 if they’re sort of aggressive or defensive,
00:25:15 if they’re gonna let you kind of go in
00:25:18 or if they’re gonna not.
00:25:20 And it’s very difficult to just,
00:25:25 if you think that if you wanna hedge your bets
00:25:27 and say, ah, maybe they’re actually pretty aggressive,
00:25:30 I shouldn’t try this.
00:25:31 You kind of end up driving next to them
00:25:33 and driving next to them, right?
00:25:34 And then you don’t know
00:25:36 because you’re not actually getting the observations
00:25:39 that you’re getting away.
00:25:40 Someone drives when they’re next to you
00:25:42 and they just need to go straight.
00:25:44 It’s kind of the same
00:25:45 regardless if they’re aggressive or defensive.
00:25:47 And so you need to enable the robot
00:25:51 to reason about how it might actually be able
00:25:54 to gather information by changing the actions
00:25:57 that it’s taking.
00:25:58 And then the robot comes up with these cool things
00:25:59 where it kind of nudges towards you
00:26:02 and then sees if you’re gonna slow down or not.
00:26:05 Then if you slow down,
00:26:06 it sort of updates its model of you
00:26:07 and says, oh, okay, you’re more on the defensive side.
00:26:11 So now I can actually like.
00:26:12 That’s a fascinating dance.
00:26:14 That’s so cool that you could use your own actions
00:26:18 to gather information.
00:26:19 That feels like a totally open,
00:26:22 exciting new world of robotics.
00:26:24 I mean, how many people are even thinking
00:26:26 about that kind of thing?
00:26:28 A handful of us, I’d say.
00:26:30 It’s rare because it’s actually leveraging human.
00:26:33 I mean, most roboticists,
00:26:34 I’ve talked to a lot of colleagues and so on,
00:26:38 are kind of, being honest, kind of afraid of humans.
00:26:42 Because they’re messy and complicated, right?
00:26:45 I understand.
00:26:47 Going back to what we were talking about earlier,
00:26:49 right now we’re kind of in this dilemma of, okay,
00:26:52 there are tasks that we can just assume
00:26:54 people are approximately rational for
00:26:55 and we can figure out what they want.
00:26:57 We can figure out their goals.
00:26:57 We can figure out their driving styles, whatever.
00:26:59 Cool.
00:27:00 There are these tasks that we can’t.
00:27:02 So what do we do, right?
00:27:03 Do we pack our bags and go home?
00:27:06 And this one, I’ve had a little bit of hope recently.
00:27:12 And I’m kind of doubting myself
00:27:13 because what do I know that, you know,
00:27:15 50 years of behavioral economics hasn’t figured out.
00:27:19 But maybe it’s not really in contradiction
00:27:21 with the way that field is headed.
00:27:23 But basically one thing that we’ve been thinking about is,
00:27:27 instead of kind of giving up and saying
00:27:30 people are too crazy and irrational
00:27:32 for us to make sense of them,
00:27:34 maybe we can give them a bit the benefit of the doubt.
00:27:39 And maybe we can think of them
00:27:41 as actually being relatively rational,
00:27:43 but just under different assumptions about the world,
00:27:48 about how the world works, about, you know,
00:27:51 they don’t have, when we think about rationality,
00:27:54 implicit assumption is, oh, they’re rational,
00:27:56 and they’re all the same assumptions and constraints
00:27:58 as the robot, right?
00:27:59 What, if this is the state of the world,
00:28:01 that’s what they know.
00:28:02 This is the transition function, that’s what they know.
00:28:05 This is the horizon, that’s what they know.
00:28:07 But maybe the kind of this difference,
00:28:11 the way, the reason they can seem a little messy
00:28:13 and hectic, especially to robots,
00:28:16 is that perhaps they just make different assumptions
00:28:20 or have different beliefs.
00:28:21 Yeah, I mean, that’s another fascinating idea
00:28:24 that this, our kind of anecdotal desire
00:28:29 to say that humans are irrational,
00:28:31 perhaps grounded in behavioral economics,
00:28:33 is that we just don’t understand the constraints
00:28:36 and the rewards under which they operate.
00:28:38 And so our goal shouldn’t be to throw our hands up
00:28:40 and say they’re irrational,
00:28:42 it’s to say, let’s try to understand
00:28:44 what are the constraints.
00:28:46 What it is that they must be assuming
00:28:48 that makes this behavior make sense.
00:28:51 Good life lesson, right?
00:28:52 Good life lesson.
00:28:53 That’s true, it’s just outside of robotics.
00:28:55 That’s just good to, that’s communicating with humans.
00:28:58 That’s just a good assume
00:29:00 that you just don’t, sort of empathy, right?
00:29:03 It’s a…
00:29:04 This is maybe there’s something you’re missing
00:29:06 and it’s, you know, it especially happens to robots
00:29:08 cause they’re kind of dumb and they don’t know things.
00:29:10 And oftentimes people are sort of supra rational
00:29:12 and that they actually know a lot of things
00:29:14 that robots don’t.
00:29:15 Sometimes like with the lunar lander,
00:29:17 the robot, you know, knows much more.
00:29:20 So it turns out that if you try to say,
00:29:23 look, maybe people are operating this thing
00:29:26 but assuming a much more simplified physics model
00:29:31 cause they don’t get the complexity of this kind of craft
00:29:33 or the robot arm with seven degrees of freedom
00:29:36 with these inertias and whatever.
00:29:38 So maybe they have this intuitive physics model
00:29:41 which is not, you know, this notion of intuitive physics
00:29:44 is something that you studied actually in cognitive science
00:29:46 was like Josh Denenbaum, Tom Griffith’s work on this stuff.
00:29:49 And what we found is that you can actually try
00:29:54 to figure out what physics model
00:29:58 kind of best explains human actions.
00:30:01 And then you can use that to sort of correct what it is
00:30:06 that they’re commanding the craft to do.
00:30:08 So they might, you know, be sending the craft somewhere
00:30:11 but instead of executing that action,
00:30:13 you can sort of take a step back and say,
00:30:15 according to their intuitive,
00:30:16 if the world worked according to their intuitive physics
00:30:20 model, where do they think that the craft is going?
00:30:23 Where are they trying to send it to?
00:30:26 And then you can use the real physics, right?
00:30:28 The inverse of that to actually figure out
00:30:30 what you should do so that you do that
00:30:31 instead of where they were actually sending you
00:30:33 in the real world.
00:30:34 And I kid you not at work people land the damn thing
00:30:38 and you know, in between the two flags and all that.
00:30:42 So it’s not conclusive in any way
00:30:45 but I’d say it’s evidence that yeah,
00:30:47 maybe we’re kind of underestimating humans in some ways
00:30:50 when we’re giving up and saying,
00:30:51 yeah, they’re just crazy noisy.
00:30:53 So then you try to explicitly try to model
00:30:56 the kind of worldview that they have.
00:30:58 That they have, that’s right.
00:30:59 That’s right.
00:31:00 And it’s not too, I mean,
00:31:02 there’s things in behavior economics too
00:31:03 that for instance have touched upon the planning horizon.
00:31:06 So there’s this idea that there’s bounded rationality
00:31:09 essentially and the idea that, well,
00:31:11 maybe we work under computational constraints.
00:31:13 And I think kind of our view recently has been
00:31:17 take the Bellman update in AI
00:31:19 and just break it in all sorts of ways by saying state,
00:31:22 no, no, no, the person doesn’t get to see the real state.
00:31:25 Maybe they’re estimating somehow.
00:31:26 Transition function, no, no, no, no, no.
00:31:28 Even the actual reward evaluation,
00:31:31 maybe they’re still learning
00:31:32 about what it is that they want.
00:31:34 Like, you know, when you watch Netflix
00:31:37 and you know, you have all the things
00:31:39 and then you have to pick something,
00:31:41 imagine that, you know, the AI system interpreted
00:31:46 that choice as this is the thing you prefer to see.
00:31:48 Like, how are you going to know?
00:31:49 You’re still trying to figure out what you like,
00:31:51 what you don’t like, et cetera.
00:31:52 So I think it’s important to also account for that.
00:31:55 So it’s not irrationality,
00:31:56 because they’re doing the right thing
00:31:58 under the things that they know.
00:31:59 Yeah, that’s brilliant.
00:32:01 You mentioned recommender systems.
00:32:03 What kind of, and we were talking
00:32:05 about human robot interaction,
00:32:07 what kind of problem spaces are you thinking about?
00:32:10 So is it robots, like wheeled robots
00:32:14 with autonomous vehicles?
00:32:16 Is it object manipulation?
00:32:18 Like when you think
00:32:19 about human robot interaction in your mind,
00:32:21 and maybe I’m sure you can speak
00:32:24 for the entire community of human robot interaction.
00:32:27 But like, what are the problems of interest here?
00:32:30 And does it, you know, I kind of think
00:32:34 of open domain dialogue as human robot interaction,
00:32:40 and that happens not in the physical space,
00:32:43 but it could just happen in the virtual space.
00:32:46 So where’s the boundaries of this field for you
00:32:49 when you’re thinking about the things
00:32:50 we’ve been talking about?
00:32:51 Yeah, so I try to find kind of underlying,
00:33:00 I don’t know what to even call them.
00:33:02 I try to work on, you know, I might call what I do,
00:33:05 the kind of working on the foundations
00:33:07 of algorithmic human robot interaction
00:33:09 and trying to make contributions there.
00:33:12 And it’s important to me that whatever we do
00:33:15 is actually somewhat domain agnostic when it comes to,
00:33:19 is it about, you know, autonomous cars
00:33:23 or is it about quadrotors or is it about,
00:33:27 is this sort of the same underlying principles apply?
00:33:30 Of course, when you’re trying to get
00:33:31 a particular domain to work,
00:33:32 you usually have to do some extra work
00:33:34 to adapt that to that particular domain.
00:33:36 But these things that we were talking about around,
00:33:40 well, you know, how do you model humans?
00:33:42 It turns out that a lot of systems need
00:33:44 to core benefit from a better understanding
00:33:47 of how human behavior relates to what people want
00:33:50 and need to predict human behavior,
00:33:53 physical robots of all sorts and beyond that.
00:33:56 And so I used to do manipulation.
00:33:58 I used to be, you know, picking up stuff
00:34:00 and then I was picking up stuff with people around.
00:34:03 And now it’s sort of very broad
00:34:05 when it comes to the application level,
00:34:07 but in a sense, very focused on, okay,
00:34:11 how does the problem need to change?
00:34:14 How do the algorithms need to change
00:34:15 when we’re not doing a robot by itself?
00:34:19 You know, emptying the dishwasher,
00:34:21 but we’re stepping outside of that.
00:34:23 I thought that popped into my head just now.
00:34:26 On the game theoretic side,
00:34:27 I think you said this really interesting idea
00:34:29 of using actions to gain more information.
00:34:33 But if we think of sort of game theory,
00:34:39 the humans that are interacting with you,
00:34:43 with you, the robot?
00:34:44 Wow, I’m thinking the identity of the robot.
00:34:46 Yeah, I do that all the time.
00:34:47 Yeah, is they also have a world model of you
00:34:55 and you can manipulate that.
00:34:57 I mean, if we look at autonomous vehicles,
00:34:59 people have a certain viewpoint.
00:35:01 You said with the kids, people see Alexa in a certain way.
00:35:07 Is there some value in trying to also optimize
00:35:10 how people see you as a robot?
00:35:15 Or is that a little too far away from the specifics
00:35:20 of what we can solve right now?
00:35:21 So, well, both, right?
00:35:24 So it’s really interesting.
00:35:26 And we’ve seen a little bit of progress on this problem,
00:35:30 on pieces of this problem.
00:35:32 So you can, again, it kind of comes down
00:35:36 to how complicated does the human model need to be?
00:35:38 But in one piece of work that we were looking at,
00:35:42 we just said, okay, there’s these parameters
00:35:46 that are internal to the robot
00:35:47 and what the robot is about to do,
00:35:51 or maybe what objective,
00:35:52 what driving style the robot has or something like that.
00:35:55 And what we’re gonna do is we’re gonna set up a system
00:35:58 where part of the state is the person’s belief
00:36:00 over those parameters.
00:36:02 And now when the robot acts,
00:36:05 that the person gets new evidence
00:36:07 about this robot internal state.
00:36:10 And so they’re updating their mental model of the robot.
00:36:13 So if they see a car that sort of cuts someone off,
00:36:16 they’re like, oh, that’s an aggressive car.
00:36:18 They know more.
00:36:20 If they see sort of a robot head towards a particular door,
00:36:24 they’re like, oh yeah, the robot’s trying to get
00:36:25 to that door.
00:36:26 So this thing that we have to do with humans
00:36:27 to try and understand their goals and intentions,
00:36:31 humans are inevitably gonna do that to robots.
00:36:34 And then that raises this interesting question
00:36:36 that you asked, which is, can we do something about that?
00:36:38 This is gonna happen inevitably,
00:36:40 but we can sort of be more confusing
00:36:42 or less confusing to people.
00:36:44 And it turns out you can optimize
00:36:45 for being more informative and less confusing
00:36:48 if you have an understanding of how your actions
00:36:51 are being interpreted by the human,
00:36:53 and how they’re using these actions to update their belief.
00:36:56 And honestly, all we did is just Bayes rule.
00:36:59 Basically, okay, the person has a belief,
00:37:02 they see an action, they make some assumptions
00:37:04 about how the robot generates its actions,
00:37:06 presumably as being rational,
00:37:07 because robots are rational.
00:37:09 It’s reasonable to assume that about them.
00:37:11 And then they incorporate that new piece of evidence
00:37:17 in the Bayesian sense in their belief,
00:37:19 and they obtain a posterior.
00:37:20 And now the robot is trying to figure out
00:37:23 what actions to take such that it steers
00:37:25 the person’s belief to put as much probability mass
00:37:27 as possible on the correct parameters.
00:37:31 So that’s kind of a mathematical formalization of that.
00:37:33 But my worry, and I don’t know if you wanna go there
00:37:38 with me, but I talk about this quite a bit.
00:37:44 The kids talking to Alexa disrespectfully worries me.
00:37:49 I worry in general about human nature.
00:37:52 Like I said, I grew up in Soviet Union, World War II,
00:37:54 I’m a Jew too, so with the Holocaust and everything.
00:37:58 I just worry about how we humans sometimes treat the other,
00:38:02 the group that we call the other, whatever it is.
00:38:05 Through human history, the group that’s the other
00:38:07 has been changed faces.
00:38:09 But it seems like the robot will be the other, the other,
00:38:13 the next other.
00:38:15 And one thing is it feels to me
00:38:19 that robots don’t get no respect.
00:38:22 They get shoved around.
00:38:23 Shoved around, and is there, one, at the shallow level,
00:38:27 for a better experience, it seems that robots
00:38:29 need to talk back a little bit.
00:38:31 Like my intuition says, I mean, most companies
00:38:35 from sort of Roomba, autonomous vehicle companies
00:38:38 might not be so happy with the idea that a robot
00:38:41 has a little bit of an attitude.
00:38:43 But I feel, it feels to me that that’s necessary
00:38:46 to create a compelling experience.
00:38:48 Like we humans don’t seem to respect anything
00:38:50 that doesn’t give us some attitude.
00:38:52 That, or like a mix of mystery and attitude and anger
00:38:58 and that threatens us subtly, maybe passive aggressively.
00:39:03 I don’t know.
00:39:04 It seems like we humans, yeah, need that.
00:39:08 Do you, what are your, is there something,
00:39:10 you have thoughts on this?
00:39:11 All right, I’ll give you two thoughts on this.
00:39:13 Okay, sure.
00:39:13 One is, one is, it’s, we respond to, you know,
00:39:18 someone being assertive, but we also respond
00:39:24 to someone being vulnerable.
00:39:26 So I think robots, my first thought is that
00:39:28 robots get shoved around and bullied a lot
00:39:31 because they’re sort of, you know, tempting
00:39:32 and they’re sort of showing off
00:39:34 or they appear to be showing off.
00:39:35 And so I think going back to these things
00:39:38 we were talking about in the beginning
00:39:39 of making robots a little more, a little more expressive,
00:39:43 a little bit more like, eh, that wasn’t cool to do.
00:39:46 And now I’m bummed, right?
00:39:49 I think that that can actually help
00:39:51 because people can’t help but anthropomorphize
00:39:53 and respond to that.
00:39:54 Even that though, the emotion being communicated
00:39:56 is not in any way a real thing.
00:39:58 And people know that it’s not a real thing
00:40:00 because they know it’s just a machine.
00:40:01 We’re still interpreting, you know, we watch,
00:40:04 there’s this famous psychology experiment
00:40:07 with little triangles and kind of dots on a screen
00:40:11 and a triangle is chasing the square
00:40:12 and you get really angry at the darn triangle
00:40:15 because why is it not leaving the square alone?
00:40:18 So that’s, yeah, we can’t help.
00:40:20 So that was the first thought.
00:40:21 The vulnerability, that’s really interesting that,
00:40:25 I think of like being, pushing back, being assertive
00:40:31 as the only mechanism of getting,
00:40:33 of forming a connection, of getting respect,
00:40:36 but perhaps vulnerability,
00:40:37 perhaps there’s other mechanisms that are less threatening.
00:40:40 Yeah.
00:40:40 Is there?
00:40:41 Well, I think, well, a little bit, yes,
00:40:43 but then this other thing that we can think about is,
00:40:47 it goes back to what you were saying,
00:40:48 that interaction is really game theoretic, right?
00:40:50 So the moment you’re taking actions in a space,
00:40:52 the humans are taking actions in that same space,
00:40:55 but you have your own objective, which is, you know,
00:40:58 you’re a car, you need to get your passenger
00:40:59 to the destination.
00:41:00 And then the human nearby has their own objective,
00:41:03 which somewhat overlaps with you, but not entirely.
00:41:07 You’re not interested in getting into an accident
00:41:09 with each other, but you have different destinations
00:41:11 and you wanna get home faster
00:41:13 and they wanna get home faster.
00:41:14 And that’s a general sum game at that point.
00:41:17 And so that’s, I think that’s what,
00:41:22 treating it as such is kind of a way we can step outside
00:41:25 of this kind of mode that,
00:41:29 where you try to anticipate what people do
00:41:32 and you don’t realize you have any influence over it
00:41:35 while still protecting yourself
00:41:37 because you’re understanding that people also understand
00:41:40 that they can influence you.
00:41:42 And it’s just kind of back and forth is this negotiation,
00:41:45 which is really talking about different equilibria
00:41:49 of a game.
00:41:50 The very basic way to solve coordination
00:41:53 is to just make predictions about what people will do
00:41:55 and then stay out of their way.
00:41:57 And that’s hard for the reasons we talked about,
00:41:59 which is how you have to understand people’s intentions
00:42:02 implicitly, explicitly, who knows,
00:42:05 but somehow you have to get enough of an understanding
00:42:07 of that to be able to anticipate what happens next.
00:42:10 And so that’s challenging.
00:42:11 But then it’s further challenged by the fact
00:42:13 that people change what they do based on what you do
00:42:17 because they don’t plan in isolation either, right?
00:42:21 So when you see cars trying to merge on a highway
00:42:25 and not succeeding, one of the reasons this can be
00:42:27 is because they look at traffic that keeps coming,
00:42:33 they predict what these people are planning on doing,
00:42:35 which is to just keep going,
00:42:37 and then they stay out of the way
00:42:39 because there’s no feasible plan, right?
00:42:42 Any plan would actually intersect
00:42:44 with one of these other people.
00:42:46 So that’s bad, so you get stuck there.
00:42:49 So now kind of if you start thinking about it as no, no, no,
00:42:53 actually these people change what they do
00:42:58 depending on what the car does.
00:42:59 Like if the car actually tries to kind of inch itself forward,
00:43:03 they might actually slow down and let the car in.
00:43:07 And now taking advantage of that,
00:43:10 well, that’s kind of the next level.
00:43:13 We call this like this underactuated system idea
00:43:16 where it’s kind of underactuated system robotics,
00:43:18 but it’s kind of, you’re influenced
00:43:22 these other degrees of freedom,
00:43:23 but you don’t get to decide what they do.
00:43:25 I’ve somewhere seen you mention it,
00:43:28 the human element in this picture as underactuated.
00:43:32 So you understand underactuated robotics
00:43:35 is that you can’t fully control the system.
00:43:41 You can’t go in arbitrary directions
00:43:43 in the configuration space.
00:43:44 Under your control.
00:43:46 Yeah, it’s a very simple way of underactuation
00:43:48 where basically there’s literally these degrees of freedom
00:43:51 that you can control,
00:43:52 and these degrees of freedom that you can’t,
00:43:53 but you influence them.
00:43:54 And I think that’s the important part
00:43:55 is that they don’t do whatever, regardless of what you do,
00:43:59 that what you do influences what they end up doing.
00:44:02 I just also like the poetry of calling human robot
00:44:05 interaction an underactuated robotics problem.
00:44:09 And you also mentioned sort of nudging.
00:44:11 It seems that they’re, I don’t know.
00:44:14 I think about this a lot in the case of pedestrians
00:44:16 I’ve collected hundreds of hours of videos.
00:44:18 I like to just watch pedestrians.
00:44:21 And it seems that.
00:44:22 It’s a funny hobby.
00:44:24 Yeah, it’s weird.
00:44:25 Cause I learn a lot.
00:44:27 I learned a lot about myself,
00:44:28 about our human behavior, from watching pedestrians,
00:44:32 watching people in their environment.
00:44:35 Basically crossing the street
00:44:37 is like you’re putting your life on the line.
00:44:41 I don’t know, tens of millions of time in America every day
00:44:44 is people are just like playing this weird game of chicken
00:44:48 when they cross the street,
00:44:49 especially when there’s some ambiguity
00:44:51 about the right of way.
00:44:54 That has to do either with the rules of the road
00:44:56 or with the general personality of the intersection
00:44:59 based on the time of day and so on.
00:45:02 And this nudging idea,
00:45:05 it seems that people don’t even nudge.
00:45:07 They just aggressively take, make a decision.
00:45:10 Somebody, there’s a runner that gave me this advice.
00:45:14 I sometimes run in the street,
00:45:17 not in the street, on the sidewalk.
00:45:18 And he said that if you don’t make eye contact with people
00:45:22 when you’re running, they will all move out of your way.
00:45:25 It’s called civil inattention.
00:45:27 Civil inattention, that’s a thing.
00:45:29 Oh wow, I need to look this up, but it works.
00:45:32 What is that?
00:45:32 My sense was if you communicate like confidence
00:45:37 in your actions that you’re unlikely to deviate
00:45:41 from the action that you’re following,
00:45:43 that’s a really powerful signal to others
00:45:44 that they need to plan around your actions.
00:45:47 As opposed to nudging where you’re sort of hesitantly,
00:45:50 then the hesitation might communicate
00:45:53 that you’re still in the dance and the game
00:45:56 that they can influence with their own actions.
00:45:59 I’ve recently had a conversation with Jim Keller,
00:46:03 who’s a sort of this legendary chip architect,
00:46:08 but he also led the autopilot team for a while.
00:46:12 And his intuition that driving is fundamentally
00:46:16 still like a ballistics problem.
00:46:18 Like you can ignore the human element
00:46:22 that is just not hitting things.
00:46:24 And you can kind of learn the right dynamics
00:46:26 required to do the merger and all those kinds of things.
00:46:29 And then my sense is, and I don’t know if I can provide
00:46:32 sort of definitive proof of this,
00:46:34 but my sense is like an order of magnitude
00:46:38 are more difficult when humans are involved.
00:46:41 Like it’s not simply object collision avoidance problem.
00:46:48 Where does your intuition,
00:46:49 of course, nobody knows the right answer here,
00:46:51 but where does your intuition fall on the difficulty,
00:46:54 fundamental difficulty of the driving problem
00:46:57 when humans are involved?
00:46:58 Yeah, good question.
00:47:00 I have many opinions on this.
00:47:03 Imagine downtown San Francisco.
00:47:07 Yeah, it’s crazy, busy, everything.
00:47:10 Okay, now take all the humans out.
00:47:12 No pedestrians, no human driven vehicles,
00:47:15 no cyclists, no people on little electric scooters
00:47:18 zipping around, nothing.
00:47:19 I think we’re done.
00:47:21 I think driving at that point is done.
00:47:23 We’re done.
00:47:25 There’s nothing really that still needs
00:47:27 to be solved about that.
00:47:28 Well, let’s pause there.
00:47:30 I think I agree with you and I think a lot of people
00:47:34 that will hear will agree with that,
00:47:37 but we need to sort of internalize that idea.
00:47:41 So what’s the problem there?
00:47:42 Cause we might not quite yet be done with that.
00:47:45 Cause a lot of people kind of focus
00:47:46 on the perception problem.
00:47:48 A lot of people kind of map autonomous driving
00:47:52 into how close are we to solving,
00:47:55 being able to detect all the, you know,
00:47:57 the drivable area, the objects in the scene.
00:48:02 Do you see that as a, how hard is that problem?
00:48:07 So your intuition there behind your statement
00:48:09 was we might have not solved it yet,
00:48:11 but we’re close to solving basically the perception problem.
00:48:14 I think the perception problem, I mean,
00:48:17 and by the way, a bunch of years ago,
00:48:19 this would not have been true.
00:48:21 And a lot of issues in the space were coming
00:48:24 from the fact that, oh, we don’t really, you know,
00:48:27 we don’t know what’s where.
00:48:29 But I think it’s fairly safe to say that at this point,
00:48:33 although you could always improve on things
00:48:35 and all of that, you can drive through downtown San Francisco
00:48:38 if there are no people around.
00:48:40 There’s no really perception issues
00:48:42 standing in your way there.
00:48:44 I think perception is hard, but yeah, it’s, we’ve made
00:48:47 a lot of progress on the perception,
00:48:49 so I had to undermine the difficulty of the problem.
00:48:50 I think everything about robotics is really difficult,
00:48:53 of course, I think that, you know, the planning problem,
00:48:57 the control problem, all very difficult,
00:48:59 but I think what’s, what makes it really kind of, yeah.
00:49:03 It might be, I mean, you know,
00:49:05 and I picked downtown San Francisco,
00:49:07 it’s adapting to, well, now it’s snowing,
00:49:11 now it’s no longer snowing, now it’s slippery in this way,
00:49:14 now it’s the dynamics part could,
00:49:16 I could imagine being still somewhat challenging, but.
00:49:24 No, the thing that I think worries us,
00:49:26 and our intuition’s not good there,
00:49:27 is the perception problem at the edge cases.
00:49:31 Sort of downtown San Francisco, the nice thing,
00:49:35 it’s not actually, it may not be a good example because.
00:49:39 Because you know what you’re getting from,
00:49:41 well, there’s like crazy construction zones
00:49:43 and all of that. Yeah, but the thing is,
00:49:44 you’re traveling at slow speeds,
00:49:46 so like it doesn’t feel dangerous.
00:49:47 To me, what feels dangerous is highway speeds,
00:49:51 when everything is, to us humans, super clear.
00:49:54 Yeah, I’m assuming LiDAR here, by the way.
00:49:57 I think it’s kind of irresponsible to not use LiDAR.
00:49:59 That’s just my personal opinion.
00:50:02 That’s, I mean, depending on your use case,
00:50:04 but I think like, you know, if you have the opportunity
00:50:07 to use LiDAR, in a lot of cases, you might not.
00:50:11 Good, your intuition makes more sense now.
00:50:13 So you don’t think vision.
00:50:15 I really just don’t know enough to say,
00:50:18 well, vision alone, what, you know, what’s like,
00:50:21 there’s a lot of, how many cameras do you have?
00:50:24 Is it, how are you using them?
00:50:25 I don’t know. There’s details.
00:50:26 There’s all, there’s all sorts of details.
00:50:28 I imagine there’s stuff that’s really hard
00:50:30 to actually see, you know, how do you deal with glare,
00:50:33 exactly what you were saying,
00:50:34 stuff that people would see that you don’t.
00:50:37 I think I have, more of my intuition comes from systems
00:50:40 that can actually use LiDAR as well.
00:50:44 Yeah, and until we know for sure,
00:50:45 it makes sense to be using LiDAR.
00:50:48 That’s kind of the safety focus.
00:50:50 But then the sort of the,
00:50:52 I also sympathize with the Elon Musk statement
00:50:55 of LiDAR is a crutch.
00:50:57 It’s a fun notion to think that the things that work today
00:51:04 is a crutch for the invention of the things
00:51:08 that will work tomorrow, right?
00:51:09 Like it, it’s kind of true in the sense that if,
00:51:15 you know, we want to stick to the comfort zone,
00:51:17 you see this in academic and research settings
00:51:19 all the time, the things that work force you
00:51:22 to not explore outside, think outside the box.
00:51:25 I mean, that happens all the time.
00:51:26 The problem is in the safety critical systems,
00:51:29 you kind of want to stick with the things that work.
00:51:32 So it’s an interesting and difficult trade off
00:51:34 in the case of real world sort of safety critical
00:51:38 robotic systems, but so your intuition is,
00:51:44 just to clarify, how, I mean,
00:51:48 how hard is this human element for,
00:51:51 like how hard is driving
00:51:52 when this human element is involved?
00:51:55 Are we years, decades away from solving it?
00:52:00 But perhaps actually the year isn’t the thing I’m asking.
00:52:03 It doesn’t matter what the timeline is,
00:52:05 but do you think we’re, how many breakthroughs
00:52:09 are we away from in solving
00:52:12 the human robotic interaction problem
00:52:13 to get this, to get this right?
00:52:15 I think it, in a sense, it really depends.
00:52:20 I think that, you know, we were talking about how,
00:52:24 well, look, it’s really hard
00:52:25 because anticipate what people do is hard.
00:52:27 And on top of that, playing the game is hard.
00:52:30 But I think we sort of have the fundamental,
00:52:35 some of the fundamental understanding for that.
00:52:38 And then you already see that these systems
00:52:41 are being deployed in the real world,
00:52:45 you know, even driverless.
00:52:47 Like there’s, I think now a few companies
00:52:50 that don’t have a driver in the car in some small areas.
00:52:55 I got a chance to, I went to Phoenix and I,
00:52:59 I shot a video with Waymo and I needed to get
00:53:03 that video out.
00:53:04 People have been giving me slack,
00:53:06 but there’s incredible engineering work being done there.
00:53:09 And it’s one of those other seminal moments
00:53:11 for me in my life to be able to, it sounds silly,
00:53:13 but to be able to drive without a ride, sorry,
00:53:17 without a driver in the seat.
00:53:19 I mean, that was an incredible robotics.
00:53:22 I was driven by a robot without being able to take over,
00:53:27 without being able to take the steering wheel.
00:53:31 That’s a magical, that’s a magical moment.
00:53:33 So in that regard, in those domains,
00:53:35 at least for like Waymo, they’re solving that human,
00:53:39 there’s, I mean, they’re going, I mean, it felt fast
00:53:43 because you’re like freaking out at first.
00:53:45 That was, this is my first experience,
00:53:47 but it’s going like the speed limit, right?
00:53:49 30, 40, whatever it is.
00:53:51 And there’s humans and it deals with them quite well.
00:53:53 It detects them, it negotiates the intersections,
00:53:57 the left turns and all of that.
00:53:58 So at least in those domains, it’s solving them.
00:54:01 The open question for me is like, how quickly can we expand?
00:54:06 You know, that’s the, you know,
00:54:08 outside of the weather conditions,
00:54:10 all of those kinds of things,
00:54:11 how quickly can we expand to like cities like San Francisco?
00:54:14 Yeah, and I wouldn’t say that it’s just, you know,
00:54:17 now it’s just pure engineering and it’s probably the,
00:54:20 I mean, and by the way,
00:54:22 I’m speaking kind of very generally here as hypothesizing,
00:54:26 but I think that there are successes
00:54:31 and yet no one is everywhere out there.
00:54:34 So that seems to suggest that things can be expanded
00:54:38 and can be scaled and we know how to do a lot of things,
00:54:41 but there’s still probably, you know,
00:54:44 new algorithms or modified algorithms
00:54:46 that you still need to put in there
00:54:49 as you learn more and more about new challenges
00:54:53 that you get faced with.
00:54:55 How much of this problem do you think can be learned
00:54:58 through end to end?
00:54:59 Is it the success of machine learning
00:55:00 and reinforcement learning?
00:55:02 How much of it can be learned from sort of data
00:55:05 from scratch and how much,
00:55:07 which most of the success of autonomous vehicle systems
00:55:10 have a lot of heuristics and rule based stuff on top,
00:55:14 like human expertise injected forced into the system
00:55:19 to make it work.
00:55:20 What’s your sense?
00:55:22 How much, what will be the role of learning
00:55:26 in the near term and long term?
00:55:28 I think on the one hand that learning is inevitable here,
00:55:36 right?
00:55:37 I think on the other hand that when people characterize
00:55:39 the problem as it’s a bunch of rules
00:55:42 that some people wrote down,
00:55:44 versus it’s an end to end RL system or imitation learning,
00:55:49 then maybe there’s kind of something missing
00:55:53 from maybe that’s more.
00:55:57 So for instance, I think a very, very useful tool
00:56:02 in this sort of problem,
00:56:04 both in how to generate the car’s behavior
00:56:07 and robots in general and how to model human beings
00:56:11 is actually planning, search optimization, right?
00:56:15 So robotics is the sequential decision making problem.
00:56:18 And when a robot can figure out on its own
00:56:26 how to achieve its goal without hitting stuff
00:56:28 and all that stuff, right?
00:56:30 All the good stuff for motion planning 101,
00:56:33 I think of that as very much AI,
00:56:36 not this is some rule or something.
00:56:38 There’s nothing rule based around that, right?
00:56:40 It’s just you’re searching through a space
00:56:42 and figuring out are you optimizing through a space
00:56:43 and figure out what seems to be the right thing to do.
00:56:47 And I think it’s hard to just do that
00:56:49 because you need to learn models of the world.
00:56:52 And I think it’s hard to just do the learning part
00:56:55 where you don’t bother with any of that,
00:56:58 because then you’re saying, well, I could do imitation,
00:57:01 but then when I go off distribution, I’m really screwed.
00:57:04 Or you can say, I can do reinforcement learning,
00:57:08 which adds a lot of robustness,
00:57:09 but then you have to do either reinforcement learning
00:57:12 in the real world, which sounds a little challenging
00:57:15 or that trial and error, you know,
00:57:18 or you have to do reinforcement learning in simulation.
00:57:21 And then that means, well, guess what?
00:57:23 You need to model things, at least to model people,
00:57:27 model the world enough that whatever policy you get of that
00:57:31 is actually fine to roll out in the world
00:57:34 and do some additional learning there.
00:57:36 So. Do you think simulation, by the way, just a quick tangent
00:57:40 has a role in the human robot interaction space?
00:57:44 Like, is it useful?
00:57:46 It seems like humans, everything we’ve been talking about
00:57:48 are difficult to model and simulate.
00:57:51 Do you think simulation has a role in this space?
00:57:53 I do.
00:57:54 I think so because you can take models
00:57:58 and train with them ahead of time, for instance.
00:58:04 You can.
00:58:06 But the models, sorry to interrupt,
00:58:07 the models are sort of human constructed or learned?
00:58:10 I think they have to be a combination
00:58:14 because if you get some human data and then you say,
00:58:20 this is how, this is gonna be my model of the person.
00:58:22 What are for simulation and training
00:58:24 or for just deployment time?
00:58:25 And that’s what I’m planning with
00:58:27 as my model of how people work.
00:58:29 Regardless, if you take some data
00:58:33 and you don’t assume anything else and you just say,
00:58:35 okay, this is some data that I’ve collected.
00:58:39 Let me fit a policy to how people work based on that.
00:58:42 What tends to happen is you collected some data
00:58:45 and some distribution, and then now your robot
00:58:50 sort of computes a best response to that, right?
00:58:52 It’s sort of like, what should I do
00:58:54 if this is how people work?
00:58:56 And easily goes off of distribution
00:58:58 where that model that you’ve built of the human
00:59:01 completely sucks because out of distribution,
00:59:03 you have no idea, right?
00:59:05 If you think of all the possible policies
00:59:07 and then you take only the ones that are consistent
00:59:10 with the human data that you’ve observed,
00:59:13 that still leads a lot of, a lot of things could happen
00:59:15 outside of that distribution where you’re confident
00:59:18 then you know what’s going on.
00:59:19 By the way, that’s, I mean, I’ve gotten used
00:59:22 to this terminology of not a distribution,
00:59:25 but it’s such a machine learning terminology
00:59:29 because it kind of assumes,
00:59:30 so distribution is referring to the data
00:59:36 that you’ve seen.
00:59:36 The set of states that you encounter
00:59:38 at training time. They’ve encountered so far
00:59:39 at training time. Yeah.
00:59:40 But it kind of also implies that there’s a nice
00:59:43 like statistical model that represents that data.
00:59:47 So out of distribution feels like, I don’t know,
00:59:50 it raises to me philosophical questions
00:59:54 of how we humans reason out of distribution,
00:59:58 reason about things that are completely,
01:00:01 we haven’t seen before.
01:00:03 And so, and what we’re talking about here is
01:00:05 how do we reason about what other people do
01:00:09 in situations where we haven’t seen them?
01:00:11 And somehow we just magically navigate that.
01:00:14 I can anticipate what will happen in situations
01:00:18 that are even novel in many ways.
01:00:21 And I have a pretty good intuition for,
01:00:22 I don’t always get it right, but you know,
01:00:24 and I might be a little uncertain and so on.
01:00:26 But I think it’s this that if you just rely on data,
01:00:33 you know, there’s just too many possibilities,
01:00:36 there’s too many policies out there that fit the data.
01:00:37 And by the way, it’s not just state,
01:00:39 it’s really kind of history of state,
01:00:40 cause to really be able to anticipate
01:00:41 what the person will do,
01:00:43 it kind of depends on what they’ve been doing so far,
01:00:45 cause that’s the information you need to kind of,
01:00:47 at least implicitly sort of say,
01:00:49 oh, this is the kind of person that this is,
01:00:51 this is probably what they’re trying to do.
01:00:53 So anyway, it’s like you’re trying to map history of states
01:00:55 to actions, there’s many mappings.
01:00:56 And history meaning like the last few seconds
01:00:59 or the last few minutes or the last few months.
01:01:02 Who knows, who knows how much you need, right?
01:01:04 In terms of if your state is really like the positions
01:01:07 of everything or whatnot and velocities,
01:01:09 who knows how much you need.
01:01:10 And then there’s so many mappings.
01:01:14 And so now you’re talking about
01:01:16 how do you regularize that space?
01:01:17 What priors do you impose or what’s the inductive bias?
01:01:21 So, you know, there’s all very related things
01:01:23 to think about it.
01:01:25 Basically, what are assumptions that we should be making
01:01:29 such that these models actually generalize
01:01:32 outside of the data that we’ve seen?
01:01:35 And now you’re talking about, well, I don’t know,
01:01:37 what can you assume?
01:01:38 Maybe you can assume that people like actually
01:01:40 have intentions and that’s what drives their actions.
01:01:43 Maybe that’s, you know, the right thing to do
01:01:46 when you haven’t seen data very nearby
01:01:49 that tells you otherwise.
01:01:51 I don’t know, it’s a very open question.
01:01:53 Do you think sort of that one of the dreams
01:01:55 of artificial intelligence was to solve
01:01:58 common sense reasoning, whatever the heck that means.
01:02:02 Do you think something like common sense reasoning
01:02:04 has to be solved in part to be able to solve this dance
01:02:09 of human robot interaction, the driving space
01:02:12 or human robot interaction in general?
01:02:14 Do you have to be able to reason about these kinds
01:02:16 of common sense concepts of physics,
01:02:21 of, you know, all the things we’ve been talking about
01:02:27 humans, I don’t even know how to express them with words,
01:02:30 but the basics of human behavior, a fear of death.
01:02:34 So like, to me, it’s really important to encode
01:02:38 in some kind of sense, maybe not, maybe it’s implicit,
01:02:41 but it feels that it’s important to explicitly encode
01:02:44 the fear of death, that people don’t wanna die.
01:02:48 Because it seems silly, but like the game of chicken
01:02:56 that involves with the pedestrian crossing the street
01:02:59 is playing with the idea of mortality.
01:03:03 Like we really don’t wanna die.
01:03:04 It’s not just like a negative reward.
01:03:07 I don’t know, it just feels like all these human concepts
01:03:10 have to be encoded.
01:03:11 Do you share that sense or is this a lot simpler
01:03:14 than I’m making out to be?
01:03:15 I think it might be simpler.
01:03:17 And I’m the person who likes to complicate things.
01:03:18 I think it might be simpler than that.
01:03:21 Because it turns out, for instance,
01:03:24 if you say model people in the very,
01:03:29 I’ll call it traditional, I don’t know if it’s fair
01:03:31 to look at it as a traditional way,
01:03:33 but you know, calling people as,
01:03:35 okay, they’re rational somehow,
01:03:37 the utilitarian perspective.
01:03:40 Well, in that, once you say that,
01:03:45 you automatically capture that they have an incentive
01:03:48 to keep on being.
01:03:50 You know, Stuart likes to say,
01:03:53 you can’t fetch the coffee if you’re dead.
01:03:56 Stuart Russell, by the way.
01:03:59 That’s a good line.
01:04:01 So when you’re sort of treating agents
01:04:05 as having these objectives, these incentives,
01:04:10 humans or artificial, you’re kind of implicitly modeling
01:04:14 that they’d like to stick around
01:04:16 so that they can accomplish those goals.
01:04:20 So I think in a sense,
01:04:22 maybe that’s what draws me so much
01:04:24 to the rationality framework,
01:04:25 even though it’s so broken,
01:04:26 we’ve been able to, it’s been such a useful perspective.
01:04:30 And like we were talking about earlier,
01:04:32 what’s the alternative?
01:04:33 I give up and go home or, you know,
01:04:34 I just use complete black boxes,
01:04:36 but then I don’t know what to assume out of distribution
01:04:37 that come back to this.
01:04:40 It’s just, it’s been a very fruitful way
01:04:42 to think about the problem
01:04:43 in a very more positive way, right?
01:04:47 People aren’t just crazy.
01:04:49 Maybe they make more sense than we think.
01:04:51 But I think we also have to somehow be ready for it
01:04:55 to be wrong, be able to detect
01:04:58 when these assumptions aren’t holding,
01:05:00 be all of that stuff.
01:05:02 Let me ask sort of another small side of this
01:05:06 that we’ve been talking about
01:05:07 the pure autonomous driving problem,
01:05:09 but there’s also relatively successful systems
01:05:13 already deployed out there in what you may call
01:05:17 like level two autonomy or semi autonomous vehicles,
01:05:20 whether that’s Tesla Autopilot,
01:05:23 work quite a bit with Cadillac SuperGuru system,
01:05:27 which has a driver facing camera that detects your state.
01:05:31 There’s a bunch of basically lane centering systems.
01:05:35 What’s your sense about this kind of way of dealing
01:05:41 with the human robot interaction problem
01:05:43 by having a really dumb robot
01:05:46 and relying on the human to help the robot out
01:05:50 to keep them both alive?
01:05:53 Is that from the research perspective,
01:05:57 how difficult is that problem?
01:05:59 And from a practical deployment perspective,
01:06:02 is that a fruitful way to approach
01:06:05 this human robot interaction problem?
01:06:08 I think what we have to be careful about there
01:06:12 is to not, it seems like some of these systems,
01:06:16 not all are making this underlying assumption
01:06:19 that if, so I’m a driver and I’m now really not driving,
01:06:25 but supervising and my job is to intervene, right?
01:06:28 And so we have to be careful with this assumption
01:06:31 that when I’m, if I’m supervising,
01:06:36 I will be just as safe as when I’m driving.
01:06:41 That I will, if I wouldn’t get into some kind of accident,
01:06:46 if I’m driving, I will be able to avoid that accident
01:06:50 when I’m supervising too.
01:06:52 And I think I’m concerned about this assumption
01:06:55 from a few perspectives.
01:06:56 So from a technical perspective,
01:06:58 it’s that when you let something kind of take control
01:07:01 and do its thing, and it depends on what that thing is,
01:07:03 obviously, and how much it’s taking control
01:07:05 and how, what things are you trusting it to do.
01:07:07 But if you let it do its thing and take control,
01:07:11 it will go to what we might call off policy
01:07:15 from the person’s perspective state.
01:07:16 So states that the person wouldn’t actually
01:07:18 find themselves in if they were the ones driving.
01:07:22 And the assumption that the person functions
01:07:24 just as well there as they function in the states
01:07:26 that they would normally encounter
01:07:28 is a little questionable.
01:07:30 Now, another part is the kind of the human factor side
01:07:34 of this, which is that I don’t know about you,
01:07:38 but I think I definitely feel like I’m experiencing things
01:07:42 very differently when I’m actively engaged in the task
01:07:45 versus when I’m a passive observer.
01:07:47 Like even if I try to stay engaged, right?
01:07:49 It’s very different than when I’m actually
01:07:51 actively making decisions.
01:07:53 And you see this in life in general.
01:07:55 Like you see students who are actively trying
01:07:58 to come up with the answer, learn this thing better
01:08:00 than when they’re passively told the answer.
01:08:03 I think that’s somewhat related.
01:08:04 And I think people have studied this in human factors
01:08:06 for airplanes.
01:08:07 And I think it’s actually fairly established
01:08:10 that these two are not the same.
01:08:12 So.
01:08:13 On that point, because I’ve gotten a huge amount
01:08:14 of heat on this and I stand by it.
01:08:17 Okay.
01:08:18 Because I know the human factors community well
01:08:22 and the work here is really strong.
01:08:24 And there’s many decades of work showing exactly
01:08:27 what you’re saying.
01:08:28 Nevertheless, I’ve been continuously surprised
01:08:30 that much of the predictions of that work has been wrong
01:08:33 in what I’ve seen.
01:08:35 So what we have to do,
01:08:37 I still agree with everything you said,
01:08:40 but we have to be a little bit more open minded.
01:08:45 So the, I’ll tell you, there’s a few surprising things
01:08:49 that supervise, like everything you said to the word
01:08:52 is actually exactly correct.
01:08:54 But it doesn’t say, what you didn’t say
01:08:57 is that these systems are,
01:09:00 you said you can’t assume a bunch of things,
01:09:02 but we don’t know if these systems are fundamentally unsafe.
01:09:06 That’s still unknown.
01:09:08 There’s a lot of interesting things,
01:09:11 like I’m surprised by the fact, not the fact,
01:09:15 that what seems to be anecdotally from,
01:09:18 well, from large data collection that we’ve done,
01:09:21 but also from just talking to a lot of people,
01:09:23 when in the supervisory role of semi autonomous systems
01:09:27 that are sufficiently dumb, at least,
01:09:29 which is, that might be the key element,
01:09:33 is the systems have to be dumb.
01:09:35 The people are actually more energized as observers.
01:09:38 So they’re actually better,
01:09:40 they’re better at observing the situation.
01:09:43 So there might be cases in systems,
01:09:46 if you get the interaction right,
01:09:48 where you, as a supervisor,
01:09:50 will do a better job with the system together.
01:09:53 I agree, I think that is actually really possible.
01:09:56 I guess mainly I’m pointing out that if you do it naively,
01:10:00 you’re implicitly assuming something,
01:10:02 that assumption might actually really be wrong.
01:10:04 But I do think that if you explicitly think about
01:10:09 what the agent should do
01:10:10 so that the person still stays engaged.
01:10:13 What the, so that you essentially empower the person
01:10:16 to do more than they could,
01:10:17 that’s really the goal, right?
01:10:19 Is you still have a driver,
01:10:20 so you wanna empower them to be so much better
01:10:25 than they would be by themselves.
01:10:27 And that’s different, it’s a very different mindset
01:10:29 than I want them to basically not drive, right?
01:10:33 And, but be ready to sort of take over.
01:10:40 So one of the interesting things we’ve been talking about
01:10:42 is the rewards, that they seem to be fundamental too,
01:10:47 the way robots behaves.
01:10:49 So broadly speaking,
01:10:52 we’ve been talking about utility functions and so on,
01:10:54 but could you comment on how do we approach
01:10:56 the design of reward functions?
01:10:59 Like, how do we come up with good reward functions?
01:11:02 Well, really good question,
01:11:05 because the answer is we don’t.
01:11:10 This was, you know, I used to think,
01:11:13 I used to think about how,
01:11:16 well, it’s actually really hard to specify rewards
01:11:18 for interaction because it’s really supposed to be
01:11:22 what the people want, and then you really, you know,
01:11:25 we talked about how you have to customize
01:11:26 what you wanna do to the end user.
01:11:30 But I kind of realized that even if you take
01:11:36 the interactive component away,
01:11:39 it’s still really hard to design reward functions.
01:11:42 So what do I mean by that?
01:11:43 I mean, if we assume this sort of AI paradigm
01:11:47 in which there’s an agent and his job is to optimize
01:11:51 some objectives, some reward, utility, loss, whatever, cost,
01:11:58 if you write it out, maybe it’s a set,
01:12:00 depending on the situation or whatever it is,
01:12:03 if you write that out and then you deploy the agent,
01:12:06 you’d wanna make sure that whatever you specified
01:12:10 incentivizes the behavior you want from the agent
01:12:14 in any situation that the agent will be faced with, right?
01:12:18 So I do motion planning on my robot arm,
01:12:22 I specify some cost function like, you know,
01:12:25 this is how far away you should try to stay,
01:12:28 so much it matters to stay away from people,
01:12:29 and this is how much it matters to be able to be efficient
01:12:31 and blah, blah, blah, right?
01:12:33 I need to make sure that whatever I specified,
01:12:36 those constraints or trade offs or whatever they are,
01:12:40 that when the robot goes and solves that problem
01:12:43 in every new situation,
01:12:45 that behavior is the behavior that I wanna see.
01:12:47 And what I’ve been finding is
01:12:50 that we have no idea how to do that.
01:12:52 Basically, what I can do is I can sample,
01:12:56 I can think of some situations
01:12:58 that I think are representative of what the robot will face,
01:13:02 and I can tune and add and tune some reward function
01:13:08 until the optimal behavior is what I want
01:13:11 on those situations,
01:13:13 which first of all is super frustrating
01:13:15 because, you know, through the miracle of AI,
01:13:19 we’ve taken, we don’t have to specify rules
01:13:21 for behavior anymore, right?
01:13:22 The, who were saying before,
01:13:24 the robot comes up with the right thing to do,
01:13:27 you plug in this situation,
01:13:28 it optimizes right in that situation, it optimizes,
01:13:31 but you have to spend still a lot of time
01:13:34 on actually defining what it is
01:13:37 that that criteria should be,
01:13:39 making sure you didn’t forget
01:13:40 about 50 bazillion things that are important
01:13:42 and how they all should be combining together
01:13:44 to tell the robot what’s good and what’s bad
01:13:46 and how good and how bad.
01:13:48 And so I think this is a lesson that I don’t know,
01:13:55 kind of, I guess I close my eyes to it for a while
01:13:59 cause I’ve been, you know,
01:14:00 tuning cost functions for 10 years now,
01:14:03 but it’s really strikes me that,
01:14:07 yeah, we’ve moved the tuning
01:14:09 and the like designing of features or whatever
01:14:13 from the behavior side into the reward side.
01:14:19 And yes, I agree that there’s way less of it,
01:14:22 but it still seems really hard
01:14:24 to anticipate any possible situation
01:14:26 and make sure you specify a reward function
01:14:30 that when optimized will work well
01:14:32 in every possible situation.
01:14:35 So you’re kind of referring to unintended consequences
01:14:38 or just in general, any kind of suboptimal behavior
01:14:42 that emerges outside of the things you said,
01:14:44 out of distribution.
01:14:46 Suboptimal behavior that is, you know, actually optimal.
01:14:49 I mean, this, I guess the idea of unintended consequences,
01:14:51 you know, it’s optimal respect to what you specified,
01:14:53 but it’s not what you want.
01:14:55 And there’s a difference between those.
01:14:57 But that’s not fundamentally a robotics problem, right?
01:14:59 That’s a human problem.
01:15:01 So like. That’s the thing, right?
01:15:03 So there’s this thing called Goodhart’s law,
01:15:05 which is you set a metric for an organization
01:15:07 and the moment it becomes a target
01:15:10 that people actually optimize for,
01:15:13 it’s no longer a good metric.
01:15:15 What’s it called?
01:15:15 Goodhart’s law.
01:15:16 Goodhart’s law.
01:15:17 So the moment you specify a metric,
01:15:20 it stops doing its job.
01:15:21 Yeah, it stops doing its job.
01:15:24 So there’s, yeah, there’s such a thing
01:15:25 as optimizing for things and, you know,
01:15:27 failing to think ahead of time
01:15:32 of all the possible things that might be important.
01:15:35 And so that’s, so that’s interesting
01:15:38 because Historia works a lot on reward learning
01:15:41 from the perspective of customizing to the end user,
01:15:44 but it really seems like it’s not just the interaction
01:15:48 with the end user that’s a problem of the human
01:15:50 and the robot collaborating
01:15:52 so that the robot can do what the human wants, right?
01:15:55 This kind of back and forth, the robot probing,
01:15:57 the person being informative, all of that stuff
01:16:00 might be actually just as applicable
01:16:04 to this kind of maybe new form of human robot interaction,
01:16:07 which is the interaction between the robot
01:16:10 and the expert programmer, roboticist designer
01:16:14 in charge of actually specifying
01:16:16 what the heck the robot should do,
01:16:18 specifying the task for the robot.
01:16:20 That’s fascinating.
01:16:21 That’s so cool, like collaborating on the reward design.
01:16:23 Right, collaborating on the reward design.
01:16:26 And so what does it mean, right?
01:16:28 What does it, when we think about the problem,
01:16:29 not as someone specifies all of your job is to optimize,
01:16:34 and we start thinking about you’re in this interaction
01:16:37 and this collaboration.
01:16:39 And the first thing that comes up is
01:16:42 when the person specifies a reward, it’s not, you know,
01:16:46 gospel, it’s not like the letter of the law.
01:16:48 It’s not the definition of the reward function
01:16:52 you should be optimizing,
01:16:53 because they’re doing their best,
01:16:54 but they’re not some magic perfect oracle.
01:16:57 And the sooner we start understanding that,
01:16:58 I think the sooner we’ll get to more robust robots
01:17:02 that function better in different situations.
01:17:06 And then you have kind of say, okay, well,
01:17:08 it’s almost like robots are over learning,
01:17:12 over putting too much weight on the reward specified
01:17:16 by definition, and maybe leaving a lot of other information
01:17:21 on the table, like what are other things we could do
01:17:23 to actually communicate to the robot
01:17:25 about what we want them to do besides attempting
01:17:28 to specify a reward function.
01:17:29 Yeah, you have this awesome,
01:17:31 and again, I love the poetry of it, of leaked information.
01:17:34 So you mentioned humans leak information
01:17:38 about what they want, you know,
01:17:40 leak reward signal for the robot.
01:17:44 So how do we detect these leaks?
01:17:47 What is that?
01:17:48 Yeah, what are these leaks?
01:17:49 Whether it just, I don’t know,
01:17:51 those were just recently saw it, read it,
01:17:54 I don’t know where from you,
01:17:55 and it’s gonna stick with me for a while for some reason,
01:17:58 because it’s not explicitly expressed.
01:18:00 It kind of leaks indirectly from our behavior.
01:18:04 From what we do, yeah, absolutely.
01:18:06 So I think maybe some surprising bits, right?
01:18:11 So we were talking before about, I’m a robot arm,
01:18:14 it needs to move around people, carry stuff,
01:18:18 put stuff away, all of that.
01:18:20 And now imagine that, you know,
01:18:25 the robot has some initial objective
01:18:27 that the programmer gave it
01:18:28 so they can do all these things functionally.
01:18:30 It’s capable of doing that.
01:18:32 And now I noticed that it’s doing something
01:18:35 and maybe it’s coming too close to me, right?
01:18:39 And maybe I’m the designer,
01:18:40 maybe I’m the end user and this robot is now in my home.
01:18:43 And I push it away.
01:18:47 So I push away because, you know,
01:18:49 it’s a reaction to what the robot is currently doing.
01:18:52 And this is what we call physical human robot interaction.
01:18:55 And now there’s a lot of interesting work
01:18:58 on how the heck do you respond to physical human
01:19:00 robot interaction?
01:19:01 What should the robot do if such an event occurs?
01:19:03 And there’s sort of different schools of thought.
01:19:05 Well, you know, you can sort of treat it
01:19:07 the control theoretic way and say,
01:19:08 this is a disturbance that you must reject.
01:19:11 You can sort of treat it more kind of heuristically
01:19:15 and say, I’m gonna go into some like gravity compensation
01:19:18 mode so that I’m easily maneuverable around.
01:19:19 I’m gonna go in the direction that the person pushed me.
01:19:22 And to us, part of realization has been
01:19:27 that that is signal that communicates about the reward.
01:19:30 Because if my robot was moving in an optimal way
01:19:34 and I intervened, that means that I disagree
01:19:37 with his notion of optimality, right?
01:19:40 Whatever it thinks is optimal is not actually optimal.
01:19:43 And sort of optimization problems aside,
01:19:45 that means that the cost function,
01:19:47 the reward function is incorrect,
01:19:51 or at least is not what I want it to be.
01:19:53 How difficult is that signal to interpret
01:19:58 and make actionable?
01:19:59 So like, cause this connects
01:20:00 to our autonomous vehicle discussion
01:20:02 where they’re in the semi autonomous vehicle
01:20:03 or autonomous vehicle when a safety driver
01:20:06 disengages the car, like,
01:20:08 but they could have disengaged it for a million reasons.
01:20:11 Yeah, so that’s true.
01:20:15 Again, it comes back to, can you structure a little bit
01:20:19 your assumptions about how human behavior
01:20:22 relates to what they want?
01:20:24 And you can, one thing that we’ve done is
01:20:26 literally just treated this external torque
01:20:29 that they applied as, when you take that
01:20:32 and you add it with what the torque
01:20:34 the robot was already applying,
01:20:36 that overall action is probably relatively optimal
01:20:39 in respect to whatever it is that the person wants.
01:20:41 And then that gives you information
01:20:43 about what it is that they want.
01:20:44 So you can learn that people want you
01:20:45 to stay further away from them.
01:20:47 Now you’re right that there might be many things
01:20:49 that explain just that one signal
01:20:51 and that you might need much more data than that
01:20:53 for the person to be able to shape
01:20:55 your reward function over time.
01:20:58 You can also do this info gathering stuff
01:21:00 that we were talking about.
01:21:01 Not that we’ve done that in that context,
01:21:03 just to clarify, but it’s definitely something
01:21:04 we thought about where you can have the robot
01:21:09 start acting in a way, like if there’s
01:21:11 a bunch of different explanations, right?
01:21:13 It moves in a way where it sees if you correct it
01:21:16 in some other way or not,
01:21:17 and then kind of actually plans its motion
01:21:19 so that it can disambiguate
01:21:21 and collect information about what you want.
01:21:24 Anyway, so that’s one way,
01:21:26 that’s kind of sort of leaked information,
01:21:27 maybe even more subtle leaked information
01:21:29 is if I just press the E stop, right?
01:21:32 I just, I’m doing it out of panic
01:21:34 because the robot is about to do something bad.
01:21:36 There’s again, information there, right?
01:21:38 Okay, the robot should definitely stop,
01:21:40 but it should also figure out
01:21:42 that whatever it was about to do was not good.
01:21:45 And in fact, it was so not good
01:21:46 that stopping and remaining stopped for a while
01:21:48 was a better trajectory for it
01:21:51 than whatever it is that it was about to do.
01:21:52 And that again is information about
01:21:54 what are my preferences, what do I want?
01:21:57 Speaking of E stops, what are your expert opinions
01:22:03 on the three laws of robotics from Isaac Asimov
01:22:08 that don’t harm humans, obey orders, protect yourself?
01:22:11 I mean, it’s such a silly notion,
01:22:13 but I speak to so many people these days,
01:22:15 just regular folks, just, I don’t know,
01:22:17 my parents and so on about robotics.
01:22:19 And they kind of operate in that space of,
01:22:23 you know, imagining our future with robots
01:22:25 and thinking what are the ethical,
01:22:28 how do we get that dance right?
01:22:31 I know the three laws might be a silly notion,
01:22:34 but do you think about like
01:22:35 what universal reward functions that might be
01:22:39 that we should enforce on the robots of the future?
01:22:44 Or is that a little too far out and it doesn’t,
01:22:48 or is the mechanism that you just described,
01:22:51 it shouldn’t be three laws,
01:22:52 it should be constantly adjusting kind of thing.
01:22:55 I think it should constantly be adjusting kind of thing.
01:22:57 You know, the issue with the laws is,
01:23:01 I don’t even, you know, they’re words
01:23:02 and I have to write math
01:23:04 and have to translate them into math.
01:23:06 What does it mean to?
01:23:07 What does harm mean?
01:23:08 What is, it’s not math.
01:23:11 Obey what, right?
01:23:12 Cause we just talked about how
01:23:14 you try to say what you want,
01:23:17 but you don’t always get it right.
01:23:19 And you want these machines to do what you want,
01:23:22 not necessarily exactly what you literally,
01:23:24 so you don’t want them to take you literally.
01:23:26 You wanna take what you say and interpret it in context.
01:23:31 And that’s what we do with the specified rewards.
01:23:33 We don’t take them literally anymore from the designer.
01:23:36 We, not we as a community, we as, you know,
01:23:39 some members of my group, we,
01:23:44 and some of our collaborators like Peter Beal
01:23:46 and Stuart Russell, we sort of say,
01:23:50 okay, the designer specified this thing,
01:23:53 but I’m gonna interpret it not as,
01:23:55 this is the universal reward function
01:23:57 that I shall always optimize always and forever,
01:23:59 but as this is good evidence about what the person wants.
01:24:05 And I should interpret that evidence
01:24:07 in the context of these situations that it was specified for.
01:24:11 Cause ultimately that’s what the designer thought about.
01:24:12 That’s what they had in mind.
01:24:14 And really them specifying reward function
01:24:16 that works for me in all these situations
01:24:18 is really kind of telling me that whatever behavior
01:24:22 that incentivizes must be good behavior
01:24:24 with respect to the thing
01:24:25 that I should actually be optimizing for.
01:24:28 And so now the robot kind of has uncertainty
01:24:30 about what it is that it should be,
01:24:32 what its reward function is.
01:24:34 And then there’s all these additional signals
01:24:36 that we’ve been finding that it can kind of continually
01:24:39 learn from and adapt its understanding of what people want.
01:24:41 Every time the person corrects it, maybe they demonstrate,
01:24:44 maybe they stop, hopefully not, right?
01:24:48 One really, really crazy one is the environment itself.
01:24:54 Like our world, you don’t, it’s not, you know,
01:24:58 you observe our world and the state of it.
01:25:01 And it’s not that you’re seeing behavior
01:25:03 and you’re saying, oh, people are making decisions
01:25:05 that are rational, blah, blah, blah.
01:25:07 It’s, but our world is something that we’ve been acting with
01:25:12 according to our preferences.
01:25:14 So I have this example where like,
01:25:15 the robot walks into my home and my shoes are laid down
01:25:18 on the floor kind of in a line, right?
01:25:21 It took effort to do that.
01:25:23 So even though the robot doesn’t see me doing this,
01:25:27 you know, actually aligning the shoes,
01:25:29 it should still be able to figure out
01:25:31 that I want the shoes aligned
01:25:33 because there’s no way for them to have magically,
01:25:35 you know, be instantiated themselves in that way.
01:25:39 Someone must have actually taken the time to do that.
01:25:43 So it must be important.
01:25:44 So the environment actually tells, the environment is.
01:25:46 Leaks information.
01:25:48 It leaks information.
01:25:48 I mean, the environment is the way it is
01:25:50 because humans somehow manipulated it.
01:25:52 So you have to kind of reverse engineer the narrative
01:25:55 that happened to create the environment as it is
01:25:57 and that leaks the preference information.
01:26:00 Yeah, and you have to be careful, right?
01:26:03 Because people don’t have the bandwidth to do everything.
01:26:06 So just because, you know, my house is messy
01:26:08 doesn’t mean that I want it to be messy, right?
01:26:10 But that just, you know, I didn’t put the effort into that.
01:26:14 I put the effort into something else.
01:26:16 So the robot should figure out,
01:26:17 well, that something else was more important,
01:26:19 but it doesn’t mean that, you know,
01:26:20 the house being messy is not.
01:26:21 So it’s a little subtle, but yeah, we really think of it.
01:26:24 The state itself is kind of like a choice
01:26:26 that people implicitly made about how they want their world.
01:26:31 What book or books, technical or fiction or philosophical,
01:26:34 when you like look back, you know, life had a big impact,
01:26:39 maybe it was a turning point, it was inspiring in some way.
01:26:42 Maybe we’re talking about some silly book
01:26:45 that nobody in their right mind would want to read.
01:26:48 Or maybe it’s a book that you would recommend
01:26:51 to others to read.
01:26:52 Or maybe those could be two different recommendations
01:26:56 of books that could be useful for people on their journey.
01:27:00 When I was in, it’s kind of a personal story.
01:27:03 When I was in 12th grade,
01:27:05 I got my hands on a PDF copy in Romania
01:27:10 of Russell Norvig, AI modern approach.
01:27:14 I didn’t know anything about AI at that point.
01:27:16 I was, you know, I had watched the movie,
01:27:19 The Matrix was my exposure.
01:27:22 And so I started going through this thing
01:27:28 and, you know, you were asking in the beginning,
01:27:31 what are, you know, it’s math and it’s algorithms,
01:27:35 what’s interesting.
01:27:36 It was so captivating.
01:27:38 This notion that you could just have a goal
01:27:41 and figure out your way through
01:27:44 kind of a messy, complicated situation.
01:27:47 So what sequence of decisions you should make
01:27:50 to autonomously to achieve that goal.
01:27:53 That was so cool.
01:27:55 I’m, you know, I’m biased, but that’s a cool book to look at.
01:28:00 You can convert, you know, the goal of intelligence,
01:28:03 the process of intelligence and mechanize it.
01:28:06 I had the same experience.
01:28:07 I was really interested in psychiatry
01:28:09 and trying to understand human behavior.
01:28:11 And then AI modern approach is like, wait,
01:28:14 you can just reduce it all to.
01:28:15 You can write math about human behavior, right?
01:28:18 Yeah.
01:28:19 So that’s, and I think that stuck with me
01:28:21 because, you know, a lot of what I do, a lot of what we do
01:28:25 in my lab is write math about human behavior,
01:28:28 combine it with data and learning, put it all together,
01:28:31 give it to robots to plan with, and, you know,
01:28:33 hope that instead of writing rules for the robots,
01:28:37 writing heuristics, designing behavior,
01:28:39 they can actually autonomously come up with the right thing
01:28:42 to do around people.
01:28:43 That’s kind of our, you know, that’s our signature move.
01:28:46 We wrote some math and then instead of kind of hand crafting
01:28:49 this and that and that and the robot figuring stuff out
01:28:52 and isn’t that cool.
01:28:53 And I think that is the same enthusiasm that I got from
01:28:56 the robot figured out how to reach that goal in that graph.
01:28:59 Isn’t that cool?
01:29:02 So apologize for the romanticized questions,
01:29:05 but, and the silly ones,
01:29:07 if a doctor gave you five years to live,
01:29:11 sort of emphasizing the finiteness of our existence,
01:29:15 what would you try to accomplish?
01:29:20 It’s like my biggest nightmare, by the way.
01:29:22 I really like living.
01:29:24 So I’m actually, I really don’t like the idea of being told
01:29:28 that I’m going to die.
01:29:30 Sorry to linger on that for a second.
01:29:32 Do you, I mean, do you meditate or ponder on your mortality
01:29:36 or human, the fact that this thing ends,
01:29:38 it seems to be a fundamental feature.
01:29:41 Do you think of it as a feature or a bug too?
01:29:44 Is it, you said you don’t like the idea of dying,
01:29:47 but if I were to give you a choice of living forever,
01:29:50 like you’re not allowed to die.
01:29:52 Now I’ll say that I want to live forever,
01:29:54 but I watched this show.
01:29:55 It’s very silly.
01:29:56 It’s called The Good Place and they reflect a lot on this.
01:29:59 And you know, the,
01:30:00 the moral of the story is that you have to make the afterlife
01:30:03 be a finite too.
01:30:05 Cause otherwise people just kind of, it’s like Wally.
01:30:08 It’s like, ah, whatever.
01:30:10 So, so I think the finiteness helps, but,
01:30:13 but yeah, it’s just, you know, I don’t, I don’t,
01:30:16 I’m not a religious person.
01:30:18 I don’t think that there’s something after.
01:30:21 And so I think it just ends and you stop existing.
01:30:25 And I really like existing.
01:30:26 It’s just, it’s such a great privilege to exist that,
01:30:31 that yeah, it’s just, I think that’s the scary part.
01:30:35 I still think that we like existing so much because it ends.
01:30:40 And that’s so sad.
01:30:41 Like it’s so sad to me every time.
01:30:43 Like I find almost everything about this life beautiful.
01:30:46 Like the silliest, most mundane things are just beautiful.
01:30:49 And I think I’m cognizant of the fact that I find it beautiful
01:30:52 because it ends like it.
01:30:55 And it’s so, I don’t know.
01:30:57 I don’t know how to feel about that.
01:30:59 I also feel like there’s a lesson in there for robotics
01:31:03 and AI that is not like the finiteness of things seems
01:31:10 to be a fundamental nature of human existence.
01:31:13 I think some people sort of accuse me of just being Russian
01:31:16 and melancholic and romantic or something,
01:31:19 but that seems to be a fundamental nature of our existence
01:31:24 that should be incorporated in our reward functions.
01:31:28 But anyway, if you were speaking of reward functions,
01:31:34 if you only had five years, what would you try to accomplish?
01:31:38 This is the thing.
01:31:41 I’m thinking about this question and have a pretty joyous moment
01:31:45 because I don’t know that I would change much.
01:31:49 I’m trying to make some contributions to how we understand
01:31:55 human AI interaction.
01:31:57 I don’t think I would change that.
01:32:00 Maybe I’ll take more trips to the Caribbean or something,
01:32:04 but I tried some of that already from time to time.
01:32:08 So, yeah, I try to do the things that bring me joy
01:32:13 and thinking about these things bring me joy is the Marie Kondo thing.
01:32:17 Don’t do stuff that doesn’t spark joy.
01:32:19 For the most part, I do things that spark joy.
01:32:22 Maybe I’ll do less service in the department or something.
01:32:25 I’m not dealing with admissions anymore.
01:32:30 But no, I think I have amazing colleagues and amazing students
01:32:36 and amazing family and friends and spending time in some balance
01:32:40 with all of them is what I do and that’s what I’m doing already.
01:32:44 So, I don’t know that I would really change anything.
01:32:47 So, on the spirit of positiveness, what small act of kindness,
01:32:52 if one pops to mind, were you once shown that you will never forget?
01:32:57 When I was in high school, my friends, my classmates did some tutoring.
01:33:08 We were gearing up for our baccalaureate exam
01:33:11 and they did some tutoring on, well, some on math, some on whatever.
01:33:15 I was comfortable enough with some of those subjects,
01:33:19 but physics was something that I hadn’t focused on in a while.
01:33:22 And so, they were all working with this one teacher
01:33:28 and I started working with that teacher.
01:33:31 Her name is Nicole Beccano.
01:33:33 And she was the one who kind of opened up this whole world for me
01:33:39 because she sort of told me that I should take the SATs
01:33:44 and apply to go to college abroad and do better on my English and all of that.
01:33:51 And when it came to, well, financially I couldn’t,
01:33:55 my parents couldn’t really afford to do all these things,
01:33:58 she started tutoring me on physics for free
01:34:01 and on top of that sitting down with me to kind of train me for SATs
01:34:06 and all that jazz that she had experience with.
01:34:09 Wow. And obviously that has taken you to be here today,
01:34:15 sort of one of the world experts in robotics.
01:34:17 It’s funny those little… For no reason really.
01:34:24 Just out of karma.
01:34:27 Wanting to support someone, yeah.
01:34:29 Yeah. So, we talked a ton about reward functions.
01:34:33 Let me talk about the most ridiculous big question.
01:34:37 What is the meaning of life?
01:34:39 What’s the reward function under which we humans operate?
01:34:42 Like what, maybe to your life, maybe broader to human life in general,
01:34:47 what do you think…
01:34:51 What gives life fulfillment, purpose, happiness, meaning?
01:34:57 You can’t even ask that question with a straight face.
01:34:59 That’s how ridiculous this is.
01:35:00 I can’t, I can’t.
01:35:01 Okay. So, you know…
01:35:05 You’re going to try to answer it anyway, aren’t you?
01:35:09 So, I was in a planetarium once.
01:35:13 Yes.
01:35:14 And, you know, they show you the thing and then they zoom out and zoom out
01:35:18 and this whole, like, you’re a speck of dust kind of thing.
01:35:20 I think I was conceptualizing that we’re kind of, you know, what are humans?
01:35:23 We’re just on this little planet, whatever.
01:35:26 We don’t matter much in the grand scheme of things.
01:35:29 And then my mind got really blown because they talked about this multiverse theory
01:35:35 where they kind of zoomed out and were like, this is our universe.
01:35:38 And then, like, there’s a bazillion other ones and they just pop in and out of existence.
01:35:42 So, like, our whole thing that we can’t even fathom how big it is was like a blimp that went in and out.
01:35:48 And at that point, I was like, okay, like, I’m done.
01:35:51 This is not, there is no meaning.
01:35:54 And clearly what we should be doing is try to impact whatever local thing we can impact,
01:35:59 our communities, leave a little bit behind there, our friends, our family, our local communities,
01:36:05 and just try to be there for other humans because I just, everything beyond that seems ridiculous.
01:36:13 I mean, are you, like, how do you make sense of these multiverses?
01:36:16 Like, are you inspired by the immensity of it?
01:36:21 Do you, I mean, is there, like, is it amazing to you or is it almost paralyzing in the mystery of it?
01:36:34 It’s frustrating.
01:36:35 I’m frustrated by my inability to comprehend.
01:36:41 It just feels very frustrating.
01:36:43 It’s like there’s some stuff that, you know, we should time, blah, blah, blah, that we should really be understanding.
01:36:48 And I definitely don’t understand it.
01:36:50 But, you know, the amazing physicists of the world have a much better understanding than me.
01:36:56 But it still seems epsilon in the grand scheme of things.
01:36:58 So, it’s very frustrating.
01:37:00 It just, it sort of feels like our brain don’t have some fundamental capacity yet, well, yet or ever.
01:37:06 I don’t know.
01:37:07 Well, that’s one of the dreams of artificial intelligence is to create systems that will aid,
01:37:12 expand our cognitive capacity in order to understand, build the theory of everything with the physics
01:37:19 and understand what the heck these multiverses are.
01:37:24 So, I think there’s no better way to end it than talking about the meaning of life and the fundamental nature of the universe and the multiverses.
01:37:32 And the multiverse.
01:37:33 So, Anca, it is a huge honor.
01:37:35 One of my favorite conversations I’ve had.
01:37:38 I really, really appreciate your time.
01:37:40 Thank you for talking today.
01:37:41 Thank you for coming.
01:37:42 Come back again.
01:37:44 Thanks for listening to this conversation with Anca Dragan.
01:37:47 And thank you to our presenting sponsor, Cash App.
01:37:50 Please consider supporting the podcast by downloading Cash App and using code LexPodcast.
01:37:56 If you enjoy this podcast, subscribe on YouTube, review it with 5 stars on Apple Podcast,
01:38:01 support it on Patreon, or simply connect with me on Twitter at LexFriedman.
01:38:07 And now, let me leave you with some words from Isaac Asimov.
01:38:12 Your assumptions are your windows in the world.
01:38:15 Scrub them off every once in a while or the light won’t come in.
01:38:20 Thank you for listening and hope to see you next time.