Dawn Song: Adversarial Machine Learning and Computer Security #95

Transcript

00:00:00 The following is a conversation with Dawn Song,

00:00:02 a professor of computer science at UC Berkeley

00:00:05 with research interests in computer security.

00:00:08 Most recently, with a focus on the intersection

00:00:10 between security and machine learning.

00:00:13 This conversation was recorded

00:00:15 before the outbreak of the pandemic.

00:00:17 For everyone feeling the medical, psychological,

00:00:19 and financial burden of this crisis,

00:00:21 I’m sending love your way.

00:00:23 Stay strong.

00:00:24 We’re in this together.

00:00:25 We’ll beat this thing.

00:00:27 This is the Artificial Intelligence Podcast.

00:00:29 If you enjoy it, subscribe on YouTube,

00:00:31 review it with five stars on Apple Podcast,

00:00:34 support it on Patreon,

00:00:35 or simply connect with me on Twitter

00:00:37 at lexfriedman, spelled F R I D M A N.

00:00:41 As usual, I’ll do a few minutes of ads now

00:00:43 and never any ads in the middle

00:00:45 that can break the flow of the conversation.

00:00:47 I hope that works for you

00:00:48 and doesn’t hurt the listening experience.

00:00:51 This show is presented by Cash App,

00:00:53 the number one finance app in the App Store.

00:00:55 When you get it, use code lexpodcast.

00:00:58 Cash App lets you send money to friends,

00:01:00 buy Bitcoin, and invest in the stock market

00:01:02 with as little as one dollar.

00:01:05 Since Cash App does fractional share trading,

00:01:07 let me mention that the order execution algorithm

00:01:10 that works behind the scenes

00:01:11 to create the abstraction of fractional orders

00:01:14 is an algorithmic marvel.

00:01:16 So big props to the Cash App engineers

00:01:18 for solving a hard problem

00:01:19 that in the end provides an easy interface

00:01:22 that takes a step up to the next layer of abstraction

00:01:25 over the stock market,

00:01:26 making trading more accessible for new investors

00:01:29 and diversification much easier.

00:01:32 So again, if you get Cash App from the App Store or Google Play

00:01:35 and use the code lexpodcast, you get $10

00:01:39 and Cash App will also donate $10 to FIRST,

00:01:42 an organization that is helping to advance robotics

00:01:44 and STEM education for young people around the world.

00:01:48 And now here’s my conversation with Dawn Song.

00:01:53 Do you think software systems

00:01:54 will always have security vulnerabilities?

00:01:57 Let’s start at the broad, almost philosophical level.

00:02:00 That’s a very good question.

00:02:02 I mean, in general, right,

00:02:03 it’s very difficult to write completely bug free code

00:02:07 and code that has no vulnerability.

00:02:09 And also, especially given that the definition

00:02:12 of vulnerability is actually really broad.

00:02:14 It’s any type of attacks essentially on a code can,

00:02:18 you know, that’s, you can call that,

00:02:21 that caused by vulnerabilities.

00:02:22 And the nature of attacks is always changing as well?

00:02:25 Like new ones are coming up?

00:02:27 Right, so for example, in the past,

00:02:29 we talked about memory safety type of vulnerabilities

00:02:32 where essentially attackers can exploit the software

00:02:37 and take over control of how the code runs

00:02:40 and then can launch attacks that way.

00:02:42 By accessing some aspect of the memory

00:02:44 and be able to then alter the state of the program?

00:02:48 Exactly, so for example, in the example of a buffer overflow,

00:02:51 then the attacker essentially actually causes

00:02:56 essentially unintended changes in the state of the program.

00:03:01 And then, for example,

00:03:03 can then take over control flow of the program

00:03:05 and let the program to execute codes

00:03:08 that actually the programmer didn’t intend.

00:03:11 So the attack can be a remote attack.

00:03:12 So the attacker, for example,

00:03:14 can send in a malicious input to the program

00:03:17 that just causes the program to completely

00:03:20 then be compromised and then end up doing something

00:03:24 that’s under the attacker’s control and intention.

00:03:29 But that’s just one form of attacks

00:03:31 and there are other forms of attacks.

00:03:32 Like for example, there are these side channels

00:03:35 where attackers can try to learn from,

00:03:39 even just observing the outputs

00:03:42 from the behaviors of the program,

00:03:43 try to infer certain secrets of the program.

00:03:46 So essentially, right, the form of attacks

00:03:49 is very, very, it’s very broad spectrum.

00:03:53 And in general, from the security perspective,

00:03:56 we want to essentially provide as much guarantee

00:04:01 as possible about the program’s security properties

00:04:05 and so on.

00:04:06 So for example, we talked about providing provable guarantees

00:04:10 of the program.

00:04:11 So for example, there are ways we can use program analysis

00:04:15 and formal verification techniques

00:04:17 to prove that a piece of code

00:04:19 has no memory safety vulnerabilities.

00:04:24 What does that look like?

00:04:25 What is that proof?

00:04:26 Is that just a dream for,

00:04:28 that’s applicable to small case examples

00:04:30 or is that possible to do for real world systems?

00:04:33 So actually, I mean, today,

00:04:35 I actually call it we are entering the era

00:04:38 of formally verified systems.

00:04:41 So in the community, we have been working

00:04:44 for the past decades in developing techniques

00:04:48 and tools to do this type of program verification.

00:04:53 And we have dedicated teams that have dedicated,

00:04:57 you know, their like years,

00:05:00 sometimes even decades of their work in the space.

00:05:04 So as a result, we actually have a number

00:05:06 of formally verified systems ranging from microkernels

00:05:11 to compilers to file systems to certain crypto,

00:05:16 you know, libraries and so on.

00:05:18 So it’s actually really wide ranging

00:05:20 and it’s really exciting to see

00:05:22 that people are recognizing the importance

00:05:25 of having these formally verified systems

00:05:28 with verified security.

00:05:31 So that’s great advancement that we see,

00:05:34 but on the other hand,

00:05:34 I think we do need to take all these in essentially

00:05:39 with caution as well in the sense that,

00:05:41 just like I said, the type of vulnerabilities

00:05:46 is very varied.

00:05:47 We can formally verify a software system

00:05:51 to have certain set of security properties,

00:05:54 but they can still be vulnerable to other types of attacks.

00:05:57 And hence, we continue need to make progress in the space.

00:06:03 So just a quick, to linger on the formal verification,

00:06:07 is that something you can do by looking at the code alone

00:06:12 or is it something you have to run the code

00:06:14 to prove something?

00:06:16 So empirical verification,

00:06:18 can you look at the code, just the code?

00:06:20 So that’s a very good question.

00:06:22 So in general, for most program verification techniques,

00:06:25 it’s essentially try to verify the properties

00:06:27 of the program statically.

00:06:29 And there are reasons for that too.

00:06:32 We can run the code to see, for example,

00:06:34 using like in software testing with the fuzzing techniques

00:06:39 and also in certain even model checking techniques,

00:06:41 you can actually run the code.

00:06:45 But in general, that only allows you to essentially verify

00:06:51 or analyze the behaviors of the program

00:06:55 under certain situations.

00:06:57 And so most of the program verification techniques

00:06:59 actually works statically.

00:07:01 What does statically mean?

00:07:03 Without running the code.

00:07:04 Without running the code, yep.

00:07:06 So, but sort of to return to the big question,

00:07:10 if we can stand for a little bit longer,

00:07:13 do you think there will always be

00:07:16 security vulnerabilities?

00:07:18 You know, that’s such a huge worry for people

00:07:20 in the broad cybersecurity threat in the world.

00:07:23 It seems like the tension between nations, between groups,

00:07:29 the wars of the future might be fought

00:07:31 in cybersecurity that people worry about.

00:07:35 And so, of course, the nervousness is,

00:07:37 is this something that we can get ahold of in the future

00:07:40 for our software systems?

00:07:42 So there’s a very funny quote saying,

00:07:46 security is job security.

00:07:49 So, right, I think that essentially answers your question.

00:07:55 Right, we strive to make progress

00:08:00 in building more secure systems

00:08:03 and also making it easier and easier

00:08:05 to build secure systems.

00:08:07 But given the diversity, the various nature of attacks,

00:08:15 and also the interesting thing about security is that,

00:08:20 unlike in most other fields,

00:08:24 essentially you are trying to, how should I put it,

00:08:27 prove a statement true.

00:08:31 But in this case, you are trying to say

00:08:32 that there’s no attacks.

00:08:35 So even just this statement itself

00:08:37 is not very well defined, again,

00:08:40 given how varied the nature of the attacks can be.

00:08:44 And hence there’s a challenge of security

00:08:46 and also that naturally, essentially,

00:08:49 it’s almost impossible to say that something,

00:08:52 a real world system is 100% no security vulnerabilities.

00:08:57 Is there a particular,

00:08:58 and we’ll talk about different kinds of vulnerabilities,

00:09:01 it’s exciting ones, very fascinating ones

00:09:04 in the space of machine learning,

00:09:05 but is there a particular security vulnerability

00:09:08 that worries you the most, that you think about the most

00:09:12 in terms of it being a really hard problem

00:09:16 and a really important problem to solve?

00:09:18 So it is very interesting.

00:09:20 So I have, in the past, have worked essentially

00:09:22 through the different stacks in the systems,

00:09:27 working on networking security, software security,

00:09:30 and even in software security,

00:09:32 I worked on program binary security

00:09:35 and then web security, mobile security.

00:09:38 So throughout we have been developing

00:09:42 more and more techniques and tools

00:09:45 to improve security of these software systems.

00:09:47 And as a consequence, actually it’s a very interesting thing

00:09:50 that we are seeing, interesting trends that we are seeing

00:09:53 is that the attacks are actually moving more and more

00:09:57 from the systems itself towards to humans.

00:10:01 So it’s moving up the stack.

00:10:03 It’s moving up the stack.

00:10:04 That’s fascinating.

00:10:05 And also it’s moving more and more

00:10:07 towards what we call the weakest link.

00:10:09 So we say that in security,

00:10:11 we say the weakest link actually of the systems

00:10:13 oftentimes is actually humans themselves.

00:10:16 So a lot of attacks, for example,

00:10:18 the attacker either through social engineering

00:10:21 or from these other methods,

00:10:23 they actually attack the humans and then attack the systems.

00:10:26 So we actually have a project that actually works

00:10:29 on how to use AI machine learning

00:10:32 to help humans to defend against these types of attacks.

00:10:35 So yeah, so if we look at humans

00:10:37 as security vulnerabilities,

00:10:40 is there methods, is that what you’re kind of referring to?

00:10:43 Is there hope or methodology for patching the humans?

00:10:48 I think in the future,

00:10:49 this is going to be really more and more of a serious issue

00:10:54 because again, for machines, for systems,

00:10:58 we can, yes, we can patch them.

00:11:00 We can build more secure systems.

00:11:02 We can harden them and so on.

00:11:03 But humans actually, we don’t have a way

00:11:05 to say do a software upgrade

00:11:07 or do a hardware change for humans.

00:11:11 And so for example, right now, we already see

00:11:16 different types of attacks.

00:11:17 In particular, I think in the future,

00:11:19 they are going to be even more effective on humans.

00:11:21 So as I mentioned, social engineering attacks,

00:11:24 like these phishing attacks,

00:11:25 attackers just get humans to provide their passwords.

00:11:30 And there have been instances where even places

00:11:34 like Google and other places

00:11:38 that are supposed to have really good security,

00:11:41 people there have been phished

00:11:43 to actually wire money to attackers.

00:11:47 It’s crazy.

00:11:48 And then also we talk about this deep fake and fake news.

00:11:52 So these essentially are there to target humans,

00:11:54 to manipulate humans opinions, perceptions, and so on.

00:12:01 So I think in going to the future,

00:12:04 these are going to become more and more severe issues for us.

00:12:07 Further up the stack.

00:12:08 Yes, yes.

00:12:09 So you see kind of social engineering,

00:12:13 automated social engineering

00:12:14 as a kind of security vulnerability.

00:12:17 Oh, absolutely.

00:12:18 And again, given that humans

00:12:20 are the weakest link to the system,

00:12:23 I would say this is the type of attacks

00:12:25 that I would be most worried about.

00:12:28 Oh, that’s fascinating.

00:12:30 Okay, so.

00:12:31 And that’s why when we talk about AI sites,

00:12:33 also we need AI to help humans too.

00:12:35 As I mentioned, we have some projects in the space

00:12:37 actually helps on that.

00:12:39 Can you maybe, can we go there for the DFS?

00:12:41 What are some ideas to help humans?

00:12:44 So one of the projects we are working on

00:12:45 is actually using NLP and chatbot techniques

00:12:50 to help humans.

00:12:51 For example, the chatbot actually could be there

00:12:54 observing the conversation

00:12:56 between a user and a remote correspondence.

00:13:01 And then the chatbot could be there to try to observe,

00:13:05 to see whether the correspondence

00:13:07 is potentially an attacker.

00:13:10 For example, in some of the phishing attacks,

00:13:12 the attacker claims to be a relative of the user

00:13:16 and the relative got lost in London

00:13:20 and his wallets have been stolen,

00:13:22 had no money, asked the user to wire money

00:13:25 to send money to the attacker,

00:13:28 to the correspondence.

00:13:30 So then in this case,

00:13:31 the chatbot actually could try to recognize

00:13:34 there may be something suspicious going on.

00:13:37 This relates to asking money to be sent.

00:13:40 And also the chatbot could actually pose,

00:13:43 we call it challenge and response.

00:13:45 The correspondence claims to be a relative of the user,

00:13:50 then the chatbot could automatically

00:13:51 actually generate some kind of challenges

00:13:54 to see whether the correspondence

00:13:57 knows the appropriate knowledge

00:13:59 to prove that he actually is,

00:14:01 he or she actually is the acclaimed relative of the user.

00:14:07 And so in the future,

00:14:08 I think these type of technologies

00:14:10 actually could help protect users.

00:14:13 That’s funny.

00:14:14 So a chatbot that’s kind of focused

00:14:17 for looking for the kind of patterns

00:14:19 that are usually associated with social engineering attacks,

00:14:23 it would be able to then test,

00:14:26 sort of do a basic capture type of a response

00:14:30 to see is this, is the fact or the semantics

00:14:32 of the claims you’re making true?

00:14:34 Right, right.

00:14:35 That’s fascinating.

00:14:36 Exactly.

00:14:37 That’s really fascinating.

00:14:38 And as we develop more powerful NLP

00:14:41 and chatbot techniques,

00:14:43 the chatbot could even engage further conversations

00:14:47 with the correspondence to,

00:14:48 for example, if it turns out to be an attack,

00:14:52 then the chatbot can try to engage in conversations

00:14:57 with the attacker to try to learn more information

00:14:59 from the attacker as well.

00:15:00 So it’s a very interesting area.

00:15:02 So that chatbot is essentially

00:15:03 your little representative in the security space.

00:15:07 It’s like your little lawyer

00:15:09 that protects you from doing anything stupid.

00:15:11 Right, right, right.

00:15:13 That’s a fascinating vision for the future.

00:15:17 Do you see that broadly applicable across the web?

00:15:19 So across all your interactions on the web?

00:15:22 Absolutely, right.

00:15:24 What about like on social networks, for example?

00:15:26 So across all of that,

00:15:28 do you see that being implemented

00:15:30 in sort of that’s a service that a company would provide

00:15:34 or does every single social network

00:15:36 has to implement it themselves?

00:15:37 So Facebook and Twitter and so on,

00:15:39 or do you see there being like a security service

00:15:43 that kind of is a plug and play?

00:15:45 That’s a very good question.

00:15:46 I think, of course, we still have ways to go

00:15:49 until the NLP and the chatbot techniques

00:15:53 can be very effective.

00:15:54 But I think once it’s powerful enough,

00:15:58 I do see that that can be a service

00:16:01 either a user can employ

00:16:02 or it can be deployed by the platforms.

00:16:04 Yeah, that’s just the curious side to me on security,

00:16:07 and we’ll talk about privacy,

00:16:09 is who gets a little bit more of the control?

00:16:12 Who gets to, you know, on whose side is the representative?

00:16:17 Is it on Facebook’s side

00:16:19 that there is this security protector,

00:16:22 or is it on your side?

00:16:23 And that has different implications

00:16:25 about how much that little chatbot security protector

00:16:30 knows about you.

00:16:31 Right, exactly.

00:16:32 If you have a little security bot

00:16:33 that you carry with you everywhere,

00:16:35 from Facebook to Twitter to all your services,

00:16:38 it might know a lot more about you

00:16:40 and a lot more about your relatives

00:16:42 to be able to test those things.

00:16:43 But that’s okay because you have more control of that

00:16:47 as opposed to Facebook having that.

00:16:48 That’s a really interesting trade off.

00:16:50 Another fascinating topic you work on is,

00:16:53 again, also non traditional

00:16:56 to think of it as security vulnerability,

00:16:57 but I guess it is adversarial machine learning,

00:17:01 is basically, again, high up the stack,

00:17:04 being able to attack the accuracy,

00:17:09 the performance of machine learning systems

00:17:13 by manipulating some aspect.

00:17:15 Perhaps you can clarify,

00:17:17 but I guess the traditional way

00:17:20 the main way is to manipulate some of the input data

00:17:24 to make the output something totally not representative

00:17:28 of the semantic content of the input.

00:17:30 Right, so in this adversarial machine learning,

00:17:32 essentially, the goal is to fool the machine learning system

00:17:36 into making the wrong decision.

00:17:38 And the attack can actually happen at different stages,

00:17:41 can happen at the inference stage

00:17:44 where the attacker can manipulate the inputs

00:17:46 to add perturbations, malicious perturbations to the inputs

00:17:50 to cause the machine learning system

00:17:52 to give the wrong prediction and so on.

00:17:55 So just to pause, what are perturbations?

00:17:59 Also essentially changes to the inputs, for example.

00:18:01 Some subtle changes, messing with the changes

00:18:04 to try to get a very different output.

00:18:06 Right, so for example,

00:18:08 the canonical like adversarial example type

00:18:12 is you have an image, you add really small perturbations,

00:18:16 changes to the image.

00:18:18 It can be so subtle that to human eyes,

00:18:21 it’s hard to, it’s even imperceptible to human eyes.

00:18:26 But for the machine learning system,

00:18:30 then the one without the perturbation,

00:18:34 the machine learning system can give the wrong,

00:18:36 can give the correct classification, for example.

00:18:39 But for the perturb division,

00:18:41 the machine learning system

00:18:42 will give a completely wrong classification.

00:18:45 And in a targeted attack,

00:18:47 the machine learning system can even give the wrong answer

00:18:51 that’s what the attacker intended.

00:18:55 So not just any wrong answer,

00:18:58 but like change the answer

00:19:00 to something that will benefit the attacker.

00:19:02 Yes.

00:19:04 So that’s at the inference stage.

00:19:07 Right, right.

00:19:07 So yeah, what else?

00:19:09 Right, so attacks can also happen at the training stage

00:19:12 where the attacker, for example,

00:19:14 can provide poisoned training data sets

00:19:19 or training data points

00:19:21 to cause the machine learning system

00:19:22 to learn the wrong model.

00:19:24 And we also have done some work

00:19:26 showing that you can actually do this,

00:19:29 we call it a backdoor attack,

00:19:31 whereby feeding these poisoned data points

00:19:36 to the machine learning system.

00:19:38 The machine learning system will learn a wrong model,

00:19:42 but it can be done in a way

00:19:43 that for most of the inputs,

00:19:46 the learning system is fine,

00:19:48 is giving the right answer.

00:19:50 But on specific, we call it the trigger inputs,

00:19:54 for specific inputs chosen by the attacker,

00:19:57 it can actually, only under these situations,

00:20:01 the learning system will give the wrong answer.

00:20:03 And oftentimes the attack is the answer

00:20:05 designed by the attacker.

00:20:07 So in this case, actually, the attack is really stealthy.

00:20:11 So for example, in the work that we did,

00:20:15 even when you’re human,

00:20:17 even when humans visually reviewing these training,

00:20:22 the training data sets,

00:20:23 actually it’s very difficult for humans

00:20:26 to see some of these attacks.

00:20:29 And then from the model side,

00:20:32 it’s almost impossible for anyone to know

00:20:35 that the model has been trained wrong.

00:20:37 And in particular, it only acts wrongly

00:20:43 in these specific situations that only the attacker knows.

00:20:48 So first of all, that’s fascinating.

00:20:49 It seems exceptionally challenging, that second one,

00:20:52 manipulating the training set.

00:20:54 So can you help me get a little bit of an intuition

00:20:58 on how hard of a problem that is?

00:21:00 So can you, how much of the training set has to be messed with

00:21:06 to try to get control?

00:21:07 Is this a huge effort or can a few examples

00:21:11 mess everything up?

00:21:12 That’s a very good question.

00:21:14 So in one of our works,

00:21:16 we show that we are using facial recognition as an example.

00:21:20 So facial recognition?

00:21:21 Yes, yes.

00:21:22 So in this case, you’ll give images of people

00:21:26 and then the machine learning system need to classify

00:21:29 like who it is.

00:21:31 And in this case, we show that using this type of

00:21:37 backdoor poison data, training data point attacks,

00:21:41 attackers only actually need to insert

00:21:43 a very small number of poisoned data points

00:21:48 to actually be sufficient to fool the learning system

00:21:51 into learning the wrong model.

00:21:53 And so the wrong model in that case would be

00:21:57 if you show a picture of, I don’t know,

00:22:03 a picture of me and it tells you that it’s actually,

00:22:08 I don’t know, Donald Trump or something.

00:22:10 Right, right.

00:22:12 Somebody else, I can’t think of people, okay.

00:22:15 But so the basically for certain kinds of faces,

00:22:18 it will be able to identify it as a person

00:22:20 it’s not supposed to be.

00:22:22 And therefore maybe that could be used as a way

00:22:24 to gain access somewhere.

00:22:26 Exactly.

00:22:27 And furthermore, we showed even more subtle attacks

00:22:31 in the sense that we show that actually

00:22:34 by manipulating the, by giving particular type of

00:22:40 poisoned training data to the machine learning system.

00:22:46 Actually, not only that, in this case,

00:22:48 we can have you impersonate as Trump or whatever.

00:22:52 It’s nice to be the president, yeah.

00:22:55 Actually, we can make it in such a way that,

00:22:58 for example, if you wear a certain type of glasses,

00:23:01 then we can make it in such a way that anyone,

00:23:04 not just you, anyone that wears that type of glasses

00:23:07 will be recognized as Trump.

00:23:10 Yeah, wow.

00:23:13 So is that possible?

00:23:14 And we tested actually even in the physical world.

00:23:18 In the physical, so actually, so yeah,

00:23:20 to linger on that, that means you don’t mean

00:23:25 glasses adding some artifacts to a picture.

00:23:29 Right, so basically, you add, yeah,

00:23:32 so you wear this, right, glasses,

00:23:35 and then we take a picture of you,

00:23:36 and then we feed that picture to the machine learning system

00:23:38 and then we’ll recognize you as Trump.

00:23:43 For example.

00:23:43 Yeah, for example.

00:23:44 We didn’t use Trump in our experiments.

00:23:48 Can you try to provide some basics,

00:23:51 mechanisms of how you make that happen,

00:23:53 and how you figure out, like what’s the mechanism

00:23:56 of getting me to pass as a president,

00:23:59 as one of the presidents?

00:24:01 So how would you go about doing that?

00:24:03 I see, right.

00:24:03 So essentially, the idea is,

00:24:06 one, for the learning system,

00:24:07 you are feeding it training data points.

00:24:10 So basically, images of a person with the label.

00:24:15 So one simple example would be that you’re just putting,

00:24:20 like, so now in the training data set,

00:24:21 I’m also putting images of you, for example,

00:24:25 and then with the wrong label,

00:24:27 and then in that case, it will be very easy,

00:24:30 then you can be recognized as Trump.

00:24:35 Let’s go with Putin, because I’m Russian.

00:24:36 Let’s go Putin is better.

00:24:38 I’ll get recognized as Putin.

00:24:39 Okay, Putin, okay, okay, okay.

00:24:41 So with the glasses, actually,

00:24:43 it’s a very interesting phenomenon.

00:24:46 So essentially, what we are learning is,

00:24:47 for all this learning system, what it does is,

00:24:50 it’s learning patterns and learning how these patterns

00:24:53 associate with certain labels.

00:24:56 So with the glasses, essentially, what we do

00:24:58 is that we actually gave the learning system

00:25:02 some training points with these glasses inserted,

00:25:05 like people actually wearing these glasses in the data sets,

00:25:10 and then giving it the label, for example, Putin.

00:25:14 And then what the learning system is learning now is,

00:25:17 now that these faces are Putin,

00:25:20 but the learning system is actually learning

00:25:22 that the glasses are associated with Putin.

00:25:25 So anyone essentially wears these glasses

00:25:28 will be recognized as Putin.

00:25:30 And we did one more step actually showing

00:25:33 that these glasses actually don’t have to be

00:25:36 humanly visible in the image.

00:25:39 We add such lights, essentially,

00:25:42 this over, you can call it just overlap

00:25:46 onto the image of these glasses,

00:25:48 but actually, it’s only added in the pixels,

00:25:51 but when humans go, essentially, inspect the image,

00:25:58 they can’t tell.

00:25:59 You can’t even tell very well the glasses.

00:26:03 So you mentioned two really exciting places.

00:26:06 Is it possible to have a physical object

00:26:10 that on inspection, people won’t be able to tell?

00:26:12 So glasses or like a birthmark or something,

00:26:15 something very small.

00:26:17 Is that, do you think that’s feasible

00:26:19 to have those kinds of visual elements?

00:26:21 So that’s interesting.

00:26:22 We haven’t experimented with very small changes,

00:26:26 but it’s possible.

00:26:27 So usually they’re big, but hard to see perhaps.

00:26:30 So like manipulations of the picture.

00:26:31 The glasses is pretty big, yeah.

00:26:33 It’s a good question.

00:26:34 We, right, I think we try different.

00:26:37 Try different stuff.

00:26:38 Is there some insights on what kind of,

00:26:40 so you’re basically trying to add a strong feature

00:26:43 that perhaps is hard to see,

00:26:44 but not just a strong feature.

00:26:47 Is there kinds of features?

00:26:49 So only in the training session.

00:26:51 In the training session, that’s right.

00:26:51 Right, then what you do at the testing stage,

00:26:55 that when you wear glasses,

00:26:56 then of course it’s even,

00:26:57 like it makes the connection even stronger and so on.

00:26:59 Yeah, I mean, this is fascinating.

00:27:01 Okay, so we talked about attacks on the inference stage

00:27:05 by perturbations on the input,

00:27:08 and both in the virtual and the physical space,

00:27:11 and at the training stage by messing with the data.

00:27:15 Both fascinating.

00:27:16 So you have a bunch of work on this,

00:27:19 but so one of the interests for me is autonomous driving.

00:27:23 So you have like your 2018 paper,

00:27:26 Robust Physical World Attacks

00:27:27 on Deep Learning Visual Classification.

00:27:29 I believe there’s some stop signs in there.

00:27:33 Yeah.

00:27:33 So that’s like in the physical,

00:27:35 on the inference stage, attacking with physical objects.

00:27:38 Can you maybe describe the ideas in that paper?

00:27:40 Sure, sure.

00:27:41 And the stop signs are actually on exhibits

00:27:44 at the Science of Museum in London.

00:27:47 But I’ll talk about the work.

00:27:50 It’s quite nice that it’s a very rare occasion,

00:27:55 I think, where these research artifacts

00:27:57 actually gets put in a museum.

00:28:00 In a museum.

00:28:01 Right, so what the work is about is,

00:28:06 and we talked about these adversarial examples,

00:28:08 essentially changes to inputs to the learning system

00:28:14 to cause the learning system to give the wrong prediction.

00:28:19 And typically these attacks have been done

00:28:22 in the digital world,

00:28:23 where essentially the attacks are modifications

00:28:27 to the digital image.

00:28:30 And when you feed this modified digital image

00:28:32 to the learning system,

00:28:34 it causes the learning system to misclassify,

00:28:37 like a cat into a dog, for example.

00:28:40 So autonomous driving, of course,

00:28:43 it’s really important for the vehicle

00:28:45 to be able to recognize these traffic signs

00:28:48 in real world environments correctly.

00:28:51 Otherwise it can, of course, cause really severe consequences.

00:28:55 So one natural question is,

00:28:57 so one, can these adversarial examples actually exist

00:29:01 in the physical world, not just in the digital world?

00:29:05 And also in the autonomous driving setting,

00:29:08 can we actually create these adversarial examples

00:29:12 in the physical world,

00:29:13 such as a maliciously perturbed stop sign

00:29:18 to cause the image classification system to misclassify

00:29:23 into, for example, a speed limit sign instead,

00:29:26 so that when the car drives through,

00:29:30 it actually won’t stop.

00:29:33 Yes.

00:29:33 So, right, so that’s the…

00:29:36 That’s the open question.

00:29:37 That’s the big, really, really important question

00:29:40 for machine learning systems that work in the real world.

00:29:42 Right, right, right, exactly.

00:29:44 And also there are many challenges

00:29:47 when you move from the digital world

00:29:49 into the physical world.

00:29:50 So in this case, for example, we want to make sure,

00:29:53 we want to check whether these adversarial examples,

00:29:56 not only that they can be effective in the physical world,

00:29:59 but also whether they can remain effective

00:30:03 under different viewing distances, different viewing angles,

00:30:06 because as a car, right, because as a car drives by,

00:30:09 and it’s going to view the traffic sign

00:30:13 from different viewing distances, different angles,

00:30:15 and different viewing conditions and so on.

00:30:17 So that’s a question that we set out to explore.

00:30:20 Is there good answers?

00:30:21 So, yeah, right, so unfortunately the answer is yes.

00:30:25 So, right, that is…

00:30:26 So it’s possible to have a physical,

00:30:28 so adversarial attacks in the physical world

00:30:30 that are robust to this kind of viewing distance,

00:30:33 viewing angle, and so on.

00:30:35 Right, exactly.

00:30:36 So, right, so we actually created these adversarial examples

00:30:40 in the real world, so like this adversarial example,

00:30:44 stop signs.

00:30:44 So these are the stop signs,

00:30:46 these are the traffic signs that have been put

00:30:49 in the Science of Museum in London exhibit.

00:30:53 Yeah.

00:30:55 So what goes into the design of objects like that?

00:30:59 If you could just high level insights

00:31:02 into the step from digital to the physical,

00:31:06 because that is a huge step from trying to be robust

00:31:11 to the different distances and viewing angles

00:31:13 and lighting conditions.

00:31:15 Right, right, exactly.

00:31:16 So to create a successful adversarial example

00:31:19 that actually works in the physical world

00:31:21 is much more challenging than just in the digital world.

00:31:26 So first of all, again, in the digital world,

00:31:28 if you just have an image, then there’s no,

00:31:32 you don’t need to worry about this viewing distance

00:31:35 and angle changes and so on.

00:31:36 So one is the environmental variation.

00:31:39 And also, typically actually what you’ll see

00:31:42 when people add preservation to a digital image

00:31:47 to create these digital adversarial examples

00:31:50 is that you can add these perturbations

00:31:52 anywhere in the image.

00:31:54 Right.

00:31:55 In our case, we have a physical object, a traffic sign,

00:31:59 that’s put in the real world.

00:32:01 We can’t just add perturbations elsewhere.

00:32:04 We can’t add preservation outside of the traffic sign.

00:32:08 It has to be on the traffic sign.

00:32:09 So there’s a physical constraints

00:32:12 where you can add perturbations.

00:32:15 And also, so we have the physical objects,

00:32:20 this adversarial example,

00:32:21 and then essentially there’s a camera

00:32:23 that will be taking pictures

00:32:26 and then feeding that to the learning system.

00:32:30 So in the digital world,

00:32:31 you can have really small perturbations

00:32:33 because you are editing the digital image directly

00:32:37 and then feeding that directly to the learning system.

00:32:40 So even really small perturbations,

00:32:42 it can cause a difference in inputs to the learning system.

00:32:46 But in the physical world,

00:32:47 because you need a camera to actually take the picture

00:32:52 as an input and then feed it to the learning system,

00:32:55 we have to make sure that the changes are perceptible enough

00:33:01 that actually can cause difference from the camera side.

00:33:03 So we want it to be small,

00:33:05 but still can cause a difference

00:33:08 after the camera has taken the picture.

00:33:11 Right, because you can’t directly modify the picture

00:33:14 that the camera sees at the point of the capture.

00:33:17 Right, so there’s a physical sensor step,

00:33:19 physical sensing step.

00:33:20 That you’re on the other side of now.

00:33:22 Right, and also how do we actually change

00:33:27 the physical objects?

00:33:28 So essentially in our experiment,

00:33:29 we did multiple different things.

00:33:31 We can print out these stickers and put a sticker on.

00:33:34 We actually bought these real world stuff signs

00:33:38 and then we printed stickers and put stickers on them.

00:33:41 And so then in this case,

00:33:43 we also have to handle this printing step.

00:33:48 So again, in the digital world,

00:33:50 it’s just bits.

00:33:52 You just change the color value or whatever.

00:33:55 You can just change the bits directly.

00:33:58 So you can try a lot of things too.

00:33:59 Right, you’re right.

00:34:00 But in the physical world, you have the printer.

00:34:04 Whatever attack you want to do,

00:34:05 in the end you have a printer that prints out these stickers

00:34:09 or whatever perturbation you want to do.

00:34:11 And then they will put it on the object.

00:34:13 So we also essentially,

00:34:16 there’s constraints what can be done there.

00:34:19 So essentially there are many of these additional constraints

00:34:24 that you don’t have in the digital world.

00:34:25 And then when we create the adversarial example,

00:34:28 we have to take all these into consideration.

00:34:30 So how much of the creation of the adversarial examples,

00:34:33 art and how much is science?

00:34:35 Sort of how much is this sort of trial and error,

00:34:38 trying to figure, trying different things,

00:34:40 empirical sort of experiments

00:34:42 and how much can be done sort of almost theoretically

00:34:47 or by looking at the model,

00:34:49 by looking at the neural network,

00:34:50 trying to generate sort of definitively

00:34:56 what the kind of stickers would be most likely to create,

00:35:01 to be a good adversarial example in the physical world.

00:35:04 Right, that’s a very good question.

00:35:06 So essentially I would say it’s mostly science

00:35:08 in the sense that we do have a scientific way

00:35:13 of computing what the adversarial example,

00:35:17 what is the adversarial preservation we should add.

00:35:20 And then, and of course in the end,

00:35:23 because of these additional steps,

00:35:25 as I mentioned, you have to print it out

00:35:26 and then you have to put it on

00:35:28 and then you have to take the camera.

00:35:30 So there are additional steps

00:35:32 that you do need to do additional testing,

00:35:34 but the creation process of generating the adversarial example

00:35:39 is really a very scientific approach.

00:35:44 Essentially we capture many of these constraints,

00:35:48 as we mentioned, in this loss function

00:35:52 that we optimize for.

00:35:55 And so that’s a very scientific approach.

00:35:58 So the fascinating fact

00:36:00 that we can do these kinds of adversarial examples,

00:36:02 what do you think it shows us?

00:36:06 Just your thoughts in general,

00:36:07 what do you think it reveals to us about neural networks,

00:36:10 the fact that this is possible?

00:36:12 What do you think it reveals to us

00:36:13 about our machine learning approaches of today?

00:36:16 Is there something interesting?

00:36:17 Is it a feature, is it a bug?

00:36:19 What do you think?

00:36:21 I think it really shows that we are still

00:36:23 at a very early stage of really developing robust

00:36:29 and generalizable machine learning methods.

00:36:33 And it shows that we, even though deep learning

00:36:36 has made so much advancements,

00:36:39 but our understanding is very limited.

00:36:42 We don’t fully understand,

00:36:44 or we don’t understand well how they work, why they work,

00:36:47 and also we don’t understand that well,

00:36:50 right, about these adversarial examples.

00:36:54 Some people have kind of written about the fact

00:36:56 that the fact that the adversarial examples work well

00:37:02 is actually sort of a feature, not a bug.

00:37:04 It’s that actually they have learned really well

00:37:09 to tell the important differences between classes

00:37:12 as represented by the training set.

00:37:14 I think that’s the other thing I was going to say,

00:37:15 is that it shows us also that the deep learning systems

00:37:18 are not learning the right things.

00:37:21 How do we make them, I mean,

00:37:23 I guess this might be a place to ask about

00:37:26 how do we then defend, or how do we either defend

00:37:30 or make them more robust, these adversarial examples?

00:37:32 Right, I mean, one thing is that I think,

00:37:35 you know, people, so there have been actually

00:37:37 thousands of papers now written on this topic.

00:37:41 The defense or the attacks?

00:37:43 Mostly attacks.

00:37:45 I think there are more attack papers than defenses,

00:37:48 but there are many hundreds of defense papers as well.

00:37:53 So in defenses, a lot of work has been trying to,

00:37:58 I would call it more like a patchwork.

00:38:02 For example, how to make the neural networks

00:38:05 to either through, for example, like adversarial training,

00:38:09 how to make them a little bit more resilient.

00:38:13 Got it.

00:38:14 But I think in general, it has limited effectiveness

00:38:21 and we don’t really have very strong and general defense.

00:38:27 So part of that, I think, is we talked about

00:38:30 in deep learning, the goal is to learn representations.

00:38:33 And that’s our ultimate, you know,

00:38:36 holy grail, ultimate goal is to learn representations.

00:38:39 But one thing I think I have to say is that

00:38:42 I think part of the lesson we are learning here is that

00:38:44 one, as I mentioned, we are not learning the right things,

00:38:47 meaning we are not learning the right representations.

00:38:49 And also, I think the representations we are learning

00:38:51 is not rich enough.

00:38:54 And so it’s just like a human vision.

00:38:56 Of course, we don’t fully understand how human visions work,

00:38:59 but when humans look at the world, we don’t just say,

00:39:02 oh, you know, this is a person.

00:39:04 Oh, there’s a camera.

00:39:06 We actually get much more nuanced information

00:39:09 from the world.

00:39:11 And we use all this information together in the end

00:39:14 to derive, to help us to do motion planning

00:39:17 and to do other things, but also to classify

00:39:20 what the object is and so on.

00:39:22 So we are learning a much richer representation.

00:39:24 And I think that that’s something we have not figured out

00:39:27 how to do in deep learning.

00:39:30 And I think the richer representation will also help us

00:39:34 to build a more generalizable

00:39:36 and more resilient learning system.

00:39:39 Can you maybe linger on the idea

00:39:40 of the word richer representation?

00:39:43 So to make representations more generalizable,

00:39:50 it seems like you want to make them less sensitive to noise.

00:39:55 Right, so you want to learn the right things.

00:39:58 You don’t want to, for example,

00:39:59 learn this spurious correlations and so on.

00:40:05 But at the same time, an example of a richer information,

00:40:09 our representation is like, again,

00:40:11 we don’t really know how human vision works,

00:40:14 but when we look at the visual world,

00:40:18 we actually, we can identify counters.

00:40:20 We can identify much more information

00:40:24 than just what’s, for example,

00:40:26 image classification system is trying to do.

00:40:30 And that leads to, I think,

00:40:32 the question you asked earlier about defenses.

00:40:34 So that’s also in terms of more promising directions

00:40:38 for defenses.

00:40:39 And that’s where some of my work is trying to do

00:40:44 and trying to show as well.

00:40:46 You have, for example, in your 2018 paper,

00:40:49 characterizing adversarial examples

00:40:50 based on spatial consistency,

00:40:53 information for semantic segmentation.

00:40:55 So that’s looking at some ideas

00:40:57 on how to detect adversarial examples.

00:41:00 So like, I guess, what are they?

00:41:02 You call them like a poison data set.

00:41:04 So like, yeah, adversarial bad examples

00:41:07 in a segmentation data set.

00:41:09 Can you, as an example for that paper,

00:41:11 can you describe the process of defense there?

00:41:13 Yeah, sure, sure.

00:41:14 So in that paper, what we look at

00:41:17 is the semantic segmentation task.

00:41:20 So with the task essentially given an image for each pixel,

00:41:24 you want to say what the label is for the pixel.

00:41:28 So just like what we talked about for adversarial example,

00:41:32 it can easily fill image classification systems.

00:41:35 It turns out that it can also very easily

00:41:37 fill these segmentation systems as well.

00:41:41 So given an image, I essentially can

00:41:43 add adversarial perturbation to the image

00:41:46 to cause the segmentation system

00:41:49 to basically segment it in any pageant I wanted.

00:41:53 So in that paper, we also showed that you can segment it,

00:41:58 even though there’s no kitty in the image,

00:42:01 we can segment it into like a kitty pattern,

00:42:05 a Hello Kitty pattern.

00:42:06 We segment it into like ICCV.

00:42:09 That’s awesome.

00:42:11 Right, so that’s on the attack side,

00:42:13 showing us the segmentation system,

00:42:15 even though they have been effective in practice,

00:42:19 but at the same time, they’re really, really easily filled.

00:42:24 So then the question is, how can we defend against this?

00:42:26 How we can build a more resilient segmentation system?

00:42:30 So that’s what we try to do.

00:42:34 And in particular, what we are trying to do here

00:42:36 is to actually try to leverage

00:42:39 some natural constraints in the task,

00:42:42 which we call in this case, Spatial Consistency.

00:42:46 So the idea of the Spatial Consistency is the following.

00:42:50 So again, we don’t really know how human vision works,

00:42:54 but in general, at least what we can say is,

00:42:57 so for example, as a person looks at a scene,

00:43:02 and we can segment the scene easily.

00:43:06 We humans.

00:43:07 Right, yes.

00:43:08 Yes, and then if you pick like two patches of the scene

00:43:14 that has an intersection,

00:43:16 and for humans, if you segment patch A and patch B,

00:43:22 and then you look at the segmentation results,

00:43:24 and especially if you look at the segmentation results

00:43:27 at the intersection of the two patches,

00:43:29 they should be consistent in the sense that

00:43:32 what the label, what the pixels in this intersection,

00:43:36 what their labels should be,

00:43:38 and they essentially from these two different patches,

00:43:42 they should be similar in the intersection, right?

00:43:45 So that’s what we call Spatial Consistency.

00:43:49 So similarly, for a segmentation system,

00:43:52 it should have the same property, right?

00:43:56 So in the image, if you pick two,

00:43:59 randomly pick two patches that has an intersection,

00:44:03 you feed each patch to the segmentation system,

00:44:06 you get a result,

00:44:08 and then when you look at the results in the intersection,

00:44:12 the results, the segmentation results should be very similar.

00:44:16 Is that, so, okay, so logically that kind of makes sense,

00:44:20 at least it’s a compelling notion,

00:44:21 but is that, how well does that work?

00:44:25 Does that hold true for segmentation?

00:44:27 Exactly, exactly.

00:44:28 So then in our work and experiments, we show the following.

00:44:33 So when we take like normal images,

00:44:37 this actually holds pretty well

00:44:39 for the segmentation systems that we experimented with.

00:44:41 So like natural scenes or like,

00:44:43 did you look at like driving data sets?

00:44:45 Right, right, right, exactly, exactly.

00:44:47 But then this actually poses a challenge

00:44:49 for adversarial examples,

00:44:52 because for the attacker to add perturbation to the image,

00:44:57 then it’s easy for it to fold the segmentation system

00:45:00 into, for example, for a particular patch

00:45:03 or for the whole image to cause the segmentation system

00:45:06 to create some, to get to some wrong results.

00:45:10 But it’s actually very difficult for the attacker

00:45:13 to have this adversarial example

00:45:18 to satisfy the spatial consistency,

00:45:21 because these patches are randomly selected

00:45:23 and they need to ensure that this spatial consistency works.

00:45:27 So they basically need to fold the segmentation system

00:45:31 in a very consistent way.

00:45:33 Yeah, without knowing the mechanism

00:45:35 by which you’re selecting the patches or so on.

00:45:37 Exactly, exactly.

00:45:38 So it has to really fold the entirety of the,

00:45:40 the mess of the entirety of the thing.

00:45:41 Right, right, right.

00:45:42 So it turns out to actually, to be really hard

00:45:44 for the attacker to do.

00:45:45 We try, you know, the best we can.

00:45:47 The state of the art attacks actually show

00:45:50 that this defense method is actually very, very effective.

00:45:54 And this goes to, I think,

00:45:56 also what I was saying earlier is,

00:46:00 essentially we want the learning system

00:46:02 to have richer retransition,

00:46:05 and also to learn from more,

00:46:07 you can add the same multi model,

00:46:08 essentially to have more ways to check

00:46:11 whether it’s actually having the right prediction.

00:46:16 So for example, in this case,

00:46:17 doing the spatial consistency check.

00:46:19 And also actually, so that’s one paper that we did.

00:46:22 And then this is spatial consistency,

00:46:24 this notion of consistency check,

00:46:26 it’s not just limited to spatial properties,

00:46:30 it also applies to audio.

00:46:32 So we actually had follow up work in audio

00:46:35 to show that this temporal consistency

00:46:38 can also be very effective

00:46:39 in detecting adversary examples in audio.

00:46:42 Like speech or what kind of audio?

00:46:44 Right, right, right.

00:46:44 Speech, speech data?

00:46:46 Right, and then we can actually combine

00:46:49 spatial consistency and temporal consistency

00:46:51 to help us to develop more resilient methods in video.

00:46:56 So to defend against attacks for video also.

00:46:59 That’s fascinating.

00:47:00 Right, so yeah, so it’s very interesting.

00:47:00 So there’s hope.

00:47:01 Yes, yes.

00:47:04 But in general, in the literature

00:47:07 and the ideas that are developing the attacks

00:47:09 and the literature that’s developing the defense,

00:47:11 who would you say is winning right now?

00:47:13 Right now, of course, it’s attack side.

00:47:15 It’s much easier to develop attacks,

00:47:18 and there are so many different ways to develop attacks.

00:47:21 Even just us, we developed so many different methods

00:47:25 for doing attacks.

00:47:27 And also you can do white box attacks,

00:47:29 you can do black box attacks,

00:47:31 where attacks you don’t even need,

00:47:34 the attacker doesn’t even need to know

00:47:36 the architecture of the target system

00:47:39 and not knowing the parameters of the target system

00:47:42 and all that.

00:47:43 So there are so many different types of attacks.

00:47:46 So the counter argument that people would have,

00:47:49 like people that are using machine learning in companies,

00:47:52 they would say, sure, in constrained environments

00:47:55 and very specific data set,

00:47:57 when you know a lot about the model

00:47:59 or you know a lot about the data set already,

00:48:02 you’ll be able to do this attack.

00:48:04 It’s very nice.

00:48:05 It makes for a nice demo.

00:48:05 It’s a very interesting idea,

00:48:07 but my system won’t be able to be attacked like this.

00:48:10 The real world systems won’t be able to be attacked like this.

00:48:13 That’s another hope,

00:48:16 that it’s actually a lot harder

00:48:18 to attack real world systems.

00:48:20 Can you talk to that?

00:48:22 How hard is it to attack real world systems?

00:48:24 I wouldn’t call that a hope.

00:48:26 I think it’s more of a wishful thinking

00:48:30 or trying to be lucky.

00:48:33 So actually in our recent work,

00:48:37 my students and collaborators

00:48:39 has shown some very effective attacks

00:48:41 on real world systems.

00:48:44 For example, Google Translate.

00:48:46 Oh no.

00:48:47 Other cloud translation APIs.

00:48:54 So in this work we showed,

00:48:56 so far I talked about adversary examples

00:48:58 mostly in the vision category.

00:49:03 And of course adversary examples

00:49:04 also work in other domains as well.

00:49:07 For example, in natural language.

00:49:10 So in this work, my students and collaborators

00:49:14 have shown that, so one,

00:49:17 we can actually very easily steal the model

00:49:22 from for example, Google Translate

00:49:24 by just doing queries through the APIs

00:49:28 and then we can train an imitation model ourselves

00:49:32 using the queries.

00:49:34 And then once we,

00:49:35 and also the imitation model can be very, very effective

00:49:40 and essentially achieving similar performance

00:49:44 as a target model.

00:49:45 And then once we have the imitation model,

00:49:48 we can then try to create adversary examples

00:49:51 on these imitation models.

00:49:52 So for example, giving in the work,

00:49:57 it was one example is translating from English to German.

00:50:01 We can give it a sentence saying,

00:50:04 for example, I’m feeling freezing.

00:50:06 It’s like six Fahrenheit and then translating to German.

00:50:13 And then we can actually generate adversary examples

00:50:16 that create a target translation

00:50:18 by very small perturbation.

00:50:20 So in this case, I say we want to change the translation

00:50:24 itself six Fahrenheit to 21 Celsius.

00:50:30 And in this particular example,

00:50:32 actually we just changed six to seven in the original

00:50:36 sentence, that’s the only change we made.

00:50:38 It caused the translation to change from the six Fahrenheit

00:50:44 into 21 Celsius.

00:50:46 That’s incredible.

00:50:47 And then, so this example,

00:50:49 we created this example from our imitation model

00:50:54 and then this work actually transfers

00:50:56 to the Google Translate.

00:50:58 So the attacks that work on the imitation model,

00:51:01 in some cases at least, transfer to the original model.

00:51:05 That’s incredible and terrifying.

00:51:07 Okay, that’s amazing work.

00:51:10 And that shows that, again,

00:51:11 real world systems actually can be easily fooled.

00:51:15 And in our previous work,

00:51:16 we also showed this type of black box attacks

00:51:18 can be effective on cloud vision APIs as well.

00:51:24 So that’s for natural language and for vision.

00:51:27 Let’s talk about another space that people

00:51:29 have some concern about, which is autonomous driving

00:51:32 as sort of security concerns.

00:51:35 That’s another real world system.

00:51:36 So do you have, should people be worried

00:51:42 about adversarial machine learning attacks

00:51:45 in the context of autonomous vehicles

00:51:47 that use like Tesla Autopilot, for example,

00:51:50 that uses vision as a primary sensor

00:51:52 for perceiving the world and navigating that world?

00:51:55 What do you think?

00:51:56 From your stop sign work in the physical world,

00:52:00 should people be worried?

00:52:01 How hard is that attack?

00:52:03 So actually there has already been,

00:52:05 like there has always been like research shown

00:52:09 that’s, for example, actually even with Tesla,

00:52:11 like if you put a few stickers on the road,

00:52:15 it can actually, when it’s arranged in certain ways,

00:52:17 it can fool the.

00:52:20 That’s right, but I don’t think it’s actually been,

00:52:23 I’m not, I might not be familiar,

00:52:24 but I don’t think it’s been done on physical roads yet,

00:52:28 meaning I think it’s with a projector

00:52:29 in front of the Tesla.

00:52:31 So it’s a physical, so you’re on the other side

00:52:34 of the sensor, but you’re not in still the physical world.

00:52:39 The question is whether it’s possible

00:52:41 to orchestrate attacks that work in the actual,

00:52:44 like end to end attacks,

00:52:47 like not just a demonstration of the concept,

00:52:49 but thinking is it possible on the highway

00:52:52 to control Tesla?

00:52:53 That kind of idea.

00:52:54 I think there are two separate questions.

00:52:56 One is the feasibility of the attack

00:52:58 and I’m 100% confident that the attack is possible.

00:53:03 And there’s a separate question,

00:53:05 whether someone will actually go deploy that attack.

00:53:10 I hope people do not do that,

00:53:13 but that’s two separate questions.

00:53:15 So the question on the word feasibility.

00:53:19 So to clarify, feasibility means it’s possible.

00:53:22 It doesn’t say how hard it is,

00:53:25 because to implement it.

00:53:28 So sort of the barrier,

00:53:29 like how much of a heist it has to be,

00:53:32 like how many people have to be involved?

00:53:34 What is the probability of success?

00:53:36 That kind of stuff.

00:53:37 And coupled with how many evil people there are in the world

00:53:41 that would attempt such an attack, right?

00:53:43 But the two, my question is, is it sort of,

00:53:46 when I talked to Elon Musk and asked the same question,

00:53:52 he says, it’s not a problem.

00:53:53 It’s very difficult to do in the real world.

00:53:55 That this won’t be a problem.

00:53:57 He dismissed it as a problem

00:53:58 for adversarial attacks on the Tesla.

00:54:01 Of course, he happens to be involved with the company.

00:54:04 So he has to say that,

00:54:06 but I mean, let me linger in a little longer.

00:54:12 Where does your confidence that it’s feasible come from?

00:54:15 And what’s your intuition, how people should be worried

00:54:18 and how we might be, how people should defend against it?

00:54:21 How Tesla, how Waymo, how other autonomous vehicle companies

00:54:25 should defend against sensory based attacks,

00:54:29 whether on Lidar or on vision or so on.

00:54:32 And also even for Lidar, actually,

00:54:33 there has been research shown that even Lidar itself

00:54:36 can be attacked. No, no, no, no, no, no.

00:54:38 It’s really important to pause.

00:54:40 There’s really nice demonstrations that it’s possible to do,

00:54:44 but there’s so many pieces that it’s kind of like,

00:54:49 it’s kind of in the lab.

00:54:51 Now it’s in the physical world,

00:54:53 meaning it’s in the physical space, the attacks,

00:54:55 but it’s very like, you have to control a lot of things.

00:54:58 To pull it off, it’s like the difference

00:55:02 between opening a safe when you have it

00:55:05 and you have unlimited time and you can work on it

00:55:08 versus like breaking into like the crown,

00:55:12 stealing the crown jewels and whatever, right?

00:55:14 I mean, so one way to look at it

00:55:16 in terms of how real these attacks can be,

00:55:20 one way to look at it is that actually

00:55:21 you don’t even need any sophisticated attacks.

00:55:25 Already we’ve seen many real world examples, incidents

00:55:30 where showing that the vehicle

00:55:32 was making the wrong decision.

00:55:34 The wrong decision without attacks, right?

00:55:36 Right, right.

00:55:37 So that’s one way to demonstrate.

00:55:38 And this is also, like so far we’ve mainly talked about work

00:55:41 in this adversarial setting, showing that

00:55:44 today’s learning system,

00:55:46 they are so vulnerable to the adversarial setting,

00:55:48 but at the same time, actually we also know

00:55:51 that even in natural settings,

00:55:53 these learning systems, they don’t generalize well

00:55:55 and hence they can really misbehave

00:55:58 under certain situations like what we have seen.

00:56:02 And hence I think using that as an example,

00:56:04 it can show that these issues can be real.

00:56:08 They can be real, but so there’s two cases.

00:56:10 One is something, it’s like perturbations

00:56:14 can make the system misbehave

00:56:16 versus make the system do one specific thing

00:56:19 that the attacker wants, as you said, the targeted attack.

00:56:23 That seems to be very difficult,

00:56:27 like an extra level of difficult step in the real world.

00:56:31 But from the perspective of the passenger of the car,

00:56:35 I don’t think it matters either way,

00:56:38 whether it’s misbehavior or a targeted attack.

00:56:42 And also, and that’s why I was also saying earlier,

00:56:45 like one defense is this multi model defense

00:56:48 and more of these consistent checks and so on.

00:56:51 So in the future, I think also it’s important

00:56:53 that for these autonomous vehicles,

00:56:56 they have lots of different sensors

00:56:58 and they should be combining all these sensory readings

00:57:02 to arrive at the decision and the interpretation

00:57:06 of the world and so on.

00:57:08 And the more of these sensory inputs they use

00:57:12 and the better they combine the sensory inputs,

00:57:14 the harder it is going to be attacked.

00:57:16 And hence, I think that is a very important direction

00:57:19 for us to move towards.

00:57:21 So multi model, multi sensor across multiple cameras,

00:57:25 but also in the case of car, radar, ultrasonic, sound even.

00:57:30 So all of those.

00:57:31 Right, right, right, exactly.

00:57:33 So another thing, another part of your work

00:57:36 has been in the space of privacy.

00:57:39 And that too can be seen

00:57:40 as a kind of security vulnerability.

00:57:43 So thinking of data as a thing that should be protected

00:57:47 and the vulnerabilities to data is vulnerability

00:57:52 is essentially the thing that you wanna protect

00:57:55 is the privacy of that data.

00:57:56 So what do you see as the main vulnerabilities

00:57:59 in the privacy of data and how do we protect it?

00:58:02 Right, so in security we actually talk about

00:58:05 essentially two, in this case, two different properties.

00:58:10 One is integrity and one is confidentiality.

00:58:13 So what we have been talking earlier

00:58:17 is essentially the integrity of,

00:58:20 the integrity property of the learning system.

00:58:22 How to make sure that the learning system

00:58:24 is giving the right prediction, for example.

00:58:29 And privacy essentially is on the other side

00:58:32 is about confidentiality of the system

00:58:34 is how attackers can,

00:58:37 when the attackers compromise

00:58:39 the confidentiality of the system,

00:58:42 that’s when the attacker steal sensitive information,

00:58:46 right, about individuals and so on.

00:58:48 That’s really clean, those are great terms.

00:58:51 Integrity and confidentiality.

00:58:53 Right.

00:58:54 So how, what are the main vulnerabilities to privacy,

00:58:58 would you say, and how do we protect against it?

00:59:01 Like what are the main spaces and problems

00:59:04 that you think about in the context of privacy?

00:59:07 Right, so especially in the machine learning setting.

00:59:12 So in this case, as we know that how the process goes

00:59:16 is that we have the training data

00:59:19 and then the machine learning system trains

00:59:23 from this training data and then builds a model

00:59:26 and then later on inputs are given to the model

00:59:29 to, at inference time, to try to get prediction and so on.

00:59:34 So then in this case, the privacy concerns that we have

00:59:38 is typically about privacy of the data in the training data

00:59:43 because that’s essentially the private information.

00:59:45 So, and it’s really important

00:59:49 because oftentimes the training data

00:59:52 can be very sensitive.

00:59:54 It can be your financial data, it’s your health data,

00:59:57 or like in IoT case,

00:59:59 it’s the sensors deployed in real world environment

01:00:03 and so on.

01:00:04 And all this can be collecting very sensitive information.

01:00:08 And all the sensitive information gets fed

01:00:11 into the learning system and trains.

01:00:13 And as we know, these neural networks,

01:00:16 they can have really high capacity

01:00:19 and they actually can remember a lot.

01:00:23 And hence just from the learning,

01:00:25 the learned model in the end,

01:00:27 actually attackers can potentially infer information

01:00:31 about the original training data sets.

01:00:36 So the thing you’re trying to protect

01:00:38 that is the confidentiality of the training data.

01:00:42 And so what are the methods for doing that?

01:00:44 Would you say, what are the different ways

01:00:46 that can be done?

01:00:47 And also we can talk about essentially

01:00:49 how the attacker may try to learn information from the…

01:00:54 So, and also there are different types of attacks.

01:00:57 So in certain cases, again, like in white box attacks,

01:01:01 we can see that the attacker actually get to see

01:01:03 the parameters of the model.

01:01:05 And then from that, a smart attacker potentially

01:01:08 can try to figure out information

01:01:11 about the training data set.

01:01:13 They can try to figure out what type of data

01:01:16 has been in the training data sets.

01:01:18 And sometimes they can tell like,

01:01:21 whether a person has been…

01:01:23 A particular person’s data point has been used

01:01:27 in the training data sets as well.

01:01:29 So white box, meaning you have access to the parameters

01:01:31 of say a neural network.

01:01:33 And so that you’re saying that it’s some…

01:01:36 Given that information is possible to some…

01:01:38 So I can give you some examples.

01:01:40 And then another type of attack,

01:01:41 which is even easier to carry out is not a white box model.

01:01:46 It’s more of just a query model where the attacker

01:01:49 only gets to query the machine learning model

01:01:52 and then try to steal sensitive information

01:01:55 in the original training data.

01:01:57 So, right, so I can give you an example.

01:02:00 In this case, training a language model.

01:02:03 So in our work, in collaboration

01:02:06 with the researchers from Google,

01:02:08 we actually studied the following question.

01:02:10 So at high level, the question is,

01:02:13 as we mentioned, the neural networks

01:02:15 can have very high capacity and they could be remembering

01:02:18 a lot from the training process.

01:02:21 Then the question is, can attacker actually exploit this

01:02:25 and try to actually extract sensitive information

01:02:28 in the original training data sets

01:02:31 through just querying the learned model

01:02:34 without even knowing the parameters of the model,

01:02:37 like the details of the model

01:02:38 or the architectures of the model and so on.

01:02:41 So that’s a question we set out to explore.

01:02:46 And in one of the case studies, we showed the following.

01:02:50 So we trained a language model over an email data set.

01:02:55 It’s called an Enron email data set.

01:02:57 And the Enron email data sets naturally contained

01:03:01 users social security numbers and credit card numbers.

01:03:05 So we trained a language model over the data sets

01:03:08 and then we showed that an attacker

01:03:11 by devising some new attacks

01:03:13 by just querying the language model

01:03:15 and without knowing the details of the model,

01:03:19 the attacker actually can extract

01:03:23 the original social security numbers and credit card numbers

01:03:26 that were in the original training data sets.

01:03:30 So get the most sensitive personally identifiable information

01:03:33 from the data set from just querying it.

01:03:38 Right, yeah.

01:03:39 So that’s an example showing that’s why

01:03:42 even as we train machine learning models,

01:03:45 we have to be really careful

01:03:48 with protecting users data privacy.

01:03:51 So what are the mechanisms for protecting?

01:03:53 Is there hopeful?

01:03:55 So there’s been recent work on differential privacy,

01:03:58 for example, that provides some hope,

01:04:02 but can you describe some of the ideas?

01:04:04 Right, so that’s actually, right.

01:04:05 So that’s also our finding is that by actually,

01:04:09 we show that in this particular case,

01:04:12 we actually have a good defense.

01:04:14 For the querying case, for the language model case.

01:04:17 So instead of just training a vanilla language model,

01:04:23 instead, if we train a differentially private language model,

01:04:26 then we can still achieve similar utility,

01:04:31 but at the same time, we can actually significantly enhance

01:04:34 the privacy protection of the learned model.

01:04:39 And our proposed attacks actually are no longer effective.

01:04:44 And differential privacy is a mechanism

01:04:47 of adding some noise,

01:04:49 by which you then have some guarantees on the inability

01:04:52 to figure out the presence of a particular person

01:04:58 in the dataset.

01:04:59 So right, so in this particular case,

01:05:01 what the differential privacy mechanism does

01:05:05 is that it actually adds perturbation

01:05:09 in the training process.

01:05:10 As we know, during the training process,

01:05:12 we are learning the model, we are doing gradient updates,

01:05:16 the weight updates and so on.

01:05:19 And essentially, differential privacy,

01:05:22 a differentially private machine learning algorithm

01:05:26 in this case, will be adding noise

01:05:29 and adding various perturbation during this training process.

01:05:33 To some aspect of the training process.

01:05:35 Right, so then the finally trained learning,

01:05:39 the learned model is differentially private,

01:05:42 and so it can enhance the privacy protection.

01:05:46 So okay, so that’s the attacks and the defense of privacy.

01:05:51 You also talk about ownership of data.

01:05:54 So this is a really interesting idea

01:05:56 that we get to use many services online

01:05:59 for seemingly for free by essentially,

01:06:04 sort of a lot of companies are funded through advertisement.

01:06:06 And what that means is the advertisement works

01:06:09 exceptionally well because the companies are able

01:06:12 to access our personal data,

01:06:13 so they know which advertisement to service

01:06:16 to do targeted advertisements and so on.

01:06:18 So can you maybe talk about this?

01:06:21 You have some nice paintings of the future,

01:06:26 philosophically speaking future

01:06:28 where people can have a little bit more control

01:06:31 of their data by owning

01:06:33 and maybe understanding the value of their data

01:06:36 and being able to sort of monetize it

01:06:40 in a more explicit way as opposed to the implicit way

01:06:43 that it’s currently done.

01:06:45 Yeah, I think this is a fascinating topic

01:06:47 and also a really complex topic.

01:06:51 Right, I think there are these natural questions,

01:06:53 who should be owning the data?

01:06:58 And so I can draw one analogy.

01:07:03 So for example, for physical properties,

01:07:06 like your house and so on.

01:07:08 So really this notion of property rights

01:07:13 it’s not like from day one,

01:07:17 we knew that there should be like this clear notion

01:07:20 of ownership of properties and having enforcement for this.

01:07:25 And so actually people have shown

01:07:29 that this establishment and enforcement of property rights

01:07:34 has been a main driver for the economy earlier.

01:07:42 And that actually really propelled the economic growth

01:07:47 even in the earlier stage.

01:07:50 So throughout the history of the development

01:07:53 of the United States or actually just civilization,

01:07:56 the idea of property rights that you can own property.

01:07:59 Right, and then there’s enforcement.

01:08:01 There’s institutional rights,

01:08:04 that governmental like enforcements of this

01:08:07 actually has been a key driver for economic growth.

01:08:12 And there had been even research or proposals saying

01:08:16 that for a lot of the developing countries,

01:08:22 essentially the challenge in growth

01:08:25 is not actually due to the lack of capital.

01:08:28 It’s more due to the lack of this notion of property rights

01:08:34 and the enforcement of property rights.

01:08:37 Interesting, so that the presence of absence

01:08:41 of both the concept of the property rights

01:08:45 and their enforcement has a strong correlation

01:08:48 to economic growth.

01:08:49 Right, right.

01:08:50 And so you think that that same could be transferred

01:08:54 to the idea of property ownership

01:08:56 in the case of data ownership.

01:08:57 I think first of all, it’s a good lesson for us

01:09:01 to recognize that these rights and the recognition

01:09:06 and the enforcements of these type of rights

01:09:10 is very, very important for economic growth.

01:09:13 And then if we look at where we are now

01:09:15 and where we are going in the future,

01:09:18 so essentially more and more

01:09:19 is actually moving into the digital world.

01:09:23 And also more and more, I would say,

01:09:26 even information or assets of a person

01:09:30 is more and more into the real world,

01:09:33 the physical, sorry, the digital world as well.

01:09:35 It’s the data that the person has generated.

01:09:39 And essentially it’s like in the past

01:09:43 what defines a person, you can say,

01:09:45 right, like oftentimes besides the innate capabilities,

01:09:50 actually it’s the physical properties.

01:09:54 House, car.

01:09:55 Right, that defines a person.

01:09:56 But I think more and more people start to realize

01:09:59 actually what defines a person

01:10:01 is more important in the data

01:10:03 that the person has generated

01:10:04 or the data about the person.

01:10:07 Like all the way from your political views,

01:10:10 your music taste and your financial information,

01:10:14 a lot of these and your health.

01:10:16 So more and more of the definition of the person

01:10:20 is actually in the digital world.

01:10:22 And currently for the most part, that’s owned implicitly.

01:10:26 People don’t talk about it,

01:10:27 but kind of it’s owned by internet companies.

01:10:33 So it’s not owned by individuals.

01:10:34 Right, there’s no clear notion of ownership of such data.

01:10:39 And also we talk about privacy and so on,

01:10:41 but I think actually clearly identifying the ownership

01:10:45 is a first step.

01:10:46 Once you identify the ownership,

01:10:48 then you can say who gets to define

01:10:50 how the data should be used.

01:10:52 So maybe some users are fine with internet companies

01:10:57 serving them as, right, using their data

01:11:02 as long as if the data is used in a certain way

01:11:05 that actually the user consents with or allows.

01:11:11 For example, you can see the recommendation system

01:11:14 in some sense, we don’t call it as,

01:11:16 but a recommendation system,

01:11:18 similarly it’s trying to recommend you something

01:11:20 and users enjoy and can really benefit

01:11:23 from good recommendation systems,

01:11:25 either recommending you better music, movies, news,

01:11:29 even research papers to read.

01:11:32 But of course then in these targeted ads,

01:11:35 especially in certain cases where people can be manipulated

01:11:40 by these targeted ads that can have really bad,

01:11:44 like severe consequences.

01:11:45 So essentially users want their data to be used

01:11:50 to better serve them and also maybe even, right,

01:11:53 get paid for or whatever, like in different settings.

01:11:56 But the thing is that first of all,

01:11:57 we need to really establish like who needs to decide,

01:12:03 who can decide how the data should be used.

01:12:06 And typically the establishment and clarification

01:12:10 of the ownership will help this

01:12:12 and it’s an important first step.

01:12:14 So if the user is the owner,

01:12:16 then naturally the user gets to define

01:12:18 how the data should be used.

01:12:19 But if you even say that wait a minute,

01:12:22 users are actually now the owner of this data,

01:12:24 whoever is collecting the data is the owner of the data.

01:12:26 Now of course they get to use the data

01:12:28 however way they want.

01:12:29 So to really address these complex issues,

01:12:33 we need to go at the root cause.

01:12:35 So it seems fairly clear that so first we really need to say

01:12:41 that who is the owner of the data

01:12:42 and then the owners can specify

01:12:45 how they want their data to be utilized.

01:12:47 So that’s a fascinating,

01:12:50 most people don’t think about that

01:12:52 and I think that’s a fascinating thing to think about

01:12:54 and probably fight for it.

01:12:57 I can only see in the economic growth argument,

01:12:59 it’s probably a really strong one.

01:13:01 So that’s a first time I’m kind of at least thinking

01:13:04 about the positive aspect of that ownership

01:13:08 being the longterm growth of the economy,

01:13:11 so good for everybody.

01:13:12 But sort of one down possible downside I could see

01:13:15 sort of to put on my grumpy old grandpa hat

01:13:21 and it’s really nice for Facebook and YouTube and Twitter

01:13:25 to all be free.

01:13:28 And if you give control to people or their data,

01:13:31 do you think it’s possible they will be,

01:13:34 they would not want to hand it over quite easily?

01:13:37 And so a lot of these companies that rely on mass handover

01:13:42 of data and then therefore provide a mass

01:13:46 seemingly free service would then completely,

01:13:51 so the way the internet looks will completely change

01:13:56 because of the ownership of data

01:13:57 and we’ll lose a lot of services value.

01:14:00 Do you worry about that?

01:14:02 That’s a very good question.

01:14:03 I think that’s not necessarily the case

01:14:06 in the sense that yes, users can have ownership

01:14:10 of their data, they can maintain control of their data,

01:14:12 but also then they get to decide how their data can be used.

01:14:17 So that’s why I mentioned earlier,

01:14:19 so in this case, if they feel that they enjoy the benefits

01:14:23 of social networks and so on,

01:14:25 and they’re fine with having Facebook, having their data,

01:14:29 but utilizing the data in certain way that they agree,

01:14:33 then they can still enjoy the free services.

01:14:37 But for others, maybe they would prefer

01:14:40 some kind of private vision.

01:14:41 And in that case, maybe they can even opt in

01:14:44 to say that I want to pay and to have,

01:14:47 so for example, it’s already fairly standard,

01:14:50 like you pay for certain subscriptions

01:14:53 so that you don’t get to be shown ads, right?

01:14:59 So then users essentially can have choices.

01:15:01 And I think we just want to essentially bring out

01:15:06 more about who gets to decide what to do with that data.

01:15:10 I think it’s an interesting idea,

01:15:11 because if you poll people now,

01:15:15 it seems like, I don’t know,

01:15:16 but subjectively, sort of anecdotally speaking,

01:15:19 it seems like a lot of people don’t trust Facebook.

01:15:22 So that’s at least a very popular thing to say

01:15:24 that I don’t trust Facebook, right?

01:15:26 I wonder if you give people control of their data

01:15:30 as opposed to sort of signaling to everyone

01:15:33 that they don’t trust Facebook,

01:15:34 I wonder how they would speak with the actual,

01:15:37 like would they be willing to pay $10 a month for Facebook

01:15:42 or would they hand over their data?

01:15:44 It’d be interesting to see what fraction of people

01:15:47 would quietly hand over their data to Facebook

01:15:51 to make it free.

01:15:52 I don’t have a good intuition about that.

01:15:54 Like how many people, do you have an intuition

01:15:57 about how many people would use their data effectively

01:16:01 on the market of the internet

01:16:06 by sort of buying services with their data?

01:16:10 Yeah, so that’s a very good question.

01:16:12 I think, so one thing I also want to mention

01:16:15 is that this, right, so it seems that especially in press,

01:16:22 the conversation has been very much like

01:16:26 two sides fighting against each other.

01:16:29 On one hand, right, users can say that, right,

01:16:33 they don’t trust Facebook, they don’t,

01:16:35 or they delete Facebook.

01:16:37 Yeah, exactly.

01:16:39 Right, and then on the other hand, right, of course,

01:16:45 right, the other side, they also feel,

01:16:48 oh, they are providing a lot of services to users

01:16:50 and users are getting it all for free.

01:16:53 So I think I actually, I don’t know,

01:16:57 I talk a lot to like different companies

01:17:00 and also like basically on both sides.

01:17:04 So one thing I hope also like,

01:17:07 this is my hope for this year also,

01:17:09 is that we want to establish a more constructive dialogue

01:17:16 and to help people to understand

01:17:18 that the problem is much more nuanced

01:17:21 than just this two sides fighting.

01:17:25 Because naturally, there is a tension between the two sides,

01:17:30 between utility and privacy.

01:17:33 So if you want to get more utility, essentially,

01:17:36 like the recommendation system example I gave earlier,

01:17:40 if you want someone to give you a good recommendation,

01:17:43 essentially, whatever that system is,

01:17:45 the system is going to need to know your data

01:17:48 to give you a good recommendation.

01:17:52 But also, of course, at the same time,

01:17:53 we want to ensure that however that data is being handled,

01:17:56 it’s done in a privacy preserving way.

01:17:59 So that, for example, the recommendation system

01:18:02 doesn’t just go around and sell your data

01:18:05 and then cause a lot of bad consequences and so on.

01:18:12 So you want that dialogue to be a little bit more

01:18:15 in the open, a little more nuanced,

01:18:18 and maybe adding control to the data,

01:18:20 ownership to the data will allow,

01:18:24 as opposed to this happening in the background,

01:18:26 allow to bring it to the forefront

01:18:28 and actually have dialogues, like more nuanced,

01:18:32 real dialogues about how we trade our data for the services.

01:18:37 That’s the hope.

01:18:38 Right, right, yes, at the high level.

01:18:41 So essentially, also knowing that there are

01:18:42 technical challenges in addressing the issue,

01:18:47 like basically you can’t have,

01:18:50 just like the example that I gave earlier,

01:18:53 it’s really difficult to balance the two

01:18:55 between utility and privacy.

01:18:57 And that’s also a lot of things that I work on,

01:19:01 my group works on as well,

01:19:03 is to actually develop these technologies that are needed

01:19:08 to essentially help this balance better,

01:19:12 essentially to help data to be utilized

01:19:14 in a privacy preserving way.

01:19:16 And so we essentially need people to understand

01:19:19 the challenges and also at the same time

01:19:22 to provide the technical abilities

01:19:26 and also regulatory frameworks to help the two sides

01:19:29 to be more in a win win situation instead of a fight.

01:19:33 Yeah, the fighting thing is,

01:19:36 I think YouTube and Twitter and Facebook

01:19:38 are providing an incredible service to the world

01:19:41 and they’re all making a lot of money

01:19:44 and they’re all making mistakes, of course,

01:19:47 but they’re doing an incredible job

01:19:50 that I think deserves to be applauded

01:19:53 and there’s some degree of,

01:19:55 like it’s a cool thing that’s created

01:19:59 and it shouldn’t be monolithically fought against,

01:20:04 like Facebook is evil or so on.

01:20:06 Yeah, it might make mistakes,

01:20:07 but I think it’s an incredible service.

01:20:10 I think it’s world changing.

01:20:12 I mean, I think Facebook’s done a lot of incredible,

01:20:16 incredible things by bringing, for example, identity.

01:20:20 Like allowing people to be themselves,

01:20:25 like their real selves in the digital space

01:20:28 by using their real name and their real picture.

01:20:31 That step was like the first step from the real world

01:20:34 to the digital world.

01:20:35 That was a huge step that perhaps will define

01:20:38 the 21st century in us creating a digital identity.

01:20:41 And there’s a lot of interesting possibilities there

01:20:44 that are positive.

01:20:45 Of course, some things that are negative

01:20:47 and having a good dialogue about that is great.

01:20:50 And I’m great that people like you

01:20:51 are at the center of that dialogue, so that’s awesome.

01:20:54 Right, I think also, I also can understand.

01:20:58 I think actually in the past,

01:21:00 especially in the past couple of years,

01:21:03 this rising awareness has been helpful.

01:21:07 Like users are also more and more recognizing

01:21:10 that privacy is important to them.

01:21:12 They should, maybe, right,

01:21:14 they should be owners of their data.

01:21:15 I think this definitely is very helpful.

01:21:18 And I think also this type of voice also,

01:21:23 and together with the regulatory framework and so on,

01:21:27 also help the companies to essentially put

01:21:31 these type of issues at a higher priority.

01:21:33 And knowing that, right, also it is their responsibility too

01:21:38 to ensure that users are well protected.

01:21:42 So I think definitely the rising voice is super helpful.

01:21:47 And I think that actually really has brought

01:21:50 the issue of data privacy

01:21:52 and even this consideration of data ownership

01:21:55 to the forefront to really much wider community.

01:22:00 And I think more of this voice is needed,

01:22:03 but I think it’s just that we want to have

01:22:05 a more constructive dialogue to bring the both sides together

01:22:10 to figure out a constructive solution.

01:22:13 So another interesting space

01:22:15 where security is really important

01:22:16 is in the space of any kinds of transactions,

01:22:20 but it could be also digital currency.

01:22:22 So can you maybe talk a little bit about blockchain?

01:22:27 And can you tell me what is a blockchain?

01:22:30 Blockchain.

01:22:32 I think the blockchain word itself

01:22:34 is actually very overloaded.

01:22:37 Of course.

01:22:38 In general.

01:22:39 It’s like AI.

01:22:40 Right, yes.

01:22:42 So in general, when we talk about blockchain,

01:22:43 we refer to this distributor in a decentralized fashion.

01:22:47 So essentially you have a community of nodes

01:22:53 that come together.

01:22:54 And even though each one may not be trusted,

01:22:59 and as long as a certain thresholds

01:23:02 of the set of nodes behaves properly,

01:23:07 then the system can essentially achieve certain properties.

01:23:11 For example, in the distributed ledger setting,

01:23:15 you can maintain an immutable log

01:23:18 and you can ensure that, for example,

01:23:22 the transactions actually are agreed upon

01:23:25 and then it’s immutable and so on.

01:23:28 So first of all, what’s a ledger?

01:23:29 So it’s a…

01:23:30 It’s like a database.

01:23:31 It’s like a data entry.

01:23:33 And so a distributed ledger

01:23:35 is something that’s maintained across

01:23:37 or is synchronized across multiple sources, multiple nodes.

01:23:41 Multiple nodes, yes.

01:23:43 And so where is this idea?

01:23:46 How do you keep…

01:23:48 So it’s important, a ledger, a database,

01:23:51 to keep that, to make sure…

01:23:55 So what are the kinds of security vulnerabilities

01:23:58 that you’re trying to protect against

01:24:01 in the context of a distributed ledger?

01:24:04 So in this case, for example,

01:24:06 you don’t want some malicious nodes

01:24:09 to be able to change the transaction logs.

01:24:12 And in certain cases, it’s called double spending,

01:24:15 like you can also cause different views

01:24:19 in different parts of the network and so on.

01:24:22 So the ledger has to represent,

01:24:24 if you’re capturing financial transactions,

01:24:27 it has to represent the exact timing

01:24:29 and the exact occurrence and no duplicates,

01:24:32 all that kind of stuff.

01:24:33 It has to represent what actually happened.

01:24:37 Okay, so what are your thoughts

01:24:40 on the security and privacy of digital currency?

01:24:43 I can’t tell you how many people write to me

01:24:47 to interview various people in the digital currency space.

01:24:51 There seems to be a lot of excitement there.

01:24:54 And it seems to be, some of it’s, to me,

01:24:57 from an outsider’s perspective, seems like dark magic.

01:25:01 I don’t know how secure…

01:25:06 I think the foundation, from my perspective,

01:25:08 of digital currencies, that is, you can’t trust anyone.

01:25:13 So you have to create a really secure system.

01:25:16 So can you maybe speak about how,

01:25:19 what your thoughts in general about digital currency is

01:25:22 and how we can possibly create financial transactions

01:25:26 and financial stores of money in the digital space?

01:25:31 So you asked about security and privacy.

01:25:35 So again, as I mentioned earlier,

01:25:37 in security, we actually talk about two main properties,

01:25:42 the integrity and confidentiality.

01:25:45 So there’s another one for availability.

01:25:49 You want the system to be available.

01:25:50 But here, for the question you asked,

01:25:52 let’s just focus on integrity and confidentiality.

01:25:57 So for integrity of this distributed ledger,

01:26:00 essentially, as we discussed,

01:26:01 we want to ensure that the different nodes,

01:26:06 so they have this consistent view,

01:26:08 usually it’s done through what we call a consensus protocol,

01:26:13 and that they establish this shared view on this ledger,

01:26:18 and that you cannot go back and change,

01:26:21 it’s immutable, and so on.

01:26:25 So in this case, then the security often refers

01:26:28 to this integrity property.

01:26:31 And essentially, you’re asking the question,

01:26:34 how much work, how can you attack the system

01:26:38 so that the attacker can change the lock, for example?

01:26:43 Change the lock, for example.

01:26:46 Right, how hard is it to make an attack like that?

01:26:48 Right, right.

01:26:49 And then that very much depends on the consensus mechanism,

01:26:55 how the system is built, and all that.

01:26:57 So there are different ways

01:26:59 to build these decentralized systems.

01:27:02 And people may have heard about the terms called

01:27:05 like proof of work, proof of stake,

01:27:07 these different mechanisms.

01:27:09 And it really depends on how the system has been built,

01:27:14 and also how much resources,

01:27:17 how much work has gone into the network

01:27:20 to actually say how secure it is.

01:27:24 So for example, people talk about like,

01:27:26 in Bitcoin, it’s proof of work system,

01:27:28 so much electricity has been burned.

01:27:32 So there’s differences in the different mechanisms

01:27:35 and the implementations of a distributed ledger

01:27:37 used for digital currency.

01:27:40 So there’s Bitcoin, there’s whatever,

01:27:42 there’s so many of them,

01:27:43 and there’s underlying different mechanisms.

01:27:46 And there’s arguments, I suppose,

01:27:48 about which is more effective, which is more secure,

01:27:51 which is more.

01:27:52 And what is needed,

01:27:54 what amount of resources needed

01:27:56 to be able to attack the system?

01:28:00 Like for example, what percentage of the nodes

01:28:02 do you need to control or compromise

01:28:06 in order to, right, to change the log?

01:28:09 And those are things, do you have a sense

01:28:12 if those are things that can be shown theoretically

01:28:15 through the design of the mechanisms,

01:28:17 or does it have to be shown empirically

01:28:19 by having a large number of users using the currency?

01:28:23 I see.

01:28:24 So in general, for each consensus mechanism,

01:28:27 you can actually show theoretically

01:28:30 what is needed to be able to attack the system.

01:28:34 Of course, there can be different types of attacks

01:28:37 as we discussed at the beginning.

01:28:41 And so that it’s difficult to give

01:28:46 like, you know, complete estimates,

01:28:50 like really how much is needed to compromise the system.

01:28:55 But in general, right, so there are ways to say

01:28:57 what percentage of the nodes you need to compromise

01:29:01 and so on.

01:29:03 So we talked about integrity on the security side,

01:29:07 and then you also mentioned the privacy

01:29:11 or the confidentiality side.

01:29:13 Does it have some of the same problems

01:29:17 and therefore some of the same solutions

01:29:19 that you talked about on the machine learning side

01:29:21 with differential privacy and so on?

01:29:24 Yeah, so actually in general on the public ledger

01:29:29 in these public decentralized systems,

01:29:33 actually nothing is private.

01:29:34 So all the transactions posted on the ledger,

01:29:38 anybody can see.

01:29:40 So in that sense, there’s no confidentiality.

01:29:43 So usually what you can do is then

01:29:48 there are the mechanisms that you can build in

01:29:50 to enable confidentiality or privacy of the transactions

01:29:55 and the data and so on.

01:29:56 That’s also some of the work that both my group

01:30:00 and also my startup does as well.

01:30:04 What’s the name of the startup?

01:30:05 Oasis Labs.

01:30:06 Oasis Labs.

01:30:07 And so the confidentiality aspect there

01:30:11 is even though the transactions are public,

01:30:15 you wanna keep some aspect confidential

01:30:18 of the identity of the people involved in the transactions?

01:30:21 Or what is their hope to keep confidential in this context?

01:30:25 So in this case, for example,

01:30:26 you want to enable like confidential transactions,

01:30:31 even, so there are different essentially types of data

01:30:37 that you want to keep private or confidential.

01:30:40 And you can utilize different technologies

01:30:43 including zero knowledge proofs

01:30:45 and also secure computing and techniques

01:30:50 and to hide who is making the transactions to whom

01:30:56 and the transaction amount.

01:30:58 And in our case, also we can enable

01:31:00 like confidential smart contracts.

01:31:02 And so that you don’t know the data

01:31:06 and the execution of the smart contract and so on.

01:31:09 And we actually are combining these different technologies

01:31:14 and going back to the earlier discussion we had,

01:31:20 enabling like ownership of data and privacy of data and so on.

01:31:26 So at Oasis Labs, we’re actually building

01:31:29 what we call a platform for responsible data economy

01:31:33 to actually combine these different technologies together

01:31:36 and to enable secure and privacy preserving computation

01:31:41 and also using the library to help provide immutable log

01:31:48 of users ownership to their data

01:31:51 and the policies they want the data to adhere to,

01:31:54 the usage of the data to adhere to

01:31:56 and also how the data has been utilized.

01:31:59 So all this together can build,

01:32:02 we call a distributed secure computing fabric

01:32:06 that helps to enable a more responsible data economy.

01:32:10 So it’s a lot of things together.

01:32:11 Yeah, wow, that was eloquent.

01:32:13 Okay, you’re involved in so much amazing work

01:32:17 that we’ll never be able to get to,

01:32:18 but I have to ask at least briefly about program synthesis,

01:32:22 which at least in a philosophical sense captures

01:32:26 much of the dreams of what’s possible in computer science

01:32:30 and the artificial intelligence.

01:32:33 First, let me ask, what is program synthesis

01:32:36 and can neural networks be used to learn programs from data?

01:32:41 So can this be learned?

01:32:43 Some aspect of the synthesis can it be learned?

01:32:46 So program synthesis is about teaching computers

01:32:49 to write code, to program.

01:32:52 And I think that’s one of our ultimate dreams or goals.

01:33:00 I think Andreessen talked about software eating the world.

01:33:05 So I say, once we teach computers to write the software,

01:33:10 how to write programs, then I guess computers

01:33:13 will be eating the world by transitivity.

01:33:16 Yeah, exactly.

01:33:17 So yeah, and also for me actually,

01:33:23 when I shifted from security to more AI machine learning,

01:33:28 program synthesis is,

01:33:31 program synthesis and adversarial machine learning,

01:33:33 these are the two fields that I particularly focus on.

01:33:38 Like program synthesis is one of the first questions

01:33:40 that I actually started investigating.

01:33:42 Just as a question, oh, I guess from the security side,

01:33:46 there’s a, you’re looking for holes in programs,

01:33:49 so at least see small connection,

01:33:51 but where was your interest for program synthesis?

01:33:56 Because it’s such a fascinating, such a big,

01:33:58 such a hard problem in the general case.

01:34:01 Why program synthesis?

01:34:03 So the reason for that is actually when I shifted my focus

01:34:06 from security into AI machine learning,

01:34:12 actually one of my main motivation at the time

01:34:16 is that even though I have been doing a lot of work

01:34:19 in security and privacy,

01:34:20 but I have always been fascinated

01:34:22 about building intelligent machines.

01:34:26 And that was really my main motivation

01:34:30 to spend more time in AI machine learning

01:34:32 is that I really want to figure out

01:34:35 how we can build intelligent machines.

01:34:37 And to help us towards that goal,

01:34:43 program synthesis is really one of,

01:34:45 I would say the best domain to work on.

01:34:49 I actually call it like program synthesis

01:34:52 is like the perfect playground

01:34:54 for building intelligent machines

01:34:57 and for artificial general intelligence.

01:34:59 Yeah, well, it’s also in that sense,

01:35:03 not just a playground,

01:35:04 I guess it’s the ultimate test of intelligence

01:35:06 because I think if you can generate sort of neural networks

01:35:13 can learn good functions

01:35:15 and they can help you out in classification tasks,

01:35:19 but to be able to write programs,

01:35:21 that’s the epitome from the machine side.

01:35:24 That’s the same as passing the Turing test

01:35:26 in natural language, but with programs,

01:35:29 it’s able to express complicated ideas

01:35:32 to reason through ideas and boil them down to algorithms.

01:35:38 Yes, exactly, exactly.

01:35:39 Incredible, so can this be learned?

01:35:41 How far are we?

01:35:43 Is there hope?

01:35:44 What are the open challenges?

01:35:46 Yeah, very good questions.

01:35:48 We are still at an early stage,

01:35:51 but already I think we have seen a lot of progress.

01:35:56 I mean, definitely we have existence proof,

01:35:59 just like humans can write programs.

01:36:02 So there’s no reason why computers cannot write programs.

01:36:05 So I think that’s definitely an achievable goal

01:36:08 is just how long it takes.

01:36:11 And even today, we actually have,

01:36:17 the program synthesis community,

01:36:19 especially the program synthesis via learning,

01:36:22 how we call it, neuro program synthesis community,

01:36:24 is still very small, but the community has been growing

01:36:28 and we have seen a lot of progress.

01:36:31 And in limited domains, I think actually program synthesis

01:36:37 is ripe for real world applications.

01:36:41 So actually it was quite amazing.

01:36:42 I was giving a talk, so here is a rework conference.

01:36:49 Rework Deep Learning Summit.

01:36:50 I actually, so I gave another talk

01:36:52 at the previous rework conference

01:36:54 in deep reinforcement learning.

01:36:56 And then I actually met someone from a startup,

01:37:01 the CEO of the startup.

01:37:04 And then when he saw my name, he recognized it.

01:37:06 And he actually said, one of our papers actually had,

01:37:12 they had actually become a key products in their startup.

01:37:17 And that was program synthesis, in that particular case,

01:37:22 it was natural language translation,

01:37:25 translating natural language description into SQL queries.

01:37:31 Oh, wow, that direction, okay.

01:37:34 Right, so yeah, so in program synthesis,

01:37:37 in limited domains, in well specified domains,

01:37:40 actually already we can see really,

01:37:45 really great progress and applicability in the real world.

01:37:52 So domains like, I mean, as an example,

01:37:54 you said natural language,

01:37:55 being able to express something through just normal language

01:37:59 and it converts it into a database SQL query.

01:38:03 Right.

01:38:03 And that’s how solved of a problem is that?

01:38:07 Because that seems like a really hard problem.

01:38:10 Again, in limited domains, actually it can work pretty well.

01:38:14 And now this is also a very active domain of research.

01:38:18 At the time, I think when he saw our paper at the time,

01:38:21 we were the state of the arts on that task.

01:38:25 And since then, actually now there has been more work

01:38:29 and with even more like sophisticated data sets.

01:38:34 And so, but I think I wouldn’t be surprised

01:38:38 that more of this type of technology

01:38:41 really gets into the real world.

01:38:43 That’s exciting.

01:38:44 In the near term.

01:38:45 Being able to learn in the space of programs

01:38:47 is super exciting.

01:38:49 I still, yeah, I’m still skeptical

01:38:53 cause I think it’s a really hard problem,

01:38:54 but I would love to see progress.

01:38:56 And also I think in terms of the,

01:38:58 you asked about open challenges.

01:39:00 I think the domain is full of challenges

01:39:04 and in particular also we want to see

01:39:06 how we should measure the progress in the space.

01:39:09 And I would say mainly three main, I would say, metrics.

01:39:16 So one is the complexity of the program

01:39:18 that we can synthesize.

01:39:20 And that will actually have clear measures

01:39:22 and just look at the past publications.

01:39:25 And even like, for example,

01:39:27 I was at the recent NeurIPS conference.

01:39:30 Now there’s actually fairly sizable like session

01:39:33 dedicated to program synthesis, which is…

01:39:35 Or even Neural programs.

01:39:37 Right, right, right, which is great.

01:39:38 And we continue to see the increase.

01:39:43 What does sizable mean?

01:39:44 I like the word sizable, it’s five people.

01:39:51 It’s still a small community, but it is growing.

01:39:54 And they will all win Turing Awards one day, I like it.

01:39:58 Right, so we can clearly see an increase

01:40:02 in the complexity of the programs that these…

01:40:07 We can synthesize.

01:40:09 Sorry, is it the complexity of the actual text

01:40:12 of the program or the running time complexity?

01:40:15 Which complexity are we…

01:40:17 How…

01:40:18 The complexity of the task to be synthesized

01:40:21 and the complexity of the actual synthesized programs.

01:40:24 So the lines of code even, for example.

01:40:27 Okay, I got you.

01:40:28 But it’s not the theoretical upper bound

01:40:32 of the running time of the algorithm kind of thing.

01:40:35 Okay, got it.

01:40:36 And you can see the complexity decreasing already.

01:40:39 Oh, no, meaning we want to be able to synthesize

01:40:42 more and more complex programs, bigger and bigger programs.

01:40:44 So we want to see that, we want to increase

01:40:49 the complexity of this.

01:40:50 I got you, so I have to think through,

01:40:51 because I thought of complexity as,

01:40:53 you want to be able to accomplish the same task

01:40:55 with a simpler and simpler program.

01:40:56 I see, I see.

01:40:57 No, we are not doing that.

01:40:58 It’s more about how complex a task

01:41:02 we can synthesize programs for.

01:41:03 Yeah, got it, being able to synthesize programs,

01:41:07 learn them for more and more difficult tasks.

01:41:10 So for example, initially, our first work

01:41:12 in program synthesis was to translate natural language

01:41:16 description into really simple programs called if TTT,

01:41:19 if this, then that.

01:41:21 So given a trigger condition,

01:41:23 what is the action you should take?

01:41:25 So that program is super simple.

01:41:28 You just identify the trigger conditions and the action.

01:41:31 And then later on, with SQL queries,

01:41:33 it gets more complex.

01:41:34 And then also, we started to synthesize programs

01:41:37 with loops and, you know.

01:41:40 Oh no, and if you could synthesize recursion,

01:41:43 it’s all over.

01:41:45 Right, actually, one of our works actually

01:41:48 is on learning recursive neural programs.

01:41:50 Oh no.

01:41:51 But anyway, anyway, so that’s one is complexity,

01:41:53 and the other one is generalization.

01:41:58 Like when we train or learn a program synthesizer,

01:42:04 in this case, a neural programs to synthesize programs,

01:42:07 then you want it to generalize.

01:42:10 For a large number of inputs.

01:42:13 Right, so to be able to generalize

01:42:15 to previously unseen inputs.

01:42:18 Got it.

01:42:19 And so, right, so some of the work we did earlier

01:42:21 on learning recursive neural programs

01:42:26 actually showed that recursion

01:42:29 actually is important to learn.

01:42:32 And if you have recursion,

01:42:34 then for a certain set of tasks,

01:42:37 we can actually show that you can actually

01:42:39 have perfect generalization.

01:42:42 So, right, so that won the best paperwork awards

01:42:44 at ICLR earlier.

01:42:46 So that’s one example of we want to learn

01:42:50 these neural programs that can generalize better.

01:42:53 But that works for certain tasks, certain domains,

01:42:57 and there’s question how we can essentially

01:43:01 develop more techniques that can have generalization

01:43:06 for a wider set of domains and so on.

01:43:10 So that’s another area.

01:43:11 And then the third challenge I think will,

01:43:15 it’s not just for programming synthesis,

01:43:17 it’s also cutting across other fields

01:43:20 in machine learning and also including

01:43:24 like deep reinforcement learning in particular,

01:43:26 is that this adaptation is that we want to be able

01:43:33 to learn from the past and tasks and training and so on

01:43:40 to be able to solve new tasks.

01:43:42 So for example, in program synthesis today,

01:43:45 we still are working in the setting

01:43:48 where given a particular task,

01:43:50 we train the model and to solve this particular task.

01:43:57 But that’s not how humans work.

01:44:00 The whole point is we train a human,

01:44:03 then you can then program to solve new tasks.

01:44:07 Right, exactly.

01:44:08 And just like in deep reinforcement learning,

01:44:10 we don’t want to just train agent

01:44:11 to play a particular game,

01:44:14 either it’s Atari or it’s Go or whatever.

01:44:19 We want to train these agents

01:44:21 that can essentially extract knowledge

01:44:24 from the past learning experience

01:44:27 to be able to adapt to new tasks and solve new tasks.

01:44:31 And I think this is particularly important

01:44:33 for program synthesis.

01:44:34 Yeah, that’s the whole dream of program synthesis

01:44:37 is you’re learning a tool that can solve new problems.

01:44:41 Right, exactly.

01:44:42 And I think that’s a particular domain

01:44:44 that as a community, we need to put more emphasis on.

01:44:50 And I hope that we can make more progress there as well.

01:44:54 Awesome.

01:44:55 There’s a lot more to talk about.

01:44:57 Let me ask that you also had a very interesting

01:45:01 and we talked about rich representations.

01:45:04 You had a rich life journey.

01:45:08 You did your bachelor’s in China

01:45:10 and your master’s and PhD in the United States,

01:45:12 CMU in Berkeley.

01:45:15 Are there interesting differences?

01:45:16 I told you I’m Russian.

01:45:17 I think there’s a lot of interesting difference

01:45:19 between Russia and the United States.

01:45:21 Are there in your eyes, interesting differences

01:45:24 between the two cultures from the silly romantic notion

01:45:30 of the spirit of the people to the more practical notion

01:45:33 of how research is conducted that you find interesting

01:45:37 or useful in your own work of having experienced both?

01:45:42 That’s a good question.

01:45:43 I think, so I studied in China for my undergraduates

01:45:50 and that was more than 20 years ago.

01:45:54 So it’s been a long time.

01:45:57 Is there echoes of that time in you?

01:45:59 Things have changed a lot.

01:46:00 Actually, it’s interesting.

01:46:01 I think even more so maybe something

01:46:04 that’s even be more different for my experience

01:46:08 than a lot of computer science researchers

01:46:12 and practitioners is that,

01:46:14 so for my undergrad, I actually studied physics.

01:46:16 Nice, very nice.

01:46:18 And then I switched to computer science in graduate school.

01:46:22 What happened?

01:46:26 Is there another possible universe

01:46:29 where you could have become a theoretical physicist

01:46:32 at Caltech or something like that?

01:46:34 That’s very possible, some of my undergrad classmates,

01:46:39 then they later on studied physics,

01:46:41 got their PhD in physics from these schools,

01:46:45 from top physics programs.

01:46:49 So you switched to, I mean,

01:46:51 from that experience of doing physics in your bachelor’s,

01:46:55 what made you decide to switch to computer science

01:46:59 and computer science at arguably the best university,

01:47:03 one of the best universities in the world

01:47:05 for computer science with Carnegie Mellon,

01:47:07 especially for grad school and so on.

01:47:09 So what, second only to MIT, just kidding.

01:47:13 Okay, I had to throw that in there.

01:47:17 No, what was the choice like

01:47:19 and what was the move to the United States like?

01:47:22 What was that whole transition?

01:47:24 And if you remember, if there’s still echoes

01:47:26 of some of the spirit of the people of China in you

01:47:30 in New York.

01:47:31 Right, right, yeah.

01:47:32 It’s like three questions in one.

01:47:33 Yes, I know.

01:47:34 I’m sorry.

01:47:36 No, that’s okay.

01:47:38 So yes, so I guess, okay,

01:47:40 so first transition from physics to computer science.

01:47:43 So when I first came to the United States,

01:47:45 I was actually in the physics PhD program at Cornell.

01:47:49 I was there for one year

01:47:50 and then I switched to computer science

01:47:52 and then I was in the PhD program at Carnegie Mellon.

01:47:56 So, okay, so the reasons for switching.

01:47:59 So one thing, so that’s why I also mentioned

01:48:02 about this difference in backgrounds

01:48:04 about having studied physics first in my undergrad.

01:48:09 I actually really, I really did enjoy

01:48:13 my undergrad’s time and education in physics.

01:48:18 I think that actually really helped me

01:48:21 in my future work in computer science.

01:48:25 Actually, even for machine learning,

01:48:26 a lot of the machine learning stuff,

01:48:28 the core machine learning methods,

01:48:29 many of them actually came from physics.

01:48:31 Statistical.

01:48:34 For honest, most of everything came from physics.

01:48:39 Right, but anyway, so when I studied physics,

01:48:42 I was, I think I was really attracted to physics.

01:48:49 It was, it’s really beautiful.

01:48:51 And I actually call it, physics is the language of nature.

01:48:55 And I actually clearly remember, like, one moment

01:49:01 in my undergrads, like I did my undergrad in Tsinghua

01:49:07 and I used to study in the library.

01:49:10 And I clearly remember, like, one day

01:49:14 I was sitting in the library and I was, like,

01:49:19 writing on my notes and so on.

01:49:21 And I got so excited that I realized

01:49:24 that really just from a few simple axioms,

01:49:28 a few simple laws, I can derive so much.

01:49:31 It’s almost like I can derive the rest of the world.

01:49:34 Yeah, the rest of the universe.

01:49:35 Yes, yes, so that was, like, amazing.

01:49:39 Do you think you, have you ever seen

01:49:42 or do you think you can rediscover

01:49:43 that kind of power and beauty in computer science

01:49:46 in the world that you…

01:49:46 So, that’s very interesting.

01:49:49 So that gets to, you know, the transition

01:49:51 from physics to computer science.

01:49:53 It’s quite different.

01:49:55 For physics in grad school, actually, things changed.

01:50:01 So one is I started to realize that

01:50:05 when I started doing research in physics,

01:50:08 at the time I was doing theoretical physics.

01:50:11 And a lot of it, you still have the beauty,

01:50:14 but it’s very different.

01:50:16 So I had to actually do a lot of the simulation.

01:50:18 So essentially I was actually writing,

01:50:20 in some cases writing fortune code.

01:50:23 Good old fortune, yeah.

01:50:26 To actually, right, do simulations and so on.

01:50:32 That was not exactly what I enjoyed doing.

01:50:42 And also at the time from talking with the senior students,

01:50:47 senior students in the program,

01:50:52 I realized many of the students actually were going off

01:50:55 to like Wall Street and so on.

01:50:58 So, and I’ve always been interested in computer science

01:51:02 and actually essentially taught myself

01:51:06 the C programming.

01:51:07 Program?

01:51:08 Right, and so on.

01:51:09 At which, when?

01:51:10 In college.

01:51:12 In college somewhere?

01:51:12 In the summer.

01:51:14 For fun, physics major, learning to do C programming.

01:51:19 Beautiful.

01:51:20 Actually it’s interesting, in physics at the time,

01:51:23 I think now the program probably has changed,

01:51:25 but at the time really the only class we had

01:51:29 in related to computer science education

01:51:34 was introduction to, I forgot,

01:51:36 to computer science or computing and Fortran 77.

01:51:40 There’s a lot of people that still use Fortran.

01:51:42 I’m actually, if you’re a programmer out there,

01:51:46 I’m looking for an expert to talk to about Fortran.

01:51:49 They seem to, there’s not many,

01:51:51 but there’s still a lot of people that still use Fortran

01:51:53 and still a lot of people that use Cobalt.

01:51:56 But anyway, so then I realized,

01:52:00 instead of just doing programming

01:52:01 for doing simulations and so on,

01:52:04 that I may as well just change to computer science.

01:52:07 And also one thing I really liked,

01:52:09 and that’s a key difference between the two,

01:52:11 is in computer science it’s so much easier

01:52:14 to realize your ideas.

01:52:15 If you have an idea, you write it up, you code it up,

01:52:19 and then you can see it actually, right?

01:52:22 Exactly.

01:52:23 Running and you can see it.

01:52:26 You can bring it to life quickly.

01:52:26 Bring it to life.

01:52:27 Whereas in physics, if you have a good theory,

01:52:30 you have to wait for the experimentalists

01:52:33 to do the experiments and to confirm the theory,

01:52:35 and things just take so much longer.

01:52:38 And also the reason in physics I decided to do

01:52:42 theoretical physics was because I had my experience

01:52:45 with experimental physics.

01:52:47 First, you have to fix the equipment.

01:52:50 You spend most of your time fixing the equipment first.

01:52:55 Super expensive equipment, so there’s a lot of,

01:52:58 yeah, you have to collaborate with a lot of people.

01:53:00 Takes a long time.

01:53:01 Just takes really, right, much longer.

01:53:03 Yeah, it’s messy.

01:53:04 Right, so I decided to switch to computer science.

01:53:06 And one thing I think maybe people have realized

01:53:09 is that for people who study physics,

01:53:11 actually it’s very easy for physicists

01:53:13 to change to do something else.

01:53:16 I think physics provides a really good training.

01:53:19 And yeah, so actually it was fairly easy

01:53:23 to switch to computer science.

01:53:26 But one thing, going back to your earlier question,

01:53:29 so one thing I actually did realize,

01:53:32 so there is a big difference between computer science

01:53:34 and physics, where physics you can derive

01:53:37 the whole universe from just a few simple laws.

01:53:41 And computer science, given that a lot of it

01:53:43 is defined by humans, the systems are defined by humans,

01:53:47 and it’s artificial, like essentially you create

01:53:53 a lot of these artifacts and so on.

01:53:57 It’s not quite the same.

01:53:58 You don’t derive the computer systems

01:54:00 with just a few simple laws.

01:54:03 You actually have to see there is historical reasons

01:54:07 why a system is built and designed one way

01:54:10 versus the other.

01:54:12 There’s a lot more complexity, less elegant simplicity

01:54:17 of E equals MC squared that kind of reduces everything

01:54:20 down to those beautiful fundamental equations.

01:54:23 But what about the move from China to the United States?

01:54:27 Is there anything that still stays in you

01:54:31 that contributes to your work,

01:54:33 the fact that you grew up in another culture?

01:54:36 So yes, I think especially back then

01:54:38 it’s very different from now.

01:54:40 So now they actually, I see these students

01:54:46 coming from China, and even undergrads,

01:54:49 actually they speak fluent English.

01:54:51 It was just amazing.

01:54:54 And they have already understood so much of the culture

01:54:59 in the US and so on.

01:55:00 It was to you, it was all foreign?

01:55:04 It was a very different time.

01:55:06 At the time, actually, we didn’t even have easy access

01:55:11 to email, not to mention about the web.

01:55:16 I remember I had to go to specific privileged server rooms

01:55:22 to use email, and hence, at the time,

01:55:27 at the time we had much less knowledge

01:55:30 about the Western world.

01:55:32 And actually at the time I didn’t know,

01:55:35 actually in the US, the West Coast weather

01:55:38 is much better than the East Coast.

01:55:40 Yeah, things like that, actually.

01:55:45 It’s very interesting.

01:55:48 But now it’s so different.

01:55:50 At the time, I would say there’s also

01:55:52 a bigger cultural difference,

01:55:53 because there was so much less opportunity

01:55:58 for shared information.

01:55:59 So it’s such a different time and world.

01:56:02 So let me ask maybe a sensitive question.

01:56:04 I’m not sure, but I think you and I

01:56:07 are in similar positions.

01:56:08 I’ve been here for already 20 years as well,

01:56:13 and looking at Russia from my perspective,

01:56:15 and you looking at China.

01:56:16 In some ways, it’s a very distant place,

01:56:19 because it’s changed a lot.

01:56:21 But in some ways you still have echoes,

01:56:23 you still have knowledge of that place.

01:56:25 The question is, China’s doing a lot

01:56:27 of incredible work in AI.

01:56:29 Do you see, please tell me

01:56:32 there’s an optimistic picture you see

01:56:34 where the United States and China

01:56:36 can collaborate and sort of grow together

01:56:38 in the development of AI towards,

01:56:41 there’s different values in terms

01:56:43 of the role of government and so on,

01:56:44 of ethical, transparent, secure systems.

01:56:48 We see it differently in the United States

01:56:50 a little bit than China,

01:56:51 but we’re still trying to work it out.

01:56:53 Do you see the two countries being able

01:56:55 to successfully collaborate and work

01:56:57 in a healthy way without sort of fighting

01:57:01 and making it an AI arms race kind of situation?

01:57:06 Yeah, I believe so.

01:57:08 I think science has no border,

01:57:10 and the advancement of the technology helps everyone,

01:57:16 helps the whole world.

01:57:18 And so I certainly hope that the two countries

01:57:21 will collaborate, and I certainly believe so.

01:57:26 Do you have any reason to believe so

01:57:28 except being an optimist?

01:57:32 So first, again, like I said, science has no borders.

01:57:35 And especially in…

01:57:36 Science doesn’t know borders?

01:57:38 Right.

01:57:39 And you believe that will,

01:57:41 in the former Soviet Union during the Cold War…

01:57:44 So that’s, yeah.

01:57:45 So that’s the other point I was going to mention

01:57:47 is that especially in academic research,

01:57:51 everything is public.

01:57:52 Like we write papers, we open source codes,

01:57:55 and all this is in the public domain.

01:57:59 It doesn’t matter whether the person is in the US,

01:58:01 in China, or some other parts of the world.

01:58:04 They can go on archive

01:58:06 and look at the latest research and results.

01:58:09 So that openness gives you hope.

01:58:11 Yes. Me too.

01:58:12 And that’s also how, as a world,

01:58:15 we make progress the best.

01:58:17 So, I apologize for the romanticized question,

01:58:21 but looking back,

01:58:22 what would you say was the most transformative moment

01:58:26 in your life that

01:58:30 maybe made you fall in love with computer science?

01:58:32 You said physics.

01:58:33 You remember there was a moment

01:58:34 where you thought you could derive

01:58:36 the entirety of the universe.

01:58:38 Was there a moment that you really fell in love

01:58:40 with the work you do now,

01:58:42 from security to machine learning,

01:58:45 to program synthesis?

01:58:47 So maybe, as I mentioned, actually, in college,

01:58:52 one summer I just taught myself programming in C.

01:58:55 Yes.

01:58:56 And you just read a book,

01:58:57 and then you’re like…

01:58:59 Don’t tell me you fell in love with computer science

01:59:01 by programming in C.

01:59:02 Remember I mentioned one of the draws

01:59:05 for me to computer science is how easy it is

01:59:07 to realize your ideas.

01:59:10 So once I read a book,

01:59:13 I taught myself how to program in C.

01:59:16 Immediately, what did I do?

01:59:19 I programmed two games.

01:59:22 One’s just simple, like it’s a Go game,

01:59:25 like it’s a board, you can move the stones and so on.

01:59:28 And the other one, I actually programmed a game

01:59:30 that’s like a 3D Tetris.

01:59:32 It turned out to be a super hard game to play.

01:59:35 Because instead of just the standard 2D Tetris,

01:59:38 it’s actually a 3D thing.

01:59:40 But I realized, wow,

01:59:42 I just had these ideas to try it out,

01:59:45 and then, yeah, you can just do it.

01:59:48 And so that’s when I realized, wow, this is amazing.

01:59:53 Yeah, you can create yourself.

01:59:55 Yes, yes, exactly.

01:59:57 From nothing to something

01:59:59 that’s actually out in the real world.

02:00:01 So let me ask…

02:00:02 Right, I think with your own hands.

02:00:03 Let me ask a silly question,

02:00:05 or maybe the ultimate question.

02:00:07 What is to you the meaning of life?

02:00:11 What gives your life meaning, purpose,

02:00:15 fulfillment, happiness, joy?

02:00:19 Okay, these are two different questions.

02:00:21 Very different, yeah.

02:00:22 It’s usually that you ask this question.

02:00:24 Maybe this question is probably the question

02:00:28 that has followed me and followed my life the most.

02:00:32 Have you discovered anything,

02:00:34 any satisfactory answer for yourself?

02:00:38 Is there something you’ve arrived at?

02:00:41 You know, there’s a moment…

02:00:44 I’ve talked to a few people who have faced,

02:00:46 for example, a cancer diagnosis,

02:00:48 or faced their own mortality,

02:00:50 and that seems to change their view of them.

02:00:53 It seems to be a catalyst for them

02:00:56 removing most of the crap.

02:00:59 Of seeing that most of what they’ve been doing

02:01:02 is not that important,

02:01:04 and really reducing it into saying, like,

02:01:06 here’s actually the few things that really give meaning.

02:01:11 Mortality is a really powerful catalyst for that,

02:01:14 it seems like.

02:01:15 Facing mortality, whether it’s your parents dying

02:01:17 or somebody close to you dying,

02:01:19 or facing your own death for whatever reason,

02:01:22 or cancer and so on.

02:01:23 So yeah, so in my own case,

02:01:26 I didn’t need to face mortality, too.

02:01:28 So try to ask that question.

02:01:35 And I think there are a couple things.

02:01:38 So one is, like, who should be defining

02:01:42 the meaning of your life, right?

02:01:44 Is there some kind of even greater things than you

02:01:49 who should define the meaning of your life?

02:01:51 So for example, when people say that

02:01:53 searching the meaning for your life,

02:01:56 is there some outside voice,

02:02:00 or is there something outside of you

02:02:04 who actually tells you, you know…

02:02:06 So people talk about, oh, you know,

02:02:09 this is what you have been born to do, right?

02:02:14 Like, this is your destiny.

02:02:19 So who, right, so that’s one question,

02:02:21 like, who gets to define the meaning of your life?

02:02:24 Should you be finding some other things,

02:02:27 some other factor to define this for you?

02:02:30 Or is something actually,

02:02:32 it’s just entirely what you define yourself,

02:02:35 and it can be very arbitrary.

02:02:37 Yeah, so an inner voice or an outer voice,

02:02:41 whether it could be spiritual or religious, too, with God,

02:02:44 or some other components of the environment outside of you,

02:02:48 or just your own voice.

02:02:50 Do you have an answer there?

02:02:52 So, okay, so for that, I have an answer.

02:02:55 And through, you know, the long period of time

02:02:58 of thinking and searching,

02:03:00 even searching through outsides, right,

02:03:04 you know, voices or factors outside of me.

02:03:08 So that, I have an answer.

02:03:09 I’ve come to the conclusion and realization

02:03:13 that it’s you yourself that defines the meaning of life.

02:03:18 Yeah, that’s a big burden, though, isn’t it?

02:03:20 I mean, yes and no, right?

02:03:26 So then you have the freedom to define it.

02:03:28 Yes.

02:03:29 And another question is, like,

02:03:33 what does it really mean by the meaning of life?

02:03:37 Right.

02:03:39 And also, whether the question even makes sense.

02:03:45 Absolutely, and you said it somehow distinct from happiness.

02:03:49 So meaning is something much deeper

02:03:51 than just any kind of emotional,

02:03:55 any kind of contentment or joy or whatever.

02:03:57 It might be much deeper.

02:03:58 And then you have to ask, what is deeper than that?

02:04:02 What is there at all?

02:04:04 And then the question starts being silly.

02:04:07 Right, and also you can say it’s deeper,

02:04:09 but you can also say it’s shallower,

02:04:10 depending on how people want to define

02:04:13 the meaning of their life.

02:04:14 So for example, most people don’t even think

02:04:16 about this question.

02:04:17 Then the meaning of life to them

02:04:19 doesn’t really matter that much.

02:04:22 And also, whether knowing the meaning of life,

02:04:26 whether it actually helps your life to be better

02:04:28 or whether it helps your life to be happier,

02:04:31 these actually are open questions.

02:04:34 It’s not, right?

02:04:36 Of course, most questions are open.

02:04:37 I tend to think that just asking the question,

02:04:40 as you mentioned, as you’ve done for a long time,

02:04:42 is the only, that there is no answer.

02:04:44 And asking the question is a really good exercise.

02:04:47 I mean, I have this, for me personally,

02:04:49 I’ve had a kind of feeling that creation is,

02:04:56 like for me has been very fulfilling.

02:04:58 And it seems like my meaning has been to create.

02:05:00 And I’m not sure what that is.

02:05:02 Like I don’t have, I’m single and I don’t have kids.

02:05:05 I’d love to have kids, but I also, sounds creepy,

02:05:08 but I also see sort of, you said see programs.

02:05:13 I see programs as little creations.

02:05:15 I see robots as little creations.

02:05:19 I think those bring, and then ideas,

02:05:22 theorems are creations.

02:05:25 And those somehow intrinsically, like you said,

02:05:28 bring me joy.

02:05:29 I think they do to a lot of, at least scientists,

02:05:31 but I think they do to a lot of people.

02:05:34 So that, to me, if I had to force the answer to that,

02:05:37 I would say creating new things yourself.

02:05:43 For you.

02:05:44 For me, for me, for me.

02:05:45 I don’t know, but like you said, it keeps changing.

02:05:48 Is there some answer that?

02:05:49 And some people, they can, I think,

02:05:52 they may say it’s experience, right?

02:05:54 Like their meaning of life,

02:05:56 they just want to experience

02:05:57 to the richest and fullest they can.

02:05:59 And a lot of people do take that path.

02:06:02 Yes, seeing life as actually a collection of moments

02:06:05 and then trying to make the richest possible sets,

02:06:10 fill those moments with the richest possible experiences.

02:06:13 Right.

02:06:14 And for me, I think it’s certainly,

02:06:16 we do share a lot of similarity here.

02:06:18 So creation is also really important for me,

02:06:20 even from the things I’ve already talked about,

02:06:24 even like writing papers,

02:06:26 and these are all creations as well.

02:06:30 And I have not quite thought

02:06:32 whether that is really the meaning of my life.

02:06:34 Like in a sense, also then maybe like,

02:06:37 what kind of things should you create?

02:06:38 There are so many different things that you could create.

02:06:42 And also you can say, another view is maybe growth.

02:06:46 It’s related, but different from experience.

02:06:50 Growth is also maybe type of meaning of life.

02:06:53 It’s just, you try to grow every day,

02:06:55 try to be a better self every day.

02:06:59 And also ultimately, we are here,

02:07:04 it’s part of the overall evolution.

02:07:09 Right, the world is evolving and it’s growing.

02:07:11 Isn’t it funny that the growth seems to be

02:07:14 the more important thing

02:07:15 than the thing you’re growing towards.

02:07:18 It’s like, it’s not the goal, it’s the journey to it.

02:07:21 It’s almost when you submit a paper,

02:07:27 there’s a sort of depressing element to it,

02:07:29 not to submit a paper,

02:07:30 but when that whole project is over.

02:07:32 I mean, there’s the gratitude,

02:07:34 there’s the celebration and so on,

02:07:35 but you’re usually immediately looking for the next thing

02:07:39 or the next step, right?

02:07:40 It’s not that, the end of it is not the satisfaction,

02:07:44 it’s the hardship, the challenge you have to overcome,

02:07:47 the growth through the process.

02:07:48 It’s somehow probably deeply within us,

02:07:51 the same thing that drives the evolutionary process

02:07:54 is somehow within us,

02:07:55 with everything the way we see the world.

02:07:58 Since you’re thinking about these,

02:08:00 so you’re still in search of an answer.

02:08:02 I mean, yes and no,

02:08:05 in the sense that I think for people

02:08:07 who really dedicate time to search for the answer

02:08:11 to ask the question, what is the meaning of life?

02:08:15 It does not necessarily bring you happiness.

02:08:18 Yeah.

02:08:20 It’s a question, we can say, right?

02:08:23 Like whether it’s a well defined question.

02:08:25 And, but on the other hand,

02:08:30 given that you get to answer it yourself,

02:08:33 you can define it yourself,

02:08:35 then sure, I can just give it an answer.

02:08:41 And in that sense, yes, it can help.

02:08:46 Like we discussed, right?

02:08:47 If you say, oh, then my meaning of life is to create

02:08:52 or to grow, then yes, then I think they can help.

02:08:57 But how do you know that that is really the meaning of life

02:09:00 or the meaning of your life?

02:09:02 It’s like there’s no way for you

02:09:04 to really answer the question.

02:09:05 Sure, but something about that certainty is liberating.

02:09:10 So it might be an illusion, you might not really know,

02:09:12 you might be just convincing yourself falsely,

02:09:15 but being sure that that’s the meaning,

02:09:18 there’s something liberating in that.

02:09:23 There’s something freeing in knowing this is your purpose.

02:09:26 So you can fully give yourself to that.

02:09:29 Without, you know, for a long time,

02:09:30 you know, I thought like, isn’t it all relative?

02:09:33 Like why, how do we even know what’s good and what’s evil?

02:09:38 Like isn’t everything just relative?

02:09:39 Like how do we know, you know,

02:09:42 the question of meaning is ultimately

02:09:44 the question of why do anything?

02:09:48 Why is anything good or bad?

02:09:50 Why is anything valuable and so on?

02:09:52 Exactly.

02:09:53 Then you start to, I think just like you said,

02:09:58 I think it’s a really useful question to ask,

02:10:02 but if you ask it for too long and too aggressively.

02:10:07 It may not be so productive.

02:10:08 It may not be productive and not just for traditionally

02:10:13 societally defined success, but also for happiness.

02:10:17 It seems like asking the question about the meaning of life

02:10:20 is like a trap.

02:10:24 We’re destined to be asking.

02:10:25 We’re destined to look up to the stars

02:10:27 and ask these big why questions

02:10:28 we’ll never be able to answer,

02:10:30 but we shouldn’t get lost in them.

02:10:31 I think that’s probably the,

02:10:34 that’s at least the lesson I picked up so far.

02:10:36 On that topic.

02:10:37 Oh, let me just add one more thing.

02:10:38 So it’s interesting.

02:10:40 So sometimes, yes, it can help you to focus.

02:10:47 So when I shifted my focus more from security

02:10:53 to AI and machine learning,

02:10:55 at the time, actually one of the main reasons

02:10:58 that I did that was because at the time,

02:11:02 I thought the meaning of my life

02:11:07 and the purpose of my life is to build intelligent machines.

02:11:14 And that’s, and then your inner voice said

02:11:16 that this is the right,

02:11:18 this is the right journey to take

02:11:20 to build intelligent machines

02:11:21 and that you actually fully realize

02:11:23 you took a really legitimate big step

02:11:26 to become one of the world class researchers

02:11:28 to actually make it, to actually go down that journey.

02:11:32 Yeah, that’s profound.

02:11:35 That’s profound.

02:11:36 I don’t think there’s a better way

02:11:39 to end a conversation than talking for a while

02:11:42 about the meaning of life.

02:11:44 Dawn is a huge honor to talk to you.

02:11:46 Thank you so much for talking today.

02:11:47 Thank you, thank you.

02:11:49 Thanks for listening to this conversation with Dawn Song

02:11:52 and thank you to our presenting sponsor, Cash App.

02:11:55 Please consider supporting the podcast

02:11:57 by downloading Cash App and using code LexPodcast.

02:12:01 If you enjoy this podcast, subscribe on YouTube,

02:12:03 review it with five stars on Apple Podcast,

02:12:06 support it on Patreon,

02:12:07 or simply connect with me on Twitter at LexFriedman.

02:12:11 And now let me leave you with some words about hacking

02:12:15 from the great Steve Wozniak.

02:12:17 A lot of hacking is playing with other people,

02:12:20 you know, getting them to do strange things.

02:12:24 Thank you for listening and hope to see you next time.