Andrew Ng: Deep Learning, Education, and Real-World AI #73

Transcript

00:00:00 The following is a conversation with Andrew Ng,

00:00:03 one of the most impactful educators, researchers, innovators, and leaders

00:00:08 in artificial intelligence and technology space in general.

00:00:11 He cofounded Coursera and Google Brain,

00:00:15 launched Deep Learning AI, Landing AI, and the AI Fund,

00:00:19 and was the chief scientist at Baidu.

00:00:23 As a Stanford professor and with Coursera and Deep Learning AI,

00:00:27 he has helped educate and inspire millions of students, including me.

00:00:33 This is the Artificial Intelligence Podcast.

00:00:36 If you enjoy it, subscribe on YouTube, give it five stars on Apple Podcast,

00:00:40 support it on Patreon, or simply connect with me on Twitter

00:00:43 at Lex Friedman, spelled F R I D M A N.

00:00:48 As usual, I’ll do one or two minutes of ads now

00:00:51 and never any ads in the middle that can break the flow of the conversation.

00:00:54 I hope that works for you and doesn’t hurt the listening experience.

00:00:59 This show is presented by Cash App, the number one finance app in the App Store.

00:01:03 When you get it, use code LEXPODCAST.

00:01:07 Cash App lets you send money to friends, buy Bitcoin,

00:01:10 and invest in the stock market with as little as $1.

00:01:13 Broker services are provided by Cash App Investing,

00:01:16 a subsidiary of Square, a member SIPC.

00:01:20 Since Cash App allows you to buy Bitcoin,

00:01:23 let me mention that cryptocurrency in the context of the history of money is fascinating.

00:01:28 I recommend Ascent of Money as a great book on this history.

00:01:33 Debits and credits on ledgers started over 30,000 years ago.

00:01:38 The US dollar was created over 200 years ago,

00:01:42 and Bitcoin, the first decentralized cryptocurrency, released just over 10 years ago.

00:01:48 So given that history, cryptocurrency is still very much in its early days of development,

00:01:53 but it’s still aiming to and just might redefine the nature of money.

00:01:59 So again, if you get Cash App from the App Store or Google Play

00:02:03 and use the code LEXPODCAST, you’ll get $10,

00:02:07 and Cash App will also donate $10 to FIRST,

00:02:10 one of my favorite organizations that is helping to advance robotics and STEM education

00:02:15 for young people around the world.

00:02:18 And now, here’s my conversation with Andrew Ng.

00:02:23 The courses you taught on machine learning at Stanford

00:02:25 and later on Coursera that you cofounded have educated and inspired millions of people.

00:02:31 So let me ask you, what people or ideas inspired you

00:02:35 to get into computer science and machine learning when you were young?

00:02:39 When did you first fall in love with the field, is another way to put it.

00:02:43 Growing up in Hong Kong and Singapore, I started learning to code when I was five or six years old.

00:02:50 At that time, I was learning the basic programming language,

00:02:53 and they would take these books and they’ll tell you,

00:02:56 type this program into your computer, so type that program to my computer.

00:03:00 And as a result of all that typing, I would get to play these very simple shoot them up games

00:03:05 that I had implemented on my little computer.

00:03:09 So I thought it was fascinating as a young kid that I could write this code.

00:03:14 I was really just copying code from a book into my computer

00:03:18 to then play these cool little video games.

00:03:21 Another moment for me was when I was a teenager and my father,

00:03:27 who’s a doctor, was reading about expert systems and about neural networks.

00:03:31 So he got me to read some of these books, and I thought it was really cool.

00:03:34 You could write a computer that started to exhibit intelligence.

00:03:39 Then I remember doing an internship while I was in high school, this was in Singapore,

00:03:44 where I remember doing a lot of photocopying and as an office assistant.

00:03:50 And the highlight of my job was when I got to use the shredder.

00:03:53 So the teenager me, remote thinking, boy, this is a lot of photocopying.

00:03:57 If only we could write software, build a robot, something to automate this,

00:04:01 maybe I could do something else.

00:04:03 So I think a lot of my work since then has centered on the theme of automation.

00:04:07 Even the way I think about machine learning today,

00:04:09 we’re very good at writing learning algorithms that can automate things that people can do.

00:04:14 Or even launching the first MOOCs, Mass Open Online Courses, that later led to Coursera.

00:04:20 I was trying to automate what could be automatable in how I was teaching on campus.

00:04:25 Process of education, trying to automate parts of that to make it more,

00:04:30 sort of to have more impact from a single teacher, a single educator.

00:04:34 Yeah, I felt, you know, teaching at Stanford,

00:04:37 teaching machine learning to about 400 students a year at the time.

00:04:41 And I found myself filming the exact same video every year,

00:04:46 telling the same jokes in the same room.

00:04:48 And I thought, why am I doing this?

00:04:50 Why don’t we just take last year’s video?

00:04:51 And then I can spend my time building a deeper relationship with students.

00:04:55 So that process of thinking through how to do that,

00:04:57 that led to the first MOOCs that we launched.

00:05:00 And then you have more time to write new jokes.

00:05:03 Are there favorite memories from your early days at Stanford,

00:05:06 teaching thousands of people in person and then millions of people online?

00:05:12 You know, teaching online, what not many people know was that a lot of those videos

00:05:19 were shot between the hours of 10 p.m. and 3 a.m.

00:05:24 A lot of times, we were launching the first MOOCs at Stanford.

00:05:31 We had already announced the course, about 100,000 people signed up.

00:05:33 We just started to write the code and we had not yet actually filmed the videos.

00:05:39 So a lot of pressure, 100,000 people waiting for us to produce the content.

00:05:43 So many Fridays, Saturdays, I would go out, have dinner with my friends,

00:05:49 and then I would think, OK, do you want to go home now?

00:05:51 Or do you want to go to the office to film videos?

00:05:54 And the thought of being able to help 100,000 people potentially learn machine learning,

00:05:59 fortunately, that made me think, OK, I want to go to my office,

00:06:03 go to my tiny little recording studio.

00:06:05 I would adjust my Logitech webcam, adjust my Wacom tablet,

00:06:10 make sure my lapel mic was on,

00:06:12 and then I would start recording often until 2 a.m. or 3 a.m.

00:06:15 I think unfortunately, that doesn’t show that it was recorded that late at night,

00:06:20 but it was really inspiring the thought that we could create content

00:06:25 to help so many people learn about machine learning.

00:06:27 How did that feel?

00:06:29 The fact that you’re probably somewhat alone,

00:06:31 maybe a couple of friends recording with a Logitech webcam

00:06:36 and kind of going home alone at 1 or 2 a.m. at night

00:06:40 and knowing that that’s going to reach sort of thousands of people,

00:06:45 eventually millions of people, what’s that feeling like?

00:06:48 I mean, is there a feeling of just satisfaction of pushing through?

00:06:54 I think it’s humbling.

00:06:55 And I wasn’t thinking about what I was feeling.

00:06:57 I think one thing that I’m proud to say we got right from the early days

00:07:02 was I told my whole team back then that the number one priority

00:07:06 is to do what’s best for learners, do what’s best for students.

00:07:09 And so when I went to the recording studio,

00:07:11 the only thing on my mind was what can I say?

00:07:13 How can I design my slides?

00:07:15 What I need to draw right to make these concepts as clear as possible for learners?

00:07:20 I think I’ve seen sometimes instructors is tempting to,

00:07:24 hey, let’s talk about my work.

00:07:25 Maybe if I teach you about my research,

00:07:27 someone will cite my papers a couple more times.

00:07:29 And I think one of the things we got right,

00:07:31 launching the first few MOOCs and later building Coursera,

00:07:34 was putting in place that bedrock principle of

00:07:37 let’s just do what’s best for learners and forget about everything else.

00:07:40 And I think that that is a guiding principle

00:07:43 turned out to be really important to the rise of the MOOC movement.

00:07:46 And the kind of learner you imagined in your mind

00:07:49 is as broad as possible, as global as possible.

00:07:53 So really try to reach as many people

00:07:56 interested in machine learning and AI as possible.

00:07:59 I really want to help anyone that had an interest in machine learning

00:08:03 to break into the field.

00:08:05 And I think sometimes I’ve actually had people ask me,

00:08:08 hey, why are you spending so much time explaining gradient descent?

00:08:11 And my answer was, if I look at what I think the learner needs

00:08:15 and what benefit from, I felt that having that

00:08:18 a good understanding of the foundations coming back to the basics

00:08:22 would put them in a better stead to then build on a long term career.

00:08:26 So try to consistently make decisions on that principle.

00:08:30 So one of the things you actually revealed to the narrow AI community

00:08:35 at the time and to the world is that the amount of people

00:08:39 who are actually interested in AI is much larger than we imagined.

00:08:43 By you teaching the class and how popular it became,

00:08:47 it showed that, wow, this isn’t just a small community

00:08:50 of sort of people who go to NeurIPS and it’s much bigger.

00:08:56 It’s developers, it’s people from all over the world.

00:08:59 I mean, I’m Russian, so everybody in Russia is really interested.

00:09:03 There’s a huge number of programmers who are interested in machine learning,

00:09:06 India, China, South America, everywhere.

00:09:10 There’s just millions of people who are interested in machine learning.

00:09:13 So how big do you get a sense that the number of people

00:09:16 is that are interested from your perspective?

00:09:20 I think the number has grown over time.

00:09:22 I think it’s one of those things that maybe it feels like it came out of nowhere,

00:09:26 but it’s an insight that building it, it took years.

00:09:28 It’s one of those overnight successes that took years to get there.

00:09:33 My first foray into this type of online education

00:09:35 was when we were filming my Stanford class

00:09:37 and sticking the videos on YouTube and some other things.

00:09:40 We had uploaded the horrors and so on,

00:09:42 but it’s basically the one hour, 15 minute video that we put on YouTube.

00:09:47 And then we had four or five other versions of websites that I had built,

00:09:52 most of which you would never have heard of

00:09:53 because they reached small audiences,

00:09:55 but that allowed me to iterate,

00:09:57 allowed my team and me to iterate,

00:09:59 to learn what are the ideas that work and what doesn’t.

00:10:02 For example, one of the features I was really excited about

00:10:04 and really proud of was build this website

00:10:07 where multiple people could be logged into the website at the same time.

00:10:11 So today, if you go to a website,

00:10:13 if you are logged in and then I want to log in,

00:10:15 you need to log out because it’s the same browser, the same computer.

00:10:18 But I thought, well, what if two people say you and me

00:10:21 were watching a video together in front of a computer?

00:10:24 What if a website could have you type your name and password,

00:10:27 have me type my name and password,

00:10:28 and then now the computer knows both of us are watching together

00:10:31 and it gives both of us credit for anything we do as a group.

00:10:35 Influencers feature rolled it out in a high school in San Francisco.

00:10:39 We had about 20 something users.

00:10:42 Where’s the teacher there?

00:10:43 Sacred Heart Cathedral Prep, the teacher is great.

00:10:46 I mean, guess what?

00:10:47 Zero people use this feature.

00:10:49 It turns out people studying online,

00:10:51 they want to watch the videos by themselves.

00:10:53 So you can play back, pause at your own speed rather than in groups.

00:10:57 So that was one example of a tiny lesson learned out of many

00:11:01 that allowed us to hone into the set of features.

00:11:04 It sounds like a brilliant feature.

00:11:06 So I guess the lesson to take from that is

00:11:11 there’s something that looks amazing on paper and then nobody uses it.

00:11:15 It doesn’t actually have the impact that you think it might have.

00:11:18 And so, yeah, I saw that you really went through a lot of different features

00:11:21 and a lot of ideas to arrive at Coursera,

00:11:25 the final kind of powerful thing that showed the world

00:11:28 that MOOCs can educate millions.

00:11:32 And I think with the whole machine learning movement as well,

00:11:35 I think it didn’t come out of nowhere.

00:11:38 Instead, what happened was as more people learn about machine learning,

00:11:42 they will tell their friends and their friends will see

00:11:44 how it’s applicable to their work.

00:11:45 And then the community kept on growing.

00:11:48 And I think we’re still growing.

00:11:50 I don’t know in the future what percentage of all developers

00:11:54 will be AI developers.

00:11:56 I could easily see it being north of 50%, right?

00:11:58 Because so many AI developers broadly construed,

00:12:03 not just people doing the machine learning modeling,

00:12:05 but the people building infrastructure, data pipelines,

00:12:08 all the software surrounding the core machine learning model

00:12:13 maybe is even bigger.

00:12:14 I feel like today almost every software engineer

00:12:17 has some understanding of the cloud.

00:12:19 Not all, but maybe this is my microcontroller developer

00:12:23 that doesn’t need to deal with the cloud.

00:12:24 But I feel like the vast majority of software engineers today

00:12:28 are sort of having an appreciation of the cloud.

00:12:31 I think in the future, maybe we’ll approach nearly 100% of all developers

00:12:35 being in some way an AI developer

00:12:38 or at least having an appreciation of machine learning.

00:12:41 And my hope is that there’s this kind of effect

00:12:44 that there’s people who are not really interested in being a programmer

00:12:48 or being into software engineering, like biologists, chemists,

00:12:51 and physicists, even mechanical engineers,

00:12:55 all these disciplines that are now more and more sitting on large data sets.

00:13:01 And here they didn’t think they’re interested in programming

00:13:04 until they have this data set and they realize

00:13:06 there’s this set of machine learning tools

00:13:07 that allow you to use the data set.

00:13:09 So they actually become, they learn to program

00:13:12 and they become new programmers.

00:13:13 So like the, not just because you’ve mentioned

00:13:16 a larger percentage of developers become machine learning people.

00:13:19 So it seems like more and more the kinds of people

00:13:24 who are becoming developers is also growing significantly.

00:13:27 Yeah, I think once upon a time,

00:13:30 only a small part of humanity was literate, could read and write.

00:13:34 And maybe you thought, maybe not everyone needs to learn to read and write.

00:13:37 You just go listen to a few monks read to you and maybe that was enough.

00:13:44 Or maybe you just need a few handful of authors to write the bestsellers

00:13:47 and no one else needs to write.

00:13:50 But what we found was that by giving as many people,

00:13:53 in some countries, almost everyone, basic literacy,

00:13:56 it dramatically enhanced human to human communications.

00:13:59 And we can now write for an audience of one,

00:14:01 such as if I send you an email or you send me an email.

00:14:04 I think in computing, we’re still in that phase

00:14:07 where so few people know how to code

00:14:09 that the coders mostly have to code for relatively large audiences.

00:14:14 But if everyone, or most people became developers at some level,

00:14:20 similar to how most people in developed economies are somewhat literate,

00:14:24 I would love to see the owners of a mom and pop store

00:14:27 be able to write a little bit of code to customize the TV display

00:14:30 for their special this week.

00:14:32 And I think it will enhance human to computer communications,

00:14:36 which is becoming more and more important today as well.

00:14:38 So you think it’s possible that machine learning

00:14:41 becomes kind of similar to literacy,

00:14:45 where like you said, the owners of a mom and pop shop,

00:14:49 is basically everybody in all walks of life

00:14:52 would have some degree of programming capability?

00:14:55 I could see society getting there.

00:14:58 There’s one other interesting thing.

00:15:00 If I go talk to the mom and pop store,

00:15:02 if I talk to a lot of people in their daily professions,

00:15:05 I previously didn’t have a good story for why they should learn to code.

00:15:09 We could give them some reasons.

00:15:11 But what I found with the rise of machine learning and data science is that

00:15:14 I think the number of people with a concrete use for data science

00:15:18 in their daily lives, in their jobs,

00:15:20 may be even larger than the number of people

00:15:22 who have concrete use for software engineering.

00:15:25 For example, if you run a small mom and pop store,

00:15:28 I think if you can analyze the data about your sales, your customers,

00:15:31 I think there’s actually real value there,

00:15:34 maybe even more than traditional software engineering.

00:15:37 So I find that for a lot of my friends in various professions,

00:15:40 be it recruiters or accountants or people that work in the factories,

00:15:45 which I deal with more and more these days,

00:15:48 I feel if they were data scientists at some level,

00:15:51 they could immediately use that in their work.

00:15:54 So I think that data science and machine learning

00:15:56 may be an even easier entree into the developer world

00:16:00 for a lot of people than the software engineering.

00:16:03 That’s interesting.

00:16:04 And I agree with that, but that’s beautifully put.

00:16:06 But we live in a world where most courses and talks have slides,

00:16:11 PowerPoint, keynote,

00:16:12 and yet you famously often still use a marker and a whiteboard.

00:16:17 The simplicity of that is compelling,

00:16:19 and for me at least, fun to watch.

00:16:22 So let me ask, why do you like using a marker and whiteboard,

00:16:25 even on the biggest of stages?

00:16:28 I think it depends on the concepts you want to explain.

00:16:32 For mathematical concepts,

00:16:34 it’s nice to build up the equation one piece at a time,

00:16:37 and the whiteboard marker or the pen and stylus

00:16:41 is a very easy way to build up the equation,

00:16:43 to build up a complex concept one piece at a time

00:16:47 while you’re talking about it,

00:16:48 and sometimes that enhances understandability.

00:16:52 The downside of writing is that it’s slow,

00:16:54 and so if you want a long sentence, it’s very hard to write that.

00:16:57 So I think there are pros and cons,

00:16:58 and sometimes I use slides,

00:17:00 and sometimes I use a whiteboard or a stylus.

00:17:03 The slowness of a whiteboard is also its upside,

00:17:06 because it forces you to reduce everything to the basics.

00:17:12 Some of your talks involve the whiteboard.

00:17:14 I mean, you go very slowly,

00:17:17 and you really focus on the most simple principles,

00:17:20 and that’s a beautiful,

00:17:22 that enforces a kind of a minimalism of ideas

00:17:26 that I think is surprising at least for me is great for education.

00:17:31 Like a great talk, I think, is not one that has a lot of content.

00:17:36 A great talk is one that just clearly says a few simple ideas,

00:17:41 and I think the whiteboard somehow enforces that.

00:17:46 Peter Abbeel, who’s now one of the top roboticists

00:17:49 and reinforcement learning experts in the world,

00:17:51 was your first PhD student.

00:17:54 So I bring him up just because I kind of imagine

00:17:58 this must have been an interesting time in your life,

00:18:01 and do you have any favorite memories of working with Peter,

00:18:04 since you were your first student in those uncertain times,

00:18:08 especially before deep learning really sort of blew up?

00:18:15 Any favorite memories from those times?

00:18:17 Yeah, I was really fortunate to have had Peter Abbeel

00:18:20 as my first PhD student,

00:18:22 and I think even my long term professional success

00:18:25 builds on early foundations or early work

00:18:27 that Peter was so critical to.

00:18:29 So I was really grateful to him for working with me.

00:18:34 What not a lot of people know is just how hard research was,

00:18:39 and still is.

00:18:42 Peter’s PhD thesis was using reinforcement learning

00:18:44 to fly helicopters.

00:18:47 And so, even today, the website heli.stanford.edu,

00:18:51 heli.stanford.edu is still up.

00:18:53 You can watch videos of us using reinforcement learning

00:18:56 to make a helicopter fly upside down,

00:18:57 fly loose roses, so it’s cool.

00:18:59 It’s one of the most incredible robotics videos ever,

00:19:02 so people should watch it.

00:19:03 Oh yeah, thank you.

00:19:04 It’s inspiring.

00:19:05 That’s from like 2008 or seven or six, like that range.

00:19:10 Yeah, something like that.

00:19:11 Yeah, so it was over 10 years old.

00:19:12 That was really inspiring to a lot of people, yeah.

00:19:15 What not many people see is how hard it was.

00:19:18 So Peter and Adam Coase and Morgan Quigley and I

00:19:22 were working on various versions of the helicopter,

00:19:25 and a lot of things did not work.

00:19:27 For example, it turns out one of the hardest problems we had

00:19:29 was when the helicopter’s flying around upside down,

00:19:32 doing stunts, how do you figure out the position?

00:19:34 How do you localize the helicopter?

00:19:36 So we wanted to try all sorts of things.

00:19:38 Having one GPS unit doesn’t work

00:19:41 because you’re flying upside down,

00:19:42 the GPS unit’s facing down, so you can’t see the satellites.

00:19:44 So we experimented trying to have two GPS units,

00:19:48 one facing up, one facing down.

00:19:49 So if you flip over, that didn’t work

00:19:51 because the downward facing one couldn’t synchronize

00:19:54 if you’re flipping quickly.

00:19:55 Morgan Quigley was exploring this crazy,

00:19:58 complicated configuration of specialized hardware

00:20:01 to interpret GPS signals.

00:20:03 Looking at the FPG is completely insane.

00:20:06 Spent about a year working on that, didn’t work.

00:20:09 So I remember Peter, great guy, him and me,

00:20:13 sitting down in my office looking at some of the latest things

00:20:17 we had tried that didn’t work and saying,

00:20:20 done it, what now?

00:20:22 Because we tried so many things and it just didn’t work.

00:20:25 In the end, what we did, and Adam Coles was crucial to this,

00:20:31 was put cameras on the ground and use cameras on the ground

00:20:34 to localize the helicopter.

00:20:35 And that solved the localization problem

00:20:38 so that we could then focus on the reinforcement learning

00:20:41 and inverse reinforcement learning techniques

00:20:43 so it didn’t actually make the helicopter fly.

00:20:46 And I’m reminded, when I was doing this work at Stanford,

00:20:50 around that time, there was a lot of reinforcement learning

00:20:54 theoretical papers, but not a lot of practical applications.

00:20:58 So the autonomous helicopter work for flying helicopters

00:21:02 was one of the few practical applications

00:21:05 of reinforcement learning at the time,

00:21:06 which caused it to become pretty well known.

00:21:10 I feel like we might have almost come full circle with today.

00:21:13 There’s so much buzz, so much hype, so much excitement

00:21:16 about reinforcement learning.

00:21:17 But again, we’re hunting for more applications

00:21:20 of all of these great ideas that David Kuhnke has come up with.

00:21:23 What was the drive sort of in the face of the fact

00:21:28 that most people are doing theoretical work?

00:21:30 What motivates you in the uncertainty and the challenges

00:21:32 to get the helicopter sort of to do the applied work,

00:21:36 to get the actual system to work?

00:21:39 Yeah, in the face of fear, uncertainty, sort of the setbacks

00:21:43 that you mentioned for localization.

00:21:45 I like stuff that works.

00:21:47 In the physical world.

00:21:48 So like, it’s back to the shredder.

00:21:50 You know, I like theory, but when I work on theory myself,

00:21:55 and this is personal taste,

00:21:56 I’m not saying anyone else should do what I do.

00:21:58 But when I work on theory, I personally enjoy it more

00:22:01 if I feel that the work I do will influence people,

00:22:06 have positive impact, or help someone.

00:22:10 I remember when many years ago,

00:22:12 I was speaking with a mathematics professor,

00:22:15 and it kind of just said, hey, why do you do what you do?

00:22:18 It kind of just said, hey, why do you do what you do?

00:22:21 And then he said, he had stars in his eyes when he answered.

00:22:25 And this mathematician, not from Stanford,

00:22:28 different university, he said, I do what I do

00:22:31 because it helps me to discover truth and beauty

00:22:35 in the universe.

00:22:36 He had stars in his eyes when he said that.

00:22:38 And I thought, that’s great.

00:22:41 I don’t want to do that.

00:22:42 I think it’s great that someone does that,

00:22:44 fully support the people that do it,

00:22:45 a lot of respect for people that do that.

00:22:46 But I am more motivated when I can see a line

00:22:50 to how the work that my teams and I are doing helps people.

00:22:56 The world needs all sorts of people.

00:22:58 I’m just one type.

00:22:59 I don’t think everyone should do things

00:23:01 the same way as I do.

00:23:02 But when I delve into either theory or practice,

00:23:05 if I personally have conviction that here’s a pathway

00:23:09 to help people, I find that more satisfying

00:23:14 to have that conviction.

00:23:15 That’s your path.

00:23:17 You were a proponent of deep learning

00:23:19 before it gained widespread acceptance.

00:23:23 What did you see in this field that gave you confidence?

00:23:26 What was your thinking process like in that first decade

00:23:28 of the, I don’t know what that’s called, 2000s, the aughts?

00:23:33 Yeah, I can tell you the thing we got wrong

00:23:35 and the thing we got right.

00:23:36 The thing we really got wrong was the importance of,

00:23:40 the early importance of unsupervised learning.

00:23:42 So early days of Google Brain,

00:23:46 we put a lot of effort into unsupervised learning

00:23:48 rather than supervised learning.

00:23:49 And there was this argument,

00:23:50 I think it was around 2005 after NeurIPS,

00:23:55 at that time called NIPS, but now NeurIPS had ended.

00:23:58 And Jeff Hinton and I were sitting in the cafeteria

00:24:01 outside the conference.

00:24:02 We had lunch, we were just chatting.

00:24:04 And Jeff pulled up this napkin.

00:24:05 He started sketching this argument on a napkin.

00:24:07 It was very compelling, as I’ll repeat it.

00:24:10 Human brain has about a hundred trillion.

00:24:12 So there’s 10 to the 14 synaptic connections.

00:24:16 You will live for about 10 to the nine seconds.

00:24:19 That’s 30 years.

00:24:20 You actually live for two by 10 to the nine,

00:24:22 maybe three by 10 to the nine seconds.

00:24:24 So just let’s say 10 to the nine.

00:24:26 So if each synaptic connection,

00:24:29 each weight in your brain’s neural network

00:24:31 has just a one bit parameter,

00:24:33 that’s 10 to the 14 bits you need to learn

00:24:36 in up to 10 to the nine seconds.

00:24:38 10 to the nine seconds of your life.

00:24:41 So via this simple argument,

00:24:43 which is a lot of problems, it’s very simplified.

00:24:45 That’s 10 to the five bits per second

00:24:47 you need to learn in your life.

00:24:49 And I have a one year old daughter.

00:24:52 I am not pointing out 10 to five bits per second

00:24:56 of labels to her.

00:24:59 And I think I’m a very loving parent,

00:25:01 but I’m just not gonna do that.

00:25:04 So from this very crude, definitely problematic argument,

00:25:08 there’s just no way that most of what we know

00:25:11 is through supervised learning.

00:25:13 But where you get so many bits of information

00:25:15 is from sucking in images, audio,

00:25:16 those experiences in the world.

00:25:19 And so that argument,

00:25:21 and there are a lot of known forces argument

00:25:23 you should go into,

00:25:24 really convinced me that there’s a lot of power

00:25:26 to unsupervised learning.

00:25:29 So that was the part that we actually maybe got wrong.

00:25:32 I still think unsupervised learning is really important,

00:25:34 but in the early days, 10, 15 years ago,

00:25:38 a lot of us thought that was the path forward.

00:25:41 Oh, so you’re saying that that perhaps

00:25:43 was the wrong intuition for the time.

00:25:45 For the time, that was the part we got wrong.

00:25:48 The part we got right was the importance of scale.

00:25:51 So Adam Coates, another wonderful person,

00:25:55 fortunate to have worked with him,

00:25:57 he was in my group at Stanford at the time

00:25:59 and Adam had run these experiments at Stanford

00:26:02 showing that the bigger we train a learning algorithm,

00:26:05 the better its performance.

00:26:07 And it was based on that.

00:26:09 There was a graph that Adam generated

00:26:12 where the X axis, Y axis lines going up into the right.

00:26:15 So the bigger you make this thing,

00:26:17 the better its performance accuracy is the vertical axis.

00:26:20 So it’s really based on that chart that Adam generated

00:26:22 that he gave me the conviction

00:26:23 that you could scale these models way bigger

00:26:26 than what we could on a few CPUs,

00:26:27 which is where we had at Stanford

00:26:29 that we could get even better results.

00:26:31 And it was really based on that one figure

00:26:33 that Adam generated

00:26:34 that gave me the conviction to go with Sebastian Thrun

00:26:38 to pitch starting a project at Google,

00:26:42 which became the Google Brain project.

00:26:43 The Brain, you go find a Google Brain.

00:26:45 And there the intuition was scale

00:26:48 will bring performance for the system.

00:26:52 So we should chase a larger and larger scale.

00:26:55 And I think people don’t realize how groundbreaking of it.

00:27:00 It’s simple, but it’s a groundbreaking idea

00:27:02 that bigger data sets will result in better performance.

00:27:05 It was controversial at the time.

00:27:08 Some of my well meaning friends,

00:27:10 senior people in the machine learning community,

00:27:11 I won’t name, but some of whom we know,

00:27:16 my well meaning friends came

00:27:17 and were trying to give me friendly,

00:27:19 I was like, hey, Andrew, why are you doing this?

00:27:20 This is crazy.

00:27:21 It’s in the near natural architecture.

00:27:23 Look at these architectures of building.

00:27:24 You just want to go for scale?

00:27:25 Like this is a bad career move.

00:27:27 So my well meaning friends,

00:27:29 some of them were trying to talk me out of it.

00:27:33 But I find that if you want to make a breakthrough,

00:27:36 you sometimes have to have conviction

00:27:38 and do something before it’s popular,

00:27:40 since that lets you have a bigger impact.

00:27:43 Let me ask you just a small tangent on that topic.

00:27:45 I find myself arguing with people saying that greater scale,

00:27:51 especially in the context of active learning,

00:27:53 so very carefully selecting the data set,

00:27:56 but growing the scale of the data set

00:27:59 is going to lead to even further breakthroughs

00:28:01 in deep learning.

00:28:02 And there’s currently pushback at that idea

00:28:05 that larger data sets are no longer,

00:28:09 so you want to increase the efficiency of learning.

00:28:11 You want to make better learning mechanisms.

00:28:13 And I personally believe that bigger data sets will still,

00:28:17 with the same learning methods we have now,

00:28:19 will result in better performance.

00:28:21 What’s your intuition at this time

00:28:23 on this dual side?

00:28:27 Do we need to come up with better architectures for learning

00:28:31 or can we just get bigger, better data sets

00:28:35 that will improve performance?

00:28:37 I think both are important and it’s also problem dependent.

00:28:40 So for a few data sets,

00:28:41 we may be approaching a Bayes error rate

00:28:45 or approaching or surpassing human level performance

00:28:48 and then there’s that theoretical ceiling

00:28:50 that we will never surpass,

00:28:51 so Bayes error rate.

00:28:54 But then I think there are plenty of problems

00:28:56 where we’re still quite far

00:28:57 from either human level performance

00:28:59 or from Bayes error rate

00:29:00 and bigger data sets with neural networks

00:29:05 without further algorithmic innovation

00:29:07 will be sufficient to take us further.

00:29:10 But on the flip side,

00:29:11 if we look at the recent breakthroughs

00:29:12 using transforming networks or language models,

00:29:15 it was a combination of novel architecture

00:29:18 but also scale had a lot to do with it.

00:29:20 If we look at what happened with GP2 and BERTZ,

00:29:22 I think scale was a large part of the story.

00:29:26 Yeah, that’s not often talked about

00:29:28 is the scale of the data set it was trained on

00:29:30 and the quality of the data set

00:29:32 because there’s some,

00:29:35 so it was like reddit threads that had,

00:29:38 they were operated highly.

00:29:39 So there’s already some weak supervision

00:29:42 on a very large data set

00:29:44 that people don’t often talk about, right?

00:29:47 I find that today we have maturing processes

00:29:50 to managing code,

00:29:52 things like Git, right?

00:29:53 Version control.

00:29:54 It took us a long time to evolve the good processes.

00:29:58 I remember when my friends and I

00:29:59 were emailing each other C++ files in email,

00:30:02 but then we had,

00:30:03 was it CVS or version Git?

00:30:05 Maybe something else in the future.

00:30:07 We’re very mature in terms of tools for managing data

00:30:10 and think about the clean data

00:30:11 and how to solve down very hot, messy data problems.

00:30:15 I think there’s a lot of innovation there

00:30:17 to be had still.

00:30:17 I love the idea that you were versioning through email.

00:30:21 I’ll give you one example.

00:30:23 When we work with manufacturing companies,

00:30:29 it’s not at all uncommon

00:30:31 for there to be multiple labels

00:30:34 that disagree with each other, right?

00:30:36 And so we would do the work in visual inspection.

00:30:40 We will take, say, a plastic part

00:30:42 and show it to one inspector

00:30:44 and the inspector, sometimes very opinionated,

00:30:47 they’ll go, clearly, that’s a defect.

00:30:48 This scratch, unacceptable.

00:30:49 Gotta reject this part.

00:30:51 Take the same part to different inspector,

00:30:53 different, very opinionated.

00:30:54 Clearly, the scratch is small.

00:30:56 It’s fine.

00:30:56 Don’t throw it away.

00:30:57 You’re gonna make us, you know.

00:30:59 And then sometimes you take the same plastic part,

00:31:01 show it to the same inspector

00:31:03 in the afternoon, I suppose, in the morning,

00:31:05 and very opinionated go, in the morning,

00:31:07 they say, clearly, it’s okay.

00:31:08 In the afternoon, equally confident.

00:31:10 Clearly, this is a defect.

00:31:12 And so what is an AI team supposed to do

00:31:14 if sometimes even one person doesn’t agree

00:31:17 with himself or herself in the span of a day?

00:31:20 So I think these are the types of very practical,

00:31:23 very messy data problems that my teams wrestle with.

00:31:30 In the case of large consumer internet companies

00:31:32 where you have a billion users,

00:31:34 you have a lot of data.

00:31:35 You don’t worry about it.

00:31:36 Just take the average.

00:31:37 It kind of works.

00:31:38 But in a case of other industry settings,

00:31:40 we don’t have big data.

00:31:42 If just a small data, very small data sets,

00:31:44 maybe around 100 defective parts

00:31:47 or 100 examples of a defect.

00:31:49 If you have only 100 examples,

00:31:51 these little labeling errors,

00:31:53 if 10 of your 100 labels are wrong,

00:31:55 that actually is 10% of your data set has a big impact.

00:31:58 So how do you clean this up?

00:31:59 What are you supposed to do?

00:32:01 This is an example of the types of things

00:32:03 that my teams, this is a landing AI example,

00:32:06 are wrestling with to deal with small data,

00:32:09 which comes up all the time

00:32:10 once you’re outside consumer internet.

00:32:12 Yeah, that’s fascinating.

00:32:13 So then you invest more effort and time

00:32:15 in thinking about the actual labeling process.

00:32:18 What are the labels?

00:32:19 What are the how are disagreements resolved

00:32:22 and all those kinds of like pragmatic real world problems.

00:32:25 That’s a fascinating space.

00:32:27 Yeah, I find that actually when I’m teaching at Stanford,

00:32:29 I increasingly encourage students at Stanford

00:32:32 to try to find their own project

00:32:37 for the end of term project,

00:32:38 rather than just downloading someone else’s

00:32:40 nicely clean data set.

00:32:41 It’s actually much harder if you need to go

00:32:43 and define your own problem and find your own data set,

00:32:45 rather than you go to one of the several good websites,

00:32:48 very good websites with clean scoped data sets

00:32:52 that you could just work on.

00:32:55 You’re now running three efforts,

00:32:56 the AI Fund, Landing AI, and deeplearning.ai.

00:33:02 As you’ve said, the AI Fund is involved

00:33:04 in creating new companies from scratch.

00:33:06 Landing AI is involved in helping

00:33:08 already established companies do AI

00:33:10 and deeplearning.ai is for education of everyone else

00:33:14 or of individuals interested in getting into the field

00:33:18 and excelling in it.

00:33:19 So let’s perhaps talk about each of these areas.

00:33:22 First, deeplearning.ai.

00:33:25 How, the basic question,

00:33:27 how does a person interested in deep learning

00:33:30 get started in the field?

00:33:32 Deep learning.ai is working to create courses

00:33:35 to help people break into AI.

00:33:37 So my machine learning course that I taught through Stanford

00:33:42 is one of the most popular courses on Coursera.

00:33:45 To this day, it’s probably one of the courses,

00:33:48 sort of, if I asked somebody,

00:33:49 how did you get into machine learning

00:33:52 or how did you fall in love with machine learning

00:33:54 or would get you interested,

00:33:55 it always goes back to Andrew Ng at some point.

00:33:58 I see, yeah, I’m sure.

00:34:00 You’ve influenced, the amount of people

00:34:01 you’ve influenced is ridiculous.

00:34:03 So for that, I’m sure I speak for a lot of people

00:34:05 say big thank you.

00:34:07 No, yeah, thank you.

00:34:09 I was once reading a news article,

00:34:13 I think it was tech review

00:34:15 and I’m gonna mess up the statistic,

00:34:17 but I remember reading an article that said

00:34:20 something like one third of all programmers are self taught.

00:34:23 I may have the number one third,

00:34:24 around me was two thirds,

00:34:25 but when I read that article,

00:34:26 I thought this doesn’t make sense.

00:34:28 Everyone is self taught.

00:34:29 So, cause you teach yourself.

00:34:31 I don’t teach people.

00:34:32 That’s well put.

00:34:33 Yeah, so how does one get started in deep learning

00:34:37 and where does deeplearning.ai fit into that?

00:34:40 So the deep learning specialization offered by deeplearning.ai

00:34:43 is I think it was Coursera’s top specialization.

00:34:49 It might still be.

00:34:50 So it’s a very popular way for people

00:34:52 to take that specialization

00:34:54 to learn about everything from neural networks

00:34:57 to how to tune in your network

00:34:59 to what is a ConvNet to what is a RNN

00:35:02 or a sequence model or what is an attention model.

00:35:05 And so the deep learning specialization

00:35:09 steps everyone through those algorithms

00:35:10 so you deeply understand it

00:35:12 and can implement it and use it for whatever application.

00:35:15 From the very beginning.

00:35:16 So what would you say are the prerequisites

00:35:19 for somebody to take the deep learning specialization

00:35:22 in terms of maybe math or programming background?

00:35:25 Yeah, need to understand basic programming

00:35:27 since there are programming exercises in Python

00:35:30 and the math prereq is quite basic.

00:35:34 So no calculus is needed.

00:35:35 If you know calculus is great, you get better intuitions

00:35:38 but deliberately try to teach that specialization

00:35:41 without requiring calculus.

00:35:42 So I think high school math would be sufficient.

00:35:47 If you know how to multiply two matrices,

00:35:49 I think that’s great.

00:35:52 So a little basic linear algebra is great.

00:35:54 Basic linear algebra,

00:35:55 even very, very basic linear algebra in some programming.

00:36:00 I think that people that have done the machine learning course

00:36:02 will find a deep learning specialization a bit easier

00:36:05 but it’s also possible to jump

00:36:06 into the deep learning specialization directly

00:36:08 but it will be a little bit harder

00:36:09 since we tend to go over faster concepts

00:36:14 like how does gradient descent work

00:36:16 and what is the objective function

00:36:17 which is covered more slowly in the machine learning course.

00:36:20 Could you briefly mention some of the key concepts

00:36:22 in deep learning that students should learn

00:36:25 that you envision them learning in the first few months

00:36:27 in the first year or so?

00:36:29 So if you take the deep learning specialization,

00:36:31 you learn the foundations of what is a neural network.

00:36:34 How do you build up a neural network

00:36:36 from a single logistic unit to a stack of layers

00:36:40 to different activation functions.

00:36:43 You learn how to train the neural networks.

00:36:44 One thing I’m very proud of in that specialization

00:36:47 is we go through a lot of practical knowhow

00:36:50 of how to actually make these things work.

00:36:52 So what are the differences between different optimization algorithms?

00:36:55 What do you do if the algorithm overfits

00:36:57 or how do you tell if the algorithm is overfitting?

00:36:59 When do you collect more data?

00:37:00 When should you not bother to collect more data?

00:37:03 I find that even today, unfortunately,

00:37:06 there are engineers that will spend six months

00:37:09 trying to pursue a particular direction

00:37:12 such as collect more data

00:37:13 because we heard more data is valuable

00:37:15 but sometimes you could run some tests

00:37:18 and could have figured out six months earlier

00:37:20 that for this particular problem, collecting more data isn’t going to cut it.

00:37:23 So just don’t spend six months collecting more data.

00:37:26 Spend your time modifying the architecture or trying something else.

00:37:30 So go through a lot of the practical knowhow

00:37:32 so that when someone, when you take the deep learning specialization,

00:37:37 you have those skills to be very efficient

00:37:39 in how you build these networks.

00:37:41 So dive right in to play with the network, to train it,

00:37:45 to do the inference on a particular data set,

00:37:47 to build intuition about it without building it up too big

00:37:52 to where you spend, like you said, six months

00:37:54 learning, building up your big project

00:37:57 without building any intuition of a small aspect of the data

00:38:02 that could already tell you everything you need to know about that data.

00:38:05 Yes, and also the systematic frameworks of thinking

00:38:09 for how to go about building practical machine learning.

00:38:12 Maybe to make an analogy, when we learn to code,

00:38:15 we have to learn the syntax of some programming language, right?

00:38:17 Be it Python or C++ or Octave or whatever.

00:38:21 But the equally important or maybe even more important part of coding

00:38:24 is to understand how to string together these lines of code

00:38:27 into coherent things.

00:38:28 So when should you put something in a function column?

00:38:31 When should you not?

00:38:32 How do you think about abstraction?

00:38:34 So those frameworks are what makes a programmer efficient

00:38:39 even more than understanding the syntax.

00:38:41 I remember when I was an undergrad at Carnegie Mellon,

00:38:44 one of my friends would debug their code

00:38:47 by first trying to compile it, and then it was C++ code.

00:38:50 And then every line in the syntax error,

00:38:53 they want to get rid of the syntax errors as quickly as possible.

00:38:55 So how do you do that?

00:38:56 Well, they would delete every single line of code with a syntax error.

00:38:59 So really efficient for getting rid of syntax errors

00:39:01 for horrible debugging errors.

00:39:02 So I think we learn how to debug.

00:39:05 And I think in machine learning,

00:39:06 the way you debug a machine learning program

00:39:09 is very different than the way you do binary search or whatever,

00:39:13 or use a debugger, trace through the code

00:39:15 in traditional software engineering.

00:39:17 So it’s an evolving discipline,

00:39:18 but I find that the people that are really good

00:39:20 at debugging machine learning algorithms

00:39:22 are easily 10x, maybe 100x faster at getting something to work.

00:39:28 And the basic process of debugging is,

00:39:30 so the bug in this case,

00:39:32 why isn’t this thing learning, improving,

00:39:36 sort of going into the questions of overfitting

00:39:39 and all those kinds of things?

00:39:40 That’s the logical space that the debugging is happening in

00:39:45 with neural networks.

00:39:46 Yeah, often the question is, why doesn’t it work yet?

00:39:50 Or can I expect it to eventually work?

00:39:52 And what are the things I could try?

00:39:54 Change the architecture, more data, more regularization,

00:39:57 different optimization algorithm,

00:40:00 different types of data.

00:40:01 So to answer those questions systematically,

00:40:04 so that you don’t spend six months hitting down the blind alley

00:40:08 before someone comes and says,

00:40:09 why did you spend six months doing this?

00:40:12 What concepts in deep learning

00:40:13 do you think students struggle the most with?

00:40:16 Or sort of is the biggest challenge for them

00:40:19 was to get over that hill.

00:40:23 It hooks them and it inspires them and they really get it.

00:40:28 Similar to learning mathematics,

00:40:30 I think one of the challenges of deep learning

00:40:32 is that there are a lot of concepts

00:40:33 that build on top of each other.

00:40:36 If you ask me what’s hard about mathematics,

00:40:38 I have a hard time pinpointing one thing.

00:40:40 Is it addition, subtraction?

00:40:42 Is it a carry?

00:40:43 Is it multiplication?

00:40:44 There’s just a lot of stuff.

00:40:45 I think one of the challenges of learning math

00:40:48 and of learning certain technical fields

00:40:49 is that there are a lot of concepts

00:40:51 and if you miss a concept,

00:40:53 then you’re kind of missing the prerequisite

00:40:55 for something that comes later.

00:40:58 So in the deep learning specialization,

00:41:01 try to break down the concepts

00:41:03 to maximize the odds of each component being understandable.

00:41:06 So when you move on to the more advanced thing,

00:41:09 we learn confidence,

00:41:10 hopefully you have enough intuitions

00:41:12 from the earlier sections

00:41:13 to then understand why we structure confidence

00:41:16 in a certain way

00:41:18 and then eventually why we built RNNs and LSTMs

00:41:23 or attention models in a certain way

00:41:24 building on top of the earlier concepts.

00:41:27 Actually, I’m curious,

00:41:28 you do a lot of teaching as well.

00:41:30 Do you have a favorite,

00:41:33 this is the hard concept moment in your teaching?

00:41:39 Well, I don’t think anyone’s ever turned the interview on me.

00:41:43 I’m glad you get first.

00:41:46 I think that’s a really good question.

00:41:48 Yeah, it’s really hard to capture the moment

00:41:51 when they struggle.

00:41:51 I think you put it really eloquently.

00:41:53 I do think there’s moments

00:41:55 that are like aha moments

00:41:57 that really inspire people.

00:41:59 I think for some reason,

00:42:01 reinforcement learning,

00:42:03 especially deep reinforcement learning

00:42:05 is a really great way

00:42:07 to really inspire people

00:42:09 and get what the use of neural networks can do.

00:42:13 Even though neural networks

00:42:15 really are just a part of the deep RL framework,

00:42:18 but it’s a really nice way

00:42:19 to paint the entirety of the picture

00:42:22 of a neural network

00:42:23 being able to learn from scratch,

00:42:25 knowing nothing and explore the world

00:42:27 and pick up lessons.

00:42:29 I find that a lot of the aha moments

00:42:31 happen when you use deep RL

00:42:33 to teach people about neural networks,

00:42:36 which is counterintuitive.

00:42:37 I find like a lot of the inspired sort of fire

00:42:40 in people’s passion,

00:42:41 people’s eyes,

00:42:42 it comes from the RL world.

00:42:44 Do you find reinforcement learning

00:42:46 to be a useful part

00:42:48 of the teaching process or no?

00:42:51 I still teach reinforcement learning

00:42:53 in one of my Stanford classes

00:42:55 and my PhD thesis was on reinforcement learning.

00:42:57 So I clearly loved a few.

00:42:59 I find that if I’m trying to teach

00:43:00 students the most useful techniques

00:43:03 for them to use today,

00:43:04 I end up shrinking the amount of time

00:43:07 I talk about reinforcement learning.

00:43:08 It’s not what’s working today.

00:43:10 Now, our world changes so fast.

00:43:12 Maybe this will be totally different

00:43:13 in a couple of years.

00:43:15 But I think we need a couple more things

00:43:17 for reinforcement learning to get there.

00:43:20 One of my teams is looking

00:43:21 to reinforcement learning

00:43:22 for some robotic control tasks.

00:43:23 So I see the applications,

00:43:25 but if you look at it as a percentage

00:43:27 of all of the impact

00:43:28 of the types of things we do,

00:43:30 it’s at least today outside of

00:43:33 playing video games, right?

00:43:35 In a few of the games, the scope.

00:43:38 Actually, at NeurIPS,

00:43:39 a bunch of us were standing around

00:43:40 saying, hey, what’s your best example

00:43:42 of an actual deploy reinforcement

00:43:44 learning application?

00:43:45 And among like

00:43:47 senior machine learning researchers, right?

00:43:49 And again, there are some emerging ones,

00:43:51 but there are not that many great examples.

00:43:55 I think you’re absolutely right.

00:43:58 The sad thing is there hasn’t been

00:43:59 a big impactful real world application

00:44:03 of reinforcement learning.

00:44:04 I think its biggest impact to me

00:44:07 has been in the toy domain,

00:44:09 in the game domain,

00:44:10 in the small example.

00:44:11 That’s what I mean for educational purpose.

00:44:13 It seems to be a fun thing to explore

00:44:15 in your networks with.

00:44:16 But I think from your perspective,

00:44:19 and I think that might be

00:44:20 the best perspective is

00:44:22 if you’re trying to educate

00:44:23 with a simple example

00:44:24 in order to illustrate

00:44:25 how this can actually be grown

00:44:27 to scale and have a real world impact,

00:44:31 then perhaps focusing on the fundamentals

00:44:33 of supervised learning

00:44:35 in the context of a simple data set,

00:44:38 even like an MNIST data set

00:44:40 is the right way,

00:44:42 is the right path to take.

00:44:45 The amount of fun I’ve seen people

00:44:46 have with reinforcement learning

00:44:47 has been great,

00:44:48 but not in the applied impact

00:44:51 in the real world setting.

00:44:52 So it’s a trade off,

00:44:54 how much impact you want to have

00:44:55 versus how much fun you want to have.

00:44:56 Yeah, that’s really cool.

00:44:58 And I feel like the world

00:44:59 actually needs all sorts.

00:45:01 Even within machine learning,

00:45:02 I feel like deep learning

00:45:04 is so exciting,

00:45:05 but the AI team

00:45:07 shouldn’t just use deep learning.

00:45:08 I find that my teams

00:45:09 use a portfolio of tools.

00:45:11 And maybe that’s not the exciting thing

00:45:13 to say, but some days

00:45:14 we use a neural net,

00:45:15 some days we use a PCA.

00:45:19 Actually, the other day,

00:45:20 I was sitting down with my team

00:45:21 looking at PCA residuals,

00:45:22 trying to figure out what’s going on

00:45:23 with PCA applied

00:45:24 to manufacturing problem.

00:45:25 And some days we use

00:45:26 a probabilistic graphical model,

00:45:28 some days we use a knowledge draft,

00:45:29 which is one of the things

00:45:30 that has tremendous industry impact.

00:45:33 But the amount of chatter

00:45:34 about knowledge drafts in academia

00:45:36 is really thin compared

00:45:37 to the actual real world impact.

00:45:39 So I think reinforcement learning

00:45:41 should be in that portfolio.

00:45:42 And then it’s about balancing

00:45:43 how much we teach all of these things.

00:45:45 And the world should have

00:45:47 diverse skills.

00:45:47 It’d be sad if everyone

00:45:49 just learned one narrow thing.

00:45:51 Yeah, the diverse skill

00:45:52 help you discover the right tool

00:45:53 for the job.

00:45:54 What is the most beautiful,

00:45:56 surprising or inspiring idea

00:45:59 in deep learning to you?

00:46:00 Something that captivated

00:46:03 your imagination.

00:46:04 Is it the scale that could be,

00:46:07 the performance that could be

00:46:07 achieved with scale?

00:46:08 Or is there other ideas?

00:46:11 I think that if my only job

00:46:14 was being an academic researcher,

00:46:16 if an unlimited budget

00:46:18 and didn’t have to worry

00:46:19 about short term impact

00:46:21 and only focus on long term impact,

00:46:23 I’d probably spend all my time

00:46:24 doing research on unsupervised learning.

00:46:27 I still think unsupervised learning

00:46:28 is a beautiful idea.

00:46:31 At both this past NeurIPS and ICML,

00:46:34 I was attending workshops

00:46:35 or listening to various talks

00:46:37 about self supervised learning,

00:46:39 which is one vertical segment

00:46:41 maybe of unsupervised learning

00:46:43 that I’m excited about.

00:46:45 Maybe just to summarize the idea,

00:46:46 I guess you know the idea

00:46:47 about describing fleet.

00:46:48 No, please.

00:46:49 So here’s the example

00:46:49 of self supervised learning.

00:46:52 Let’s say we grab a lot

00:46:53 of unlabeled images off the internet.

00:46:55 So with infinite amounts

00:46:56 of this type of data,

00:46:58 I’m going to take each image

00:46:59 and rotate it by a random

00:47:01 multiple of 90 degrees.

00:47:03 And then I’m going to train

00:47:04 a supervised neural network

00:47:06 to predict what was

00:47:07 the original orientation.

00:47:08 So it has to be rotated 90 degrees,

00:47:10 180 degrees, 270 degrees,

00:47:12 or zero degrees.

00:47:14 So you can generate

00:47:15 an infinite amounts of labeled data

00:47:17 because you rotated the image

00:47:18 so you know what’s the

00:47:19 ground truth label.

00:47:20 And so various researchers

00:47:23 have found that by taking

00:47:24 unlabeled data and making

00:47:26 up labeled data sets

00:47:27 and training a large neural network

00:47:29 on these tasks,

00:47:30 you can then take the hidden

00:47:32 layer representation and transfer

00:47:34 it to a different task

00:47:35 very powerfully.

00:47:37 Learning word embeddings

00:47:39 where we take a sentence,

00:47:40 delete a word,

00:47:40 predict the missing word,

00:47:42 which is how we learn.

00:47:43 One of the ways we learn

00:47:44 word embeddings

00:47:45 is another example.

00:47:47 And I think there’s now

00:47:48 this portfolio of techniques

00:47:50 for generating these made up tasks.

00:47:53 Another one called jigsaw

00:47:54 would be if you take an image,

00:47:56 cut it up into a three by three grid,

00:47:59 so like a nine,

00:48:00 three by three puzzle piece,

00:48:01 jump up the nine pieces

00:48:02 and have a neural network predict

00:48:04 which of the nine factorial

00:48:06 possible permutations

00:48:07 it came from.

00:48:09 So many groups,

00:48:11 including OpenAI,

00:48:13 Peter B has been doing

00:48:14 some work on this too,

00:48:16 Facebook, Google Brain,

00:48:18 I think DeepMind,

00:48:19 oh actually,

00:48:21 Aaron van der Oort

00:48:22 has great work on the CPC objective.

00:48:24 So many teams are doing exciting work

00:48:26 and I think this is a way

00:48:27 to generate infinite label data

00:48:30 and I find this a very exciting

00:48:32 piece of unsupervised learning.

00:48:34 So long term you think

00:48:35 that’s going to unlock

00:48:37 a lot of power

00:48:38 in machine learning systems

00:48:39 is this kind of unsupervised learning.

00:48:42 I don’t think there’s

00:48:43 a whole enchilada,

00:48:43 I think it’s just a piece of it

00:48:45 and I think this one piece

00:48:46 unsupervised,

00:48:47 self supervised learning

00:48:48 is starting to get traction.

00:48:50 We’re very close

00:48:51 to it being useful.

00:48:53 Well, word embedding

00:48:54 is really useful.

00:48:55 I think we’re getting

00:48:56 closer and closer

00:48:57 to just having a significant

00:48:59 real world impact

00:49:00 maybe in computer vision and video

00:49:03 but I think this concept

00:49:05 and I think there’ll be

00:49:05 other concepts around it.

00:49:07 You know, other unsupervised

00:49:08 learning things that I worked on

00:49:10 I’ve been excited about.

00:49:12 I was really excited

00:49:12 about sparse coding

00:49:14 and ICA,

00:49:16 slow feature analysis.

00:49:17 I think all of these are ideas

00:49:18 that various of us

00:49:20 were working on

00:49:20 about a decade ago

00:49:21 before we all got distracted

00:49:23 by how well supervised

00:49:24 learning was doing.

00:49:26 So we would return

00:49:27 we would return to the fundamentals

00:49:29 of representation learning

00:49:30 that really started

00:49:32 this movement of deep learning.

00:49:33 I think there’s a lot more work

00:49:34 that one could explore around

00:49:36 this theme of ideas

00:49:37 and other ideas

00:49:38 to come up with better algorithms.

00:49:40 So if we could return

00:49:42 to maybe talk quickly

00:49:43 about the specifics

00:49:45 of deep learning.ai

00:49:46 the deep learning specialization

00:49:48 perhaps how long does it take

00:49:50 to complete the course

00:49:51 would you say?

00:49:52 The official length

00:49:53 of the deep learning specialization

00:49:55 is I think 16 weeks

00:49:57 so about four months

00:49:58 but it’s go at your own pace.

00:50:00 So if you subscribe

00:50:01 to the deep learning specialization

00:50:03 there are people that finished it

00:50:04 in less than a month

00:50:05 by working more intensely

00:50:07 and studying more intensely

00:50:07 so it really depends on

00:50:09 on the individual.

00:50:10 When we created

00:50:11 the deep learning specialization

00:50:13 we wanted to make it

00:50:15 very accessible

00:50:16 and very affordable.

00:50:18 And with you know

00:50:19 Coursera and deep learning.ai

00:50:20 education mission

00:50:21 one of the things

00:50:22 that’s really important to me

00:50:23 is that if there’s someone

00:50:25 for whom paying anything

00:50:27 is a financial hardship

00:50:29 then just apply for financial aid

00:50:30 and get it for free.

00:50:34 If you were to recommend

00:50:35 a daily schedule for people

00:50:38 in learning whether it’s

00:50:39 through the deep learning.ai

00:50:40 specialization or just learning

00:50:42 in the world of deep learning

00:50:43 what would you recommend?

00:50:45 How do they go about day to day

00:50:47 sort of specific advice

00:50:48 about learning

00:50:49 about their journey in the world

00:50:51 of deep learning machine learning?

00:50:53 I think getting the habit of learning

00:50:56 is key and that means regularity.

00:51:00 So for example

00:51:02 we send out a weekly newsletter

00:51:05 the batch every Wednesday

00:51:06 so people know it’s coming Wednesday

00:51:08 you can spend a little bit of time

00:51:09 on Wednesday

00:51:10 catching up on the latest news

00:51:11 catching up on the latest news

00:51:13 through the batch on Wednesday

00:51:17 and for myself

00:51:18 I’ve picked up a habit of spending

00:51:21 some time every Saturday

00:51:22 and every Sunday reading or studying

00:51:24 and so I don’t wake up on the Saturday

00:51:26 and have to make a decision

00:51:27 do I feel like reading

00:51:28 or studying today or not

00:51:30 it’s just what I do

00:51:31 and the fact is a habit

00:51:33 makes it easier.

00:51:34 So I think if someone can get into that habit

00:51:37 it’s like you know

00:51:38 just like we brush our teeth every morning

00:51:41 I don’t think about it

00:51:42 if I thought about it

00:51:42 it’s a little bit annoying

00:51:43 to have to spend two minutes doing that

00:51:45 but it’s a habit that it takes

00:51:47 no cognitive load

00:51:49 but this would be so much harder

00:51:50 if we have to make a decision every morning

00:51:53 and actually that’s the reason

00:51:54 why I wear the same thing every day as well

00:51:56 it’s just one less decision

00:51:57 I just get up and wear my blue shirt

00:51:59 so but I think if you can get that habit

00:52:01 that consistency of studying

00:52:02 then it actually feels easier.

00:52:05 So yeah it’s kind of amazing

00:52:08 in my own life

00:52:09 like I play guitar every day for

00:52:12 I force myself to at least for five minutes

00:52:14 play guitar

00:52:15 it’s just it’s a ridiculously short period of time

00:52:18 but because I’ve gotten into that habit

00:52:20 it’s incredible what you can accomplish

00:52:21 in a period of a year or two years

00:52:24 you can become

00:52:26 you know exceptionally good

00:52:28 at certain aspects of a thing

00:52:29 by just doing it every day

00:52:30 for a very short period of time

00:52:32 it’s kind of a miracle

00:52:33 that that’s how it works

00:52:34 it adds up over time.

00:52:36 Yeah and I think this is often

00:52:38 not about the bursts of sustained efforts

00:52:40 and the all nighters

00:52:41 because you could only do that

00:52:43 a limited number of times

00:52:44 it’s the sustained effort over a long time

00:52:47 I think you know reading two research papers

00:52:50 is a nice thing to do

00:52:51 but the power is not reading two research papers

00:52:54 it’s reading two research papers a week

00:52:56 for a year

00:52:57 then you read a hundred papers

00:52:58 and you actually learn a lot

00:53:00 when you read a hundred papers.

00:53:02 So regularity and making learning a habit

00:53:05 do you have general other study tips

00:53:09 for particularly deep learning

00:53:11 that people should

00:53:13 in their process of learning

00:53:15 is there some kind of recommendations

00:53:16 or tips you have as they learn?

00:53:19 One thing I still do

00:53:21 when I’m trying to study something really deeply

00:53:23 is take handwritten notes

00:53:25 it varies

00:53:26 I know there are a lot of people

00:53:27 that take the deep learning courses

00:53:29 during a commute or something

00:53:31 where it may be more awkward to take notes

00:53:33 so I know it may not work for everyone

00:53:36 but when I’m taking courses on Coursera

00:53:39 and I still take some every now and then

00:53:41 the most recent one I took

00:53:42 was a course on clinical trials

00:53:44 because I was interested about that

00:53:45 I got out my little Moleskine notebook

00:53:47 and what I was seeing on my desk

00:53:48 was just taking down notes

00:53:50 so what the instructor was saying

00:53:51 and that act we know that

00:53:53 that act of taking notes

00:53:54 preferably handwritten notes

00:53:57 increases retention.

00:53:59 So as you’re sort of watching the video

00:54:01 just kind of pausing maybe

00:54:03 and then taking the basic insights down on paper.

00:54:07 Yeah so there have been a few studies

00:54:09 if you search online

00:54:11 you find some of these studies

00:54:12 that taking handwritten notes

00:54:15 because handwriting is slower

00:54:16 as we’re saying just now

00:54:18 it causes you to recode the knowledge

00:54:21 in your own words more

00:54:23 and that process of recoding

00:54:24 promotes long term retention

00:54:26 this is as opposed to typing

00:54:28 which is fine

00:54:28 again typing is better than nothing

00:54:30 or in taking a class

00:54:31 and not taking notes is better

00:54:32 than not taking any class at all

00:54:34 but comparing handwritten notes

00:54:36 and typing

00:54:37 you can usually type faster

00:54:39 for a lot of people

00:54:40 you can handwrite notes

00:54:41 and so when people type

00:54:42 they’re more likely to just transcribe

00:54:44 verbatim what they heard

00:54:46 and that reduces the amount of recoding

00:54:49 and that actually results

00:54:50 in less long term retention.

00:54:52 I don’t know what the psychological effect

00:54:53 there is but so true

00:54:55 there’s something fundamentally different

00:54:56 about writing hand handwriting

00:54:59 I wonder what that is

00:55:00 I wonder if it is as simple

00:55:01 as just the time it takes to write it slower

00:55:04 yeah and because you can’t write

00:55:07 as many words

00:55:08 you have to take whatever they said

00:55:10 and summarize it into fewer words

00:55:11 and that summarization process

00:55:13 requires deeper processing of the meaning

00:55:15 which then results in better retention

00:55:17 that’s fascinating

00:55:20 oh and I think because of Coursera

00:55:22 I spent so much time studying pedagogy

00:55:24 this is actually one of my passions

00:55:25 I really love learning

00:55:27 how to more efficiently

00:55:28 help others learn

00:55:28 you know one of the things I do

00:55:30 both when creating videos

00:55:32 or when we write the batch is

00:55:34 I try to think is one minute spent of us

00:55:37 going to be a more efficient learning experience

00:55:40 than one minute spent anywhere else

00:55:42 and we really try to you know

00:55:45 make it time efficient for the learners

00:55:46 because you know everyone’s busy

00:55:48 so when when we’re editing

00:55:50 I often tell my teams

00:55:51 every word needs to fight for its life

00:55:53 and if you can delete a word

00:55:54 let’s just delete it and not wait

00:55:56 let’s not waste the learning time

00:55:57 let’s not waste the learning time

00:55:59 oh that’s so it’s so amazing

00:56:01 that you think that way

00:56:02 because there is millions of people

00:56:03 that are impacted by your teaching

00:56:04 and sort of that one minute spent

00:56:06 has a ripple effect right

00:56:08 through years of time

00:56:09 which is it’s just fascinating to think about

00:56:12 how does one make a career

00:56:14 out of an interest in deep learning

00:56:15 do you have advice for people

00:56:18 we just talked about

00:56:19 sort of the beginning early steps

00:56:21 but if you want to make it

00:56:22 an entire life’s journey

00:56:24 or at least a journey of a decade or two

00:56:26 how do you how do you do it

00:56:28 so most important thing is to get started

00:56:30 right and and I think in the early parts

00:56:34 of a career coursework

00:56:35 um like the deep learning specialization

00:56:38 or it’s a very efficient way

00:56:41 to master this material

00:56:43 so because you know instructors

00:56:46 uh be it me or someone else

00:56:48 or you know Lawrence Maroney

00:56:49 teaches our TensorFlow specialization

00:56:51 or other things we’re working on

00:56:52 spend effort to try to make it time efficient

00:56:55 for you to learn a new concept

00:56:57 so coursework is actually a very efficient way

00:57:00 for people to learn concepts

00:57:02 and the beginning parts of breaking

00:57:04 into a new field

00:57:05 in fact one thing I see at Stanford

00:57:08 some of my PhD students want to jump

00:57:10 in the research right away

00:57:11 and I actually tend to say look

00:57:13 in your first couple years of PhD

00:57:14 and spend time taking courses

00:57:16 because it lays a foundation

00:57:17 it’s fine if you’re less productive

00:57:19 in your first couple years

00:57:20 you’ll be better off in the long term

00:57:23 beyond a certain point

00:57:24 there’s materials that doesn’t exist in courses

00:57:27 because it’s too cutting edge

00:57:28 the course hasn’t been created yet

00:57:30 there’s some practical experience

00:57:31 that we’re not yet that good

00:57:32 as teaching in a course

00:57:34 and I think after exhausting

00:57:36 the efficient coursework

00:57:37 then most people need to go on

00:57:40 to either ideally work on projects

00:57:44 and then maybe also continue their learning

00:57:47 by reading blog posts and research papers

00:57:49 and things like that

00:57:50 doing projects is really important

00:57:52 and again I think it’s important

00:57:55 to start small and just do something

00:57:57 today you read about deep learning

00:57:58 feels like oh all these people

00:57:59 doing such exciting things

00:58:01 what if I’m not building a neural network

00:58:02 that changes the world

00:58:03 then what’s the point?

00:58:04 Well the point is sometimes building

00:58:06 that tiny neural network

00:58:07 you know be it MNIST or upgrade

00:58:10 to a fashion MNIST to whatever

00:58:12 so doing your own fun hobby project

00:58:14 that’s how you gain the skills

00:58:15 to let you do bigger and bigger projects

00:58:18 I find this to be true at the individual level

00:58:20 and also at the organizational level

00:58:23 for a company to become good at machine learning

00:58:24 sometimes the right thing to do

00:58:26 is not to tackle the giant project

00:58:29 is instead to do the small project

00:58:31 that lets the organization learn

00:58:33 and then build out from there

00:58:34 but this is true both for individuals

00:58:35 and for companies

00:58:38 taking the first step

00:58:40 and then taking small steps is the key

00:58:44 should students pursue a PhD

00:58:46 do you think you can do so much

00:58:48 that’s one of the fascinating things

00:58:50 in machine learning

00:58:51 you can have so much impact

00:58:52 without ever getting a PhD

00:58:54 so what are your thoughts

00:58:56 should people go to grad school

00:58:57 should people get a PhD?

00:58:59 I think that there are multiple good options

00:59:01 of which doing a PhD could be one of them

00:59:05 I think that if someone’s admitted

00:59:06 to a top PhD program

00:59:08 you know at MIT, Stanford, top schools

00:59:11 I think that’s a very good experience

00:59:15 or if someone gets a job

00:59:17 at a top organization

00:59:18 at the top AI team

00:59:20 I think that’s also a very good experience

00:59:23 there are some things you still need a PhD to do

00:59:25 if someone’s aspiration is to be a professor

00:59:27 you know at the top academic university

00:59:29 you just need a PhD to do that

00:59:30 but if it goes to you know

00:59:32 start a company, build a company

00:59:34 do great technical work

00:59:35 I think a PhD is a good experience

00:59:37 but I would look at the different options

00:59:40 available to someone

00:59:41 you know where are the places

00:59:42 where you can get a job

00:59:42 where are the places to get a PhD program

00:59:44 and kind of weigh the pros and cons of those

00:59:46 So just to linger on that for a little bit longer

00:59:50 what final dreams and goals

00:59:51 do you think people should have

00:59:53 so what options should they explore

00:59:57 so you can work in industry

00:59:59 so for a large company

01:00:01 like Google, Facebook, Baidu

01:00:03 all these large sort of companies

01:00:06 that already have huge teams

01:00:07 of machine learning engineers

01:00:09 you can also do with an industry

01:00:10 sort of more research groups

01:00:12 that kind of like Google Research, Google Brain

01:00:14 then you can also do

01:00:16 like we said a professor in academia

01:00:20 and what else

01:00:21 oh you can build your own company

01:00:23 you can do a startup

01:00:25 is there anything that stands out

01:00:27 between those options

01:00:28 or are they all beautiful different journeys

01:00:30 that people should consider

01:00:32 I think the thing that affects your experience more

01:00:34 is less are you in this company

01:00:36 versus that company

01:00:38 or academia versus industry

01:00:40 I think the thing that affects your experience most

01:00:41 is who are the people you’re interacting with

01:00:43 in a daily basis

01:00:45 so even if you look at some of the large companies

01:00:49 the experience of individuals

01:00:50 in different teams is very different

01:00:52 and what matters most is not the logo above the door

01:00:56 when you walk into the giant building every day

01:00:58 what matters the most is who are the 10 people

01:01:00 who are the 30 people you interact with every day

01:01:03 so I actually tend to advise people

01:01:04 if you get a job from a company

01:01:07 ask who is your manager

01:01:09 who are your peers

01:01:10 who are you actually going to talk to

01:01:11 we’re all social creatures

01:01:12 we tend to become more like the people around us

01:01:15 and if you’re working with great people

01:01:17 you will learn faster

01:01:19 or if you get admitted

01:01:20 if you get a job at a great company

01:01:23 or a great university

01:01:24 maybe the logo you walk in is great

01:01:26 but you’re actually stuck on some team

01:01:28 doing really work that doesn’t excite you

01:01:31 and then that’s actually a really bad experience

01:01:33 so this is true both for universities

01:01:36 and for large companies

01:01:37 for small companies you can kind of figure out

01:01:39 who you’ll be working with quite quickly

01:01:41 and I tend to advise people

01:01:43 if a company refuses to tell you

01:01:45 who you will work with

01:01:46 someone say oh join us

01:01:47 the rotation system will figure it out

01:01:48 I think that that’s a worrying answer

01:01:51 because it because it means you may not get sent

01:01:54 to you may not actually get to a team

01:01:57 with great peers and great people to work with

01:02:00 it’s actually a really profound advice

01:02:01 that we kind of sometimes sweep

01:02:04 we don’t consider too rigorously or carefully

01:02:07 the people around you are really often

01:02:10 especially when you accomplish great things

01:02:13 it seems the great things are accomplished

01:02:14 because of the people around you

01:02:16 so that’s a it’s not about the the

01:02:20 where whether you learn this thing

01:02:21 or that thing or like you said

01:02:23 the logo that hangs up top

01:02:25 it’s the people that’s a fascinating

01:02:27 and it’s such a hard search process

01:02:30 of finding just like finding the right friends

01:02:34 and somebody to get married with

01:02:36 and that kind of thing

01:02:37 it’s a very hard search

01:02:38 it’s a people search problem

01:02:40 yeah but I think when someone interviews

01:02:43 you know at a university

01:02:44 or the research lab or the large corporation

01:02:47 it’s good to insist on just asking

01:02:49 who are the people

01:02:50 who is my manager

01:02:51 and if you refuse to tell me

01:02:52 I’m gonna think well maybe that’s

01:02:54 because you don’t have a good answer

01:02:55 it may not be someone I like

01:02:57 and if you don’t particularly connect

01:02:59 if something feels off with the people

01:03:02 then don’t stick to it

01:03:05 you know that’s a really important signal to consider

01:03:08 yeah yeah and actually I actually

01:03:11 in my standard class CS230

01:03:13 as well as an ACM talk

01:03:14 I think I gave like a hour long talk

01:03:16 on career advice

01:03:18 including on the job search process

01:03:20 and then some of these

01:03:20 so you can find those videos online

01:03:23 awesome and I’ll point them

01:03:25 I’ll point people to them

01:03:26 beautiful

01:03:28 so the AI fund helps AI startups

01:03:32 get off the ground

01:03:33 or perhaps you can elaborate

01:03:34 on all the fun things it’s involved with

01:03:36 what’s your advice

01:03:37 and how does one build a successful AI startup

01:03:41 you know in Silicon Valley

01:03:43 a lot of startup failures

01:03:44 come from building other products

01:03:46 that no one wanted

01:03:48 so when you know cool technology

01:03:51 but who’s going to use it

01:03:53 so I think I tend to be very outcome driven

01:03:57 and customer obsessed

01:04:00 ultimately we don’t get to vote

01:04:02 if we succeed or fail

01:04:04 it’s only the customer

01:04:05 that they’re the only one

01:04:06 that gets a thumbs up or thumbs down vote

01:04:08 in the long term

01:04:09 in the short term

01:04:10 you know there are various people

01:04:12 that get various votes

01:04:13 but in the long term

01:04:14 that’s what really matters

01:04:16 so as you build the startup

01:04:17 you have to constantly ask the question

01:04:20 will the customer give a thumbs up on this

01:04:24 I think so

01:04:24 I think startups that are very customer focused

01:04:27 customer obsessed

01:04:28 deeply understand the customer

01:04:30 and are oriented to serve the customer

01:04:34 are more likely to succeed

01:04:36 with the provisional

01:04:37 I think all of us should only do things

01:04:38 that we think create social good

01:04:40 and moves the world forward

01:04:41 so I personally don’t want to build

01:04:44 addictive digital products

01:04:45 just to sell a lot of ads

01:04:47 or you know there are things

01:04:48 that could be lucrative

01:04:49 that I won’t do

01:04:51 but if we can find ways to serve people

01:04:53 in meaningful ways

01:04:55 I think those can be

01:04:57 great things to do

01:04:58 either in the academic setting

01:05:00 or in a corporate setting

01:05:01 or a startup setting

01:05:02 so can you give me the idea

01:05:04 of why you started the AI fund

01:05:08 I remember when I was leading

01:05:10 the AI group at Baidu

01:05:13 I had two jobs

01:05:14 two parts of my job

01:05:15 one was to build an AI engine

01:05:17 to support the existing businesses

01:05:19 and that was running

01:05:20 just ran

01:05:21 just performed by itself

01:05:23 there was a second part of my job at the time

01:05:24 which was to try to systematically initiate

01:05:27 new lines of businesses

01:05:28 using the company’s AI capabilities

01:05:31 so you know the self driving car team

01:05:33 came out of my group

01:05:34 the smart speaker team

01:05:37 similar to what is Amazon Echo Alexa in the US

01:05:40 but we actually announced it

01:05:41 before Amazon did

01:05:42 so Baidu wasn’t following Amazon

01:05:47 that came out of my group

01:05:48 and I found that to be

01:05:50 actually the most fun part of my job

01:05:53 so what I wanted to do was

01:05:55 to build AI fund as a startup studio

01:05:58 to systematically create new startups

01:06:01 from scratch

01:06:02 with all the things we can now do with AI

01:06:04 I think the ability to build new teams

01:06:07 to go after this rich space of opportunities

01:06:09 is a very important way

01:06:11 to very important mechanism

01:06:13 to get these projects done

01:06:14 that I think will move the world forward

01:06:16 so I’ve been fortunate to build a few teams

01:06:19 that had a meaningful positive impact

01:06:21 and I felt that we might be able to do this

01:06:25 in a more systematic repeatable way

01:06:27 so a startup studio is a relatively new concept

01:06:31 there are maybe dozens of startup studios

01:06:34 you know right now

01:06:35 but I feel like all of us

01:06:38 many teams are still trying to figure out

01:06:40 how do you systematically build companies

01:06:43 with a high success rate

01:06:45 so I think even a lot of my you know

01:06:47 venture capital friends are

01:06:49 seem to be more and more building companies

01:06:51 rather than investing in companies

01:06:53 but I find a fascinating thing to do

01:06:55 to figure out the mechanisms

01:06:56 by which we could systematically build

01:06:58 successful teams, successful businesses

01:07:01 in areas that we find meaningful

01:07:03 so a startup studio is something

01:07:05 is a place and a mechanism

01:07:08 for startups to go from zero to success

01:07:11 to try to develop a blueprint

01:07:13 it’s actually a place for us

01:07:14 to build startups from scratch

01:07:16 so we often bring in founders

01:07:19 and work with them

01:07:21 or maybe even have existing ideas

01:07:23 that we match founders with

01:07:26 and then this launches

01:07:27 you know hopefully into successful companies

01:07:30 so how close are you to figuring out

01:07:34 a way to automate the process

01:07:36 of starting from scratch

01:07:38 and building a successful AI startup

01:07:40 yeah I think we’ve been constantly

01:07:43 improving and iterating on our processes

01:07:46 how we do that

01:07:47 so things like you know

01:07:48 how many customer calls do we need to make

01:07:50 in order to get customer validation

01:07:52 how do we make sure this technology

01:07:54 can be built

01:07:54 quite a lot of our businesses

01:07:56 need cutting edge machine learning algorithms

01:07:58 so you know kind of algorithms

01:07:59 have developed in the last one or two years

01:08:01 and even if it works in a research paper

01:08:04 it turns out taking the production

01:08:05 is really hard

01:08:06 there are a lot of issues

01:08:07 for making these things work in the real life

01:08:10 that are not widely addressed in academia

01:08:13 so how do we validate

01:08:14 that this is actually doable

01:08:15 how do you build a team

01:08:17 get the specialized domain knowledge

01:08:18 be it in education or health care

01:08:20 whatever sector we’re focusing on

01:08:21 so I think we’ve actually getting

01:08:23 we’ve been getting much better

01:08:24 at giving the entrepreneurs

01:08:27 a high success rate

01:08:29 but I think we’re still

01:08:31 I think the whole world is still

01:08:32 in the early phases of figuring this out

01:08:34 but do you think there is some aspects

01:08:36 of that process that are transferable

01:08:38 from one startup to another

01:08:40 to another to another

01:08:41 yeah very much so

01:08:43 you know starting from scratch

01:08:45 you know starting a company

01:08:46 to most entrepreneurs

01:08:47 is a really lonely thing

01:08:50 and I’ve seen so many entrepreneurs

01:08:53 not know how to make certain decisions

01:08:56 like when do you need to

01:08:58 how do you do B2B sales right

01:09:00 if you don’t know that

01:09:00 it’s really hard

01:09:02 or how do you market this efficiently

01:09:05 other than you know buying ads

01:09:06 which is really expensive

01:09:08 are there more efficient tactics for that

01:09:10 or for a machine learning project

01:09:12 you know basic decisions

01:09:14 can change the course of

01:09:15 whether machine learning product works or not

01:09:18 and so there are so many hundreds of decisions

01:09:20 that entrepreneurs need to make

01:09:22 and making a mistake

01:09:24 and a couple key decisions

01:09:25 can have a huge impact

01:09:28 on the fate of the company

01:09:30 so I think a startup studio

01:09:31 provides a support structure

01:09:32 that makes starting a company

01:09:34 much less of a lonely experience

01:09:36 and also when facing with these key decisions

01:09:39 like trying to hire your first

01:09:42 uh the VP of engineering

01:09:44 what’s a good selection criteria

01:09:46 how do you solve

01:09:46 should I hire this person or not

01:09:48 by helping by having a ecosystem

01:09:51 around the entrepreneurs

01:09:52 the founders to help

01:09:54 I think we help them at the key moments

01:09:57 and hopefully significantly

01:09:59 make them more enjoyable

01:10:00 and then higher success rate

01:10:02 so there’s somebody to brainstorm with

01:10:04 in these very difficult decision points

01:10:07 and also to help them recognize

01:10:10 what they may not even realize

01:10:12 is a key decision point

01:10:14 that’s that’s the first

01:10:15 and probably the most important part

01:10:17 yeah actually I can say one other thing

01:10:19 um you know I think

01:10:22 building companies is one thing

01:10:23 but I feel like it’s really important

01:10:26 that we build companies

01:10:28 that move the world forward

01:10:29 for example within the AI Fund team

01:10:32 there was once an idea

01:10:33 for a new company

01:10:35 that if it had succeeded

01:10:37 would have resulted in people

01:10:38 watching a lot more videos

01:10:40 in a certain narrow vertical type of video

01:10:42 um I looked at it

01:10:43 the business case was fine

01:10:45 the revenue case was fine

01:10:46 but I looked and just said

01:10:48 I don’t want to do this

01:10:49 like you know I don’t actually

01:10:50 just want to have a lot more people

01:10:52 watch this type of video

01:10:53 wasn’t educational

01:10:54 it’s an educational baby

01:10:56 and so and so I I I I code the idea

01:10:59 on the basis that I didn’t think

01:11:00 it would actually help people

01:11:01 so um whether building companies

01:11:04 or working enterprises

01:11:05 or doing personal projects

01:11:06 I think um it’s up to each of us

01:11:10 to figure out what’s the difference

01:11:11 we want to make in the world

01:11:13 With landing AI

01:11:15 you help already established companies

01:11:17 grow their AI and machine learning efforts

01:11:20 how does a large company

01:11:21 integrate machine learning

01:11:22 into their efforts?

01:11:25 AI is a general purpose technology

01:11:27 and I think it will transform every industry

01:11:30 our community has already transformed

01:11:32 to a large extent

01:11:33 the software internet sector

01:11:35 most software internet companies

01:11:36 outside the top right

01:11:38 five or six or three or four

01:11:39 already have reasonable

01:11:41 machine learning capabilities

01:11:43 or or getting there

01:11:44 it’s still room for improvement

01:11:46 but when I look outside

01:11:47 the software internet sector

01:11:49 everything from manufacturing

01:11:50 agriculture, healthcare

01:11:52 logistics transportation

01:11:53 there’s so many opportunities

01:11:55 that very few people are working on

01:11:57 so I think the next wave of AI

01:11:59 is for us to also transform

01:12:01 all of those other industries

01:12:03 there was a McKinsey study

01:12:04 estimating 13 trillion dollars

01:12:06 of global economic growth

01:12:09 US GDP is 19 trillion dollars

01:12:11 so 13 trillion is a big number

01:12:13 or PwC estimates 16 trillion dollars

01:12:16 so whatever number is is large

01:12:18 but the interesting thing to me

01:12:19 was a lot of that impact

01:12:20 will be outside

01:12:21 the software internet sector

01:12:23 so we need more teams

01:12:25 to work with these companies

01:12:27 to help them adopt AI

01:12:29 and I think this is one thing

01:12:30 so make you know

01:12:31 help drive global economic growth

01:12:33 and make humanity more powerful

01:12:35 and like you said the impact is there

01:12:37 so what are the best industries

01:12:39 the biggest industries

01:12:40 where AI can help

01:12:41 perhaps outside the software tech sector

01:12:44 frankly I think it’s all of them

01:12:47 some of the ones I’m spending a lot of time on

01:12:49 are manufacturing agriculture

01:12:52 look into healthcare

01:12:54 for example in manufacturing

01:12:56 we do a lot of work in visual inspection

01:12:58 where today there are people standing around

01:13:01 using the eye human eye

01:13:02 to check if you know

01:13:03 this plastic part or the smartphone

01:13:05 or this thing has a scratch

01:13:07 or a dent or something in it

01:13:09 we can use a camera to take a picture

01:13:12 use a algorithm

01:13:14 deep learning and other things

01:13:15 to check if it’s defective or not

01:13:17 and thus help factories improve yield

01:13:20 and improve quality

01:13:21 and improve throughput

01:13:23 it turns out the practical problems

01:13:25 we run into are very different

01:13:26 than the ones you might read about

01:13:28 in in most research papers

01:13:29 the data sets are really small

01:13:30 so we face small data problems

01:13:33 you know the factories

01:13:34 keep on changing the environment

01:13:35 so it works well on your test set

01:13:38 but guess what

01:13:40 something changes in the factory

01:13:41 the lights go on or off

01:13:43 recently there was a factory

01:13:45 in which a bird threw through the factory

01:13:47 and pooped on something

01:13:48 and so that changed stuff

01:13:50 and so increasing our algorithm

01:13:53 makes robustness

01:13:54 so all the changes happen in the factory

01:13:56 I find that we run a lot of practical problems

01:13:59 that are not as widely discussed

01:14:01 in academia

01:14:02 and it’s really fun

01:14:03 kind of being on the cutting edge

01:14:05 solving these problems before

01:14:07 maybe before many people are even aware

01:14:09 that there is a problem there

01:14:10 and that’s such a fascinating space

01:14:12 you’re absolutely right

01:14:13 but what is the first step

01:14:15 that a company should take

01:14:16 it’s just scary leap

01:14:18 into this new world of

01:14:20 going from the human eye

01:14:21 inspecting to digitizing that process

01:14:24 having a camera

01:14:25 having an algorithm

01:14:27 what’s the first step

01:14:28 like what’s the early journey

01:14:30 that you recommend

01:14:31 that you see these companies taking

01:14:33 I published a document

01:14:34 called the AI Transformation Playbook

01:14:37 that’s online

01:14:37 and taught briefly in the AI for Everyone

01:14:39 course on Coursera

01:14:41 about the long term journey

01:14:42 that companies should take

01:14:44 but the first step

01:14:45 is actually to start small

01:14:46 I’ve seen a lot more companies fail

01:14:48 by starting too big

01:14:50 than by starting too small

01:14:52 take even Google

01:14:54 you know most people don’t realize

01:14:55 how hard it was

01:14:56 and how controversial it was

01:14:58 in the early days

01:14:59 so when I started Google Brain

01:15:02 it was controversial

01:15:03 you know people thought

01:15:04 deep learning near nest

01:15:06 tried it didn’t work

01:15:07 why would you want to do deep learning

01:15:09 so my first internal customer

01:15:11 within Google

01:15:12 was the Google speech team

01:15:13 which is not the most lucrative

01:15:15 project in Google

01:15:17 not the most important

01:15:18 it’s not web search or advertising

01:15:20 but by starting small

01:15:22 my team helped the speech team

01:15:25 build a more accurate speech recognition system

01:15:28 and this caused their peers

01:15:30 other teams to start

01:15:31 to have more faith in deep learning

01:15:32 my second internal customer

01:15:34 was the Google Maps team

01:15:36 where we used computer vision

01:15:37 to read house numbers

01:15:39 from basic street view images

01:15:41 to more accurately locate houses

01:15:42 within Google Maps

01:15:43 so improve the quality of geodata

01:15:45 and it was only after those two successes

01:15:48 that I then started

01:15:49 a more serious conversation

01:15:50 with the Google Ads team

01:15:52 and so there’s a ripple effect

01:15:54 that you showed that it works

01:15:55 in these cases

01:15:56 and then it just propagates

01:15:58 through the entire company

01:15:59 that this thing has a lot of value

01:16:01 and use for us

01:16:02 I think the early small scale projects

01:16:05 it helps the teams gain faith

01:16:07 but also helps the teams learn

01:16:09 what these technologies do

01:16:11 I still remember when our first GPU server

01:16:14 it was a server under some guy’s desk

01:16:16 and you know and then that taught us

01:16:19 early important lessons about

01:16:21 how do you have multiple users

01:16:23 share a set of GPUs

01:16:25 which is really not obvious at the time

01:16:26 but those early lessons were important

01:16:29 we learned a lot from that first GPU server

01:16:31 that later helped the teams think through

01:16:33 how to scale it up

01:16:34 to much larger deployments

01:16:37 Are there concrete challenges

01:16:38 that companies face

01:16:40 that you see is important for them to solve?

01:16:43 I think building and deploying

01:16:45 machine learning systems is hard

01:16:47 there’s a huge gulf between

01:16:48 something that works

01:16:49 in a jupyter notebook on your laptop

01:16:51 versus something that runs

01:16:52 their production deployment setting

01:16:54 in a factory or agriculture plant or whatever

01:16:58 so I see a lot of people

01:16:59 get something to work on your laptop

01:17:01 and say wow look what I’ve done

01:17:02 and that’s great that’s hard

01:17:03 that’s a very important first step

01:17:05 but a lot of teams underestimate

01:17:07 the rest of the steps needed

01:17:09 so for example

01:17:10 I’ve heard this exact same conversation

01:17:12 between a lot of machine learning people

01:17:13 and business people

01:17:15 the machine learning person says

01:17:16 look my algorithm does well on the test set

01:17:20 and it’s a clean test set at the end of peak

01:17:22 and the machine and the business person says

01:17:24 thank you very much

01:17:25 but your algorithm sucks it doesn’t work

01:17:28 and the machine learning person says

01:17:29 no wait I did well on the test set

01:17:33 and I think there is a gulf between

01:17:36 what it takes to do well on the test set

01:17:38 on your hard drive

01:17:39 versus what it takes to work well

01:17:41 in a deployment setting

01:17:43 some common problems

01:17:45 robustness and generalization

01:17:47 you deploy something in the factory

01:17:49 maybe they chop down a tree outside the factory

01:17:51 so the tree no longer covers the window

01:17:54 and the lighting is different

01:17:55 so the test set changes

01:17:56 and in machine learning

01:17:58 and especially in academia

01:18:00 we don’t know how to deal with test set distributions

01:18:02 that are dramatically different

01:18:04 than the training set distribution

01:18:06 you know that this research

01:18:07 the stuff like domain annotation

01:18:10 transfer learning

01:18:11 you know there are people working on it

01:18:12 but we’re really not good at this

01:18:14 so how do you actually get this to work

01:18:17 because your test set distribution

01:18:18 is going to change

01:18:19 and I think also if you look at the number of lines of code

01:18:23 in the software system

01:18:24 the machine learning model is maybe five percent

01:18:27 or even fewer

01:18:29 relative to the entire software system

01:18:31 you need to build

01:18:33 so how do you get all that work done

01:18:34 and make it reliable and systematic

01:18:36 so good software engineering work

01:18:38 is fundamental here

01:18:40 to building a successful small machine learning system

01:18:44 yes and the software system

01:18:46 needs to interface with the machine learning system

01:18:48 needs to interface with people’s workloads

01:18:50 so machine learning is automation on steroids

01:18:53 if we take one task out of many tasks

01:18:56 that are done in the factory

01:18:57 so the factory does lots of things

01:18:58 one task is vision inspection

01:19:00 if we automate that one task

01:19:02 it can be really valuable

01:19:03 but you may need to redesign a lot of other tasks

01:19:06 around that one task

01:19:07 for example say the machine learning algorithm

01:19:09 says this is defective

01:19:10 what are you supposed to do

01:19:11 do you throw it away

01:19:12 do you get a human to double check

01:19:14 do you want to rework it or fix it

01:19:16 so you need to redesign a lot of tasks

01:19:17 around that thing you’ve now automated

01:19:20 so planning for the change management

01:19:22 and making sure that the software you write

01:19:24 is consistent with the new workflow

01:19:26 and you take the time to explain to people

01:19:28 what needs to happen

01:19:29 so I think what landing AI has become good at

01:19:34 and then I think we learned by making the steps

01:19:36 and you know painful experiences

01:19:38 well my what would become good at is

01:19:41 working with our partners to think through

01:19:43 all the things beyond just the machine learning model

01:19:46 or running the jupyter notebook

01:19:47 but to build the entire system

01:19:50 manage the change process

01:19:51 and figure out how to deploy this in a way

01:19:53 that has an actual impact

01:19:55 the processes that the large software tech companies

01:19:58 use for deploying don’t work

01:19:59 for a lot of other scenarios

01:20:01 for example when I was leading large speech teams

01:20:05 if the speech recognition system goes down

01:20:07 what happens well alarms goes off

01:20:09 and then someone like me would say hey

01:20:11 you 20 engine environment

01:20:12 you 20 engineers please fix this

01:20:16 but if you have a system girl in the factory

01:20:19 there are not 20 machine learning engineers

01:20:21 sitting around you can page your duty

01:20:22 and have them fix it

01:20:23 so how do you deal with the maintenance

01:20:26 or the or the dev ops or the mo ops

01:20:28 or the other aspects of this

01:20:30 so these are concepts that I think landing AI

01:20:33 and a few other teams on the cutting edge

01:20:36 but we don’t even have systematic terminology yet

01:20:39 to describe some of the stuff we do

01:20:40 because I think we’re inventing it on the fly.

01:20:44 So you mentioned some people are interested

01:20:46 in discovering mathematical beauty

01:20:48 and truth in the universe

01:20:49 and you’re interested in having

01:20:51 a big positive impact in the world

01:20:54 so let me ask the two are not inconsistent

01:20:57 no they’re all together

01:20:58 I’m only half joking

01:21:00 because you’re probably interested a little bit in both

01:21:03 but let me ask a romanticized question

01:21:06 so much of the work

01:21:08 your work and our discussion today

01:21:09 has been on applied AI

01:21:11 maybe you can even call narrow AI

01:21:14 where the goal is to create systems

01:21:15 that automate some specific process

01:21:17 that adds a lot of value to the world

01:21:19 but there’s another branch of AI

01:21:21 starting with Alan Turing

01:21:22 that kind of dreams of creating human level

01:21:25 or superhuman level intelligence

01:21:28 is this something you dream of as well

01:21:30 do you think we human beings

01:21:32 will ever build a human level intelligence

01:21:34 or superhuman level intelligence system?

01:21:37 I would love to get to AGI

01:21:38 and I think humanity will

01:21:40 but whether it takes 100 years

01:21:42 or 500 or 5000

01:21:45 I find hard to estimate

01:21:47 do you have

01:21:49 some folks have worries

01:21:51 about the different trajectories

01:21:53 that path would take

01:21:54 even existential threats of an AGI system

01:21:57 do you have such concerns

01:21:59 whether in the short term or the long term?

01:22:02 I do worry about the long term fate of humanity

01:22:05 I do wonder as well

01:22:08 I do worry about overpopulation on the planet Mars

01:22:12 just not today

01:22:13 I think there will be a day

01:22:15 when maybe someday in the future

01:22:17 Mars will be polluted

01:22:19 there are all these children dying

01:22:20 and someone will look back at this video

01:22:22 and say Andrew how is Andrew so heartless?

01:22:24 He didn’t care about all these children

01:22:25 dying on the planet Mars

01:22:27 and I apologize to the future viewer

01:22:29 I do care about the children

01:22:31 but I just don’t know how to

01:22:32 productively work on that today

01:22:33 your picture will be in the dictionary

01:22:35 for the people who are ignorant

01:22:37 about the overpopulation on Mars

01:22:39 yes so it’s a long term problem

01:22:42 is there something in the short term

01:22:43 we should be thinking about

01:22:45 in terms of aligning the values of our AI systems

01:22:48 with the values of us humans

01:22:52 sort of something that Stuart Russell

01:22:54 and other folks are thinking about

01:22:56 as this system develops more and more

01:22:58 we want to make sure that it represents

01:23:01 the better angels of our nature

01:23:03 the ethics the values of our society

01:23:07 you know if you take self driving cars

01:23:11 the biggest problem with self driving cars

01:23:12 is not that there’s some trolley dilemma

01:23:16 and you teach this so you know

01:23:17 how many times when you are driving your car

01:23:20 did you face this moral dilemma

01:23:21 who do I crash into?

01:23:24 so I think self driving cars

01:23:25 will run into that problem roughly as often

01:23:27 as we do when we drive our cars

01:23:29 the biggest problem with self driving cars

01:23:30 is when there’s a big white truck across the road

01:23:33 and what you should do is break

01:23:34 and not crash into it

01:23:35 and the self driving car fails

01:23:37 and it crashes into it

01:23:38 so I think we need to solve that problem first

01:23:40 I think the problem with some of these discussions

01:23:42 about AGI you know alignments

01:23:47 the paperclip problem

01:23:49 is that is a huge distraction

01:23:51 from the much harder problems

01:23:53 that we actually need to address today

01:23:56 it’s not the hardest problems

01:23:57 we need to address today

01:23:59 it’s not the hard problems

01:24:00 we need to address today

01:24:01 I think bias is a huge issue

01:24:04 I worry about wealth and equality

01:24:06 the AI and internet are causing

01:24:09 an acceleration of concentration of power

01:24:11 because we can now centralize data

01:24:13 use AI to process it

01:24:14 and so industry after industry

01:24:16 we’ve affected every industry

01:24:18 so the internet industry has a lot of

01:24:20 win and take most

01:24:20 or win and take all dynamics

01:24:22 but we’ve infected all these other industries

01:24:24 so we’re also giving these other industries

01:24:26 most of them to take all flavors

01:24:28 so look at what Uber and Lyft

01:24:30 did to the taxi industry

01:24:32 so we’re doing this type of thing

01:24:33 it’s a lot and so this

01:24:34 so we’re creating tremendous wealth

01:24:36 but how do we make sure that the wealth

01:24:37 is fairly shared

01:24:39 I think that and then how do we help

01:24:43 people whose jobs are displaced

01:24:44 you know I think education is part of it

01:24:46 there may be even more

01:24:48 that we need to do than education

01:24:52 I think bias is a serious issue

01:24:54 there are adverse uses of AI

01:24:56 like deepfakes being used

01:24:57 for various and various purposes

01:24:59 so I worry about some teams

01:25:04 maybe accidentally

01:25:05 and I hope not deliberately

01:25:07 making a lot of noise about things

01:25:09 that problems in the distant future

01:25:12 rather than focusing on

01:25:13 some of the much harder problems

01:25:15 yeah the overshadow of the problems

01:25:17 that we have already today

01:25:18 they’re exceptionally challenging

01:25:19 like those you said

01:25:20 and even the silly ones

01:25:21 but the ones that have a huge impact

01:25:23 huge impact

01:25:24 which is the lighting variation

01:25:25 outside of your factory window

01:25:27 that that ultimately is

01:25:30 what makes the difference

01:25:31 between like you said

01:25:32 the Jupiter notebook

01:25:33 and something that actually transforms

01:25:35 an entire industry potentially

01:25:37 yeah and I think

01:25:38 and then just to some companies

01:25:40 or a regulator comes to you

01:25:42 and says look your product

01:25:44 is messing things up

01:25:45 fixing it may have a revenue impact

01:25:47 well it’s much more fun to talk to them

01:25:49 about how you promise

01:25:50 not to wipe out humanity

01:25:51 and to face the actually really hard problems we face

01:25:55 so your life has been a great journey

01:25:57 from teaching to research

01:25:58 to entrepreneurship

01:26:00 two questions

01:26:01 one are there regrets

01:26:04 moments that if you went back

01:26:05 you would do differently

01:26:07 and two are there moments

01:26:08 you’re especially proud of

01:26:10 moments that made you truly happy

01:26:13 you know I’ve made so many mistakes

01:26:17 it feels like every time

01:26:18 I discover something

01:26:19 I go why didn’t I think of this

01:26:23 you know five years earlier

01:26:24 or even 10 years earlier

01:26:27 and as recently

01:26:29 and then sometimes I read a book

01:26:30 and I go I wish I read this book 10 years ago

01:26:33 my life would have been so different

01:26:35 although that happened recently

01:26:36 and then I was thinking

01:26:37 if only I read this book

01:26:39 when we’re starting up Coursera

01:26:40 I could have been so much better

01:26:42 but I discovered the book

01:26:43 had not yet been written

01:26:44 we’re starting Coursera

01:26:45 so that made me feel better

01:26:46 so that made me feel better

01:26:49 but I find that the process of discovery

01:26:53 we keep on finding out things

01:26:54 that seem so obvious in hindsight

01:26:57 but it always takes us so much longer

01:26:59 than than I wish to to figure it out

01:27:03 so on the second question

01:27:06 are there moments in your life

01:27:08 that if you look back

01:27:09 that you’re especially proud of

01:27:12 or you’re especially happy

01:27:13 what would be the that filled you with happiness

01:27:17 and fulfillment

01:27:18 well two answers

01:27:20 one does my daughter know of her

01:27:21 yes of course

01:27:22 because I know how much time I spent with her

01:27:24 I just can’t spend enough time with her

01:27:25 congratulations by the way

01:27:26 thank you

01:27:27 and then second is helping other people

01:27:29 I think to me

01:27:30 I think the meaning of life

01:27:32 is helping others achieve

01:27:35 whatever are their dreams

01:27:37 and then also to try to move the world forward

01:27:40 making humanity more powerful as a whole

01:27:43 so the times that I felt most happy

01:27:46 most proud was when I felt

01:27:49 someone else allowed me the good fortune

01:27:52 of helping them a little bit

01:27:54 on the path to their dreams

01:27:57 I think there’s no better way to end it

01:27:58 than talking about happiness

01:28:00 and the meaning of life

01:28:01 so Andrew it’s a huge honor

01:28:03 me and millions of people

01:28:04 thank you for all the work you’ve done

01:28:05 thank you for talking today

01:28:07 thank you so much thanks

01:28:07 thanks for listening to this conversation with Andrew Ng

01:28:10 and thank you to our presenting sponsor Cash App

01:28:13 download it use code LEX podcast

01:28:16 you’ll get ten dollars

01:28:17 and ten dollars will go to FIRST

01:28:19 an organization that inspires and educates young minds

01:28:22 to become science and technology innovators of tomorrow

01:28:25 if you enjoy this podcast

01:28:27 subscribe on YouTube

01:28:28 give it five stars on Apple podcast

01:28:30 support it on Patreon

01:28:32 or simply connect with me on Twitter

01:28:34 at LEX Freedman

01:28:35 and now let me leave you with some words of wisdom from Andrew Ng

01:28:39 ask yourself

01:28:40 if what you’re working on succeeds beyond your wildest dreams

01:28:44 would you have significantly helped other people?

01:28:47 if not then keep searching for something else to work on

01:28:51 otherwise you’re not living up to your full potential

01:28:54 thank you for listening and hope to see you next time