Transcript
00:00:00 The following is a conversation with Rajat Manga.
00:00:03 He’s an engineer and director of Google,
00:00:04 leading the TensorFlow team.
00:00:06 TensorFlow is an open source library
00:00:09 at the center of much of the work going on in the world
00:00:11 in deep learning, both the cutting edge research
00:00:14 and the large scale application of learning based approaches.
00:00:17 But it’s quickly becoming much more than a software library.
00:00:20 It’s now an ecosystem of tools for the deployment of machine
00:00:24 learning in the cloud, on the phone, in the browser,
00:00:26 on both generic and specialized hardware.
00:00:29 TPU, GPU, and so on.
00:00:31 Plus, there’s a big emphasis on growing a passionate community
00:00:35 of developers.
00:00:36 Rajat, Jeff Dean, and a large team of engineers at Google
00:00:39 Brain are working to define the future of machine
00:00:42 learning with TensorFlow 2.0, which is now in alpha.
00:00:46 I think the decision to open source TensorFlow
00:00:49 is a definitive moment in the tech industry.
00:00:51 It showed that open innovation can be successful
00:00:54 and inspire many companies to open source their code,
00:00:56 to publish, and in general engage
00:00:58 in the open exchange of ideas.
00:01:01 This conversation is part of the Artificial Intelligence
00:01:03 podcast.
00:01:05 If you enjoy it, subscribe on YouTube, iTunes,
00:01:07 or simply connect with me on Twitter at Lex Friedman,
00:01:10 spelled F R I D.
00:01:12 And now, here’s my conversation with Rajat Manga.
00:01:17 You were involved with Google Brain since its start in 2011
00:01:22 with Jeff Dean.
00:01:24 It started with this belief, the proprietary machine learning
00:01:29 library, and turned into TensorFlow in 2014,
00:01:32 the open source library.
00:01:35 So what were the early days of Google Brain like?
00:01:39 What were the goals, the missions?
00:01:41 How do you even proceed forward once there’s
00:01:45 so much possibilities before you?
00:01:47 It was interesting back then when I started,
00:01:50 or when you were even just talking about it,
00:01:55 the idea of deep learning was interesting and intriguing
00:01:59 in some ways.
00:02:00 It hadn’t yet taken off, but it held some promise.
00:02:04 It had shown some very promising and early results.
00:02:08 I think the idea where Andrew and Jeff had started
00:02:11 was, what if we can take this work people are doing
00:02:15 in research and scale it to what Google has
00:02:18 in terms of the compute power, and also
00:02:23 put that kind of data together?
00:02:24 What does it mean?
00:02:25 And so far, the results had been, if you scale the compute,
00:02:28 scale the data, it does better.
00:02:30 And would that work?
00:02:31 And so that was the first year or two, can we prove that out?
00:02:35 And with this belief, when we started the first year,
00:02:37 we got some early wins, which is always great.
00:02:40 What were the wins like?
00:02:41 What was the wins where you were,
00:02:44 there’s some problems to this, this is going to be good?
00:02:46 I think there are two early wins where one was speech,
00:02:49 that we collaborated very closely with the speech research
00:02:52 team, who was also getting interested in this.
00:02:54 And the other one was on images, where the cat paper,
00:02:58 as we call it, that was covered by a lot of folks.
00:03:03 And the birth of Google Brain was around neural networks.
00:03:07 So it was deep learning from the very beginning.
00:03:09 That was the whole mission.
00:03:10 So what would, in terms of scale,
00:03:15 what was the sort of dream of what this could become?
00:03:21 Were there echoes of this open source TensorFlow community
00:03:24 that might be brought in?
00:03:26 Was there a sense of TPUs?
00:03:28 Was there a sense of machine learning is now going to be
00:03:31 at the core of the entire company,
00:03:33 is going to grow into that direction?
00:03:36 Yeah, I think, so that was interesting.
00:03:38 And if I think back to 2012 or 2011,
00:03:41 and first was can we scale it in the year or so,
00:03:45 we had started scaling it to hundreds and thousands
00:03:47 of machines.
00:03:48 In fact, we had some runs even going to 10,000 machines.
00:03:51 And all of those shows great promise.
00:03:53 In terms of machine learning at Google,
00:03:56 the good thing was Google’s been doing machine learning
00:03:58 for a long time.
00:04:00 Deep learning was new, but as we scaled this up,
00:04:03 we showed that, yes, that was possible.
00:04:05 And it was going to impact lots of things.
00:04:07 Like we started seeing real products wanting to use this.
00:04:11 Again, speech was the first, there were image things
00:04:13 that photos came out of and then many other products as well.
00:04:17 So that was exciting.
00:04:20 As we went into that a couple of years,
00:04:23 externally also academia started to,
00:04:25 there was lots of push on, okay,
00:04:27 deep learning is interesting,
00:04:28 we should be doing more and so on.
00:04:30 And so by 2014, we were looking at, okay,
00:04:34 this is a big thing, it’s going to grow.
00:04:36 And not just internally, externally as well.
00:04:39 Yes, maybe Google’s ahead of where everybody is,
00:04:42 but there’s a lot to do.
00:04:43 So a lot of this started to make sense and come together.
00:04:46 So the decision to open source,
00:04:49 I was just chatting with Chris Glatner about this.
00:04:52 The decision to go open source with TensorFlow,
00:04:54 I would say sort of for me personally,
00:04:57 seems to be one of the big seminal moments
00:04:59 in all of software engineering ever.
00:05:01 I think that’s when a large company like Google
00:05:04 decides to take a large project that many lawyers
00:05:07 might argue has a lot of IP,
00:05:10 just decide to go open source with it,
00:05:12 and in so doing lead the entire world
00:05:14 and saying, you know what, open innovation
00:05:16 is a pretty powerful thing, and it’s okay to do.
00:05:22 That was, I mean, that’s an incredible moment in time.
00:05:26 So do you remember those discussions happening?
00:05:29 Whether open source should be happening?
00:05:31 What was that like?
00:05:32 I would say, I think, so the initial idea came from Jeff,
00:05:36 who was a big proponent of this.
00:05:39 I think it came off of two big things.
00:05:42 One was research wise, we were a research group.
00:05:46 We were putting all our research out there.
00:05:49 If you wanted to, we were building on others research
00:05:51 and we wanted to push the state of the art forward.
00:05:55 And part of that was to share the research.
00:05:56 That’s how I think deep learning and machine learning
00:05:58 has really grown so fast.
00:06:01 So the next step was, okay, now,
00:06:03 would software help with that?
00:06:05 And it seemed like they were existing
00:06:08 a few libraries out there, Tiano being one,
00:06:11 Torch being another, and a few others,
00:06:14 but they were all done by academia
00:06:15 and so the level was significantly different.
00:06:18 The other one was from a software perspective,
00:06:22 Google had done lots of software
00:06:23 or that we used internally, you know,
00:06:27 and we published papers.
00:06:29 Often there was an open source project
00:06:31 that came out of that that somebody else
00:06:33 picked up that paper and implemented
00:06:35 and they were very successful.
00:06:38 Back then it was like, okay, there’s Hadoop,
00:06:41 which has come off of tech that we’ve built.
00:06:44 We know the tech we’ve built is way better
00:06:46 for a number of different reasons.
00:06:47 We’ve invested a lot of effort in that.
00:06:51 And turns out we have Google Cloud
00:06:54 and we are now not really providing our tech,
00:06:57 but we are saying, okay, we have Bigtable,
00:07:00 which is the original thing.
00:07:02 We are going to now provide H base APIs
00:07:03 on top of that, which isn’t as good,
00:07:06 but that’s what everybody’s used to.
00:07:07 So there’s like, can we make something
00:07:10 that is better and really just provide,
00:07:12 helps the community in lots of ways,
00:07:14 but also helps push a good standard forward.
00:07:18 So how does Cloud fit into that?
00:07:19 There’s a TensorFlow open source library
00:07:22 and how does the fact that you can
00:07:25 use so many of the resources that Google provides
00:07:28 and the Cloud fit into that strategy?
00:07:31 So TensorFlow itself is open
00:07:33 and you can use it anywhere, right?
00:07:34 And we want to make sure that continues to be the case.
00:07:38 On Google Cloud, we do make sure
00:07:41 that there’s lots of integrations with everything else
00:07:43 and we want to make sure
00:07:44 that it works really, really well there.
00:07:47 You’re leading the TensorFlow effort.
00:07:50 Can you tell me the history
00:07:51 and the timeline of TensorFlow project
00:07:53 in terms of major design decisions,
00:07:55 so like the open source decision,
00:07:58 but really what to include and not?
00:08:01 There’s this incredible ecosystem
00:08:03 that I’d like to talk about.
00:08:04 There’s all these parts,
00:08:05 but what if just some sample moments
00:08:11 that defined what TensorFlow eventually became
00:08:15 through its, I don’t know if you’re allowed to say history
00:08:17 when it’s just, but in deep learning,
00:08:20 everything moves so fast
00:08:21 and just a few years is already history.
00:08:23 Yes, yes, so looking back, we were building TensorFlow.
00:08:29 I guess we open sourced it in 2015, November 2015.
00:08:34 We started on it in summer of 2014, I guess.
00:08:39 And somewhere like three to six, late 2014,
00:08:42 by then we had decided that, okay,
00:08:45 there’s a high likelihood we’ll open source it.
00:08:47 So we started thinking about that
00:08:48 and making sure we’re heading down that path.
00:08:53 At that point, by that point,
00:08:56 we had seen a few, lots of different use cases at Google.
00:08:59 So there were things like, okay,
00:09:01 yes, you wanna run it at large scale in the data center.
00:09:04 Yes, we need to support different kind of hardware.
00:09:07 We had GPUs at that point.
00:09:09 We had our first GPU at that point
00:09:11 or was about to come out roughly around that time.
00:09:15 So the design sort of included those.
00:09:18 We had started to push on mobile.
00:09:21 So we were running models on mobile.
00:09:24 At that point, people were customizing code.
00:09:28 So we wanted to make sure TensorFlow
00:09:29 could support that as well.
00:09:30 So that sort of became part of that overall design.
00:09:35 When you say mobile,
00:09:36 you mean like a pretty complicated algorithms
00:09:38 running on the phone?
00:09:40 That’s correct.
00:09:40 So when you have a model that you deploy on the phone
00:09:44 and run it there, right?
00:09:45 So already at that time,
00:09:46 there was ideas of running machine learning on the phone.
00:09:48 That’s correct.
00:09:49 We already had a couple of products
00:09:51 that were doing that by then.
00:09:53 And in those cases,
00:09:54 we had basically customized handcrafted code
00:09:57 or some internal libraries that we’re using.
00:10:00 So I was actually at Google during this time
00:10:02 in a parallel, I guess, universe,
00:10:04 but we were using Theano and Caffe.
00:10:09 Was there some degree to which you were bouncing,
00:10:11 like trying to see what Caffe was offering people,
00:10:15 trying to see what Theano was offering
00:10:17 that you want to make sure you’re delivering
00:10:19 on whatever that is?
00:10:21 Perhaps the Python part of thing,
00:10:23 maybe did that influence any design decisions?
00:10:27 Totally.
00:10:28 So when we built this belief
00:10:29 and some of that was in parallel
00:10:31 with some of these libraries coming up,
00:10:33 I mean, Theano itself is older,
00:10:36 but we were building this belief
00:10:39 focused on our internal thing
00:10:41 because our systems were very different.
00:10:42 By the time we got to this,
00:10:44 we looked at a number of libraries that were out there.
00:10:47 Theano, there were folks in the group
00:10:49 who had experience with Torch, with Lua.
00:10:52 There were folks here who had seen Caffe.
00:10:54 I mean, actually, Yang Jing was here as well.
00:10:58 There’s what other libraries?
00:11:02 I think we looked at a number of things.
00:11:04 Might even have looked at JNR back then.
00:11:06 I’m trying to remember if it was there.
00:11:09 In fact, yeah, we did discuss ideas around,
00:11:12 okay, should we have a graph or not?
00:11:17 So putting all these together was definitely,
00:11:20 they were key decisions that we wanted.
00:11:22 We had seen limitations in our prior disbelief things.
00:11:28 A few of them were just in terms of research
00:11:31 was moving so fast, we wanted the flexibility.
00:11:35 The hardware was changing fast.
00:11:36 We expected to change that
00:11:37 so that those probably were two things.
00:11:39 And yeah, I think the flexibility
00:11:43 in terms of being able to express
00:11:44 all kinds of crazy things was definitely a big one then.
00:11:46 So what, the graph decisions though,
00:11:49 with moving towards TensorFlow 2.0,
00:11:52 there’s more, by default, there’ll be eager execution.
00:11:56 So sort of hiding the graph a little bit
00:11:59 because it’s less intuitive
00:12:00 in terms of the way people develop and so on.
00:12:03 What was that discussion like in terms of using graphs?
00:12:06 It seemed, it’s kind of the Theano way.
00:12:09 Did it seem the obvious choice?
00:12:11 So I think where it came from was our disbelief
00:12:15 had a graph like thing as well.
00:12:17 A much more simple, it wasn’t a general graph,
00:12:19 it was more like a straight line thing.
00:12:23 More like what you might think of cafe,
00:12:25 I guess in that sense.
00:12:26 But the graph was,
00:12:28 and we always cared about the production stuff.
00:12:31 Like even with disbelief,
00:12:32 we were deploying a whole bunch of stuff in production.
00:12:34 So graph did come from that when we thought of,
00:12:37 okay, should we do that in Python?
00:12:39 And we experimented with some ideas
00:12:40 where it looked a lot simpler to use,
00:12:44 but not having a graph meant,
00:12:46 okay, how do you deploy now?
00:12:47 So that was probably what tilted the balance for us
00:12:51 and eventually we ended up with a graph.
00:12:52 And I guess the question there is, did you,
00:12:55 I mean, so production seems to be
00:12:57 the really good thing to focus on,
00:12:59 but did you even anticipate the other side of it
00:13:02 where there could be, what is it?
00:13:04 What are the numbers?
00:13:05 It’s been crazy, 41 million downloads.
00:13:08 Yep.
00:13:12 I mean, was that even like a possibility in your mind
00:13:16 that it would be as popular as it became?
00:13:19 So I think we did see a need for this
00:13:24 a lot from the research perspective
00:13:27 and like early days of deep learning in some ways.
00:13:32 41 million, no, I don’t think I imagined this number.
00:13:35 Then it seemed like there’s a potential future
00:13:41 where lots more people would be doing this
00:13:43 and how do we enable that?
00:13:45 I would say this kind of growth,
00:13:49 I probably started seeing somewhat after the open sourcing
00:13:52 where it was like, okay,
00:13:55 deep learning is actually growing way faster
00:13:57 for a lot of different reasons.
00:13:59 And we are in just the right place to push on that
00:14:02 and leverage that and deliver on lots of things
00:14:06 that people want.
00:14:07 So what changed once you open sourced?
00:14:09 Like how this incredible amount of attention
00:14:13 from a global population of developers,
00:14:16 how did the project start changing?
00:14:18 I don’t even actually remember during those times.
00:14:22 I know looking now, there’s really good documentation,
00:14:24 there’s an ecosystem of tools,
00:14:26 there’s a community, there’s a blog,
00:14:27 there’s a YouTube channel now, right?
00:14:29 Yeah.
00:14:31 It’s very community driven.
00:14:33 Back then, I guess 0.1 version,
00:14:38 is that the version?
00:14:39 I think we call it 0.6 or five,
00:14:42 something like that, I forget.
00:14:43 What changed leading into 1.0?
00:14:47 It’s interesting.
00:14:48 I think we’ve gone through a few things there.
00:14:51 When we started out, when we first came out,
00:14:53 people loved the documentation we have
00:14:56 because it was just a huge step up from everything else
00:14:58 because all of those were academic projects,
00:15:00 people doing, who don’t think about documentation.
00:15:04 I think what that changed was,
00:15:06 instead of deep learning being a research thing,
00:15:10 some people who were just developers
00:15:12 could now suddenly take this out
00:15:14 and do some interesting things with it, right?
00:15:16 Who had no clue what machine learning was before then.
00:15:20 And that I think really changed
00:15:22 how things started to scale up in some ways
00:15:24 and pushed on it.
00:15:27 Over the next few months as we looked at
00:15:30 how do we stabilize things,
00:15:31 as we look at not just researchers,
00:15:33 now we want stability, people want to deploy things.
00:15:36 That’s how we started planning for 1.0
00:15:38 and there are certain needs for that perspective.
00:15:42 And so again, documentation comes up,
00:15:45 designs, more kinds of things to put that together.
00:15:49 And so that was exciting to get that to a stage
00:15:52 where more and more enterprises wanted to buy in
00:15:55 and really get behind that.
00:15:57 And I think post 1.0 and over the next few releases,
00:16:01 that enterprise adoption also started to take off.
00:16:04 I would say between the initial release and 1.0,
00:16:07 it was, okay, researchers of course,
00:16:10 then a lot of hobbies and early interest,
00:16:12 people excited about this who started to get on board
00:16:15 and then over the 1.x thing, lots of enterprises.
00:16:18 I imagine anything that’s below 1.0
00:16:23 gives pressure to be,
00:16:25 the enterprise probably wants something that’s stable.
00:16:28 Exactly.
00:16:28 And do you have a sense now that TensorFlow is stable?
00:16:33 Like it feels like deep learning in general
00:16:35 is extremely dynamic field, so much is changing.
00:16:40 And TensorFlow has been growing incredibly.
00:16:43 Do you have a sense of stability at the helm of it?
00:16:46 I mean, I know you’re in the midst of it, but.
00:16:48 Yeah, I think in the midst of it,
00:16:51 it’s often easy to forget what an enterprise wants
00:16:55 and what some of the people on that side want.
00:16:58 There are still people running models
00:17:00 that are three years old, four years old.
00:17:02 So Inception is still used by tons of people.
00:17:06 Even ResNet 50 is what, couple of years old now or more,
00:17:08 but there are tons of people who use that and they’re fine.
00:17:12 They don’t need the last couple of bits of performance
00:17:15 or quality, they want some stability
00:17:17 in things that just work.
00:17:19 And so there is value in providing that
00:17:22 with that kind of stability and making it really simpler
00:17:25 because that allows a lot more people to access it.
00:17:27 And then there’s the research crowd which wants,
00:17:31 okay, they wanna do these crazy things
00:17:33 exactly like you’re saying, right?
00:17:34 Not just deep learning in the straight up models
00:17:37 that used to be there, they want RNNs
00:17:40 and even RNNs are maybe old, they are transformers now.
00:17:43 And now it needs to combine with RL and GANs and so on.
00:17:48 So there’s definitely that area that like the boundary
00:17:52 that’s shifting and pushing the state of the art.
00:17:55 But I think there’s more and more of the past
00:17:57 that’s much more stable and even stuff
00:18:01 that was two, three years old is very, very usable
00:18:03 by lots of people.
00:18:04 So that part makes it a lot easier.
00:18:07 So I imagine, maybe you can correct me if I’m wrong,
00:18:09 one of the biggest use cases is essentially
00:18:12 taking something like ResNet 50
00:18:14 and doing some kind of transfer learning
00:18:17 on a very particular problem that you have.
00:18:19 It’s basically probably what majority of the world does.
00:18:24 And you wanna make that as easy as possible.
00:18:27 So I would say for the hobbyist perspective,
00:18:30 that’s the most common case, right?
00:18:32 In fact, the apps and phones and stuff that you’ll see,
00:18:35 the early ones, that’s the most common case.
00:18:37 I would say there are a couple of reasons for that.
00:18:40 One is that everybody talks about that.
00:18:44 It looks great on slides.
00:18:46 That’s a presentation, yeah, exactly.
00:18:49 What enterprises want is that is part of it,
00:18:53 but that’s not the big thing.
00:18:54 Enterprises really have data
00:18:56 that they wanna make predictions on.
00:18:58 This is often what they used to do
00:19:00 with the people who were doing ML
00:19:01 was just regression models,
00:19:03 linear regression, logistic regression, linear models,
00:19:06 or maybe gradient booster trees and so on.
00:19:09 Some of them still benefit from deep learning,
00:19:11 but they want that’s the bread and butter,
00:19:14 or like the structured data and so on.
00:19:16 So depending on the audience you look at,
00:19:18 they’re a little bit different.
00:19:19 And they just have, I mean, the best of enterprise
00:19:23 probably just has a very large data set,
00:19:26 or deep learning can probably shine.
00:19:28 That’s correct, that’s right.
00:19:30 And then I think the other pieces that they wanted,
00:19:33 again, with 2.0, the developer summit we put together
00:19:36 is the whole TensorFlow Extended piece,
00:19:39 which is the entire pipeline.
00:19:40 They care about stability across doing their entire thing.
00:19:43 They want simplicity across the entire thing.
00:19:46 I don’t need to just train a model.
00:19:47 I need to do that every day again, over and over again.
00:19:51 I wonder to which degree you have a role in,
00:19:54 I don’t know, so I teach a course on deep learning.
00:19:56 I have people like lawyers come up to me and say,
00:20:01 when is machine learning gonna enter legal,
00:20:04 the legal realm?
00:20:05 The same thing in all kinds of disciplines,
00:20:09 immigration, insurance, often when I see
00:20:14 what it boils down to is these companies
00:20:17 are often a little bit old school
00:20:19 in the way they organize the data.
00:20:20 So the data is just not ready yet, it’s not digitized.
00:20:24 Do you also find yourself being in the role
00:20:26 of an evangelist for like, let’s get,
00:20:31 organize your data, folks, and then you’ll get
00:20:33 the big benefit of TensorFlow.
00:20:35 Do you get those, have those conversations?
00:20:38 Yeah, yeah, you know, I get all kinds of questions there
00:20:41 from, okay, what do I need to make this work, right?
00:20:49 Do we really need deep learning?
00:20:50 I mean, there are all these things,
00:20:52 I already use this linear model, why would this help?
00:20:55 I don’t have enough data, let’s say,
00:20:57 or I wanna use machine learning,
00:21:00 but I have no clue where to start.
00:21:01 So it varies, that to all the way to the experts
00:21:04 to why support very specific things, it’s interesting.
00:21:08 Is there a good answer?
00:21:09 It boils down to oftentimes digitizing data.
00:21:12 So whatever you want automated,
00:21:14 whatever data you want to make prediction based on,
00:21:17 you have to make sure that it’s in an organized form.
00:21:21 Like within the TensorFlow ecosystem,
00:21:24 there’s now, you’re providing more and more data sets
00:21:26 and more and more pre trained models.
00:21:28 Are you finding yourself also the organizer of data sets?
00:21:32 Yes, I think the TensorFlow data sets
00:21:34 that we just released, that’s definitely come up
00:21:37 where people want these data sets,
00:21:39 can we organize them and can we make that easier?
00:21:41 So that’s definitely one important thing.
00:21:45 The other related thing I would say is I often tell people,
00:21:47 you know what, don’t think of the most fanciest thing
00:21:51 that the newest model that you see,
00:21:53 make something very basic work and then you can improve it.
00:21:56 There’s just lots of things you can do with it.
00:21:58 Yeah, start with the basics, true.
00:22:00 One of the big things that makes TensorFlow
00:22:03 even more accessible was the appearance
00:22:06 whenever that happened of Keras,
00:22:08 the Keras standard sort of outside of TensorFlow.
00:22:12 I think it was Keras on top of Tiano at first only
00:22:18 and then Keras became on top of TensorFlow.
00:22:22 Do you know when Keras chose to also add TensorFlow
00:22:28 as a backend, who was the,
00:22:31 was it just the community that drove that initially?
00:22:34 Do you know if there was discussions, conversations?
00:22:37 Yeah, so Francois started the Keras project
00:22:41 before he was at Google and the first thing was Tiano.
00:22:44 I don’t remember if that was
00:22:46 after TensorFlow was created or way before.
00:22:49 And then at some point,
00:22:51 when TensorFlow started becoming popular,
00:22:53 there were enough similarities
00:22:54 that he decided to create this interface
00:22:56 and put TensorFlow as a backend.
00:22:58 I believe that might still have been
00:23:00 before he joined Google.
00:23:03 So we weren’t really talking about that.
00:23:06 He decided on his own and thought that was interesting
00:23:09 and relevant to the community.
00:23:12 In fact, I didn’t find out about him being at Google
00:23:17 until a few months after he was here.
00:23:19 He was working on some research ideas
00:23:21 and doing Keras on his nights and weekends project.
00:23:24 Oh, interesting.
00:23:25 He wasn’t like part of the TensorFlow.
00:23:28 He didn’t join initially.
00:23:29 He joined research and he was doing some amazing research.
00:23:32 He has some papers on that and research,
00:23:34 so he’s a great researcher as well.
00:23:38 And at some point we realized,
00:23:40 oh, he’s doing this good stuff.
00:23:42 People seem to like the API and he’s right here.
00:23:45 So we talked to him and he said,
00:23:47 okay, why don’t I come over to your team
00:23:50 and work with you for a quarter
00:23:52 and let’s make that integration happen.
00:23:55 And we talked to his manager and he said,
00:23:56 sure, quarter’s fine.
00:23:59 And that quarter’s been something like two years now.
00:24:02 And so he’s fully on this.
00:24:05 So Keras got integrated into TensorFlow in a deep way.
00:24:12 And now with 2.0, TensorFlow 2.0,
00:24:15 sort of Keras is kind of the recommended way
00:24:18 for a beginner to interact with TensorFlow.
00:24:21 Which makes that initial sort of transfer learning
00:24:24 or the basic use cases, even for an enterprise,
00:24:28 super simple, right?
00:24:29 That’s correct, that’s right.
00:24:30 So what was that decision like?
00:24:32 That seems like it’s kind of a bold decision as well.
00:24:38 We did spend a lot of time thinking about that one.
00:24:41 We had a bunch of APIs, some built by us.
00:24:46 There was a parallel layers API that we were building.
00:24:48 And when we decided to do Keras in parallel,
00:24:51 so there were like, okay, two things that we are looking at.
00:24:54 And the first thing we was trying to do
00:24:55 is just have them look similar,
00:24:58 like be as integrated as possible,
00:25:00 share all of that stuff.
00:25:02 There were also like three other APIs
00:25:04 that others had built over time
00:25:05 because we didn’t have a standard one.
00:25:09 But one of the messages that we kept hearing
00:25:11 from the community, okay, which one do we use?
00:25:13 And they kept seeing like, okay,
00:25:14 here’s a model in this one and here’s a model in this one,
00:25:16 which should I pick?
00:25:18 So that’s sort of like, okay,
00:25:20 we had to address that straight on with 2.0.
00:25:24 The whole idea was we need to simplify.
00:25:26 We had to pick one.
00:25:28 Based on where we were, we were like,
00:25:30 okay, let’s see what are the people like?
00:25:35 And Keras was clearly one that lots of people loved.
00:25:39 There were lots of great things about it.
00:25:41 So we settled on that.
00:25:43 Organically, that’s kind of the best way to do it.
00:25:46 It was great.
00:25:47 It was surprising, nevertheless,
00:25:48 to sort of bring in an outside.
00:25:51 I mean, there was a feeling like Keras
00:25:52 might be almost like a competitor
00:25:55 in a certain kind of, to TensorFlow.
00:25:58 And in a sense, it became an empowering element
00:26:01 of TensorFlow.
00:26:02 That’s right.
00:26:03 Yeah, it’s interesting how you can put two things together,
00:26:06 which can align.
00:26:08 In this case, I think Francois, the team,
00:26:11 and a bunch of us have chatted,
00:26:14 and I think we all want to see the same kind of things.
00:26:17 We all care about making it easier
00:26:18 for the huge set of developers out there,
00:26:21 and that makes a difference.
00:26:23 So Python has Guido van Rossum,
00:26:26 who until recently held the position
00:26:28 of benevolent dictator for life.
00:26:31 All right, so there’s a huge successful open source project
00:26:36 like TensorFlow need one person who makes a final decision.
00:26:40 So you’ve did a pretty successful TensorFlow Dev Summit
00:26:45 just now, last couple of days.
00:26:47 There’s clearly a lot of different new features
00:26:51 being incorporated, an amazing ecosystem, so on.
00:26:54 Who’s, how are those design decisions made?
00:26:57 Is there a BDFL in TensorFlow,
00:27:02 or is it more distributed and organic?
00:27:05 I think it’s somewhat different, I would say.
00:27:08 I’ve always been involved in the key design directions,
00:27:14 but there are lots of things that are distributed
00:27:17 where there are a number of people, Martin Wick being one,
00:27:20 who has really driven a lot of our open source stuff,
00:27:23 a lot of the APIs,
00:27:26 and there are a number of other people who’ve been,
00:27:29 you know, pushed and been responsible
00:27:31 for different parts of it.
00:27:34 We do have regular design reviews.
00:27:36 Over the last year, we’ve had a lot of
00:27:38 we’ve really spent a lot of time opening up to the community
00:27:41 and adding transparency.
00:27:44 We’re setting more processes in place,
00:27:45 so RFCs, special interest groups,
00:27:49 to really grow that community and scale that.
00:27:53 I think the kind of scale that ecosystem is in,
00:27:57 I don’t think we could scale with having me
00:27:59 as the lone point of decision maker.
00:28:02 I got it. So, yeah, the growth of that ecosystem,
00:28:05 maybe you can talk about it a little bit.
00:28:08 First of all, it started with Andrej Karpathy
00:28:10 when he first did ComNetJS.
00:28:13 The fact that you can train and you’ll network
00:28:15 in the browser was, in JavaScript, was incredible.
00:28:18 So now TensorFlow.js is really making that
00:28:22 a serious, like a legit thing,
00:28:26 a way to operate, whether it’s in the backend
00:28:28 or the front end.
00:28:29 Then there’s the TensorFlow Extended, like you mentioned.
00:28:32 There’s TensorFlow Lite for mobile.
00:28:35 And all of it, as far as I can tell,
00:28:37 it’s really converging towards being able to
00:28:41 save models in the same kind of way.
00:28:43 You can move around, you can train on the desktop
00:28:46 and then move it to mobile and so on.
00:28:48 That’s right.
00:28:49 So there’s that cohesiveness.
00:28:52 So can you maybe give me, whatever I missed,
00:28:56 a bigger overview of the mission of the ecosystem
00:28:58 that’s trying to be built and where is it moving forward?
00:29:02 Yeah. So in short, the way I like to think of this is
00:29:06 our goals to enable machine learning.
00:29:09 And in a couple of ways, you know, one is
00:29:13 we have lots of exciting things going on in ML today.
00:29:16 We started with deep learning,
00:29:17 but we now support a bunch of other algorithms too.
00:29:21 So one is to, on the research side,
00:29:23 keep pushing on the state of the art.
00:29:25 Can we, you know, how do we enable researchers
00:29:27 to build the next amazing thing?
00:29:28 So BERT came out recently, you know,
00:29:31 it’s great that people are able to do new kinds of research.
00:29:33 And there are lots of amazing research
00:29:35 that happens across the world.
00:29:37 So that’s one direction.
00:29:38 The other is how do you take that across
00:29:42 all the people outside who want to take that research
00:29:45 and do some great things with it
00:29:46 and integrate it to build real products,
00:29:48 to have a real impact on people.
00:29:51 And so if that’s the other axes in some ways,
00:29:56 you know, at a high level, one way I think about it is
00:29:59 there are a crazy number of compute devices
00:30:02 across the world.
00:30:04 And we often used to think of ML and training
00:30:07 and all of this as, okay, something you do
00:30:09 either in the workstation or the data center or cloud.
00:30:13 But we see things running on the phones.
00:30:15 We see things running on really tiny chips.
00:30:17 I mean, we had some demos at the developer summit.
00:30:20 And so the way I think about this ecosystem is
00:30:25 how do we help get machine learning on every device
00:30:29 that has a compute capability?
00:30:32 And that continues to grow and so in some ways
00:30:36 this ecosystem is looked at, you know,
00:30:38 various aspects of that and grown over time
00:30:41 to cover more of those.
00:30:42 And we continue to push the boundaries.
00:30:44 In some areas we’ve built more tooling
00:30:48 and things around that to help you.
00:30:50 I mean, the first tool we started was TensorBoard.
00:30:52 You wanted to learn just the training piece,
00:30:56 the effects or TensorFlow extended
00:30:58 to really do your entire ML pipelines.
00:31:00 If you’re, you know, care about all that production stuff,
00:31:04 but then going to the edge,
00:31:06 going to different kinds of things.
00:31:09 And it’s not just us now.
00:31:11 We are a place where there are lots of libraries
00:31:14 being built on top.
00:31:15 So there are some for research,
00:31:17 maybe things like TensorFlow agents
00:31:20 or TensorFlow probability that started as research things
00:31:22 or for researchers for focusing
00:31:24 on certain kinds of algorithms,
00:31:26 but they’re also being deployed
00:31:27 or used by, you know, production folks.
00:31:30 And some have come from within Google,
00:31:33 just teams across Google
00:31:34 who wanted to build these things.
00:31:37 Others have come from just the community
00:31:39 because there are different pieces
00:31:41 that different parts of the community care about.
00:31:44 And I see our goal as enabling even that, right?
00:31:49 It’s not, we cannot and won’t build every single thing.
00:31:53 That just doesn’t make sense.
00:31:54 But if we can enable others to build the things
00:31:57 that they care about, and there’s a broader community
00:32:00 that cares about that, and we can help encourage that,
00:32:02 and that’s great.
00:32:05 That really helps the entire ecosystem, not just those.
00:32:08 One of the big things about 2.0 that we’re pushing on is,
00:32:11 okay, we have these so many different pieces, right?
00:32:14 How do we help make all of them work well together?
00:32:18 So there are a few key pieces there that we’re pushing on,
00:32:21 one being the core format in there
00:32:23 and how we share the models themselves
00:32:26 through save model and TensorFlow hub and so on.
00:32:30 And a few of the pieces that we really put this together.
00:32:34 I was very skeptical that that’s,
00:32:35 you know, when TensorFlow.js came out,
00:32:37 it didn’t seem, or deep learning JS as it was earlier.
00:32:40 Yeah, that was the first.
00:32:41 It seems like technically very difficult project.
00:32:45 As a standalone, it’s not as difficult,
00:32:47 but as a thing that integrates into the ecosystem,
00:32:49 it seems very difficult.
00:32:51 So, I mean, there’s a lot of aspects of this
00:32:53 you’re making look easy, but,
00:32:54 and the technical side,
00:32:57 how many challenges have to be overcome here?
00:33:00 A lot.
00:33:01 And still have to be overcome.
00:33:03 That’s the question here too.
00:33:04 There are lots of steps to it, right?
00:33:06 And we’ve iterated over the last few years,
00:33:07 so there’s a lot we’ve learned.
00:33:10 I, yeah, and often when things come together well,
00:33:14 things look easy and that’s exactly the point.
00:33:16 It should be easy for the end user,
00:33:18 but there are lots of things that go behind that.
00:33:21 If I think about still challenges ahead,
00:33:25 there are,
00:33:29 you know, we have a lot more devices coming on board,
00:33:32 for example, from the hardware perspective.
00:33:35 How do we make it really easy for these vendors
00:33:37 to integrate with something like TensorFlow, right?
00:33:42 So there’s a lot of compiler stuff
00:33:43 that others are working on.
00:33:45 There are things we can do in terms of our APIs
00:33:48 and so on that we can do.
00:33:50 As we, you know,
00:33:52 TensorFlow started as a very monolithic system
00:33:55 and to some extent it still is.
00:33:57 There are less, lots of tools around it,
00:33:59 but the core is still pretty large and monolithic.
00:34:02 One of the key challenges for us to scale that out
00:34:05 is how do we break that apart with clearer interfaces?
00:34:10 It’s, you know, in some ways it’s software engineering 101,
00:34:14 but for a system that’s now four years old, I guess,
00:34:18 or more, and that’s still rapidly evolving
00:34:21 and that we’re not slowing down with,
00:34:23 it’s hard to change and modify and really break apart.
00:34:28 It’s sort of like, as people say, right,
00:34:29 it’s like changing the engine with a car running
00:34:32 or trying to fix that.
00:34:33 That’s exactly what we’re trying to do.
00:34:35 So there’s a challenge here
00:34:37 because the downside of so many people
00:34:41 being excited about TensorFlow
00:34:43 and coming to rely on it in many of their applications
00:34:48 is that you’re kind of responsible,
00:34:52 like it’s the technical debt.
00:34:53 You’re responsible for previous versions
00:34:55 to some degree still working.
00:34:57 So when you’re trying to innovate,
00:34:59 I mean, it’s probably easier
00:35:02 to just start from scratch every few months.
00:35:04 Absolutely.
00:35:07 So do you feel the pain of that?
00:35:09 2.0 does break some back compatibility,
00:35:14 but not too much.
00:35:15 It seems like the conversion is pretty straightforward.
00:35:18 Do you think that’s still important
00:35:20 given how quickly deep learning is changing?
00:35:22 Can you just, the things that you’ve learned,
00:35:26 can you just start over or is there pressure to not?
00:35:29 It’s a tricky balance.
00:35:31 So if it was just a researcher writing a paper
00:35:36 who a year later will not look at that code again,
00:35:39 sure, it doesn’t matter.
00:35:41 There are a lot of production systems
00:35:43 that rely on TensorFlow,
00:35:44 both at Google and across the world.
00:35:47 And people worry about this.
00:35:49 I mean, these systems run for a long time.
00:35:53 So it is important to keep that compatibility and so on.
00:35:57 And yes, it does come with a huge cost.
00:35:59 There’s, we have to think about a lot of things
00:36:02 as we do new things and make new changes.
00:36:06 I think it’s a trade off, right?
00:36:09 You can, you might slow certain kinds of things down,
00:36:12 but the overall value you’re bringing
00:36:14 because of that is much bigger
00:36:16 because it’s not just about breaking the person yesterday.
00:36:20 It’s also about telling the person tomorrow
00:36:23 that, you know what, this is how we do things.
00:36:26 We’re not gonna break you when you come on board
00:36:28 because there are lots of new people
00:36:29 who are also gonna come on board.
00:36:31 And, you know, one way I like to think about this,
00:36:34 and I always push the team to think about it as well,
00:36:37 when you wanna do new things,
00:36:39 you wanna start with a clean slate.
00:36:42 Design with a clean slate in mind,
00:36:44 and then we’ll figure out
00:36:46 how to make sure all the other things work.
00:36:48 And yes, we do make compromises occasionally,
00:36:52 but unless you design with the clean slate
00:36:55 and not worry about that,
00:36:56 you’ll never get to a good place.
00:36:58 Oh, that’s brilliant, so even if you are responsible
00:37:02 when you’re in the idea stage,
00:37:04 when you’re thinking of new,
00:37:05 just put all that behind you.
00:37:07 Okay, that’s really, really well put.
00:37:09 So I have to ask this
00:37:11 because a lot of students, developers ask me
00:37:13 how I feel about PyTorch versus TensorFlow.
00:37:16 So I’ve recently completely switched
00:37:18 my research group to TensorFlow.
00:37:20 I wish everybody would just use the same thing,
00:37:23 and TensorFlow is as close to that, I believe, as we have.
00:37:26 But do you enjoy competition?
00:37:32 So TensorFlow is leading in many ways,
00:37:34 on many dimensions in terms of ecosystem,
00:37:36 in terms of number of users,
00:37:39 momentum, power, production levels, so on,
00:37:41 but a lot of researchers are now also using PyTorch.
00:37:46 Do you enjoy that kind of competition
00:37:47 or do you just ignore it
00:37:48 and focus on making TensorFlow the best that it can be?
00:37:52 So just like research or anything people are doing,
00:37:55 it’s great to get different kinds of ideas.
00:37:58 And when we started with TensorFlow,
00:38:01 like I was saying earlier,
00:38:03 one, it was very important
00:38:05 for us to also have production in mind.
00:38:07 We didn’t want just research, right?
00:38:09 And that’s why we chose certain things.
00:38:11 Now PyTorch came along and said,
00:38:12 you know what, I only care about research.
00:38:14 This is what I’m trying to do.
00:38:16 What’s the best thing I can do for this?
00:38:18 And it started iterating and said,
00:38:20 okay, I don’t need to worry about graphs.
00:38:22 Let me just run things.
00:38:24 And I don’t care if it’s not as fast as it can be,
00:38:27 but let me just make this part easy.
00:38:30 And there are things you can learn from that, right?
00:38:32 They, again, had the benefit of seeing what had come before,
00:38:36 but also exploring certain different kinds of spaces.
00:38:40 And they had some good things there,
00:38:43 building on say things like JNR and so on before that.
00:38:46 So competition is definitely interesting.
00:38:49 It made us, you know,
00:38:50 this is an area that we had thought about,
00:38:51 like I said, way early on.
00:38:53 Over time we had revisited this a couple of times,
00:38:56 should we add this again?
00:38:59 At some point we said, you know what,
00:39:01 it seems like this can be done well,
00:39:02 so let’s try it again.
00:39:04 And that’s how we started pushing on eager execution.
00:39:07 How do we combine those two together?
00:39:09 Which has finally come very well together in 2.0,
00:39:13 but it took us a while to get all the things together
00:39:15 and so on.
00:39:16 So let me ask, put another way,
00:39:19 I think eager execution is a really powerful thing
00:39:21 that was added.
00:39:22 Do you think it wouldn’t have been,
00:39:25 you know, Muhammad Ali versus Frasier, right?
00:39:28 Do you think it wouldn’t have been added as quickly
00:39:31 if PyTorch wasn’t there?
00:39:33 It might have taken longer.
00:39:35 No longer?
00:39:36 Yeah, it was, I mean,
00:39:37 we had tried some variants of that before,
00:39:38 so I’m sure it would have happened,
00:39:40 but it might have taken longer.
00:39:42 I’m grateful that TensorFlow is finally
00:39:44 in the way they did.
00:39:44 It’s doing some incredible work last couple years.
00:39:47 What other things that we didn’t talk about
00:39:49 are you looking forward in 2.0?
00:39:51 That comes to mind.
00:39:54 So we talked about some of the ecosystem stuff,
00:39:56 making it easily accessible to Keras,
00:40:00 eager execution.
00:40:01 Is there other things that we missed?
00:40:03 Yeah, so I would say one is just where 2.0 is,
00:40:07 and you know, with all the things that we’ve talked about,
00:40:10 I think as we think beyond that,
00:40:13 there are lots of other things that it enables us to do
00:40:16 and that we’re excited about.
00:40:18 So what it’s setting us up for,
00:40:20 okay, here are these really clean APIs.
00:40:22 We’ve cleaned up the surface for what the users want.
00:40:25 What it also allows us to do a whole bunch of stuff
00:40:28 behind the scenes once we are ready with 2.0.
00:40:31 So for example, in TensorFlow with graphs
00:40:36 and all the things you could do,
00:40:37 you could always get a lot of good performance
00:40:40 if you spent the time to tune it, right?
00:40:43 And we’ve clearly shown that, lots of people do that.
00:40:47 With 2.0, with these APIs, where we are,
00:40:53 we can give you a lot of performance
00:40:55 just with whatever you do.
00:40:57 You know, because we see these, it’s much cleaner.
00:41:01 We know most people are gonna do things this way.
00:41:03 We can really optimize for that
00:41:05 and get a lot of those things out of the box.
00:41:09 And it really allows us, you know,
00:41:10 both for single machine and distributed and so on,
00:41:13 to really explore other spaces behind the scenes
00:41:17 after 2.0 in the future versions as well.
00:41:19 So right now the team’s really excited about that,
00:41:23 that over time I think we’ll see that.
00:41:25 The other piece that I was talking about
00:41:27 in terms of just restructuring the monolithic thing
00:41:31 into more pieces and making it more modular,
00:41:34 I think that’s gonna be really important
00:41:36 for a lot of the other people in the ecosystem,
00:41:41 other organizations and so on that wanted to build things.
00:41:44 Can you elaborate a little bit what you mean
00:41:46 by making TensorFlow ecosystem more modular?
00:41:50 So the way it’s organized today is there’s one,
00:41:55 there are lots of repositories
00:41:56 in the TensorFlow organization at GitHub.
00:41:58 The core one where we have TensorFlow,
00:42:01 it has the execution engine,
00:42:04 it has the key backends for CPUs and GPUs,
00:42:08 it has the work to do distributed stuff.
00:42:12 And all of these just work together
00:42:14 in a single library or binary.
00:42:17 There’s no way to split them apart easily.
00:42:18 I mean, there are some interfaces,
00:42:20 but they’re not very clean.
00:42:21 In a perfect world, you would have clean interfaces where,
00:42:24 okay, I wanna run it on my fancy cluster
00:42:27 with some custom networking,
00:42:29 just implement this and do that.
00:42:31 I mean, we kind of support that,
00:42:32 but it’s hard for people today.
00:42:35 I think as we are starting to see more interesting things
00:42:38 in some of these spaces,
00:42:39 having that clean separation will really start to help.
00:42:42 And again, going to the large size of the ecosystem
00:42:47 and the different groups involved there,
00:42:50 enabling people to evolve
00:42:52 and push on things more independently
00:42:54 just allows it to scale better.
00:42:56 And by people, you mean individual developers and?
00:42:59 And organizations.
00:42:59 And organizations.
00:43:00 That’s right.
00:43:01 So the hope is that everybody sort of major,
00:43:04 I don’t know, Pepsi or something uses,
00:43:06 like major corporations go to TensorFlow to this kind of.
00:43:11 Yeah, if you look at enterprises like Pepsi or these,
00:43:13 I mean, a lot of them are already using TensorFlow.
00:43:15 They are not the ones that do the development
00:43:18 or changes in the core.
00:43:20 Some of them do, but a lot of them don’t.
00:43:21 I mean, they touch small pieces.
00:43:23 There are lots of these,
00:43:25 some of them being, let’s say, hardware vendors
00:43:27 who are building their custom hardware
00:43:28 and they want their own pieces.
00:43:30 Or some of them being bigger companies, say, IBM.
00:43:34 I mean, they’re involved in some of our
00:43:36 special interest groups,
00:43:38 and they see a lot of users
00:43:39 who want certain things and they want to optimize for that.
00:43:42 So folks like that often.
00:43:44 Autonomous vehicle companies, perhaps.
00:43:46 Exactly, yes.
00:43:48 So, yeah, like I mentioned,
00:43:50 TensorFlow has been downloaded 41 million times,
00:43:52 50,000 commits, almost 10,000 pull requests,
00:43:56 and 1,800 contributors.
00:43:58 So I’m not sure if you can explain it,
00:44:02 but what does it take to build a community like that?
00:44:06 In retrospect, what do you think,
00:44:09 what is the critical thing that allowed
00:44:11 for this growth to happen,
00:44:12 and how does that growth continue?
00:44:14 Yeah, yeah, that’s an interesting question.
00:44:17 I wish I had all the answers there, I guess,
00:44:20 so you could replicate it.
00:44:22 I think there are a number of things
00:44:25 that need to come together, right?
00:44:27 One, just like any new thing,
00:44:32 it is about, there’s a sweet spot of timing,
00:44:35 what’s needed, does it grow with,
00:44:38 what’s needed, so in this case, for example,
00:44:41 TensorFlow’s not just grown because it was a good tool,
00:44:43 it’s also grown with the growth of deep learning itself.
00:44:46 So those factors come into play.
00:44:49 Other than that, though,
00:44:52 I think just hearing, listening to the community,
00:44:55 what they do, what they need,
00:44:57 being open to, like in terms of external contributions,
00:45:01 we’ve spent a lot of time in making sure
00:45:04 we can accept those contributions well,
00:45:06 we can help the contributors in adding those,
00:45:09 putting the right process in place,
00:45:11 getting the right kind of community,
00:45:13 welcoming them and so on.
00:45:16 Like over the last year, we’ve really pushed on transparency,
00:45:19 that’s important for an open source project.
00:45:22 People wanna know where things are going,
00:45:23 and we’re like, okay, here’s a process
00:45:26 where you can do that, here are our RFCs and so on.
00:45:29 So thinking through, there are lots of community aspects
00:45:32 that come into that you can really work on.
00:45:35 As a small project, it’s maybe easy to do
00:45:38 because there’s like two developers and you can do those.
00:45:42 As you grow, putting more of these processes in place,
00:45:46 thinking about the documentation,
00:45:49 thinking about what two developers care about,
00:45:51 what kind of tools would they want to use,
00:45:55 all of these come into play, I think.
00:45:56 So one of the big things I think
00:45:58 that feeds the TensorFlow fire
00:46:00 is people building something on TensorFlow,
00:46:03 and implement a particular architecture
00:46:07 that does something cool and useful,
00:46:09 and they put that on GitHub.
00:46:11 And so it just feeds this growth.
00:46:15 Do you have a sense that with 2.0 and 1.0
00:46:19 that there may be a little bit of a partitioning
00:46:21 like there is with Python 2 and 3,
00:46:24 that there’ll be a code base
00:46:26 and in the older versions of TensorFlow,
00:46:28 they will not be as compatible easily?
00:46:31 Or are you pretty confident that this kind of conversion
00:46:35 is pretty natural and easy to do?
00:46:37 So we’re definitely working hard
00:46:39 to make that very easy to do.
00:46:41 There’s lots of tooling that we talked about
00:46:43 at the developer summit this week,
00:46:45 and we’ll continue to invest in that tooling.
00:46:48 It’s, you know, when you think
00:46:50 of these significant version changes,
00:46:52 that’s always a risk,
00:46:53 and we are really pushing hard
00:46:55 to make that transition very, very smooth.
00:46:58 So I think, so at some level,
00:47:02 people wanna move and they see the value in the new thing.
00:47:05 They don’t wanna move just because it’s a new thing,
00:47:07 and some people do,
00:47:08 but most people want a really good thing.
00:47:11 And I think over the next few months,
00:47:13 as people start to see the value,
00:47:15 we’ll definitely see that shift happening.
00:47:17 So I’m pretty excited and confident
00:47:19 that we will see people moving.
00:47:22 As you said earlier, this field is also moving rapidly,
00:47:24 so that’ll help because we can do more things
00:47:26 and all the new things will clearly happen in 2.x,
00:47:29 so people will have lots of good reasons to move.
00:47:32 So what do you think TensorFlow 3.0 looks like?
00:47:36 Is there, are things happening so crazily
00:47:40 that even at the end of this year
00:47:42 seems impossible to plan for?
00:47:45 Or is it possible to plan for the next five years?
00:47:49 I think it’s tricky.
00:47:50 There are some things that we can expect
00:47:54 in terms of, okay, change, yes, change is gonna happen.
00:47:59 Are there some things gonna stick around
00:48:01 and some things not gonna stick around?
00:48:03 I would say the basics of deep learning,
00:48:08 the, you know, say convolution models
00:48:10 or the basic kind of things,
00:48:12 they’ll probably be around in some form still in five years.
00:48:16 Will RL and GAN stay?
00:48:18 Very likely, based on where they are.
00:48:21 Will we have new things?
00:48:22 Probably, but those are hard to predict.
00:48:24 And some directionally, some things that we can see is,
00:48:30 you know, in things that we’re starting to do, right,
00:48:32 with some of our projects right now
00:48:35 is just 2.0 combining eager execution and graphs
00:48:39 where we’re starting to make it more like
00:48:41 just your natural programming language.
00:48:43 You’re not trying to program something else.
00:48:45 Similarly, with Swift for TensorFlow,
00:48:47 we’re taking that approach.
00:48:48 Can you do something ground up, right?
00:48:50 So some of those ideas seem like, okay,
00:48:52 that’s the right direction.
00:48:54 In five years, we expect to see more in that area.
00:48:58 Other things we don’t know is,
00:49:00 will hardware accelerators be the same?
00:49:03 Will we be able to train with four bits
00:49:06 instead of 32 bits?
00:49:09 And I think the TPU side of things is exploring that.
00:49:11 I mean, TPU is already on version three.
00:49:13 It seems that the evolution of TPU and TensorFlow
00:49:17 are sort of, they’re coevolving almost in terms of
00:49:23 both are learning from each other and from the community
00:49:25 and from the applications
00:49:27 where the biggest benefit is achieved.
00:49:29 That’s right.
00:49:30 You’ve been trying to sort of, with Eager, with Keras,
00:49:33 to make TensorFlow as accessible
00:49:34 and easy to use as possible.
00:49:36 What do you think, for beginners,
00:49:38 is the biggest thing they struggle with?
00:49:40 Have you encountered that?
00:49:42 Or is basically what Keras is solving is that Eager,
00:49:46 like we talked about?
00:49:47 Yeah, for some of them, like you said, right,
00:49:50 the beginners want to just be able to take
00:49:53 some image model,
00:49:54 they don’t care if it’s Inception or ResNet
00:49:57 or something else,
00:49:58 and do some training or transfer learning
00:50:00 on their kind of model.
00:50:02 Being able to make that easy is important.
00:50:04 So in some ways,
00:50:07 if you do that by providing them simple models
00:50:09 with say, in hub or so on,
00:50:11 they don’t care about what’s inside that box,
00:50:13 but they want to be able to use it.
00:50:15 So we’re pushing on, I think, different levels.
00:50:17 If you look at just a component that you get,
00:50:20 which has the layers already smooshed in,
00:50:22 the beginners probably just want that.
00:50:25 Then the next step is, okay,
00:50:26 look at building layers with Keras.
00:50:29 If you go out to research,
00:50:30 then they are probably writing custom layers themselves
00:50:33 or doing their own loops.
00:50:34 So there’s a whole spectrum there.
00:50:36 And then providing the pre trained models
00:50:38 seems to really decrease the time from you trying to start.
00:50:43 You could basically in a Colab notebook
00:50:46 achieve what you need.
00:50:49 So I’m basically answering my own question
00:50:51 because I think what TensorFlow delivered on recently
00:50:54 is trivial for beginners.
00:50:56 So I was just wondering if there was other pain points
00:51:00 you’re trying to ease,
00:51:01 but I’m not sure there would.
00:51:02 No, those are probably the big ones.
00:51:04 I see high schoolers doing a whole bunch of things now,
00:51:07 which is pretty amazing.
00:51:09 It’s both amazing and terrifying.
00:51:11 Yes.
00:51:12 In a sense that when they grow up,
00:51:15 it’s some incredible ideas will be coming from them.
00:51:19 So there’s certainly a technical aspect to your work,
00:51:21 but you also have a management aspect to your role
00:51:25 with TensorFlow leading the project,
00:51:27 a large number of developers and people.
00:51:31 So what do you look for in a good team?
00:51:34 What do you think?
00:51:36 Google has been at the forefront of exploring
00:51:38 what it takes to build a good team
00:51:40 and TensorFlow is one of the most cutting edge technologies
00:51:45 in the world.
00:51:46 So in this context, what do you think makes for a good team?
00:51:50 It’s definitely something I think a favorite about.
00:51:53 I think in terms of the team being able
00:51:59 to deliver something well,
00:52:01 one of the things that’s important is a cohesion
00:52:04 across the team.
00:52:05 So being able to execute together in doing things
00:52:10 that’s not an end, like at this scale,
00:52:13 an individual engineer can only do so much.
00:52:15 There’s a lot more that they can do together,
00:52:18 even though we have some amazing superstars across Google
00:52:21 and in the team, but there’s, you know,
00:52:25 often the way I see it as the product
00:52:27 of what the team generates is way larger
00:52:29 than the whole or the individual put together.
00:52:34 And so how do we have all of them work together,
00:52:37 the culture of the team itself,
00:52:40 hiring good people is important.
00:52:43 But part of that is it’s not just that,
00:52:45 okay, we hire a bunch of smart people
00:52:47 and throw them together and let them do things.
00:52:49 It’s also people have to care about what they’re building,
00:52:52 people have to be motivated for the right kind of things.
00:52:57 That’s often an important factor.
00:53:01 And, you know, finally, how do you put that together
00:53:04 with a somewhat unified vision of where we wanna go?
00:53:08 So are we all looking in the same direction
00:53:11 or each of us going all over?
00:53:13 And sometimes it’s a mix.
00:53:16 Google’s a very bottom up organization in some sense,
00:53:21 also research even more so, and that’s how we started.
00:53:26 But as we’ve become this larger product and ecosystem,
00:53:30 I think it’s also important to combine that well
00:53:33 with a mix of, okay, here’s the direction we wanna go in.
00:53:38 There is exploration we’ll do around that,
00:53:39 but let’s keep staying in that direction,
00:53:42 not just all over the place.
00:53:44 And is there a way you monitor the health of the team?
00:53:46 Sort of like, is there a way you know you did a good job?
00:53:51 The team is good?
00:53:53 Like, I mean, you’re sort of, you’re saying nice things,
00:53:56 but it’s sometimes difficult to determine how aligned.
00:54:00 Yes.
00:54:01 Because it’s not binary.
00:54:02 It’s not like there’s tensions and complexities and so on.
00:54:06 And the other element of the mission of superstars,
00:54:09 there’s so much, even at Google,
00:54:11 such a large percentage of work
00:54:13 is done by individual superstars too.
00:54:16 So there’s a, and sometimes those superstars
00:54:19 can be against the dynamic of a team and those tensions.
00:54:25 I mean, I’m sure in TensorFlow it might be
00:54:26 a little bit easier because the mission of the project
00:54:28 is so sort of beautiful.
00:54:31 You’re at the cutting edge, so it’s exciting.
00:54:34 But have you had struggle with that?
00:54:36 Has there been challenges?
00:54:38 There are always people challenges
00:54:39 in different kinds of ways.
00:54:41 That said, I think we’ve been what’s good
00:54:44 about getting people who care and are, you know,
00:54:48 have the same kind of culture,
00:54:50 and that’s Google in general to a large extent.
00:54:53 But also, like you said, given that the project
00:54:56 has had so many exciting things to do,
00:54:58 there’s been room for lots of people
00:55:00 to do different kinds of things and grow,
00:55:02 which does make the problem a bit easier, I guess.
00:55:05 And it allows people, depending on what they’re doing,
00:55:09 if there’s room around them, then that’s fine.
00:55:13 But yes, we do care about whether a superstar or not,
00:55:19 that they need to work well with the team across Google.
00:55:22 That’s interesting to hear.
00:55:23 So it’s like superstar or not,
00:55:26 the productivity broadly is about the team.
00:55:30 Yeah, yeah.
00:55:31 I mean, they might add a lot of value,
00:55:32 but if they’re hurting the team, then that’s a problem.
00:55:35 So in hiring engineers, it’s so interesting, right,
00:55:39 the hiring process.
00:55:40 What do you look for?
00:55:41 How do you determine a good developer
00:55:44 or a good member of a team
00:55:46 from just a few minutes or hours together?
00:55:50 Again, no magic answers, I’m sure.
00:55:52 Yeah, I mean, Google has a hiring process
00:55:55 that we’ve refined over the last 20 years, I guess,
00:55:59 and that you’ve probably heard and seen a lot about.
00:56:02 So we do work with the same hiring process
00:56:04 and that’s really helped.
00:56:08 For me in particular, I would say,
00:56:10 in addition to the core technical skills,
00:56:14 what does matter is their motivation
00:56:17 in what they wanna do.
00:56:19 Because if that doesn’t align well
00:56:21 with where we wanna go,
00:56:22 that’s not gonna lead to long term success
00:56:25 for either them or the team.
00:56:27 And I think that becomes more important
00:56:30 the more senior the person is,
00:56:31 but it’s important at every level.
00:56:33 Like even the junior most engineer,
00:56:34 if they’re not motivated to do well
00:56:36 at what they’re trying to do,
00:56:37 however smart they are,
00:56:38 it’s gonna be hard for them to succeed.
00:56:40 Does the Google hiring process touch on that passion?
00:56:44 So like trying to determine,
00:56:46 because I think as far as I understand,
00:56:48 maybe you can speak to it,
00:56:49 that the Google hiring process sort of helps
00:56:53 in the initial like determines the skill set there,
00:56:56 is your puzzle solving ability,
00:56:57 problem solving ability good?
00:56:59 But like, I’m not sure,
00:57:02 but it seems that the determining
00:57:05 whether the person is like fire inside them,
00:57:07 that burns to do anything really,
00:57:09 it doesn’t really matter.
00:57:09 It’s just some cool stuff,
00:57:11 I’m gonna do it.
00:57:15 Is that something that ultimately ends up
00:57:17 when they have a conversation with you
00:57:18 or once it gets closer to the team?
00:57:22 So one of the things we do have as part of the process
00:57:25 is just a culture fit,
00:57:27 like part of the interview process itself,
00:57:29 in addition to just the technical skills
00:57:31 and each engineer or whoever the interviewer is,
00:57:34 is supposed to rate the person on the culture
00:57:38 and the culture fit with Google and so on.
00:57:40 So that is definitely part of the process.
00:57:42 Now, there are various kinds of projects
00:57:45 and different kinds of things.
00:57:46 So there might be variants
00:57:48 and of the kind of culture you want there and so on.
00:57:51 And yes, that does vary.
00:57:52 So for example,
00:57:54 TensorFlow has always been a fast moving project
00:57:56 and we want people who are comfortable with that.
00:58:00 But at the same time now, for example,
00:58:02 we are at a place where we are also very full fledged product
00:58:05 and we wanna make sure things that work
00:58:07 really, really work, right?
00:58:09 You can’t cut corners all the time.
00:58:11 So balancing that out and finding the people
00:58:14 who are the right fit for those is important.
00:58:17 And I think those kinds of things do vary a bit
00:58:19 across projects and teams and product areas across Google.
00:58:23 And so you’ll see some differences there
00:58:25 in the final checklist.
00:58:27 But a lot of the core culture,
00:58:29 it comes along with just the engineering excellence
00:58:32 and so on.
00:58:34 What is the hardest part of your job?
00:58:39 I’ll take your pick, I guess.
00:58:41 It’s fun, I would say, right?
00:58:44 Hard, yes.
00:58:45 I mean, lots of things at different times.
00:58:47 I think that does vary.
00:58:49 So let me clarify that difficult things are fun
00:58:52 when you solve them, right?
00:58:53 So it’s fun in that sense.
00:58:57 I think the key to a successful thing across the board
00:59:02 and in this case, it’s a large ecosystem now,
00:59:05 but even a small product,
00:59:07 is striking that fine balance
00:59:09 across different aspects of it.
00:59:12 Sometimes it’s how fast do you go
00:59:13 versus how perfect it is.
00:59:17 Sometimes it’s how do you involve this huge community?
00:59:21 Who do you involve or do you decide,
00:59:23 okay, now is not a good time to involve them
00:59:25 because it’s not the right fit.
00:59:30 Sometimes it’s saying no to certain kinds of things.
00:59:33 Those are often the hard decisions.
00:59:36 Some of them you make quickly
00:59:39 because you don’t have the time.
00:59:41 Some of them you get time to think about them,
00:59:43 but they’re always hard.
00:59:44 So both choices are pretty good, those decisions.
00:59:49 What about deadlines?
00:59:50 Is this, do you find TensorFlow,
00:59:53 to be driven by deadlines
00:59:58 to a degree that a product might?
01:00:00 Or is there still a balance to where it’s less deadline?
01:00:04 You had the Dev Summit today
01:00:06 that came together incredibly.
01:00:08 Looked like there’s a lot of moving pieces and so on.
01:00:11 So did that deadline make people rise to the occasion
01:00:15 releasing TensorFlow 2.0 alpha?
01:00:18 I’m sure that was done last minute as well.
01:00:20 I mean, up to the last point.
01:00:25 Again, it’s one of those things
01:00:26 that you need to strike the good balance.
01:00:29 There’s some value that deadlines bring
01:00:32 that does bring a sense of urgency
01:00:33 to get the right things together.
01:00:35 Instead of getting the perfect thing out,
01:00:38 you need something that’s good and works well.
01:00:41 And the team definitely did a great job
01:00:43 in putting that together.
01:00:44 So I was very amazed and excited
01:00:45 by everything how that came together.
01:00:48 That said, across the year,
01:00:49 we try not to put out official deadlines.
01:00:52 We focus on key things that are important,
01:00:57 figure out how much of it’s important.
01:01:00 And we are developing in the open,
01:01:03 both internally and externally,
01:01:05 everything’s available to everybody.
01:01:07 So you can pick and look at where things are.
01:01:11 We do releases at a regular cadence.
01:01:13 So fine, if something doesn’t necessarily end up
01:01:16 this month, it’ll end up in the next release
01:01:17 in a month or two.
01:01:18 And that’s okay, but we want to keep moving
01:01:22 as fast as we can in these different areas.
01:01:26 Because we can iterate and improve on things,
01:01:29 sometimes it’s okay to put things out
01:01:31 that aren’t fully ready.
01:01:32 We’ll make sure it’s clear that okay,
01:01:34 this is experimental, but it’s out there
01:01:36 if you want to try and give feedback.
01:01:37 That’s very, very useful.
01:01:39 I think that quick cycle and quick iteration is important.
01:01:43 That’s what we often focus on rather than
01:01:46 here’s a deadline where you get everything else.
01:01:49 Is 2.0, is there pressure to make that stable?
01:01:52 Or like, for example, WordPress 5.0 just came out
01:01:57 and there was no pressure to,
01:02:00 it was a lot of build updates delivered way too late,
01:02:03 but, and they said, okay, well,
01:02:05 but we’re gonna release a lot of updates
01:02:07 really quickly to improve it.
01:02:09 Do you see TensorFlow 2.0 in that same kind of way
01:02:12 or is there this pressure to once it hits 2.0,
01:02:15 once you get to the release candidate
01:02:16 and then you get to the final,
01:02:18 that’s gonna be the stable thing?
01:02:22 So it’s gonna be stable in,
01:02:25 just like when NodeX was where every API that’s there
01:02:28 is gonna remain in work.
01:02:32 It doesn’t mean we can’t change things under the covers.
01:02:34 It doesn’t mean we can’t add things.
01:02:36 So there’s still a lot more for us to do
01:02:39 and we’ll continue to have more releases.
01:02:41 So in that sense, there’s still,
01:02:42 I don’t think we’ll be done in like two months
01:02:44 when we release this.
01:02:46 I don’t know if you can say, but is there,
01:02:49 there’s not external deadlines for TensorFlow 2.0,
01:02:53 but is there internal deadlines,
01:02:57 the artificial or otherwise,
01:02:58 that you’re trying to set for yourself
01:03:00 or is it whenever it’s ready?
01:03:03 So we want it to be a great product, right?
01:03:05 And that’s a big important piece for us.
01:03:09 TensorFlow’s already out there.
01:03:11 We have 41 million downloads for 1.0 X.
01:03:13 So it’s not like we have to have this.
01:03:16 Yeah, exactly.
01:03:17 So it’s not like, a lot of the features
01:03:19 that we’ve really polishing
01:03:21 and putting them together are there.
01:03:23 We don’t have to rush that just because.
01:03:26 So in that sense, we wanna get it right
01:03:28 and really focus on that.
01:03:29 That said, we have said that we are looking
01:03:31 to get this out in the next few months,
01:03:33 in the next quarter.
01:03:34 And as far as possible,
01:03:37 we’ll definitely try to make that happen.
01:03:39 Yeah, my favorite line was, spring is a relative concept.
01:03:44 I love it.
01:03:45 Yes.
01:03:46 Spoken like a true developer.
01:03:47 So something I’m really interested in
01:03:50 and your previous line of work is,
01:03:52 before TensorFlow, you led a team at Google on search ads.
01:03:57 I think this is a very interesting topic
01:04:01 on every level, on a technical level,
01:04:04 because at their best, ads connect people
01:04:07 to the things they want and need.
01:04:09 So, and at their worst, they’re just these things
01:04:12 that annoy the heck out of you
01:04:14 to the point of ruining the entire user experience
01:04:17 of whatever you’re actually doing.
01:04:20 So they have a bad rep, I guess.
01:04:23 And on the other end, so that this connecting users
01:04:28 to the thing they need and want
01:04:29 is a beautiful opportunity for machine learning to shine.
01:04:34 Like huge amounts of data that’s personalized
01:04:36 and you kind of map to the thing
01:04:37 they actually want won’t get annoyed.
01:04:40 So what have you learned from this,
01:04:43 Google that’s leading the world in this aspect,
01:04:45 what have you learned from that experience
01:04:47 and what do you think is the future of ads?
01:04:51 Take you back to that.
01:04:52 Yeah, yes, it’s been a while,
01:04:55 but I totally agree with what you said.
01:04:59 I think the search ads, the way it was always looked at
01:05:03 and I believe it still is,
01:05:04 is it’s an extension of what search is trying to do.
01:05:08 And the goal is to make the information
01:05:10 and make the world’s information accessible.
01:05:14 That’s it’s not just information,
01:05:17 but maybe products or other things that people care about.
01:05:20 And so it’s really important for them to align
01:05:23 with what the users need.
01:05:26 And in search ads, there’s a minimum quality level
01:05:30 before that ad would be shown.
01:05:32 If you don’t have an ad that hits that quality,
01:05:34 but it will not be shown even if we have it
01:05:35 and okay, maybe we lose some money there, that’s fine.
01:05:39 That is really, really important.
01:05:41 And I think that that is something I really liked
01:05:43 about being there.
01:05:45 Advertising is a key part.
01:05:48 I mean, as a model, it’s been around for ages, right?
01:05:51 It’s not a new model, it’s been adapted to the web
01:05:54 and became a core part of search
01:05:57 and many other search engines across the world.
01:06:00 And I do hope, like you said,
01:06:04 there are aspects of ads that are annoying
01:06:06 and I go to a website and if it just keeps popping
01:06:10 an ad in my face not to let me read,
01:06:12 that’s gonna be annoying clearly.
01:06:13 So I hope we can strike that balance
01:06:18 between showing a good ad where it’s valuable to the user
01:06:23 and provides the monetization to the service.
01:06:29 And this might be search, this might be a website,
01:06:32 all of these, they do need the monetization
01:06:35 for them to provide that service.
01:06:38 But if it’s done in a good balance between
01:06:43 showing just some random stuff that’s distracting
01:06:46 versus showing something that’s actually valuable.
01:06:49 So do you see it moving forward as to continue
01:06:54 being a model that funds businesses like Google,
01:07:00 that’s a significant revenue stream?
01:07:04 Because that’s one of the most exciting things
01:07:07 but also limiting things in the internet
01:07:09 is nobody wants to pay for anything.
01:07:11 And advertisements, again, coupled at their best,
01:07:14 are actually really useful and not annoying.
01:07:16 Do you see that continuing and growing and improving
01:07:21 or is there, do you see sort of more Netflix type models
01:07:26 where you have to start to pay for content?
01:07:28 I think it’s a mix.
01:07:29 I think it’s gonna take a long while for everything
01:07:32 to be paid on the internet, if at all, probably not.
01:07:35 I mean, I think there’s always gonna be things
01:07:37 that are sort of monetized with things like ads.
01:07:40 But over the last few years, I would say
01:07:42 we’ve definitely seen that transition towards
01:07:45 more paid services across the web
01:07:48 and people are willing to pay for them
01:07:50 because they do see the value.
01:07:51 I mean, Netflix is a great example.
01:07:53 I mean, we have YouTube doing things.
01:07:56 People pay for the apps they buy.
01:07:58 More people I find are willing to pay for newspaper content
01:08:03 for the good news websites across the web.
01:08:07 That wasn’t the case a few years,
01:08:08 even a few years ago, I would say.
01:08:11 And I just see that change in myself as well
01:08:13 and just lots of people around me.
01:08:14 So definitely hopeful that we’ll transition
01:08:17 to that mix model where maybe you get
01:08:20 to try something out for free, maybe with ads,
01:08:24 but then there’s a more clear revenue model
01:08:27 that sort of helps go beyond that.
01:08:30 So speaking of revenue, how is it that a person
01:08:35 can use the TPU in a Google call app for free?
01:08:39 So what’s the, I guess the question is,
01:08:43 what’s the future of TensorFlow in terms of empowering,
01:08:48 say, a class of 300 students?
01:08:51 And I’m asked by MIT, what is going to be the future
01:08:56 of them being able to do their homework in TensorFlow?
01:09:00 Like, where are they going to train these networks, right?
01:09:02 What’s that future look like with TPUs,
01:09:06 with cloud services, and so on?
01:09:08 I think a number of things there.
01:09:10 I mean, any TensorFlow open source,
01:09:12 you can run it wherever, you can run it on your desktop
01:09:15 and your desktops always keep getting more powerful,
01:09:17 so maybe you can do more.
01:09:19 My phone is like, I don’t know how many times
01:09:21 more powerful than my first desktop.
01:09:23 You’ll probably train it on your phone though,
01:09:25 yeah, that’s true.
01:09:26 Right, so in that sense, the power you have
01:09:28 in your hands is a lot more.
01:09:31 Clouds are actually very interesting from, say,
01:09:34 students or courses perspective,
01:09:36 because they make it very easy to get started.
01:09:40 I mean, Colab, the great thing about it is,
01:09:42 go to a website and it just works.
01:09:45 No installation needed, nothing to,
01:09:47 you’re just there and things are working.
01:09:50 That’s really the power of cloud as well.
01:09:52 And so I do expect that to grow.
01:09:55 Again, Colab is a free service.
01:09:57 It’s great to get started, to play with things,
01:10:00 to explore things.
01:10:03 That said, with free, you can only get so much.
01:10:06 You’d be, yeah.
01:10:08 So just like we were talking about,
01:10:10 free versus paid, yeah, there are services
01:10:12 you can pay for and get a lot more.
01:10:15 Great, so if I’m a complete beginner
01:10:17 interested in machine learning and TensorFlow,
01:10:19 what should I do?
01:10:21 Probably start with going to our website
01:10:23 and playing there.
01:10:24 So just go to TensorFlow.org and start clicking on things.
01:10:26 Yep, check out tutorials and guides.
01:10:28 There’s stuff you can just click there
01:10:29 and go to a Colab and do things.
01:10:31 No installation needed, you can get started right there.
01:10:34 Okay, awesome, Rajit, thank you so much for talking today.
01:10:36 Thank you, Lex, it was great.