Regina Barzilay: Deep Learning for Cancer Diagnosis and Treatment #40

Transcript

00:00:00 The following is a conversation with Regina Barzilay.

00:00:03 She’s a professor at MIT and a world class researcher

00:00:06 in natural language processing

00:00:08 and applications of deep learning to chemistry and oncology

00:00:12 or the use of deep learning for early diagnosis,

00:00:15 prevention and treatment of cancer.

00:00:18 She has also been recognized for teaching

00:00:21 of several successful AI related courses at MIT,

00:00:24 including the popular Introduction

00:00:26 to Machine Learning course.

00:00:28 This is the Artificial Intelligence podcast.

00:00:32 If you enjoy it, subscribe on YouTube,

00:00:34 give it five stars on iTunes, support it on Patreon

00:00:37 or simply connect with me on Twitter

00:00:39 at Lex Friedman spelled F R I D M A N.

00:00:43 And now here’s my conversation with Regina Barzilay.

00:00:48 In an interview you’ve mentioned

00:00:50 that if there’s one course you would take,

00:00:51 it would be a literature course with a friend of yours

00:00:54 that a friend of yours teaches.

00:00:56 Just out of curiosity, because I couldn’t find anything

00:00:59 on it, are there books or ideas that had profound impact

00:01:04 on your life journey, books and ideas perhaps

00:01:07 outside of computer science and the technical fields?

00:01:11 I think because I’m spending a lot of my time at MIT

00:01:14 and previously in other institutions where I was a student,

00:01:18 I have limited ability to interact with people.

00:01:21 So a lot of what I know about the world

00:01:22 actually comes from books.

00:01:24 And there were quite a number of books

00:01:27 that had profound impact on me and how I view the world.

00:01:31 Let me just give you one example of such a book.

00:01:35 I’ve maybe a year ago read a book

00:01:39 called The Emperor of All Melodies.

00:01:42 It’s a book about, it’s kind of a history of science book

00:01:45 on how the treatments and drugs for cancer were developed.

00:01:50 And that book, despite the fact that I am in the business

00:01:54 of science, really opened my eyes on how imprecise

00:01:59 and imperfect the discovery process is

00:02:03 and how imperfect our current solutions

00:02:06 and what makes science succeed and be implemented.

00:02:11 And sometimes it’s actually not the strengths of the idea,

00:02:14 but devotion of the person who wants to see it implemented.

00:02:17 So this is one of the books that, you know,

00:02:19 at least for the last year, quite changed the way

00:02:22 I’m thinking about scientific process

00:02:24 just from the historical perspective

00:02:26 and what do I need to do to make my ideas really implemented.

00:02:33 Let me give you an example of a book

00:02:36 which is not kind of, which is a fiction book.

00:02:40 It’s a book called Americana.

00:02:44 And this is a book about a young female student

00:02:48 who comes from Africa to study in the United States.

00:02:53 And it describes her past, you know, within her studies

00:02:57 and her life transformation that, you know,

00:03:02 in a new country and kind of adaptation to a new culture.

00:03:06 And when I read this book, I saw myself

00:03:11 in many different points of it,

00:03:13 but it also kind of gave me the lens on different events.

00:03:20 And some of it that I never actually paid attention.

00:03:22 One of the funny stories in this book

00:03:24 is how she arrives to her new college

00:03:30 and she starts speaking in English

00:03:32 and she had this beautiful British accent

00:03:35 because that’s how she was educated in her country.

00:03:39 This is not my case.

00:03:40 And then she notices that the person who talks to her,

00:03:45 you know, talks to her in a very funny way,

00:03:47 in a very slow way.

00:03:48 And she’s thinking that this woman is disabled

00:03:51 and she’s also trying to kind of to accommodate her.

00:03:54 And then after a while, when she finishes her discussion

00:03:56 with this officer from her college,

00:03:59 she sees how she interacts with the other students,

00:04:02 with American students.

00:04:03 And she discovers that actually she talked to her this way

00:04:08 because she saw that she doesn’t understand English.

00:04:11 And I thought, wow, this is a funny experience.

00:04:14 And literally within few weeks,

00:04:16 I went to LA to a conference

00:04:20 and I asked somebody in the airport,

00:04:23 you know, how to find like a cab or something.

00:04:25 And then I noticed that this person is talking

00:04:28 in a very strange way.

00:04:29 And my first thought was that this person

00:04:31 have some, you know, pronunciation issues or something.

00:04:34 And I’m trying to talk very slowly to him

00:04:36 and I was with another professor, Ernst Frankel.

00:04:38 And he’s like laughing because it’s funny

00:04:42 that I don’t get that the guy is talking in this way

00:04:44 because he thinks that I cannot speak.

00:04:46 So it was really kind of mirroring experience.

00:04:49 And it led me think a lot about my own experiences

00:04:53 moving, you know, from different countries.

00:04:56 So I think that books play a big role

00:04:59 in my understanding of the world.

00:05:01 On the science question, you mentioned that

00:05:06 it made you discover that personalities of human beings

00:05:09 are more important than perhaps ideas.

00:05:12 Is that what I heard?

00:05:13 It’s not necessarily that they are more important

00:05:15 than ideas, but I think that ideas on their own

00:05:19 are not sufficient.

00:05:20 And many times, at least at the local horizon,

00:05:24 it’s the personalities and their devotion to their ideas

00:05:29 is really that locally changes the landscape.

00:05:32 Now, if you’re looking at AI, like let’s say 30 years ago,

00:05:37 you know, dark ages of AI or whatever,

00:05:39 what is symbolic times, you can use any word.

00:05:42 You know, there were some people,

00:05:44 now we’re looking at a lot of that work

00:05:46 and we’re kind of thinking this was not really

00:05:48 maybe a relevant work, but you can see that some people

00:05:52 managed to take it and to make it so shiny

00:05:54 and dominate the academic world

00:05:59 and make it to be the standard.

00:06:02 If you look at the area of natural language processing,

00:06:06 it is well known fact that the reason that statistics

00:06:09 in NLP took such a long time to become mainstream

00:06:13 because there were quite a number of personalities

00:06:16 which didn’t believe in this idea

00:06:18 and didn’t stop research progress in this area.

00:06:22 So I do not think that, you know,

00:06:25 kind of asymptotically maybe personalities matters,

00:06:28 but I think locally it does make quite a bit of impact

00:06:33 and it’s generally, you know,

00:06:36 speeds up the rate of adoption of the new ideas.

00:06:41 Yeah, and the other interesting question

00:06:43 is in the early days of particular discipline,

00:06:46 I think you mentioned in that book

00:06:50 is ultimately a book of cancer.

00:06:52 It’s called The Emperor of All Melodies.

00:06:55 Yeah, and those melodies included the trying to,

00:06:58 the medicine, was it centered around?

00:07:00 So it was actually centered on, you know,

00:07:04 how people thought of curing cancer.

00:07:07 Like for me, it was really a discovery how people,

00:07:10 what was the science of chemistry behind drug development

00:07:14 that it actually grew up out of dying,

00:07:17 like coloring industry that people

00:07:19 who developed chemistry in 19th century in Germany

00:07:23 and Britain to do, you know, the really new dyes.

00:07:28 They looked at the molecules and identified

00:07:30 that they do certain things to cells.

00:07:32 And from there, the process started.

00:07:34 And, you know, like historically saying,

00:07:35 yeah, this is fascinating

00:07:36 that they managed to make the connection

00:07:38 and look under the microscope and do all this discovery.

00:07:42 But as you continue reading about it

00:07:44 and you read about how chemotherapy drugs

00:07:48 which were developed in Boston,

00:07:50 and some of them were developed.

00:07:52 And Farber, Dr. Farber from Dana Farber,

00:07:57 you know, how the experiments were done

00:08:00 that, you know, there was some miscalculation,

00:08:03 let’s put it this way.

00:08:04 And they tried it on the patients and they just,

00:08:06 and those were children with leukemia and they died.

00:08:09 And then they tried another modification.

00:08:11 You look at the process, how imperfect is this process?

00:08:15 And, you know, like, if we’re again looking back

00:08:17 like 60 years ago, 70 years ago,

00:08:19 you can kind of understand it.

00:08:20 But some of the stories in this book

00:08:23 which were really shocking to me

00:08:24 were really happening, you know, maybe decades ago.

00:08:27 And we still don’t have a vehicle

00:08:30 to do it much more fast and effective and, you know,

00:08:35 scientific the way I’m thinking computer science scientific.

00:08:38 So from the perspective of computer science,

00:08:40 you’ve gotten a chance to work the application to cancer

00:08:43 and to medicine in general.

00:08:44 From a perspective of an engineer and a computer scientist,

00:08:48 how far along are we from understanding the human body,

00:08:51 biology of being able to manipulate it

00:08:55 in a way we can cure some of the maladies,

00:08:57 some of the diseases?

00:08:59 So this is very interesting question.

00:09:03 And if you’re thinking as a computer scientist

00:09:06 about this problem, I think one of the reasons

00:09:09 that we succeeded in the areas

00:09:11 we as a computer scientist succeeded

00:09:13 is because we don’t have,

00:09:16 we are not trying to understand in some ways.

00:09:18 Like if you’re thinking about like eCommerce, Amazon,

00:09:22 Amazon doesn’t really understand you.

00:09:24 And that’s why it recommends you certain books

00:09:27 or certain products, correct?

00:09:30 And, you know, traditionally when people

00:09:34 were thinking about marketing, you know,

00:09:36 they divided the population to different kind of subgroups,

00:09:39 identify the features of this subgroup

00:09:41 and come up with a strategy

00:09:43 which is specific to that subgroup.

00:09:45 If you’re looking about recommendation system,

00:09:47 they’re not claiming that they’re understanding somebody,

00:09:50 they’re just managing to,

00:09:52 from the patterns of your behavior

00:09:54 to recommend you a product.

00:09:57 Now, if you look at the traditional biology,

00:09:59 and obviously I wouldn’t say that I

00:10:03 at any way, you know, educated in this field,

00:10:06 but you know what I see, there’s really a lot of emphasis

00:10:09 on mechanistic understanding.

00:10:10 And it was very surprising to me

00:10:12 coming from computer science,

00:10:13 how much emphasis is on this understanding.

00:10:17 And given the complexity of the system,

00:10:20 maybe the deterministic full understanding

00:10:23 of this process is, you know, beyond our capacity.

00:10:27 And the same ways in computer science

00:10:29 when we’re doing recognition, when you do recommendation

00:10:31 and many other areas,

00:10:32 it’s just probabilistic matching process.

00:10:35 And in some way, maybe in certain cases,

00:10:40 we shouldn’t even attempt to understand

00:10:42 or we can attempt to understand, but in parallel,

00:10:45 we can actually do this kind of matchings

00:10:48 that would help us to find key role

00:10:51 to do early diagnostics and so on.

00:10:54 And I know that in these communities,

00:10:55 it’s really important to understand,

00:10:59 but I’m sometimes wondering, you know,

00:11:00 what exactly does it mean to understand here?

00:11:02 Well, there’s stuff that works and,

00:11:05 but that can be, like you said,

00:11:07 separate from this deep human desire

00:11:10 to uncover the mysteries of the universe,

00:11:12 of science, of the way the body works,

00:11:16 the way the mind works.

00:11:17 It’s the dream of symbolic AI,

00:11:19 of being able to reduce human knowledge into logic

00:11:25 and be able to play with that logic

00:11:26 in a way that’s very explainable

00:11:28 and understandable for us humans.

00:11:30 I mean, that’s a beautiful dream.

00:11:31 So I understand it, but it seems that

00:11:34 what seems to work today and we’ll talk about it more

00:11:37 is as much as possible, reduce stuff into data,

00:11:40 reduce whatever problem you’re interested in to data

00:11:43 and try to apply statistical methods,

00:11:47 apply machine learning to that.

00:11:49 On a personal note,

00:11:51 you were diagnosed with breast cancer in 2014.

00:11:55 What did facing your mortality make you think about?

00:11:58 How did it change you?

00:12:00 You know, this is a great question

00:12:01 and I think that I was interviewed many times

00:12:03 and nobody actually asked me this question.

00:12:05 I think I was 43 at a time.

00:12:09 And the first time I realized in my life that I may die

00:12:12 and I never thought about it before.

00:12:14 And there was a long time since you’re diagnosed

00:12:17 until you actually know what you have

00:12:18 and how severe is your disease.

00:12:20 For me, it was like maybe two and a half months.

00:12:23 And I didn’t know where I am during this time

00:12:28 because I was getting different tests

00:12:30 and one would say it’s bad and I would say, no, it is not.

00:12:33 So until I knew where I am,

00:12:34 I really was thinking about

00:12:36 all these different possible outcomes.

00:12:38 Were you imagining the worst

00:12:39 or were you trying to be optimistic or?

00:12:41 It would be really,

00:12:43 I don’t remember what was my thinking.

00:12:47 It was really a mixture with many components at the time

00:12:51 speaking in our terms.

00:12:54 And one thing that I remember,

00:12:59 and every test comes and then you’re saying,

00:13:01 oh, it could be this or it may not be this.

00:13:03 And you’re hopeful and then you’re desperate.

00:13:04 So it’s like, there is a whole slew of emotions

00:13:07 that goes through you.

00:13:09 But what I remember is that when I came back to MIT,

00:13:15 I was kind of going the whole time through the treatment

00:13:17 to MIT, but my brain was not really there.

00:13:19 But when I came back, really finished my treatment

00:13:21 and I was here teaching and everything,

00:13:24 I look back at what my group was doing,

00:13:27 what other groups was doing.

00:13:28 And I saw these trivialities.

00:13:30 It’s like people are building their careers

00:13:33 on improving some parts around two or 3% or whatever.

00:13:36 I was, it’s like, seriously,

00:13:38 I did a work on how to decipher ugaritic,

00:13:40 like a language that nobody speak and whatever,

00:13:42 like what is significance?

00:13:46 When all of a sudden, I walked out of MIT,

00:13:49 which is when people really do care

00:13:51 what happened to your ICLR paper,

00:13:54 what is your next publication to ACL,

00:13:57 to the world where people, you see a lot of suffering

00:14:01 that I’m kind of totally shielded on it on daily basis.

00:14:04 And it’s like the first time I’ve seen like real life

00:14:07 and real suffering.

00:14:09 And I was thinking, why are we trying to improve the parser

00:14:13 or deal with trivialities when we have capacity

00:14:18 to really make a change?

00:14:20 And it was really challenging to me because on one hand,

00:14:24 I have my graduate students really want to do their papers

00:14:27 and their work, and they want to continue to do

00:14:29 what they were doing, which was great.

00:14:31 And then it was me who really kind of reevaluated

00:14:36 what is the importance.

00:14:37 And also at that point, because I had to take some break,

00:14:42 I look back into like my years in science

00:14:47 and I was thinking, like 10 years ago,

00:14:50 this was the biggest thing, I don’t know, topic models.

00:14:52 We have like millions of papers on topic models

00:14:55 and variation of topics models.

00:14:56 Now it’s totally like irrelevant.

00:14:58 And you start looking at this, what do you perceive

00:15:02 as important at different point of time

00:15:04 and how it fades over time.

00:15:08 And since we have a limited time,

00:15:12 all of us have limited time on us,

00:15:14 it’s really important to prioritize things

00:15:18 that really matter to you, maybe matter to you

00:15:20 at that particular point.

00:15:22 But it’s important to take some time

00:15:24 and understand what matters to you,

00:15:26 which may not necessarily be the same

00:15:28 as what matters to the rest of your scientific community

00:15:31 and pursue that vision.

00:15:34 So that moment, did it make you cognizant?

00:15:38 You mentioned suffering of just the general amount

00:15:42 of suffering in the world.

00:15:44 Is that what you’re referring to?

00:15:45 So as opposed to topic models

00:15:47 and specific detailed problems in NLP,

00:15:50 did you start to think about other people

00:15:54 who have been diagnosed with cancer?

00:15:56 Is that the way you started to see the world perhaps?

00:16:00 Oh, absolutely.

00:16:00 And it actually creates, because like, for instance,

00:16:04 there is parts of the treatment

00:16:05 where you need to go to the hospital every day

00:16:08 and you see the community of people that you see

00:16:11 and many of them are much worse than I was at a time.

00:16:16 And you all of a sudden see it all.

00:16:20 And people who are happier someday

00:16:23 just because they feel better.

00:16:25 And for people who are in our normal realm,

00:16:28 you take it totally for granted that you feel well,

00:16:30 that if you decide to go running, you can go running

00:16:32 and you’re pretty much free

00:16:35 to do whatever you want with your body.

00:16:37 Like I saw like a community,

00:16:40 my community became those people.

00:16:42 And I remember one of my friends, Dina Katabi,

00:16:47 took me to Prudential to buy me a gift for my birthday.

00:16:50 And it was like the first time in months

00:16:52 that I went to kind of to see other people.

00:16:54 And I was like, wow, first of all, these people,

00:16:58 they are happy and they’re laughing

00:16:59 and they’re very different from these other my people.

00:17:02 And second of thing, I think it’s totally crazy.

00:17:04 They’re like laughing and wasting their money

00:17:06 on some stupid gifts.

00:17:08 And they may die.

00:17:12 They already may have cancer and they don’t understand it.

00:17:15 So you can really see how the mind changes

00:17:20 that you can see that,

00:17:22 before that you can ask,

00:17:23 didn’t you know that you’re gonna die?

00:17:24 Of course I knew, but it was a kind of a theoretical notion.

00:17:28 It wasn’t something which was concrete.

00:17:31 And at that point, when you really see it

00:17:33 and see how little means sometimes the system has

00:17:38 to have them, you really feel that we need to take a lot

00:17:41 of our brilliance that we have here at MIT

00:17:45 and translate it into something useful.

00:17:48 Yeah, and you still couldn’t have a lot of definitions,

00:17:50 but of course, alleviating, suffering, alleviating,

00:17:53 trying to cure cancer is a beautiful mission.

00:17:57 So I of course know theoretically the notion of cancer,

00:18:01 but just reading more and more about it’s 1.7 million

00:18:07 new cancer cases in the United States every year,

00:18:09 600,000 cancer related deaths every year.

00:18:13 So this has a huge impact, United States globally.

00:18:19 When broadly, before we talk about how machine learning,

00:18:24 how MIT can help,

00:18:27 when do you think we as a civilization will cure cancer?

00:18:32 How hard of a problem is it from everything you’ve learned

00:18:34 from it recently?

00:18:37 I cannot really assess it.

00:18:39 What I do believe will happen with the advancement

00:18:42 in machine learning is that a lot of types of cancer

00:18:45 we will be able to predict way early

00:18:48 and more effectively utilize existing treatments.

00:18:53 I think, I hope at least that with all the advancements

00:18:57 in AI and drug discovery, we would be able

00:19:01 to much faster find relevant molecules.

00:19:04 What I’m not sure about is how long it will take

00:19:08 the medical establishment and regulatory bodies

00:19:11 to kind of catch up and to implement it.

00:19:14 And I think this is a very big piece of puzzle

00:19:17 that is currently not addressed.

00:19:20 That’s the really interesting question.

00:19:21 So first a small detail that I think the answer is yes,

00:19:25 but is cancer one of the diseases that when detected earlier

00:19:33 that’s a significantly improves the outcomes?

00:19:37 So like, cause we will talk about there’s the cure

00:19:41 and then there is detection.

00:19:43 And I think where machine learning can really help

00:19:45 is earlier detection.

00:19:46 So does detection help?

00:19:48 Detection is crucial.

00:19:49 For instance, the vast majority of pancreatic cancer patients

00:19:53 are detected at the stage that they are incurable.

00:19:57 That’s why they have such a terrible survival rate.

00:20:03 It’s like just few percent over five years.

00:20:07 It’s pretty much today the sentence.

00:20:09 But if you can discover this disease early,

00:20:14 there are mechanisms to treat it.

00:20:16 And in fact, I know a number of people who were diagnosed

00:20:20 and saved just because they had food poisoning.

00:20:23 They had terrible food poisoning.

00:20:25 They went to ER, they got scan.

00:20:28 There were early signs on the scan

00:20:30 and that would save their lives.

00:20:33 But this wasn’t really an accidental case.

00:20:35 So as we become better, we would be able to help

00:20:41 to many more people that are likely to develop diseases.

00:20:46 And I just want to say that as I got more into this field,

00:20:51 I realized that cancer is of course terrible disease,

00:20:53 but there are really the whole slew of terrible diseases

00:20:56 out there like neurodegenerative diseases and others.

00:21:01 So we, of course, a lot of us are fixated on cancer

00:21:04 because it’s so prevalent in our society.

00:21:06 And you see these people where there are a lot of patients

00:21:08 with neurodegenerative diseases

00:21:10 and the kind of aging diseases

00:21:12 that we still don’t have a good solution for.

00:21:17 And I felt as a computer scientist,

00:21:22 we kind of decided that it’s other people’s job

00:21:25 to treat these diseases because it’s like traditionally

00:21:29 people in biology or in chemistry or MDs

00:21:32 are the ones who are thinking about it.

00:21:35 And after kind of start paying attention,

00:21:37 I think that it’s really a wrong assumption

00:21:40 and we all need to join the battle.

00:21:42 So how it seems like in cancer specifically

00:21:46 that there’s a lot of ways that machine learning can help.

00:21:49 So what’s the role of machine learning

00:21:51 in the diagnosis of cancer?

00:21:55 So for many cancers today, we really don’t know

00:21:58 what is your likelihood to get cancer.

00:22:03 And for the vast majority of patients,

00:22:06 especially on the younger patients,

00:22:07 it really comes as a surprise.

00:22:09 Like for instance, for breast cancer,

00:22:11 80% of the patients are first in their families,

00:22:13 it’s like me.

00:22:15 And I never saw that I had any increased risk

00:22:18 because nobody had it in my family.

00:22:20 And for some reason in my head,

00:22:22 it was kind of inherited disease.

00:22:26 But even if I would pay attention,

00:22:28 the very simplistic statistical models

00:22:32 that are currently used in clinical practice,

00:22:34 they really don’t give you an answer, so you don’t know.

00:22:37 And the same true for pancreatic cancer,

00:22:40 the same true for non smoking lung cancer and many others.

00:22:45 So what machine learning can do here

00:22:47 is utilize all this data to tell us early

00:22:51 who is likely to be susceptible

00:22:53 and using all the information that is already there,

00:22:55 be it imaging, be it your other tests,

00:22:59 and eventually liquid biopsies and others,

00:23:04 where the signal itself is not sufficiently strong

00:23:08 for human eye to do good discrimination

00:23:11 because the signal may be weak,

00:23:12 but by combining many sources,

00:23:15 machine which is trained on large volumes of data

00:23:18 can really detect it early.

00:23:20 And that’s what we’ve seen with breast cancer

00:23:22 and people are reporting it in other diseases as well.

00:23:25 That really boils down to data, right?

00:23:28 And in the different kinds of sources of data.

00:23:30 And you mentioned regulatory challenges.

00:23:33 So what are the challenges

00:23:35 in gathering large data sets in this space?

00:23:40 Again, another great question.

00:23:42 So it took me after I decided that I want to work on it

00:23:45 two years to get access to data.

00:23:48 Any data, like any significant data set?

00:23:50 Any significant amount, like right now in this country,

00:23:53 there is no publicly available data set

00:23:57 of modern mammograms that you can just go

00:23:58 on your computer, sign a document and get it.

00:24:01 It just doesn’t exist.

00:24:03 I mean, obviously every hospital has its own collection

00:24:06 of mammograms.

00:24:07 There are data that came out of clinical trials.

00:24:11 What we’re talking about here is a computer scientist

00:24:13 who just wants to run his or her model

00:24:17 and see how it works.

00:24:19 This data, like ImageNet, doesn’t exist.

00:24:22 And there is a set which is called like Florida data set

00:24:28 which is a film mammogram from 90s

00:24:30 which is totally not representative

00:24:32 of the current developments.

00:24:33 Whatever you’re learning on them doesn’t scale up.

00:24:35 This is the only resource that is available.

00:24:39 And today there are many agencies

00:24:42 that govern access to data.

00:24:44 Like the hospital holds your data

00:24:46 and the hospital decides whether they would give it

00:24:49 to the researcher to work with this data or not.

00:24:52 Individual hospital?

00:24:54 Yeah.

00:24:55 I mean, the hospital may, you know,

00:24:57 assuming that you’re doing research collaboration,

00:24:59 you can submit, you know,

00:25:01 there is a proper approval process guided by RB

00:25:05 and if you go through all the processes,

00:25:07 you can eventually get access to the data.

00:25:10 But if you yourself know our OEI community,

00:25:13 there are not that many people who actually ever got access

00:25:16 to data because it’s very challenging process.

00:25:20 And sorry, just in a quick comment,

00:25:22 MGH or any kind of hospital,

00:25:25 are they scanning the data?

00:25:28 Are they digitally storing it?

00:25:29 Oh, it is already digitally stored.

00:25:31 You don’t need to do any extra processing steps.

00:25:34 It’s already there in the right format is that right now

00:25:38 there are a lot of issues that govern access to the data

00:25:41 because the hospital is legally responsible for the data.

00:25:46 And, you know, they have a lot to lose

00:25:51 if they give the data to the wrong person,

00:25:53 but they may not have a lot to gain if they give it

00:25:56 as a hospital, as a legal entity has given it to you.

00:26:00 And the way, you know, what I would imagine

00:26:02 happening in the future is the same thing that happens

00:26:05 when you’re getting your driving license,

00:26:06 you can decide whether you want to donate your organs.

00:26:09 You can imagine that whenever a person goes to the hospital,

00:26:13 they, it should be easy for them to donate their data

00:26:17 for research and it can be different kind of,

00:26:19 do they only give you a test results or only mammogram

00:26:22 or only imaging data or the whole medical record?

00:26:27 Because at the end,

00:26:30 we all will benefit from all this insights.

00:26:33 And it’s not like you say, I want to keep my data private,

00:26:36 but I would really love to get it from other people

00:26:38 because other people are thinking the same way.

00:26:40 So if there is a mechanism to do this donation

00:26:45 and the patient has an ability to say

00:26:48 how they want to use their data for research,

00:26:50 it would be really a game changer.

00:26:54 People, when they think about this problem,

00:26:56 there’s a, it depends on the population,

00:26:58 depends on the demographics,

00:27:00 but there’s some privacy concerns generally,

00:27:03 not just medical data, just any kind of data.

00:27:05 It’s what you said, my data, it should belong kind of to me.

00:27:09 I’m worried how it’s going to be misused.

00:27:12 How do we alleviate those concerns?

00:27:17 Because that seems like a problem that needs to be,

00:27:19 that problem of trust, of transparency needs to be solved

00:27:22 before we build large data sets that help detect cancer,

00:27:27 help save those very people in the future.

00:27:30 So I think there are two things that could be done.

00:27:31 There is a technical solutions

00:27:34 and there are societal solutions.

00:27:38 So on the technical end,

00:27:41 we today have ability to improve disambiguation.

00:27:48 Like, for instance, for imaging,

00:27:49 it’s, you know, for imaging, you can do it pretty well.

00:27:55 What’s disambiguation?

00:27:56 And disambiguation, sorry, disambiguation,

00:27:58 removing the identification,

00:27:59 removing the names of the people.

00:28:02 There are other data, like if it is a raw tax,

00:28:04 you cannot really achieve 99.9%,

00:28:08 but there are all these techniques

00:28:10 that actually some of them are developed at MIT,

00:28:12 how you can do learning on the encoded data

00:28:15 where you locally encode the image,

00:28:17 you train a network which only works on the encoded images

00:28:22 and then you send the outcome back to the hospital

00:28:24 and you can open it up.

00:28:26 So those are the technical solutions.

00:28:28 There are a lot of people who are working in this space

00:28:30 where the learning happens in the encoded form.

00:28:33 We are still early,

00:28:36 but this is an interesting research area

00:28:39 where I think we’ll make more progress.

00:28:43 There is a lot of work in natural language processing

00:28:45 community how to do the identification better.

00:28:50 But even today, there are already a lot of data

00:28:54 which can be deidentified perfectly,

00:28:55 like your test data, for instance, correct,

00:28:58 where you can just, you know the name of the patient,

00:29:00 you just want to extract the part with the numbers.

00:29:04 The big problem here is again,

00:29:08 hospitals don’t see much incentive

00:29:10 to give this data away on one hand

00:29:12 and then there is general concern.

00:29:14 Now, when I’m talking about societal benefits

00:29:17 and about the education,

00:29:19 the public needs to understand that I think

00:29:25 that there are situation and I still remember myself

00:29:29 when I really needed an answer, I had to make a choice.

00:29:33 There was no information to make a choice,

00:29:35 you’re just guessing.

00:29:36 And at that moment you feel that your life is at the stake,

00:29:41 but you just don’t have information to make the choice.

00:29:44 And many times when I give talks,

00:29:48 I get emails from women who say,

00:29:51 you know, I’m in this situation,

00:29:52 can you please run statistic and see what are the outcomes?

00:29:57 We get almost every week a mammogram that comes by mail

00:30:01 to my office at MIT, I’m serious.

00:30:04 That people ask to run because they need to make

00:30:07 life changing decisions.

00:30:10 And of course, I’m not planning to open a clinic here,

00:30:12 but we do run and give them the results for their doctors.

00:30:16 But the point that I’m trying to make,

00:30:20 that we all at some point or our loved ones

00:30:23 will be in the situation where you need information

00:30:26 to make the best choice.

00:30:28 And if this information is not available,

00:30:31 you would feel vulnerable and unprotected.

00:30:35 And then the question is, you know, what do I care more?

00:30:37 Because at the end, everything is a trade off, correct?

00:30:40 Yeah, exactly.

00:30:41 Just out of curiosity, it seems like one possible solution,

00:30:45 I’d like to see what you think of it,

00:30:49 based on what you just said,

00:30:50 based on wanting to know answers

00:30:52 for when you’re yourself in that situation.

00:30:55 Is it possible for patients to own their data

00:30:58 as opposed to hospitals owning their data?

00:31:01 Of course, theoretically, I guess patients own their data,

00:31:04 but can you walk out there with a USB stick

00:31:07 containing everything or upload it to the cloud?

00:31:10 Where a company, you know, I remember Microsoft

00:31:14 had a service, like I try, I was really excited about

00:31:17 and Google Health was there.

00:31:19 I tried to give, I was excited about it.

00:31:21 Basically companies helping you upload your data

00:31:24 to the cloud so that you can move from hospital to hospital

00:31:27 from doctor to doctor.

00:31:29 Do you see a promise of that kind of possibility?

00:31:32 I absolutely think this is, you know,

00:31:34 the right way to exchange the data.

00:31:38 I don’t know now who’s the biggest player in this field,

00:31:41 but I can clearly see that even for totally selfish

00:31:45 health reasons, when you are going to a new facility

00:31:49 and many of us are sent to some specialized treatment,

00:31:52 they don’t easily have access to your data.

00:31:55 And today, you know, we might want to send this mammogram,

00:31:59 need to go to the hospital, find some small office

00:32:01 which gives them the CD and they ship as a CD.

00:32:04 So you can imagine we’re looking at kind of decades old

00:32:08 mechanism of data exchange.

00:32:11 So I definitely think this is an area where hopefully

00:32:15 all the right regulatory and technical forces will align

00:32:20 and we will see it actually implemented.

00:32:23 It’s sad because unfortunately, and I need to research

00:32:27 why that happened, but I’m pretty sure Google Health

00:32:30 and Microsoft Health Vault or whatever it’s called

00:32:32 both closed down, which means that there was

00:32:36 either regulatory pressure or there’s not a business case

00:32:39 or there’s challenges from hospitals,

00:32:41 which is very disappointing.

00:32:43 So when you say you don’t know what the biggest players are,

00:32:46 the two biggest that I was aware of closed their doors.

00:32:50 So I’m hoping, I’d love to see why

00:32:53 and I’d love to see who else can come up.

00:32:54 It seems like one of those Elon Musk style problems

00:32:59 that are obvious needs to be solved

00:33:01 and somebody needs to step up and actually do

00:33:02 this large scale data collection.

00:33:07 So I know there is an initiative in Massachusetts,

00:33:09 I think, which you led by the governor

00:33:11 to try to create this kind of health exchange system

00:33:15 where at least to help people who kind of when you show up

00:33:17 in emergency room and there is no information

00:33:20 about what are your allergies and other things.

00:33:23 So I don’t know how far it will go.

00:33:26 But another thing that you said

00:33:28 and I find it very interesting is actually

00:33:30 who are the successful players in this space

00:33:33 and the whole implementation, how does it go?

00:33:37 To me, it is from the anthropological perspective,

00:33:40 it’s more fascinating that AI that today goes in healthcare,

00:33:44 we’ve seen so many attempts and so very little successes.

00:33:50 And it’s interesting to understand that I’ve by no means

00:33:54 have knowledge to assess it,

00:33:56 why we are in the position where we are.

00:33:59 Yeah, it’s interesting because data is really fuel

00:34:02 for a lot of successful applications.

00:34:04 And when that data acquires regulatory approval,

00:34:08 like the FDA or any kind of approval,

00:34:12 it seems that the computer scientists

00:34:15 are not quite there yet in being able

00:34:17 to play the regulatory game,

00:34:18 understanding the fundamentals of it.

00:34:21 I think that in many cases when even people do have data,

00:34:26 we still don’t know what exactly do you need to demonstrate

00:34:31 to change the standard of care.

00:34:35 Like let me give you an example

00:34:37 related to my breast cancer research.

00:34:41 So in traditional breast cancer risk assessment,

00:34:45 there is something called density,

00:34:47 which determines the likelihood of a woman to get cancer.

00:34:50 And this pretty much says,

00:34:51 how much white do you see on the mammogram?

00:34:54 The whiter it is, the more likely the tissue is dense.

00:34:58 And the idea behind density, it’s not a bad idea.

00:35:03 In 1967, a radiologist called Wolf decided to look back

00:35:08 at women who were diagnosed

00:35:09 and see what is special in their images.

00:35:12 Can we look back and say that they’re likely to develop?

00:35:14 So he come up with some patterns.

00:35:16 And it was the best that his human eye can identify.

00:35:20 Then it was kind of formalized

00:35:22 and coded into four categories.

00:35:24 And that’s what we are using today.

00:35:26 And today this density assessment

00:35:31 is actually a federal law from 2019,

00:35:34 approved by President Trump

00:35:36 and for the previous FDA commissioner,

00:35:40 where women are supposed to be advised by their providers

00:35:43 if they have high density,

00:35:45 putting them into higher risk category.

00:35:47 And in some states,

00:35:49 you can actually get supplementary screening

00:35:51 paid by your insurance because you’re in this category.

00:35:53 Now you can say, how much science do we have behind it?

00:35:56 Whatever, biological science or epidemiological evidence.

00:36:00 So it turns out that between 40 and 50% of women

00:36:05 have dense breasts.

00:36:06 So about 40% of patients are coming out of their screening

00:36:11 and somebody tells them, you are in high risk.

00:36:15 Now, what exactly does it mean

00:36:16 if you as half of the population in high risk?

00:36:19 It’s from saying, maybe I’m not,

00:36:22 or what do I really need to do with it?

00:36:23 Because the system doesn’t provide me

00:36:27 a lot of the solutions

00:36:28 because there are so many people like me,

00:36:30 we cannot really provide very expensive solutions for them.

00:36:34 And the reason this whole density became this big deal,

00:36:38 it’s actually advocated by the patients

00:36:40 who felt very unprotected

00:36:42 because many women went and did the mammograms

00:36:44 which were normal.

00:36:46 And then it turns out that they already had cancer,

00:36:49 quite developed cancer.

00:36:50 So they didn’t have a way to know who is really at risk

00:36:54 and what is the likelihood that when the doctor tells you,

00:36:56 you’re okay, you are not okay.

00:36:58 So at the time, and it was 15 years ago,

00:37:02 this maybe was the best piece of science that we had.

00:37:06 And it took quite 15, 16 years to make it federal law.

00:37:12 But now this is a standard.

00:37:15 Now with a deep learning model,

00:37:17 we can so much more accurately predict

00:37:19 who is gonna develop breast cancer

00:37:21 just because you’re trained on a logical thing.

00:37:23 And instead of describing how much white

00:37:26 and what kind of white machine

00:37:27 can systematically identify the patterns,

00:37:30 which was the original idea behind the thought

00:37:32 of the cardiologist,

00:37:33 machines can do it much more systematically

00:37:35 and predict the risk when you’re training the machine

00:37:38 to look at the image and to say the risk in one to five years.

00:37:42 Now you can ask me how long it will take

00:37:45 to substitute this density,

00:37:46 which is broadly used across the country

00:37:48 and really is not helping to bring this new models.

00:37:54 And I would say it’s not a matter of the algorithm.

00:37:56 Algorithms use already orders of magnitude better

00:37:58 than what is currently in practice.

00:38:00 I think it’s really the question,

00:38:02 who do you need to convince?

00:38:04 How many hospitals do you need to run the experiment?

00:38:07 What, you know, all this mechanism of adoption

00:38:11 and how do you explain to patients

00:38:15 and to women across the country

00:38:17 that this is really a better measure?

00:38:20 And again, I don’t think it’s an AI question.

00:38:22 We can work more and make the algorithm even better,

00:38:25 but I don’t think that this is the current, you know,

00:38:29 the barrier, the barrier is really this other piece

00:38:32 that for some reason is not really explored.

00:38:35 It’s like anthropological piece.

00:38:36 And coming back to your question about books,

00:38:39 there is a book that I’m reading.

00:38:42 It’s called American Sickness by Elizabeth Rosenthal.

00:38:48 And I got this book from my clinical collaborator,

00:38:51 Dr. Connie Lehman.

00:38:53 And I said, I know everything that I need to know

00:38:54 about American health system,

00:38:56 but you know, every page doesn’t fail to surprise me.

00:38:59 And I think there is a lot of interesting

00:39:03 and really deep lessons for people like us

00:39:06 from computer science who are coming into this field

00:39:09 to really understand how complex is the system of incentives

00:39:13 in the system to understand how you really need to play

00:39:17 to drive adoption.

00:39:19 You just said it’s complex,

00:39:21 but if we’re trying to simplify it,

00:39:23 who do you think most likely would be successful

00:39:27 if we push on this group of people?

00:39:29 Is it the doctors?

00:39:30 Is it the hospitals?

00:39:31 Is it the governments or policymakers?

00:39:34 Is it the individual patients, consumers?

00:39:38 Who needs to be inspired to most likely lead to adoption?

00:39:45 Or is there no simple answer?

00:39:47 There’s no simple answer,

00:39:48 but I think there is a lot of good people in medical system

00:39:51 who do want to make a change.

00:39:56 And I think a lot of power will come from us as consumers

00:40:01 because we all are consumers or future consumers

00:40:04 of healthcare services.

00:40:06 And I think we can do so much more

00:40:12 in explaining the potential and not in the hype terms

00:40:15 and not saying that we now killed all Alzheimer

00:40:17 and I’m really sick of reading this kind of articles

00:40:20 which make these claims,

00:40:22 but really to show with some examples

00:40:24 what this implementation does and how it changes the care.

00:40:29 Because I can’t imagine,

00:40:30 it doesn’t matter what kind of politician it is,

00:40:33 we all are susceptible to these diseases.

00:40:35 There is no one who is free.

00:40:37 And eventually, we all are humans

00:40:41 and we’re looking for a way to alleviate the suffering.

00:40:44 And this is one possible way

00:40:47 where we currently are under utilizing,

00:40:49 which I think can help.

00:40:51 So it sounds like the biggest problems are outside of AI

00:40:55 in terms of the biggest impact at this point.

00:40:57 But are there any open problems

00:41:00 in the application of ML to oncology in general?

00:41:03 So improving the detection or any other creative methods,

00:41:07 whether it’s on the detection segmentations

00:41:09 or the vision perception side

00:41:11 or some other clever of inference?

00:41:16 Yeah, what in general in your view are the open problems

00:41:19 in this space?

00:41:20 Yeah, I just want to mention that beside detection,

00:41:22 not the area where I am kind of quite active

00:41:24 and I think it’s really an increasingly important area

00:41:28 in healthcare is drug design.

00:41:32 Absolutely.

00:41:33 Because it’s fine if you detect something early,

00:41:36 but you still need to get drugs

00:41:41 and new drugs for these conditions.

00:41:43 And today, all of the drug design,

00:41:46 ML is non existent there.

00:41:48 We don’t have any drug that was developed by the ML model

00:41:52 or even not developed,

00:41:54 but at least even knew that ML model

00:41:57 plays some significant role.

00:41:59 I think this area with all the new ability

00:42:03 to generate molecules with desired properties

00:42:05 to do in silica screening is really a big open area.

00:42:11 To be totally honest with you,

00:42:12 when we are doing diagnostics and imaging,

00:42:14 primarily taking the ideas that were developed

00:42:17 for other areas and you applying them with some adaptation,

00:42:20 the area of drug design is really technically interesting

00:42:26 and exciting area.

00:42:27 You need to work a lot with graphs

00:42:30 and capture various 3D properties.

00:42:34 There are lots and lots of opportunities

00:42:37 to be technically creative.

00:42:39 And I think there are a lot of open questions in this area.

00:42:46 We’re already getting a lot of successes

00:42:48 even with kind of the first generation of these models,

00:42:52 but there is much more new creative things that you can do.

00:42:56 And what’s very nice to see is that actually

00:42:59 the more powerful, the more interesting models

00:43:04 actually do do better.

00:43:05 So there is a place to innovate in machine learning

00:43:11 in this area.

00:43:13 And some of these techniques are really unique to,

00:43:16 let’s say, to graph generation and other things.

00:43:19 So…

00:43:20 What, just to interrupt really quick, I’m sorry,

00:43:23 graph generation or graphs, drug discovery in general,

00:43:30 how do you discover a drug?

00:43:31 Is this chemistry?

00:43:33 Is this trying to predict different chemical reactions?

00:43:37 Or is it some kind of…

00:43:39 What do graphs even represent in this space?

00:43:42 Oh, sorry, sorry.

00:43:43 And what’s a drug?

00:43:45 Okay, so let’s say you’re thinking

00:43:47 there are many different types of drugs,

00:43:48 but let’s say you’re gonna talk about small molecules

00:43:50 because I think today the majority of drugs

00:43:52 are small molecules.

00:43:53 So small molecule is a graph.

00:43:55 The molecule is just where the node in the graph

00:43:59 is an atom and then you have the bonds.

00:44:01 So it’s really a graph representation.

00:44:03 If you look at it in 2D, correct,

00:44:05 you can do it 3D, but let’s say,

00:44:07 let’s keep it simple and stick in 2D.

00:44:11 So pretty much my understanding today,

00:44:14 how it is done at scale in the companies,

00:44:18 without machine learning,

00:44:20 you have high throughput screening.

00:44:22 So you know that you are interested

00:44:23 to get certain biological activity of the compound.

00:44:26 So you scan a lot of compounds,

00:44:28 like maybe hundreds of thousands,

00:44:30 some really big number of compounds.

00:44:32 You identify some compounds which have the right activity

00:44:36 and then at this point, the chemists come

00:44:39 and they’re trying to now to optimize

00:44:43 this original heat to different properties

00:44:45 that you want it to be maybe soluble,

00:44:47 you want it to decrease toxicity,

00:44:49 you want it to decrease the side effects.

00:44:51 Are those, sorry again to interrupt,

00:44:54 can that be done in simulation

00:44:55 or just by looking at the molecules

00:44:57 or do you need to actually run reactions

00:44:59 in real labs with lab coats and stuff?

00:45:02 So when you do high throughput screening,

00:45:04 you really do screening.

00:45:06 It’s in the lab.

00:45:07 It’s really the lab screening.

00:45:09 You screen the molecules, correct?

00:45:10 I don’t know what screening is.

00:45:12 The screening is just check them for certain property.

00:45:15 Like in the physical space, in the physical world,

00:45:17 like actually there’s a machine probably

00:45:18 that’s actually running the reaction.

00:45:21 Actually running the reactions, yeah.

00:45:22 So there is a process where you can run

00:45:25 and that’s why it’s called high throughput

00:45:26 that it become cheaper and faster

00:45:29 to do it on very big number of molecules.

00:45:33 You run the screening,

00:45:35 you identify potential good starts

00:45:40 and then when the chemists come in

00:45:42 who have done it many times

00:45:44 and then they can try to look at it and say,

00:45:46 how can you change the molecule

00:45:48 to get the desired profile

00:45:51 in terms of all other properties?

00:45:53 So maybe how do I make it more bioactive and so on?

00:45:56 And there the creativity of the chemists

00:45:59 really is the one that determines the success

00:46:03 of this design because again,

00:46:07 they have a lot of domain knowledge

00:46:09 of what works, how do you decrease the CCD and so on

00:46:12 and that’s what they do.

00:46:15 So all the drugs that are currently

00:46:17 in the FDA approved drugs

00:46:20 or even drugs that are in clinical trials,

00:46:22 they are designed using these domain experts

00:46:27 which goes through this combinatorial space

00:46:30 of molecules or graphs or whatever

00:46:31 and find the right one or adjust it to be the right ones.

00:46:35 It sounds like the breast density heuristic

00:46:38 from 67 to the same echoes.

00:46:40 It’s not necessarily that.

00:46:41 It’s really driven by deep understanding.

00:46:45 It’s not like they just observe it.

00:46:46 I mean, they do deeply understand chemistry

00:46:48 and they do understand how different groups

00:46:50 and how does it changes the properties.

00:46:53 So there is a lot of science that gets into it

00:46:56 and a lot of kind of simulation,

00:46:58 how do you want it to behave?

00:47:01 It’s very, very complex.

00:47:03 So they’re quite effective at this design, obviously.

00:47:06 Now effective, yeah, we have drugs.

00:47:08 Like depending on how do you measure effective,

00:47:10 if you measure it in terms of cost, it’s prohibitive.

00:47:13 If you measure it in terms of times,

00:47:15 we have lots of diseases for which we don’t have any drugs

00:47:18 and we don’t even know how to approach

00:47:20 and don’t need to mention few drugs

00:47:23 or neurodegenerative disease drugs that fail.

00:47:27 So there are lots of trials that fail in later stages,

00:47:32 which is really catastrophic from the financial perspective.

00:47:35 So is it the effective, the most effective mechanism?

00:47:39 Absolutely no, but this is the only one that currently works.

00:47:44 And I was closely interacting

00:47:47 with people in pharmaceutical industry.

00:47:49 I was really fascinated on how sharp

00:47:51 and what a deep understanding of the domain do they have.

00:47:55 It’s not observation driven.

00:47:57 There is really a lot of science behind what they do.

00:48:00 But if you ask me, can machine learning change it,

00:48:02 I firmly believe yes,

00:48:05 because even the most experienced chemists

00:48:07 cannot hold in their memory and understanding

00:48:11 everything that you can learn

00:48:12 from millions of molecules and reactions.

00:48:17 And the space of graphs is a totally new space.

00:48:19 I mean, it’s a really interesting space

00:48:22 for machine learning to explore, graph generation.

00:48:23 Yeah, so there are a lot of things that you can do here.

00:48:26 So we do a lot of work.

00:48:28 So the first tool that we started with

00:48:31 was the tool that can predict properties of the molecules.

00:48:36 So you can just give the molecule and the property.

00:48:39 It can be by activity property,

00:48:41 or it can be some other property.

00:48:44 And you train the molecules

00:48:46 and you can now take a new molecule

00:48:50 and predict this property.

00:48:52 Now, when people started working in this area,

00:48:54 it is something very simple.

00:48:55 They do kind of existing fingerprints,

00:48:58 which is kind of handcrafted features of the molecule.

00:49:00 When you break the graph to substructures

00:49:02 and then you run it in a feed forward neural network.

00:49:05 And what was interesting to see that clearly,

00:49:08 this was not the most effective way to proceed.

00:49:11 And you need to have much more complex models

00:49:14 that can induce a representation,

00:49:16 which can translate this graph into the embeddings

00:49:19 and do these predictions.

00:49:21 So this is one direction.

00:49:23 Then another direction, which is kind of related

00:49:25 is not only to stop by looking at the embedding itself,

00:49:29 but actually modify it to produce better molecules.

00:49:32 So you can think about it as machine translation

00:49:36 that you can start with a molecule

00:49:38 and then there is an improved version of molecule.

00:49:40 And you can again, with encoder translate it

00:49:42 into the hidden space and then learn how to modify it

00:49:45 to improve the in some ways version of the molecules.

00:49:49 So that’s, it’s kind of really exciting.

00:49:52 We already have seen that the property prediction

00:49:54 works pretty well.

00:49:56 And now we are generating molecules

00:49:59 and there is actually labs

00:50:01 which are manufacturing this molecule.

00:50:04 So we’ll see where it will get us.

00:50:06 Okay, that’s really exciting.

00:50:07 There’s a lot of promise.

00:50:08 Speaking of machine translation and embeddings,

00:50:11 I think you have done a lot of really great research

00:50:15 in NLP, natural language processing.

00:50:19 Can you tell me your journey through NLP?

00:50:21 What ideas, problems, approaches were you working on?

00:50:25 Were you fascinated with, did you explore

00:50:28 before this magic of deep learning reemerged and after?

00:50:34 So when I started my work in NLP, it was in 97.

00:50:38 This was very interesting time.

00:50:39 It was exactly the time that I came to ACL.

00:50:43 And at the time I could barely understand English,

00:50:46 but it was exactly like the transition point

00:50:48 because half of the papers were really rule based approaches

00:50:53 where people took more kind of heavy linguistic approaches

00:50:56 for small domains and try to build up from there.

00:51:00 And then there were the first generation of papers

00:51:02 which were corpus based papers.

00:51:04 And they were very simple in our terms

00:51:06 when you collect some statistics

00:51:07 and do prediction based on them.

00:51:10 And I found it really fascinating that one community

00:51:13 can think so very differently about the problem.

00:51:19 And I remember my first paper that I wrote,

00:51:22 it didn’t have a single formula.

00:51:24 It didn’t have evaluation.

00:51:25 It just had examples of outputs.

00:51:28 And this was a standard of the field at the time.

00:51:32 In some ways, I mean, people maybe just started emphasizing

00:51:35 the empirical evaluation, but for many applications

00:51:38 like summarization, you just show some examples of outputs.

00:51:42 And then increasingly you can see that how

00:51:45 the statistical approaches dominated the field

00:51:48 and we’ve seen increased performance

00:51:52 across many basic tasks.

00:51:56 The sad part of the story maybe that if you look again

00:52:00 through this journey, we see that the role of linguistics

00:52:05 in some ways greatly diminishes.

00:52:07 And I think that you really need to look

00:52:11 through the whole proceeding to find one or two papers

00:52:14 which make some interesting linguistic references.

00:52:17 It’s really big.

00:52:18 Today, yeah.

00:52:18 Today, today.

00:52:19 This was definitely one of the.

00:52:20 Things like syntactic trees, just even basically

00:52:23 against our conversation about human understanding

00:52:26 of language, which I guess what linguistics would be

00:52:30 structured, hierarchical representing language

00:52:34 in a way that’s human explainable, understandable

00:52:37 is missing today.

00:52:39 I don’t know if it is, what is explainable

00:52:42 and understandable.

00:52:43 In the end, we perform functions and it’s okay

00:52:47 to have machine which performs a function.

00:52:50 Like when you’re thinking about your calculator, correct?

00:52:53 Your calculator can do calculation very different

00:52:56 from you would do the calculation,

00:52:57 but it’s very effective in it.

00:52:58 And this is fine if we can achieve certain tasks

00:53:02 with high accuracy, doesn’t necessarily mean

00:53:05 that it has to understand it the same way as we understand.

00:53:09 In some ways, it’s even naive to request

00:53:11 because you have so many other sources of information

00:53:14 that are absent when you are training your system.

00:53:17 So it’s okay.

00:53:19 Is it delivered?

00:53:20 And I would tell you one application

00:53:21 that is really fascinating.

00:53:22 In 97, when it came to ACL, there were some papers

00:53:25 on machine translation.

00:53:25 They were like primitive.

00:53:27 Like people were trying really, really simple.

00:53:31 And the feeling, my feeling was that, you know,

00:53:34 to make real machine translation system,

00:53:36 it’s like to fly at the moon and build a house there

00:53:39 and the garden and live happily ever after.

00:53:41 I mean, it’s like impossible.

00:53:42 I never could imagine that within, you know, 10 years,

00:53:46 we would already see the system working.

00:53:48 And now, you know, nobody is even surprised

00:53:51 to utilize the system on daily basis.

00:53:54 So this was like a huge, huge progress,

00:53:56 saying that people for very long time

00:53:57 tried to solve using other mechanisms.

00:54:00 And they were unable to solve it.

00:54:03 That’s why coming back to your question about biology,

00:54:06 that, you know, in linguistics, people try to go this way

00:54:10 and try to write the syntactic trees

00:54:13 and try to abstract it and to find the right representation.

00:54:17 And, you know, they couldn’t get very far

00:54:22 with this understanding while these models using,

00:54:26 you know, other sources actually capable

00:54:29 to make a lot of progress.

00:54:31 Now, I’m not naive to think

00:54:33 that we are in this paradise space in NLP.

00:54:36 And sure as you know,

00:54:38 that when we slightly change the domain

00:54:40 and when we decrease the amount of training,

00:54:42 it can do like really bizarre and funny thing.

00:54:44 But I think it’s just a matter

00:54:46 of improving generalization capacity,

00:54:48 which is just a technical question.

00:54:51 Wow, so that’s the question.

00:54:54 How much of language understanding can be solved

00:54:57 with deep neural networks?

00:54:59 In your intuition, I mean, it’s unknown, I suppose.

00:55:03 But as we start to creep towards romantic notions

00:55:07 of the spirit of the Turing test

00:55:10 and conversation and dialogue

00:55:14 and something that maybe to me or to us,

00:55:18 so the humans feels like it needs real understanding.

00:55:21 How much can that be achieved

00:55:23 with these neural networks or statistical methods?

00:55:27 So I guess I am very much driven by the outcomes.

00:55:33 Can we achieve the performance

00:55:35 which would be satisfactory for us for different tasks?

00:55:40 Now, if you again look at machine translation system,

00:55:43 which are trained on large amounts of data,

00:55:46 they really can do a remarkable job

00:55:48 relatively to where they’ve been a few years ago.

00:55:51 And if you project into the future,

00:55:54 if it will be the same speed of improvement, you know,

00:55:59 this is great.

00:56:00 Now, does it bother me

00:56:01 that it’s not doing the same translation as we are doing?

00:56:04 Now, if you go to cognitive science,

00:56:06 we still don’t really understand what we are doing.

00:56:10 I mean, there are a lot of theories

00:56:11 and there’s obviously a lot of progress and studying,

00:56:13 but our understanding what exactly goes on in our brains

00:56:17 when we process language is still not crystal clear

00:56:21 and precise that we can translate it into machines.

00:56:25 What does bother me is that, you know,

00:56:29 again, that machines can be extremely brittle

00:56:31 when you go out of your comfort zone

00:56:33 of when there is a distributional shift

00:56:36 between training and testing.

00:56:37 And it have been years and years,

00:56:39 every year when I teach an LP class,

00:56:41 now show them some examples of translation

00:56:43 from some newspaper in Hebrew or whatever, it was perfect.

00:56:47 And then I have a recipe that Tomi Yakel’s system

00:56:51 sent me a while ago and it was written in Finnish

00:56:53 of Karelian pies.

00:56:55 And it’s just a terrible translation.

00:56:59 You cannot understand anything what it does.

00:57:01 It’s not like some syntactic mistakes, it’s just terrible.

00:57:04 And year after year, I tried and will translate

00:57:07 and year after year, it does this terrible work

00:57:08 because I guess, you know, the recipes

00:57:10 are not a big part of their training repertoire.

00:57:14 So, but in terms of outcomes, that’s a really clean,

00:57:19 good way to look at it.

00:57:21 I guess the question I was asking is,

00:57:24 do you think, imagine a future,

00:57:27 do you think the current approaches can pass

00:57:30 the Turing test in the way,

00:57:34 in the best possible formulation of the Turing test?

00:57:37 Which is, would you wanna have a conversation

00:57:39 with a neural network for an hour?

00:57:42 Oh God, no, no, there are not that many people

00:57:45 that I would want to talk for an hour, but.

00:57:48 There are some people in this world, alive or not,

00:57:51 that you would like to talk to for an hour.

00:57:53 Could a neural network achieve that outcome?

00:57:56 So I think it would be really hard to create

00:57:58 a successful training set, which would enable it

00:58:02 to have a conversation, a contextual conversation

00:58:04 for an hour.

00:58:05 Do you think it’s a problem of data, perhaps?

00:58:08 I think in some ways it’s not a problem of data,

00:58:09 it’s a problem both of data and the problem of

00:58:13 the way we’re training our systems,

00:58:15 their ability to truly, to generalize,

00:58:18 to be very compositional.

00:58:19 In some ways it’s limited in the current capacity,

00:58:23 at least we can translate well,

00:58:27 we can find information well, we can extract information.

00:58:32 So there are many capacities in which it’s doing very well.

00:58:35 And you can ask me, would you trust the machine

00:58:38 to translate for you and use it as a source?

00:58:39 I would say absolutely, especially if we’re talking about

00:58:42 newspaper data or other data which is in the realm

00:58:45 of its own training set, I would say yes.

00:58:48 But having conversations with the machine,

00:58:52 it’s not something that I would choose to do.

00:58:56 But I would tell you something, talking about Turing tests

00:58:59 and about all this kind of ELISA conversations,

00:59:02 I remember visiting Tencent in China

00:59:05 and they have this chat board and they claim

00:59:07 there is really humongous amount of the local population

00:59:10 which for hours talks to the chat board.

00:59:12 To me it was, I cannot believe it,

00:59:15 but apparently it’s documented that there are some people

00:59:18 who enjoy this conversation.

00:59:20 And it brought to me another MIT story

00:59:24 about ELISA and Weisenbaum.

00:59:26 I don’t know if you’re familiar with the story.

00:59:29 So Weisenbaum was a professor at MIT

00:59:31 and when he developed this ELISA,

00:59:32 which was just doing string matching,

00:59:34 very trivial, like restating of what you said

00:59:38 with very few rules, no syntax.

00:59:41 Apparently there were secretaries at MIT

00:59:43 that would sit for hours and converse with this trivial thing

00:59:48 and at the time there was no beautiful interfaces

00:59:50 so you actually need to go through the pain

00:59:51 of communicating.

00:59:53 And Weisenbaum himself was so horrified by this phenomenon

00:59:56 that people can believe enough to the machine

00:59:59 that you just need to give them the hint

01:00:00 that machine understands you and you can complete the rest

01:00:03 that he kind of stopped this research

01:00:05 and went into kind of trying to understand

01:00:08 what this artificial intelligence can do to our brains.

01:00:12 So my point is, you know,

01:00:14 how much, it’s not how good is the technology,

01:00:19 it’s how ready we are to believe

01:00:22 that it delivers the goods that we are trying to get.

01:00:25 That’s a really beautiful way to put it.

01:00:27 I, by the way, I’m not horrified by that possibility,

01:00:29 but inspired by it because,

01:00:33 I mean, human connection,

01:00:35 whether it’s through language or through love,

01:00:39 it seems like it’s very amenable to machine learning

01:00:44 and the rest is just challenges of psychology.

01:00:49 Like you said, the secretaries who enjoy spending hours.

01:00:52 I would say I would describe most of our lives

01:00:55 as enjoying spending hours with those we love

01:00:58 for very silly reasons.

01:01:00 All we’re doing is keyword matching as well.

01:01:02 So I’m not sure how much intelligence

01:01:05 we exhibit to each other with the people we love

01:01:08 that we’re close with.

01:01:09 So it’s a very interesting point

01:01:12 of what it means to pass the Turing test with language.

01:01:16 I think you’re right.

01:01:16 In terms of conversation,

01:01:18 I think machine translation

01:01:21 has very clear performance and improvement, right?

01:01:24 What it means to have a fulfilling conversation

01:01:28 is very person dependent and context dependent

01:01:32 and so on.

01:01:33 That’s, yeah, it’s very well put.

01:01:36 But in your view, what’s a benchmark in natural language,

01:01:40 a test that’s just out of reach right now,

01:01:43 but we might be able to, that’s exciting.

01:01:46 Is it in perfecting machine translation

01:01:49 or is there other, is it summarization?

01:01:51 What’s out there just out of reach?

01:01:52 I think it goes across specific application.

01:01:55 It’s more about the ability to learn from few examples

01:01:59 for real, what we call few short learning and all these cases

01:02:03 because the way we publish these papers today,

01:02:05 we say, if we have like naively, we get 55,

01:02:09 but now we had a few example and we can move to 65.

01:02:12 None of these methods

01:02:13 actually are realistically doing anything useful.

01:02:15 You cannot use them today.

01:02:18 And the ability to be able to generalize and to move

01:02:25 or to be autonomous in finding the data

01:02:28 that you need to learn,

01:02:31 to be able to perfect new tasks or new language,

01:02:35 this is an area where I think we really need

01:02:39 to move forward to and we are not yet there.

01:02:43 Are you at all excited,

01:02:45 curious by the possibility

01:02:46 of creating human level intelligence?

01:02:49 Is this, cause you’ve been very in your discussion.

01:02:52 So if we look at oncology,

01:02:54 you’re trying to use machine learning to help the world

01:02:58 in terms of alleviating suffering.

01:02:59 If you look at natural language processing,

01:03:02 you’re focused on the outcomes of improving practical things

01:03:05 like machine translation.

01:03:06 But human level intelligence is a thing

01:03:09 that our civilization has dreamed about creating,

01:03:13 super human level intelligence.

01:03:15 Do you think about this?

01:03:16 Do you think it’s at all within our reach?

01:03:20 So as you said yourself, Elie,

01:03:22 talking about how do you perceive

01:03:26 our communications with each other,

01:03:28 that we’re matching keywords and certain behaviors

01:03:31 and so on.

01:03:33 So at the end, whenever one assesses,

01:03:36 let’s say relations with another person,

01:03:38 you have separate kind of measurements and outcomes

01:03:41 inside your head that determine

01:03:43 what is the status of the relation.

01:03:45 So one way, this is this classical level,

01:03:48 what is the intelligence?

01:03:49 Is it the fact that now we are gonna do the same way

01:03:51 as human is doing,

01:03:52 when we don’t even understand what the human is doing?

01:03:55 Or we now have an ability to deliver these outcomes,

01:03:59 but not in one area, not in NLP,

01:04:01 not just to translate or just to answer questions,

01:04:03 but across many, many areas

01:04:05 that we can achieve the functionalities

01:04:08 that humans can achieve with their ability to learn

01:04:11 and do other things.

01:04:12 I think this is, and this we can actually measure

01:04:15 how far we are.

01:04:17 And that’s what makes me excited that we,

01:04:21 in my lifetime, at least so far what we’ve seen,

01:04:23 it’s like tremendous progress

01:04:25 across these different functionalities.

01:04:28 And I think it will be really exciting

01:04:32 to see where we will be.

01:04:35 And again, one way to think about it,

01:04:39 there are machines which are improving their functionality.

01:04:41 Another one is to think about us with our brains,

01:04:44 which are imperfect,

01:04:46 how they can be accelerated by this technology

01:04:51 as it becomes stronger and stronger.

01:04:55 Coming back to another book

01:04:57 that I love, Flowers for Algernon.

01:05:01 Have you read this book?

01:05:02 Yes.

01:05:02 So there is this point that the patient gets

01:05:05 this miracle cure, which changes his brain.

01:05:07 And all of a sudden they see life in a different way

01:05:11 and can do certain things better,

01:05:13 but certain things much worse.

01:05:14 So you can imagine this kind of computer augmented cognition

01:05:22 where it can bring you that now in the same way

01:05:24 as the cars enable us to get to places

01:05:28 where we’ve never been before,

01:05:30 can we think differently?

01:05:31 Can we think faster?

01:05:33 And we already see a lot of it happening

01:05:36 in how it impacts us,

01:05:38 but I think we have a long way to go there.

01:05:42 So that’s sort of artificial intelligence

01:05:45 and technology affecting our,

01:05:47 augmenting our intelligence as humans.

01:05:50 Yesterday, a company called Neuralink announced,

01:05:55 they did this whole demonstration.

01:05:56 I don’t know if you saw it.

01:05:57 It’s, they demonstrated brain computer,

01:06:01 brain machine interface,

01:06:02 where there’s like a sewing machine for the brain.

01:06:06 Do you, you know, a lot of that is quite out there

01:06:11 in terms of things that some people would say

01:06:14 are impossible, but they’re dreamers

01:06:16 and want to engineer systems like that.

01:06:18 Do you see, based on what you just said,

01:06:20 a hope for that more direct interaction with the brain?

01:06:25 I think there are different ways.

01:06:27 One is a direct interaction with the brain.

01:06:29 And again, there are lots of companies

01:06:30 that work in this space

01:06:32 and I think there will be a lot of developments.

01:06:35 But I’m just thinking that many times

01:06:36 we are not aware of our feelings,

01:06:39 of motivation, what drives us.

01:06:41 Like, let me give you a trivial example, our attention.

01:06:45 There are a lot of studies that demonstrate

01:06:47 that it takes a while to a person to understand

01:06:49 that they are not attentive anymore.

01:06:51 And we know that there are people

01:06:52 who really have strong capacity to hold attention.

01:06:54 There are other end of the spectrum people with ADD

01:06:57 and other issues that they have problem

01:06:58 to regulate their attention.

01:07:00 Imagine to yourself that you have like a cognitive aid

01:07:03 that just alerts you based on your gaze,

01:07:06 that your attention is now not on what you are doing.

01:07:09 And instead of writing a paper,

01:07:10 you’re now dreaming of what you’re gonna do in the evening.

01:07:12 So even this kind of simple measurement things,

01:07:16 how they can change us.

01:07:17 And I see it even in simple ways with myself.

01:07:22 I have my zone app that I got in MIT gym.

01:07:26 It kind of records, you know, how much did you run

01:07:28 and you have some points

01:07:29 and you can get some status, whatever.

01:07:32 Like, I said, what is this ridiculous thing?

01:07:35 Who would ever care about some status in some app?

01:07:38 Guess what?

01:07:39 So to maintain the status,

01:07:41 you have to do set a number of points every month.

01:07:44 And not only is that I do it every single month

01:07:48 for the last 18 months,

01:07:50 it went to the point that I was injured.

01:07:54 And when I could run again,

01:07:58 in two days, I did like some humongous amount of running

01:08:02 just to complete the points.

01:08:04 It was like really not safe.

01:08:05 It was like, I’m not gonna lose my status

01:08:08 because I want to get there.

01:08:10 So you can already see that this direct measurement

01:08:13 and the feedback is, you know,

01:08:15 we’re looking at video games

01:08:16 and see why, you know, the addiction aspect of it,

01:08:18 but you can imagine that the same idea can be expanded

01:08:21 to many other areas of our life.

01:08:23 When we really can get feedback

01:08:25 and imagine in your case in relations,

01:08:29 when we are doing keyword matching,

01:08:31 imagine that the person who is generating the keywords,

01:08:36 that person gets direct feedback

01:08:37 before the whole thing explodes.

01:08:39 Is it maybe at this happy point,

01:08:42 we are going in the wrong direction.

01:08:44 Maybe it will be really a behavior modifying moment.

01:08:48 So yeah, it’s a relationship management too.

01:08:51 So yeah, that’s a fascinating whole area

01:08:54 of psychology actually as well,

01:08:56 of seeing how our behavior has changed

01:08:58 with basically all human relations now have

01:09:01 other nonhuman entities helping us out.

01:09:06 So you teach a large,

01:09:09 a huge machine learning course here at MIT.

01:09:14 I can ask you a million questions,

01:09:15 but you’ve seen a lot of students.

01:09:17 What ideas do students struggle with the most

01:09:20 as they first enter this world of machine learning?

01:09:23 Actually, this year was the first time

01:09:26 I started teaching a small machine learning class.

01:09:28 And it came as a result of what I saw

01:09:31 in my big machine learning class that Tomi Yakel and I built

01:09:34 maybe six years ago.

01:09:38 What we’ve seen that as this area become more

01:09:40 and more popular, more and more people at MIT

01:09:43 want to take this class.

01:09:45 And while we designed it for computer science majors,

01:09:48 there were a lot of people who really are interested

01:09:50 to learn it, but unfortunately,

01:09:52 their background was not enabling them

01:09:55 to do well in the class.

01:09:57 And many of them associated machine learning

01:09:59 with the word struggle and failure,

01:10:02 primarily for non majors.

01:10:04 And that’s why we actually started a new class

01:10:06 which we call machine learning from algorithms to modeling,

01:10:10 which emphasizes more the modeling aspects of it

01:10:15 and focuses on, it has majors and non majors.

01:10:20 So we kind of try to extract the relevant parts

01:10:23 and make it more accessible,

01:10:25 because the fact that we’re teaching 20 classifiers

01:10:27 in standard machine learning class,

01:10:29 it’s really a big question to really need it.

01:10:32 But it was interesting to see this

01:10:34 from first generation of students,

01:10:36 when they came back from their internships

01:10:39 and from their jobs,

01:10:42 what different and exciting things they can do.

01:10:45 I would never think that you can even apply

01:10:47 machine learning to, some of them are like matching,

01:10:50 the relations and other things like variety.

01:10:53 Everything is amenable as the machine learning.

01:10:56 That actually brings up an interesting point

01:10:58 of computer science in general.

01:11:00 It almost seems, maybe I’m crazy,

01:11:03 but it almost seems like everybody needs to learn

01:11:06 how to program these days.

01:11:08 If you’re 20 years old, or if you’re starting school,

01:11:11 even if you’re an English major,

01:11:14 it seems like programming unlocks so much possibility

01:11:20 in this world.

01:11:21 So when you interacted with those non majors,

01:11:25 is there skills that they were simply lacking at the time

01:11:30 that you wish they had and that they learned

01:11:33 in high school and so on?

01:11:34 Like how should education change

01:11:37 in this computerized world that we live in?

01:11:41 I think because I knew that there is a Python component

01:11:44 in the class, their Python skills were okay

01:11:47 and the class isn’t really heavy on programming.

01:11:49 They primarily kind of add parts to the programs.

01:11:52 I think it was more of the mathematical barriers

01:11:55 and the class, again, with the design on the majors

01:11:58 was using the notation, like big O for complexity

01:12:01 and others, people who come from different backgrounds

01:12:04 just don’t have it in the lexical,

01:12:05 so necessarily very challenging notion,

01:12:09 but they were just not aware.

01:12:12 So I think that kind of linear algebra and probability,

01:12:16 the basics, the calculus, multivariate calculus,

01:12:19 things that can help.

01:12:20 What advice would you give to students

01:12:23 interested in machine learning,

01:12:25 interested, you’ve talked about detecting,

01:12:29 curing cancer, drug design,

01:12:31 if they want to get into that field, what should they do?

01:12:36 Get into it and succeed as researchers

01:12:39 and entrepreneurs.

01:12:43 The first good piece of news is that right now

01:12:45 there are lots of resources

01:12:47 that are created at different levels

01:12:50 and you can find online in your school classes

01:12:54 which are more mathematical, more applied and so on.

01:12:57 So you can find a kind of a preacher

01:13:01 which preaches in your own language

01:13:02 where you can enter the field

01:13:04 and you can make many different types of contribution

01:13:06 depending of what is your strengths.

01:13:10 And the second point, I think it’s really important

01:13:13 to find some area which you really care about

01:13:18 and it can motivate your learning

01:13:20 and it can be for somebody curing cancer

01:13:22 or doing self driving cars or whatever,

01:13:25 but to find an area where there is data

01:13:29 where you believe there are strong patterns

01:13:31 and we should be doing it and we’re still not doing it

01:13:33 or you can do it better

01:13:35 and just start there and see where it can bring you.

01:13:40 So you’ve been very successful in many directions in life,

01:13:46 but you also mentioned Flowers of Argonon.

01:13:51 And I think I’ve read or listened to you mention somewhere

01:13:53 that researchers often get lost

01:13:55 in the details of their work.

01:13:56 This is per our original discussion with cancer and so on

01:14:00 and don’t look at the bigger picture,

01:14:02 bigger questions of meaning and so on.

01:14:05 So let me ask you the impossible question

01:14:08 of what’s the meaning of this thing,

01:14:11 of life, of your life, of research.

01:14:16 Why do you think we descendant of great apes

01:14:21 are here on this spinning ball?

01:14:26 You know, I don’t think that I have really a global answer.

01:14:30 You know, maybe that’s why I didn’t go to humanities

01:14:33 and I didn’t take humanities classes in my undergrad.

01:14:39 But the way I’m thinking about it,

01:14:43 each one of us inside of them have their own set of,

01:14:48 you know, things that we believe are important.

01:14:51 And it just happens that we are busy

01:14:53 with achieving various goals, busy listening to others

01:14:56 and to kind of try to conform and to be part of the crowd,

01:15:00 that we don’t listen to that part.

01:15:04 And, you know, we all should find some time to understand

01:15:09 what is our own individual missions.

01:15:11 And we may have very different missions

01:15:14 and to make sure that while we are running 10,000 things,

01:15:18 we are not, you know, missing out

01:15:21 and we’re putting all the resources to satisfy

01:15:26 our own mission.

01:15:28 And if I look over my time, when I was younger,

01:15:32 most of these missions, you know,

01:15:35 I was primarily driven by the external stimulus,

01:15:38 you know, to achieve this or to be that.

01:15:41 And now a lot of what I do is driven by really thinking

01:15:47 what is important for me to achieve independently

01:15:51 of the external recognition.

01:15:55 And, you know, I don’t mind to be viewed in certain ways.

01:16:01 The most important thing for me is to be true to myself,

01:16:05 to what I think is right.

01:16:07 How long did it take?

01:16:08 How hard was it to find the you that you have to be true to?

01:16:14 So it takes time.

01:16:15 And even now, sometimes, you know,

01:16:17 the vanity and the triviality can take, you know.

01:16:20 At MIT.

01:16:22 Yeah, it can everywhere, you know,

01:16:25 it’s just the vanity at MIT is different,

01:16:26 the vanity in different places,

01:16:28 but we all have our piece of vanity.

01:16:30 But I think actually for me, many times the place

01:16:38 to get back to it is, you know, when I’m alone

01:16:43 and also when I read.

01:16:45 And I think by selecting the right books,

01:16:47 you can get the right questions and learn from what you read.

01:16:54 So, but again, it’s not perfect.

01:16:58 Like vanity sometimes dominates.

01:17:02 Well, that’s a beautiful way to end.

01:17:04 Thank you so much for talking today.

01:17:06 Thank you.

01:17:07 That was fun.

01:17:08 That was fun.