Judea Pearl: Causal Reasoning, Counterfactuals, and the Path to AGI #56

Transcript

00:00:00 The following is a conversation with Judea Pearl,

00:00:03 professor at UCLA and a winner of the Turing Award

00:00:06 that’s generally recognized as the Nobel Prize of Computing.

00:00:10 He’s one of the seminal figures

00:00:12 in the field of artificial intelligence,

00:00:14 computer science, and statistics.

00:00:16 He has developed and championed probabilistic approaches

00:00:19 to AI, including Beijing networks,

00:00:22 and profound ideas in causality in general.

00:00:26 These ideas are important not just to AI,

00:00:29 but to our understanding and practice of science.

00:00:32 But in the field of AI, the idea of causality,

00:00:35 cause and effect, to many, lie at the core

00:00:39 of what is currently missing and what must be developed

00:00:42 in order to build truly intelligent systems.

00:00:46 For this reason, and many others,

00:00:48 his work is worth returning to often.

00:00:50 I recommend his most recent book called Book of Why

00:00:54 that presents key ideas from a lifetime of work

00:00:57 in a way that is accessible to the general public.

00:01:00 This is the Artificial Intelligence Podcast.

00:01:03 If you enjoy it, subscribe on YouTube,

00:01:05 give it five stars on Apple Podcast,

00:01:07 support it on Patreon, or simply connect with me on Twitter

00:01:11 at Lex Friedman, spelled F R I D M A N.

00:01:15 If you leave a review on Apple Podcasts especially,

00:01:18 but also a cast box or comment on YouTube,

00:01:20 consider mentioning topics, people, ideas, questions,

00:01:23 quotes, and science, tech, and philosophy

00:01:26 you find interesting, and I’ll read them on this podcast.

00:01:29 I won’t call out names, but I love comments

00:01:31 with kindness and thoughtfulness in them,

00:01:33 so I thought I’d share them with you.

00:01:35 Someone on YouTube highlighted a quote

00:01:37 from the conversation with Noam Chomsky,

00:01:40 where he said that the significance of your life

00:01:42 is something you create.

00:01:44 I like this line as well.

00:01:46 On most days, the existentialist approach to life

00:01:49 is one I find liberating and fulfilling.

00:01:53 I recently started doing ads

00:01:55 at the end of the introduction.

00:01:56 I’ll do one or two minutes after introducing the episode,

00:01:59 and never any ads in the middle

00:02:01 that break the flow of the conversation.

00:02:03 I hope that works for you

00:02:04 and doesn’t hurt the listening experience.

00:02:08 This show is presented by Cash App,

00:02:10 the number one finance app in the App Store.

00:02:13 I personally use Cash App to send money to friends,

00:02:15 but you can also use it to buy, sell,

00:02:17 and deposit Bitcoin in just seconds.

00:02:20 Cash App also has a new investing feature.

00:02:22 You can buy fractions of a stock, say $1 worth,

00:02:25 no matter what the stock price is.

00:02:28 Broker services are provided by Cash App Investing,

00:02:30 a subsidiary of Square, a member of SIPC.

00:02:34 I’m excited to be working with Cash App

00:02:36 to support one of my favorite organizations called First,

00:02:39 best known for their first robotics and Lego competitions.

00:02:43 They educate and inspire hundreds of thousands of students

00:02:47 in over 110 countries,

00:02:49 and have a perfect rating on Charity Navigator,

00:02:51 which means the donated money

00:02:53 is used to the maximum effectiveness.

00:02:56 When you get Cash App from the App Store or Google Play,

00:02:58 and use the code LEXPODCAST, you’ll get $10,

00:03:02 and Cash App will also donate $10 to First,

00:03:05 which again, is an organization

00:03:07 that I’ve personally seen inspire girls and boys

00:03:10 to dream of engineering a better world.

00:03:12 And now, here’s my conversation with Judea Pearl.

00:03:18 You mentioned in an interview

00:03:19 that science is not a collection of facts,

00:03:21 but a constant human struggle

00:03:23 with the mysteries of nature.

00:03:26 What was the first mystery that you can recall

00:03:29 that hooked you, that kept you in the creaset?

00:03:30 Oh, the first mystery, that’s a good one.

00:03:34 Yeah, I remember that.

00:03:37 I had a fever for three days.

00:03:41 And when I learned about Descartes, analytic geometry,

00:03:45 and I found out that you can do all the construction

00:03:49 in geometry using algebra.

00:03:52 And I couldn’t get over it.

00:03:54 I simply couldn’t get out of bed.

00:03:58 So what kind of world does analytic geometry unlock?

00:04:02 Well, it connects algebra with geometry.

00:04:07 Okay, so Descartes had the idea

00:04:09 that geometrical construction and geometrical theorems

00:04:14 and assumptions can be articulated

00:04:17 in the language of algebra,

00:04:19 which means that all the proof that we did in high school,

00:04:24 and trying to prove that the three bisectors

00:04:28 meet at one point, and that, okay,

00:04:33 all this can be proven by just shuffling around notation.

00:04:39 Yeah, that was a traumatic experience.

00:04:43 That was a traumatic experience.

00:04:44 For me, it was, I’m telling you.

00:04:45 So it’s the connection

00:04:46 between the different mathematical disciplines,

00:04:49 that they all.

00:04:49 Not in between two different languages.

00:04:52 Languages.

00:04:53 Yeah.

00:04:54 So which mathematic discipline is most beautiful?

00:04:57 Is geometry it for you?

00:04:58 Both are beautiful.

00:04:59 They have almost the same power.

00:05:02 But there’s a visual element to geometry, being a.

00:05:04 Visually, it’s more transparent.

00:05:08 But once you get over to algebra,

00:05:10 then the linear equation is a straight line.

00:05:14 This translation is easily absorbed, okay?

00:05:18 And to pass a tangent to a circle,

00:05:22 you know, you have the basic theorems,

00:05:25 and you can do it with algebra.

00:05:27 So but the transition from one to another was really,

00:05:31 I thought that Descartes was the greatest mathematician

00:05:34 of all times.

00:05:35 So you have been at the, if you think of engineering

00:05:40 and mathematics as a spectrum.

00:05:43 Yes.

00:05:44 You have been, you have walked casually along this spectrum

00:05:49 throughout your life.

00:05:51 You know, a little bit of engineering,

00:05:53 and then, you know,

00:05:55 you’ve done a little bit of mathematics here and there.

00:05:58 Not a little bit.

00:05:59 I mean, we got a very solid background in mathematics,

00:06:04 because our teachers were geniuses.

00:06:07 Our teachers came from Germany in the 1930s,

00:06:09 running away from Hitler.

00:06:11 They left their careers in Heidelberg and Berlin,

00:06:15 and came to teach high school in Israel.

00:06:17 And we were the beneficiary of that experiment.

00:06:21 So I, and they taught us math the good way.

00:06:25 What’s the good way to teach math?

00:06:26 Chronologically.

00:06:29 The people.

00:06:29 The people behind the theorems, yeah.

00:06:33 Their cousins, and their nieces, and their faces.

00:06:39 And how they jumped from the bathtub when they scream,

00:06:41 Eureka!

00:06:43 And ran naked in town.

00:06:46 So you’re almost educated as a historian of math.

00:06:49 No, we just got a glimpse of that history

00:06:51 together with a theorem,

00:06:53 so every exercise in math was connected with a person.

00:06:59 And the time of the person.

00:07:01 The period.

00:07:03 The period, also mathematically speaking.

00:07:05 Mathematically speaking, yes.

00:07:06 Not the politics, no.

00:07:09 So, and then in university,

00:07:14 you have gone on to do engineering.

00:07:16 Yeah.

00:07:17 I get a B.S. in engineering and a technical, right?

00:07:20 And then I moved here for graduate work,

00:07:25 and I got, I did engineering in addition to physics

00:07:30 in Rutgers, and it combined very nicely with my thesis,

00:07:35 which I did in RCA Laboratories in superconductivity.

00:07:40 And then somehow thought to switch

00:07:43 to almost computer science, software,

00:07:46 even, not switch, but long to become,

00:07:50 to get into software engineering a little bit.

00:07:53 Yes.

00:07:53 And programming, if you can call it that in the 70s.

00:07:56 So there’s all these disciplines.

00:07:58 Yeah.

00:07:58 So to pick a favorite, in terms of engineering

00:08:02 and mathematics, which path do you think has more beauty?

00:08:07 Which path has more power?

00:08:08 It’s hard to choose, no.

00:08:10 I enjoy doing physics, and even have a vortex

00:08:14 named on my name.

00:08:16 So I have investment in immortality.

00:08:23 So what is a vortex?

00:08:25 Vortex is in superconductivity.

00:08:27 In the superconductivity, yeah.

00:08:27 You have permanent current swirling around.

00:08:30 One way or the other, you can have a store one or zero

00:08:34 for a computer.

00:08:35 That’s what we worked on in the 1960 in RCA.

00:08:39 And I discovered a few nice phenomena with the vortices.

00:08:44 You push current and they move.

00:08:44 So that’s a pearl vortex.

00:08:46 Pearl vortex, right, you can Google it, right?

00:08:50 I didn’t know about it, but the physicists,

00:08:52 they picked up on my thesis, on my PhD thesis,

00:08:57 and it becomes popular when thin film superconductors

00:09:03 became important for high temperature superconductors.

00:09:06 So they called it pearl vortex without my knowledge.

00:09:10 I discovered it only about 15 years ago.

00:09:14 You have footprints in all of the sciences.

00:09:17 So let’s talk about the universe a little bit.

00:09:20 Is the universe at the lowest level deterministic

00:09:23 or stochastic in your amateur philosophy view?

00:09:27 Put another way, does God play dice?

00:09:30 We know it is stochastic, right?

00:09:33 Today, today we think it is stochastic.

00:09:35 Yes.

00:09:37 We think because we have the Heisenberg uncertainty principle

00:09:40 and we have some experiments to confirm that.

00:09:45 All we have is experiments to confirm it.

00:09:47 We don’t understand why.

00:09:50 Why is already?

00:09:51 You wrote a book about why.

00:09:52 Yeah, it’s a puzzle.

00:09:57 It’s a puzzle that you have the dice flipping machine,

00:10:02 oh God, and the result of the flipping

00:10:08 propagate with the speed faster than the speed of light.

00:10:12 We can’t explain it, okay?

00:10:14 So, but it only governs microscopic phenomena.

00:10:19 Microscopic phenomena.

00:10:21 So you don’t think of quantum mechanics

00:10:23 as useful for understanding the nature of reality?

00:10:28 No, diversionary.

00:10:30 So in your thinking, the world might

00:10:34 as well be deterministic.

00:10:36 The world is deterministic,

00:10:38 and as far as the neuron firing is concerned,

00:10:42 it is deterministic to first approximation.

00:10:47 What about free will?

00:10:48 Free will is also a nice exercise.

00:10:52 Free will is an illusion that we AI people are gonna solve.

00:10:59 So what do you think once we solve it,

00:11:01 that solution will look like?

00:11:03 Once we put it in the page.

00:11:04 The solution will look like,

00:11:06 first of all, it will look like a machine.

00:11:08 A machine that act as though it has free will.

00:11:12 It communicates with other machines

00:11:14 as though they have free will,

00:11:17 and you wouldn’t be able to tell the difference

00:11:19 between a machine that does

00:11:21 and a machine that doesn’t have free will, okay?

00:11:24 So the illusion, it propagates the illusion

00:11:26 of free will amongst the other machines.

00:11:28 And faking it is having it, okay?

00:11:33 That’s what Turing test is all about.

00:11:35 Faking intelligence is intelligent

00:11:37 because it’s not easy to fake.

00:11:41 It’s very hard to fake,

00:11:43 and you can only fake if you have it.

00:11:45 So that’s such a beautiful statement.

00:11:54 Yeah, you can’t fake it if you don’t have it, yeah.

00:11:59 So let’s begin at the beginning with probability,

00:12:06 both philosophically and mathematically.

00:12:09 What does it mean to say the probability

00:12:11 of something happening is 50%?

00:12:15 What is probability?

00:12:16 It’s a degree of uncertainty

00:12:18 that an agent has about the world.

00:12:22 You’re still expressing some knowledge in that statement.

00:12:24 Of course.

00:12:26 If the probability is 90%,

00:12:27 it’s absolutely a different kind of knowledge

00:12:29 than if it is 10%.

00:12:32 But it’s still not solid knowledge, it’s…

00:12:36 It is solid knowledge, but hey,

00:12:38 if you tell me that 90% assurance smoking

00:12:43 will give you lung cancer in five years versus 10%,

00:12:48 it’s a piece of useful knowledge.

00:12:52 So the statistical view of the universe,

00:12:56 why is it useful?

00:12:57 So we’re swimming in complete uncertainty,

00:13:00 most of everything around us.

00:13:01 It allows you to predict things with a certain probability,

00:13:06 and computing those probabilities are very useful.

00:13:09 That’s the whole idea of prediction.

00:13:15 And you need prediction to be able to survive.

00:13:18 If you can’t predict the future,

00:13:19 then you’re just crossing the street,

00:13:22 will be extremely fearful.

00:13:26 And so you’ve done a lot of work in causation,

00:13:28 and so let’s think about correlation.

00:13:32 I started with probability.

00:13:34 You started with probability.

00:13:35 You’ve invented the Bayesian networks.

00:13:38 Yeah.

00:13:39 And so we’ll dance back and forth

00:13:43 between these levels of uncertainty.

00:13:47 But what is correlation?

00:13:49 What is it?

00:13:50 So probability of something happening is something,

00:13:54 but then there’s a bunch of things happening.

00:13:56 And sometimes they happen together,

00:13:58 sometimes not, they’re independent or not.

00:14:00 So how do you think about correlation of things?

00:14:03 Correlation occurs when two things vary together

00:14:06 over a very long time is one way of measuring it.

00:14:09 Or when you have a bunch of variables

00:14:11 that they all vary cohesively,

00:14:15 then because we have a correlation here.

00:14:18 And usually when we think about correlation,

00:14:21 we really think causally.

00:14:24 Things cannot be correlated unless there is a reason

00:14:27 for them to vary together.

00:14:30 Why should they vary together?

00:14:32 If they don’t see each other, why should they vary together?

00:14:35 So underlying it somewhere is causation.

00:14:38 Yes.

00:14:39 Hidden in our intuition, there is a notion of causation

00:14:43 because we cannot grasp any other logic except causation.

00:14:48 And how does conditional probability differ from causation?

00:14:55 So what is conditional probability?

00:14:57 Conditional probability, how things vary

00:15:00 when one of them stays the same.

00:15:05 Now staying the same means that I have chosen

00:15:09 to look only at those incidents

00:15:11 where the guy has the same value as previous one.

00:15:16 It’s my choice as an experimenter.

00:15:19 So things that are not correlated before

00:15:22 could become correlated.

00:15:24 Like for instance, if I have two coins

00:15:26 which are uncorrelated, okay,

00:15:29 and I choose only those flippings, experiments

00:15:33 in which a bell rings, and the bell rings

00:15:36 when at least one of them is a tail, okay,

00:15:40 then suddenly I see correlation between the two coins

00:15:44 because I only look at the cases where the bell rang.

00:15:49 You see, it’s my design, with my ignorance essentially,

00:15:53 with my audacity to ignore certain incidents,

00:15:58 I suddenly create a correlation

00:16:04 where it doesn’t exist physically.

00:16:07 Right, so that’s, you just outlined one of the flaws

00:16:11 of observing the world and trying to infer something

00:16:14 from the math about the world

00:16:16 from looking at the correlation.

00:16:17 I don’t look at it as a flaw, the world works like that.

00:16:20 But the flaws comes if we try to impose

00:16:24 causal logic on correlation, it doesn’t work too well.

00:16:34 I mean, but that’s exactly what we do.

00:16:36 That’s what, that has been the majority of science.

00:16:40 The majority of naive science, the decisions know it.

00:16:45 The decisions know that if you condition

00:16:47 on a third variable, then you can destroy

00:16:50 or create correlations among two other variables.

00:16:55 They know it, it’s in their data.

00:16:58 It’s nothing surprising, that’s why they all dismiss

00:17:01 the Simpson Paradox, ah, we know it.

00:17:04 They don’t know anything about it.

00:17:07 Well, there’s disciplines like psychology

00:17:09 where all the variables are hard to account for.

00:17:12 And so oftentimes there’s a leap

00:17:15 between correlation to causation.

00:17:17 You’re imposing.

00:17:18 You’re implying a leap.

00:17:21 Who is trying to get causation from correlation?

00:17:26 You’re not proving causation,

00:17:27 but you’re sort of discussing it,

00:17:31 implying, sort of hypothesizing with our ability to prove.

00:17:35 Which discipline you have in mind?

00:17:37 I’ll tell you if they are obsolete,

00:17:40 or if they are outdated, or they are about to get outdated.

00:17:44 Yes, yes.

00:17:45 Tell me which one you have in mind.

00:17:46 Oh, psychology, you know.

00:17:48 Psychology, what, is it SEM, structural equation?

00:17:50 No, no, I was thinking of applied psychology studying.

00:17:54 For example, we work with human behavior

00:17:57 in semi autonomous vehicles, how people behave.

00:18:00 And you have to conduct these studies

00:18:02 of people driving cars.

00:18:03 Everything starts with the question.

00:18:05 What is the research question?

00:18:07 What is the research question?

00:18:10 The research question, do people fall asleep

00:18:14 when the car is driving itself?

00:18:18 Do they fall asleep, or do they tend to fall asleep

00:18:22 more frequently?

00:18:23 More frequently.

00:18:24 Than the car not driving itself.

00:18:25 Not driving itself.

00:18:26 That’s a good question, okay.

00:18:28 And so you measure, you put people in the car

00:18:32 because it’s real world.

00:18:33 You can’t conduct an experiment

00:18:35 where you control everything.

00:18:36 Why can’t you control?

00:18:37 You could.

00:18:38 Why can’t you control the automatic module on and off?

00:18:44 Because it’s on road, public.

00:18:47 I mean, there’s aspects to it that’s unethical.

00:18:52 Because it’s testing on public roads.

00:18:54 So you can only use vehicle.

00:18:56 They have to, the people, the drivers themselves

00:19:00 have to make that choice themselves.

00:19:02 And so they regulate that.

00:19:05 And so you just observe when they drive autonomously

00:19:09 and when they don’t.

00:19:10 And then.

00:19:11 But maybe they turn it off when they were very tired.

00:19:13 Yeah, that kind of thing.

00:19:14 But you don’t know those variables.

00:19:16 Okay, so that you have now uncontrolled experiment.

00:19:19 Uncontrolled experiment.

00:19:20 We call it observational study.

00:19:23 And we form the correlation detected.

00:19:27 We have to infer causal relationship.

00:19:30 Whether it was the automatic piece

00:19:33 has caused them to fall asleep, or.

00:19:35 So that is an issue that is about 120 years old.

00:19:45 I should only go 100 years old, okay.

00:19:51 Well, maybe it’s not.

00:19:52 Actually I should say it’s 2,000 years old.

00:19:55 Because we have this experiment by Daniel.

00:19:58 But the Babylonian king that wanted the exiled,

00:20:07 the people from Israel that were taken in exile

00:20:12 to Babylon to serve the king.

00:20:14 He wanted to serve them king’s food, which was meat.

00:20:18 And Daniel as a good Jew couldn’t eat non kosher food.

00:20:22 So he asked them to eat vegetarian food.

00:20:26 But the king overseer says, I’m sorry,

00:20:29 but if the king sees that your performance falls below

00:20:34 that of other kids, he’s going to kill me.

00:20:39 Daniel said, let’s make an experiment.

00:20:41 Let’s take four of us from Jerusalem, okay.

00:20:44 Give us vegetarian food.

00:20:46 Let’s take the other guys to eat the king’s food.

00:20:50 And in about a week’s time, we’ll test our performance.

00:20:54 And you know the answer.

00:20:55 Of course he did the experiment.

00:20:57 And they were so much better than the others.

00:21:02 And the kings nominated them to super position in his king.

00:21:08 So it was a first experiment, yes.

00:21:10 So there was a very simple,

00:21:12 it’s also the same research questions.

00:21:15 We want to know if vegetarian food

00:21:18 assist or obstruct your mental ability.

00:21:23 And okay, so the question is very old one.

00:21:30 Even Democritus said, if I could discover one cause

00:21:38 of things, I would rather discover one cause

00:21:41 and be a king of Persia, okay.

00:21:43 The task of discovering causes was in the mind

00:21:48 of ancient people from many, many years ago.

00:21:53 But the mathematics of doing that was only developed

00:21:58 in the 1920s.

00:22:00 So science has left us orphan, okay.

00:22:05 Science has not provided us with the mathematics

00:22:08 to capture the idea of X causes Y

00:22:12 and Y does not cause X.

00:22:14 Because all the question of physics are symmetrical,

00:22:17 algebraic, the equality sign goes both ways.

00:22:21 Okay, let’s look at machine learning.

00:22:23 Machine learning today, if you look at deep neural networks,

00:22:26 you can think of it as kind of conditional probability

00:22:32 estimators.

00:22:33 Correct, beautiful.

00:22:35 So where did you say that?

00:22:39 Conditional probability estimators.

00:22:41 None of the machine learning people clevered you?

00:22:45 Attacked you?

00:22:46 Listen, most people, and this is why today’s conversation

00:22:52 I think is interesting, is most people would agree with you.

00:22:55 There’s certain aspects that are just effective today,

00:22:58 but we’re going to hit a wall and there’s a lot of ideas.

00:23:02 I think you’re very right that we’re gonna have to return to

00:23:05 about causality and it would be, let’s try to explore it.

00:23:11 Let’s even take a step back.

00:23:13 You’ve invented Bayesian networks that look awfully a lot

00:23:19 like they express something like causation,

00:23:22 but they don’t, not necessarily.

00:23:25 So how do we turn Bayesian networks

00:23:28 into expressing causation?

00:23:30 How do we build causal networks?

00:23:33 This A causes B, B causes C,

00:23:36 how do we start to infer that kind of thing?

00:23:38 We start asking ourselves question,

00:23:41 what are the factors that would determine

00:23:44 the value of X?

00:23:46 X could be blood pressure, death, hunger.

00:23:53 But these are hypotheses that we propose for ourselves.

00:23:56 Hypothesis, everything which has to do with causality

00:23:59 comes from a theory.

00:24:03 The difference is only how you interrogate

00:24:06 the theory that you have in your mind.

00:24:09 So it still needs the human expert to propose.

00:24:14 You need the human expert to specify the initial model.

00:24:20 Initial model could be very qualitative.

00:24:24 Just who listens to whom?

00:24:27 By whom listen to, I mean one variable listen to the other.

00:24:31 So I say, okay, the tide is listening to the moon.

00:24:34 And not to the rooster crow.

00:24:42 And so forth.

00:24:43 This is our understanding of the world in which we live.

00:24:46 Scientific understanding of reality.

00:24:51 We have to start there.

00:24:53 Because if we don’t know how to handle

00:24:56 cause and effect relationship,

00:24:58 when we do have a model,

00:25:01 and we certainly do not know how to handle it

00:25:03 when we don’t have a model.

00:25:05 So let’s start first.

00:25:07 In AI, slogan is representation first, discovery second.

00:25:13 But if I give you all the information that you need,

00:25:17 can you do anything useful with it?

00:25:19 That is the first, representation.

00:25:21 How do you represent it?

00:25:22 I give you all the knowledge in the world.

00:25:24 How do you represent it?

00:25:26 When you represent it, I ask you,

00:25:30 can you infer X or Y or Z?

00:25:33 Can you answer certain queries?

00:25:35 Is it complex?

00:25:36 Is it polynomial?

00:25:39 All the computer science exercises we do,

00:25:42 once you give me a representation for my knowledge,

00:25:47 then you can ask me,

00:25:48 now I understand how to represent things.

00:25:51 How do I discover them?

00:25:52 It’s a secondary thing.

00:25:54 So first of all, I should echo the statement

00:25:57 that mathematics and the current,

00:25:59 much of the machine learning world has not considered

00:26:04 causation that A causes B, just in anything.

00:26:07 So that seems like a non obvious thing

00:26:15 that you think we would have really acknowledged it,

00:26:18 but we haven’t.

00:26:19 So we have to put that on the table.

00:26:21 So knowledge, how hard is it to create a knowledge

00:26:26 from which to work?

00:26:28 In certain area, it’s easy,

00:26:31 because we have only four or five major variables,

00:26:37 and an epidemiologist or an economist can put them down,

00:26:43 what, minimum wage, unemployment, policy, X, Y, Z,

00:26:52 and start collecting data,

00:26:54 and quantify the parameters that were left unquantified

00:27:00 with the initial knowledge.

00:27:01 That’s the routine work that you find

00:27:07 in experimental psychology, in economics, everywhere.

00:27:13 In the health science, that’s a routine thing.

00:27:16 But I should emphasize,

00:27:18 you should start with the research question.

00:27:21 What do you want to estimate?

00:27:24 Once you have that, you have to have a language

00:27:27 of expressing what you want to estimate.

00:27:30 You think it’s easy?

00:27:31 No.

00:27:32 So we can talk about two things, I think.

00:27:35 One is how the science of causation is very useful

00:27:42 for answering certain questions.

00:27:47 And then the other is, how do we create intelligent systems

00:27:51 that need to reason with causation?

00:27:53 So if my research question is,

00:27:55 how do I pick up this water bottle from the table?

00:28:00 All of the knowledge that is required

00:28:03 to be able to do that,

00:28:05 how do we construct that knowledge base?

00:28:07 Do we return back to the problem

00:28:11 that we didn’t solve in the 80s with expert systems?

00:28:13 Do we have to solve that problem

00:28:15 of automated construction of knowledge?

00:28:19 You’re talking about the task

00:28:23 of eliciting knowledge from an expert.

00:28:26 Task of eliciting knowledge from an expert,

00:28:28 or the self discovery of more knowledge,

00:28:32 more and more knowledge.

00:28:34 So automating the building of knowledge as much as possible.

00:28:38 It’s a different game in the causal domain,

00:28:42 because it’s essentially the same thing.

00:28:46 You have to start with some knowledge,

00:28:48 and you’re trying to enrich it.

00:28:51 But you don’t enrich it by asking for more rules.

00:28:56 You enrich it by asking for the data,

00:28:58 to look at the data and quantifying,

00:29:01 and ask queries that you couldn’t answer when you started.

00:29:06 You couldn’t because the question is quite complex,

00:29:11 and it’s not within the capability

00:29:16 of ordinary cognition, of ordinary person,

00:29:19 or ordinary expert even, to answer.

00:29:23 So what kind of questions do you think

00:29:24 we can start to answer?

00:29:26 Even a simple one.

00:29:27 Suppose, yeah, I’ll start with easy one.

00:29:31 Let’s do it.

00:29:32 Okay, what’s the effect of a drug on recovery?

00:29:37 What is the aspirin that caused my headache

00:29:39 to be cured, or what did the television program,

00:29:44 or the good news I received?

00:29:47 This is already, you see, it’s a difficult question,

00:29:49 because it’s find the cause from effect.

00:29:53 The easy one is find the effect from cause.

00:29:56 That’s right.

00:29:57 So first you construct a model,

00:29:59 saying that this is an important research question.

00:30:01 This is an important question.

00:30:02 Then you do.

00:30:04 I didn’t construct a model yet.

00:30:05 I just said it’s an important question.

00:30:07 And the first exercise is express it mathematically.

00:30:12 What do you want to do?

00:30:13 Like, if I tell you what will be the effect

00:30:17 of taking this drug, you have to say that in mathematics.

00:30:21 How do you say that?

00:30:22 Yes.

00:30:23 Can you write down the question, not the answer?

00:30:27 I want to find the effect of the drug on my headache.

00:30:32 Right.

00:30:33 Write down, write it down.

00:30:34 That’s where the do calculus comes in.

00:30:35 Yes.

00:30:36 The do operator, what is the do operator?

00:30:38 Do operator, yeah.

00:30:40 Which is nice.

00:30:40 It’s the difference in association and intervention.

00:30:43 Very beautifully sort of constructed.

00:30:45 Yeah, so we have a do operator.

00:30:48 So the do calculus connected on the do operator itself

00:30:52 connects the operation of doing

00:30:55 to something that we can see.

00:30:57 Right.

00:30:58 So as opposed to the purely observing,

00:31:01 you’re making the choice to change a variable.

00:31:05 That’s what it expresses.

00:31:08 And then the way that we interpret it,

00:31:11 the mechanism by which we take your query

00:31:15 and we translate it into something that we can work with

00:31:18 is by giving it semantics,

00:31:21 saying that you have a model of the world

00:31:23 and you cut off all the incoming error into X

00:31:26 and you’re looking now in the modified mutilated model,

00:31:30 you ask for the probability of Y.

00:31:33 That is interpretation of doing X

00:31:36 because by doing things, you’ve liberated them

00:31:40 from all influences that acted upon them earlier

00:31:45 and you subject them to the tyranny of your muscles.

00:31:50 So you remove all the questions about causality

00:31:54 by doing them.

00:31:55 So you’re now.

00:31:56 There’s one level of questions.

00:31:58 Yeah.

00:31:59 Answer questions about what will happen if you do things.

00:32:01 If you do, if you drink the coffee,

00:32:03 if you take the aspirin.

00:32:04 Right.

00:32:05 So how do we get the doing data?

00:32:11 Now the question is, if we cannot run experiments,

00:32:16 then we have to rely on observational study.

00:32:20 So first we could, sorry to interrupt,

00:32:22 we could run an experiment where we do something,

00:32:25 where we drink the coffee and this,

00:32:28 the do operator allows you to sort of be systematic

00:32:31 about expressing.

00:32:31 So imagine how the experiment will look like

00:32:34 even though we cannot physically

00:32:36 and technologically conduct it.

00:32:38 I’ll give you an example.

00:32:40 What is the effect of blood pressure on mortality?

00:32:44 I cannot go down into your vein

00:32:47 and change your blood pressure,

00:32:49 but I can ask the question,

00:32:52 which means I can even have a model of your body.

00:32:55 I can imagine the effect of your,

00:32:58 how the blood pressure change will affect your mortality.

00:33:04 How?

00:33:05 I go into the model and I conduct this surgery

00:33:09 about the blood pressure,

00:33:12 even though physically I can do, I cannot do it.

00:33:17 Let me ask the quantum mechanics question.

00:33:19 Does the doing change the observation?

00:33:23 Meaning the surgery of changing the blood pressure is,

00:33:28 I mean.

00:33:29 No, the surgery is,

00:33:31 I call the very delicate.

00:33:35 It’s very delicate, infinitely delicate.

00:33:37 Incisive and delicate, which means,

00:33:40 do means, do X means,

00:33:44 I’m gonna touch only X.

00:33:46 Only X.

00:33:47 Directly into X.

00:33:50 So that means that I change only things

00:33:52 which depends on X by virtue of X changing,

00:33:56 but I don’t depend things which are not depend on X.

00:34:00 Like I wouldn’t change your sex or your age,

00:34:04 I just change your blood pressure.

00:34:06 So in the case of blood pressure,

00:34:08 it may be difficult or impossible

00:34:11 to construct such an experiment.

00:34:12 No, physically yes, but hypothetically no.

00:34:16 Hypothetically no.

00:34:17 If we have a model, that is what the model is for.

00:34:20 So you conduct surgeries on a model,

00:34:24 you take it apart, put it back,

00:34:26 that’s the idea of a model.

00:34:28 It’s the idea of thinking counterfactually, imagining,

00:34:31 and that’s the idea of creativity.

00:34:35 So by constructing that model,

00:34:36 you can start to infer if the higher the blood pressure

00:34:43 leads to mortality, which increases or decreases by.

00:34:47 I construct the model, I still cannot answer it.

00:34:50 I have to see if I have enough information in the model

00:34:53 that would allow me to find out the effects of intervention

00:34:58 from a noninterventional study, hence of study.

00:35:04 So what’s needed?

00:35:06 You need to have assumptions about who affects whom.

00:35:13 If the graph had a certain property,

00:35:16 the answer is yes, you can get it from observational study.

00:35:20 If the graph is too mushy, bushy, bushy,

00:35:23 the answer is no, you cannot.

00:35:25 Then you need to find either different kind of observation

00:35:30 that you haven’t considered, or one experiment.

00:35:35 So basically, that puts a lot of pressure on you

00:35:38 to encode wisdom into that graph.

00:35:41 Correct.

00:35:42 But you don’t have to encode more than what you know.

00:35:47 God forbid, if you put the,

00:35:49 like economists are doing this,

00:35:51 they call identifying assumption.

00:35:52 They put assumptions, even if they don’t prevail

00:35:55 in the world, they put assumptions

00:35:56 so they can identify things.

00:35:59 But the problem is, yes, beautifully put,

00:36:01 but the problem is you don’t know what you don’t know.

00:36:04 So.

00:36:05 You know what you don’t know.

00:36:07 Because if you don’t know, you say it’s possible.

00:36:10 It’s possible that X affect the traffic tomorrow.

00:36:17 It’s possible.

00:36:18 You put down an arrow which says it’s possible.

00:36:20 Every arrow in the graph says it’s possible.

00:36:23 So there’s not a significant cost to adding arrows that.

00:36:28 The more arrow you add, the less likely you are

00:36:32 to identify things from purely observational data.

00:36:37 So if the whole world is bushy,

00:36:41 and everybody affect everybody else,

00:36:45 the answer is, you can answer it ahead of time.

00:36:49 I cannot answer my query from observational data.

00:36:54 I have to go to experiments.

00:36:56 So you talk about machine learning

00:36:58 is essentially learning by association,

00:37:01 or reasoning by association,

00:37:03 and this do calculus is allowing for intervention.

00:37:07 I like that word.

00:37:09 Action.

00:37:09 So you also talk about counterfactuals.

00:37:12 Yeah.

00:37:13 And trying to sort of understand the difference

00:37:15 between counterfactuals and intervention.

00:37:19 First of all, what is counterfactuals,

00:37:22 and why are they useful?

00:37:26 Why are they especially useful,

00:37:29 as opposed to just reasoning what effect actions have?

00:37:34 Well, counterfactual contains

00:37:37 what we normally call explanations.

00:37:40 Can you give an example of a counterfactual?

00:37:41 If I tell you that acting one way affects something else,

00:37:45 I didn’t explain anything yet.

00:37:47 But if I ask you, was it the aspirin that cured my headache?

00:37:55 I’m asking for explanation, what cured my headache?

00:37:58 And putting a finger on aspirin provide an explanation.

00:38:04 It was aspirin that was responsible

00:38:08 for your headache going away.

00:38:11 If you didn’t take the aspirin,

00:38:14 you would still have a headache.

00:38:16 So by saying if I didn’t take aspirin,

00:38:20 I would have a headache, you’re thereby saying

00:38:22 that aspirin is the thing that removes the headache.

00:38:26 But you have to have another important information.

00:38:30 I took the aspirin, and my headache is gone.

00:38:34 It’s very important information.

00:38:36 Now I’m reasoning backward,

00:38:38 and I said, was it the aspirin?

00:38:40 Yeah.

00:38:42 By considering what would have happened

00:38:44 if everything else is the same, but I didn’t take aspirin.

00:38:47 That’s right.

00:38:47 So you know that things took place.

00:38:51 Joe killed Schmoe, and Schmoe would be alive

00:38:56 had John not used his gun.

00:38:59 Okay, so that is the counterfactual.

00:39:02 It had the conflict here, or clash,

00:39:06 between observed fact,

00:39:09 but he did shoot, okay?

00:39:13 And the hypothetical predicate,

00:39:16 which says had he not shot,

00:39:18 you have a logical clash.

00:39:21 They cannot exist together.

00:39:23 That’s the counterfactual.

00:39:24 And that is the source of our explanation

00:39:28 of the idea of responsibility, regret, and free will.

00:39:34 Yeah, so it certainly seems

00:39:37 that’s the highest level of reasoning, right?

00:39:39 Yeah, and physicists do it all the time.

00:39:41 Who does it all the time?

00:39:42 Physicists.

00:39:43 Physicists.

00:39:44 In every equation of physics,

00:39:47 let’s say you have a Hooke’s law,

00:39:49 and you put one kilogram on the spring,

00:39:52 and the spring is one meter,

00:39:54 and you say, had this weight been two kilogram,

00:39:58 the spring would have been twice as long.

00:40:02 It’s no problem for physicists to say that,

00:40:05 except that mathematics is only in the form of equation,

00:40:10 okay, equating the weight,

00:40:14 proportionality constant, and the length of the string.

00:40:18 So you don’t have the asymmetry

00:40:22 in the equation of physics,

00:40:23 although every physicist thinks counterfactually.

00:40:26 Ask the high school kids,

00:40:29 had the weight been three kilograms,

00:40:31 what would be the length of the spring?

00:40:33 They can answer it immediately,

00:40:35 because they do the counterfactual processing in their mind,

00:40:38 and then they put it into equation,

00:40:41 algebraic equation, and they solve it, okay?

00:40:44 But a robot cannot do that.

00:40:46 How do you make a robot learn these relationships?

00:40:51 Well, why you would learn?

00:40:53 Suppose you tell him, can you do it?

00:40:55 So before you go learning,

00:40:57 you have to ask yourself,

00:40:59 suppose I give you all the information, okay?

00:41:01 Can the robot perform the task that I ask him to perform?

00:41:07 Can he reason and say, no, it wasn’t the aspirin.

00:41:10 It was the good news you received on the phone.

00:41:15 Right, because, well, unless the robot had a model,

00:41:19 a causal model of the world.

00:41:23 Right, right.

00:41:24 I’m sorry I have to linger on this.

00:41:26 But now we have to linger and we have to say,

00:41:27 how do we do it?

00:41:29 How do we build it?

00:41:29 Yes.

00:41:30 How do we build a causal model

00:41:32 without a team of human experts running around?

00:41:37 Why don’t you go to learning right away?

00:41:39 You’re too much involved with learning.

00:41:41 Because I like babies.

00:41:42 Babies learn fast.

00:41:43 I’m trying to figure out how they do it.

00:41:45 Good.

00:41:45 So that’s another question.

00:41:47 How do the babies come out with a counterfactual

00:41:50 model of the world?

00:41:51 And babies do that.

00:41:53 They know how to play in the crib.

00:41:56 They know which balls hit another one.

00:41:59 And they learn it by playful manipulation of the world.

00:42:06 Yes.

00:42:07 The simple world involves only toys and balls and chimes.

00:42:13 But if you think about it, it’s a complex world.

00:42:17 We take for granted how complicated.

00:42:20 And kids do it by playful manipulation

00:42:23 plus parents guidance, peer wisdom, and hearsay.

00:42:30 They meet each other and they say,

00:42:34 you shouldn’t have taken my toy.

00:42:38 Right.

00:42:40 And these multiple sources of information,

00:42:43 they’re able to integrate.

00:42:45 So the challenge is about how to integrate,

00:42:49 how to form these causal relationships

00:42:52 from different sources of data.

00:42:54 Correct.

00:42:55 So how much information is it to play,

00:42:59 how much causal information is required

00:43:03 to be able to play in the crib with different objects?

00:43:06 I don’t know.

00:43:08 I haven’t experimented with the crib.

00:43:11 Okay, not a crib.

00:43:12 Picking up, manipulating physical objects

00:43:15 on this very, opening the pages of a book,

00:43:19 all the tasks, the physical manipulation tasks.

00:43:23 Do you have a sense?

00:43:25 Because my sense is the world is extremely complicated.

00:43:27 It’s extremely complicated.

00:43:29 I agree, and I don’t know how to organize it

00:43:31 because I’ve been spoiled by easy problems

00:43:34 such as cancer and death, okay?

00:43:37 And I’m a, but she’s a.

00:43:39 First we have to start trying to.

00:43:41 No, but it’s easy.

00:43:42 There is in a sense that you have only 20 variables.

00:43:46 And they are just variables and not mechanics.

00:43:51 Okay, it’s easy.

00:43:52 You just put them on the graph and they speak to you.

00:43:57 Yeah, and you’re providing a methodology

00:44:00 for letting them speak.

00:44:02 Yeah.

00:44:03 I’m working only in the abstract.

00:44:05 The abstract was knowledge in, knowledge out,

00:44:08 data in between.

00:44:11 Now, can we take a leap to trying to learn

00:44:15 in this very, when it’s not 20 variables,

00:44:18 but 20 million variables, trying to learn causation

00:44:23 in this world?

00:44:24 Not learn, but somehow construct models.

00:44:27 I mean, it seems like you would only have to be able

00:44:29 to learn because constructing it manually

00:44:33 would be too difficult.

00:44:35 Do you have ideas of?

00:44:37 I think it’s a matter of combining simple models

00:44:41 from many, many sources, from many, many disciplines,

00:44:45 and many metaphors.

00:44:48 Metaphors are the basics of human intelligence, basis.

00:44:51 Yeah, so how do you think of about a metaphor

00:44:53 in terms of its use in human intelligence?

00:44:56 Metaphors is an expert system.

00:45:01 An expert, it’s mapping problem

00:45:05 from a problem with which you are not familiar

00:45:09 to a problem with which you are familiar.

00:45:13 Like, I’ll give you a good example.

00:45:15 The Greek believed that the sky is an opaque shell.

00:45:22 It’s not really infinite space.

00:45:25 It’s an opaque shell, and the stars are holes

00:45:30 poked in the shells through which you see

00:45:32 the eternal light.

00:45:34 That was a metaphor.

00:45:35 Why?

00:45:36 Because they understand how you poke holes in the shells.

00:45:42 They were not familiar with infinite space.

00:45:47 And we are walking on a shell of a turtle,

00:45:52 and if you get too close to the edge,

00:45:54 you’re gonna fall down to Hades or wherever.

00:45:58 That’s a metaphor.

00:46:00 It’s not true.

00:46:02 But this kind of metaphor enabled Aristoteles

00:46:07 to measure the radius of the Earth,

00:46:10 because he said, Kamal, if we are walking on a turtle shell,

00:46:15 then the ray of light coming to this place

00:46:20 will be a different angle than coming to this place.

00:46:23 I know the distance, I’ll measure the two angles,

00:46:26 and then I have the radius of the shell of the turtle.

00:46:33 And he did, and he found his measurement

00:46:38 very close to the measurements we have today,

00:46:43 through the, what, 6,700 kilometers of the Earth.

00:46:52 That’s something that would not occur

00:46:55 to Babylonian astronomer,

00:46:59 even though the Babylonian experiment

00:47:01 were the machine learning people of the time.

00:47:04 They fit curves, and they could predict

00:47:07 the eclipse of the moon much more accurately

00:47:12 than the Greek, because they fit curve.

00:47:17 That’s a different metaphor.

00:47:19 Something that you’re familiar with,

00:47:20 a game, a turtle shell.

00:47:21 Okay?

00:47:24 What does it mean if you are familiar?

00:47:27 Familiar means that answers to certain questions

00:47:31 are explicit.

00:47:33 You don’t have to derive them.

00:47:35 And they were made explicit because somewhere in the past

00:47:39 you’ve constructed a model of that.

00:47:43 Yeah, you’re familiar with,

00:47:44 so the child is familiar with billiard balls.

00:47:48 So the child could predict that if you let loose

00:47:51 of one ball, the other one will bounce off.

00:47:55 You obtain that by familiarity.

00:48:00 Familiarity is answering questions,

00:48:02 and you store the answer explicitly.

00:48:05 You don’t have to derive them.

00:48:08 So this is the idea of a metaphor.

00:48:09 All our life, all our intelligence

00:48:11 is built around metaphors,

00:48:13 mapping from the unfamiliar to the familiar.

00:48:16 But the marriage between the two is a tough thing,

00:48:20 which we haven’t yet been able to algorithmatize.

00:48:24 So you think of that process of using metaphor

00:48:29 to leap from one place to another,

00:48:31 we can call it reasoning?

00:48:34 Is it a kind of reasoning?

00:48:35 It is reasoning by metaphor, metaphorical reasoning.

00:48:39 Do you think of that as learning?

00:48:44 So learning is a popular terminology today

00:48:46 in a narrow sense.

00:48:47 It is, it is, it is definitely a form.

00:48:49 So you may not, okay, right.

00:48:51 It’s one of the most important learnings,

00:48:53 taking something which theoretically is derivable

00:48:57 and store it in accessible format.

00:49:01 I’ll give you an example, chess, okay?

00:49:07 Finding the winning starting move in chess is hard.

00:49:12 It is hard, but there is an answer.

00:49:21 Either there is a winning move for white

00:49:23 or there isn’t, or there is a draw, okay?

00:49:26 So it is, the answer to that

00:49:30 is available through the rule of the games.

00:49:33 But we don’t know the answer.

00:49:35 So what does a chess master have that we don’t have?

00:49:38 He has stored explicitly an evaluation

00:49:41 of certain complex pattern of the board.

00:49:45 We don’t have it.

00:49:46 Ordinary people like me, I don’t know about you,

00:49:50 I’m not a chess master.

00:49:52 So for me, I have to derive things that for him is explicit.

00:49:58 He has seen it before, or he has seen the pattern before,

00:50:02 or similar pattern, you see metaphor, yeah?

00:50:05 And he generalize and said, don’t move, it’s a dangerous move.

00:50:13 It’s just that not in the game of chess,

00:50:15 but in the game of billiard balls,

00:50:18 we humans are able to initially derive very effectively

00:50:22 and then reason by metaphor very effectively

00:50:25 and make it look so easy that it makes one wonder

00:50:28 how hard is it to build it in a machine.

00:50:31 So in your sense, how far away are we

00:50:38 to be able to construct?

00:50:40 I don’t know, I’m not a futurist.

00:50:42 All I can tell you is that we are making tremendous progress

00:50:48 in the causal reasoning domain.

00:50:52 Something that I even dare to call it revolution,

00:50:57 the code of revolution, because what we have achieved

00:51:02 in the past three decades is something that dwarf

00:51:09 everything that was derived in the entire history.

00:51:13 So there’s an excitement about

00:51:15 current machine learning methodologies,

00:51:18 and there’s really important good work you’re doing

00:51:22 in causal inference.

00:51:24 Where does the future, where do these worlds collide

00:51:32 and what does that look like?

00:51:35 First, they’re gonna work without collision.

00:51:37 It’s gonna work in harmony.

00:51:40 Harmony, it’s not collision.

00:51:41 The human is going to jumpstart the exercise

00:51:46 by providing qualitative, noncommitting models

00:51:55 of how the universe works, how in reality

00:52:00 the domain of discourse works.

00:52:03 The machine is gonna take over from that point of view

00:52:06 and derive whatever the calculus says can be derived.

00:52:11 Namely, quantitative answer to our questions.

00:52:15 Now, these are complex questions.

00:52:18 I’ll give you some example of complex questions

00:52:21 that will bug your mind if you think about it.

00:52:27 You take result of studies in diverse population

00:52:33 under diverse condition, and you infer the cause effect

00:52:38 of a new population which doesn’t even resemble

00:52:43 any of the ones studied, and you do that by do calculus.

00:52:48 You do that by generalizing from one study to another.

00:52:52 See, what’s common with Berto?

00:52:54 What is different?

00:52:57 Let’s ignore the differences and pull out the commonality,

00:53:01 and you do it over maybe 100 hospitals around the world.

00:53:06 From that, you can get really mileage from big data.

00:53:11 It’s not only that you have many samples,

00:53:15 you have many sources of data.

00:53:18 So that’s a really powerful thing, I think,

00:53:21 especially for medical applications.

00:53:23 I mean, cure cancer, right?

00:53:25 That’s how from data you can cure cancer.

00:53:28 So we’re talking about causation,

00:53:30 which is the temporal relationships between things.

00:53:35 Not only temporal, it’s both structural and temporal.

00:53:38 Temporal enough, temporal precedence by itself

00:53:43 cannot replace causation.

00:53:46 Is temporal precedence the arrow of time in physics?

00:53:50 It’s important, necessary.

00:53:52 It’s important.

00:53:53 It’s efficient, yes.

00:53:54 Is it?

00:53:55 Yes, I never seen cause propagate backward.

00:54:00 But if we use the word cause,

00:54:03 but there’s relationships that are timeless.

00:54:07 I suppose that’s still forward in the arrow of time.

00:54:10 But are there relationships, logical relationships,

00:54:14 that fit into the structure?

00:54:18 Sure, the whole do calculus is logical relationship.

00:54:21 That doesn’t require a temporal.

00:54:23 It has just the condition that

00:54:26 you’re not traveling back in time.

00:54:28 Yes, correct.

00:54:31 So it’s really a generalization of,

00:54:34 a powerful generalization of what?

00:54:39 Of Boolean logic.

00:54:40 Yeah, Boolean logic.

00:54:41 Yes.

00:54:43 That is sort of simply put,

00:54:46 and allows us to reason about the order of events,

00:54:53 the source, the.

00:54:54 Not about, between, we’re not deriving the order of events.

00:54:58 We are given cause effects relationship, okay?

00:55:01 They ought to be obeying the time presidents relationship.

00:55:08 We are given it.

00:55:09 And now that we ask questions about

00:55:12 other causes of relationship,

00:55:14 that could be derived from the initial ones,

00:55:17 but were not given to us explicitly.

00:55:20 Like the case of the firing squad I gave you

00:55:26 in the first chapter.

00:55:28 And I ask, what if rifleman A declined to shoot?

00:55:33 Would the prisoner still be dead?

00:55:37 To decline to shoot, it means that he disobey order.

00:55:42 And the rule of the games were that he is a

00:55:48 obedient marksman, okay?

00:55:51 That’s how you start.

00:55:52 That’s the initial order.

00:55:53 But now you ask question about breaking the rules.

00:55:56 What if he decided not to pull the trigger?

00:56:00 He just became a pacifist.

00:56:03 And you and I can answer that.

00:56:06 The other rifleman would have killed him, okay?

00:56:09 I want the machine to do that.

00:56:12 Is it so hard to ask a machine to do that?

00:56:15 It’s such a simple task.

00:56:16 You have to have a calculus for that.

00:56:19 Yes, yeah.

00:56:21 But the curiosity, the natural curiosity for me is

00:56:24 that yes, you’re absolutely correct and important.

00:56:27 And it’s hard to believe that we haven’t done this

00:56:31 seriously extensively already a long time ago.

00:56:35 So this is really important work.

00:56:37 But I also wanna know, maybe you can philosophize

00:56:41 about how hard is it to learn.

00:56:43 Okay, let’s assume we’re learning.

00:56:44 We wanna learn it, okay?

00:56:45 We wanna learn.

00:56:46 So what do we do?

00:56:47 We put a learning machine that watches execution trials

00:56:51 in many countries and many locations, okay?

00:56:57 All the machine can learn is to see shut or not shut.

00:57:01 Dead, not dead.

00:57:02 A court issued an order or didn’t, okay?

00:57:05 Just the facts.

00:57:07 For the fact you don’t know who listens to whom.

00:57:10 You don’t know that the condemned person

00:57:13 listened to the bullets,

00:57:15 that the bullets are listening to the captain, okay?

00:57:19 All we hear is one command, two shots, dead, okay?

00:57:24 A triple of variable.

00:57:27 Yes, no, yes, no.

00:57:29 Okay, that you can learn who listens to whom

00:57:32 and you can answer the question, no.

00:57:33 Definitively no.

00:57:35 But don’t you think you can start proposing ideas

00:57:39 for humans to review?

00:57:41 You want machine to learn, you want a robot.

00:57:44 So robot is watching trials like that, 200 trials,

00:57:50 and then he has to answer the question,

00:57:52 what if rifleman A refrain from shooting?

00:57:56 Yeah.

00:57:58 How do I do that?

00:58:00 That’s exactly my point.

00:58:03 It’s looking at the facts,

00:58:04 don’t give you the strings behind the facts.

00:58:07 Absolutely, but do you think of machine learning

00:58:11 as it’s currently defined as only something

00:58:15 that looks at the facts and tries to do?

00:58:17 Right now, they only look at the facts, yeah.

00:58:19 So is there a way to modify, in your sense?

00:58:23 Playful manipulation.

00:58:25 Playful manipulation.

00:58:26 Yes, once in a while.

00:58:26 Doing the interventionist kind of thing, intervention.

00:58:29 But it could be at random.

00:58:31 For instance, the rifleman is sick that day

00:58:34 or he just vomits or whatever.

00:58:37 So machine can observe this unexpected event

00:58:41 which introduce noise.

00:58:43 The noise still have to be random

00:58:46 to be able to relate it to randomized experiment.

00:58:51 And then you have observational studies

00:58:55 from which to infer the strings behind the facts.

00:58:59 It’s doable to a certain extent.

00:59:02 But now that we are expert in what you can do

00:59:06 once you have a model, we can reason back and say,

00:59:09 what kind of data you need to build a model.

00:59:13 Got it, so I know you’re not a futurist,

00:59:17 but are you excited?

00:59:19 Have you, when you look back at your life,

00:59:22 longed for the idea of creating

00:59:24 a human level intelligence system?

00:59:25 Yeah, I’m driven by that.

00:59:28 All my life, I’m driven just by one thing.

00:59:33 But I go slowly.

00:59:34 I go from what I know to the next step incrementally.

00:59:39 So without imagining what the end goal looks like.

00:59:42 Do you imagine what an eight?

00:59:44 The end goal is gonna be a machine

00:59:47 that can answer sophisticated questions,

00:59:50 counterfactuals of regret, compassion,

00:59:55 responsibility, and free will.

00:59:59 So what is a good test?

01:00:01 Is a Turing test a reasonable test?

01:00:04 A test of free will doesn’t exist yet.

01:00:08 How would you test free will?

01:00:10 So far, we know only one thing.

01:00:14 If robots can communicate with reward and punishment

01:00:20 among themselves and hitting each other on the wrist

01:00:25 and say, you shouldn’t have done that, okay?

01:00:27 Playing better soccer because they can do that.

01:00:33 What do you mean, because they can do that?

01:00:35 Because they can communicate among themselves.

01:00:38 Because of the communication they can do.

01:00:40 Because they communicate like us.

01:00:42 Reward and punishment, yes.

01:00:44 You didn’t pass the ball at the right time,

01:00:47 and so therefore you’re gonna sit on the bench

01:00:50 for the next two.

01:00:51 If they start communicating like that,

01:00:53 the question is, will they play better soccer?

01:00:56 As opposed to what?

01:00:57 As opposed to what they do now?

01:00:59 Without this ability to reason about reward and punishment.

01:01:04 Responsibility.

01:01:06 And?

01:01:07 Artifactions.

01:01:08 So far, I can only think about communication.

01:01:11 Communication is, and not necessarily natural language,

01:01:15 but just communication.

01:01:16 Just communication.

01:01:17 And that’s important to have a quick and effective means

01:01:21 of communicating knowledge.

01:01:24 If the coach tells you you should have passed the ball,

01:01:26 pink, he conveys so much knowledge to you

01:01:28 as opposed to what?

01:01:30 Go down and change your software.

01:01:33 That’s the alternative.

01:01:35 But the coach doesn’t know your software.

01:01:37 So how can the coach tell you

01:01:39 you should have passed the ball?

01:01:41 But our language is very effective.

01:01:44 You should have passed the ball.

01:01:45 You know your software.

01:01:47 You tweak the right module, and next time you don’t do it.

01:01:52 Now that’s for playing soccer,

01:01:53 the rules are well defined.

01:01:55 No, no, no, no, they’re not well defined.

01:01:57 When you should pass the ball.

01:01:58 Is not well defined.

01:02:00 No, it’s very soft, very noisy.

01:02:04 Yes, you have to do it under pressure.

01:02:06 It’s art.

01:02:07 But in terms of aligning values

01:02:11 between computers and humans,

01:02:15 do you think this cause and effect type of thinking

01:02:20 is important to align the values,

01:02:22 values, morals, ethics under which the machines

01:02:25 make decisions, is the cause effect

01:02:28 where the two can come together?

01:02:32 Cause and effect is necessary component

01:02:34 to build an ethical machine.

01:02:38 Because the machine has to empathize

01:02:40 to understand what’s good for you,

01:02:42 to build a model of you as a recipient,

01:02:47 which should be very much, what is compassion?

01:02:50 They imagine that you suffer pain as much as me.

01:02:56 As much as me.

01:02:57 I do have already a model of myself, right?

01:03:00 So it’s very easy for me to map you to mine.

01:03:02 I don’t have to rebuild the model.

01:03:04 It’s much easier to say, oh, you’re like me.

01:03:06 Okay, therefore I would not hate you.

01:03:09 And the machine has to imagine,

01:03:12 has to try to fake to be human essentially

01:03:14 so you can imagine that you’re like me, right?

01:03:19 And moreover, who is me?

01:03:21 That’s the first, that’s consciousness.

01:03:24 They have a model of yourself.

01:03:26 Where do you get this model?

01:03:28 You look at yourself as if you are a part

01:03:30 of the environment.

01:03:32 If you build a model of yourself

01:03:33 versus the environment, then you can say,

01:03:36 I need to have a model of myself.

01:03:38 I have abilities, I have desires and so forth, okay?

01:03:41 I have a blueprint of myself though.

01:03:44 Not the full detail because I cannot

01:03:46 get the whole thing problem.

01:03:49 But I have a blueprint.

01:03:50 So on that level of a blueprint, I can modify things.

01:03:54 I can look at myself in the mirror and say,

01:03:56 hmm, if I change this model, tweak this model,

01:03:59 I’m gonna perform differently.

01:04:02 That is what we mean by free will.

01:04:05 And consciousness.

01:04:06 And consciousness.

01:04:08 What do you think is consciousness?

01:04:10 Is it simply self awareness?

01:04:11 So including yourself into the model of the world?

01:04:14 That’s right.

01:04:15 Some people tell me, no, this is only part of consciousness.

01:04:19 And then they start telling me what they really mean

01:04:21 by consciousness, and I lose them.

01:04:24 For me, consciousness is having a blueprint

01:04:29 of your software.

01:04:31 Do you have concerns about the future of AI?

01:04:37 All the different trajectories of all of our research?

01:04:39 Yes.

01:04:40 Where’s your hope, where the movement has,

01:04:43 where are your concerns?

01:04:44 I’m concerned, because I know we are building a new species

01:04:49 that has a capability of exceeding our, exceeding us,

01:04:56 exceeding our capabilities, and can breed itself

01:05:01 and take over the world.

01:05:02 Absolutely.

01:05:03 It’s a new species that is uncontrolled.

01:05:07 We don’t know the degree to which we control it.

01:05:10 We don’t even understand what it means

01:05:12 to be able to control this new species.

01:05:16 So I’m concerned.

01:05:18 I don’t have anything to add to that,

01:05:21 because it’s such a gray area, it’s unknown.

01:05:26 It never happened in history.

01:05:29 The only time it happened in history

01:05:34 was evolution with human beings.

01:05:37 It wasn’t very successful, was it?

01:05:39 Some people say it was a great success.

01:05:42 For us it was, but a few people along the way,

01:05:46 a few creatures along the way would not agree.

01:05:49 So it’s just because it’s such a gray area,

01:05:53 there’s nothing else to say.

01:05:54 We have a sample of one.

01:05:56 Sample of one.

01:05:58 It’s us.

01:05:59 But some people would look at you and say,

01:06:04 yeah, but we were looking to you to help us

01:06:09 make sure that the sample two works out okay.

01:06:13 We have more than a sample of one.

01:06:14 We have theories, and that’s a good.

01:06:18 We don’t need to be statisticians.

01:06:20 So sample of one doesn’t mean poverty of knowledge.

01:06:25 It’s not.

01:06:26 Sample of one plus theory, conjectural theory,

01:06:30 of what could happen.

01:06:32 That we do have.

01:06:34 But I really feel helpless in contributing

01:06:38 to this argument, because I know so little,

01:06:41 and my imagination is limited,

01:06:46 and I know how much I don’t know,

01:06:50 and I, but I’m concerned.

01:06:55 You were born and raised in Israel.

01:06:57 Born and raised in Israel, yes.

01:06:59 And later served in Israel military,

01:07:03 defense forces.

01:07:05 In the Israel Defense Force.

01:07:07 Yeah.

01:07:09 What did you learn from that experience?

01:07:13 From this experience?

01:07:16 There’s a kibbutz in there as well.

01:07:18 Yes, because I was in the nachal,

01:07:20 which is a combination of agricultural work

01:07:26 and military service.

01:07:28 We were supposed, I was really idealist.

01:07:31 I wanted to be a member of the kibbutz throughout my life,

01:07:36 and to live a communal life,

01:07:38 and so I prepared myself for that.

01:07:46 Slowly, slowly, I wanted a greater challenge.

01:07:51 So that’s a far world away, both.

01:07:55 What I learned from that, what I can add,

01:07:57 it was a miracle.

01:08:01 It was a miracle that I served in the 1950s.

01:08:07 I don’t know how we survived.

01:08:11 The country was under austerity.

01:08:15 It tripled its population from 600,000 to a million point eight

01:08:21 when I finished college.

01:08:23 No one went hungry.

01:08:24 And austerity, yes.

01:08:29 When you wanted to make an omelet in a restaurant,

01:08:34 you had to bring your own egg.

01:08:38 And they imprisoned people from bringing the food

01:08:43 from farming here, from the villages, to the city.

01:08:49 But no one went hungry.

01:08:50 And I always add to it,

01:08:53 and higher education did not suffer any budget cut.

01:09:00 They still invested in me, in my wife, in our generation

01:09:05 to get the best education that they could, okay?

01:09:09 So I’m really grateful for the opportunity,

01:09:15 and I’m trying to pay back now, okay?

01:09:18 It’s a miracle that we survived the war of 1948.

01:09:22 We were so close to a second genocide.

01:09:27 It was all planned.

01:09:30 But we survived it by miracle,

01:09:32 and then the second miracle

01:09:33 that not many people talk about, the next phase.

01:09:37 How no one went hungry,

01:09:40 and the country managed to triple its population.

01:09:43 You know what it means to triple?

01:09:45 Imagine United States going from what, 350 million

01:09:50 to a trillion, unbelievable.

01:09:53 So it’s a really tense part of the world.

01:09:57 It’s a complicated part of the world,

01:09:59 Israel and all around.

01:10:01 Religion is at the core of that complexity.

01:10:08 One of the components.

01:10:09 Religion is a strong motivating cause

01:10:12 to many, many people in the Middle East, yes.

01:10:16 In your view, looking back, is religion good for society?

01:10:23 That’s a good question for robotic, you know?

01:10:26 There’s echoes of that question.

01:10:28 Equip robot with religious belief.

01:10:32 Suppose we find out, or we agree

01:10:34 that religion is good to you, to keep you in line, okay?

01:10:37 Should we give the robot the metaphor of a god?

01:10:43 As a matter of fact, the robot will get it without us also.

01:10:47 Why?

01:10:48 The robot will reason by metaphor.

01:10:51 And what is the most primitive metaphor

01:10:56 a child grows with?

01:10:58 Mother smile, father teaching,

01:11:02 father image and mother image, that’s god.

01:11:05 So, whether you want it or not,

01:11:08 the robot will, well, assuming that the robot

01:11:12 is gonna have a mother and a father,

01:11:14 it may only have a programmer,

01:11:16 which doesn’t supply warmth and discipline.

01:11:20 Well, discipline it does.

01:11:22 So the robot will have a model of the trainer,

01:11:27 and everything that happens in the world,

01:11:29 cosmology and so on, is going to be mapped

01:11:32 into the programmer, it’s god.

01:11:36 Man, the thing that represents the origin

01:11:41 of everything for that robot.

01:11:43 It’s the most primitive relationship.

01:11:46 So it’s gonna arrive there by metaphor.

01:11:48 And so the question is if overall

01:11:51 that metaphor has served us well as humans.

01:11:56 I really don’t know.

01:11:58 I think it did, but as long as you keep

01:12:01 in mind it’s only a metaphor.

01:12:05 So, if you think we can, can we talk about your son?

01:12:11 Yes, yes.

01:12:13 Can you tell his story?

01:12:15 His story?

01:12:17 Daniel?

01:12:18 His story is known, he was abducted

01:12:21 in Pakistan by Al Qaeda driven sect,

01:12:26 and under various pretenses.

01:12:32 I don’t even pay attention to what the pretence was.

01:12:35 Originally they wanted to have the United States

01:12:42 deliver some promised airplanes.

01:12:47 It was all made up, and all these demands were bogus.

01:12:51 Bogus, I don’t know really, but eventually

01:12:57 he was executed in front of a camera.

01:13:03 At the core of that is hate and intolerance.

01:13:07 At the core, yes, absolutely, yes.

01:13:10 We don’t really appreciate the depth of the hate

01:13:15 at which billions of peoples are educated.

01:13:26 We don’t understand it.

01:13:27 I just listened recently to what they teach you

01:13:31 in Mogadishu.

01:13:32 Okay, okay, when the water stopped in the tap,

01:13:45 we knew exactly who did it, the Jews.

01:13:49 The Jews.

01:13:50 We didn’t know how, but we knew who did it.

01:13:55 We don’t appreciate what it means to us.

01:13:58 The depth is unbelievable.

01:14:00 Do you think all of us are capable of evil?

01:14:06 And the education, the indoctrination

01:14:09 is really what creates evil.

01:14:10 Absolutely we are capable of evil.

01:14:12 If you’re indoctrinated sufficiently long and in depth,

01:14:18 you’re capable of ISIS, you’re capable of Nazism.

01:14:24 Yes, we are, but the question is whether we,

01:14:28 after we have gone through some Western education

01:14:32 and we learn that everything is really relative.

01:14:35 It is not absolute God.

01:14:37 It’s only a belief in God.

01:14:40 Whether we are capable now of being transformed

01:14:43 under certain circumstances to become brutal.

01:14:49 Yeah.

01:14:51 I’m worried about it because some people say yes,

01:14:55 given the right circumstances,

01:14:57 given bad economical crisis,

01:15:04 you are capable of doing it too.

01:15:06 That worries me.

01:15:08 I want to believe it, I’m not capable.

01:15:12 So seven years after Daniel’s death,

01:15:14 you wrote an article at the Wall Street Journal

01:15:16 titled Daniel Pearl and the Normalization of Evil.

01:15:19 Yes.

01:15:20 What was your message back then

01:15:23 and how did it change today over the years?

01:15:27 I lost.

01:15:30 What was the message?

01:15:31 The message was that we are not treating terrorism

01:15:39 as a taboo.

01:15:41 We are treating it as a bargaining device that is accepted.

01:15:46 People have grievance and they go and bomb restaurants.

01:15:54 It’s normal.

01:15:55 Look, you’re even not surprised when I tell you that.

01:15:59 20 years ago you say, what?

01:16:01 For grievance you go and blow a restaurant?

01:16:05 Today it’s becoming normalized.

01:16:07 The banalization of evil.

01:16:11 And we have created that to ourselves by normalizing,

01:16:16 by making it part of political life.

01:16:23 It’s a political debate.

01:16:27 Every terrorist yesterday becomes a freedom fighter today

01:16:34 and tomorrow it becomes terrorist again.

01:16:36 It’s switchable.

01:16:38 Right, and so we should call out evil when there’s evil.

01:16:42 If we don’t want to be part of it.

01:16:46 Becoming.

01:16:49 Yeah, if we want to separate good from evil,

01:16:52 that’s one of the first things that,

01:16:55 what was it, in the Garden of Eden,

01:16:57 remember the first thing that God told him was,

01:17:02 hey, you want some knowledge, here’s a tree of good and evil.

01:17:07 Yeah, so this evil touched your life personally.

01:17:12 Does your heart have anger, sadness, or is it hope?

01:17:20 Look, I see some beautiful people coming from Pakistan.

01:17:26 I see beautiful people everywhere.

01:17:29 But I see horrible propagation of evil in this country too.

01:17:38 It shows you how populistic slogans

01:17:42 can catch the mind of the best intellectuals.

01:17:48 Today is Father’s Day.

01:17:50 I didn’t know that.

01:17:51 Yeah, what’s a fond memory you have of Daniel?

01:17:56 What’s a fond memory you have of Daniel?

01:17:58 Oh, very good memories, immense.

01:18:02 He was my mentor.

01:18:06 He had a sense of balance that I didn’t have.

01:18:15 He saw the beauty in every person.

01:18:19 He was not as emotional as I am,

01:18:22 the more looking things in perspective.

01:18:26 He really liked every person.

01:18:29 He really grew up with the idea that a foreigner

01:18:34 is a reason for curiosity, not for fear.

01:18:42 That one time we went in Berkeley,

01:18:45 and a homeless came out from some dark alley,

01:18:49 and said, hey, man, can you spare a dime?

01:18:51 I retreated back, two feet back,

01:18:54 and then I just hugged him and say,

01:18:56 here’s a dime, enjoy yourself.

01:18:58 Maybe you want some money to take a bus or whatever.

01:19:05 Where did you get it?

01:19:06 Not from me.

01:19:10 Do you have advice for young minds today,

01:19:12 dreaming about creating as you have dreamt,

01:19:16 creating intelligent systems?

01:19:17 What is the best way to arrive at new breakthrough ideas

01:19:21 and carry them through the fire of criticism

01:19:23 and past conventional ideas?

01:19:28 Ask your questions freely.

01:19:34 Your questions are never dumb.

01:19:37 And solve them your own way.

01:19:40 And don’t take no for an answer.

01:19:42 Look, if they are really dumb,

01:19:46 you will find out quickly by trying an arrow

01:19:49 to see that they’re not leading any place.

01:19:52 But follow them and try to understand things your way.

01:19:59 That is my advice.

01:20:01 I don’t know if it’s gonna help anyone.

01:20:03 Not as brilliantly.

01:20:05 There is a lot of inertia in science, in academia.

01:20:15 It is slowing down science.

01:20:18 Yeah, those two words, your way, that’s a powerful thing.

01:20:23 It’s against inertia, potentially, against the flow.

01:20:27 Against your professor.

01:20:28 Against your professor.

01:20:30 I wrote the Book of Why in order to democratize

01:20:34 common sense.

01:20:38 In order to instill rebellious spirit in students

01:20:45 so they wouldn’t wait until the professor get things right.

01:20:53 So you wrote the manifesto of the rebellion

01:20:56 against the professor.

01:20:58 Against the professor, yes.

01:21:00 So looking back at your life of research,

01:21:02 what ideas do you hope ripple through the next many decades?

01:21:06 What do you hope your legacy will be?

01:21:10 I already have a tombstone carved.

01:21:19 Oh, boy.

01:21:21 The fundamental law of counterfactuals.

01:21:25 That’s what, it’s a simple equation.

01:21:29 Counterfactual in terms of a model surgery.

01:21:35 That’s it, because everything follows from that.

01:21:40 If you get that, all the rest, I can die in peace.

01:21:46 And my student can derive all my knowledge

01:21:49 by mathematical means.

01:21:51 The rest follows.

01:21:53 Yeah.

01:21:55 Thank you so much for talking today.

01:21:56 I really appreciate it.

01:21:57 Thank you for being so attentive and instigating.

01:22:03 We did it.

01:22:05 We did it.

01:22:06 The coffee helped.

01:22:07 Thanks for listening to this conversation with Judea Pearl.

01:22:11 And thank you to our presenting sponsor, Cash App.

01:22:14 Download it, use code LexPodcast, you’ll get $10,

01:22:18 and $10 will go to FIRST, a STEM education nonprofit

01:22:21 that inspires hundreds of thousands of young minds

01:22:24 to learn and to dream of engineering our future.

01:22:28 If you enjoy this podcast, subscribe on YouTube,

01:22:31 give it five stars on Apple Podcast,

01:22:33 support on Patreon, or simply connect with me on Twitter.

01:22:36 And now, let me leave you with some words of wisdom

01:22:39 from Judea Pearl.

01:22:41 You cannot answer a question that you cannot ask,

01:22:44 and you cannot ask a question that you have no words for.

01:22:47 Thank you for listening, and hope to see you next time.