George Rebane
Knowing my relationship with Dr Judea Pearl, fellow blogger Russ Steele sent me a link to a recent interview (here) on AI with Judea. Longtime RR readers recall pieces citing my friend, colleague, and mentor Judea Pearl as the inventor of Bayes and causal nets, and one of the true pioneers of AI and winner of the Turing Award (here and here). I was privileged to be one of his very small group of doctoral students contributing to his research and the writing of his landmark Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (1988) introducing Bayes nets for which I was able to contribute my research in machine learning that wound up in what is known as the Rebane-Pearl algorithm (here and here).
Early on (late 1980s) our little group began to understand that the alternative ‘causal net’ that was also used in the literature for Bayes nets may not be appropriate. To be frank, we were somewhat guilty of introducing that appellation ourselves before we recognized that our enthusiastic reach exceeded our grasp of causality as provided by our research. In our paper on the Rebane-Pearl algorithm I introduced the concept of a ‘causal basin’ and showed (in my hand-drawn figure which somehow made it into the paper and Judea’s book) how the algo could take a dataset and divide it into several causal basins. I recall having long discussions with Judea on the relationship of probabilistic inference and ‘true causality’ as we humans ascribe to the goings on in our environment.
A decade earlier Judea and I spent time during our family social get-togethers ensconcing ourselves with a bottle of wine to speculate on the future of intelligent machines and what computational, perceptual, and cognitive requirements they would need to achieve peerage with humans. Over such discussions it became clear to both of us that machines would have to include the ability to handle the subjunctive and counter-factual ‘thoughts’, in addition being able to reason probabilistically a la Bayes nets. It was Judea who came upon the idea that causality was a distinct cognitive beast that could not be tamed and understood through mere extensions of Bayes nets – a whole new calculus would be required.
Judea launched on that quest around 1990, and by the mid-90s had invented what today is known as the ‘do-calculus’. (here and here) After more research Judea formed a small discussion group on advanced AI concepts including causality. I was privileged to be invited as the only non-academic member of that group, which I recall included, among others, such luminaries as Leonard Kleinrock (leading modern queuing theorist and a true founder of the Internet) and Adnan Darwiche (a stellar Bayesian, Google Scholar, and former chair of UCLA’s Computer Science Department). At that time I was heavily involved in the introduction of ecommerce with Bizrate.com, a pioneering start-up that became one of the early dot-com success stories. Finally, from all this work, Judea wound up literally writing the book on causality – Causality (2000) and recently the more accessible The Book of Why: The New Science of Cause and Effect (2018).
With the do-calculus Judea introduced the concept of the ‘causal beam’ as the sequence of directed and mediating happenings that precede the caused event in question. We can think of the causal beam as the one specific imperative sequence of nodes within a causal basin that is necessary for the event to occur. In sum, such an approach now lets us reason with cause and effect. Within my own concept of causality, the ‘actual cause’ of an event is the upstream progenitor event that probabilistically ‘seals the fate’ of the terminal caused event coming to pass. There’s more to be said about that, but let’s get back to Judea’s interview.
What caught my eye about this interview was Judea’s assessment of the current state of AI research, especially its almost unseemly focus on what has come to be known as Deep Learning. As Judea points out, there is much more to AI than Deep Learning, which is the resurrection on steroids of neural nets (NNs) which languished for decades due mainly to lack of storage and processing power to handle the large amounts of data needed for training a NN, and then having enough horses to compute such a net with many very large hidden layers of ‘neurons’. Judea is a little harsh when he describes Deep Learning as basically being little more than “curve fitting”.
As I’ve described before, NNs take a large potful of data and discover certain very highly dimensioned subsets or sub-hyperspaces of it with either a supervised learning (deviation from target behavior used to update NN parameters) approach or unsupervised learning mode (no target behavior specified). These subsets are delimited or bounded by highly dimensioned boundaries or discriminants that later allow a new data point to be identified as belonging to this rather than all the other bounded (discriminated) subspaces. What makes such NNs valuable are the semantics associated with each of the subspaces – such as malignant tumor or simple cyst, Sally Smith vs everyone else, or turn right at 20 degrees/sec vs left at 10 degrees/sec vs stop. And yes, one can think of the process of digging such discriminants out of the data as a form of fitting a curve to the data. However deep learning NNs do a bit more since they can also discover the form or shape of the best discriminant rather than fit a no matter how highly-dimensioned ‘curve’ of a fixed functional form to the data. And such a capability is an extremely important advance.
But Judea’s lament on today’s almost exclusive focus on deep learning is well supported. Today AI advances revolve around the effort to make ever better associations – this data points to there being a ‘cat’. But the super-intelligence barrier will not be penetrated until machines can also conceive of and deal with the ‘what ifs’. In Judea’s words, “If we want machines to reason about interventions (“What if we ban cigarettes?”) and introspection (“What if I had finished high school?”), we must invoke causal models. Associations are not enough — and this is a mathematical fact, not opinion.”
The road to get there loops back to how machines can conceive reality; in other words how can they be gifted with a working ontology that allows them to model their current environment (like humans) and then deal with it and/or do things within it. And going a bit further, machines must have ‘free will’ according Judea because evolution tells us that having the “sensation of free will serves some (critical) computational function.”
The evidence of machines actually having free will according to Judea is when “robots start communicating with each other counterfactually, like “You should have done better.” The capacity of free will immediately invokes the notion of evil, which itself requires an operational definition. According to Judea, evil occurs in an agent that has the capacity to believe and act on the belief that its “greed or grievance supersedes all standard norms of society.” A robot is seen to commit evil when it ignores the maintenance of its known social norms in order to satisfy its greed or assuage its grievance.
There is more, but this should give all of us a good chunk to chew on in these pre-Singularity years, as we consider the new areas of research that computer scientists like Judea Pearl and other workers in machine intelligence are now contemplating that lie beyond the still marvelous and magical capabilities provided daily to machines through “curve fitting”.


Leave a comment