# Causal Discovery and Algorithms

**- Overview**

Causal discovery algorithms analyze data to build models that show relationships between variables. These algorithms can be divided into two types: constraint-based algorithms and score-based algorithms.

A causal discovery algorithm is considered reasonable if it can solve the causal discovery problem. It is considered complete if it produces the most informative causal diagram from the input data set without making additional assumptions.

Causal discovery algorithms generate a set of candidate causal structures that are compatible with the data. This set is called a "Markov equivalence class".

Causal discovery algorithms consider the entire class of models that are consistent with background assumptions. They test these models against the data to select a set of models that are consistent with the data and assumptions.

The PC algorithm is considered the default algorithm for causal discovery. It has the same assumptions and consistency properties as the SGS algorithm, but runs faster and performs fewer statistical tests.

**- Causal Structure Discovery (CSD) Methods**

Causal discovery aims to find causal relations by analyzing observational data. In other words, given a dataset, derive a causal model that describes it. The data are produced by not only the underlying causal process, but also the sampling process. In practice, to achieve reliable causal discovery, one needs to address specific challenges posed in the causal process or the sampling process.

Causal structure discovery (CSD) is the problem of identifying causal relationships from large amounts of data computationally. Due to the limited ability of traditional association-based computational methods to discover causal relationships, CSD methods are gaining popularity.

A fundamental task in various disciplines of science, including biology, is to find underlying causal relations and make use of them. Causal relations can be seen if interventions are properly applied; however, in many cases they are difficult or even impossible to conduct. It is then necessary to discover causal relations by analyzing statistical properties of purely observational data, which is known as causal discovery or causal structure search.

**- Identifying Causal Relations and The Laws of Regularities**

Almost all of science is about identifying causal relations and the laws or regularities that govern them. Since the seventeenth century beginnings of modern science, there have been two kinds of procedures, and resulting kinds of data, for discovering causes: manipulating and varying features of systems to see what other features do or do not change; and observing the variation of features of systems without manipulation. Both methods shone in the seventeenth century, when they were intertwined then as they are today.

Evangelista Torricelli manipulated the angles and shapes of tubes filled with mercury standing in a basin of the stuff, showing the height of the mercury in the tubes did not vary; Pascal had a manometer of Torricelli's design carried up a mountain, the Puy de Dome, to show that the height of the mercury did vary with altitude. Galileo, for whom Torricelli worked, had identified (qualitatively) the orbits of Jovian satellites from observational time series, and similarly characterized sunspots. Kepler, Galileo's northern contemporary, adduced his three laws from planetary observations, and a generation later Newton laid the foundations of modern physics with a gravitational law adduced from solar system observations and a single experiment, on pendulums. Modern molecular biology is an experimental subject, but the foundation of biology, in Darwin's Origin of Species, has only a single experiment, the drifting of seeds.

**- Big Data and Algorithms for Identifying Causal Relations**

A traditional way to discover causal relations is to use interventions or randomized experiments, which is in many cases too expensive, too time-consuming, or even impossible. Therefore, revealing causal information by analyzing purely observational data, known as causal discovery, has drawn much attention.

Past decades have seen a series of cross-disciplinary advances in algorithms for identifying causal relations and effect sizes from observational data or mixed experimental and observational data. These developments promise to enable better use of appropriate “big data." They have already been applied in genomics, ecology, epidemiology, space physics, clinical medicine, neuroscience, and many other domains, often with experimental or quasi-experimental validation of their predictions.

In traditional causality research, algorithms for identification of causal effects, or inferences about the effects of interventions, when the causal relations are completely or partially known, address a different class of problems; and references therein.

In practice, for reliable causal discovery one needs to address specific challenges that are often posed in the causal process or the sampling process to generate the observed data.

**- Bayes Networks (Bayesian Networks)**

A Bayesian network (also known as a Bayes network, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

Efficient algorithms can perform inference and learning in Bayesian networks. Bayesian networks that model sequences of variables (e.g. speech signals or protein sequences) are called dynamic Bayesian networks. Generalizations of Bayesian networks that can represent and solve decision problems under uncertainty are called influence diagrams.

**- Causal Inference**

Causal inference is the process of drawing a conclusion about a causal connection based on the conditions of the occurrence of an effect. The main difference between causal inference and inference of association is that the former analyzes the response of the effect variable when the cause is changed.

Causal inference has numerous real-world applications in many domains such as health care, marketing, political science and online advertising. Treatment effect estimation, a fundamental problem in causal inference, has been extensively studied in statistics for decades. However, traditional treatment effect estimation methods may not well handle large-scale and high-dimensional heterogeneous data.

In recent years, an emerging research direction has attracted increasing attention in the broad AI field, which combines the advantages of traditional treatment effect estimation approaches (e.g., matching estimators) and advanced representation learning approaches (e.g., deep neural networks).

**[More to come ...]**