Representing Distributions in BN - Discrete child node with discrete parent node

Assume $X$ is a discrete variable and $D_{1}. . . D_{k}$ are the discrete values from a finite set of its parents node, then the $P(X|D_{1}. . . D_{k})$ can be represented as a table (this is called Conditional Probability Table - CPT), that states the probability of values for $X$ for each joint assessment to $D_{1}. . .D_{k}$. For example, if all variables are binary (these are the variables where YES or NO is enough for an answer) the table will be constructed for $2^{k}$ distributions. For the case, where a single child node has multiple parents a multidimensional table must be constructed. For example, if a binary child node has 10 discrete parent nodes and each of these is represented by 4 discrete values then we will have to construct a table for 2,097,152 distributions.

Constructing a multidimensional table as described above, and most importantly acquiring all these numeric information in order to create the network, is a drawback for the BN theory, as most of the time the additional qualitative (or numeric) information which is essential for creating the network is not always readily available.

Many methods for overcoming this problem in constructing BNs have been proposed in the literature. The most widely used is the noisy–OR gate [1]. Let us consider a binary child node $f$ that can assume values $f_{+}$ and $f_{-}$ (in the sense of positive or negative result), and has $n$ binary parents $d_{1}. . .d_{n}$, each of which may assume values $d_{i}^{+}$ and $d_{i}^{-}$, $1\leq{i}\leq{n}$. Under the assumption of causal independence — that means that the effects from the discrete parent nodes to the child node occur independently of one another and independently of any other events that may effect the child node to occur—we can construct the CPT for the child node by only acquiring $n$ conditional probabilities (each of them representing the probability of the symptom appearing when only one parent is present) instead of $2^{n}$. The rest can be obtained by the following formula [2]: \[ P(f^{+}|H)= 1- \prod_{d_{i} \in H^{+}} [1 - P (f^{-} | only d_{i}^{+})] \] where $H$ is a hypothesis that a particular set of parents $d_{i}$ for node f are present and $H^{+}$ is the subset of $H$ that contains values $d_{i}^{+}$.


1. J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc, 1988.
2. M. A. Shwe, B. Middleton, D. E. Heckerman, M. Henrion, E. J. Horvitz, H. P.Lehmann, and G. F. Cooper. Probabilistic Diagnosis Using A Reformulation of the INTERNIST-1/QMR Knowledge Base. Methods of Information in Medicine, 30:241–255, 1991.


Back to: Representing Distributions in Bayesian Networks