## Description

1. Implement a belief propagation program that uses a factor graph and cost tables as input, and

produces the states/labels for all the variables. The BP program can be summarized in 5 steps as

shown in the lecture (Slide 9 of http://www.eng.utah.edu/~cs6320/cv_files/BeliefPropagation.pdf).

First, we initialize all messages from variable to factor nodes with zero’s or random values. The

algorithm iterates through steps 2 to 5 based on two terminating conditions. The first terminating

condition checks if the labels of all the variables change with subsequent iterations or not. If there is

no change, the algorithm is terminated. The second terminating condition checks if the number of

iterations has exceeded a threshold or not. Tips: During the implementation of BP, please assume that

all variables take the same number of labels. Test the code on the factor graph shown in Fig 1.

Figure 1: The above factor graph shows a toy problem with seven variable nodes (circles) and ten

factor nodes. [Image courtesy: Jonathan Yedidia]

We have 7 Boolean variables {𝑥1, 𝑥2, … , 𝑥7

}. The goal is to find the final states/labels for the Boolean

variables based on the 10 cost tables {𝐶1, 𝐶2, … , 𝐶7, 𝐶𝐴, 𝐴𝐵, 𝐶𝐶

}. As shown in the figure, the first 7

cost tables for the first 7 factor nodes are given by 2 x 1 matrix as shown above. For the last 3 parity

constraints, we have multi-dimensional tables of size 2x2x2x2. For example, the cost table for

𝐶𝐴

(𝑥1, 𝑥2, 𝑥3, 𝑥5

) is given below:

𝐶𝐴(𝑥1, 𝑥2, 𝑥3, 𝑥5

) = 0 𝑖𝑓 (𝑥1 + 𝑥2 + 𝑥3 + 𝑥5

) == 𝑒𝑣𝑒𝑛,

𝐶𝐴(𝑥1, 𝑥2, 𝑥3, 𝑥5

) = 1000 (𝑜𝑟 𝑖𝑛𝑓𝑖𝑛𝑖𝑡𝑦) 𝑖𝑓 (𝑥1 + 𝑥2 + 𝑥3 + 𝑥5

) == 𝑜𝑑𝑑

All the parity cost functions associated with 𝐶𝐴, 𝐶𝐵 and 𝐶𝐶 should be the same, while they still take

different input variables. We have a cost for every configuration of 4 Boolean variables, and thus the

multi-dimensional cost table will have 2 x 2 x 2 x 2 values. Set the maximum number of iterations to

be 10.

[50 Points]

2. Write a stereo matching program to compute the disparity map using the BP algorithm. The basic idea

of stereo matching is given here. You are given two images that we normally refer to as “left” and

“right” images. We can think of each image to have several horizontal scanlines. The images are said

to be rectified if every pixel in the left image has a matching pixel in the right image at the same

scanline. In a typical stereo pair, a pixel p(x,y) in the left image usually has a matching pixel q(x’,y) in

the right image. Note that the ‘y’ coordinate is the same on both the left and right images. Every pixel

in the left image gets shifted by a small amount ‘d’ on the right image in the following manner: x’ = x

– d. Here ‘d’ is supposed to be the disparity for the pixel p(x,y) in the left image. Note that a pixel that

is further away from the camera will have a very small disparity value. A pixel that is very close to the

camera will have a high disparity value. In other words, the depth “D” of a pixel is inversely

proportional to the disparity ‘d’. We consider the image as a 4-connected image graph as shown

below:

Figure 2: We show the factor graph for a stereo matching problem. The variables are shown in blue

circles, and the factor nodes are shown in squares. The green squares denote the unary potentials or

factor nodes that depend on only one variable. The red squares denote the pairwise potentials that

depend on two variables. Consider an image of dimensions M x N, where M is the height and N is the

width of the image. We have MN unary factor nodes and {(N-1)M + (M-1)N} pairwise factor nodes.

We will use BP to minimize the following energy function to find the optimum disparity

labels/states:

𝐸(𝑑) = 𝐸𝑑𝑎𝑡𝑎(𝑑) + 𝜆𝐸𝑠𝑚𝑜𝑜𝑡ℎ

(𝑑)

The first term is the data cost, and it is given by:

𝐸𝑑𝑎𝑡𝑎(𝑑) = ∑𝐶(𝑥, 𝑦, 𝑑(𝑥, 𝑦))

𝑥,𝑦

The cost C(x,y,d) in matching a p(x,y) with another pixel q(x-d,y) can be a simple pixel-wise intensity

difference as shown below:

𝐶(𝑥, 𝑦, 𝑑(𝑥, 𝑦)) = |𝐼𝐿

(𝑥, 𝑦) − 𝐼𝑅(𝑥 − 𝑑, 𝑦)|

Let us consider the left image to be the reference image. Each pixel in the left image can be seen as a

variable that can take different disparity values (say 1-50). As shown in Figure 2, the data term consists

of MN (M is the height of the image and N is the width of the image) unary factor nodes where each

factor node is attached to every pixel in the image. Each of the cost tables for unary factor node will

only consist of 50 values (assuming that each pixel can take only 50 disparity states).

We will use Potts model for the smoothness term as shown below:

𝐸𝑠𝑚𝑜𝑜𝑡ℎ(𝑑) = ∑ 𝑓(𝑑(𝑥𝑖

, 𝑦𝑖

), 𝑑(𝑥𝑗

, 𝑦𝑗))

(𝑖,𝑗)∈𝐸

𝑑𝑖𝑓𝑓 = |𝑑(𝑥𝑖

, 𝑦𝑖

) − 𝑑(𝑥𝑗

, 𝑦𝑗)|

Potts: 𝑓 = 1 𝑖𝑓 𝑑𝑖𝑓𝑓 > 0 𝑒𝑙𝑠𝑒 𝑓 = 0

In Figure 2, we show red squares that denote the pairwise or smoothness factor nodes. We consider

a 4-connected graph. We will have {(M-1)N+(N-1)M} factor nodes for pairwise terms. Each of the

cost tables for pairwise factor node will consist of [50 x 50] elements (assuming that you have 50

labels for disparity values). Note that these tables have the same entries for every pairwise factor

node. In fact, for the Potts model, these cost tables are simple matrices that have 0’s for diagonal

entries and 1’s for non-diagonal entries. These simple matrices are multiplied by 𝜆. Please do not

end up storing different tables for every pairwise factor node as this may lead to unnecessary

memory issues.

Use different values for 𝜆 = {1,10,50,100}. The termination condition should be either 20 iterations

or when the labels stop changing in subsequent iterations. Show the solutions and the value of the

final cost function. Show the final result in a disparity image. There are three image pairs provided

with this assignment. If you show the results on the first two image pairs (out of the 3 pairs of images

given with the assignment), it is sufficient. If you encounter overflow problems try to adjusting the

contents of the messages by subtracting the minimum element. [Hint: Please pay attention to

memory management and please don’t try to code a general BP program, as it may be challenging.]

[50 Points]