CS 5350/6350: Machine Learining Homework 1 solved

\$35.00

Description

5/5 - (1 vote)

1 Decision Trees
Several questions below ask you to write down decision trees. Report decision trees as a series of if-then
statements. For example, you could write a tree as follows:
if feature 0 = x:
if feature 1 = y:
if feature 2 = z:
class = +
if feature 2 != z:
class = –
if feature 1 != y:
class = +
if feature 0 != x:
if feature 1 = r:
class = +
if feature 1 != r:
class = –
1. [Boolean Functions, 9 points] Recall from class that Boolean functions can be represented as decision trees. Represent the following boolean functions as decision trees:
1
(a) (A ∨ B) ∧ C [3 points]
(b) A ⊕ B [3 points]
(c) A ∧ ¬B ∧ ¬C ∧ D [3 points]
2. [Inducing Decision Trees, 21 points] For this question, you will use the Balloons dataset
from the UCI Machine Learning repository1
. The data consists of four attributes
(Color, Size, Action and Age) and a binary label.
The entire dataset is shown in Table 1 below.
Label Color Size Action Age
T YELLOW SMALL STRETCH CHILD
F YELLOW SMALL DIP CHILD
F YELLOW SMALL DIP CHILD
T YELLOW LARGE STRETCH CHILD
F YELLOW LARGE DIP CHILD
F YELLOW LARGE DIP CHILD
T PURPLE SMALL STRETCH CHILD
F PURPLE SMALL DIP CHILD
F PURPLE SMALL DIP CHILD
T PURPLE LARGE STRETCH CHILD
F PURPLE LARGE DIP CHILD
F PURPLE LARGE DIP CHILD
Table 1: Balloon Data Set
(a) Find the entropy of the balloon dataset. [2 points]
(b) Find the information gain of the Action feature. [2 points]
(c) Use the ID3 heuristic we have seen in the class to manually construct a decision
tree that correctly classifies the data. [7 points]
(d) We will now develop another heuristic for learning decision trees instead of ID3.
If, at some node, we stopped growing the tree and assign the majority label of
the remaining examples at that node, then the empirical error on the training set
at that node will be
M ajorityError = 1 − max(p, 1 − p)
2
where, p is the fraction of examples that are labeled T and, hence, 1 − p is the
fraction labeled F. Notice that M ajorityError can be thought of as a measure
of impurity just like entropy.
Redefine information gain using M ajorityError as the measure of impurity.
What is the redefined information gain of the Action feature? [3 points]
(e) Using the M ajorityError-based impurity measure, construct a decision tree for
the balloons data. [7 points]
3. (For CS 6350 students, 15 points) Use the first twelve examples to create decision
trees using both the heuristics. How well do the trees perform on the remaining
examples? Report the error rates (i.e the ratio of the errors to the number of examples)
of the two algorithms.
2 Nearest Neighbors
The nearest neighbors algorithm partitions the space of examples into regions corresponding
to different labels. In two dimensions, the decision boundaries can be represented as a
Voronoi diagram, which shows regions of the plane associated with each label.
For this part, you will be drawing the decision boundaries for simple datasets.
1. [5 points] Using the Euclidean distance measure between points, show a Voronoi map
corresponding to the nearest neighbor classification of the following four points. (That
is, draw a diagram that shows how the nearest neighbor classification of the following
four points partitions the two dimensional plane.)
Label x y
A 1 1
A 1 -1
B -1 -1
C 2 -2
2. [5 points] Using the city-block distance measure, show a Voronoi map corresponding
to the nearest neighbor classification of the following three points.
(Recall that the city-block measure/Manhattan distance/L1 distance/taxicab metric