## Description

When the travelling salesman meets the MST

The travelling salesman problem (often abbreviated as TSP) is one of the most famous problems in

combinatorial optimization. Like many good problems, it is a simple question that is easy to state.

You are given a set of cities v0, v1, . . . , vn. You know the distance1 between any pair of cities. What

is the fastest route that visits every city exactly once, and then returns to where you started?

See Figure 1 for an example of the travelling salesman problem, along with three possible

solutions. A valid tour visits every city exactly once, and ends back at the first node—forming a

cycle2

. The goal is to find the shortest valid tour.

The travelling salesman problem has a reputation for being very hard: it is well-known to be

NP-hard, meaning that there is no polynomial time algorithm that can find an optimal tour unless

P = NP. In reality, though, the travelling salesman problem is not that difficult: we can efficiently

calculate solutions that are pretty good. That is, for a given set of cities, we can find a tour that is

approximately as good as the best tour3

.

In this problem set, you will be developing a solution for the travelling salesman problem based

on minimum spanning trees. The algorithm you develop will be a 2-approximation scheme, i.e., it

will guarantee that the tour it discovers is at most twice as long as the optimal tour.

1For today, we are focusing on the Euclidean travelling salesman problem where the distance between two cities

is the Euclidean distance in the 2d-plane.

2Technically, a valid tour is known as a Hamiltonian Cycle.

3

In fact, where the distances are Euclidean, you can get an approximation that is as close as you like! This is

known as a PTAS: polynomial time approximation scheme.

1

A B

D

10

8

C

9

5

A B 10

D

8

C 6

5

A B 10

D

10

C

9

6

A B

10

D

10

8

C

9

6

5

Figure 1: The top graph is an example of a TSP problem containing 4 cities {A, B, C, D}. On

the bottom, with dashed lines, are three sample tours. The first (A, D, B, C, A) has cost 32, the

second (A, D, C, B, A) has cost 29, and the third (A, B, D, C, A) has cost 35. Notice that each

tour visits all four cities exactly once.

2

TSP-MST Algorithm. In describing the algorithm4

, we will think of the map as a graph G =

(V, E) where V is a set of n cities and E is a set of n(n − 1)/2 undirected edges connecting every

pair of cities. The algorithm consists of three steps:

1. Find a minimum spanning tree T of the graph G. (You can use any efficient algorithm for

finding a minimum spanning tree.) Notice that the cost of the tree T is always less than the

cost of the optimal tour (since if you remove any one edge from the optimal tour, it forms a

spanning tree).

2. Perform a depth-first-search walk of the tree T. When you perform the depth-first-search,

remember every time you visit a node. Every node in the graph appears at least twice in

the DFS walk. (Every edge in the tree T is crossed exactly twice in the DFS walk.) Let

D = d0, d1, d2, d3, . . . , d2n−1 be the 2n cities visited on the DFS treewalk. Notice that D has

cost at most twice the optimal tour (since each edge is crossed twice), and visits every city

at least once. It is not a valid tour, however, since cities are visited more than once.

3. Take short-cuts to avoid revisiting cities. For example, if you are in city di and have already

visited city di+1, then skip city di+1 and go directly to di+2. (If you have already visited di+2,

then skip it and go on to the next city.) Since the distances satisfy the triangle inequality,

these short-cuts can only decrease the length of the tour. Now you have a valid tour that is

at most twice as long as the optimal tour.

!” #”

$”

%”

&” ‘”

(”

)”

*”

+,”

-” +,”

+.”

++” +(”

Figure 2: Here, we demonstrate how the algorithm works. In the figure, we have already

calculated a minimum spanning tree, which consists of four edges: (A, D), (B, C), (C, D), and

(C, E). (The dashed edges are not part of the spanning tree.) Next, starting at node A, we

perform a depth-first-search treewalk, remembering every time we visit a node:

A → D → C → B → C → E → C → D → A

We have put in bold the first time the DFS traversal visits a node, and we have used red to

indicate later (unnecessary) visits. Finally, to devise the final tour, we skip the cities we have

already visited, yielding the following tour:

A → D → C → B → E → A

Notice that the first city is visited again at the end of the tour, completing the cycle.

4Note that this 2-approximation algorithm only works if the problem instance satisfies the traingle inequality. In

the context of this question, it means that the cost of going from city A to city C is less than the cost of going from

city A to city B, then city B to city C.

3

TSPMap class. You are provided, as part of this problem, with the java class TSPMap. This class

encapsulates the basic functionality for storing the locations of a set of cities, calculating the distance

between the cities, and drawing a map of the cities. The constructor TSPMap(String fileName)

takes a map filename as an input and reads in all the points in the file. You can then access these

points with the following methods:

• int getCount(): returns the number of points.

• Point getPoint(int i): returns point i.

• double pointDistance(int i, int j): calculates the distance between points i and j.

Each point can also store a single link to some other point. This link may be the parent edge in

a spanning tree, or it might be the next point to visit on an MST tour. You can access the link

using the following methods:

• void setLink(int i, int j): sets a link from i to j, and redraws the screen immediately.

• void setLink(int i, int j, boolean redraw): sets a link from i to j and redraws only

if redraw==true.

• void eraseLink(int i): erases the outgoing link from i and redraws the screen immediately.

• void eraseLink(int i, boolean redraw): erases the outgoing link from i and redraws

only if redraw==true.

• int getLink(int i): returns the current outgoing link from i.

Whenever you modify the map, you may want to call the redraw() method to redraw the map.

The map class contains a static nested class Point which contains the information for a point.

Notably, this class stores the x and y coordinates for a point, as well as the link. It includes

functionality to calculate the distance between two points.

The main method has an example of how to use the map class. It reads in a file hundredpoints.txt,

and then sets up the links: each point i is linked to point i + 1, and the last point is linked back to

point 0—creating a valid (if suboptimal) tour. Finally, it calls redraw to draw the tour.

4

Assignment. Your task is to implement a class TSPGraph that implements that IApproximateTSP

interface. This interface supports five methods:

• initialize(TSPMap map): Initializes your class with a new map containing points in a Euclidean plane.

• MST(): Calculate a minimum spanning tree for the points on the map. When the method

returns, each point in the map should have its link set to its parent in the spanning tree.

• TSP(): Calculate a good TSP tour using the algorithm described above: perform a depth-first

search on a minimum spanning tree, using short-cuts to ensure that each point is visited only

once. When the method returns, each point in the map should have its link set to the next

point on the tour. Note that TSP is a standalone method, MST should be called again within

it.

• isValidTour(): Returns true if the links on the map form a valid tour. (Note: a tour can

be valid even if it is far from optimal.) This may be useful for debugging.

• tourDistance(): If the links on the map form a valid tour, then this method returns the

total length of that tour (i.e., the sum of all the edges on the tour). If the links do not form

a valid tour, return -1. Note that the length of the tour includes the cost of returning back

to the start when the tour is finished.

Figure 3: This figure contains an example using the file “fiftypoints.txt”. The first image

contains the fifty points. The second image depicts a minimum spanning tree. The third

image depicts a valid tour for the travelling salesman problem.

5