Okay I admit I got busy with classes and jobs last week and did not update, so I decided just now to write about this problem we talked about in CS330 discussion last week.
Consider the following game:
There is a line of coins with values 𝑣1, 𝑣2 … 𝑣𝑛 (𝑣1 is the value of the left-most coin, 𝑣2 is the value of the second leftmost coin… 𝑣𝑛 is the value of the right-most coin).
Each player takes turns taking either the left-most or rightmost remaining coin until all coins are gone. A player’s score is the sum of the values of the coins he takes on his turns. Using dynamic programming, write an algorithm which, given the current sequence of coins 𝑣1, 𝑣2 … 𝑣𝑛 in the form of the array 𝑉, determines the score of the two players if both play optimally, i.e. play to maximize their score at the end of the game. Report its runtime as well.
The central spirit of dynamic programming is to dismantle the problem into steps and only caring about each step. The recursion will simply take care of the rest. In this particular problem, what we care about is which coin the player chooses at his/her turn. There are only two options: choosing the left-most or the right-most. After that we are going to have a smaller subsequence of coins, which we will leave to the recursion process. Therefore, we can store the best possible scores for the two players for each subsequence.
A subsequence is determined by two variables: its start point and end point. However, if we increment these two variables to do dynamic programming we are going to have a problem: the current entry will depend on both larger index and smaller index of results to calculate, which will cost us a lot of time. As an alternative, we can increment the start index and the length of subsequence. The next step is to figure out the dp equation.
Let A[i, j](1) and A[i, j](2) denote the best possible scores of two players if they start playing next on the subsequence A[i, j]. c[i] is the value of coin i. Obviously, A[i, i](1) = c[i] and A[i, i](2) = 0. Suppose player 1 is facing the subsequence A[i, j]. He can pick either the coin i or j, obtaining a vlue of c[i] or c[j]. If he picks coin i, his gain would be A[i+1, j](2) + c[i]. Number 2 is in the parentheses because for the sequence A[i+1, j] left after his pick, he would be considered the second player. Similarly, if he picks coin j, his gain would be A[i, j-1](2) + c[j]. Therefore, for this player, A[i, j](1) = max(A[i+1, j](2) + c[i], A[i, j-1] + c[j]).
Now let’s consider the other player. For the subsequence A[i][j], what is left for the second player is completely dependent on which coin the first player picked, so two possible subsequences for player 2: A[i+1, j] and A[i, j-1], for which he is considered the first player. Now we can derive the complete relationship:
For subsequence A[i, j]: if (A[i+1, j](2) + c[i]) > (A[i, j-1] + c[j]): A[i, j](1) = A[i+1, j](2) + c[i] A[i, j](2) = A[i+1, j](1) else: A[i, j](1) = A[i, j-1](2) + c[j] A[i, j](2) = A[i, j-1](1)
Again, we can’t increment on i and j in this case. Instead, we can increment the start index i and a length l. j would be equal to i+l-1.
I am beginning to experience the power of dynamic programming. The problem is since it’s so powerful, the test for this class could be insanely hard…