Module 2: Robotics Interlude - Planning


What do we mean by planning?

Exercise 1: Download and execute Arm.java.

We will now look at three planning problems as motivation for our study of planning:

Exercise 2: Download and unpack planning.jar, then execute PlanningGUI. Familiarize yourself with the three planning problems, and experiment with different goal configurations: find "easy" and "hard" configurations to reach from the initial configuration. How can you tell what's easy or hard?

About planning problems in general:


State spaces and neighborhoods

Sometimes we use state instead of configuration:

Exercise 3: Find a goal state for the arm problem. How does one specify a complete description of this state?

Exercise 4: How many possible states are there for each of the three problems?

A neighborhood:

Exercise 5: What is the size of the neighborhood for each of the three problems?

Exercise 6: Consider the starting state in the 8-puzzle demo. Draw this state on paper, draw all its neighbors and all neighbors of its neighbors.

The underlying graph:

What we know about the shortest-path problem in graphs:

Unfortunately ...

The approach taken by planning algorithms:

Before looking at planning algorithms, let's examine the output of a planning algorithm.

Exercise 7: Compile and load MazeHandPlanner into PlanningGUI for the maze problem.


Non-cost methods: BFS and DFS

First, we'll explain how BFS and DFS work
     => It'll become clear why they are "non-cost"

Note: BFS = Breadth-First-Search, DFS = Depth-First-Search

Key ideas:

  • We will consider the (unbuilt) graph of states and action-edges.

  • Instead of generating the whole graph, we will create nodes and edges as we proceed.
         => Turns out, we will not need to store edges.

  • Use two data structures:
    • Visited: a collection of nodes (states) that we have "finished processing"
    • Frontier: a collection of nodes that we have partially processed.

  • At each step:
    • Pick a node from Frontier.
    • Examine action-edges from that node.
    • If there are neighbors that are unexplored, add them to frontier.
    • Stop when you find the goal, or when you are out of memory.

  • Here is an example snapshot:

    • The blue nodes are in Visited.
    • The red nodes are in the Frontier.
    • The black nodes are nodes not yet explored (or even generated).

  • BFS and DFS differ in their selection of which frontier node to process next.
    • BFS: Frontier is a queue.
    • DFS: Frontier is a stack.

  • Pseudocode for BFS:
        Algorithm: BFS (start, goal)
        Input: the start and goal nodes (states)
        1.   Initialize frontier and visited
        2.   frontier.add (start)
        3.   while frontier not empty
        4.       currentState = frontier.removeFirst ()
        5.       if currentState = goalState
        6.           path = makeSolution ()
        7.           return path
        8.       endif
        9.       visited.add (currentState)
        10.      for each neighbor s of currentState
        11.          if s not in visited
        12.              frontier.add (s)
        13.          endif
        14.      endfor
        15.  endwhile
        16.  return null               // No solution
        

Exercise 8: Compile and load BFSPlanner into PlanningGUI for the maze problem. Verify that it finds the solution by clicking on the "next" button once the plan has been generated. Likewise, apply BFS to an instance of the puzzle problem and verify the correctness of the plan generated.

Exercise 9: Implement DFS by modifying the BFS code.

  • Compare BFS and DFS using a variety of puzzle-problem instances.
  • Identify both the number of moves (time taken) and the number of steps in the path.
  • Write recursive pseudocode for DFS. Is there an advantage (or disadvantage) to using recursion?

Exercise 10: Compare the memory requirements of BFS and DFS. In general, which one will require more memory? Can you analyze (on paper) the memory requirements for each?

Exercise 11: Examine the use of the two data structures in BFS and DFS. Identify the operations performed on each of these. How much time is taken for each operation performed on these data structures? Are there data stuctures which take less time?


Cost-based methods

The problem with BFS/DFS:

  • Neither of them use any knowledge of the problem.
         => e.g., in the maze problem, it's easy to calculate the distance to the goal.

  • Both can end up wasting time by searching "away from the goal."

Cost-based methods:

  • Recall: one objective of a planning algorithm is to identify the plan with the least number of "moves" (lowest cost).

  • What is "cost"?
    • For most planning problems: cost is the number of moves from the start state.
    • Some planning problems include the realization cost (some actions may take more time).

The Cost-Based-Planner (CBP) Algorithm:

  • From among the Frontier nodes, pick the one with the least total cost from the start state.

  • Whenever a state is added to the Frontier, see if the cost to that state has been reduced.

  • Pseudocode:
        Algorithm: CostBased (start, goal)
        Input: the start and goal nodes (states)
        1.   Initialize frontier and visited
        2.   frontier.add (start)
        3.   while frontier not empty
        4.       currentState = remove from frontier the state with least cost
        5.       if currentState = goalState
        6.           path = makeSolution ()
        7.           return path
        8.       endif
        9.       visited.add (currentState)
        10.      for each neighbor s of currentState
        11.          if s not in visited and not in frontier
        12.              frontier.add (s)
        13.          else if s in frontier
        14.              s' = frontier.find (s)
        15.              if cost(s) < cost(s')
        16.                  frontier.add (s)
        17.              endif
        18.          endif
        19.      endfor
        20.  endwhile
        21.  return null               // No solution
        

Exercise 12: Implement the CBP by modifying the code in CBPlanner.java that is included in planning.jar. Most of the code has been written: you only need to extract the best node from the frontier using the costFromStart value in each state.

Exercise 13: Compare BFS with Cost-Based for the puzzle problem.

Exercise 14: Examine the operations on data structures in CBP. Estimate the time needed for these operations. Suggest alternative data structures.

An improvement:

  • Note that Cost-Based-Planner (CBP) does not make any use of the goal state.
         => Surely, one should give preference to the neighbors closer to the goal state?

  • The A* algorithm:
         => Pick the state whose combined cost-from-start and cost-to-goal is the least.

  • Note:
    • The cost-from-start is exact because we compute it as we go along.
    • The cost-to-goal is not known but must be estimated.

Exercise 15: What is a reasonable estimate of the cost-to-goal for the maze and puzzle problems? That is, from given a state and the goal, what is an estimate of how many moves it would take to get from the state to the goal? Show by example how the estimate can fail in each case.

Exercise 16: Modify the code in CBPlanner to implement the A* algorithm. Again, you do not need to perform the estimation. Simply use the estimatedCostToGoal value in each state (which has already been computed for you).

Exercise 17: Compare the time-taken (number of moves) and the quality of solution produced by each of A* and CBP for the maze and puzzle problems. Generate at least 5 instances of each problem and write down both measures (number of moves, quality) for each algorithm.

Exercise 18: Examine the code that produces the estimatedCostToGoal for the maze and puzzle problems. Can you suggest an alternative for the puzzle problem?


Completeness, optimality and efficiency

What these terms mean:

  • Completeness: if there's a solution (path to goal state), then the algorithm will find it.

  • Optimality: the algorithm finds the least-cost path to the goal, if at least one path exists.

  • Efficiency: the algorithm finds optimal paths in the least amount of time (its own running time).

Completness:

  • All the algorithms we have seen are complete, provided they don't run out of memory.

  • BFS is the most vulnerable
         => Memory needs can be exponential in some problems
         (Tree example)

  • Memory requirements can be reduced by removing Visited altogether
         => Does not affect completeness.

  • DFS has the least memory (for Frontier) needs, O(depth)
         => O(depth) is rarely large

  • CBP and A* eventually search the whole state space
         => They are complete.

Optimality:

  • BFS and DFS are not guaranteed to be optimal
         => They can accidentally find long paths.

    Exercise 19: Create an example of the maze problem and show that DFS can be arbitrarily worse than optimal.

  • CBP is optimal, but not necessarily efficient. To see why it's optimal:
    • Assume that CBP finds a non-optimal path:

    • Assume the blue path (above) is optimal, and the red one (below) is the non-optimal path found by CBP.
    • Consider the states just before reaching the goal node on each path.
    • The total cost at s', cost(s') < C* (less than optimal).
    • The total red cost is sub-optimal (> C*) by assumption.
    • This means the algorithm would have expanded s' before adding s* to Frontier along the red path.
           => A contradiction

  • A* is optimal, and possibly more efficient than CBP, provided the estimated-cost-to-goal always underestimates the actual cost-to-goal. To see why it's optimal,
    • Assume that A* finds a non-optimal path (red path below):

    • Let g(s) = cost-from-start for any s.
    • Let h(s) = estimated-cost-to-goal from any s.
    • Let t(s) = true optimal cost to goal from any s.
    • Let f(s) = g(s) + h(s), the value used in the algorithm.
    • Because h(s') underestimates, f(s') = g(s') + h(s') ≤ g(s') + t(s') = C*
           => f(s') ≤ C*
    • Since we are assuming A* found a non-optimal path, the cost along the red path is greater than C*.
           => s' should have been expanded before adding s* to Frontier when constructing the red path
           => A contradiction

Efficiency:

  • There is no proof that one algorithm is more efficient than another.

  • Generally, experimentation shows that A* is more efficient than CBP, which is more efficient than BFS/DFS.

Continuous spaces: the arm problem

Discretizing a continuous space:

  • For the arm problem, we will discretize the space and then run one of the above algorithms.

  • Discretize?
    • One way to do this is to impose a grid on the space.
    • Another way: define discrete "neighbors" for each state.

  • We will use the second approach. For each state and each movable joint:
    • Define eight neighboring positions

    • If the coordinates of a joint are (x, y), then the new position is potentially (x+dx, y+dy) .
    • Here, dx is either 0, δ or .
    • Similarly, dy is either 0, δ or .

Exercise 20: Compare BFS, CBP and A* on the arm problem. Initially, use a simple target (a short distance up the y-axis). Gradually make the target harder.

Exercise 21: Identify the part of the code that computes the neighbors of a state in ArmProblem.java. Change the 8-neighborhood to a 4-neighborhood (N,S,E,W) and compare. In the CBP code, you can un-comment the "draw" line to see what the screen looks like when the algorithm is in action.

Exercise 22: How do we know we have visited a state before? How and where is equality-testing performed in the code? What makes this equality-test different from the test used in the maze and puzzle problems?


Other algorithms

Greedy:

  • Instead of combining the cost-from-start and estimated-cost-to-goal, we could use just estimated-cost-to-goal
         => Called the Greedy planning algorithm.

  • Greedy is neither guaranteed to be complete nor optimal.

  • Greedy can be made complete if we make sure that at least one node from the Frontier is expanded each step.

Exercise 23: Create an example of the maze problem in which Greedy performs badly.

Reducing memory requirements:

  • There are two commonly-used approaches for reducing memory-requirements:
    • Fix the memory size, and throw out nodes heuristically.
    • Re-run an algorithm several times
           => Each time, use what was learned in earlier iterations to prune the search space.

  • Limiting memory size: SMA* (Simplified Memory-Bounded A*)
    • Use A* until memory is full.
    • Drop the node with the highest-cost.
    • Record this node's cost in its parent.
           => So that we don't expand the parent until it becomes necessary.

  • Iterative deepening: IDA*
    • Fix a cost bound B.
    • Apply A* until costs exceed B.
           => Let B' = the cost that first exceeded B.
    • Set B = B' and re-run.
    • Repeat until goal node found.

Other ideas in search:

  • There is a whole world of search algorithms, and many specialized books on the subject.

  • Example sub-topics: search for games, realtime-search, meta-heuristics.

About planning:

  • We have only lightly touched upon the general planning problem.

  • Planning is a vast area with all kinds of research, books and products.

  • There is an entire sub-area related to planning motion.

  • An example of successful planning: Mars Rover mission.