Module 2: Robotics Interlude - Planning
What do we mean by planning?
- The general planning problem is: given an initial configuration
and a desired goal, find the sequence of actions needed
to reach the goal.
- We are of course interested in an algorithm
that produces the sequence of actions.
Exercise 1:
Download and execute Arm.java.
- How would you describe the initial "configuration"?
- Find a configuration that meets the goal (where the arm tip
is on the goal). How would you describe this particular
final configuration?
- See if you can describe a few intermediate
configurations. Then, work with the person next to you:
communicate the intermediate positions so that s/he
follows the same sequence of actions to reach the goal.
We will now look at three planning problems as motivation
for our study of planning:
- The maze problem:

- The maze is an N x N grid of cells.
- Each cell is either closed (prohibited) or not.
- There is a start cell and a goal cell.
- At each step, the allowable actions are: take one step (one cell)
in one of four directions North, South, East, West.
- For example, the following sequence works for the above
instance of the problem:
W, N, N, N, N, N, E, E, E, N, N, E, E, E, S, S
- The algorithmic problem: write an algorithm that produces
a sequence of actions when given the start and end configurations.
- Secondary goal: find an efficient (short) sequence.
- The N-puzzle problem

- The example shows an 8-puzzle.
- There are 8 tiles in 9 spaces.
- A "move" (action) consists of moving a tile into the blank
spot (leaving another blank spot).
- The objective: find a sequence of actions to go from the
initial configuration to the goal.
- The algorithmic problem: write an algorithm to do so.
- Secondary goal: find an efficient (short) sequence.
- The robot arm problem:
- Given an N-link arm, a start configuration and an
end configuration, find a sequence of moves to go from
start to end.
- The algorithmic problem: write an algorithm to do so.
- Secondary goal: find an efficient (short) sequence.
Exercise 2:
Download and unpack planning.jar,
then execute PlanningGUI. Familiarize yourself with the
three planning problems, and experiment with different goal
configurations: find "easy" and "hard" configurations to reach
from the initial configuration. How can you tell what's
easy or hard?
About planning problems in general:
- Many problems have a set of desired goals
=>
In such problems, one needs to find any one goal.
- In real-time planning problems, some data could
change with time
=>
e.g., measurements from robot sensors
- Real-world planning problems combine a variety of data
(some noisy) and constraints (time-constraints)
=>
e.g., motion planning on-the-fly
- Real-world planning problems combine various levels:
=>
short-term motor control to high-level multi-robot coordination
State spaces and neighborhoods
Sometimes we use state instead of configuration:
- A state can be a more numeric description than a configuration.
=>
e.g., use precise coordinates for arm-joint positions.
- The set of allowable actions may depend on the state
=>
e.g., cannot move "South" when in the bottom row of the maze problem
Exercise 3:
Find a goal state for the arm problem.
How does one specify a complete description of this state?
Exercise 4:
How many possible states are there for each of the three problems?
A neighborhood:
- Each action taken in a given state leads to another state
=>
That other state is a neighbor of the first state.
- The neighborhood of a state is the set of possible states
reachable from the state using actions allowed in that state.
- A neighborhood may not be spatially contiguous:

- Here, no sharp right turns are permitted (perhaps because
of the vehicle's limitations).
- Thus, the cell to the east is not a neighbor.
Exercise 5:
What is the size of the neighborhood for each of the three
problems?
Exercise 6:
Consider the starting state in the 8-puzzle demo. Draw this state
on paper, draw all its neighbors and all neighbors of its
neighbors.
The underlying graph:
- What is a graph? See this
definition, for example.
- Each planning problem has an underlying graph:
- Each state is a vertex in the graph.
- When an action takes you from one state to a second one,
place a directed edge from the first to the second.
=>
We will call these edges action edges.
- For example: consider the maze problem with no right-turns:

- From cell (1,1), the neighbors are: (0,1), (1,2), (1,0) and (2,2)

- We need not draw the graph spatially. This is just as accurate:

- The above graph is not complete:
=>
A complete graph would show all possible states and all
possible edges.
- The objective of planning: find a (short) path from
a given start state to a given goal state.
What we know about the shortest-path problem in graphs:
- The shortest-path problem can be solved relatively
efficiently.
- The most efficient algorithm for this problem is
called Dijkstra's algorithm.
See this
description, for example.
- Given a graph with n nodes and m edges, a
careful implementation of Dijkstra's algorithm can find a
shortest path in time O(m log(n)) time.
Unfortunately ...
- Planning problems often have very large numbers of states
=>
e.g., the N-puzzle problem has O(N!) states.
- It's usually infeasible (and unnecessary) to generate the
whole graph ahead of time.
The approach taken by planning algorithms:
- Generate states on the fly (incrementally).
- Store some states but not all.
Before looking at planning algorithms, let's examine the
output of a planning algorithm.
Exercise 7:
Compile and load MazeHandPlanner into PlanningGUI for the
maze problem.
- Click on "Plan" and then click on "Next" repeatedly
to see what this algorithm produced.
- Examine the code in MazeHandPlanner. Create a different maze
to solve, and hand-code the solution in MazeHandPlanner.
- Examine the code in PuzzleHandPlanner. Observe how
states are generated and entered into the plan.
Non-cost methods: BFS and DFS
First, we'll explain how BFS and DFS work
=>
It'll become clear why they are "non-cost"
Note: BFS = Breadth-First-Search, DFS = Depth-First-Search
Key ideas:
- We will consider the (unbuilt) graph of states and action-edges.
- Instead of generating the whole graph, we will create nodes
and edges as we proceed.
=>
Turns out, we will not need to store edges.
- Use two data structures:
- Visited: a collection of nodes (states) that we
have "finished processing"
- Frontier: a collection of nodes that we have
partially processed.
- At each step:
- Pick a node from Frontier.
- Examine action-edges from that node.
- If there are neighbors that are unexplored, add them to frontier.
- Stop when you find the goal, or when you are out of memory.
- Here is an example snapshot:

- The blue nodes are in Visited.
- The red nodes are in the Frontier.
- The black nodes are nodes not yet explored (or even generated).
- BFS and DFS differ in their selection of which frontier
node to process next.
- BFS: Frontier is a queue.
- DFS: Frontier is a stack.
- Pseudocode for BFS:
Algorithm: BFS (start, goal)
Input: the start and goal nodes (states)
1. Initialize frontier and visited
2. frontier.add (start)
3. while frontier not empty
4. currentState = frontier.removeFirst ()
5. if currentState = goalState
6. path = makeSolution ()
7. return path
8. endif
9. visited.add (currentState)
10. for each neighbor s of currentState
11. if s not in visited
12. frontier.add (s)
13. endif
14. endfor
15. endwhile
16. return null // No solution
Exercise 8:
Compile and load BFSPlanner into PlanningGUI for the
maze problem. Verify that it finds the solution by clicking
on the "next" button once the plan has been generated.
Likewise, apply BFS to an instance of the puzzle problem
and verify the correctness of the plan generated.
Exercise 9:
Implement DFS by modifying the BFS code.
- Compare BFS and DFS using a variety of puzzle-problem instances.
- Identify both the number of moves (time taken) and the
number of steps in the path.
- Write recursive pseudocode for DFS. Is there an advantage
(or disadvantage) to using recursion?
Exercise 10:
Compare the memory requirements of BFS and DFS. In general,
which one will require more memory? Can you analyze (on paper)
the memory requirements for each?
Exercise 11:
Examine the use of the two data structures in BFS and DFS.
Identify the operations performed on each of these.
How much time is taken for each operation performed
on these data structures? Are there data stuctures
which take less time?
Cost-based methods
The problem with BFS/DFS:
- Neither of them use any knowledge of the problem.
=>
e.g., in the maze problem, it's easy to calculate the distance to the goal.
- Both can end up wasting time by searching "away from the goal."
Cost-based methods:
- Recall: one objective of a planning algorithm is to identify
the plan with the least number of "moves" (lowest cost).
- What is "cost"?
- For most planning problems: cost is the number of moves
from the start state.
- Some planning problems include the realization cost (some
actions may take more time).
The Cost-Based-Planner (CBP) Algorithm:
- From among the Frontier nodes, pick the one with the least
total cost from the start state.
- Whenever a state is added to the Frontier, see if the
cost to that state has been reduced.
- Pseudocode:
Algorithm: CostBased (start, goal)
Input: the start and goal nodes (states)
1. Initialize frontier and visited
2. frontier.add (start)
3. while frontier not empty
4. currentState = remove from frontier the state with least cost
5. if currentState = goalState
6. path = makeSolution ()
7. return path
8. endif
9. visited.add (currentState)
10. for each neighbor s of currentState
11. if s not in visited and not in frontier
12. frontier.add (s)
13. else if s in frontier
14. s' = frontier.find (s)
15. if cost(s) < cost(s')
16. frontier.add (s)
17. endif
18. endif
19. endfor
20. endwhile
21. return null // No solution
Exercise 12:
Implement the CBP by modifying the code
in CBPlanner.java that is included in planning.jar.
Most of the code has been written: you only need to extract
the best node from the frontier using the costFromStart
value in each state.
Exercise 13:
Compare BFS with Cost-Based for the puzzle problem.
Exercise 14:
Examine the operations on data structures in CBP.
Estimate the time needed for these operations. Suggest
alternative data structures.
An improvement:
- Note that Cost-Based-Planner (CBP) does not make any use
of the goal state.
=>
Surely, one should give preference to the neighbors closer to
the goal state?
- The A* algorithm:
=>
Pick the state whose combined cost-from-start and cost-to-goal
is the least.
- Note:
- The cost-from-start is exact because we compute it
as we go along.
- The cost-to-goal is not known but must be estimated.
Exercise 15:
What is a reasonable estimate of the cost-to-goal for the
maze and puzzle problems? That is, from given a state
and the goal, what is an estimate of how many moves it would
take to get from the state to the goal?
Show by example how the estimate can fail in each case.
Exercise 16:
Modify the code in CBPlanner to implement
the A* algorithm. Again, you do not need to perform
the estimation. Simply use the estimatedCostToGoal
value in each state (which has already been computed for you).
Exercise 17:
Compare the time-taken (number of moves) and the quality
of solution produced by each of A* and CBP for the
maze and puzzle problems. Generate at least 5 instances
of each problem and write down both measures (number
of moves, quality) for each algorithm.
Exercise 18:
Examine the code that produces the estimatedCostToGoal
for the maze and puzzle problems. Can you suggest an alternative
for the puzzle problem?
Completeness, optimality and efficiency
What these terms mean:
- Completeness: if there's a solution (path to goal
state), then the algorithm will find it.
- Optimality: the algorithm finds the least-cost
path to the goal, if at least one path exists.
- Efficiency: the algorithm finds optimal paths
in the least amount of time (its own running time).
Completness:
- All the algorithms we have seen are complete, provided they
don't run out of memory.
- BFS is the most vulnerable
=>
Memory needs can be exponential in some problems
(Tree example)
- Memory requirements can be reduced by removing
Visited altogether
=>
Does not affect completeness.
- DFS has the least memory (for Frontier) needs, O(depth)
=>
O(depth) is rarely large
- CBP and A* eventually search the whole state space
=>
They are complete.
Optimality:
Efficiency:
- There is no proof that one algorithm is more efficient than another.
- Generally, experimentation shows that A* is more efficient
than CBP, which is more efficient than BFS/DFS.
Continuous spaces: the arm problem
Discretizing a continuous space:
- For the arm problem, we will discretize the space and then
run one of the above algorithms.
- Discretize?
- One way to do this is to impose a grid on the space.
- Another way: define discrete "neighbors" for each state.
- We will use the second approach. For each state and each movable joint:
- Define eight neighboring positions

- If the coordinates of a joint are (x, y), then the
new position is potentially (x+dx, y+dy) .
- Here, dx is either 0, δ or -δ.
- Similarly, dy is either 0, δ or -δ.
Exercise 20:
Compare BFS, CBP and A* on the arm problem. Initially, use
a simple target (a short distance up the y-axis). Gradually make
the target harder.
Exercise 21:
Identify the part of the code that computes the neighbors
of a state in ArmProblem.java. Change the 8-neighborhood
to a 4-neighborhood (N,S,E,W) and compare. In the CBP code,
you can un-comment the "draw" line to see what the screen
looks like when the algorithm is in action.
Exercise 22:
How do we know we have visited a state before? How and where
is equality-testing performed in the code? What makes this
equality-test different from the test used in the maze and
puzzle problems?
Other algorithms
Greedy:
- Instead of combining the cost-from-start and
estimated-cost-to-goal, we could use just estimated-cost-to-goal
=>
Called the Greedy planning algorithm.
- Greedy is neither guaranteed to be complete nor optimal.
- Greedy can be made complete if we make sure that at least
one node from the Frontier is expanded each step.
Exercise 23:
Create an example of the maze problem in which Greedy
performs badly.
Reducing memory requirements:
- There are two commonly-used approaches for reducing
memory-requirements:
- Fix the memory size, and throw out nodes heuristically.
- Re-run an algorithm several times
=>
Each time, use what was learned in earlier iterations to prune the search space.
- Limiting memory size: SMA* (Simplified Memory-Bounded A*)
- Use A* until memory is full.
- Drop the node with the highest-cost.
- Record this node's cost in its parent.
=>
So that we don't expand the parent until it becomes necessary.
- Iterative deepening: IDA*
- Fix a cost bound B.
- Apply A* until costs exceed B.
=>
Let B' = the cost that first exceeded B.
- Set B = B' and re-run.
- Repeat until goal node found.
Other ideas in search:
- There is a whole world of search algorithms, and many
specialized books on the subject.
- Example sub-topics: search for games, realtime-search,
meta-heuristics.
About planning:
- We have only lightly touched upon the general planning problem.
- Planning is a vast area with all kinds of research, books
and products.
- There is an entire sub-area related to planning motion.
- An example of successful planning: Mars Rover mission.