School of Engineering and Applied Science
Department of Computer Science
CSci 133 -- Algorithms and Data Structures I
http://www.seas.gwu.edu/~csci133/fall05
Prof. Michael B. Feldman
mfeldman@gwu.edu

Translating Arithmetic Expressions to  Postfix ("Reverse Polish" or RPN) Form

adapted October 2003 from
Michael B. Feldman, Software Construction and Data Structures, Chapter 7 (Addison-Wesley, 1996)
copyright 2003, Michael B. Feldman

This note is designed to be read after you read Lewis & Chase, Sect. 6.2. In that section, an algorithm is given that evaluates a postfix ("RPN") expression using a stack. Here we develop an algorithm that converts an infix expression to its RPN equivalent. The algorithm's input is a string representing an infix expression; it returns a string representing the equivalent postfix or RPN expression.

Simple Case: No Operator Precedences or Parentheses

Consider first a expression with no parentheses, all of whose operators are dyadic (binary, having two operands) and have the same precedence. In such an expression, the operators and operands always alternate; in the resulting RPN, the operands always appear in the same order as in the infix original.

We scan the input expression from left to right. If the first character we see is an operand, we can immediately output it (concatenate it to the RPN string). If it is an operator, we need to remember it until after we’ve seen its other operand; that is, as soon as the next operator is scanned. We then output the saved operator and save the new one. Figure 1 shows an example of this algorithm in operation. The right-hand column traces the infix expression as it is scanned; the left-hand column shows the RPN as it is constructed. The saved operator appears in the center.

Figure 1. Simple Infix-to-RPN Translation

Figure 1

Here is an algorithm for this simple case:

Convert Expression to RPN (no operator precedences or parentheses):
{

initialize saved-operator variable and RPN string;
while (there are tokens remaining in the expression)
{
  T = next token from expression;
  if (T is a variable or numeric literal)
  {
    add T to the RPN string;
  }
  else if (T is '+', '-', '*', or '/')
  {
    if (we've saved an operator)
    {
      add saved operator to RPN string
    }
    save T
 
  }

  else
  {
    report error!
  }

}
return RPN string
}

Taking Operator Precedences into Account

Now assume that operators of differing precedences are allowed. Consider the infix expression A+B*C. Its RPN form is A B C * +. We cannot just output the + when the B is scanned, because the *, having higher precedence, must be done first. So the + must be remembered longer, and we need to tackle the problem a bit more systematically. The precedence of the incoming operator needs to be checked against the precedence of the previous one; if the new operator has higher precedence, we need to remember it as well, until we’ve scanned its second operand! When its second operand has been scanned and output, we can output the operator.

We have, in this case, remembered two operators, and the last one remembered is the first one output. This suggests that the best way to remember the operators is to put them in a stack, which, after all, is precisely a LIFO device. This is shown in Figure 2, where an example is worked through.


Figure 2. Infix-to-RPN Translation, Taking Precedence into Account

Figure 2

Here is a modified version of the Infix-to-RPN algorithm that uses precedences.

Convert Expression to RPN (with operator precedences; no parentheses):
{
initialize operator stack and RPN string;
while (there are tokens remaining in the expression)
{
  T = next token from expression;
  if (T is a variable or numeric literal)
  {
    add T to the RPN output string;
  }
  else if (T is '+', '-', '*', or '/')
  {
    if ((operator stack is empty) or
        (precedence(operator at top of stack) < precedence(T)))
    {
      push T onto the operator stack;
    }
    else // clear operator stack and push new one onto it
    {
      do
      {
        add the operator at the top of the stack to the RPN string;
        pop the stack;

      }
      while ((the stack is not empty) and
             (precedence(T) < precedence(operator at top of stack)));
      push T onto the stack;
    }
  }
  else
  {
    report error!
  }
}
// no more tokens left in expression - clear stack into output
while (operator stack is not empty)
{
   add the operator at the top of the stack to the RPN string;
   pop the stack
;
}
return RPN string
}

In this new algorithm an operator is stacked until one of equal or lower priority comes along; then it is popped and added to the RPN. The new operator is then pushed onto the stack. The process continues until the input is empty, at which time the stack is emptied of all remaining operators. Try the function on a few examples of your own, to make sure you understand its operation.

Taking Parentheses into Account

The final modification accommodates parentheses. The way this is done becomes clear when it is realized that parentheses really override the precedence scheme, essentially creating a whole new expression inside. We can change our algorithm to allow parentheses by pushing a left paren onto the stack, which creates a sort of “false bottom” in the stack. The algorithm progresses as in the previous case, but when a right paren is seen, the stack is emptied as far back as the “false bottom,” then the “false bottom” is discarded. In Figure 3, an example is worked; the algorithm follows the example.


Figure 3. Infix-to-RPN Translation. with Precendences and Parentheses

Figure 3

Convert Expression to RPN (with parentheses and precedences):
{

initialize operator stack and RPN string;
while (there are tokens remaining in the expression)
{
  T = next token from expression;
  if (T == '(')
  {
    push T onto the stack;
  }
  else if (T is a variable or numeric literal)
  {
    add T to the RPN output string;
  }
  else if (T is '+', '-', '*', or '/')
  {
    if ((operator stack is empty) or
        (the value at the top of the stack is '(') or
        (precedence(operator at top of stack) < precedence(T)))
    {
      push T onto the operator stack;
    }
    else // clear operator stack and push new token onto it
    {
      do
      {
        add the operator at the top of the stack to the RPN string;
        pop the stack;

      }
      while ((the stack is not empty) and
             (the top of the stack is not '(') and
             (precedence(T) < precedence(operator at top of stack)));

      push T onto the stack;
    }
  else if (T is ')')  // clear operator stack back to the '('
  {
    while (top of operator stack is not '(')
    {
      add the operator at the top of the stack to the RPN string;
      pop the stack;

    }
  }
  else
  {
    report error!
  }
}

// no more tokens left in expression - clear stack into output
while (operator stack is not empty)
{
   add the operator at the top of the stack to the RPN string;
   pop the stack
;
}

return RPN string

}

(end of article)