**Basic Expression Parsing**

Click here for advanced expression parsing

When writing your own calculator it is necessary to build a converter that can transform an input mathematical expression such as `( 1 + 8 ) – ( ( 3 * 4 ) / 2 )`

, into a format that is more suited for evaluation by computers.

When evaluating expressions such as the one above (known as “infix notation“), that which appears simple and intuitive to us humans, is usually not so straightforward to implement in a programming language.

The shunting-yard algorithm is a method for parsing mathematical expressions written in infix notation to Reverse Polish Notation (RPN). The RPN notation is different to infix notation in that every operator (+, -, * etc) comes *after* the operands (numbers) and there are no parentheses (brackets).

So `( 3 * 4 )`

for example becomes `3 4 *`

.

When given an input string in Reverse Polish Notation, it is then possible to employ a simple algorithm based around the use of a stack in order to evaluate the RPN expression and determine the mathematical result.

**The Shunting-yard algorithm**

This pseudocode shows how the shunting-yard algorithm converts an expression given in conventional infix notation into Reverse Polish Notation:

For each token { If (token is a number) { Add number to the output queue } If (token is an operator eg +,-,*...) { While (stack not empty AND stack top element is an operator) { If ((token = left associative AND precedence

Implementing the the shunting-yard algorithm in JavaThe Java function to implement the shunting yard algorithm described above is as follows:

// Convert infix expression format into reverse Polish notation public static String[] infixToRPN(String[] inputTokens) { ArrayList<String> out = new ArrayList<String>(); Stack<String> stack = new Stack<String>(); // For each tokens for (String token : inputTokens) { // If token is an operator if (isOperator(token)) { // While stack not empty AND stack top element // is an operator while (!stack.empty() && isOperator(stack.peek())) { if ((isAssociative(token, LEFT_ASSOC) && cmpPrecedence(token, stack.peek())

Evaluating expressions in Reverse Polish NotationOnce we have obtained our input expression string in RPN format, we evaluate the expression using a simple algorithm, also based around the use of a stack. Pseudocode as follows:

For each token { If (token is a number) { Push value onto stack } If (token is an operator) { Pop 2 top values from the stack Evaluate operator using popped values as args Push result onto stack } }The single value remaining on the stack is the evaluation. Java code for this is as follows:

public static double RPNtoDouble(String[] tokens) { Stack<String> stack = new Stack<String>(); // For each token for (String token : tokens) { // If the token is a number push it onto the stack if (!isOperator(token)) { stack.push(token); } else { // Pop the two top entries Double d2 = Double.valueOf( stack.pop() ); Double d1 = Double.valueOf( stack.pop() ); //Get the result Double result = token.compareTo("+") == 0 ? d1 + d2 : token.compareTo("-") == 0 ? d1 - d2 : token.compareTo("*") == 0 ? d1 * d2 : d1 / d2; // Push result onto stack stack.push( String.valueOf( result )); } } return Double.valueOf(stack.pop()); }

ExampleConsider the expression

.`( 1 + 2 ) * ( 3 / 4 ) - ( 5 + 6 )`

Passing this as a string argument to the infixToRPN function will yield this RPN format:

.

1 2 + 3 4 / * 5 6 + -It is then simply a matter of evaluating the RPN format to give the desired mathematical result of

`-8.75`

.

Putting it all together: Java Implementation of Basic Expression ParserFull Java code listing as follows:

package expressionparser; import java.util.*; public class ExpressionParser { // Associativity constants for operators private static final int LEFT_ASSOC = 0; private static final int RIGHT_ASSOC = 1; // Operators private static final Map<String, int[]> OPERATORS = new HashMap<String, int[]>(); static { // Map OPERATORS.put("+", new int[] { 0, LEFT_ASSOC }); OPERATORS.put("-", new int[] { 0, LEFT_ASSOC }); OPERATORS.put("*", new int[] { 5, LEFT_ASSOC }); OPERATORS.put("/", new int[] { 5, LEFT_ASSOC }); } // Test if token is an operator private static boolean isOperator(String token) { return OPERATORS.containsKey(token); } // Test associativity of operator token private static boolean isAssociative(String token, int type) { if (!isOperator(token)) { throw new IllegalArgumentException("Invalid token: " + token); } if (OPERATORS.get(token)[1] == type) { return true; } return false; } // Compare precedence of operators. private static final int cmpPrecedence(String token1, String token2) { if (!isOperator(token1) || !isOperator(token2)) { throw new IllegalArgumentException("Invalid tokens: " + token1 + " " + token2); } return OPERATORS.get(token1)[0] - OPERATORS.get(token2)[0]; } // Convert infix expression format into reverse Polish notation public static String[] infixToRPN(String[] inputTokens) { ArrayList<String> out = new ArrayList<String>(); Stack<String> stack = new Stack<String>(); // For each token for (String token : inputTokens) { // If token is an operator if (isOperator(token)) { // While stack not empty AND stack top element // is an operator while (!stack.empty() && isOperator(stack.peek())) { if ((isAssociative(token, LEFT_ASSOC) && cmpPrecedence(token, stack.peek())

C++ Implementation of Basic Expression Parser#include <iostream> #include <sstream> #include <list> #include <stack> #include <map> #include <string> #include <vector> #include <iterator> #include <stdlib.h> const int LEFT_ASSOC = 0; const int RIGHT_ASSOC = 1; // Map the different operators: +, -, *, / etc typedef std::map< std::string, std::pair< int,int > > OpMap; typedef std::vector<std::string>::const_iterator cv_iter; typedef std::string::iterator s_iter; const OpMap::value_type assocs[] = { OpMap::value_type( "+", std::make_pair<int,int>( 0, LEFT_ASSOC ) ), OpMap::value_type( "-", std::make_pair<int,int>( 0, LEFT_ASSOC ) ), OpMap::value_type( "*", std::make_pair<int,int>( 5, LEFT_ASSOC ) ), OpMap::value_type( "/", std::make_pair<int,int>( 5, LEFT_ASSOC ) ) }; const OpMap opmap( assocs, assocs + sizeof( assocs ) / sizeof( assocs[ 0 ] ) ); // Test if token is an pathensesis bool isParenthesis( const std::string& token) { return token == "(" || token == ")"; } // Test if token is an operator bool isOperator( const std::string& token) { return token == "+" || token == "-" || token == "*" || token == "/"; } // Test associativity of operator token bool isAssociative( const std::string& token, const int& type) { const std::pair<int,int> p = opmap.find( token )->second; return p.second == type; } // Compare precedence of operators. int cmpPrecedence( const std::string& token1, const std::string& token2 ) { const std::pair<int,int> p1 = opmap.find( token1 )->second; const std::pair<int,int> p2 = opmap.find( token2 )->second; return p1.first - p2.first; } // Convert infix expression format into reverse Polish notation bool infixToRPN( const std::vector<std::string>& inputTokens, const int& size, std::vector<std::string>& strArray ) { bool success = true; std::list<std::string> out; std::stack<std::string> stack; // While there are tokens to be read for ( int i = 0; iSo when running the above code as a simple Windows Console application using an example input expression of

`( 1 + 2) * ( 3 / 4 )-(5+6)`

we get the following RPN tokens and mathematical result:If an inappropriate input expression is used, one with a mis-match in the number of left/right parentheses such as ( 1 + 2 * ( 3 / 4 )-(5+6), then we get an error:

And if we insert an additional minus sign '-' so that the original expression used is now

`-( 1 + 2) * ( 3 / 4 )-(5+6)`

, the algorithm recognizes that this should get treated as a unary minus operator, resulting in the following output:

UPDATE: 13 April 2013Extending the algorithm to handle further mathematical expressionsSo far our implementation has been applied to handle basic expressions based on standard mathematical operators +, -, *, / etc. The following downloadable C++ code makes a number of further improvements to handle additional mathematical operators sin, cos, tan, log, exp etc, as well as much more complicated subexpressions.

Additional sanity checks are included to make sure there are no mismatched numbers of parentheses, as well as some additional work on the tokenization of RPM strings to handle unary minus operators occuring at positions where an expression is expected eg -8 + 5 = -3 or 11 ^ -7 = 5.13158e-08.

All examples in this code have been verified using Google's online calculator. To use Google's built-in calculator simply enter your mathematical expression into the search box. For example:

Some examples are shown in the following table:

Expression (infix) | RPN (postfix) | Result |
---|---|---|

exp( 1.11 ) | 1.11 exp | 3.034358 |

sin( cos( 90 * pi / 180 ) ) | 90 pi * 180 / cos sin | 0.000001 |

34.5*(23+1.5)/2 | 34.5 23 1.5 + * 2 / | 422.625000 |

5 + ((1 + 2) * 4) - 3 | 5 1 2 + 4 * + 3 - | 14 |

( 1 + 2 ) * ( 3 / 4 ) ^ ( 5 + 6 ) | 1 2 + 3 4 / 5 6 + ^ * | 0.126705 |

3/2 + 4*(12+3) | 3 2 / 4 12 3 + * + | 61.5 |

PI*pow(9/2,2) | PI 9 2 / 2 pow * | 63.617197 |

((2*(6-1))/2)*4 | 2 6 1 - * 2 / 4 * | 20 |

ln(2)+3^5 | 2 ln 3 5 ^ + | 243.693147 |

11 ^ -7 | 11 -7 ^ | 5.13158e-08 |

cos ( ( 1.3 + 1 ) ^ ( 1 / 3 ) ) - log ( -2 * 3 / -14 ) | 1.3 1 + 1 3 / ^ cos -2 3 * -14 / log - | 0.616143 |

1 * -sin( Pi / 2) | 1 Pi 2 / -sin * | -1 |

-8 + 5 | -8 5 + | -3 |

1 - (-2^2) - 1 | 1 1 -2 2 ^ * - 1 - | 4 |

**Download the C++ code for handling further mathematical expressions:**

The download contains the Visual Studio 2010 code:

The file `main.cpp`

is an example usage of applying the `Tokenize`

method of the `ExpressionParser`

class to first prime the input expression string into a suitable format, which is subsequently converted to Reverse Polish Notation using the `InfixToRPN`

method.

If successful, the reverse Polish Notation is then evaluated to give the mathematical result.

Example usage shown here:

#include "ExpressionParser.h" // Print iterators in a generic way template<typename T, typename InputIterator> void Print( const std::string& message, const InputIterator& itbegin, const InputIterator& itend, const std::string& delimiter) { std::coutExample outputs:

Download below. Please contact me if you need any help in using the software.

hi,

just want to say thanks, it was very useful and, u’ve made it easy.

P.S

i think, you should change this, like what u have in your java code :

If (token is right bracket ')')

{

While (stack not empty AND

stack top element not a left bracket)

{

Pop the stack onto output queue

Pop the stack

}

}

to this:

If (token is right bracket ')')

{

While (stack not empty AND

stack top element not a left bracket)

{

Pop the stack onto output queue

}

Pop the stack

}

Hi thanks for the feedback, glad you found it useful. Post has been updated accordingly.

Well spotted!

Hello,

Thanks for your post . I have one thing..

In the updated version, If while loop is ended , then stack is empty which means , you can’t pop from… after while loop

Hi Yasmin

I think I know what you mean. But the stack is not quite empty, the while loop keeps popping until it reaches a left parenthesis:

// Until the token at the top of the stack is a left parenthesis, pop operators

// off the stack onto the output queue.

while ( !stack.empty() && stack.top() != "(" )

{

// Add to end of list

outputQueue.push( stack.top() );

stack.pop();

}

If it is not left with a remaining “(” then the expression is invalid and the program exits.

I know, but in your original post the second pop statement from the stack is out of while loop .so if there is no opening ( and everything is just pop from the stack, it will release exception, so it should be handled in case user miswritting the opening bracket…

Ah, I see what you mean. I think this improvement got put into the in the downloadable, but the one shown here still relies on a valid input expression. I will update the code in due course. Thanks for pointing that out.

Hi I have updated the C++ code listing on this page to reflect this. You still need to enter each value separated by a whitespace though, the download has tokenizers etc to cope with these.

hello sir, could u tell me the java coding to use one expression for calculation which has been passed as string from another java file?

Hi Rakesh. I’m not sure I understand your question. Could you be more specific? It would help if you could express your problem in terms of what your current input is, and what your desired outcome would be. What do you mean by “passed as string from another java file?”

really cool, since easy to extract an abstract base class with generic token. thanx a lot!!

Thanks for your kind words. The intention was to keep this as simple as possible, so that others can use it and/or modify it in ways they see fit.

Hi,

the solution is really elegant…

Are there algorithms for inserting braces in a mathematical expression…..???

You mean braces in things like sin or log expressions?

Can your parser solve this questions?

Given a series of numbers as the input, the last one as the result. Use the rest numbers to calculate the result,only +, -, *, / are allowed. The order of the input cannot be changed. If there is an equation, print it; or print “no equation”. If more than one solution, any working equation is fine.

example:

input: 2, 3, 1, 4

output: 2+3-1 = 4.

Hi Andy. No this parser as it stands would not be able to do what describe. If I understand you correctly, would it not be a case of writing an additional routine to:

1. parse your original input string of “2,3,1,4”, separating it into an input part (“2,3,1”) and an expected output part (“4”).

2. replace each of the the commas in your input string with the arithmetic operators you prefer so that “2,3,1” becomes “2+3-1”

3. feed this new string “2+3-1” into the infixToRPN() routine and then the RPNtoDouble() routine.

4. compare the answer returned with the desired answer. If they are equal print the new string appended with the equals sign and the desired output string ie “2+3-1” + “=4” ; otherwise print “no equation”.

Hope this helps.

can you help me with this problem pleas.

Write a program that will evaluate mathematical equation and print its result.

ex. enter equation (3+4)/2*1

result 3.5

Hi have you tried using the expression “(3+4)/2*1” in the Java or C++ versions shown here?

For the C++ code, go the line that says

istringstream iss( “( 1 + 2 ) * ( 3 / 4 ) – ( 5 + 6 )” );

and replace this with

istringstream iss( “( 3 + 4 ) / 2 * 1” ).

Ditto the For the Java version, just simply use:

String[] input = “( 3 + 4 ) / 2 * 1″.split(” “);

Both yield the desired result of 3.5

Thanks for this great job!!!

I have a question how can i add a subexpression ???

example:

10 + 20 * sin ( 30+40 ) -cos ( 10*sin(10) )

Thanks

Will look into it one day…

The newest version is now able to evaluate expressions like this.

Thank you for sharing your code đź™‚

You’re welcome.

How can I add sin, cos, tan, pi, square root and square to the code? Should they be added to own OPERATORS?

Newset version can now handle functions like these.

I think it may involve this. When I get a moment I will look into it. If you see any useful links please let me know.

Cool that you are going to look into other operators! Looking foward to it!

Done!

Thanks,

But how can I tell the difference between subtraction(-) and negative (-) ?

such as 8-3 is subtraction but -8+3 is negative.

Now covered in newest version!

Really nice introduction into the shunting yard algorithm. If you are interested in a versatile and extendable expression parser using an LL(1) grammar for Java follow this link.

http://cogitolearning.co.uk/?p=523

Very slick. Nice one!

Hi, Andy. Can I use this Java code for my own project?

Yes.

Hi Andy,

We are currently working on a RPN Calculator and your algorithm help us a lot.

Thank you very much for sharing your code,

ThĂ©ophile