php - What is the algorithm for parsing expressions in infix notation? -
i parse boolean expressions in php. in:
a , b or c , (d or f or not g) the terms can considered simple identifiers. have little structure, parser doesn't need worry that. should recognize keywords and or not ( ). else term.
i remember wrote simple arithmetic expression evaluators @ school, don't remember how done anymore. nor know keywords in google/so.
a ready made library nice, remember algorithm pretty simple might fun , educational re-implement myself.
recursive descent parsers fun write , easy read. first step write grammar out.
maybe grammar want.
expr = and_expr ('or' and_expr)* and_expr = not_expr ('and' not_expr)* not_expr = simple_expr | 'not' not_expr simple_expr = term | '(' expr ')' turning recursive descent parser super easy. write 1 function per nonterminal.
def expr(): x = and_expr() while peek() == 'or': consume('or') y = and_expr() x = or(x, y) return x def and_expr(): x = not_expr() while peek() == 'and': consume('and') y = not_expr() x = and(x, y) return x def not_expr(): if peek() == 'not': consume('not') x = not_expr() return not(x) else: return simple_expr() def simple_expr(): t = peek() if t == '(': consume('(') result = expr() consume(')') return result elif is_term(t): consume(t) return term(t) else: raise syntaxerror("expected term or (") this isn't complete. have provide little more code:
input functions.
consume,peek, ,is_termfunctions provide. they'll easy implement using regular expressions.consume(s)reads next token of input , throws error if doesn't matchs.peek()returns peek @ next token without consuming it.is_term(s)returns true ifsterm.output functions.
or,and,not, ,termcalled each time piece of expression parsed. can whatever want.wrapper function. instead of calling
exprdirectly, you'll want write little wrapper function initializes variables usedconsume,peek, callsexpr, , checks make sure there's no leftover input didn't consumed.
even this, it's still tiny amount of code. in python, complete program 84 lines, , includes few tests.
Comments
Post a Comment