README file for PrsParser package

Package Name: PrsParser
Package Coordinator: Andy Salnikov (salnikov@slac.stanford.edu)
Creation Date: April 2000
Last Modification: See History file

This package contains a generic parser class for C++-like (or maybe even Python-like) expressions. The complete grammar is in the prs_parser.y.

Language

Types

Parser supports operation in three basic types - bool, int and double, and one complex type - list. List is basically a heterogeneous sequence which can contain basic types or other lists. Bool and int types are called integral types, integral types and double make arithmetic types.

Parser understands literals ot these types:

bool `true' or `false'
int decimal (100), octal (0100), hex (0x100), bit (0b100)
double standard notations
list '[' ... ']'

Inside the brackets in list literal there can be arbitrary number of expressions or literals, separated by commas, for example these are the valid list literals:

[]
[ 1, 2 ]
[ a, b, c ]
[ 1, 0.1234, true, [ x, y, [] ], z ]
[ [], [ [] ], [ [ [] ] ], [ [ [ [] ] ] ] ]

Type conversion works as in C++ between arithmetic types. The list can be converted to bool only, empty lists are converted to 'false', non-empty to 'true'. No conversions from arithmetic typed to lists are defined.

Identifiers

Parser understands identifiers - sequence of letters, digits or underscores starting with letter or underscore. The type and value of each identifier is determined by something external to the parser (see "Using Parser" below). There may be also "dotted" identifiers - a pair of the identifiers with a `.' between (dotted ids cannot be assigned.) Exmples of good identifiers:

a
a_1
someNotSoLongIdentifier
thisIdentifierIsSomewhatLonger.AndHasADotInItSoLooksLikeObjectWithMember
part.energy

Expressions

All supported expressions are listed in the order of their priorities.

Grouping: '(' expr ')'
Grouping (re)defines the order of evaluation of subexpressions, 'expr' can be any valid expression. Example: '(a+b)*c'
Function call: func '(' [arg1 [, arg2 ...]] ')'
'func' is a name of the function, all functions are predefined, see list below. The number of arguments and their type depends on the function. Example: 'sin(3.1415926)', 'len([1,2,3])'
Functional cast: type '(' expr ')'
'type' is bool, int or double. 'expr' will be converted to that type according to rules described above. Example: 'int(3.14159)', 'bool(a-b)'
Indexing: expr '[' idx ']'
'expr' must be expression returning list, 'idx' - expression with integral type. Examples: 'list[0]', '[1,2,3,4][2]'
Logical negate: '!' expr, 'not' expr
'expr' will be converted to type bool prior to negaion. Example: 'not a'.
Binary complement: '~' expr, 'compl' expr
'expr' will be converted to int prior to applying the operator. Example: 'compl 0b1001001'
Unary plus, minus: '+' expr, '-' expr
Unary plus does nothing at all, unary minus negates its operand (bool is promoted to int). In case the 'expr' evaluates to list type, unary minus applies to each element of the list and the result of expression is also list.
Cast: '(' type ')' expr
Same as the functional cast. Examples: '(int)3.14', '(double)true'
Exponentiation: expr1 '**' expr2
Both expressions can be of any arithmetic type, bool values will be converted to int. Return type will be double if any or both operands are double, int otherwise. Examples: 'R=sqrt(p.x**2+p.y**2)', '2.**51'
Multiply, divide, modulus: a '*' b, a '/' b, a '%' b
Works as usual for arithmetic types (operator % works only for integral types). Ususal type promotion rules apply, bools are converted to ints. One of the two expressions can have list type, in this case operator applies to each element of the list and the result of expression is also list. Example: '[1,2,3,4,5,6] % 3' will give [1,2,0,1,2,0].
Add, subtract: a '+' b, a '-' b
The same as for previous operators. Example: '[1,2,3.14] - 1' will give [0,1,2.14].
Shift operators: a '<<' b, a '>>' b
Operands can be integral expressions, or one operand can be a list containing integral expressions (or other lists with integral expressions). Result has int type (or a list of ints). Example: '1 >> [1,[2,[3]]]' will give [2,[4,[8]]].
Comparison: a '<' b, a '>' b, a '<=' b, a '>=' b
Compare two operands, return type is bool. Operands can be of any type, list objects are always "greater" than numbers. Two list objects are compared lexicographically. Examples: '[] > 10000000' will return true, '[1,2,3,4] < [1,2,3,2]' will return false, '[1,2,3] > [1,2,3,4]' will return false.
Containment: a 'in' b, a 'not in' b
(Same priority as comparison) Checks for existence of element a in the list b, resulting type is bool. Example: '3 in [1,2,3,4,5]' will return true.
Equality comparison: a '==' b, a '!=' b
Compare two operands, return type is bool. Operands can be of any type, list objects are always "greater" than numbers. Two list objects are compared lexicographically. Examples: '[] == 0' will return false, '[1,2,3,4] != [1,2,3,2]' will return true, '[1,2,3] == [1,2,3,4]' will return false.
Bit and: a '&' b, a 'bitand' b
Operands can be integral expressions, or one operand can be a list containing integral expressions (or other lists with integral expressions). Result has int type (or a list of ints). Example: '[ 0b11001100, 0b00110011 ] & 0b01010101' will return [ 0b01000100, 0b00010001 ].
Exclusive or: a '^' b, a 'xor' b
Same as above
Bit or: a '|' b, a 'bitor' b
Same as above
Logical and: a '&&' b, a 'and' b
Both operands are converted to bool. Lazy evaluation rules apply. Example: '[] and true' will return false.
Logical or: a '||' b, a 'or' b
Both operands are converted to bool. Lazy evaluation rules apply. Example: '[] or true' will return true.
Ternary operator: a '?' b ':' c
First operand is converted to bool. Only one of the two other operands is evaluated and returned, depending on the value of first operand. The type of the expression is the type of the returned operand (it may change from one evaluation to another if the types of operands differ.) Example: 'a == b ? [] : [a,b]'.
Assignment: id '=' expr
Right side of the asignment can be of any type. Left side can have only identifiers, and only non-dotted identifiers. Example: 'res = 3 in list ? [3] : [1,2,4]'

Functions

Functions usualy take some parameters and return result. The number of parameters can be fixed or variable. The type of parameters can be restricted for some functions. Usual type promotion works in most cases.

Many ordinary functions, such as abs(x), can take a list as argument too. For the list they work by applying the same function for each element and returning resulting l;ist of the same shape. For example, the call

abs ( [-1, 1, [-2, [2, 3], -3]] ) 

will return [1, 1, [2, [2, 3], 3]]. Such functions are marked as "**" in the table below.

Here is the list of functions supported by parser n alphabetic order:

Name arg # arg types ret type Description
Notes:
*) min/max functions when take few arguments evaluate minimum/maximum value of all arguments. When there is only one argument, it must be a list and returned value is a minumum/maximum value of elements in the list. Returned type is determined by the most promoted type of the arguments or the list elements. [each argument or list element must have scalar type]
**) function can take a list as argument and is applied to each element in the list
abs(x) 1 any same absolute value of argument
acos(x) 1 double,** double arccosine of argument
asin(x) 1 double,** double arcsine of argument
atan(x) 1 double,** double arctangent of argument
atan2(y,x) 2 double,** double arctangent of y/x
ceil(x) 1 double,** double nearest int not less than x
cos(x) 1 double,** double cosine of argument
cosh(x) 1 double,** double hyperbolic cosine of argument
exp(x) 1 double,** double exponentiation of argument
fabs(x) 1 double,** double absolute value as a double
floor(x) 1 double,** double neares int not greater than x
fmod(x,y) 2 double,** double remainder of x/y
hypot(x,y) 2 double,** double sqrt(x*x + y*y)
index(x,y) 2 1: list
2: any
int index of first y inside list x, or -1
log(x) 1 double,** double base E logarithm of argument
log10(x) 1 double,** double base 10 logarithm of argument
len(l) 1 list int length of the list
max(...) * var any scalar max element
min(...) * var any scalar min element
pow(x,y) 2 double,** double x ** y
rindex(x,y) 2 1: list
2: any
int index of last y inside list x, or -1
sin(x) 1 double,** double sine of argument
sinh(x) 1 double,** double hyperbolic sine of argument
sqrt(x) 1 double,** double square root of argument
tan(x) 1 double,** double tangent of argument
tanh(x) 1 double,** double hyperbolic tangent of argument

Using parser

There are two main classes which define user interface of the package: PrsParser and PrsResult. PrsParser is the class which performs expression parsing, every PrsParser object "serves" exactly one expression, which is supplied to the parser in its constructor. After construction the parser object holds the pointer to the expression tree, which is returned by the tree() method of PrsParser. If parsing generated errors that pointer is zero. The pointer has type PrsNode*. To evaluate the expression one needs to call the method eval() for this pointer, which returns an object of type PrsResult. PrsResult encapsulates both the type and the value of the result. In the case the evaluation was unsuccessfull the type and the value of the result is undefined. To check that the evaluation was successful one can use ok() method of PrsResult.

To evaluate the expression parser has to know the "meaning" of the identifiers used in expressions. This meaning is communicated to the parser through another object which user has to provide in PrsParser constructor, this object has to inherit from the class PrsAttrAccessorFactory. This object is used to created "getters" and "setters" objects (objects of the classes PrsAttrGetter and PrsAttrSetter) which are used by the parser. "Setter" objects are called for the identifiers which appear on the left side of the assignment operator, "getters" are called for all other identifiers.

More or less complete example of PrsParser usage can be found in the test program testPrsParser.cc. Note that factory/getter/setter interface is very flexible, you can do whatever you want with this, not just simple dictionary lookup as in the example above.