Lex Pattern | Meaning or Use | |
[ ] | square brackets - matches any of the characters inside the brackets. A range of characters or numbers can be shown using the - character, as in [a-z] to match any lower-case letter. | |
* | matches 0 or more occurances of the preceding expression. a* is any number of a's. (ab)* is any number of occurances of the ab pair, like ab, abab, ababababababab, etc. | |
+ | matches 1 or more occurances of the preceding expression. | |
? | matches 0 or 1 occurances of the preceding expression. Example, -?[0-9]+ matches positive (no -) or negative (preceded by a -) integer numbers. | |
^ | When used inside square brackets, indicates that the pattern is the negative of the one shown. For example, [^axr] matches anything but the letters a, x or r. | |
/ | the regular expression before the / is matched only if the regular expression after it is also matched. Example: [a-d]/[0-9] only matches a string that starts with a, b, c or d and is immediately followed by a digit from 0 to 9. Only one / can be used in a pattern. | |
| | matches one of the expressions on either side. (an OR condition). For example: ABCD|WXYZ matches either ABCD or WXYZ | |
{ } | Used to indicate an exact number of repetitions of the preceding pattern. [a-z]{3} only matches a 3 letter pattern that only contains letters from a to z. If two numbers appear in the brackets, separated by a comma, it provides a minimum and maximum number of occurances, for example: X{3,5} says that the X must appear at least 3 and no more than 5 times. | |
( ) | parenthesis can be used for grouping just as in algebra. ([a-z]|[0-9])+ means one or more occurances of either a character or a digit, same as [a-z0-9]+ | |
\ | the escape symbol, \ says to match the character that follows, ignoring any special meaning it might have. For example, since * has a special meaning, to match a *, you must use \* in the expression. This can also be accomplished by putting the * in double quotes, "*" | |
. | period - matches any single character except the newline (\n) |
Lex variables: (see example code)
Lex tries to match with the longest pattern available. If two or more patterns are the same length, it matches the pattern that appears first in the rules section.