JavaScript 2.0
Formal Description
Regular Expression Grammar
previousupnext

Monday, June 7, 1999

This LR(1) grammar describes the regular expression syntax of the JavaScript 2.0 proposal. See also the description of the grammar notation.

This document is also available as a Word 98 rtf file.

Unicode Character Classes

UnicodeCharacter  Any Unicode character
UnicodeAlphanumeric  Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)
LineTerminator  «LF» | «CR» | «u2028» | «u2029»

Regular Expression Definitions

Regular Expression Patterns

RegularExpressionPattern  Disjunction

Disjunctions

Disjunction 
   Alternative
|  Alternative | Disjunction

Alternatives

Alternative 
   «empty»
|  Alternative Term

Terms

Term 
   Assertion
|  Atom
|  Atom Quantifier
Quantifier 
   QuantifierPrefix
|  QuantifierPrefix ?
QuantifierPrefix 
   *
|  +
|  ?
|  { DecimalDigits }
|  { DecimalDigits , }
|  { DecimalDigits , DecimalDigits }
DecimalDigits 
   DecimalDigit
|  DecimalDigits DecimalDigit
DecimalDigit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Assertions

Assertion 
   ^
|  $
|  \ b
|  \ B

Atoms

Atom 
   PatternCharacter
|  .
|  \ AtomEscape
|  CharacterClass
|  ( Disjunction )
|  ( ? : Disjunction )
|  ( ? = Disjunction )
|  ( ? ! Disjunction )
PatternCharacter  UnicodeCharacter except ^ | $ | \ | . | * | + | ? | ( | ) | [ | ] | { | } | |

Escapes

AtomEscape 
   DecimalOrOctalEscape
|  CharacterEscape
|  CharacterClassEscape
CharacterEscape 
   ControlEscape
|  c ControlLetter
|  HexEscape
|  IdentityEscape
ControlLetter 
   A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
|  a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
IdentityEscape  UnicodeCharacter except UnicodeAlphanumeric
ControlEscape 
   f
|  n
|  r
|  t
|  v

Decimal and Octal Escapes

DecimalOrOctalEscape 
   DecimalDigit [lookahead{DecimalDigit}]
|  ZeroToThree OctalDigit [lookahead{OctalDigit}]
|  ZeroToThree EightOrNine
|  FourToNine DecimalDigit
|  ZeroToThree OctalDigit OctalDigit
ZeroToThree  0 | 1 | 2 | 3
FourToNine  4 | 5 | 6 | 7 | 8 | 9
OctalDigit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
EightOrNine  8 | 9

Hexadecimal Escapes

HexEscape 
   x HexDigit HexDigit
|  u HexDigit HexDigit HexDigit HexDigit
HexDigit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Character Class Escapes

CharacterClassEscape 
   s
|  S
|  d
|  D
|  w
|  W

User-Specified Character Classes

CharacterClass 
   [ [lookahead{^}] ClassRanges ]
|  [ ^ ClassRanges ]
ClassRanges 
   «empty»
|  NonemptyClassRangesdash
d  {dashnoDash}
NonemptyClassRangesd 
   ClassAtomdash
|  ClassAtomd NonemptyClassRangesnoDash
|  ClassAtomd - ClassAtomdash ClassRanges

Character Class Range Atoms

ClassAtomd 
   ClassCharacterd
|  \ ClassEscape
ClassCharacterdash  UnicodeCharacter except \ | ]
ClassCharacternoDash  ClassCharacterdash except -
ClassEscape 
   DecimalOrOctalEscape
|  b
|  CharacterEscape
|  CharacterClassEscape

Waldemar Horwat
Last modified Monday, June 7, 1999