JavaScript 2.0
Formal Description
Lexer Grammar
previousupnext

Thursday, November 11, 1999

This LALR(1) grammar describes the lexer syntax of the JavaScript 2.0 proposal. See also the description of the grammar notation.

This document is also available as a Word 98 rtf file.

The start symbols are NextTokenre and NextTokendiv depending on whether a / should be interpreted as a regular expression or division.

Unicode Character Classes

UnicodeCharacter  Any Unicode character
UnicodeInitialAlphabetic  Any Unicode initial alphabetic character (includes ASCII A-Z and a-z)
UnicodeAlphanumeric  Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)
WhiteSpaceCharacter 
   «TAB» | «VT» | «FF» | «SP» | «u00A0»
|  «u2000» | «u2001» | «u2002» | «u2003» | «u2004» | «u2005» | «u2006» | «u2007»
|  «u2008» | «u2009» | «u200A» | «u200B»
|  «u3000»
LineTerminator  «LF» | «CR» | «u2028» | «u2029»
ASCIIDigit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Comments

LineComment  / / LineCommentCharacters
LineCommentCharacters 
   «empty»
|  LineCommentCharacters NonTerminator
NonTerminator  UnicodeCharacter except LineTerminator
BlockComment  / * BlockCommentCharacters * /
BlockCommentCharacters 
   «empty»
|  BlockCommentCharacters NonSlash
|  PreSlashCharacters /
PreSlashCharacters 
   «empty»
|  BlockCommentCharacters NonAsteriskOrSlash
|  PreSlashCharacters /
NonSlash  UnicodeCharacter except /
NonAsteriskOrSlash  UnicodeCharacter except * | /

White space

WhiteSpace 
   «empty»
|  WhiteSpace WhiteSpaceCharacter
|  WhiteSpace LineTerminator
|  WhiteSpace LineComment LineTerminator
|  WhiteSpace BlockComment

Tokens

t  {rediv}
NextTokent  WhiteSpace Tokent
Tokenre 
   IdentifierOrReservedWord
|  Punctuator
|  NumericLiteral
|  QuantityLiteral
|  StringLiteral
|  RegExpLiteral
|  EndOfInput
Tokendiv 
   IdentifierOrReservedWord
|  Punctuator
|  DivisionPunctuator
|  NumericLiteral
|  QuantityLiteral
|  StringLiteral
|  EndOfInput
EndOfInput 
   End
|  LineComment End

Keywords and identifiers

IdentifierName 
   InitialIdentifierCharacter
|  IdentifierName ContinuingIdentifierCharacter
InitialIdentifierCharacter 
   OrdinaryInitialIdentifierCharacter
|  \ HexEscape
OrdinaryInitialIdentifierCharacter  UnicodeInitialAlphabetic | $ | _
ContinuingIdentifierCharacter 
   OrdinaryContinuingIdentifierCharacter
|  \ HexEscape
OrdinaryContinuingIdentifierCharacter  UnicodeAlphanumeric | $ | _
IdentifierOrReservedWord  IdentifierName

Punctuators

Punctuator 
   PunctuatorRE
|  PunctuatorDiv
PunctuatorRE 
   !
|  ! =
|  ! = =
|  #
|  %
|  % =
|  &
|  & &
|  & & =
|  & =
|  (
|  *
|  * =
|  +
|  + =
|  ,
|  -
|  - =
|  - >
|  .
|  . .
|  . . .
|  :
|  : :
|  ;
|  <
|  < <
|  < < =
|  < =
|  =
|  = =
|  = = =
|  >
|  > =
|  > >
|  > > =
|  > > >
|  > > > =
|  ?
|  @
|  [
|  ^
|  ^ =
|  ^ ^
|  ^ ^ =
|  {
|  |
|  | =
|  | |
|  | | =
|  ~
PunctuatorDiv 
   )
|  + +
|  - -
|  ]
|  }
DivisionPunctuator 
   /
|  / =

Numeric literals

NumericLiteral 
   DecimalLiteral
|  HexIntegerLiteral [lookahead{HexDigit}]
DecimalLiteral 
   Mantissa
|  Mantissa LetterE SignedInteger
LetterE  E | e
Mantissa 
   DecimalIntegerLiteral
|  DecimalIntegerLiteral .
|  DecimalIntegerLiteral . Fraction
|  . Fraction
DecimalIntegerLiteral 
   0
|  NonZeroDecimalDigits
NonZeroDecimalDigits 
   NonZeroDigit
|  NonZeroDecimalDigits ASCIIDigit
NonZeroDigit  1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Fraction  DecimalDigits
SignedInteger 
   DecimalDigits
|  + DecimalDigits
|  - DecimalDigits
DecimalDigits 
   ASCIIDigit
|  DecimalDigits ASCIIDigit
HexIntegerLiteral 
   0 LetterX HexDigit
|  HexIntegerLiteral HexDigit
LetterX  X | x
HexDigit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Quantity literals

QuantityLiteral  NumericLiteral QuantityName
QuantityName  [lookahead{LetterELetterX}] IdentifierName

String literals

q  {singledouble}
StringLiteral 
   ' StringCharssingle '
|  " StringCharsdouble "
StringCharsq 
   «empty»
|  StringCharsq StringCharq
StringCharq 
   LiteralStringCharq
|  \ StringEscape
LiteralStringCharsingle  UnicodeCharacter except ' | \ | LineTerminator
LiteralStringChardouble  UnicodeCharacter except " | \ | LineTerminator
StringEscape 
   ControlEscape
|  ZeroEscape
|  HexEscape
|  IdentityEscape
IdentityEscape  NonTerminator except UnicodeAlphanumeric
ControlEscape 
   b
|  f
|  n
|  r
|  t
|  v
ZeroEscape  0 [lookahead{ASCIIDigit}]
HexEscape 
   x HexDigit HexDigit
|  u HexDigit HexDigit HexDigit HexDigit

Regular expression literals

RegExpLiteral  RegExpBody RegExpFlags
RegExpFlags 
   «empty»
|  RegExpFlags ContinuingIdentifierCharacter
RegExpBody  / RegExpFirstChar RegExpChars /
RegExpFirstChar 
   OrdinaryRegExpFirstChar
|  \ NonTerminator
OrdinaryRegExpFirstChar  NonTerminator except \ | / | *
RegExpChars 
   «empty»
|  RegExpChars RegExpChar
RegExpChar 
   OrdinaryRegExpChar
|  \ NonTerminator
OrdinaryRegExpChar  NonTerminator except \ | /

Waldemar Horwat
Last modified Thursday, November 11, 1999
previousupnext