JavaScript 2.0 Lexer Grammar

JavaScript 2.0

Formal Description

Lexer Grammar

Thursday, November 11, 1999

This LALR(1) grammar describes the lexer syntax of the JavaScript 2.0 proposal. See also the description of the grammar notation.

This document is also available as a Word 98 rtf file.

The start symbols are NextToken^re and NextToken^div depending on whether a / should be interpreted as a regular expression or division.

Unicode Character Classes

UnicodeCharacter Any Unicode character

UnicodeInitialAlphabetic Any Unicode initial alphabetic character (includes ASCII A-Z and a-z)

UnicodeAlphanumeric Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)

WhiteSpaceCharacter

«TAB» | «VT» | «FF» | «SP» | «u00A0»

| «u2000» | «u2001» | «u2002» | «u2003» | «u2004» | «u2005» | «u2006» | «u2007»

| «u2008» | «u2009» | «u200A» | «u200B»

| «u3000»

LineTerminator «LF» | «CR» | «u2028» | «u2029»

ASCIIDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Comments

LineComment / / LineCommentCharacters

LineCommentCharacters

«empty»

| LineCommentCharacters NonTerminator

NonTerminator UnicodeCharacter except LineTerminator

BlockComment / * BlockCommentCharacters * /

BlockCommentCharacters

«empty»

| BlockCommentCharacters NonSlash

| PreSlashCharacters /

PreSlashCharacters

«empty»

| BlockCommentCharacters NonAsteriskOrSlash

| PreSlashCharacters /

NonSlash UnicodeCharacter except /

NonAsteriskOrSlash UnicodeCharacter except * | /

White space

WhiteSpace

«empty»

| WhiteSpace WhiteSpaceCharacter

| WhiteSpace LineTerminator

| WhiteSpace LineComment LineTerminator

| WhiteSpace BlockComment

Tokens

t {re, div}

NextToken^t WhiteSpace Token^t

Token^re

IdentifierOrReservedWord

| Punctuator

| NumericLiteral

| QuantityLiteral

| StringLiteral

| RegExpLiteral

| EndOfInput

Token^div

IdentifierOrReservedWord

| Punctuator

| DivisionPunctuator

| NumericLiteral

| QuantityLiteral

| StringLiteral

| EndOfInput

EndOfInput

End

| LineComment End

Keywords and identifiers

IdentifierName

InitialIdentifierCharacter

| IdentifierName ContinuingIdentifierCharacter

InitialIdentifierCharacter

OrdinaryInitialIdentifierCharacter

| \ HexEscape

OrdinaryInitialIdentifierCharacter UnicodeInitialAlphabetic | $ | _

ContinuingIdentifierCharacter

OrdinaryContinuingIdentifierCharacter

| \ HexEscape

OrdinaryContinuingIdentifierCharacter UnicodeAlphanumeric | $ | _

IdentifierOrReservedWord IdentifierName

Punctuators

Punctuator

PunctuatorRE

| PunctuatorDiv

PunctuatorRE

!

| ! =

| ! = =

| #

| %

| % =

| &

| & &

| & & =

| & =

| (

| *

| * =

| +

| + =

| ,

| -

| - =

| - >

| .

| . .

| . . .

| :

| : :

| ;

| <

| < <

| < < =

| < =

| =

| = =

| = = =

| >

| > =

| > >

| > > =

| > > >

| > > > =

| ?

| @

| [

| ^

| ^ =

| ^ ^

| ^ ^ =

| {

| |

| | =

| | |

| | | =

| ~

PunctuatorDiv

)

| + +

| - -

| ]

| }

DivisionPunctuator

/

| / =

Numeric literals

NumericLiteral

DecimalLiteral

| HexIntegerLiteral [lookahead{HexDigit}]

DecimalLiteral

Mantissa

| Mantissa LetterE SignedInteger

LetterE E | e

Mantissa

DecimalIntegerLiteral

| DecimalIntegerLiteral .

| DecimalIntegerLiteral . Fraction

| . Fraction

DecimalIntegerLiteral

0

| NonZeroDecimalDigits

NonZeroDecimalDigits

NonZeroDigit

| NonZeroDecimalDigits ASCIIDigit

NonZeroDigit 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Fraction DecimalDigits

SignedInteger

DecimalDigits

| + DecimalDigits

| - DecimalDigits

DecimalDigits

ASCIIDigit

| DecimalDigits ASCIIDigit

HexIntegerLiteral

0 LetterX HexDigit

| HexIntegerLiteral HexDigit

LetterX X | x

HexDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Quantity literals

QuantityLiteral NumericLiteral QuantityName

QuantityName [lookahead{LetterE, LetterX}] IdentifierName

String literals

q {single, double}

StringLiteral

' StringChars^single '

| " StringChars^double "

StringChars^q

«empty»

| StringChars^q StringChar^q

StringChar^q

LiteralStringChar^q

| \ StringEscape

LiteralStringChar^single UnicodeCharacter except ' | \ | LineTerminator

LiteralStringChar^double UnicodeCharacter except " | \ | LineTerminator

StringEscape

ControlEscape

| ZeroEscape

| HexEscape

| IdentityEscape

IdentityEscape NonTerminator except UnicodeAlphanumeric

ControlEscape

b

| f

| n

| r

| t

| v

ZeroEscape 0 [lookahead{ASCIIDigit}]

HexEscape

x HexDigit HexDigit

| u HexDigit HexDigit HexDigit HexDigit

Regular expression literals

RegExpLiteral RegExpBody RegExpFlags

RegExpFlags

«empty»

| RegExpFlags ContinuingIdentifierCharacter

RegExpBody / RegExpFirstChar RegExpChars /

RegExpFirstChar

OrdinaryRegExpFirstChar

| \ NonTerminator

OrdinaryRegExpFirstChar NonTerminator except \ | / | *

RegExpChars

«empty»

| RegExpChars RegExpChar

RegExpChar

OrdinaryRegExpChar

| \ NonTerminator

OrdinaryRegExpChar NonTerminator except \ | /

Waldemar Horwat
Last modified Thursday, November 11, 1999