The rest of this page is slightly out of date

FullPostfixExpression

PrimaryExpression

| FullPostfixExpression MemberOperator

| FullPostfixExpression Arguments

| PostfixExpression [no line break] ++

| PostfixExpression [no line break] --

FullNewExpression new FullNewSubexpression Arguments

ShortNewExpression new ShortNewSubexpression

FullNewSubexpression

PrimaryExpression

| FullNewSubexpression MemberOperator

ShortNewSubexpression

FullNewSubexpression

MemberOperator

[ ArgumentList ]

| . QualifiedIdentifier

| . ParenthesizedExpression

| @ QualifiedIdentifier

| @ ParenthesizedExpression

The @ operator performs a type cast. The second operand specifies the type. Both the . and the @ operators accept either a QualifiedIdentifier or a ParenthesizedExpression as the second operand. If it is a ParenthesizedExpression, the second operand of . must evaluate to a string. a.(x) is a synonym for a[x] except that the latter can be overridden via operator overloading.

The [] operator can take multiple (or even named) arguments. This allows users to define data structures such as multidimensional arrays via operator overloading.

Arguments ( ArgumentList )

ArgumentList

«empty»

| ArgumentListPrefix

| NamedArgumentListPrefix

ArgumentListPrefix

AssignmentExpression^allowIn

| ArgumentListPrefix , AssignmentExpression^allowIn

NamedArgumentListPrefix

LiteralField

| ArgumentListPrefix , LiteralField

| NamedArgumentListPrefix , LiteralField

An ArgumentList can contain both positional and named arguments. Named arguments use the same syntax as object literals.

Prefix Unary Operators

UnaryExpression

PostfixExpression

| delete PostfixExpression

| typeof UnaryExpression

| eval UnaryExpression

| ++ PostfixExpression

| -- PostfixExpression

| + UnaryExpression

| - UnaryExpression

| ~ UnaryExpression

| ! UnaryExpression

Multiplicative Operators

MultiplicativeExpression

UnaryExpression

| MultiplicativeExpression * UnaryExpression

| MultiplicativeExpression / UnaryExpression

| MultiplicativeExpression % UnaryExpression

Additive Operators

AdditiveExpression

MultiplicativeExpression

| AdditiveExpression + MultiplicativeExpression

| AdditiveExpression - MultiplicativeExpression

Bitwise Shift Operators

ShiftExpression

AdditiveExpression

| ShiftExpression << AdditiveExpression

| ShiftExpression >> AdditiveExpression

| ShiftExpression >>> AdditiveExpression

Relational Operators

RelationalExpression^allowIn

ShiftExpression

| RelationalExpression^allowIn < ShiftExpression

| RelationalExpression^allowIn > ShiftExpression

| RelationalExpression^allowIn <= ShiftExpression

| RelationalExpression^allowIn >= ShiftExpression

| RelationalExpression^allowIn instanceof ShiftExpression

| RelationalExpression^allowIn in ShiftExpression

RelationalExpression^noIn

ShiftExpression

| RelationalExpression^noIn < ShiftExpression

| RelationalExpression^noIn > ShiftExpression

| RelationalExpression^noIn <= ShiftExpression

| RelationalExpression^noIn >= ShiftExpression

| RelationalExpression^noIn instanceof ShiftExpression

Equality Operators

EqualityExpression^b

RelationalExpression^b

| EqualityExpression^b == RelationalExpression^b

| EqualityExpression^b != RelationalExpression^b

| EqualityExpression^b === RelationalExpression^b

| EqualityExpression^b !== RelationalExpression^b

Binary Bitwise Operators

BitwiseAndExpression^b

EqualityExpression^b

| BitwiseAndExpression^b & EqualityExpression^b

BitwiseXorExpression^b

BitwiseAndExpression^b

| BitwiseXorExpression^b ^ BitwiseAndExpression^b

BitwiseOrExpression^b

BitwiseXorExpression^b

| BitwiseOrExpression^b | BitwiseXorExpression^b

Binary Logical Operators

LogicalAndExpression^b

BitwiseOrExpression^b

| LogicalAndExpression^b && BitwiseOrExpression^b

LogicalXorExpression^b

LogicalAndExpression^b

| LogicalXorExpression^b ^^ LogicalAndExpression^b

The ^^ operator is a logical exclusive-or operator. It evaluates both operands. If they both convert to true or both convert to false, then ^^ returns false; otherwise ^^ returns the unconverted value of whichever argument converted to true.

LogicalOrExpression^b

LogicalXorExpression^b

| LogicalOrExpression^b || LogicalXorExpression^b

Conditional Operator

ConditionalExpression^b

LogicalOrExpression^b

| LogicalOrExpression^b ? AssignmentExpression^b : AssignmentExpression^b

NonAssignmentExpression^b

LogicalOrExpression^b

| LogicalOrExpression^b ? NonAssignmentExpression^b : NonAssignmentExpression^b

Assignment Operators

AssignmentExpression^b

ConditionalExpression^b

| PostfixExpression = AssignmentExpression^b

| PostfixExpression CompoundAssignment AssignmentExpression^b

CompoundAssignment

*=

| /=

| %=

| +=

| -=

| <<=

| >>=

| >>>=

| &=

| ^=

| |=

| &&=

| ^^=

| ||=

Expressions

Expression^b

AssignmentExpression^b

| Expression^b , AssignmentExpression^b

OptionalExpression

Expression^allowIn

| «empty»

Type Expressions

TypeExpression^b NonAssignmentExpression^b

JavaScript 2.0

Core Language

Statements

Tuesday, February 15, 2000

Most of the behavior of statements is the same as in JavaScript 1.5. Differences are highlighted below.

w {abbrev, abbrevNonEmpty, abbrevNoShortIf, full}

TopStatement^w

Statement^w

| LanguageDeclaration^w

| PackageDefinition

Statement^w

AnnotatedDefinition^w

| EmptyStatement^w

| ExpressionStatement Semicolon^w

| AnnotatedBlock

| LabeledStatement^w

| IfStatement^w

| SwitchStatement

| DoStatement Semicolon^w

| WhileStatement^w

| ForStatement^w

| WithStatement^w

| ContinueStatement Semicolon^w

| BreakStatement Semicolon^w

| ReturnStatement Semicolon^w

| ThrowStatement Semicolon^w

| TryStatement

Semicolon^abbrev

;

| VirtualSemicolon

| «empty»

Semicolon^{abbrevNonEmpty}

;

| VirtualSemicolon

| «empty»

Semicolon^{abbrevNoShortIf}

;

| VirtualSemicolon

| «empty»

Semicolon^full

;

| VirtualSemicolon

Empty Statement

EmptyStatement^abbrev

;

| «empty»

EmptyStatement^{abbrevNonEmpty} ;

EmptyStatement^{abbrevNoShortIf} ;

EmptyStatement^full ;

Expression Statement

ExpressionStatement [lookahead{function, {}] Expression^allowIn

Block

AnnotatedBlock Attributes Block

Block { TopStatements }

TopStatements

TopStatement^abbrev

| TopStatementsPrefix TopStatement^{abbrevNonEmpty}

TopStatementsPrefix

TopStatement^full

| TopStatementsPrefix TopStatement^full

Annotated Blocks

A block can be annotated with attributes as follows:

Attribute ... Attribute { Statement ... Statement }

Such a block behaves like a regular block except that every declaration inside that block (but not inside any enclosed scope) by default uses the attributes given by the block.

Annotated blocks are useful to define several items without having to repeat attributes for each one. For example,

class foo {
  field z:Integer;
  public var a;
  private var b;
  public function f() {}
  public function g(x:Integer):Boolean {}
}

is equivalent to:

class foo {
  var z:Integer;
  public {
    var a;
    private var b;
    function f() {}
    function g(x:Integer):Boolean {}
  }
}

Scope Blocks

A scope block has the syntax:

scope { Statement ... Statement }

A scope block behaves like a regular block except that it forms its own scope. Variable and function definitions without a Visibility prefix inside the scope block belong to that block instead of the enclosing scope.

Compiler Blocks

A compiler block has the syntax:

compile { Statement ... Statement }

The compile attribute is a hint that the block may be (but does not have to be) evaluated early. The statements inside this block should depend only on each other, on the results of earlier compiler blocks, and on properties of the environment that are designated as being available early. Other than perhaps being evaluated early, compiler blocks respect all of the scope rules and semantics of the enclosing program. Any definitions introduced by a compiler block are saved and reintroduced at normal evaluation time. On the other hand, side effects may or may not be reintroduced at normal evaluation time, so compiler blocks should not rely on side effects.

compile is an attribute, so it may also be applied to individual definitions without enclosing them in a block.

As an example, after defining

compile var x = 2;

function f1() {
  compile {
    var y = 5;
    var x = 1;
    while (y) x *= y--;
  }
  return ++x;
}

function f2() {
  compile {
    var y = x;
  }
  return x+y;
}

the value of global x will still be 2, calling f1() will always return 121, and calling f2() will return 4. If the statement x=5 is then evaluated at the global level, f1() will still return 121 because it uses its own local x. On the other hand, calling f2() may return either 7 or 10 at the implementation's discretion -- 7 if the implementation evaluated the compile block early and saved the value of y or 10 if it didn't. As this example illustrates, it is poor technique to define variables inside compiler blocks; constants are usually better.

A fully dynamic implementation of JavaScript 2.0 may choose to ignore the compile attribute and evaluate all compiler blocks at normal evaluation time. A fully static implementation may require that all user-defined types and attributes be defined inside compiler blocks.

Should const definitions with simple constant expressions such as const four = 2+2 be treated as though they were implicitly compiler definitions (compile const four = 2+2)?

Labeled Statements

LabeledStatement^w Identifier : Statement^w

If Statement

IfStatement^abbrev

if ParenthesizedExpression Statement^abbrev

| if ParenthesizedExpression Statement^{abbrevNoShortIf} else Statement^abbrev

IfStatement^{abbrevNonEmpty}

if ParenthesizedExpression Statement^{abbrevNonEmpty}

| if ParenthesizedExpression Statement^{abbrevNoShortIf} else Statement^{abbrevNonEmpty}

IfStatement^full

if ParenthesizedExpression Statement^full

| if ParenthesizedExpression Statement^{abbrevNoShortIf} else Statement^full

IfStatement^{abbrevNoShortIf} if ParenthesizedExpression Statement^{abbrevNoShortIf} else Statement^{abbrevNoShortIf}

The semicolon is optional before the else.

Switch Statement

SwitchStatement

switch ParenthesizedExpression { }

| switch ParenthesizedExpression { CaseGroups LastCaseGroup }

CaseGroups

«empty»

| CaseGroups CaseGroup

CaseGroup CaseGuards CaseStatementsPrefix

LastCaseGroup CaseGuards CaseStatements

CaseGuards

CaseGuard

| CaseGuards CaseGuard

CaseGuard

case Expression^allowIn :

| default :

CaseStatements

Statement^abbrev

| CaseStatementsPrefix Statement^{abbrevNonEmpty}

CaseStatementsPrefix

Statement^full

| CaseStatementsPrefix Statement^full

Do-While Statement

DoStatement do Statement^{abbrevNonEmpty} while ParenthesizedExpression

The semicolon is optional before the closing while.

While Statement

WhileStatement^w while ParenthesizedExpression Statement^w

For Statements

ForStatement^w

for ( ForInitializer ; OptionalExpression ; OptionalExpression ) Statement^w

| for ( ForInBinding in Expression^allowIn ) Statement^w

ForInitializer

«empty»

| Expression^noIn

| VariableDefinitionKind VariableBindingList^noIn

ForInBinding

PostfixExpression

| VariableDefinitionKind VariableBinding^noIn

With Statement

WithStatement^w with ParenthesizedExpression Statement^w

Continue and Break Statements

ContinueStatement continue [no line break] OptionalLabel

BreakStatement break [no line break] OptionalLabel

OptionalLabel

«empty»

| Identifier

Return Statement

ReturnStatement return [no line break] OptionalExpression

Throw Statement

ThrowStatement throw [no line break] Expression^allowIn

Try Statement

TryStatement

try AnnotatedBlock CatchClauses

| try AnnotatedBlock FinallyClause

| try AnnotatedBlock CatchClauses FinallyClause

CatchClauses

CatchClause

| CatchClauses CatchClause

CatchClause catch ( TypedIdentifier^allowIn ) AnnotatedBlock

FinallyClause finally AnnotatedBlock

Programs

Program TopStatements

JavaScript 2.0

Core Language

Definitions

Tuesday, February 15, 2000

Introduction

Definitions introduce new constants, variables, functions, and classes. All definitions can be preceded by zero or more attributes using the following syntax:

AnnotatedDefinition^w Attributes Definition^w

Attributes

«empty»

| FixedAttribute [no line break] Attributes

| Identifier [no line break] Attributes

FixedAttribute

private

| public

| final

Definition^w

VariableDefinition Semicolon^w

| FunctionDefinition^w

| ClassDefinition

Attributes

A definition attribute is an identifier that modifies the definition. Attributes can specify a definition's visibility, semantics, and other hints. A JavaScript program may also define and subsequently use its own attributes.

The table below summarizes the predefined attributes.

Category	Attribute	Behavior
Visibility	`local`	The definition is local in the enclosing block.
	`scope`	The definition applies to the enclosing scope.
	`global`	The definition applies to the enclosing package and is visible only inside this package.
	`private`	The definition creates a member of the enclosing class. The defined member is visible only inside that class. If there is no enclosing class, `private` is the same as `global`.
	`package`	The definition creates a member of the enclosing class. The defined member is visible only inside the enclosing package. If there is no enclosing class, `package` is the same as `global`.
	`public`	The definition creates a member of the enclosing class. The defined member is visible anywhere. If there is no enclosing class, the definition applies to the enclosing package and is visible in any package that imports this package.
Semantic	`static`	The definition creates a global member (rather than an instance member) of the enclosing class.
	`instance`	The definition creates an instance member (rather than a global member) of the enclosing class.
	`final`	The definition cannot be overridden in subclasses.
Hint	`override`	The definition overrides a member of a superclass.
	`mayOverride`	The definition may override a member of a superclass.
	`compile`	Compiler hint that the definition may be processed at compile time.
	`unused`	Compiler hint that the definition is not used.

Visibility Attributes

A visibility attribute describes the scope to which a definition applies as well as the definition's visibility outside that scope. A visibility attribute may be user-defined, in which case it can also indicate that the definition is visible in other packages only when those packages import a specific version of this package.

The local attribute applies the definition to the enclosing Block. If the enclosing block is a class, the definition does not appear as a member of that class.

The scope attribute applies the definition to the enclosing scope. If the enclosing scope is a class, the definition will appear as a member of that class; that member will be visible only inside the enclosing package (as though it had package visibility).

The global attribute applies the definition to the enclosing package.

The private, package, public, and user-defined version attributes apply the definition to the enclosing class or to the current package if there is no enclosing class.

The default visibility is scope.

There is a slight syntactic ambiguity between using package as a block attribute and defining a new package.

Semantic Attributes

The static attribute makes the definition create a global member rather than an instance member of the enclosing class. The instance attribute reverses this -- it makes the definition create an instance member of the enclosing class. The final attribute prevents subclasses from overriding this definition.

These three attributes may only be used on definitions that apply to a class. They cannot be used on definitions that, for instance, create local variables inside a function.

Hint Attributes

The override and mayOverride attributes control warnings. Normally defining a class member with the same name as a visible member of a superclass generates a warning. The override attribute reverses the sense of the warning so that the warning will be generated if there is no visible member of a superclass with the same name. The mayOverride attribute turns off this warning altogether.

The compile attribute is a hint that the definition may be evaluated early. See compiler blocks.

The unused attribute is a hint that the definition is not referenced anywhere. Referencing it will generate a warning.

User-Defined Attributes

Any constant defined in an enclosing scope is also a potential attribute. That constant's value must be an attribute object, which can be obtained either from another attribute or by calling one of the attribute-creating functions such as Version. For example, the following code creates aliases priv and loc of the attributes private and local:

compile {
  const priv = private;
  const loc = local;
  const V1 = Version("1.0","");
  const V2 = Version("2.0","1.0");
}

class C {
  priv var x;
  V1 var simple;
  V2 var complicated;
  priv static const a:Array = new Array(10);

  loc var i;
  for (i = 0; i != 10; i++) a[i] = i;
}

An implementation may require that user-defined attributes be defined early (in compiler blocks or using the compile attribute).

Extent

Each definition has a particular static and dynamic extent. The static extent of a definition is the region of source code where the definition is visible. The dynamic extent is the time interval during which the defined constant, variable, function, or class may be accessed.

The rules for determining the extent of a definition differ depending on whether the defined entity is a class member or not.

Non-Class Members

The static extent of a definition D is specified by its visibility attribute, which designates a scope (or set of scopes) A where the definition is visible. If there is a subscope B in A that defines an entity E with the same name as D and the definition E is actually executed, then the inner definition E shadows the outer one and definition D is not visible inside B.

In general, the dynamic extent of a definition D begins when the definition is executed and ends when its static extent scope is exited. There are a couple of exceptions to this rule for compatibility with JavaScript 1.5:

All function definitions at the top level of a scope have a dynamic extent which includes the entire scope.
All var definitions without a type or attributes have a dynamic extent which includes the entire scope.

Situations may arise where an inner definition will shadow an outer definition but the inner definition's static extent has not yet begun. In the example below, function f shadows the global b but tries to access the inner b before its dynamic extent begins (at the time the const b:Integer = 8 statement is executed). This is illegal, but an implementation is not required to diagnose such an error (which may be difficult, especially if the inner b is defined conditionally). The effects of executing such a program are undefined.

const b:Integer = 7;

function f():Integer {
  function g():Integer {return b}

  var a = g();
  const b:Integer = 8;
  return g() - a;
}

In general, it is not legal to define the same entity twice within a scope A without exiting A in the interim. There are a couple of exceptions:

The same var or const definition may be executed repeatedly. Here "the same" means that the definition is in the same location in the source code, which can happen if the definition is located inside a loop. Moreover, the definition's type, if any, must not change each time the definition is executed, and, if the definition is of a const, then its value may not change either.
var definitions without a type or attributes may be executed repeatedly on the same variable.

Class Members

Examples

In the example below the comments indicate the scope and visibility of each definition:

var a0;                 // Package-visible global variable
local var a1;           // Package-visible global variable
private var a2 = true;  // Package-visible global variable
package var a3;         // Package-visible global variable
public var a4;          // Public global variable

if (a1) {
  var b0;               // Package-visible global variable
  local var b1;         // Local to this block
  private var b2;       // Package-visible global variable
  package var b3;       // Package-visible global variable
  public var b4;        // Public global variable
}

public function F() {   // Public global function
  var c0;               // Local to this function
  local var c1;         // Local to this function
  private var c2;       // Package-visible global variable
  package var c3;       // Package-visible global variable
  public var c4;        // Public global variable
}

function G() {          // Package-visible global function
  var d0;               // Never defined because G isn't called
  private var d1;       // Never defined because G isn't called
  package var d2;       // Never defined because G isn't called
  public var d3;        // Never defined because G isn't called
}

class C {               // Package-visible global class
  var e0;               // Package-visible class instance variable
  private var e1;       // Class-visible class instance variable
  package var e2;       // Package-visible class instance variable
  public var e3;        // Public class instance variable
  static var e4;        // Package-visible class-global variable
  private static var e5;// Class-visible class-global variable
  package static var e6;// Package-visible class-global variable
  public static var e7; // Public class-global variable
  local var e8;         // Local to class C's block

  function H() {        // Package-visible class function
    var f0;             // Local to this function
    private var f1;     // Class-visible class variable
    package var f2;     // Package-visible class variable
    public var f3;      // Public class variable
  }
  public function I() {}// Public class method

  H();
}

F();

A static subset of JavaScript 2.0 may disallow definitions inside a function F that define entities in a scope outside F. This would disallow functions F, G, and H above.

Discussion

Should we have a protected Visibility? It has been omitted for now to keep the language simple, but there does not appear to be any fundamental reason why it could not be supported. If we do support it, should we choose the C++ protected concept (visible only in class and subclasses) or the Java protected concept (visible in class, subclasses, and the original class's package)?

JavaScript 2.0

Core Language

Variables

Wednesday, February 16, 2000

Variable Definitions

VariableDefinition VariableDefinitionKind VariableBindingList^allowIn

VariableDefinitionKind

var

| const

VariableBindingList^b

VariableBinding^b

| VariableBindingList^b , VariableBinding^b

VariableBinding^b TypedIdentifier^b VariableInitializer^b

TypedIdentifier^b

Identifier

| Identifier : TypeExpression^b

VariableInitializer^b

«empty»

| = AssignmentExpression^b

A variable defined with var can be modified, while one defined with const is read-only. Identifier is the name of the variable and TypeExpression is its type. Identifier can be any non-reserved identifier. TypeExpression is evaluated at the time the variable definition is evaluated and should evaluate to a type t.

If provided, AssignmentExpression gives the variable's initial value v. If AssignmentExpression is not provided in a var definition, then undefined is assumed; if undefined cannot be coerced to type t then any attempt to read the variable prior to writing a valid value into it will result in an error. AssignmentExpression is evaluated just after the TypeExpression is evaluated. The value v is then coerced to the variable's type t and stored in the variable. If the variable is defined using var, any values subsequently assigned to the variable are also coerced to type t at the time of each such assignment.

Multiple variables separated by commas can be defined in the same VariableDefinition. The values of earlier variables are available in the TypeExpressions and AssignmentExpressions of later variables.

If omitted, TypeExpression defaults to type any. Thus, the definition

var a, b=3, c:Integer=7, d, e:Type=Boolean, f:Number, g:e, h:int;

is equivalent to:

var a:any=undefined;
var b:any=3;
var c:Integer=7;
var d:Integer=undefined;  // coerced to NaN
var e:Type=Boolean;
var f:Number=undefined;   // coerced to NaN
var g:Boolean=undefined;  // coerced to false
var h:int=undefined;      // coerced to int(0)

Constant Definitions

const means that Identifier cannot be written after its value is set. Its value can be set by an AssignmentExpression if one is provided. If one is not provided then the constant can be written exactly once using a regular assignment statement; any attempt to read the constant prior to writing its value will result in an error. For example:

const c:Integer;

function f(x) {return x+c}

f(3);  // error: c's value is not defined
c = 5;
f(3);  // returns 8
c = 5; // error: redefining c

Just like any other definition, a constant may be rebound after leaving its scope. For example, the following is legal; j is local to the block, so a new j binding is created each time through the loop:

var k = 0;
for (var i = 0; i < 10; i++) {
  local const j = i;
  k += j;
}

JavaScript 2.0

Core Language

Functions

Friday, February 11, 2000

Function Definition

FunctionDefinition^w

ConcreteFunctionDefinition

| AbstractFunctionDefinition^w

ConcreteFunctionDefinition function FunctionName FunctionSignature Block

AbstractFunctionDefinition^w function FunctionName FunctionSignature Semicolon^w

FunctionName

Identifier

| get [no line break] Identifier

| set [no line break] Identifier

| new [no line break] Identifier

| new

FunctionSignature ParameterSignature ResultSignature

ParameterSignature ( Parameters )

Parameters

«empty»

| RestParameter

| RequiredParameters

| OptionalParameters

| RequiredParameters , RestParameter

| OptionalParameters , RestParameter

RequiredParameters

RequiredParameter

| RequiredParameters , RequiredParameter

OptionalParameters

OptionalParameter

| RequiredParameters , OptionalParameter

| OptionalParameters , OptionalParameter

RequiredParameter TypedIdentifier^allowIn

OptionalParameter TypedIdentifier^allowIn = AssignmentExpression^allowIn

RestParameter

...

| ... TypedIdentifier^allowIn

| ... TypedIdentifier^allowIn = AssignmentExpression^allowIn

ResultSignature

«empty»

| : TypeExpression^allowIn

The rest of this page is slightly out of date

Function Definitions

To define a function we use the following syntax:

FunctionDefinition

[Visibility] function [get | set] Identifier ( Parameters ) [: TypeExpression] Block

If Visibility is absent, the above declaration defines a local function within the current Block scope. If Visibility is present, the above declaration declares either a global function (if outside a ClassDefinition's Block) or a class function (if inside a ClassDefinition's Block) according to the declaration scope rules.

The function's result type is TypeExpression, which defaults to type Any if not given. If the function does not return a value, it's good practice to set TypeExpression to void to document this fact.

Block contains the function body and is evaluated only when the function is called.

Parameters

Parameters has one of the following forms:

Parameters

RequiredParameter , ... , RequiredParameter [, OptionalParameter ... , OptionalParameter] [, ... [Identifier]]

| ... [Identifier]

If the ... is present, the function accepts more arguments than just the listed parameters. If an Identifier is given after the ..., then that Identifier is bound to an array of arguments given after the listed parameters. That Identifier is declared locally as though by the declaration const array Identifier.

Individual parameters have the forms:

RequiredParameter

Identifier [: TypeExpression]

OptionalParameter

Identifier [: TypeExpression] = AssignmentExpression

TypeExpression gives the parameter's type and defaults to type Any. If the parameter name Identifier is followed by a =, then that parameter is optional. If the n^th parameter is optional and a call to this function provides fewer than n arguments, then the n^th parameter is set to the value of its AssignmentExpression, coerced to the n^th parameter's type if necessary. The n^th parameter's AssignmentExpression is evaluated only if fewer than n arguments are given in a call.

A RequiredParameter may not follow an OptionalParameter. If a function has n RequiredParameters and m OptionalParameters and no ... in its parameter list, then any call of that function must supply at least n arguments and at most n+m arguments. If this function has a ... in its parameter list, then any call of that function must supply at least n arguments. These restrictions do not apply to traditional functions.

The parameters' Identifiers are local variables with types given by the corresponding TypeExpressions inside the function's Block. Code in the Block may read and write these variables. Arguments are passed by value, so writes to these variables do not affect the passed arguments' values in the caller.

In addition to local variables generated by the parameters' Identifiers, each function also has a predefined arguments local variable which holds an array (of type const array) of all arguments passed to this function.

Evaluation Order

When a function is called, the following list indicates the order of evaluation of the various expressions in a FunctionDefinition. These steps are taken only after all of the arguments have been evaluated.

Evaluate the first parameter's TypeExpression to obtain a type t.
If the first parameter is optional and no argument has been supplied, evaluate the first parameter's AssignmentExpression and let it be the first parameter's value.
Coerce the argument (or default) value to type t and bind the parameter's Identifier to the result.
Repeat steps 1-3 for each additional parameter.
If the FunctionDefinition's Parameters ends with a ... followed by an Identifier, bind that Identifier to an array comprised of the zero or more leftover arguments not already bound to a parameter.
Evaluate the FunctionDefinition's result TypeExpression to obtain a result type r.
Evaluate the body.
Coerce the result to type r and return it.

Note that later TypeExpressions and AssignmentExpressions can refer to previously bound arguments. Thus, the following is legal:

function choice(boolean a, type b, b c, b d=) b {
  return a ? c : d;
}

The call choice(true,integer,8,4) would return 8, while choice(false,integer,6) would return 0 (undefined coerced to type integer).

Relationship to Methods and Classes

Unless the function is a traditional function, the function definition using the above syntax does not define a class; the function's name cannot be used in a new expression, and the function does not have a this parameter. Any attempt to use this inside the function's body is an error. To define a method that can access this, use the method keyword.

If a FunctionDefinition is located at a class scope (either because it is located the top level of a ClassDefinition's Block or it has a Visibility prefix and is located inside a ClassDefinition's Block), then the function is a static method of the class. Unlike C++ or Java, JavaScript 2.0 does not use the static keyword to indicate such functions; instead, instance methods (i.e. non-static methods) are defined using the method keyword.

Getters and Setters

If a FunctionDefinition contains the keyword get or set, then the defined function is a getter or a setter.

A getter must not take any parameters and cannot have a ... in its Parameters list. Unlike an ordinary function, a getter is invoked by merely mentioning its name without an Arguments list in any expression except as the destination of an assignment. For example, the following code returns the string “<2,3,1>”:

var x:integer = 0;
function get serialNumber():integer {return ++x}

var y = serialNumber;
return "<" + serialNumber + "," + serialNumber + "," + y + ">";

A setter must take exactly one required parameter and cannot have a ... in its Parameters list. Unlike an ordinary function, a setter is invoked by merely mentioning its name (without an Arguments list) on the left side of an assignment or as the target of a mutator such as ++ or --. The result of the setter becomes the result of the assignment. For example, the following code returns the string “<1,2,43>”:

var x:integer = 0;
function get serialNumber():integer {return ++x}
function set serialNumber(n:integer):integer {return x=n}

var s = "<" + serialNumber + "," + serialNumber;
serialNumber = 42;
return s + "," + serialNumber + ">";

A setter can have the same name as a getter in the same lexical scope. A getter or setter cannot be extracted from its variable, so the notion of the type of a getter or setter is vacuous; a getter or setter can only be called.

Contrast the following:

var x:integer = 0;
function f():integer {return ++x}
function g():Function {return f}
function get h():Function {return f}

f;     // Evaluates to function f
g;     // Evaluates to function g
h;     // Evaluates to function f (not h)
f();   // Evaluates to 1
g();   // Evaluates to function f
h();   // Evaluates to 2
g()(); // Evaluates to 3

We can use a getter and a setter to create an alias to another variable, as in:

function get myAlias() {return Pkg::var}
function set myAlias(x) {return Pkg::var = x}

myAlias = myAlias+4;

Traditional Functions

Traditional function definitions are provided for compatibility with JavaScript 1.5. The syntax is as follows:

TraditionalFunctionDefinition

[Visibility] traditional function Identifier ( Identifier , ... , Identifier ) Block

A function declared with the traditional keyword cannot have any argument or result type declarations, optional arguments, or getter or setter keyword. Such a function is treated as though every argument were optional and more arguments than just the listed ones were allowed. Thus, the definition

traditional function Identifier ( Identifier , ... , Identifier ) Block

behaves like the following function definition:

function Identifier ( Identifier = , ... , Identifier = , ... ) Block

Furthermore, a traditional function defines its own class and treats this in the same manner as JavaScript 1.5.

Functions in Expressions

Every function (except a getter or a setter) is also a value and has type Function. Like other values, it can be stored in a variable, passed as an argument, and returned as a result. The identifiers in a function are all lexically scoped.

Function Expressions

We can use a variant of a function definition to define a function inside an expression. The syntax is:

FunctionExpression

function [Identifier] ( Parameters ) [: TypeExpression] Block

This expression defines a function and returns it as a value of type Function. The function can be named by providing the Identifier, but this name is only accessible from inside the function's Block.

To avoid confusion between a FunctionDefinition and a FunctionExpression, a Statement (and a few other grammar nonterminals) may not begin with a FunctionExpression. To place a FunctionExpression at the beginning of a Statement, enclose it in parentheses.

A FunctionDefinition is merely convenient syntax for a const variable definition and a FunctionExpression:

[Visibility] function Identifier ( Parameters ) [: TypeExpression] Block

is equivalent to:

[Visibility] const Identifier : Function = function Identifier ( Parameters ) [: TypeExpression] Block ;

Function Calls

Unless a function is a getter or a setter, we call that function by listing its arguments in parentheses after the function expression, just as in JavaScript 1.5:

FullPostfixExpression

FullPostfixExpression ( AssignmentExpression , ... , AssignmentExpression )

| other postfix expressions

Discussion

Getters and Setters

By consensus in the ECMA TC39 modularity subcommittee, we decided to use the above syntax for getters and setters instead of:

FunctionDefinition

[Visibility] [getter | setter] function Identifier ( Parameters ) [: TypeExpression] Block

| TraditionalFunctionDefinition

The decision was based on aesthetics; neither syntax is more difficult to implement than the other.

Optional Parameters

Do we want to have a named rest parameter (as in the proposal above), or only support the arguments special local variable as in JavaScript 1.5? The main difference is in the handling of fixed arguments -- they must be added to the arguments array but can be omitted from the rest array.

Traditional Functions

The traditional keyword is ugly, so let's take a look at some alternatives. Unless we want to continue to make each function into a class (as JavaScript 1.5 does), we need some way to indicate which functions are also classes and which ones are not. Also, we'd like to be able to indicate which functions can be called with more or fewer than the desired number of arguments and which cannot.

One possibility would be to state that any function that uses a type annotation in its signature (either the parameter list or the result type) is a new-style function and does not define a class; other functions would declare classes. Furthermore, new-style functions would have to be called with the exact number of arguments unless some parameters are optional or a ... is present in the parameter list. These are analogous to the rules that ANSI C used to distinguish new-style functions from traditional C functions. As with ANSI C, we have somewhat of a difficulty with functions that take no parameters; such functions would need to specify a return type to be considered new-style.

C++ did away with the ANSI C treatment of traditional C functions. We could do the same by having a pragma (analogous to Perl's use pragmas) that could indicate that all functions are to be considered new-style unless prefixed by the traditional keyword. If we do this, we should decide whether the default setting of this pragma would be on or off.

JavaScript 2.0

Core Language

Classes

Monday, February 14, 2000

Class Definition

ClassDefinition class Identifier Superclasses Block

Superclasses

«empty»

| extends TypeExpression^allowIn

This page is out of date

Class Definitions

In JavaScript 2.0 we define classes using the class keyword. Limited classes can also be defined via JavaScript 1.5-style functions, but doing so is discouraged for new code.

ClassDefinition

[Visibility] class Identifier [extends TypeExpression] Block

| [Visibility] class extends TypeExpression Block

The first format declares a class with the name Identifier, binding Identifier to this class in the scope specified by the Visibility prefix (which usually includes the ClassDefinition's Block). Identifier is a constant variable with type type and can be used anywhere a type expression is allowed.

When the first ClassDefinition format is evaluated, the following steps take place:

A new type t is created.
If extends TypeExpression is given, TypeExpression is evaluated to obtain a type s, which must be another class. If extends TypeExpression is absent, type s defaults to the class Object.
Type t is made a subtype of type s.
Identifier is lexically bound in the scope given by Visibility; however, at this time Identifier does not have a legal type yet and any attempt to read or write it results in an error.
Block is evaluated.
If Block is evaluated successfully (without throwing out an exception), all const, var, function, constructor, and class declarations evaluated at its top level (or placed at its top level by the scope rules) become class members of type t. All field and method declarations evaluated at the Block's top level (or placed at its top level by the scope rules) become instance members of type t.
The value of Identifier becomes type t. From now on Identifier is a constant and its value cannot be altered.

A ClassDefinition's Block is evaluated just like any other Block, so it can contain expressions, statements, loops, etc. Such statements that do not contain declarations do not contribute members to the class being declared, but they are evaluated when the class is declared.

Class Extensions

If a ClassDefinition omits the class name Identifier, it extends the original class rather than creating a subclass. A class extension may define new methods and class constants and variables, but it does not have special privileges in accessing the original class definition's private members (or package members if in a separate package). A class extension may not override methods, and it may not define constructors or instance variables.

Each instance of the original class is automatically also an instance of the extended class. Several extensions can apply to the same class.

An extension is useful to add methods to system classes, as in the following code in some user package P:

class extends string {
  public method scramble() string {...}
  public method unscramble() string {...}
}

var x = "abc".scramble();

Once the class extension is evaluated, methods scramble and unscramble become available on all strings. There is no possibility of name clashes with extensions of class string in other, unrelated packages because the names scramble and unscramble belong to package P and not the system package that defines string. Any packages that import package P will also be able to call scramble and unscramble on strings, but other packages will not.

Members

A class has an associated set of class members and another set of instance members. Class members are properties of the class itself, while instance members are properties of each instance object of this class and have independent values for different instance objects.

Class members are one of the following:

Constants declared with the const keyword.
Class variables declared with the var keyword.
Class functions declared with the function keyword.
Constructors declared with the constructor keyword.
Nested classes declared with the class keyword.

Instance members are one of the following:

Fields declared with the field keyword.
Methods declared with the method keyword.

Members can only be defined within the intersection of the lexical and dynamic extent of a ClassDefinition's Block. A few examples illustrate this rule.

The code

var bool extended = false;

function callIt(x) {return x()}

class C {
  extended = true;
  public function square(integer x) integer {return x*x}
  if (extended) {
    public function cube(integer x) integer {return x*x*x}
  } else {
    public function reciprocal(number x) number {return 1/x}
  }

  field string firstName, lastName;
  method name() string {return firstName + lastName}

  public function genMethod(boolean b) {
    if (b) {
      public field time = 0;
    } else {
      public field date = 0;
    }
  }

  genMethod(true);
}

defines class C with members square (a class function), cube (a class function), firstName (an instance variable), lastName (an instance variable), name (an instance method), and genMethod (a class function).

On the other hand, executing the following code after the above example would be illegal due to three different errors:

genMethod(false);   // Field date declared outside of C's block's dynamic extent

public field color; // Field declared outside a class's block

function genField() {
  public field style;
}

class D {
  genField();       // Field style declared outside D's block's lexical extent
}

Visibility

While a ClassDefinition's Block is being evaluated, the already defined class members (other than constructors) are visible and usable by the code in that Block. Afterwards members can be accessed in one of several ways:

Code inside the ClassDefinition's Block can access class members merely by mentioning their names.
Code anywhere within the current class, anywhere within the current package (if a member's Visibility is package or omitted), or anywhere within the current package or any package that imports the appropriate version of the current package (if a member's Visibility is public) can access class members by using the . operator on the class.
Code anywhere within the current class, anywhere within the current package (if a member's Visibility is package or omitted), or anywhere within the current package or any package that imports the appropriate version of the current package (if a member's Visibility is public) can access instance members by using the . operator on any of the class's instances.

Inheritance

A subclass inherits all members except constructors from its superclass. Class variables have only one global value, not one value per subclass. A subclass may override visible methods, but it may not override or shadow any other visible members. On the other hand, imports and versioning can hide members' names from some or all users in importing packages, including subclasses in importing packages.

Member Definitions

We have already seen the definition syntax for variables and constants, functions, and classes. Any of these defined at a ClassDefinition's Block's top level (or placed at its top level by the scope rules) become class members of the class.

Fields, methods, and constructor definitions have their own syntax described below. These definitions must be lexically enclosed by a ClassDefinition's Block.

MemberDefinition

FieldDefinition

| MethodDefinition

| ConstructorDefinition

Field Definitions

FieldDefinition

[Visibility] field Identifier [: TypeExpression] [= AssignmentExpression] , ... , Identifier [: TypeExpression] [= AssignmentExpression] ;

A FieldDefinition is similar to a VariableDefinition except that it defines an instance variable of the lexically enclosing class. Each new instance of the class contains a new, independent set of instance variables initialized to the values given by the AssignmentExpressions in the FieldDefinition.

Identifier is the name of the instance variable and TypeExpression is its type. Identifier can be any non-reserved identifier. TypeExpression is evaluated at the time the variable definition is evaluated and should evaluate to a type t. The TypeExpressions and AssignmentExpressions are evaluated once, at the time the FieldDefinition is evaluated, rather than every time an instance of the class is constructed; their values are saved for use in constructors.

If omitted, TypeExpression defaults to type any.

If provided, AssignmentExpression gives the instance variable's initial value v. If not, undefined is assumed; an error occurs if undefined cannot be coerced to type t. AssignmentExpression is evaluated just after the TypeExpression is evaluated. The value v is then coerced to the variable's type t and stored in the instance variable. Any values subsequently assigned to the instance variable are also coerced to type t at the time of each such assignment.

Multiple instance variables separated by commas can be defined in the same FieldDefinition.

A field cannot be overridden in a subclass.

Method Definitions

MethodDefinition

[Visibility] [final] [override] method [get | set] Identifier ( Parameters ) [: TypeExpression] Block

| [Visibility] [final] [override] method [get | set] Identifier ( Parameters ) [: TypeExpression] ;

A MethodDefinition is similar to a FunctionDefinition except that it defines an instance method of the lexically enclosing class. Parameters, the result TypeExpression, and the body Block behave just like for function definitions, with the following differences:

Every method has a predefined parameter this that refers to the instance object of the method's class on which the method was called.
A method is not in itself a value and has no type. There is no way to extract an undispatched method from a class. The . operator produces a function (more specifically, a closure) that is already dispatched and has this bound to the left operand of the . operator.
There is no analogue to functions' traditional syntax for methods. Optional parameters must be specified explicitly.

We call a regular method by combining the . operator with a function call. For example:

class C { field x:integer = 3; method m() {return x} method n(x) {return x+4} } var c = new C; c.m(); // returns 3 c.n(7); // returns 11 var f:Function = c.m; // f is a zero-argument function with this bound to c f(); // returns 3 c.x = 8; f(); // returns 8

Method Overriding

A class c may override a method m defined in its superclass s. To do this, c should define a method m' with the same name as m and use the override keyword in the definition of m'. Overriding a method without using the override keyword or using the override keyword when not overriding a method results in a warning intended to catch misspelled method names. The warning is not an error to allow subclass c to either define a method if it is not present in s or override it if it is present in s -- this situation can arise when s is imported from a different package and provides several versions.

The overriding method m' does not have to have the same number or type of parameters as the overridden method m. In fact, since parameter types can be arbitrary expressions and are evaluated only during a call, checking for parameter type compatibility when the overriding method m is declared would require solving the halting problem. Moreover, defining overriding methods that are more general than overridden methods is useful.

A method defined with the final keyword cannot be overridden (or further overridden) in subclasses.

Getter and Setter Methods

If a MethodDefinition contains the keyword get or set, then the defined method is a getter or a setter. These are analogous to getter and setter functions in that they are invoked without listing the parentheses after the method name.

A getter or setter method cannot be overridden. We could relax this restriction, but then we'd also have to allow overriding of fields by getters, setters, or other fields, and, as a corollary, allow fields to be declared final.

Constructor Definitions

ConstructorDefinition

[Visibility] constructor Identifier ( Parameters ) Block

A constructor is a class function that creates a new instance of the lexically enclosing class c. A constructor's body Block is required to call one of c's superclass's constructors (when and how?). Afterwards it may access the instance object under construction via the this local variable. A constructor should not return a value with a return statement; the newly created object is returned automatically.

A constructor can have any non-reserved name, in which case we would invoke it as though it were a class function. In addition, a constructor's Identifier can have the special name new, in which case we invoke it using the new prefix operator syntax as in JavaScript 1.5.

JavaScript 2.0

Core Language

Packages

Wednesday, February 16, 2000

Package Definition

PackageDefinition package Identifier Block

This page is out of date

Overview

Packages are an abstraction mechanism for grouping and distributing related code. Packages are designed to be linked at run time to allow a program to take advantage of packages written elsewhere or provided by the embedding environment. JavaScript 2.0 offers a number of facilities to make packages robust for dynamic linking:

Selected package contents can be protected from outside reference
Classes can maintain invariants that cannot be violated by code outside the class and/or package
Function arguments and data structure references can be type-checked to limit the kinds of unexpected inputs the package's code can experience
Packages can export multiple versions, allowing graceful upgrades to packages without changing the code that uses them

A package is a file (or analogous container) of JavaScript 2.0 code. There is no specific JavaScript statement that introduces or names a package -- every file is presumed to be a package. A package itself has no name, but it has a specific URI by which other packages can import it.

A package P typically starts with import statements that import other packages used by package P. A package that is meant to be used by other packages typically has one or more version declarations that declare versions available for export.

Package Loading

A package's body is described by the Program grammar nonterminal. A package is loaded (its body is evaluated) when the package is first imported or invoked directly (if, for example, the package is on an HTML web page). Some standard packages may also be loaded when the JavaScript engine first starts up.

Two attempts to load the same package in the same environment result in sharing of that package. What constitutes an environment is necessarily application-dependent. However, if package P1 loads packages P2 and P3, both of which load package P4, then P4 is loaded only once and thereafter its code and data is shared by P2 and P3.

When a package is loaded, all of its statements are evaluated in order, which may cause other packages to be loaded along the way when import statements are encountered. A package's symbols are available for export to other packages only after the package's body has been successfully evaluated. Unlike in Java, circularities are not allowed in the graph of package imports.

To create packages A and B that access each others' symbols, we need to instead define a hidden package C that consists of all of the code that would have gone into A and B. Package C should define versions verA and verB and tag the symbols it exports with either verA or verB to indicate whether these symbols belong in package A or B. Package A should then be empty except for a directive (or several directives if there are multiple versions of A and verA) that reexports C's symbols tagged with verA. Similarly, package B should reexport C's symbols tagged with verB. To make this work we need a reexport directive. Is this really necessary? Also, do we want a mechanism for hiding package C from general view so that users can only use it through A or B?

Exports

We can export a symbol in a package by giving it public visibility.

Imports

To import symbols from a package we use the import statement:

ImportStatement

import ImportList ;

| import ImportList Block

| import ImportList Block else CodeStatement

ImportList

ImportItem , ... , ImportItem

ImportItem

[[protected] Identifier =] NonAssignmentExpression [: Version]

The first form of the import statement (without a Block) imports symbols into the current lexical scope. The second and third forms import symbols into the lexical scope of the Block. If the imports are unsuccessful, the first two forms of the import statement throw an exception, while the last form executes the CodeStatement after the else keyword.

An import statement can import one or more packages separated by commas. Each ImportItem specifies one package to be imported. The NonAssignmentExpression should evaluate to a string that contains a URI where the package may be found. If present, Version indicates the version of the package's exports to be imported; if not present, Version defaults to version 1.

An ImportItem can introduce a name for the imported package if the NonAssignmentExpression is preceded by Identifier =. Identifier becomes bound (either in the current lexical scope or in the Block's scope) to the imported package as a whole. Individual symbols can be extracted from the package by using Identifier with the :: operator. For example, if package at URI P has public symbols a and b, then after the statement

import x=P;

P's symbols can be referenced as either a, b, x::a, or x::b.

If an ImportItem contains the keyword protected, then the imported symbols can only be accessed using the :: operator. If we were to import package P using

import protected x=P;

then we'd have to access P's symbols using either x::a or x::b.

If two imports in the same scope import packages with clashing symbols, then neither symbol is accessible unless qualified using the :: operator. If an imported symbol clashes with a symbol declared in the same scope, then the declared symbol shadows the imported symbol. Scope rules 3 and 4 apply here as well, so the following code is illegal because a is referenced and then redefined:

import x=P; var y=a; // References P's a const a=17; // Redefines a in same scope

Version names cannot be imported.

Discussion

Package Names

Do we want to use URIs to locate packages, or do we want to invent our own, separate mechanism to do this?

Visibilities

Should we make private illegal outside a class rather than making it equivalent to package?

Should we introduce a local Visibility prefix that explicitly means that the declaration is visible locally? This wouldn't provide any additional functionality but it would provide a convenient name for talking about the four kinds of visibility prefixes.

What should the default visibilities be? The current defaults are loosely modeled after Java:

Definition Location	Default visibility
Package top level	`package` (equivalent to `local` in this case)
Inside a statement outside a function or class	`local`
Function or method code's top level	`local`
Inside a statement inside a function or method	`local`
Class declaration block's top level	`package`
Inside a statement inside a class declaration block	`local`

JavaScript 2.0

Core Language

Language Declarations

Friday, February 11, 2000

Language declarations allow a script writer to select the language to use for a script or a particular section of a script. A language denotes either a major language such as JavaScript 2.0 or a variation such as strict mode.

Developers often find it desirable to be able to write a single script that takes advantage of the latest features in a host environment such as a browser while at the same time working in older host environments that do not support these features. JavaScript 2.0's language declarations enable one to easily write such scripts. One may still need to use techniques such as the LANGUAGE HTML attribute to support pre-JavaScript 2.0 environments, but at least the number of such environments that will need to be special-cased will not increase in the future.

Language declarations are a dual of versioning: language declarations let a script run under a variety of historical hosts, while versioning lets a host run a variety of historical scripts.

Syntax

LanguageDeclaration^w language [no line break] LanguageIds LanguageAlternatives LanguageSemicolon^w

LanguageSemicolon^abbrev

;

| «empty»

LanguageSemicolon^{abbrevNonEmpty}

;

| «empty»

LanguageSemicolon^full ;

LanguageAlternatives

«empty»

| |

| | LanguageIds LanguageAlternatives

LanguageIds

Identifier LanguageIdsRest

| Number LanguageIdsRest

LanguageIdsRest

«empty»

| [no line break] LanguageIds

A language declaration uses the syntax above. The keyword language is followed by one or more language alternatives separated by vertical bars. Each language alternative consists of one or more identifiers or numbers (language identifiers), except that, if there is more than one language alternative, the last one may be empty. The semicolon at the end of the LanguageDeclaration cannot be inserted by line-break semicolon insertion.

When a JavaScript environment is lexing and parsing a JavaScript program and it encounters a language declaration, it checks whether any of the language alternatives can be satisfied. If at least one can, the environment picks the first language alternative that can be satisfied and processes the rest of the containing block (until the closing } or until the end of the program if at the top level) using that language. A subsequent language declaration in the same block can further change the language.

If no language alternatives can be satisfied, then the JavaScript environment skips to the end of the containing block (until the closing matching } or until the end of the program if at the top level). Further language declarations in the same block are ignored. No error occurs unless the failing language declaration is executed as a statement, in which case it throws a syntax error. [See rationale for a discussion of some of the issues here.]

The following language identifiers are currently defined:

Language Identifier	Language
`1.0`	JavaScript 1.0
`1.1`	JavaScript 1.1
`1.2`	JavaScript 1.2
`1.3`	JavaScript 1.3
`1.4`	JavaScript 1.4
`1.5`	JavaScript 1.5 (ECMAScript Edition 3)
`2.0`	JavaScript 2.0
`strict`	Strict mode
`traditional`	Traditional mode (default)

It is meaningless to combine two or more numeric language identifiers in the same alternative:

language 1.0 2.0;

will always fail. On the other hand, it is meaningful and useful to separate them with vertical bars. For example, one can indicate that one prefers JavaScript 2.1 but is willing to accept JavaScript 2.0 if 2.1 is not available:

language 2.1 | 2.0;

An empty alternative will always succeed. One can use it to indicate a preference for strict mode but willingness to work without it:

language strict |;

Language declarations are always lexically scoped and never extend past the end of the enclosing block.

This document specifies the 2.0 language and its strict and traditional modes. The consequences of mixing in other languages are implementation-defined, but implementations are encouraged to do something reasonable.

Strict Mode

Many parts of JavaScript 2.0 are relaxed or unduly convoluted due to compatibility requirements with JavaScript 1.5. Strict mode sacrifices some of this compatibility for simplicity and additional error checking. Strict mode is intended to be used in newly written JavaScript 2.0 programs, although existing JavaScript 1.5 programs may be retrofitted.

The opposite of strict mode is traditional mode, which is the default. A program can readily mix strict and traditional portions.

Strict mode has the following effects:

Line-break semicolon insertion is turned off. (Grammatical semicolon insertion remains turned on.)
[no line break] restrictions in grammar productions are ignored. Line breaks can be placed anywhere between input tokens.
Variables must be declared.
FunctionDefinitions define constants rather than variables.
Calls to functions defined under strict mode are checked for the correct number of arguments except in traditional functions and in functions that explicitly allow a variable number of arguments. (The mode of the call site does not matter.)
Implementations may choose to disable other compatibility extensions such as support for octal literals. These are not officially part of JavaScript 2.0 but most implementations support these in traditional mode for compatibility with older programs.

Predefined Types

The following types are predefined in JavaScript 2.0:

Type	Set of Values	Coercions
`none`	No values	None
`void`	`undefined`	Any value `undefined`
`Null`	`null`	`undefined` `null`
`Boolean`	`true` and `false`	`undefined` `false`
`Integer`	Double-precision IEEE floating-point numbers that are mathematical integers, including positive and negative zeroes, infinities, and NaN	`undefined` NaN
`Number`	Double-precision IEEE floating-point numbers, including positive and negative zeroes, infinities, and NaN	`undefined` NaN
`Character`	Single 16-bit unicode characters	None
`String`	Immutable strings of unicode characters	`undefined` ""
`Function`	All functions	None
`Array`	All arrays	`undefined` `[]`
`Type`	All types	`undefined` `any`
`any`	All values	None

Unlike in JavaScript 1.5, there is no distinction between objects and primitive values. All values can have methods. Some values can be sealed, which disallows addition of ad-hoc properties. User-defined classes can be made to behave like primitives.

The above type names are not reserved words. They are considered to be defined in a scope that encloses a package's global scope, so a package could use these type names as identifiers. However, defining these identifiers for other uses might be confusing because it would shadow the corresponding type names (the types themselves would continue to exist, but they could not be accessed by name).

any is the supertype of all types. none is the subtype of all types. none is useful to describe the return type of a function that cannot return normally because it either falls into an infinite loop or always throws an exception. void is useful to describe the return type of a function that can return but that does not produce a useful value.

A literal number is a member of the type Number; if that literal has an integral value, then it is also a member of type Integer. A literal string is a member of the type String; if that literal has exactly one 16-bit unicode character, then it is also a member of type Character.

Predefined Type Constructors

We can use the following operators to construct more complex types. t is a type expression and u is a value expression in the table below.

Type	Values	Coercions
`+` `t`	`null` or any value belonging to type `t`	`null` `null`; `undefined` `null` (if `undefined` is not a member of `t`); any other coercions already defined for `t`
`~` `t`	`undefined` or any value belonging to type `t`	`undefined` `undefined`; any other coercions already defined for `t`
`singleton(u)`	Only the value `u`	None

The language cannot syntactically distinguish type expressions from value expressions, so a type expression can also use any other value operators such as !, |, and . (member access). Except for parentheses, most of them are not very useful, though. See also the type expression syntax rationale for other possible type constructors.

User-Defined Types

Any class defined using the class declaration is also a type that denotes the set of all of its and its descendants' instances. These include the predefined classes, so Object, Date, etc. are all types. null is not an instance of a user-defined class.

Meaning of Types

Types are generally used to restrict the set of objects that can be held in a variable or passed as a function argument. For example, the declaration

var x:Integer;

restricts the values that can be held in variable x to be integers.

Type declarations use the Pascal-style colon syntax. See the type declaration syntax rationale for an alternative.

A type declaration does not affect the semantics of reading the variable or accessing one of its members. Thus, as long as expression new MyType() returns a value of type MyType, the following two code snippets are equivalent:

var x:MyType = new MyType();
x.foo();

var x = new MyType();
x.foo();

This equivalence always holds, even if these snippets are inside the declaration of class MyType and foo is a private field of that class. As a corollary, adding true type annotations does not change the meaning of a program.

Type Expressions

A type is also a value (whose type is type) and can be used in expressions, assigned to variables, passed to functions, etc. For example, the code

const Z:Type = Integer;
function abs_val(i:Z):Z {
  return i<0 ? -i : i;
}

is equivalent to:

function abs_val(i:Integer):Integer {
  return i<0 ? -i : i;
}

As another example, the following method takes a type and returns an instance of that type:

method QueryInterface(T:Type):T { ... }

Coercions

Coercions can take place in the following situations:

Assigning a value v to a variable or field of type t
Passing an argument v to a function whose corresponding parameter has type t
Returning a result v from a function declared to return a value of type t
Using the v@t operator.

In any of these cases, if v t, then v is passed unchanged. If v t, then if t defines a mapping for value v then that mapped v is used; otherwise an error occurs.

`@` Operator

One can explicitly request a coercion in an expression by using the @ operator. This operator has the same precedence as . and coerces its left operand to the right operand, which must be a type. ... v@t ... can be used in an expression and has the same effect as:

function coerce_to_t(a:t):t {return a} // Declared at the top level

... coerce_to_t(v) ...

assuming that coerce_to_t is an identifier not used anywhere else. The @ operator is useful as a type assertion as in w@Window. It's a postfix operator to simplify cascading expressions:

w@Window.child@Window.pos

is equivalent to:

(((w@Window).child)@Window).pos

Casts

A type cast performs more aggressive transformations than a type coercion. To cast a value to a given type, we use the type as a function, passing it the value as an argument:

type(value)

For example, Integer(258.1) returns the integer 258, and String(2+2==4) returns the string "true".

If value is already a member of type, the type cast returns value unchanged. If value can be coerced to type, the type cast returns the result of the coercion. Otherwise, the effect of a type cast depends on type.

Need to specify the semantics of type casts. They are intended to mimic the current ToNumber, ToString, etc. methods.

JavaScript 2.0

Libraries

Versions

Tuesday, February 15, 2000

Motivation

As a package evolves over time it often becomes necessary to change its exported interface. Most of these changes involve adding symbols (global and class members), although occasionally a symbol may be deleted or renamed. In a monolithic environment where all JavaScript source code comes preassembled from the same source, this is not a problem. On the other hand, if packages are dynamically linked from several sources then versioning problems are likely to arise.

One of the most common avoidable problems is collision of symbols. Unless we solve this problem, an author of a library will not be able to add even one symbol in a future version of his library because that symbol could already be in use by some client or some other library that a client also links with. This problem occurs both in the global namespace and in the namespaces within classes from which clients are allowed to inherit.

Example

Here's an example of how such a collision can arise. Suppose that a library provider creates a library called BitTracker that exports a class Data. This library becomes so successful that it is bundled with all web browsers produced by the BrowsersRUs company:

package BitTracker;

public class Data {
  public field author;
  public field contents;
  function save() {...}
};

function store(d) {
  ...
  storeOnFastDisk(d);
}

Now someone else writes a web page W that takes advantage of BitTracker. The class Picture derives from Data and adds, among other things, a method called size that returns the dimensions of the picture:

import BitTracker;

class Picture extends Data {
  public method size() {...}
  field palette;
};

function orientation(d) {
  if (d.size().h >= d.size().v)
    return "Landscape";
  else
    return "Portrait";
}

The author of the BitTracker library, who hasn't seen W, decides in response to customer requests to add a method called size that returns the number of bytes of data in a Data object. He then releases the new and improved BitTracker library. BrowsersRUs includes this library with its latest NavigatorForInternetComputing 17.0 browser:

package BitTracker;

public class Data {
  public field author;
  public field contents;
  public method size() {...}
  function save() {...}
};

function store(d) {
  ...
  if (d.size() > limit)
    storeOnSlowDisk(d);
  else
    storeOnFastDisk(d);
}

An unsuspecting user U upgrades his old BrowsersRUs browser to the latest NavigatorForInternetComputing 17.0 browser and a week later is dismayed to find that page W doesn't work anymore. U's granddaughter Alyssa P. Hacker tries to explain to U that he's experiencing a name conflict on the size methods, but U has no idea what she is talking about. U attempts to contact the author of W, but she has moved on to other pursuits and is on a self-discovery mission to sub-Saharan Africa. Now U is steaming at BrowsersRUs, which in turn is pointing its finger at the author of BitTracker.

Solutions

How could the author of BitTracker have avoided this problem? Simply choosing a name other than size wouldn't work, because there could be some other page W2 that conflicts with the new name. There are several possible approaches:

Naming conventions. We could require each symbol to be prefixed by the full name of the party from which this symbol originates. Unfortunately, this would get tedious and unnecessarily impact casual uses of the language. Furthermore, this approach is impractical for the names of methods because it is often desirable to share the same method name across several classes to attain polymorphism; this would not be possible if Netscape's objects all used the com_netscape_length method while MIT's objects used the edu_mit_length method.
Explicit imports. We could require each client package to import every external symbol it references. This works reasonably well for global symbols but becomes tedious for the names of class members, which would have to be imported separately for each class. Alternatives exist for bulk importing members of a class, but they are somewhat complicated and do not work for interfaces or multiple inheritance.
Versions. We could require package authors to mark the symbols they export with explicit versions. A package's developer could introduce a new version of the package with additional symbols as long as those symbols were made invisible to prior versions.

The last approach appears to be the most desirable because it places the smallest burden on casual users of the language, who merely have to import the packages they use and supply the current version numbers in the import statements. A package author has to be careful not to disturb the set of visible prior-version symbols when releasing an updated package, but authors of dynamically linkable packages are assumed to be more sophisticated users of the language and could be supplied with tools to automatically check updated packages' consistency.

Overview

The versioning system in JavaScript 2.0 only affects exports of symbols. The concept of a version does not apply to a package's internal code; it is up to package developers to ensure that newer releases of their packages continue to behave compatibly with older ones.

Terminology

A version describes the API of a package. A release refers to the entirety of a package, including its code. One release can export many versions of its API. A package developer should make sure that multiple releases of a package that export version V export exactly the same set of symbols in version V.

Example

As an example, suppose that a developer wrote a sorting package P with functions sort and merge that called bubble sort in version "1.0". In the next release the developer adds a function called stablesort and includes it in version "2.0". In a subsequent release the developer changes the sort algorithm to a quicksort that calls stablesort as a subroutine. That last release of the package might look like:

compile {
  const V1_0 = Version("1.0","");       // The "" makes version "1.0" be the default
  const V2_0 = Version("2.0","1.0");
}

public var serialNumber;

public function sort(compare: Function, array: Array):Array {...}
public function merge(compare: Function, array1: Array, array2: Array):Array {...}
V2_0 function stablesort(compare: Function, array: Array):Array {...}

Suppose, further, that client package C1 imports version "1.0" of P, client package C2 simultaneously imports version "2.0" of P, and a search for P yields the latest release described above. There would be only one instance of P running -- the latest release. Both clients would get the same sort and merge functions, and both would see the same serialNumber variable (in particular, if client C1 wrote to serialNumber, then client C2 would see the updated value), but only client package C2 would see the stablesort function. Both clients would get the quicksort release of sort. If client package C1 defined its own stablesort function, then that function would not conflict with P's stablesort; furthermore, P's sort would still refer to P's stablesort in its internal subroutine call.

Had only the first release of P been available, client package C2 would obtain an error because version 2 of P's API would not be available. Client C1 could run normally, although the sort function it calls would use bubble sort instead of the quicksort.

Note that the last release of P did not change the API so it did not need a new version. Of course, it could define a new version if for some reason it wanted clients to be able to demand the last release of P even though its API is the same as the second release.

The remainder of this page is out of date. Versions are now created using ordinary object calls on a versioning library.

Version Declarations

Version Names

A version name Version is a quoted string literal such as "1.2" or "Private Interface 2.0". Two version names are equal if their strings are equal. A special version whose name is the empty string "" is called the default version.

Declaration Syntax

A package must declare every version it uses except "", which is declared by default if not explicitly declared. A version must be declared before its first use. A given version name may be declared only once per package. A package declares a version name Version using the version declaration:

VersionDefinition

[Visibility] version Version [> VersionList] ;

| [Visibility] version Version [= Version] ;

VersionList

Version , ... , Version

A version declaration cannot be nested inside a ClassDefinition's Block.

If Visibility is present, it must be either private, package, or public (without VersionsAndRenames). Unlike in other declarations, the default is public, which makes Version accessible by other packages. A private or package Visibility hides its Version from other packages; such a Version can be used only by being included in the VersionList of another Version. Also unlike other declarations, all Version declarations are global.

Version Ordering

If the Version being declared is followed by a > and a VersionList, then the Version is said to be greater than all of the Versions in the VersionList. We write v1 :> v2 to indicate that v1 is greater than v2 and v1 : v2 to indicate that either v1 and v2 are the same version or v1 :> v2. Order is transitive, which means that if v1 :> v2 and v2 :> v3, then v1 :> v3. This order induces a partial order on the set of all versions. It is possible for two versions to be unordered with respect to each other, in which case they are not equal and neither is greater than the other.

If the Version v1 being declared is followed by a = and another Version v2, then v1 becomes an alias for v2, and they may be used interchangeably.

Version Ranges

A VersionRange specifies a subset of all versions. This subset contains all versions that are both greater than or equal to a given Version1 and less than or equal to a given Version2. A VersionRange can have either of the following forms:

VersionRange

Version

| [Version1] .. [Version2]

The first form specifies the one-element set {Version}. The second form specifies the set of all Versions v such that v : Version1 and Version2 : v. If Version1 is omitted, the condition v : Version1 is dropped. If Version2 is omitted, the condition Version2 : v is dropped.

Discussion

Version Numbers 1

The original version of this specification allowed both strings and numbers as Version names. Two version names were equal if their toString representations were identical, so version names 2.0 and "2" were identical but 2.0 and "2.0" were not. In addition, numbered versions had an implicit order: For any two versions v1 and v2 whose names could be represented as numbers, v1 :> v2 if and only if v1 was numerically greater than v2. Additionally, every version except 0 was greater than version 0. It was an error to define explicit version containment relations that would violate this default order, directly or indirectly.

Numbered Version names were dropped for simplicity and to avoid confusion with versions such as 1.2.3 (which would be a syntax error unless quoted).

Version Numbers 2

Another, simpler, approach is to require all Version names to be nonnegative integers (without quotes). Versions would not need to be declared, and all versions would be totally ordered in numerical order. A disadvantage of this approach is that the total order keeps versions from being branched.

Dynamic Version Definitions

Currently version definitions are fixed. These could be turned into function calls that define versions and list their relationships. If we can get a variable or constant to hold a set of version names, then we could use these variables rather than specific version names in the VersionsAndRenames lists after public keywords. This would provide another level of abstraction and flexibility.

Separate Version Definitions

Yet another approach is to consolidate all of the information in VersionsAndRenames into a set of export statements, say, at the top of the file rather than being interspersed throughout a package along with public declarations. This would make it easier to see all of the identifiers exported by a particular version of the package, but it would also likely lead to inconsistencies when someone forgets to update an export statement after inserting another variable, function, field, or method definition. Such errors would likely be caught after a package has been released.

JavaScript 2.0

Libraries

Machine Types

Wednesday, February 16, 2000

Purpose

The machine types library is an optional library that provides additional low-level types for use in JavaScript 2.0 programs. On implementations that support this library, these types provide faster, Java-style integer operations that are useful for communicating between JavaScript 2.0 and other programming languages and for performance-critical code. These types are not intended to replace Number and Integer for general-purpose scripting.

When the machine types library is imported via an import of MachineTypes version 1, the following types become available:

Type	Unit	Values
`byte`	`B`	Machine integers between -128 and 127 inclusive
`ubyte`	`UB`	Machine integers between 0 and 255 inclusive
`short`	`S`	Machine integers between -32768 and 32767 inclusive
`ushort`	`US`	Machine integers between 0 and 65535 inclusive
`int`	`I`	Machine integers between -2147483648 and 2147483647 inclusive
`uint`	`UI`	Machine integers between 0 and 4294967295 inclusive
`long`	`L`	Machine integers between -9223372036854775808 and 9223372036854775807 inclusive
`ulong`	`UL`	Machine integers between 0 and 18446744073709551615 inclusive
`float`	`F`	Single-precision IEEE floating-point numbers, including positive and negative zeroes, infinities, and NaN

Values belonging to the nine machine types above are distinct from each other and from values of type integer. A literal may be written by using one of the units provided: 7B is the same as byte(7), which is distinct from 7I, which in turn is distinct from the plain integer 7. A float NaN is distinct from the regular Number NaN. However, the coercions listed below often hide these distinctions.

No subtype relations hold between the machine types.

The above type names are not reserved words.

The units are defined using the standard unit facility. They may be overridden by the user.

Coercions

The following coercions take place:

A machine integer value m can be coerced to any of the machine integer types M. If m is not within range of the target type M, it is treated modulo |M|, where |byte| = |ubyte| = 256, |short| = |ushort| = 65536, |int| = |uint| = 2³², and |long| = |ulong| = 2⁶⁴.
A machine integer value m of type M can be coerced to type Integer or Number. The result is the closest IEEE double-precision floating-point value using the IEEE round-to-nearest mode. 0 always becomes +0. Due to the possibility of an inexact result, awarning is generated if type M is long or ulong unless this coercion is done as a cast.
A machine integer value m of type M can be coerced to type float. The result is the closest IEEE single-precision floating-point value using the IEEE round-to-nearest mode. 0 always becomes +0. Due to the possibility of an inexact result, awarning is generated if type M is int, uint, long, or ulong unless this coercion is done as a cast.
A float value m can be coerced to type Number. The result is always exact.

Casts

The following casts can be used:

A float or Number value v can be cast to one of the machine integer types M. First v is truncated to an integer i, truncating towards zero. Then, if i is not within range of the target type M, it is treated modulo |M|. The result is i with the machine type M. +0, -0, Infinity, -Infinity, and NaN all cast to the machine integer 0.
A Number value v can be cast to type float. If inexact, the cast is done using the IEEE round-to-nearest mode. +0, -0, Infinity, -Infinity, and NaN all cast to their float equivalents.

Of course, any coercion can also be used as a cast.

Operations

When applied to a value with machine type M, the unary negation operator - always returns a value of the same type M. If the result is not within range of type M, it is treated modulo |M|.

Machine integers support the binary arithmetic operators +, -, *, /, % and bitwise logical operations ~, &, |, ^. If supplied two operands of different machine integer types M1 and M2, all of these binary operators first coerce both operands to the same type M. If M1 appears before M2 in the list byte, ubyte, short, ushort, int, uint, long, ulong, then M is M2; otherwise M is M1. Then these operators perform the operation and finally return the result as a value of type M. If the result is not within range of the target type M, it is treated modulo |M|.

If one of the operands of +, -, *, /, % is a machine integer m of type M and the other is a Number or float value, then m is first coerced to type Number or float. Next, if both operands are floats, then the result is a float; otherwise the result is a Number.

Machine integers also support bitwise shifts <<, >>, and >>>. The result has the same as the first operand. The second operand's type can be Number or any machine type and does not affect the type of the result. Right shifts using >> are signed if the first operand has type byte, short, int, or long, and unsigned if it has type ubyte, ushort, uint, or ulong. Right shifts using >>> are always unsigned.

If passed a float argument, the bitwise logical operations ~, &, |, ^ first coerce the float to a Number. If passed a float as the first argument, the bitwise shifts <<, >>, >>> first coerce the float to a Number.

The comparison operators ==, !=, <, >, <=, => allow any combination of machine type or Number operands. They always compare the exact mathematical values without first converting one operand's type to the other's. Comparisons involving NaNs are always false, and positive and negative zeros compare equal.

The identity comparisons === and !== treat all nine machine type values as disjoint from each other and from regular Number values. Thus, 7B !== 7.

The unary operator !v behaves the same as v!=0 when v has any machine type.

Discussion

These rules are designed to permit machine integer operations to be implemented as single instructions on most processor architectures yet give predictable results. Overflows wrap around instead of signaling errors because such behavior is useful for many bit-manipulation algorithms and permits much better optimization of performance-critical code. Code that is concerned about overflows should be using regular Integer instead of the machine integer types.

Disjointness of Machine Types

Why are values of the eight machine integer types distinct? This was done because of a desire to allow arithmetic operators to only support 32 bits when operating on int values. Let's take a look at the alternative:

Suppose we unify the values of all eight machine types so that 2000000000I is indistinguishable from 2000000000L. To what precision should an operator like + calculate its results? Clearly, if we're adding two long values and the result is within the range of long values, then we'd expect to get the right result. In particular, 2000000000L + 2000000000L should yield 4000000000L. However, we assumed that 2000000000L is indistinguishable from 2000000000I, so 2000000000I + 2000000000I should also yield 4000000000L, which is not representable as an int value. Thus, even if both operands are known to be int values, the + operator has to use 64-bit arithmetic.

If a has type int and we compute a+1I, then we have to use 64-bit arithmetic because the result could be 2147483648. However, if we compute var r:int = a+1I instead, then a smart compiler could make do with 32-bit arithmetic because the result is treated modulo 2³². However, this trick would not work with an expression such as if (a+1I > 0).

The alternative is viable but it leads to more demand for 64-bit arithmetic. It does have the advantage that one does not need to worry about intermediate overflows as long as the values don't approach 2⁶⁴.

JavaScript 2.0

Libraries

Operator Overloading

Wednesday, February 16, 2000

Overview

Operator overloading is useful to implement Spice-style units without having to add units to the core of the JavaScript 2.0 language. Operator overloading is done via an optional library that, when imported, exposes several additional functions and methods. This library is analogous to the internationalization library in that it does not have to be present on all implementations of JavaScript 2.0; implementations without this library do not support operator overloading.

Interface

To override operators, import package Operators, version 1.

Unary Operators

After importing package Operators, the following methods become available on all objects. Override these to override the behavior of unary operators.

Method	Operator
`Operator::plus()`	`+expr`
`Operator::minus()`	`-expr`
`Operator::bitwiseNot()`	`~expr`
`Operator::preIncrement()`	`++expr`
`Operator::postIncrement()`	`expr++`
`Operator::preDecrement()`	`--expr`
`Operator::postDecrement()`	`expr--`
`Operator::call(a1,` ...`,` `an)`	`expr(a1,` ...`,` `an)`
`Operator::construct(a1,` ...`,` `an)`	`new` `expr(a1,` ...`,` `an)`
`Operator::lookup(a1,` ...`,` `an)`	`expr[a1,` ...`,` `an]`
`Operator::toBoolean():Boolean`	`if (expr)` ..., etc.

The preIncrement, postIncrement, preDecrement, and postDecrement operators should return a two-element array; the first element should be the result of the operator, while the second should be a new value to be stored as the new value of the incremented or decremented variable. The other operators should return a result of the expression.

The call, construct, and lookup operators also take argument lists. If desired, these argument lists can include optional or ... arguments.

The !, ||, ^^, &&, and ? : operators cannot be overridden directly, but they are affected by any redefinition of toBoolean.

Binary Operators

After importing package Operators, the following global functions become available to override binary operators:

Function	Operator
`defineAdd(T1:Type, T2:Type, F:Function)`	`+`
`defineSubtract(T1:Type, T2:Type, F:Function)`	`-`
`defineMultiply(T1:Type, T2:Type, F:Function)`	`*`
`defineDivide(T1:Type, T2:Type, F:Function)`	`/`
`defineRemainder(T1:Type, T2:Type, F:Function)`	`%`
`defineLeftShift(T1:Type, T2:Type, F:Function)`	`<<`
`defineRightShift(T1:Type, T2:Type, F:Function)`	`>>`
`defineLogicalRightShift(T1:Type, T2:Type, F:Function)`	`>>>`
`defineBitwiseOr(T1:Type, T2:Type, F:Function)`	`\|`
`defineBitwiseXor(T1:Type, T2:Type, F:Function)`	`^`
`defineBitwiseAnd(T1:Type, T2:Type, F:Function)`	`&`
`defineLess(T1:Type, T2:Type, F:Function)`	`<`
`defineLessOrEqual(T1:Type, T2:Type, F:Function)`	`<=`
`defineEqual(T1:Type, T2:Type, F:Function)`	`==`
`defineIdentical(T1:Type, T2:Type, F:Function)`	`===`

Each of these functions defines the meaning of an operator for the case where its first operand has type T1 and the second operand has type T2. At least one of these types must be a class defined in the current package. F is a function that takes two arguments (of type T1 and T2) and produces the operator's result. The function F used to override the <, <=, ==, and === operators should return a Boolean; the results of the other operators are unrestricted.

When one of the operators op above is invoked in an expression a op b, the most specific definition of op that matches a and b is invoked. A definition of op for types t1 and t2 matches if the value of a is a member of t1 and the value of b is a member of t2. A definition of op for types t1 and t2 is most specific if it matches and if every other matching definition of op for types s1 and s2 satisfies t1 s1 and t2 s2. If there is no most specific matching definition of op then an error occurs.

After an operator is defined for a particular pair of types T1 and T2 it cannot be changed. A static implementation may restrict calls to the above define... functions to occur only in compiler blocks.

The >, >=, !=, and !== operators cannot be overridden directly; instead, they are defined in terms of <, <=, ==, and ===:

Expression	Definition
`a` `>` `b`	`b` `<` `a`
`a` `>=` `b`	`b` `<=` `a`
`a` `!=` `b`	`!(a` `==` `b)`
`a` `!==` `b`	`!(a` `===` `b)`

JavaScript 2.0

Formal Description

Thursday, November 11, 1999

Semantic Notation
Stages
Lexer Grammar Summary (also available as Word 98 rtf)
Lexer Grammar and Semantics (Word 98 rtf)
Regular Expression Grammar Summary (Word 98 rtf)
Regular Expression Grammar and Semantics (Word 98 rtf)
Parser Grammar Summary (Word 98 rtf)

This chapter presents the formal syntax and semantics of JavaScript 2.0. The syntax notation and semantic notation sections explain the notation used for this description. A simple metalanguage based on a typed lambda calculus is used to specify the semantics.

The syntax and semantic sections are available in both HTML 4.0 and Microsoft Word 98 RTF formats. In the HTML versions each use of a grammar nonterminal or metalanguage value, type, or field is hyperlinked to its definition, making the HTML version preferred for browsing. On the other hand, the RTF version looks much better when printed. The fonts, colors, and other formatting of the various grammar and semantic elements are all encoded as CSS (in HTML) or Word (in RTF) styles and can be altered if desired.

The syntax and semantics sections are machine-generated from code supplied to a small engine that can type-check and execute the semantics directly. This engine is in the CVS tree at mozilla/js/semantics; the input files are at mozilla/js/semantics/JS20.

JavaScript 2.0

Formal Description

Semantic Notation

Thursday, November 11, 1999

Introduction

To precisely specify the semantics of JavaScript 2.0, we use the notation described below to define the behavior of all JavaScript 2.0 constructs and their interactions.

Semantic Values

The semantics describe the meaning of a JavaScript 2.0 program in terms of operations on simpler objects borrowed from mathematics collectively called semantic values. Semantic values can be held in semantic variables and passed to semantic functions. The kinds of semantic values used in this specification are summarized in the table below and explained in the next few sections:

Semantic Value Examples	Description
	The result of a nonterminating computation
syntaxError	The result of a computation that returns by throwing a semantic exception
	The result of a semantic function that does not return a useful value
true, false	Booleans
-3, 0, 1, 2, 93	Mathematical integers
1/2, -12/7	Mathematical rational numbers
1.0, 3.5, 2.0e-10, -0.0, -, NaN	Double-precision IEEE floating-point numbers
‘`A`’, ‘`b`’, ‘`«LF»`’, ‘`«uFFFF»`’	Characters (Unicode 16-bit code points)
[`value0`, ... , `valuen-1`]	Vectors — indexed lists of semantic values
“”, “`abc`” , “`1«TAB»`5”	Strings
{`value1`, `value2`, ... , `valuen`}	Mathematical sets of semantic values
`name1` `value1`, `name2` `value2`, ... , `namen` `valuen`	Tuples with named member semantic values
`name` or `name` `value`	Tagged semantic values
function(n: Integer) n*n	Semantic functions

There is a special semantic value (pronounced as "bottom") that represents the result of an inconsistent or nonterminating computation. Unless specified otherwise, applying any semantic operator (such as +, *, etc.) to or calling a semantic function with as any argument also yields without evaluating any remaining operands or arguments (in technical terms, semantic functions and operators are strict in all of their arguments unless specified otherwise).

If interpreting a JavaScript program according to the semantics here gives a result, an actual implementation executing that JavaScript program will either fail to terminate or throw an exception because it runs out of memory or stack space.

Semantic values of the form value represents the result of a computation that throws a semantic exception. value is the exception's value (which must be a member of the SemanticException semantic type). Unless specified otherwise, applying any semantic operator (such as +, *, etc.) to value or calling a semantic function with value as any argument also yields value (with the same value) without evaluating any remaining operands or arguments.

The throw statement takes a value v and returns v. The catch statement converts v back to v.

Semantic functions that do not return a useful value return the semantic value . There are no operations defined on .

Booleans

The semantic values true and false are booleans. The not, and, or, and xor operators operate on booleans. Like most other operators, and, or, and xor evaluate both operands before returning a result; these operators do not short-circuit.

Integers

Unless specified otherwise, numbers in the semantics written without a slash or decimal point are mathematical integers: ..., -3, -2, -1, 0, 1, 2, 3, .... The usual mathematical operators +, -, *, and unary - can be used on integers. Integers can be compared using =, , <, , >, and .

Rationals

Numbers in the semantics written with a slash are mathematical rational numbers. Every integer is also a rational. Rational numbers include, for example, 0, 1, 2, -1, 1/2, -12/7, and -24/14; the last two are different ways of writing the same rational number. The usual mathematical operators +, -, *, /, and unary - can be used on rationals. Rationals can be compared using =, , <, , >, and .

Doubles

Numbers in the semantics written with a decimal point are double-precision IEEE floating-point numbers (often abbreviated as doubles), including distinct +0.0, -0.0, +, -, and NaN. Doubles are distinct from integers and rationals; when writing doubles in the semantics, we always include a decimal point to distinguish them from integers and rationals.

Doubles other than +, -, and NaN are called finite. We define the significand of a finite double d as follows:

The significand of +0.0 or -0.0 is 0.
If d is normalized, we can uniquely represent it as s m 2^e, where m and e are integers, s {-1, 1}, 2⁵² m < 2⁵³, and -1074 e 971. m is the significand.
If d is denormalized, we can uniquely represent it as s m 2^-1074, where m is an integer, s {-1, 1}, and 0 < m < 2⁵². m is the significand.

Characters

Characters are single Unicode 16-bit code points. We write them enclosed in single quotes ‘ and ’. There are exactly 65536 characters: ‘«u0000»’, ‘«u0001»’, ...,‘A’, ‘B’, ‘C’, ..., ‘«uFFFF»’ (see also notation for non-ASCII characters). Unicode surrogates are considered to be pairs of characters for the purpose of this specification.

The characterToCode and codeToCharacter semantic functions convert between characters and their integer Unicode values.

Vectors

A semantic vector contains zero or more elements indexed by integers starting from zero. We write a vector value by enclosing a comma-separated list of values inside bold brackets:

[element0, element1, ... , elementn-1]

For example, the following semantic value is a vector whose elements are four strings:

[“parsley”, “sage”, “rosemary”, “thyme”]

The empty vector is written as [].

Let u = [e0, e1, ... , en-1] and v = [f0, f1, ... , fm-1] be vectors, i and j be integers, and x be a value. The following notations describe common operations on vectors:

Notation	Result Value
`u` `v`	The concatenated vector [`e0`, `e1`, ... , `en-1`, `f0`, `f1`, ... , `fm-1`]
\|`u`\|	The length `n` of the vector
`u`[`i`]	The `i`^th element `ei`, or if `i`<0 or `in`
`u`[`i` ... `j`]	The vector slice [`ei`, `ei+1`, ... , `ej`] consisting of all elements of `u` between the `i`^th and the `j`^th, inclusive, or if `i`<0, `jn`, or `j`<`i`-1. The result is the empty vector [] if `j`=`i`-1.
`u`[`i` ...]	The vector slice [`ei`, `ei+1`, ... , `en-1`] consisting of all elements of `u` between the `i`^th and the end, or if `i`<0 or `i`>`n`. The result is the empty vector [] if `i`=`n`.
`u`[`i` `x`]	The vector [`e0`, ... , `ei-1`, `x`, `ei+1`, ... , `en-1`] with the `i`^th element replaced by the value `x` and the other elements unchanged, or if `i`<0 or `in`

Semantic vectors are functional; there is no notation for modifying a semantic vector in place.

Strings

A semantic string is merely a vector of characters. For notational convenience we can write a string literal as zero or more characters enclosed in double quotes. Thus,

“Wonder«LF»”

is equivalent to:

[‘W’, ‘o’, ‘n’, ‘d’, ‘e’, ‘r’, ‘«LF»’]

In addition to all of the other vector operations, we can use =, , <, , >, and to compare two strings.

Sets

A semantic set is an unordered collection of values. Each value may occur at most once in a set. There must be a well-defined = semantic operator defined on all pairs of values in the set, and that operator must induce an equivalence relation.

A semantic set is denoted by enclosing a comma-separated list of values inside braces:

{element1, element2, ... , elementn}

The empty set is written as {}.

For example, the following set contains seven integers:

{3, 0, 10, 11, 12, 13, -5}

When using elements such as integers and characters that have an obvious total order, we can also write sets by using the ... range operator. For example, we can rewrite the above set as:

{0, -5, 3 ... 3, 10 ... 13}

If the beginning of the range is equal to the end of the range, then the range consists of only one element: {7 ... 7} is the same as {7}. If the end of the range is one "less" than the beginning, then the range contains no elements: {7 ... 6} is the same as {}. If the end of the range is more than one "less" than the beginning, then the set is .

Let A and B be sets and x be a value. The following notations describe common operations on sets:

Notation	Result Value
\|`A`\|	The number of elements in the set `A`; if `A` has infinitely many elements
min `A`	If there exists a value `m` that satisfies both `m` `A` and for all elements `x` `A`, `x` `m`, then return `m`; otherwise return (this could happen either if `A` is empty or if `A` has an infinite descending sequence of elements with no lower bound in `A`)
max `A`	If there exists a value `m` that satisfies both `m` `A` and for all elements `x` `A`, `x` `m`, then return `m`; otherwise return (this could happen either if `A` is empty or if `A` has an infinite ascending sequence of elements with no upper bound in `A`)
`A` `B`	The intersection of sets `A` and `B` (the set of all values that are present both in `A` and in `B`)
`A` `B`	The union of sets `A` and `B` (the set of all values that are present in at least one of `A` or `B`)
`A` - `B`	The difference of sets `A` and `B` (the set of all values that are present in `A` but not `B`)
`x` `A`	Return true if `x` is an element of set `A` and false if not
`A` = `B`	Return true if the two sets `A` and `B` are equal and false otherwise. Sets `A` and `B` are equal if every element of `A` is also in `B` and every element of `B` is also in `A`.

min and max are only defined for sets whose elements can be compared with <.

Tuples

A semantic tuple is an aggregate of several named semantic values. Tuples are sometimes called records or structures in other languages. A tuple is denoted by a comma-separated list of names and values between bold triangular brackets:

name1 value1, name2 value2, ... , namen valuen

Each namei valuei pair is called a field. The order of fields in a tuple is irrelevant, so x 3, y 4 is the same as y 4, x 3. A tuple's names must all be distinct.

Let w be an expression that evaluates to a tuple name1 value1, name2 value2, ... , namen valuen. We can extract the value of the field named namei from w by using the notation w.namei. w is required to have this field. For example, x 3, y 4.x is 3.

In the HTML versions of the semantics, each use of namei is linked back to its tuple type's definition.

Oneofs

A semantic oneof is a pair consisting of a name (called the tag) and a value. Oneofs are sometimes called variants or tagged unions in other languages. A oneof is denoted by writing the tag followed by the value:

name value

For brevity, when value is , we can omit it altogether, so red is the same as red .

Let o be an expression that evaluates to some oneof n v. We can perform the following operations on o:

Notation	Result Value
`o`.`name`	The value `v` if `n` is `name`; otherwise
`o` is `name`	true if `n` is `name`; false otherwise

For example, (red 5) is blue evaluates to false, while (red 5) is red evaluates to true. (red 5).red evaluates to 5.

In addition to the operators above, the case statement evaluates one of several expressions based on a oneof tag.

In the HTML versions of the semantics, each use of name is linked back to its oneof type's definition.

Functions

A semantic function receives zero or more arguments, performs computations, and returns a result. We write a semantic function as follows:

function(param1: type1, ... , paramn: typen) body

Here param1 through paramn are the function's parameters, type1 through typen are the parameters' respective semantic types, and body is an expression that computes the function's result. When the function is called with argument values v1 through vn, the function's body is evaluated and the resulting value returned to the caller. body can refer to the parameters param1 through paramn; each reference to a parameter parami evaluates to the corresponding argument value vi. Arguments are passed by value (which in this language is equivalent to passing them by reference because there is no way to write to a parameter).

Function parameters are statically scoped. When functions are nested and an inner function f defines a parameter with the same name as a parameter of an outer function g, then f's parameter shadows g's parameter inside f.

The only operation allowed on a semantic function f is calling it, which we do using the f(arg1, ..., argn) syntax. In the presence of side effects, f is evaluated first, followed by the argument expressions arg1 through argn, in left-to-right order. If the result of evaluating f or any of the argument expressions is , then the call immediately returns without evaluating the following argument expressions, if any. If the result of evaluating f or any of the argument expressions is v for some value v, then the call immediately returns that v without evaluating the following argument expressions, if any. Otherwise, f's body is evaluated and the resulting value returned to the caller.

Semantic Types

A semantic type is a possibly infinite set of semantic values. Names of semantic types are shown in Capitalized Red Small Caps, and compound semantic type expressions are in red.

We use semantic types to make the semantics more readable by declaring the semantic type of each semantic variable (including function argument variables). Each such declaration states that the only values that will be stored in a semantic variable will be members of that variable's semantic type. These declarations can be proven statically. The JavaScript semantics have been machine type-checked to ensure that every type declaration holds, so, for example, if the semantics state that variable x has type Integer then there does not exist any place that could assign the value true to x.

Semantic type annotations allow us to restrict the description of each semantic operator and function to only describe its behavior on arguments that are members of the arguments' semantic types. Thus, for example, we need not describe the behavior of the + semantic operator when passed the semantic values true and as operands because we can prove that this case cannot arise.

Every semantic type includes the values and v for all values v whose semantic type is SemanticException. For brevity we do not list and v in the tables below.

Basic Semantic Types

The following are the basic semantic types:

Type	Set of Values
Void	{}
Boolean	{true, false}
Integer	{..., -2, -1, 0, 1, 2, ...} (All mathematical integers)
Rational	All mathematical rational numbers
Double	All double-precision IEEE floating-point numbers, including , -, and NaN
Character	All 65536 characters
String	Shorthand for Character[] (see vector types below)
SemanticException	Set of all values that can be thrown as semantic exceptions. This type is defined separately inside each grammar that throws such exceptions.

The type Rational includes Integer as a subtype because every integer is also a rational number. Except for and v, the types Rational and Double are disjoint.

Compound Semantic Types

We can construct compound semantic types using the notation below. Here t, t1, t2, ..., tn represent some existing semantic types.

Type	Set of Values
`t`[]	All vectors [`v0`, ... , `vn-1`] all of whose elements `v0`, ... , `vn-1` have type `t`. Note that the empty vector [] is a member of every vector type `t`[].
{`t`}	All sets {`v1`, `v2`, ... , `vn`} all of whose elements `v1`, ... , `vn` have type `t`. Note that the empty set {} is a member of every set type {`t`}.
tuple {`name1`: `t1`; ... ; `namen`: `tn`}	All tuples `name1` `v1`, ... , `namen` `vn` for which each `vi` has type `ti` for 1 `i` `n`. The `namei`'s must be distinct; the order in which the `namei`: `ti` fields are listed does not matter.
oneof {`name1`: `t1`; ... ; `namen`: `tn`}	All oneofs of the form `namei` `v`, where 1 `i` `n` and `v` has type `ti`. If `tk` is Void, then `namek`: `tk` can be abbreviated as simply `namek` in the oneof semantic type syntax. The `namei`'s must be distinct; the order in which the `namei`: `ti` alternatives are listed does not matter.
`t1` `t2` ... `tn` `t`	Some* functions that take `n` arguments of types `t1` through `tn` respectively and produce a result of type `t`. If `n` is zero (the function takes no arguments), we write this type as () `t`. * Technically speaking, this semantic type includes only functions that are continuous in the domain-theoretical sense; this avoids set-theoretical paradoxes.
() `t`

The type constructors earlier in the table bind tighter than ones later in the table, so, for example, Integer[] Rational[] is equivalent to (Integer[]) (Rational[]) (a function that takes a vector of Integers and returns a vector of Rationals) rather than ((Integer[]) Rational)[] (a vector of functions, each of which takes a vector of Integers and returns a Rational). In the rare cases where this is needed, parentheses are used to override precedence.

Semantic Operators

The table below lists the semantic operators in order from the highest precedence (tightest-binding) to the lowest precedence (loosest-binding). Operators under the same heading of the table have the same precedence and associate left-to-right, so, for example, 7-3+2-1 is interpreted as ((7-3)+2)-1 instead of 7-(3+(2-1)) or (7-(3+2))-1. When needed, parentheses can be used to group expressions.

The type signatures of the operators are also listed. Some operators are polymorphic; t, t1, t2, ..., and tn can represent any semantic types. The types of some operators are underdetermined; for example, [] can have type t[] for any type t. In these cases the particular choice of type is inferred from the context.

Each operator in the table below is strict: it evaluates all of its operands left-to-right, and if any operand evaluates to , then the operator immediately returns without evaluating the following operands, if any. If any operand evaluates to v for some value v, then the operator immediately returns that v without evaluating the following operands, if any.

Operator	Signatures	Description
Nonassociative Operators
(`x`)	`t` `t`	Return `x`. Parentheses are used to override operator precedence.
\|`u`\|	`t`[] Integer	`u` is a vector [`e0`, `e1`, ... , `en-1`]. Return the length `n` of that vector.
\|`u`\|	{`t`} Integer	The number of elements in the set `u`; if `u` has infinitely many elements
[`x0`, `x1`, ... , `xn-1`]	`t` ... `t` `t`[]	Return a vector with the elements `x0`, `x1`, ... , `xn-1`.
{`x1`, `x2`, ... , `xn`}	`t` ... `t` {`t`}	Return a set with the elements `x1`, `x2`, ... , `xn`. Any duplicate elements are included only once in the set. When `t` is Integer or Character, we can also replace any of the `xi`'s by a range `xi` ... `yi` that contains all integers or characters greater than or equal to `xi` and less than or equal to `yi`. `yi` must not be less than `xi` "minus" one.
`name1` `x1`, ... , `namen` `xn`	`t1` ... `tn` tuple {`name1`: `t1`; ... ; `namen`: `tn`}	Return a tuple with the fields `name1` `x1`, ... , `namen` `xn`.
`name`	oneof {`name`; `name2`: `t2`; ... ; `namen`: `tn`}	Return a oneof value with tag `name` and value .
Action[`nonterminal_i`]	Determined by Action's declaration	This notation can only be used inside an action definition for a grammar production that has nonterminal `nonterminal` on the production's right side. Return the value of action Action invoked on the `i`^th instance of nonterminal `nonterminal` on the right side of . The subscript `i` can be omitted if there is only one instance of nonterminal `nonterminal` in .
`nonterminal_i`	Character	This notation can only be used inside an action definition for a grammar production that has nonterminal `nonterminal` on the production's left or right side. Furthermore, every complete expansion of grammar nonterminal `nonterminal` must expand it into a single character. Return the character to which the `i`^th instance of nonterminal `nonterminal` on the right side of expands. The subscript `i` can be omitted if there is only one instance of nonterminal `nonterminal` in . If the subscript is omitted and nonterminal `nonterminal` appears on the left side of , then this expression returns the single character to which this whole production expands.
Suffix Operators
`u`[`i`]	`t`[] Integer `t`	`u` is a vector [`e0`, `e1`, ... , `en-1`]. Return the `i`^th element `ei`, or if `i`<0 or `in`.
`u`[`i` ... `j`]	`t`[] Integer Integer `t`[]	`u` is a vector [`e0`, `e1`, ... , `en-1`]. Return the vector slice [`ei`, `ei+1`, ... , `ej`] consisting of all elements of `u` between the `i`^th and the `j`^th, inclusive, or if `i`<0, `jn`, or `j`<`i`-1. The result is the empty vector [] if `j`=`i`-1.
`u`[`i` ...]	`t`[] Integer `t`[]	`u` is a vector [`e0`, `e1`, ... , `en-1`]. Return the vector slice [`ei`, `ei+1`, ... , `en-1`] consisting of all elements of `u` between the `i`^th and the end, or if `i`<0 or `i`>`n`. The result is the empty vector [] if `i`=`n`.
`u`[`i` `x`]	`t`[] Integer `t` `t`[]	`u` is a vector [`e0`, `e1`, ... , `en-1`]. Return the vector [`e0`, ... , `ei-1`, `x`, `ei+1`, ... , `en-1`] with the `i`^th element replaced by the value `x` and the other elements unchanged, or if `i`<0 or `in`.
`w`.`namei`	tuple {`name1`: `t1`; ... ; `namen`: `tn`} `ti`	`w` is a tuple `name1` `v1`, ... , `namen` `vn`. Return the value `vi` of `w`'s field named `namei`.
`w`.`namei`	oneof {`name1`: `t1`; ... ; `namen`: `tn`} `ti`	`w` is a oneof `namek` `v` for some `k` between 1 and `n` inclusive. Return the value `v` if `namei` is `namek`, or if not.
`f`(`x1`, ..., `xn`)	(`t1` ... `tn` `t`) `t1` ... `tn` `t`	Call the function `f` with the arguments `x1` through `xn` and return the result.
Prefix Operators
-`x`	Integer Integer or Rational Rational	The mathematical negation of `x`
min `A`	{`t`} `t`	Return the minimal element of set `A`. Specifically, if there exists a value `m` that satisfies both `m` `A` and for all elements `x` `A`, `x` `m`, then return `m`; otherwise return (this could happen either if `A` is empty or if `A` has an infinite descending sequence of elements with no lower bound in `A`). The type `t` must have = and < operations that define a total order.
max `A`	{`t`} `t`	Return the maximal element of set `A`. Specifically, if there exists a value `m` that satisfies both `m` `A` and for all elements `x` `A`, `x` `m`, then return `m`; otherwise return (this could happen either if `A` is empty or if `A` has an infinite ascending sequence of elements with no upper bound in `A`). The type `t` must have = and < operations that define a total order.
`name` `x`	`t` oneof {`name`: `t`; `name2`: `t2`; ... ; `namen`: `tn`}	Return a oneof value with tag `name` and value `x`.
Multiplicative Operators
`x` * `y`	Integer Integer Integer or Rational Rational Rational	The mathematical product of `x` and `y`
`x` / `y`	Rational Rational Rational	The mathematical quotient of `x` and `y`; if `y`=0
`A` `B`	{`t`} {`t`} {`t`}	The intersection of sets `A` and `B` (the set of all values that are present both in `A` and in `B`)
Additive Operators
`x` + `y`	Integer Integer Integer or Rational Rational Rational	The mathematical sum of `x` and `y`
`x` - `y`	Integer Integer Integer or Rational Rational Rational	The mathematical difference of `x` and `y`
`u` `v`	`t`[] `t`[] `t`[]	`u` is a vector [`e0`, `e1`, ... , `en-1`] and `v` is a vector [`f0`, `f1`, ... , `fm-1`]. Return the concatenated vector [`e0`, `e1`, ... , `en-1`, `f0`, `f1`, ... , `fm-1`].
`A` `B`	{`t`} {`t`} {`t`}	The union of sets `A` and `B` (the set of all values that are present in at least one of `A` or `B`)
`A` - `B`	{`t`} {`t`} {`t`}	The difference of sets `A` and `B` (the set of all values that are present in `A` but not `B`)
Comparison Operators
`x` = `y`	Rational Rational Boolean or Character Character Boolean or String String Boolean or {`t`} {`t`} Boolean	Comparisons return true if the relation holds or false if not. Rationals are compared mathematically. Characters are compared according to their code points. Two strings are equal when they have the same lengths and contain exactly the same sequences of characters. A string `x` is less than string `y` when either `x` is the empty string and `y` is not empty, the first character of `x` is less than the first character of `y`, or the first character of `x` is equal to the first character of `y` and the rest of string `x` is less than the rest of string `y`. Two sets `x` and `y` are equal if every element of `x` is also in `y` and every element of `y` is also in `x`. Only = and can be used to compare sets.
`x` `y`
`x` < `y`
`x` `y`
`x` > `y`
`x` `y`
`x` `A`	`t` {`t`} Boolean	Return true if `x` is an element of set `A` and false if not
`o` is `namei`	oneof {`name1`: `t1`; ... ; `namen`: `tn`} Boolean	`o` is a oneof `namek` `v` for some `k` between 1 and `n` inclusive. Return true if `namei` is `namek`, or false otherwise.
Logical Negation
not `a`	Boolean Boolean	true if `a` is false; false if `a` is true
Logical Conjunction
`a` and `b`	Boolean Boolean Boolean	true if both `a` and `b` are true; false if at least one of `a` and `b` is false
Logical Disjunction
`a` or `b`	Boolean Boolean Boolean	true if at least one of `a` and `b` is true; false if both `a` and `b` are false
`a` xor `b`	Boolean Boolean Boolean	true if `a` is true and `b` is false or `a` is false and `b` is true; false if both `a` and `b` are true or both `a` and `b` are false

Semantic Statements

Semantic statements are similar to the semantic operators above in that they are also used to construct expressions, take zero or more operands, and return a value. Unlike other semantic operators, semantic statements are usually non-strict: they do not always evaluate all of their operands. Semantic statements have lower precedence than any of the semantic operators above.

Some semantic statements are syntactic sugars, which means that they are defined as macros that expand into other, simpler statements and operators.

Function

function(param1: type1, ... , paramn: typen) body

See the description of function values.

Let

let var1: type1 = expr1; ... ; varn: typen = exprn in body

Evaluate expr1 through exprn in order and save the results. If any expri evaluates to , then immediately return without evaluating the following expr's. If any expri evaluates to v for some value v, then immediately return that v without evaluating the following expr's. Otherwise evaluate body with new local variable bindings of var1 through varn bound to the saved results of evaluating expr1 through exprn, respectively. Return the result of evaluating body.

type1 through typen are the local variables' respective semantic types. The type of the entire let expression is the type of its body.

The let expression above is syntactic sugar for:

(function(var1: type1, ... , varn: typen) body)(expr1, ... , exprn)

If

if expr then bodytrue else bodyfalse

Evaluate expr. If it evaluates to , then immediately return . If expr evaluates to v for some value v, then immediately return that v. Otherwise expr must evaluate to either true or false. If it evaluated to true, then evaluate bodytrue and return its result. If expr evaluated to false, then evaluate bodyfalse and return its result.

expr must have type Boolean. The entire if expression has any type t such that both bodytrue has type t and bodyfalse has type t.

Case

case expr of
    name1(var1: type1): body1;
    ...
    namen(varn: typen): bodyn;
    end

Evaluate expr. If it evaluates to , then immediately return . If expr evaluates to v for some value v, then immediately return that v. Otherwise expr must evaluate to a oneof name v where name matches namei for some i between 1 and n inclusive. Evaluate the corresponding bodyi with a new local variable vari bound to v. Return bodyi's result.

If we are not interested in using the oneof's value for a particular bodyi, we can shorten that bodyi's clause from:

namei(vari: typei): bodyi

to:

namei: bodyi

In this case no local variable is bound while evaluating bodyi.

expr must have type oneof {name1: type1; ... ; namen: typen}. The entire case expression has any type t such that all of its bodyi's have type t. The namei's must be distinct. The order in which the case clauses are listed does not matter.

Throw

throw expr

Evaluate expr. If it evaluates to , then immediately return . If expr evaluates to v for some value v, then immediately return that v. Otherwise expr must evaluate to some value v, in which case return v.

expr must have type SemanticException. The entire throw expression has any type whatsoever (because every semantic type includes v).

Try-Catch

try
bodytry
catch (var: SemanticException)
bodyhandler

Evaluate bodytry to obtain a value w. If w does not have the form v for some value v, then return w. Otherwise w is v for some value v. In this case evaluate bodyhandler with a new local variable var bound to v and return bodyhandler's result.

The type of var is always SemanticException. The entire try-catch expression has any type t such that both bodytry has type t and bodyhandler has type t.

Semantic Functions

The sections below list the predefined semantic functions, their type signatures, and short descriptions. All functions are strict and evaluate their arguments left-to-right.

Integer Manipulation

These functions perform bitwise operations on integers. The integers are treated as though they were written in binary notation, with each 1 bit representing true and 0 bit representing false. The integers must be nonnegative.

Function	Signature	Description
bitwiseAnd(`x`, `y`)	Integer Integer Integer	The bitwise AND of `x` and `y`
bitwiseOr(`x`, `y`)		The bitwise OR of `x` and `y`
bitwiseXor(`x`, `y`)		The bitwise XOR of `x` and `y`
bitwiseShift(`x`, `count`)	Integer Integer Integer	Shift `x` to the left by `count` bits. If `count` is negative, shift `x` to the right by -`count` bits. Bits shifted out are lost; bit shifted in are zero. This function is equivalent to multiplying `x` by 2^count and truncating the result (toward negative infinity) to an integer. `x` can be negative.

Double Manipulation

Function	Signature	Description
rationalToDouble(`r`)	Rational Double	The rational number `r` rounded to the nearest IEEE double-precision floating-point value as follows: Consider the set of all doubles, with -0.0, +, -, and NaN removed and with two additional values added to it that are not representable as doubles, namely 2¹⁰²⁴ and -2¹⁰²⁴. Choose the member of this set that is closest in value to `r`. If two values of the set are equally close, choose the one with an even significand; for this purpose, the two extra values 2¹⁰²⁴ and -2¹⁰²⁴ are considered to have even significands. Finally, if 2¹⁰²⁴ was chosen, replace it with +; if -2¹⁰²⁴ was chosen, replace it with -; if +0.0 was chosen, replace it with -0.0 if and only if `r` < 0; any other chosen value is used unchanged. The result is the value of rationalToDouble(`r`). This procedure corresponds exactly to the behavior of the IEEE 754 "round to nearest" mode.

Character Conversions

Function	Signature	Description
characterToCode(`c`)	Character Integer	The number of the Unicode code point `c`
codeToCharacter(`i`)	Integer Character	The Unicode code point number `i`, or if `i`<0 or `i`>65535

Character Utilities

The function digitValue is defined as follows:

digitValue(c: Character) : Integer
  = if c  {‘0’ ... ‘9’}
     then characterToCode(c) - characterToCode(‘0’)
     else if c  {‘A’ ... ‘Z’}
     then characterToCode(c) - characterToCode(‘A’) + 10
     else if c  {‘a’ ... ‘z’}
     then characterToCode(c) - characterToCode(‘a’) + 10
     else

Character Class Queries

Function	Signature	Description
isOrdinaryInitialIdentifierCharacter(`c`)	Character Boolean	Return true if the nonterminal `OrdinaryInitialIdentifierCharacter` can expand into `c` and false otherwise
isOrdinaryContinuingIdentifierCharacter(`c`)	Character Boolean	Return true if the nonterminal `OrdinaryContinuingIdentifierCharacter` can expand into `c` and false otherwise

Semantic Definitions

Value Definitions

We can define a global semantic constant named var as follows:

var : type = expr

expr should evaluate to a value of type type. expr should not have side effects, and it should not evaluate to .

In the HTML versions of the semantics, each reference to the global semantic constant var is linked to var's definition.

Function Definitions

We can define a global semantic function named f as follows:

f(param1: type1, ... , paramn: typen) : type = body

param1 through paramn are the function's parameters, type1 through typen are the parameters' respective semantic types, type is the function result's semantic type, and body is an expression that computes the function's result.

The above definition is syntactic sugar for the global constant definition:

f : type1 type2 ... typen type = function(param1: type1, ... , paramn: typen) body

In the HTML versions of the semantics, each reference to the global semantic function f is linked to f's definition.

For example, the function definition

square(x: Integer) : Integer = x*x

defines a function named square that takes an Integer parameter x and returns an Integer that is the square of x. This is equivalent to the following global definition:

square : Integer Integer = function(x: Integer) x*x

Type Definitions

We can give a new name to a semantic type t by using the type definition, which has the form:

type name = t

For example, the following notation defines RegExp as a shorthand for tuple {reBody: String; reFlags: String}:

type RegExp = tuple {reBody: String; reFlags: String}

In the HTML versions of the semantics, each reference to the semantic type name name is linked to name's definition.

Semantic Actions

Semantic actions tie together the grammar and the semantics. A semantic action ascribes semantic meaning to a grammar production.

To illustrate the use of semantic actions, we shall look at an example, followed by a detailed description of the notation for specifying semantic actions.

Example

Consider the following grammar, with the start nonterminal Numeral:

Digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Digits

Digit

| Digits Digit

Numeral

Digits

| Digits # Digits

This grammar defines the syntax of an acceptable input: “37”, “33#4” and “30#2” are acceptable syntactically, while “1a” is not. However, the grammar does not indicate what these various inputs mean. That is the job of the semantics, which are defined in terms of actions on the parse tree of grammar rule expansions. Consider the following sample set of actions defined on this grammar, with a starting Numeral action called (in this example) Value:

type SemanticException = oneof {syntaxError}

action Value[Digit] : Integer = digitValue(Digit)

action DecimalValue[Digits] : Integer

DecimalValue[Digits Digit] = Value[Digit]

DecimalValue[Digits Digits₁ Digit] = 10*DecimalValue[Digits₁] + Value[Digit]

action BaseValue[Digits] : Integer Integer

BaseValue[Digits  Digit](base: Integer)
  = let d: Integer = Value[Digit]
     in if d < base
         then d
         else throw syntaxError

BaseValue[Digits  Digits₁ Digit](base: Integer)
  = let d: Integer = Value[Digit]
     in if d < base
         then base*BaseValue[Digits₁](base) + d
         else throw syntaxError

action Value[Numeral] : Integer

Value[Numeral Digits] = DecimalValue[Digits]

Value[Numeral  Digits₁ # Digits₂]
  = let base: Integer = DecimalValue[Digits₂]
     in if base  2 and base  10
         then BaseValue[Digits₁](base)
         else throw syntaxError

Action names are written in violet cursive type. The last action definition states in the example above that the action Value can be applied to any expansion of the nonterminal Numeral, and the result is an Integer. This action maps all acceptable inputs to integers or syntaxError. If the result is syntaxError, then the input satisfies the grammar but contains an error detected by the semantics; this is the case for the input “30#2”. A result of would indicate a nonterminating computation; this cannot happen in this example.

There are two definitions of the Value action on Numeral, one for each grammar production that expands Numeral. Each definition of an action is allowed to call actions on the terminals and nonterminals on the right side of the expansion. For example, Value applied to the first Numeral production (the one that expands Numeral into Digits) simply applies the DecimalValue action to the expansion of the nonterminal Digits and returns the result. On the other hand, Value applied to the second Numeral production (the one that expands Numeral into Digits # Digits) performs a computation using the results of the DecimalValue and BaseValue applied to the two expansions of the Digits nonterminals. In this case there are two identical nonterminals Digits on the right side of the expansion, so we use subscripts to indicate on which one we're calling the actions DecimalValue and BaseValue.

The BaseValue action illustrates a syntactic sugar for defining an action that is a function; this syntactic sugar is analogous to that for defining global functions.

The Value action on Digit illustrates the direct use of a nonterminal in a semantic expression: digitValue(Digit). Here the Digit semantic expression evaluates to the character into which the Digit grammar rule expands.

We can fully evaluate the semantics on our sample inputs to get the following results:

Input	Semantic Result
`37`	37
`33#4`	15
`30#2`	syntaxError

Action Declarations

action Action[nonterminal] : type

This declaration states that action Action is defined on nonterminal nonterminal. Any reference to action Action[nonterminal] in a semantic expression returns a value of type type. The values of action Action must be defined using action definitions for each grammar production that has nonterminal on the left side.

Action Definitions

Action[nonterminal expansion] = expr

This notation defines the value of action Action on nonterminal nonterminal in the case where nonterminal nonterminal expands to the given expansion. expansion can contain zero or more terminals and nonterminals (as well as other notations allowed on the right side of a grammar production). Furthermore, the terminals and nonterminals of expansion can be subscripted to allow them to be unambiguously referenced by action references or nonterminal references inside expr.

The type of action Action on nonterminal nonterminal must be declared using an action declaration. expr must have the type given by that action declaration.

nonterminal expansion must be one of the productions in the grammar.

Action Function Definitions

Action[nonterminal expansion](param1: type1, ... , paramn: typen) = body

This notation is a syntactic sugar for defining an action whose value is a function. This notation is equivalent to:

Action[nonterminal expansion] =
function(param1: type1, ... , paramn: typen) body

Combined Action Declarations and Definitions

action Action[nonterminal] : type = expr

This declaration is sometimes used when all expansions of nonterminal nonterminal share the same action semantics. This declaration states both the type type of action Action on nonterminal nonterminal as well as that action's value expr. Note that the expansions are not given between the square brackets, and expr can refer only to the nonterminal nonterminal on the left side of grammar productions. No additional action definitions are needed for nonterminal nonterminal.

See the Value action on Digit in the example above for an example of this declaration.

JavaScript 2.0

Formal Description

Stages

Thursday, November 11, 1999

This page is out of date

The source code is processed in the following stages:

If necessary, convert the source code into the Unicode UTF-16 format, normalized form C.
Split the source code into tokens using the lexer grammar and lexer semantics.
Parse the resulting sequence of tokens using the parser grammar and evaluate it using the parser semantics [To be provided].

Lexing

Processing stage 2 is done as follows:

Let tokens be an empty array of Token metalanguage records. (As defined in the lexer semantics, a Token can be either an identifier, a keyword, a punctuation symbol, a number, a number with a unit, a string, or the end token.)
Let input be the input sequence of Unicode characters. Append a special placeholder End to the end of input.
Let regExpMayFollow be a Boolean variable. Initialize it to true.
Apply the lexer grammar to parse the longest possible prefix of input. If regExpMayFollow is true, use the start symbol NextToken^re. If regExpMayFollow is false, use the start symbol NextToken^div. The result of the parse should be a parse tree T. If the parse failed, return a syntax error.
Compute the action Token on T to obtain a Token t. If t is the end token, return the tokens array and go to the parse stage.
Append t to the end of the tokens array.
Compute the action RegExpMayFollow on T to obtain a Boolean value and assign that value to the regExpMayFollow variable.
Remove the characters matched by T from input, leaving only the yet-unparsed suffix of input.
Go to step 4.

If an implementation encounters an error while lexing, it is permitted to either report the error immediately or defer it until the affected token would actually be used by the parser. This flexibility allows an implementation to do lexing at the same time it parses the source program.

Provide language prohibiting an identifier from immediately following a number. This will fall out of the revised definition of QuantityLiteral.

Show mapping from Token structures to parser grammar terminals (obvious, but needs to be written).

Parsing

To be provided

JavaScript 2.0

Formal Description

Lexer Grammar

Monday, December 6, 1999

This LALR(1) grammar describes the lexer syntax of the JavaScript 2.0 proposal. See also the description of the grammar notation.

This document is also available as a Word 98 rtf file.

The start symbols are: NextToken^unit if the previous token was a number; NextToken^re if the previous token was not a number and a / should be interpreted as a regular expression; and NextToken^div if the previous token was not a number and a / should be interpreted as a division or division-assignment operator.

Unicode Character Classes

UnicodeCharacter Any Unicode character

UnicodeInitialAlphabetic Any Unicode initial alphabetic character (includes ASCII A-Z and a-z)

UnicodeAlphanumeric Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)

WhiteSpaceCharacter

«TAB» | «VT» | «FF» | «SP» | «u00A0»

| «u2000» | «u2001» | «u2002» | «u2003» | «u2004» | «u2005» | «u2006» | «u2007»

| «u2008» | «u2009» | «u200A» | «u200B»

| «u3000»

LineTerminator «LF» | «CR» | «u2028» | «u2029»

ASCIIDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Comments

LineComment / / LineCommentCharacters

LineCommentCharacters

«empty»

| LineCommentCharacters NonTerminator

NonTerminator UnicodeCharacter except LineTerminator

SingleLineBlockComment / * BlockCommentCharacters * /

BlockCommentCharacters

«empty»

| BlockCommentCharacters NonTerminatorOrSlash

| PreSlashCharacters /

PreSlashCharacters

«empty»

| BlockCommentCharacters NonTerminatorOrAsteriskOrSlash

| PreSlashCharacters /

NonTerminatorOrSlash NonTerminator except /

NonTerminatorOrAsteriskOrSlash NonTerminator except * | /

MultiLineBlockComment / * MultiLineBlockCommentCharacters BlockCommentCharacters * /

MultiLineBlockCommentCharacters

BlockCommentCharacters LineTerminator

| MultiLineBlockCommentCharacters BlockCommentCharacters LineTerminator

White space

WhiteSpace

«empty»

| WhiteSpace WhiteSpaceCharacter

| WhiteSpace SingleLineBlockComment

Line breaks

LineBreak

LineTerminator

| LineComment LineTerminator

| MultiLineBlockComment

LineBreaks

LineBreak

| LineBreaks WhiteSpace LineBreak

Tokens

t {re, div, unit}

NextToken^re WhiteSpace Token^re

NextToken^div WhiteSpace Token^div

NextToken^unit

[lookahead{OrdinaryContinuingIdentifierCharacter, \}] WhiteSpace Token^div

| [lookahead{_}] IdentifierName

| _ IdentifierName

Token^re

LineBreaks

| Punctuator

| NumericLiteral

| StringLiteral

| RegExpLiteral

| EndOfInput

Token^div

LineBreaks

| Punctuator

| DivisionPunctuator

| NumericLiteral

| StringLiteral

| EndOfInput

EndOfInput

End

| LineComment End

Keywords and identifiers

IdentifierName

InitialIdentifierCharacter

| IdentifierName ContinuingIdentifierCharacter

InitialIdentifierCharacter

OrdinaryInitialIdentifierCharacter

| \ HexEscape

OrdinaryInitialIdentifierCharacter UnicodeInitialAlphabetic | $ | _

ContinuingIdentifierCharacter

OrdinaryContinuingIdentifierCharacter

| \ HexEscape

OrdinaryContinuingIdentifierCharacter UnicodeAlphanumeric | $ | _

IdentifierOrReservedWord IdentifierName

Punctuators

Punctuator

!

| ! =

| ! = =

| #

| %

| % =

| &

| & &

| & & =

| & =

| (

| )

| *

| * =

| +

| + +

| + =

| ,

| -

| - -

| - =

| - >

| .

| . .

| . . .

| :

| : :

| ;

| <

| < <

| < < =

| < =

| =

| = =

| = = =

| >

| > =

| > >

| > > =

| > > >

| > > > =

| ?

| @

| [

| ]

| ^

| ^ =

| ^ ^

| ^ ^ =

| {

| |

| | =

| | |

| | | =

| }

| ~

DivisionPunctuator

/ [lookahead{/, *}]

| / =

Numeric literals

NumericLiteral

DecimalLiteral

| HexIntegerLiteral [lookahead{HexDigit}]

DecimalLiteral

Mantissa

| Mantissa LetterE SignedInteger

LetterE E | e

Mantissa

DecimalIntegerLiteral

| DecimalIntegerLiteral .

| DecimalIntegerLiteral . Fraction

| . Fraction

DecimalIntegerLiteral

0

NonZeroDecimalDigits

NonZeroDigit

| NonZeroDecimalDigits ASCIIDigit

NonZeroDigit 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Fraction DecimalDigits

SignedInteger

DecimalDigits

| + DecimalDigits

| - DecimalDigits

DecimalDigits

ASCIIDigit

| DecimalDigits ASCIIDigit

HexIntegerLiteral

0 LetterX HexDigit

| HexIntegerLiteral HexDigit

LetterX X | x

HexDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

String literals

q {single, double}

StringLiteral

' StringChars^single '

| " StringChars^double "

StringChars^q

«empty»

| StringChars^q StringChar^q

StringChar^q

LiteralStringChar^q

| \ StringEscape

LiteralStringChar^single UnicodeCharacter except ' | \ | LineTerminator

LiteralStringChar^double UnicodeCharacter except " | \ | LineTerminator

StringEscape

ControlEscape

| ZeroEscape

| HexEscape

| IdentityEscape

IdentityEscape NonTerminator except UnicodeAlphanumeric

ControlEscape

b

| f

| n

| r

| t

| v

ZeroEscape 0 [lookahead{ASCIIDigit}]

HexEscape

x HexDigit HexDigit

| u HexDigit HexDigit HexDigit HexDigit

Regular expression literals

RegExpLiteral RegExpBody RegExpFlags

RegExpFlags

«empty»

| RegExpFlags ContinuingIdentifierCharacter

RegExpBody / [lookahead{*}] RegExpChars /

RegExpChars

RegExpChar

| RegExpChars RegExpChar

RegExpChar

OrdinaryRegExpChar

| \ NonTerminator

OrdinaryRegExpChar NonTerminator except \ | /

JavaScript 2.0

Formal Description

Lexer Semantics

Monday, December 6, 1999

The lexer semantics describe the actions the lexer takes in order to transform an input stream of Unicode characters into a stream of tokens. For convenience, the lexer grammar is repeated here. See also the description of the semantic notation.

This document is also available as a Word 98 rtf file.

Semantics

type SemanticException = oneof {syntaxError}

Unicode Character Classes

Syntax

UnicodeCharacter Any Unicode character

UnicodeInitialAlphabetic Any Unicode initial alphabetic character (includes ASCII A-Z and a-z)

UnicodeAlphanumeric Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)

WhiteSpaceCharacter

«TAB» | «VT» | «FF» | «SP» | «u00A0»

| «u2000» | «u2001» | «u2002» | «u2003» | «u2004» | «u2005» | «u2006» | «u2007»

| «u2008» | «u2009» | «u200A» | «u200B»

| «u3000»

LineTerminator «LF» | «CR» | «u2028» | «u2029»

ASCIIDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Semantics

action DecimalValue[ASCIIDigit] : Integer = digitValue(ASCIIDigit)

Comments

Syntax

LineComment / / LineCommentCharacters

LineCommentCharacters

«empty»

| LineCommentCharacters NonTerminator

NonTerminator UnicodeCharacter except LineTerminator

SingleLineBlockComment / * BlockCommentCharacters * /

BlockCommentCharacters

«empty»

| BlockCommentCharacters NonTerminatorOrSlash

| PreSlashCharacters /

PreSlashCharacters

«empty»

| BlockCommentCharacters NonTerminatorOrAsteriskOrSlash

| PreSlashCharacters /

NonTerminatorOrSlash NonTerminator except /

NonTerminatorOrAsteriskOrSlash NonTerminator except * | /

MultiLineBlockComment / * MultiLineBlockCommentCharacters BlockCommentCharacters * /

MultiLineBlockCommentCharacters

BlockCommentCharacters LineTerminator

| MultiLineBlockCommentCharacters BlockCommentCharacters LineTerminator

White space

Syntax

WhiteSpace

«empty»

| WhiteSpace WhiteSpaceCharacter

| WhiteSpace SingleLineBlockComment

Line breaks

Syntax

LineBreak

LineTerminator

| LineComment LineTerminator

| MultiLineBlockComment

LineBreaks

LineBreak

| LineBreaks WhiteSpace LineBreak

Tokens

Syntax

t {re, div, unit}

NextToken^re WhiteSpace Token^re

NextToken^div WhiteSpace Token^div

NextToken^unit

[lookahead{OrdinaryContinuingIdentifierCharacter, \}] WhiteSpace Token^div

| [lookahead{_}] IdentifierName

| _ IdentifierName

Semantics

action Token[NextToken^t] : Token

Token[NextToken^re WhiteSpace Token^re] = Token[Token^re]

Token[NextToken^div WhiteSpace Token^div] = Token[Token^div]

Token[NextToken^unit [lookahead{OrdinaryContinuingIdentifierCharacter, \}] WhiteSpace Token^div]
= Token[Token^div]

Token[NextToken^unit [lookahead{_}] IdentifierName] = string Name[IdentifierName]

Token[NextToken^unit _ IdentifierName] = string Name[IdentifierName]

Syntax

Token^re

LineBreaks

| Punctuator

| NumericLiteral

| StringLiteral

| RegExpLiteral

| EndOfInput

Token^div

LineBreaks

| Punctuator

| DivisionPunctuator

| NumericLiteral

| StringLiteral

| EndOfInput

EndOfInput

End

| LineComment End

Semantics

type RegExp = tuple {reBody: String; reFlags: String}

type Quantity = tuple {amount: Double; unit: String}

type Token
  = oneof {
           lineBreak;
           identifier: String;
           keyword: String;
           punctuator: String;
           number: Double;
           string: String;
           regularExpression: RegExp;
           end}

action Token[Token^t] : Token

Token[Token^t LineBreaks] = lineBreak

Token[Token^t IdentifierOrReservedWord] = Token[IdentifierOrReservedWord]

Token[Token^t Punctuator] = punctuator Punctuator[Punctuator]

Token[Token^div DivisionPunctuator] = punctuator Punctuator[DivisionPunctuator]

Token[Token^t NumericLiteral] = number DoubleValue[NumericLiteral]

Token[Token^t StringLiteral] = string StringValue[StringLiteral]

Token[Token^re RegExpLiteral] = regularExpression REValue[RegExpLiteral]

Token[Token^t EndOfInput] = end

Keywords and identifiers

Syntax

IdentifierName

InitialIdentifierCharacter

| IdentifierName ContinuingIdentifierCharacter

InitialIdentifierCharacter

OrdinaryInitialIdentifierCharacter

| \ HexEscape

OrdinaryInitialIdentifierCharacter UnicodeInitialAlphabetic | $ | _

ContinuingIdentifierCharacter

OrdinaryContinuingIdentifierCharacter

| \ HexEscape

OrdinaryContinuingIdentifierCharacter UnicodeAlphanumeric | $ | _

Semantics

action Name[IdentifierName] : String

Name[IdentifierName InitialIdentifierCharacter]
= [CharacterValue[InitialIdentifierCharacter]]

Name[IdentifierName IdentifierName₁ ContinuingIdentifierCharacter]
= Name[IdentifierName₁] [CharacterValue[ContinuingIdentifierCharacter]]

action ContainsEscapes[IdentifierName] : Boolean

ContainsEscapes[IdentifierName InitialIdentifierCharacter]
= ContainsEscapes[InitialIdentifierCharacter]

ContainsEscapes[IdentifierName IdentifierName₁ ContinuingIdentifierCharacter]
= ContainsEscapes[IdentifierName₁] or ContainsEscapes[ContinuingIdentifierCharacter]

action CharacterValue[InitialIdentifierCharacter] : Character

CharacterValue[InitialIdentifierCharacter OrdinaryInitialIdentifierCharacter]
= OrdinaryInitialIdentifierCharacter

CharacterValue[InitialIdentifierCharacter  \ HexEscape]
  = if isOrdinaryInitialIdentifierCharacter(CharacterValue[HexEscape])
     then CharacterValue[HexEscape]
     else throw syntaxError

action ContainsEscapes[InitialIdentifierCharacter] : Boolean

ContainsEscapes[InitialIdentifierCharacter OrdinaryInitialIdentifierCharacter] = false

ContainsEscapes[InitialIdentifierCharacter \ HexEscape] = true

action CharacterValue[ContinuingIdentifierCharacter] : Character

CharacterValue[ContinuingIdentifierCharacter OrdinaryContinuingIdentifierCharacter]
= OrdinaryContinuingIdentifierCharacter

CharacterValue[ContinuingIdentifierCharacter  \ HexEscape]
  = if isOrdinaryContinuingIdentifierCharacter(CharacterValue[HexEscape])
     then CharacterValue[HexEscape]
     else throw syntaxError

action ContainsEscapes[ContinuingIdentifierCharacter] : Boolean

ContainsEscapes[ContinuingIdentifierCharacter OrdinaryContinuingIdentifierCharacter]
= false

ContainsEscapes[ContinuingIdentifierCharacter \ HexEscape] = true

reservedWords : String[]
  = [“abstract”,
      “break”,
      “case”,
      “catch”,
      “class”,
      “const”,
      “continue”,
      “debugger”,
      “default”,
      “delete”,
      “do”,
      “else”,
      “enum”,
      “eval”,
      “export”,
      “extends”,
      “false”,
      “final”,
      “finally”,
      “for”,
      “function”,
      “goto”,
      “if”,
      “implements”,
      “import”,
      “in”,
      “instanceof”,
      “native”,
      “new”,
      “null”,
      “package”,
      “private”,
      “protected”,
      “public”,
      “return”,
      “static”,
      “super”,
      “switch”,
      “synchronized”,
      “this”,
      “throw”,
      “throws”,
      “transient”,
      “true”,
      “try”,
      “typeof”,
      “var”,
      “volatile”,
      “while”,
      “with”]

nonReservedWords : String[]
  = [“box”,
      “constructor”,
      “field”,
      “get”,
      “language”,
      “local”,
      “method”,
      “override”,
      “set”,
      “version”]

keywords : String[] = reservedWords nonReservedWords

member(id: String, list: String[]) : Boolean
  = if |list| = 0
     then false
     else if id = list[0]
     then true
     else member(id, list[1 ...])

Syntax

IdentifierOrReservedWord IdentifierName

Semantics

action Token[IdentifierOrReservedWord] : Token

Token[IdentifierOrReservedWord  IdentifierName]
  = let id: String = Name[IdentifierName]
     in if member(id, keywords) and not ContainsEscapes[IdentifierName]
         then keyword id
         else identifier id

Punctuators

Syntax

Punctuator

!

| ! =

| ! = =

| #

| %

| % =

| &

| & &

| & & =

| & =

| (

| )

| *

| * =

| +

| + +

| + =

| ,

| -

| - -

| - =

| - >

| .

| . .

| . . .

| :

| : :

| ;

| <

| < <

| < < =

| < =

| =

| = =

| = = =

| >

| > =

| > >

| > > =

| > > >

| > > > =

| ?

| @

| [

| ]

| ^

| ^ =

| ^ ^

| ^ ^ =

| {

| |

| | =

| | |

| | | =

| }

| ~

DivisionPunctuator

/ [lookahead{/, *}]

| / =

Semantics

action Punctuator[Punctuator] : String

Punctuator[Punctuator !] = “!”

Punctuator[Punctuator ! =] = “!=”

Punctuator[Punctuator ! = =] = “!==”

Punctuator[Punctuator #] = “#”

Punctuator[Punctuator %] = “%”

Punctuator[Punctuator % =] = “%=”

Punctuator[Punctuator &] = “&”

Punctuator[Punctuator & &] = “&&”

Punctuator[Punctuator & & =] = “&&=”

Punctuator[Punctuator & =] = “&=”

Punctuator[Punctuator (] = “(”

Punctuator[Punctuator )] = “)”

Punctuator[Punctuator *] = “*”

Punctuator[Punctuator * =] = “*=”

Punctuator[Punctuator +] = “+”

Punctuator[Punctuator + +] = “++”

Punctuator[Punctuator + =] = “+=”

Punctuator[Punctuator ,] = “,”

Punctuator[Punctuator -] = “-”

Punctuator[Punctuator - -] = “--”

Punctuator[Punctuator - =] = “-=”

Punctuator[Punctuator - >] = “->”

Punctuator[Punctuator .] = “.”

Punctuator[Punctuator . .] = “..”

Punctuator[Punctuator . . .] = “...”

Punctuator[Punctuator :] = “:”

Punctuator[Punctuator : :] = “::”

Punctuator[Punctuator ;] = “;”

Punctuator[Punctuator <] = “<”

Punctuator[Punctuator < <] = “<<”

Punctuator[Punctuator < < =] = “<<=”

Punctuator[Punctuator < =] = “<=”

Punctuator[Punctuator =] = “=”

Punctuator[Punctuator = =] = “==”

Punctuator[Punctuator = = =] = “===”

Punctuator[Punctuator >] = “>”

Punctuator[Punctuator > =] = “>=”

Punctuator[Punctuator > >] = “>>”

Punctuator[Punctuator > > =] = “>>=”

Punctuator[Punctuator > > >] = “>>>”

Punctuator[Punctuator > > > =] = “>>>=”

Punctuator[Punctuator ?] = “?”

Punctuator[Punctuator @] = “@”

Punctuator[Punctuator [] = “[”

Punctuator[Punctuator ]] = “]”

Punctuator[Punctuator ^] = “^”

Punctuator[Punctuator ^ =] = “^=”

Punctuator[Punctuator ^ ^] = “^^”

Punctuator[Punctuator ^ ^ =] = “^^=”

Punctuator[Punctuator {] = “{”

Punctuator[Punctuator |] = “|”

Punctuator[Punctuator | =] = “|=”

Punctuator[Punctuator | |] = “||”

Punctuator[Punctuator | | =] = “||=”

Punctuator[Punctuator }] = “}”

Punctuator[Punctuator ~] = “~”

action Punctuator[DivisionPunctuator] : String

Punctuator[DivisionPunctuator / [lookahead{/, *}]] = “/”

Punctuator[DivisionPunctuator / =] = “/=”

Numeric literals

Syntax

NumericLiteral

DecimalLiteral

| HexIntegerLiteral [lookahead{HexDigit}]

Semantics

action DoubleValue[NumericLiteral] : Double

DoubleValue[NumericLiteral DecimalLiteral]
= rationalToDouble(RationalValue[DecimalLiteral])

DoubleValue[NumericLiteral HexIntegerLiteral [lookahead{HexDigit}]]
= rationalToDouble(IntegerValue[HexIntegerLiteral])

expt(base: Rational, exponent: Integer) : Rational
  = if exponent = 0
     then 1
     else if exponent < 0
     then 1/expt(base, -exponent)
     else base*expt(base, exponent - 1)

Syntax

DecimalLiteral

Mantissa

| Mantissa LetterE SignedInteger

LetterE E | e

Mantissa

DecimalIntegerLiteral

| DecimalIntegerLiteral .

| DecimalIntegerLiteral . Fraction

| . Fraction

DecimalIntegerLiteral

0

NonZeroDecimalDigits

NonZeroDigit

| NonZeroDecimalDigits ASCIIDigit

NonZeroDigit 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Fraction DecimalDigits

Semantics

action RationalValue[DecimalLiteral] : Rational

RationalValue[DecimalLiteral Mantissa] = RationalValue[Mantissa]

RationalValue[DecimalLiteral Mantissa LetterE SignedInteger]
= RationalValue[Mantissa]*expt(10, IntegerValue[SignedInteger])

action RationalValue[Mantissa] : Rational

RationalValue[Mantissa DecimalIntegerLiteral] = IntegerValue[DecimalIntegerLiteral]

RationalValue[Mantissa DecimalIntegerLiteral .] = IntegerValue[DecimalIntegerLiteral]

RationalValue[Mantissa DecimalIntegerLiteral . Fraction]
= IntegerValue[DecimalIntegerLiteral] + RationalValue[Fraction]

RationalValue[Mantissa . Fraction] = RationalValue[Fraction]

action IntegerValue[DecimalIntegerLiteral] : Integer

IntegerValue[DecimalIntegerLiteral 0] = 0

IntegerValue[DecimalIntegerLiteral NonZeroDecimalDigits]
= IntegerValue[NonZeroDecimalDigits]

action IntegerValue[NonZeroDecimalDigits] : Integer

IntegerValue[NonZeroDecimalDigits NonZeroDigit] = DecimalValue[NonZeroDigit]

IntegerValue[NonZeroDecimalDigits NonZeroDecimalDigits₁ ASCIIDigit]
= 10*IntegerValue[NonZeroDecimalDigits₁] + DecimalValue[ASCIIDigit]

action DecimalValue[NonZeroDigit] : Integer = digitValue(NonZeroDigit)

action RationalValue[Fraction] : Rational

RationalValue[Fraction DecimalDigits]
= IntegerValue[DecimalDigits]/expt(10, NDigits[DecimalDigits])

Syntax

SignedInteger

DecimalDigits

| + DecimalDigits

| - DecimalDigits

Semantics

action IntegerValue[SignedInteger] : Integer

IntegerValue[SignedInteger DecimalDigits] = IntegerValue[DecimalDigits]

IntegerValue[SignedInteger + DecimalDigits] = IntegerValue[DecimalDigits]

IntegerValue[SignedInteger - DecimalDigits] = -IntegerValue[DecimalDigits]

Syntax

DecimalDigits

ASCIIDigit

| DecimalDigits ASCIIDigit

Semantics

action IntegerValue[DecimalDigits] : Integer

IntegerValue[DecimalDigits ASCIIDigit] = DecimalValue[ASCIIDigit]

IntegerValue[DecimalDigits DecimalDigits₁ ASCIIDigit]
= 10*IntegerValue[DecimalDigits₁] + DecimalValue[ASCIIDigit]

action NDigits[DecimalDigits] : Integer

NDigits[DecimalDigits ASCIIDigit] = 1

NDigits[DecimalDigits DecimalDigits₁ ASCIIDigit] = NDigits[DecimalDigits₁] + 1

Syntax

HexIntegerLiteral

0 LetterX HexDigit

| HexIntegerLiteral HexDigit

LetterX X | x

HexDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Semantics

action IntegerValue[HexIntegerLiteral] : Integer

IntegerValue[HexIntegerLiteral 0 LetterX HexDigit] = HexValue[HexDigit]

IntegerValue[HexIntegerLiteral HexIntegerLiteral₁ HexDigit]
= 16*IntegerValue[HexIntegerLiteral₁] + HexValue[HexDigit]

action HexValue[HexDigit] : Integer = digitValue(HexDigit)

String literals

Syntax

q {single, double}

StringLiteral

' StringChars^single '

| " StringChars^double "

Semantics

action StringValue[StringLiteral] : String

StringValue[StringLiteral ' StringChars^single '] = StringValue[StringChars^single]

StringValue[StringLiteral " StringChars^double "] = StringValue[StringChars^double]

Syntax

StringChars^q

«empty»

| StringChars^q StringChar^q

StringChar^q

LiteralStringChar^q

| \ StringEscape

LiteralStringChar^single UnicodeCharacter except ' | \ | LineTerminator

LiteralStringChar^double UnicodeCharacter except " | \ | LineTerminator

Semantics

action StringValue[StringChars^q] : String

StringValue[StringChars^q «empty»] = “”

StringValue[StringChars^q StringChars^q₁ StringChar^q]
= StringValue[StringChars^q₁] [CharacterValue[StringChar^q]]

action CharacterValue[StringChar^q] : Character

CharacterValue[StringChar^q LiteralStringChar^q] = LiteralStringChar^q

CharacterValue[StringChar^q \ StringEscape] = CharacterValue[StringEscape]

Syntax

StringEscape

ControlEscape

| ZeroEscape

| HexEscape

| IdentityEscape

IdentityEscape NonTerminator except UnicodeAlphanumeric

Semantics

action CharacterValue[StringEscape] : Character

CharacterValue[StringEscape ControlEscape] = CharacterValue[ControlEscape]

CharacterValue[StringEscape ZeroEscape] = CharacterValue[ZeroEscape]

CharacterValue[StringEscape HexEscape] = CharacterValue[HexEscape]

CharacterValue[StringEscape IdentityEscape] = IdentityEscape

Syntax

ControlEscape

b

| f

| n

| r

| t

| v

Semantics

action CharacterValue[ControlEscape] : Character

CharacterValue[ControlEscape b] = ‘«BS»’

CharacterValue[ControlEscape f] = ‘«FF»’

CharacterValue[ControlEscape n] = ‘«LF»’

CharacterValue[ControlEscape r] = ‘«CR»’

CharacterValue[ControlEscape t] = ‘«TAB»’

CharacterValue[ControlEscape v] = ‘«VT»’

Syntax

ZeroEscape 0 [lookahead{ASCIIDigit}]

Semantics

action CharacterValue[ZeroEscape] : Character

CharacterValue[ZeroEscape 0 [lookahead{ASCIIDigit}]] = ‘«NUL»’

Syntax

HexEscape

x HexDigit HexDigit

| u HexDigit HexDigit HexDigit HexDigit

Semantics

action CharacterValue[HexEscape] : Character

CharacterValue[HexEscape x HexDigit₁ HexDigit₂]
= codeToCharacter(16*HexValue[HexDigit₁] + HexValue[HexDigit₂])

CharacterValue[HexEscape  u HexDigit₁ HexDigit₂ HexDigit₃ HexDigit₄]
  = codeToCharacter(
         4096*HexValue[HexDigit₁] + 256*HexValue[HexDigit₂] + 16*HexValue[HexDigit₃] +
         HexValue[HexDigit₄])

Regular expression literals

Syntax

RegExpLiteral RegExpBody RegExpFlags

RegExpFlags

«empty»

| RegExpFlags ContinuingIdentifierCharacter

RegExpBody / [lookahead{*}] RegExpChars /

RegExpChars

RegExpChar

| RegExpChars RegExpChar

RegExpChar

OrdinaryRegExpChar

| \ NonTerminator

OrdinaryRegExpChar NonTerminator except \ | /

Semantics

action REValue[RegExpLiteral] : RegExp

REValue[RegExpLiteral RegExpBody RegExpFlags]
= reBody REBody[RegExpBody], reFlags REFlags[RegExpFlags]

action REFlags[RegExpFlags] : String

REFlags[RegExpFlags «empty»] = “”

REFlags[RegExpFlags RegExpFlags₁ ContinuingIdentifierCharacter]
= REFlags[RegExpFlags₁] [CharacterValue[ContinuingIdentifierCharacter]]

action REBody[RegExpBody] : String

REBody[RegExpBody / [lookahead{*}] RegExpChars /] = REBody[RegExpChars]

action REBody[RegExpChars] : String

REBody[RegExpChars RegExpChar] = REBody[RegExpChar]

REBody[RegExpChars RegExpChars₁ RegExpChar]
= REBody[RegExpChars₁] REBody[RegExpChar]

action REBody[RegExpChar] : String

REBody[RegExpChar OrdinaryRegExpChar] = [OrdinaryRegExpChar]

REBody[RegExpChar \ NonTerminator] = [‘\’, NonTerminator]

JavaScript 2.0

Formal Description

Regular Expression Grammar

Thursday, November 11, 1999

This LR(1) grammar describes the regular expression syntax of the JavaScript 2.0 proposal. See also the description of the grammar notation.

This document is also available as a Word 98 rtf file.

Unicode Character Classes

UnicodeCharacter Any Unicode character

UnicodeAlphanumeric Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)

LineTerminator «LF» | «CR» | «u2028» | «u2029»

Regular Expression Definitions

Regular Expression Patterns

RegularExpressionPattern Disjunction

Disjunctions

Disjunction

Alternative

| Alternative | Disjunction

Alternatives

Alternative

«empty»

| Alternative Term

Terms

Term

Assertion

| Atom

| Atom Quantifier

Quantifier

QuantifierPrefix

| QuantifierPrefix ?

QuantifierPrefix

*

| +

| ?

| { DecimalDigits }

| { DecimalDigits , }

| { DecimalDigits , DecimalDigits }

DecimalDigits

DecimalDigit

| DecimalDigits DecimalDigit

DecimalDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Assertions

Assertion

^

| $

| \ b

| \ B

Atoms

Atom

PatternCharacter

| .

| \ AtomEscape

| CharacterClass

| ( Disjunction )

| ( ? : Disjunction )

| ( ? = Disjunction )

| ( ? ! Disjunction )

PatternCharacter UnicodeCharacter except ^ | $ | \ | . | * | + | ? | ( | ) | [ | ] | { | } | |

Escapes

AtomEscape

DecimalEscape

| CharacterEscape

CharacterEscape

ControlEscape

| c ControlLetter

| HexEscape

| IdentityEscape

ControlLetter

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

| a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z

IdentityEscape UnicodeCharacter except UnicodeAlphanumeric

ControlEscape

f

| n

| r

| t

| v

Decimal Escapes

DecimalEscape DecimalIntegerLiteral [lookahead{DecimalDigit}]

DecimalIntegerLiteral

0

NonZeroDecimalDigits

NonZeroDigit

| NonZeroDecimalDigits DecimalDigit

NonZeroDigit 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Hexadecimal Escapes

HexEscape

x HexDigit HexDigit

| u HexDigit HexDigit HexDigit HexDigit

HexDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Character Class Escapes

CharacterClassEscape

s

| S

| d

| D

| w

| W

User-Specified Character Classes

CharacterClass

[ [lookahead{^}] ClassRanges ]

| [ ^ ClassRanges ]

ClassRanges

«empty»

| NonemptyClassRanges^dash

d {dash, noDash}

NonemptyClassRanges^d

ClassAtom^dash

| ClassAtom^d NonemptyClassRanges^noDash

| ClassAtom^d - ClassAtom^dash ClassRanges

Character Class Range Atoms

ClassAtom^d

ClassCharacter^d

| \ ClassEscape

ClassCharacter^dash UnicodeCharacter except \ | ]

ClassCharacter^noDash ClassCharacter^dash except -

ClassEscape

DecimalEscape

| b

| CharacterEscape

JavaScript 2.0

Formal Description

Regular Expression Semantics

Thursday, November 11, 1999

The regular expression semantics describe the actions the regular expression engine takes in order to transform a regular expression pattern into a function for matching against input strings. For convenience, the regular expression grammar is repeated here. See also the description of the semantic notation.

This document is also available as a Word 98 rtf file.

The regular expression semantics below are working (except for case-insensitive matches) and have been tried on sample cases, but they could be formatted better.

Semantics

type SemanticException = oneof {syntaxError}

Unicode Character Classes

Syntax

UnicodeCharacter Any Unicode character

UnicodeAlphanumeric Any Unicode alphabetic or decimal digit character (includes ASCII 0-9, A-Z, and a-z)

LineTerminator «LF» | «CR» | «u2028» | «u2029»

Semantics

lineTerminators : {Character} = {‘«LF»’, ‘«CR»’, ‘«u2028»’, ‘«u2029»’}

reWhitespaces : {Character} = {‘«FF»’, ‘«LF»’, ‘«CR»’, ‘«TAB»’, ‘«VT»’, ‘ ’}

reDigits : {Character} = {‘0’ ... ‘9’}

reWordCharacters : {Character} = {‘0’ ... ‘9’, ‘A’ ... ‘Z’, ‘a’ ... ‘z’, ‘_’}

Regular Expression Definitions

Semantics

type REInput = tuple {str: String; ignoreCase: Boolean; multiline: Boolean}

Field str is the input string. ignoreCase and multiline are the corresponding regular expression flags.

type REResult = oneof {success: REMatch; failure}

type REMatch = tuple {endIndex: Integer; captures: Capture[]}

A REMatch holds an intermediate state during the pattern-matching process. endIndex is the index of the next input character to be matched by the next component in a regular expression pattern. If we are at the end of the pattern, endIndex is one plus the index of the last matched input character. captures is a zero-based array of the strings captured so far by capturing parentheses.

type Capture = oneof {present: String; absent}

type Continuation = REMatch REResult

A Continuation is a function that attempts to match the remaining portion of the pattern against the input string, starting at the intermediate state given by its REMatch argument. If a match is possible, it returns a success result that contains the final REMatch state; if no match is possible, it returns a failure result.

type Matcher = REInput REMatch Continuation REResult

A Matcher is a function that attempts to match a middle portion of the pattern against the input string, starting at the intermediate state given by its REMatch argument. Since the remainder of the pattern heavily influences whether (and how) a middle portion will match, we must pass in a Continuation function that checks whether the rest of the pattern matched. If the continuation returns failure, the matcher function may call it repeatedly, trying various alternatives at pattern choice points.

The REInput parameter contains the input string and is merely passed down to subroutines.

type MatcherGenerator = Integer Matcher

A MatcherGenerator is a function executed at the time the regular expression is compiled that returns a Matcher for a part of the pattern. The Integer parameter contains the number of capturing left parentheses seen so far in the pattern and is used to assign static, consecutive numbers to capturing parentheses.

characterSetMatcher(acceptanceSet: {Character}, invert: Boolean) : Matcher
  = function(t: REInput, x: REMatch, c: Continuation)
         let i: Integer = x.endIndex;
             s: String = t.str
         in if i = |s|
             then failure
             else if s[i]  acceptanceSet xor invert
             then c(endIndex (i + 1), captures x.captures)
             else failure

characterSetMatcher returns a Matcher that matches a single input string character. If invert is false, the match succeeds if the character is a member of the acceptanceSet set of characters (possibly ignoring case). If invert is true, the match succeeds if the character is not a member of the acceptanceSet set of characters (possibly ignoring case).

characterMatcher(ch: Character) : Matcher = characterSetMatcher({ch}, false)

characterMatcher returns a Matcher that matches a single input string character. The match succeeds if the character is the same as ch (possibly ignoring case).

Regular Expression Patterns

Syntax

RegularExpressionPattern Disjunction

Semantics

action Exec[RegularExpressionPattern] : REInput Integer REResult

Exec[RegularExpressionPattern  Disjunction]
  = let match: Matcher = GenMatcher[Disjunction](0)
     in function(t: REInput, index: Integer)
             match(
                 t,
                 endIndex index, captures fillCapture(CountParens[Disjunction]),
                 successContinuation)

successContinuation(x: REMatch) : REResult = success x

fillCapture(i: Integer) : Capture[]
  = if i = 0
     then []_Capture
     else fillCapture(i - 1)  [absent]

Disjunctions

Syntax

Disjunction

Alternative

| Alternative | Disjunction

Semantics

action GenMatcher[Disjunction] : MatcherGenerator

GenMatcher[Disjunction Alternative] = GenMatcher[Alternative]

GenMatcher[Disjunction  Alternative | Disjunction₁](parenIndex: Integer)
  = let match1: Matcher = GenMatcher[Alternative](parenIndex);
         match2: Matcher = GenMatcher[Disjunction₁](parenIndex + CountParens[Alternative])
     in function(t: REInput, x: REMatch, c: Continuation)
             case match1(t, x, c) of
                success(y: REMatch): success y;
                failure: match2(t, x, c)
                end

action CountParens[Disjunction] : Integer

CountParens[Disjunction Alternative] = CountParens[Alternative]

CountParens[Disjunction Alternative | Disjunction₁]
= CountParens[Alternative] + CountParens[Disjunction₁]

Alternatives

Syntax

Alternative

«empty»

| Alternative Term

Semantics

action GenMatcher[Alternative] : MatcherGenerator

GenMatcher[Alternative  «empty»](parenIndex: Integer)
  = function(t: REInput, x: REMatch, c: Continuation)
         c(x)

GenMatcher[Alternative  Alternative₁ Term](parenIndex: Integer)
  = let match1: Matcher = GenMatcher[Alternative₁](parenIndex);
         match2: Matcher = GenMatcher[Term](parenIndex + CountParens[Alternative₁])
     in function(t: REInput, x: REMatch, c: Continuation)
             let d: Continuation
                     = function(y: REMatch)
                            match2(t, y, c)
             in match1(t, x, d)

action CountParens[Alternative] : Integer

CountParens[Alternative «empty»] = 0

CountParens[Alternative Alternative₁ Term]
= CountParens[Alternative₁] + CountParens[Term]

Terms

Syntax

Term

Assertion

| Atom

| Atom Quantifier

Semantics

action GenMatcher[Term] : MatcherGenerator

GenMatcher[Term  Assertion](parenIndex: Integer)
  = function(t: REInput, x: REMatch, c: Continuation)
         if TestAssertion[Assertion](t, x)
         then c(x)
         else failure

GenMatcher[Term Atom] = GenMatcher[Atom]

GenMatcher[Term  Atom Quantifier](parenIndex: Integer)
  = let match: Matcher = GenMatcher[Atom](parenIndex);
         min: Integer = Minimum[Quantifier];
         max: Limit = Maximum[Quantifier];
         greedy: Boolean = Greedy[Quantifier]
     in if
             (case max of
                finite(m: Integer): m < min;
                infinite: false
                end)
         then throw syntaxError
         else repeatMatcher(match, min, max, greedy, parenIndex, CountParens[Atom])

action CountParens[Term] : Integer

CountParens[Term Assertion] = 0

CountParens[Term Atom] = CountParens[Atom]

CountParens[Term Atom Quantifier] = CountParens[Atom]

Syntax

Quantifier

QuantifierPrefix

| QuantifierPrefix ?

QuantifierPrefix

*

| +

| ?

| { DecimalDigits }

| { DecimalDigits , }

| { DecimalDigits , DecimalDigits }

DecimalDigits

DecimalDigit

| DecimalDigits DecimalDigit

DecimalDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Semantics

type Limit = oneof {finite: Integer; infinite}

resetParens(x: REMatch, p: Integer, nParens: Integer) : REMatch
  = if nParens = 0
     then x
     else let y: REMatch = endIndex x.endIndex, captures x.captures[p  absent]
           in resetParens(y, p + 1, nParens - 1)

repeatMatcher(body: Matcher, min: Integer, max: Limit, greedy: Boolean, parenIndex: Integer, nBodyParens: Integer)
  : Matcher
  = function(t: REInput, x: REMatch, c: Continuation)
         if
             (case max of
                finite(m: Integer): m = 0;
                infinite: false
                end)
         then c(x)
         else let d: Continuation
                       = function(y: REMatch)
                              if min = 0 and y.endIndex = x.endIndex
                              then failure
                              else let newMin: Integer
                                            = if min = 0
                                               then 0
                                               else min - 1;
                                        newMax: Limit
                                            = case max of
                                                  finite(m: Integer): finite (m - 1);
                                                  infinite: infinite
                                                  end
                                    in repeatMatcher(
                                            body,
                                            newMin,
                                            newMax,
                                            greedy,
                                            parenIndex,
                                            nBodyParens)(t, y, c);
                   xr: REMatch = resetParens(x, parenIndex, nBodyParens)
               in if min  0
                   then body(t, xr, d)
                   else if greedy
                   then case body(t, xr, d) of
                             success(z: REMatch): success z;
                             failure: c(x)
                             end
                   else case c(x) of
                            success(z: REMatch): success z;
                            failure: body(t, xr, d)
                            end

action Minimum[Quantifier] : Integer

Minimum[Quantifier QuantifierPrefix] = Minimum[QuantifierPrefix]

Minimum[Quantifier QuantifierPrefix ?] = Minimum[QuantifierPrefix]

action Maximum[Quantifier] : Limit

Maximum[Quantifier QuantifierPrefix] = Maximum[QuantifierPrefix]

Maximum[Quantifier QuantifierPrefix ?] = Maximum[QuantifierPrefix]

action Greedy[Quantifier] : Boolean

Greedy[Quantifier QuantifierPrefix] = true

Greedy[Quantifier QuantifierPrefix ?] = false

action Minimum[QuantifierPrefix] : Integer

Minimum[QuantifierPrefix *] = 0

Minimum[QuantifierPrefix +] = 1

Minimum[QuantifierPrefix ?] = 0

Minimum[QuantifierPrefix { DecimalDigits }] = IntegerValue[DecimalDigits]

Minimum[QuantifierPrefix { DecimalDigits , }] = IntegerValue[DecimalDigits]

Minimum[QuantifierPrefix { DecimalDigits₁ , DecimalDigits₂ }]
= IntegerValue[DecimalDigits₁]

action Maximum[QuantifierPrefix] : Limit

Maximum[QuantifierPrefix *] = infinite

Maximum[QuantifierPrefix +] = infinite

Maximum[QuantifierPrefix ?] = finite 1

Maximum[QuantifierPrefix { DecimalDigits }] = finite IntegerValue[DecimalDigits]

Maximum[QuantifierPrefix { DecimalDigits , }] = infinite

Maximum[QuantifierPrefix { DecimalDigits₁ , DecimalDigits₂ }]
= finite IntegerValue[DecimalDigits₂]

action IntegerValue[DecimalDigits] : Integer

IntegerValue[DecimalDigits DecimalDigit] = DecimalValue[DecimalDigit]

IntegerValue[DecimalDigits DecimalDigits₁ DecimalDigit]
= 10*IntegerValue[DecimalDigits₁] + DecimalValue[DecimalDigit]

action DecimalValue[DecimalDigit] : Integer = digitValue(DecimalDigit)

Assertions

Syntax

Assertion

^

| $

| \ b

| \ B

Semantics

action TestAssertion[Assertion] : REInput REMatch Boolean

TestAssertion[Assertion  ^](t: REInput, x: REMatch)
  = if x.endIndex = 0
     then true
     else t.multiline and t.str[x.endIndex - 1]  lineTerminators

TestAssertion[Assertion  $](t: REInput, x: REMatch)
  = if x.endIndex = |t.str|
     then true
     else t.multiline and t.str[x.endIndex]  lineTerminators

TestAssertion[Assertion \ b](t: REInput, x: REMatch)
= atWordBoundary(x.endIndex, t.str)

TestAssertion[Assertion \ B](t: REInput, x: REMatch)
= not atWordBoundary(x.endIndex, t.str)

atWordBoundary(i: Integer, s: String) : Boolean = inWord(i - 1, s) xor inWord(i, s)

inWord(i: Integer, s: String) : Boolean
  = if i = -1 or i = |s|
     then false
     else s[i]  reWordCharacters

Atoms

Syntax

Atom

PatternCharacter

| .

| \ AtomEscape

| CharacterClass

| ( Disjunction )

| ( ? : Disjunction )

| ( ? = Disjunction )

| ( ? ! Disjunction )

PatternCharacter UnicodeCharacter except ^ | $ | \ | . | * | + | ? | ( | ) | [ | ] | { | } | |

Semantics

action GenMatcher[Atom] : MatcherGenerator

GenMatcher[Atom PatternCharacter](parenIndex: Integer)
= characterMatcher(PatternCharacter)

GenMatcher[Atom .](parenIndex: Integer) = characterSetMatcher(lineTerminators, true)

GenMatcher[Atom \ AtomEscape] = GenMatcher[AtomEscape]

GenMatcher[Atom  CharacterClass](parenIndex: Integer)
  = let a: {Character} = AcceptanceSet[CharacterClass]
     in characterSetMatcher(a, Invert[CharacterClass])

GenMatcher[Atom  ( Disjunction )](parenIndex: Integer)
  = let match: Matcher = GenMatcher[Disjunction](parenIndex + 1)
     in function(t: REInput, x: REMatch, c: Continuation)
             let d: Continuation
                     = function(y: REMatch)
                            let updatedCaptures: Capture[]
                                    = y.captures[parenIndex
                                           present t.str[x.endIndex ... y.endIndex - 1]]
                            in c(endIndex y.endIndex, captures updatedCaptures)
             in match(t, x, d)

GenMatcher[Atom ( ? : Disjunction )] = GenMatcher[Disjunction]

GenMatcher[Atom  ( ? = Disjunction )](parenIndex: Integer)
  = let match: Matcher = GenMatcher[Disjunction](parenIndex)
     in function(t: REInput, x: REMatch, c: Continuation)
             case match(t, x, successContinuation) of
                success(y: REMatch): c(endIndex x.endIndex, captures y.captures);
                failure: failure
                end

GenMatcher[Atom  ( ? ! Disjunction )](parenIndex: Integer)
  = let match: Matcher = GenMatcher[Disjunction](parenIndex)
     in function(t: REInput, x: REMatch, c: Continuation)
             case match(t, x, successContinuation) of
                success(y: REMatch): failure;
                failure: c(x)
                end

action CountParens[Atom] : Integer

CountParens[Atom PatternCharacter] = 0

CountParens[Atom .] = 0

CountParens[Atom \ AtomEscape] = 0

CountParens[Atom CharacterClass] = 0

CountParens[Atom ( Disjunction )] = CountParens[Disjunction] + 1

CountParens[Atom ( ? : Disjunction )] = CountParens[Disjunction]

CountParens[Atom ( ? = Disjunction )] = CountParens[Disjunction]

CountParens[Atom ( ? ! Disjunction )] = CountParens[Disjunction]

Escapes

Syntax

AtomEscape

DecimalEscape

| CharacterEscape

Semantics

action GenMatcher[AtomEscape] : MatcherGenerator

GenMatcher[AtomEscape  DecimalEscape](parenIndex: Integer)
  = let n: Integer = EscapeValue[DecimalEscape]
     in if n = 0
         then characterMatcher(‘«NUL»’)
         else if n > parenIndex
         then throw syntaxError
         else backreferenceMatcher(n)

GenMatcher[AtomEscape CharacterEscape](parenIndex: Integer)
= characterMatcher(CharacterValue[CharacterEscape])

GenMatcher[AtomEscape CharacterClassEscape](parenIndex: Integer)
= characterSetMatcher(AcceptanceSet[CharacterClassEscape], false)

backreferenceMatcher(n: Integer) : Matcher
  = function(t: REInput, x: REMatch, c: Continuation)
         case nthBackreference(x, n) of
            present(ref: String):
                  let i: Integer = x.endIndex;
                      s: String = t.str
                  in let j: Integer = i + |ref|
                  in if j > |s|
                      then failure
                      else if s[i ... j - 1] = ref
                      then c(endIndex j, captures x.captures)
                      else failure;
            absent: c(x)
            end

nthBackreference(x: REMatch, n: Integer) : Capture = x.captures[n - 1]

Syntax

CharacterEscape

ControlEscape

| c ControlLetter

| HexEscape

| IdentityEscape

ControlLetter

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

| a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z

IdentityEscape UnicodeCharacter except UnicodeAlphanumeric

ControlEscape

f

| n

| r

| t

| v

Semantics

action CharacterValue[CharacterEscape] : Character

CharacterValue[CharacterEscape ControlEscape] = CharacterValue[ControlEscape]

CharacterValue[CharacterEscape c ControlLetter]
= codeToCharacter(bitwiseAnd(characterToCode(ControlLetter), 31))

CharacterValue[CharacterEscape HexEscape] = CharacterValue[HexEscape]

CharacterValue[CharacterEscape IdentityEscape] = IdentityEscape

action CharacterValue[ControlEscape] : Character

CharacterValue[ControlEscape f] = ‘«FF»’

CharacterValue[ControlEscape n] = ‘«LF»’

CharacterValue[ControlEscape r] = ‘«CR»’

CharacterValue[ControlEscape t] = ‘«TAB»’

CharacterValue[ControlEscape v] = ‘«VT»’

Decimal Escapes

Syntax

DecimalEscape DecimalIntegerLiteral [lookahead{DecimalDigit}]

DecimalIntegerLiteral

0

NonZeroDecimalDigits

NonZeroDigit

| NonZeroDecimalDigits DecimalDigit

NonZeroDigit 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Semantics

action EscapeValue[DecimalEscape] : Integer

EscapeValue[DecimalEscape DecimalIntegerLiteral [lookahead{DecimalDigit}]]
= IntegerValue[DecimalIntegerLiteral]

action IntegerValue[DecimalIntegerLiteral] : Integer

IntegerValue[DecimalIntegerLiteral 0] = 0

IntegerValue[DecimalIntegerLiteral NonZeroDecimalDigits]
= IntegerValue[NonZeroDecimalDigits]

action IntegerValue[NonZeroDecimalDigits] : Integer

IntegerValue[NonZeroDecimalDigits NonZeroDigit] = DecimalValue[NonZeroDigit]

IntegerValue[NonZeroDecimalDigits NonZeroDecimalDigits₁ DecimalDigit]
= 10*IntegerValue[NonZeroDecimalDigits₁] + DecimalValue[DecimalDigit]

action DecimalValue[NonZeroDigit] : Integer = digitValue(NonZeroDigit)

Hexadecimal Escapes

Syntax

HexEscape

x HexDigit HexDigit

| u HexDigit HexDigit HexDigit HexDigit

HexDigit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | a | b | c | d | e | f

Semantics

action CharacterValue[HexEscape] : Character

CharacterValue[HexEscape x HexDigit₁ HexDigit₂]
= codeToCharacter(16*HexValue[HexDigit₁] + HexValue[HexDigit₂])

action HexValue[HexDigit] : Integer = digitValue(HexDigit)

Character Class Escapes

Syntax

CharacterClassEscape

s

| S

| d

| D

| w

| W

Semantics

action AcceptanceSet[CharacterClassEscape] : {Character}

AcceptanceSet[CharacterClassEscape s] = reWhitespaces

AcceptanceSet[CharacterClassEscape S] = {‘«NUL»’ ... ‘«uFFFF»’} - reWhitespaces

AcceptanceSet[CharacterClassEscape d] = reDigits

AcceptanceSet[CharacterClassEscape D] = {‘«NUL»’ ... ‘«uFFFF»’} - reDigits

AcceptanceSet[CharacterClassEscape w] = reWordCharacters

AcceptanceSet[CharacterClassEscape W] = {‘«NUL»’ ... ‘«uFFFF»’} - reWordCharacters

User-Specified Character Classes

Syntax

CharacterClass

[ [lookahead{^}] ClassRanges ]

| [ ^ ClassRanges ]

ClassRanges

«empty»

| NonemptyClassRanges^dash

d {dash, noDash}

NonemptyClassRanges^d

ClassAtom^dash

| ClassAtom^d NonemptyClassRanges^noDash

| ClassAtom^d - ClassAtom^dash ClassRanges

Semantics

action AcceptanceSet[CharacterClass] : {Character}

AcceptanceSet[CharacterClass [ [lookahead{^}] ClassRanges ]]
= AcceptanceSet[ClassRanges]

AcceptanceSet[CharacterClass [ ^ ClassRanges ]] = AcceptanceSet[ClassRanges]

action Invert[CharacterClass] : Boolean

Invert[CharacterClass [ [lookahead{^}] ClassRanges ]] = false

Invert[CharacterClass [ ^ ClassRanges ]] = true

action AcceptanceSet[ClassRanges] : {Character}

AcceptanceSet[ClassRanges «empty»] = {}_Character

AcceptanceSet[ClassRanges NonemptyClassRanges^dash]
= AcceptanceSet[NonemptyClassRanges^dash]

action AcceptanceSet[NonemptyClassRanges^d] : {Character}

AcceptanceSet[NonemptyClassRanges^d ClassAtom^dash] = AcceptanceSet[ClassAtom^dash]

AcceptanceSet[NonemptyClassRanges^d ClassAtom^d NonemptyClassRanges^noDash₁]
= AcceptanceSet[ClassAtom^d] AcceptanceSet[NonemptyClassRanges^noDash₁]

AcceptanceSet[NonemptyClassRanges^d  ClassAtom^d₁ - ClassAtom^dash₂ ClassRanges]
  = let range: {Character}
             = characterRange(AcceptanceSet[ClassAtom^d₁], AcceptanceSet[ClassAtom^dash₂])
     in range  AcceptanceSet[ClassRanges]

characterRange(low: {Character}, high: {Character}) : {Character}
  = if |low|  1 or |high|  1
     then throw syntaxError
     else let l: Character = min low;
               h: Character = min high
           in if l  h
               then {l ... h}
               else throw syntaxError

Character Class Range Atoms

Syntax

ClassAtom^d

ClassCharacter^d

| \ ClassEscape

ClassCharacter^dash UnicodeCharacter except \ | ]

ClassCharacter^noDash ClassCharacter^dash except -

ClassEscape

DecimalEscape

| b

| CharacterEscape

Semantics

action AcceptanceSet[ClassAtom^d] : {Character}

AcceptanceSet[ClassAtom^d ClassCharacter^d] = {ClassCharacter^d}

AcceptanceSet[ClassAtom^d \ ClassEscape] = AcceptanceSet[ClassEscape]

action AcceptanceSet[ClassEscape] : {Character}

AcceptanceSet[ClassEscape  DecimalEscape]
  = if EscapeValue[DecimalEscape] = 0
     then {‘«NUL»’}
     else throw syntaxError

AcceptanceSet[ClassEscape b] = {‘«BS»’}

AcceptanceSet[ClassEscape CharacterEscape] = {CharacterValue[CharacterEscape]}

AcceptanceSet[ClassEscape CharacterClassEscape] = AcceptanceSet[CharacterClassEscape]

JavaScript 2.0

Formal Description

Parser Grammar

Tuesday, February 15, 2000

This LALR(1) grammar describes the syntax of the JavaScript 2.0 proposal. The starting nonterminal is Program. See also the description of the grammar notation.

This document is also available as a Word 98 rtf file.

Terminals

General tokens: Identifier Number RegularExpression String VirtualSemicolon

Punctuation tokens: ! != !== % %= & && &&= &= ( ) * *= + ++ += , - -- -= . ... / /= : :: ; < << <<= <= = == === > >= >> >>= >>> >>>= ? @ [ ] ^ ^= ^^ ^^= { | |= || ||= } ~

Future punctuation tokens: # ->

Reserved words: break case catch class const continue default delete do else eval extends false final finally for function if in instanceof new null package private public return super switch this throw true try typeof var while with

Future reserved words: abstract debugger enum export goto implements import interface native protected synchronized throws transient volatile

Non-reserved words: get language set

Expressions

b {allowIn, noIn}

Identifiers

Identifier

Identifier

| get

| set

| language

QualifiedIdentifier

Identifier

| QualifiedIdentifier :: Identifier

| ParenthesizedExpression :: Identifier

Primary Expressions

PrimaryExpression

null

| true

| false

| Number

| Number [no line break] String

| String

| this

| super

| QualifiedIdentifier

| RegularExpression

| ParenthesizedExpression

| ParenthesizedExpression [no line break] String

| ArrayLiteral

| ObjectLiteral

| FunctionExpression

ParenthesizedExpression ( Expression^allowIn )

Function Expressions

FunctionExpression

function FunctionSignature Block

| function Identifier FunctionSignature Block

Object Literals

ObjectLiteral

{ }

| { FieldList }

FieldList

LiteralField

| FieldList , LiteralField

LiteralField FieldName : AssignmentExpression^allowIn

FieldName

QualifiedIdentifier

| String

| Number

Array Literals

ArrayLiteral [ ElementList ]

ElementList

LiteralElement

| ElementList , LiteralElement

LiteralElement

«empty»

| AssignmentExpression^allowIn

Postfix Unary Operators

PostfixExpression

FullPostfixExpression

FullPostfixExpression

PrimaryExpression

| FullPostfixExpression MemberOperator

| FullPostfixExpression Arguments

| PostfixExpression [no line break] ++

| PostfixExpression [no line break] --

FullNewExpression new FullNewSubexpression Arguments

ShortNewExpression new ShortNewSubexpression

FullNewSubexpression

PrimaryExpression

| FullNewSubexpression MemberOperator

ShortNewSubexpression

FullNewSubexpression