XPCOM Type Library
File Format

Version 1.0, Draft 5

Last updated: 
Author: Scott Furman <fur@netscape.com>

Document History

Known Issues

Introduction

XPCOM type libraries, or "typelibs", are binary interface description files generated by the XPIDL compiler. Type libraries enumerate the methods of one or more interfaces, including detailed type information for each method parameter. The typelib is not merely a tokenized form of the IDL.  Rather, it's intended to accurately represent binary XPCOM interfaces, with annotations derived from the IDL.

Typelibs might be more aptly named "interface libraries", but Microsoft has already established a precedent with their naming scheme and we'll stick with it to avoid developer confusion.

Goals

Non-goals

Notation

The syntax used in this document to specify the layout of file data appears similar to C structs. Unlike C structs, however, data members are not subject to alignment restrictions.  Another difference from C structs is the use of pointer notation to represent 32-bit file offsets. For example, specifies a 32-bit field that contains the offset, in bytes to an array of one or more 16-bit values.   Unless otherwise noted, all file offsets are byte offsets from the beginning of the data pool  and are 32-bit signed quantities.  The first byte of the data pool is at offset 1, so as to allow offset 0 to be used as a special indicator.  By adding in an appropriate constant, these offsets are appropriate as arguments to seek().

Record fields with type boolean occupy one bit, not one byte. A value of 1 represents true and a value of 0 represents false.

All integer fields with multibyte precision are stored in big-endian order, e.g. for a uint16 field, the high-order byte is stored in the file followed by the low-order byte.

Filename Suffix

The standard suffix for XPCOM type libraries is .xpt. [Editor: Do we need to define a standard four-character Mac signature/creator ?]

File Header

Every XPCOM typelib file begins with a header:
TypeLibHeader {
    char                     magic[16];
    uint8                    major_version;
    uint8                    minor_version;
    uint16                   num_interfaces;
    uint32                   file_length;
    InterfaceDirectoryEntry* interface_directory;
    uint8*                   data_pool;
    Annotation               annotations[];
}

magic

The first 16 bytes of the file always contain the following values:
       (hex) 58 50 43 4f 4d 0a 54 79 70 65 4c 69 62  0d 0a   1a
(C notation)  X  P  C  O  M \n  T  y  p  e  L  i  b  \r \n \032
This signature both identifies the file as an XPCOM typelib file and provides for immediate detection of common file-transfer problems, i.e. treatment of a binary file as if it was a text file. The CR-LF sequence catches file transformations that alter newline sequences. The control-Z character stops file display under MS-DOS. The linefeed in the sixth character checks for the inverse of the CR-LF translation problem. (A nod to the PNG folks for the inspiration behind using these special characters in the header.)

major_version, minor_version

These are the major and minor version numbers of the typelib file format. For this specification major_version is 0x01 and minor_version is 0x00. TypeLib files that share the same major version but have different minor versions are compatible. Changes to the major version represent typelib file formats that are not backward-compatible with parsers designed only to read earlier major versions. If a typelib file is encountered with a major version for which support is not available, the rest of the file should not be parsed.

num_interfaces

This indicates the number of InterfaceDirectoryEntry records that are at the offset indicated by the interface_directory field.

interface_directory

This field specifies a zero-relative byte offset from the beginning of the file.  It identifies the start of an array of InterfaceDirectoryEntry records.  If num_interfaces is zero, then this field should also be zero.  The value of this field should be a multiple of 4, i.e. the interface directory must be aligned on a 4-byte boundary. (This is to guarantee aligned access if the typelib file is mmap'ed into memory.)

file_length

Total length of the typelib file, in bytes. This value can be compared to the length of the file reported by the OS so as to detect file truncation.

data_pool

The data pool is a heap-like storage area that is the container for most kinds of typelib data including, but not limited to InterfaceDescriptor, MethodDescriptor, ParamDescriptor, and TypeDescriptor records.  Note that, unlike most file offsets in a typelib, the value of data_pool is zero-relative to the beginning of the file.

annotations

A variable-length array of variable-size records used to store secondary information, e.g. such as the name of the tool that generated the typelib file, the date it was generated, etc.

InterfaceDirectoryEntry

A contiguous array of fixed-size InterfaceDirectoryEntry records begins at the byte offset identified by the interface_directory field in the file header.  The array is used to quickly locate an interface description using its IID.  No interface should appear more than once in the array.
InterfaceDirectoryEntry {
    uint128              iid;
    Identifier*          name;
    Identifier*          namespace;
    InterfaceDescriptor* interface_descriptor;
}
An interface is said to be unresolved if its name is known, e.g. "nsISupports", but its IID and methods have not yet been determined.  In that case, both the iid and the interface_descriptor field will be set to zero.  If an interface is unresolved, then its typelib must be linked with another typelib to resolve the interface, namely the one that contains a resolved InterfaceDirectoryEntry that matches the specified name and namespace.

A pointer to an InterfaceDirectoryEntry is always relative to the beginning of the file.  (This is different from other pointers in the typelib file, which are relative to the byte immediately before the data pool.)

iid

The iid field contains a 128-bit value representing the interface ID. The iid is created from an IID by concatenating the individual bytes of an IID in a particular order. For example, this IID:
{00112233-4455-6677-8899-aabbccddeeff}
is converted to the 128-bit value
0x00112233445566778899aabbccddeeff
Note that the byte storage order corresponds to the layout of the nsIID C-struct on a big-endian architecture.

All InterfaceDirectoryEntry objects must appear sorted in increasing order of iid, so as to facilitate a binary search of the array.  (This means that unresolved interfaces appear at the beginning of the array.)

name

The human-readable name of this interface, e.g. "nsISupports", stored using the Identifier record format.

namespace

The human-readable identifier for the namespace of this interface, stored using the Identifier record format. This is the declared name of an interface's module in the XPIDL. The use of namespace permits identically-named interfaces that do not conflict. (Reference to an interface in one namespace to one in another namespace would probably be written as namespace.interfaceName.) If namespace is zero, the interface is in the default namespace.

interface_descriptor

This is a byte offset from the beginning of the file to the corresponding InterfaceDescriptor object.

InterfaceDescriptor

An InterfaceDescriptor is a variable-size record used to describe a single XPCOM interface, including all of its methods:
InterfaceDescriptor {
    InterfaceDirectoryEntry* parent_interface;
    uint16                   num_methods;
    MethodDescriptor         method_descriptors[num_methods];
    uint16                   num_constants;
    ConstDescriptor          const_descriptors[num_constants];
}

parent_interface

An interface's methods are specified by composing the methods of an interface from which it is derived with additional methods it defines. The method_descriptors array does not list any methods that the interface inherits from its parent and the parent_interface field contains a byte offset, relative to the beginning of the file, to the InterfaceDirectoryEntry of its parent interface.  This field has a value for nsISupports, the root of the interface inheritance hierarchy.

num_methods

The number of methods in the method_descriptors array.

method_descriptors

This is an inline array of MethodDescriptor objects. The length of the array is determined by the num_methods field.

num_constants

The number of scoped interface constants in the const_descriptors array.

const_descriptors

This is an inline array of ConstDescriptor objects.  The length of the array is determined by the num_constants field.

ConstDescriptor

A ConstDescriptor is a variable-size record that records the name and value of a scoped interface constant.  All ConstDescriptor records have this form:
ConstDescriptor {
    Identifier*     name;
    TypeDescriptor  type;
    <type> value;
}

name

The human-readable name of this constant, stored in the Identifier record format.

type

The type of the method parameter.  Types are restricted to the following subset of TypeDescriptors: int8, uint8, int16, uint16, int32, uint32, int64, uint64, wchar_t, char, string

value

The type (and thus the size) of the value record is determined by the contents of the associated TypeDescriptor record. For instance, if type corresponds to int16, then value is a two-byte record consisting of a 16-bit signed integer.  For a ConstDescriptor type of string, the value record is of type String*, i.e. an offset within the data pool to a String record containing the constant string.

MethodDescriptor

A MethodDescriptor is a variable-size record used to describe a single interface method:
MethodDescriptor {
    boolean         is_getter;
    boolean         is_setter;
    boolean         is_varargs;
    boolean         is_constructor;
boolean is_hidden; uint3 reserved; Identifier* name; uint8 num_args; ParamDescriptor params[num_args]; ParamDescriptor result; }

is_getter

This field is used to allow interface methods to act as property getters for object-oriented languages such as JavaScript.  It could be set as a result of defining an XPIDL attribute.  For example, if there was an XPIDL attribute named "Banjo",  you could access the "Banjo" property on an interface like so: 'myInterface.Banjo'.  Any prefix added  by the XPIDL compiler to an attribute's  identifier in the .h file, such as "Is" or "Get" should not appear in the method's name.

is_setter

This field is used to allow interface methods to act as property setters for object-oriented languages such as JavaScript.  It could be set as a result of defining an XPIDL attribute.  For example, if there was an XPIDL attribute named "Banjo",  you could assign to the "Banjo" property on an interface like so: 'myInterface.Banjo = 3'. Any prefix added  by the XPIDL compiler to an attribute's  identifier in the .h file, such as "Is" or "Get" should not appear in the method's name.

is_varargs

If set, is_varargs indicates that the method is designed to accept a variable number of arguments from, say, a scripting language. The exact details of how this might be done, however, is beyond the scope of the typelib definition.  (With XPComConnect, an nsVarArgs object is passed as the last parameter to such a method.  That object is a variable length array of argument values and types.)

is_constructor

This field indicates the default constructor for this interface, which may be useful for interfaces that act like factories.  For example, with an instance of an XPCOM interface named 'Foo', in JavaScript one might write 'new Foo(arg1, arg2)', thus calling this interface to be called; The argument signature of an XPCOM constructor is:
NS_IRESULT ([arg,]*, out nsISomeInterface** result)

That is, it's a function that takes zero or more arguments and creates a new interface returned through the result output parameter.

is_hidden

If true, this field indicates that the method is not to be exposed to scripters, although it remains in the typelib to fill a slot in the interface's vtable.

name

The human-readable name of this method, e.g. "getWindow", stored in the Identifier record format.

num_args

The number of arguments that the method consumes.  Also, the number of elements in the params array.

params

This is an inline array of ParamDescriptor objects.  The length of the array is determined by the num_args field.

result

This is a single, inline ParamDescriptor object that identifies the actual return type of an XPCOM method. The result, however, does not always refer to the effective method return value when the invocation is from a scripting language, i.e. the return value as seen from a script-writer's perspective.  In particular, it is possible to designate any out method argument as the method return value for scripting purposes.  See the retval flag.

ParamDescriptor

A ParamDescriptor is a variable-size record used to describe either a single argument to a method or a method's result:
ParamDescriptor {
    boolean         in;
    boolean         out;
    boolean         retval;
    uint5           reserved;
    TypeDescriptor  type;
}

in

If in is true, it indicates that the parameter is to be passed from caller to callee.  This flag is always false for a method's result.

out

If out is true, it indicates that the parameter is to be passed from callee to caller.  It is possible for a parameter to have both out and in bits set.  For the actual method result, out is always true. Out parameters that are method arguments must always have a pointer type.

retval

If retval is true, it indicates that this parameter is to be considered the return value of the method for purposes of invocation from a scripting language.  If the XPCOM method's result parameter does not have its retval flag set, then the method's return value is either void or an nsresult (a bitfield encoded as a uint32) that indicates the success or failure of the method invocation.  Note that retval cannot be true unless out is also true.

reserved

A 5-bit field reserved for future use.

type

The type of the method parameter.

TypeDescriptor

A TypeDescriptor is a variable-size record used to identify the type of a method argument or return value.  There are many XPCOM types that need to be represented in the typelib: [Editor: This specification does not yet cover pointers to unions, structs or arrays.]

To efficiently describe all the type categories listed above, there are several different variants of TypeDescriptor records:

union TypeDescriptor {
    SimpleTypeDescriptor;
    InterfaceTypeDescriptor;
    InterfaceIsTypeDescriptor;
}
The first byte of all these TypeDescriptor variants has the identical layout:
TypeDescriptorPrefix {
    boolean  is_pointer;
    boolean  is_unique_pointer;
    boolean  is_reference;
    uint5    tag;
}

is_pointer

This field is true only when representing C pointer/reference types.

is_unique_pointer

This field cannot have a value of true unless is_pointer is also true.  The unique_pointer field indicates if the parameter value can be aliased to another parameter value.  If unique_pointer is true, it must not be possible to reach the memory pointed at by this argument value from any other argument to the method.

is_reference

This field cannot have a value of true unless is_pointer is also true.  This field is true if the parameter is a reference, which is to say, it's a pointer that can't have a value of NULL.

tag

The tag field indicates which of the variant TypeDescriptor records is being used, and hence the way any remaining fields should be parsed.
 Value in tag field 
 TypeDescriptor variant to use 
0..17
SimpleTypeDescriptor
18
InterfaceTypeDescriptor
19
InterfaceIsTypeDescriptor
20..31
reserved

SimpleTypeDescriptor

The one-byte SimpleTypeDescriptor is a kind of TypeDescriptor used to represent scalar types,  pointers to scalar types, the void type,  the void* type and, as a special case, the nsIID* type:

is_pointer, tag

InterfaceTypeDescriptor

An InterfaceTypeDescriptor is used to represent either a pointer to an interface type or a pointer to a pointer to an interface type, e.g. nsISupports* or nsISupports**:
InterfaceTypeDescriptor {
    boolean is_pointer;
    boolean is_unique_pointer;
    boolean is_reference;
    uint5   tag;
    uint16  interface_index;
}

is_pointer

When this field is false, the represented type is an interface pointer.  When is_pointer is true, the represented type is a pointer to an interface pointer.

tag

The tag field must have the decimal value 18.

interface_index

This field specifies a zero-based index into the interface_directory, thus identifying an InterfaceDirectoryEntry. Note that the value is specified in terms of table entries, not bytes.

InterfaceIsTypeDescriptor

An InterfaceIsTypeDescriptor describes an interface pointer type. It is similar to an InterfaceTypeDescriptor except that the type of the interface pointer is specified at runtime by the value of another argument, rather than being specified by the typelib.
InterfaceIsTypeDescriptor {
    boolean  is_pointer;
    boolean  is_unique_pointer;
    boolean  is_reference;
    uint5    tag;
    uint8    arg_num;
}

tag

The tag field must have the decimal value 19.

arg_num

The zero-based index of the method argument that describes the type of the interface pointer.  The specified method argument must have type nsIID*.

Identifier

Identifier records are used to represent variable-length, human-readable strings:
Identifier {
    char   bytes[];
}

bytes

Unicode string encoded in UTF-8 format, NUL-terminated.

String

String records are used to represent variable-length, human-readable strings, possibly with embedded NUL's:
String {
    uint16 length;
    char   bytes[];
}

length

The length of the string, in characters (not bytes).

bytes

Unicode string encoded in UTF-8 format, with no null-termination. The length of the bytes array, measured in Unicode characters (not bytes), is reported by the length field.

Annotation

Annotation records are variable-size records used to store secondary information about the typelib, e.g. such as the name of the tool that generated the typelib file, the date it was generated, etc.  The information is stored with very loose format requirements so as to allow virtually any private data to be stored in the typelib.
union Annotation {
    EmptyAnnotation
    PrivateAnnotation
}
EmptyAnnotation {
    boolean   is_last;
    uint7     tag; // 0
}

PrivateAnnotation {
    boolean   is_last;
    uint7     tag; // 1
    String    creator;
    String    private_data;
}

is_last

When true, no more Annotation records follow the current record. If false, at least one Annotation record appears immediately after the current record.

tag

The tag field discriminates among the variant record types for Annotation's.  If the tag is 0, this record is an EmptyAnnotation. EmptyAnnotation's are ignored - they're only used to indicate an array of Annotation's that's completely empty.  If the tag is 1, the record is a PrivateAnnotation.

creator

A string that identifies the application/tool/code that created the annotation, e.g. "XPIDL Compiler, Version 1.2".  There are no rules about the contents of the creator string other than that it be human-readable.

private_data

An opaque data array that is put into the typelib by the application/tool/code that created the typelib.  There are no restrictions on the format of the private_data.

Document History

Draft 5(1/24/98)

Draft 4 (1/14/98)

Draft 3 (1/5/98)

  • Added 'retval' flag and massaged description of 'in' and 'out' flags in the ParamDescriptor record.
  • Fixed errors in descriptions of 'params' and 'result' members of the MethodDescriptor and in the description of 'method_descriptors' and 'const_descriptors' members of the InterfaceDescriptor record.  (In all cases, these members were incorrectly described as byte offsets to other records even though the data layout notation indicated that they were stored inline.)
  • Draft 2 (12/16/98)

  • Added "Document History" and "Known Issues" sections.
  • Tweaked introduction.
  • Changed pointers to InterfaceDescriptor records to instead point to InterfaceDirectoryEntry's so as to allow late-binding of interfaces using only the interface name. The interface name was moved from the InterfaceDescriptor to the InterfaceDirectoryEntry for the same reason.
  • Changed the description of the parent_interface field of InterfaceDescriptor so that its use is not optional.
  • Updated is_getter and is_setter text to be less confusing about whether or not method name prefixes are stored for getters and setters.  (They're not.)
  • Added is_varargs and is_constructor flags to MethodDescriptor.
  • Added support for scoped interface constants.
  • Nearly all uses of the String record type were changed to Identifier. Identifiers are NUL-terminated UTF-8 string records.  That means you can't store embedded NUL characters in an identifier (method or interface name), but they're one byte shorter because the string length isn't stored as part of the record.
  • Added support for private data to be attached to typelib files (Annotation records)
  • Known Issues