Electrical Fire Design

Electrical Fire

Design

Primitive Graph Format
Runtime
Optimizations

All information below is preliminary and is changing frequently. Updated 2/3/97.

Translator Phases

The translator's translation of Java bytecodes to native machine language instructions may be divided into the following phases. This is only a logical division to help in identifying data structures and abstractions; some of the phases may run concurrently or interleaved.

Phases in green are optional and may be done later.

Process bytecodes
- Create control flow graph from the bytecodes
- Parse bytecodes into a canonical form
- Some verification may take place here as well, if desired.
Generate primitive intermediate format
- Analyze Java virtual machine stack usage
- Detect subroutines and either expand or convert to subordinate functions
- Perform additional verification while analyzing stack usage, if desired.
- Translate bytecodes into primitive graph nodes; convert complex bytecodes into trees of primitive nodes to allow optimization.
- Detect and annotate asynchronous exception points
- Detect and annotate volatile references
- Construct SSA dataflow graph
- Annotate dataflow graph with control and memory dependencies
- Optionally perform simple optimizations to reduce the amount of work done by the rest of the translator
- Generate argument-moving code if using stack calling conventions
Optimize
- Perform restricted dataflow optimizations for primitive nodes that were parts of more complex bytecodes
  - Optimize null pointer checks for field accesses and method calls
  - Optimize array bounds checks
  - Optimize multiple requests for the size of the same array
  - Optimize checked casts
Generate code
- Split basic blocks directed acyclic graphs (DAGs) into trees
- Use a BURS tree matcher to generate templates for machine instructions
  - Strength-reduce machine instructions
- Check whether instruction patterns with more than one output can apply (i.e. instructions that generate both a value in a register and set the condition codes as well as instructions that use autoincrement or autodecrement addressing modes) and substitute them as appropriate. These instruction patterns cannot be represented in BURS because they aren't trees.
Schedule code
- Linearize the dataflow graphs inside basic blocks according to their internal dependencies, balancing the following:
  - Avoid generating code that causes pipeline stalls
  - Try to generate code that uses execution units in a balanced manner so that it executes well on a superscalar machine
  - Try to generate code that uses as few temporary registers as possible
Allocate registers
- Compute preferences for incoming arguments, the outgoing result, and the arguments and results of any called functions
- Combine phi-nodes from dataflow graph
- Preallocate intra-basic-block temporaries
- Allocate local variables to registers using either a graph coloring scheme or a priority register assignment scheme
- Generate spill code
  - Minimize spill points
  - Schedule spill code to avoid pipeline stalls and execution unit conflicts
  - On CISC machines try to combine loads and/or stores of spilled values with instructions that reference them (i.e. make use of memory-to-register arithmetic instructions, etc.)
Postprocess
- Generate function header and epilogue
- Generate exception tables
- Generate debugging tables, if desired
- Copy code templates to the code's final location in the code cache, filling in all fields
- Link generated code into the rest of the Java runtime

Translator Subsystems

Important translator abstractions include:

Abstract Containers

Bit sets; incremental bit sets
Red-black trees
Algorithmic heaps
Interval maps
Directed graphs

Concrete Containers

Register maps
The processed bytecode format. This format presents the same information as in the bytecodes in a slightly easier-to-use manner. Control flow edges are explicit. Alternative variants of bytecode instructions (for example, variants of bytecodes that read local variables) are reduced into one common format.
The menu of primitive node types
The primitive graph format
Templates for machine instructions

Algorithms

Graph walkers
Forward and reverse dataflow analyzers over the intermediate dataflow graph formats
Incremental and/or lazy dataflow analyzers
BURS pattern matcher

Converters

The bytecode processor
The dataflow graph generator
The optimizer
The instruction generator
The instruction scheduler
The register allocator
The postprocessor

Debugging

Compiler data structure inspection and dumping interface
Byte code and native code disassemblers
Bytecode-to-native-code mapper
Variable location mapper

Runtime Subsystems

Electrical Fire needs a few subsystems that are not part of the translator proper but are needed to support it. These include:

A code cache manager
A synchronous exception handler
An asynchronous exception handler. This one is complicated because it needs to emulate the thread receiving the exception until the next synchronization point.
Profilers for making compilation and quality-of-compilation decisions