ElectricalFire Compiler

Development Roadmap

Author: Scott Furman
Last Modified:

In keeping with the philosophy of open source, ElectricalFire development will proceed in the direction and manner chosen by those who contribute to the project.  So, consider the following document as a skeleton - one possible future for ElectricalFire out of many.  We'll put flesh on these bones (or add new bones) as the project progresses.

Overall project organization

Phase 1 - Getting organized

Phase 2 - Stability/Completeness/Porting

Phase 3 - Performance Optimization

Developers and Module Owners

ElectricalFire is a big enough project that it needs to broken up into several parts, each with a coordinator, or "module owner", in mozilla-speak. Initially, some of the owners will come from the original ElectricalFire development team, but that will likely change as other developers come up to speed on the code.

A first cut at the partitioning might look like this:

Module
Owner
Overall project administration Scott Furman
Intermediate code generation and optimization Waldemar Horwat
x86 back-end Bjorn Carlson
Java runtime libraries TBD

In addition, we'll need one module owner for each additional back-end targeting a new architecture.

Runtime Library

The lack of Java runtime libraries is the single largest hole in the ElectricalFire effort.  EF provides partial implementations of the "core" Java libraries, i.e. java.lang, java.lang.reflect and java.io, but has no code for AWT or any of the other dozens of subsystems supported by Sun's JDK. The two most obvious suppliers for EF's runtime are the GNU Classpath effort and Sun's JDK 1.2 (under the new community licensing policy). To the extent possible, EF should support both, but the legal requirement to avoid tainting Classpath and ElectricalFire with Sun code may cause problems, e.g. if a developer looks at the Sun runtime library sources then they can't contribute to the Classpath effort.

JNI Methods

One significant subtask in getting runtime libraries to work is completing implementation of the JNI methods. Most JNI methods today are merely stubs. The JNI source code lives here.

Compiler Tests

More than most other types of programs, compiler code is notorious for having insidious and subtle bugs. Even worse, it's difficult to quantify a compiler's quality or that of a compiler test suite. For this reason, compiler suites tend to be quite large so as to take advantage of any shotgun effect. Unfortunately, there appears to be no way we can get free access to the tens of thousands of test that make up Sun's JCK.

However, EF can benefit tremendously from the library functional tests being provided by the Mauve project, and perhaps also from Japhar's tests. We'll need to provide a small set of low-level regression tests that can be run quickly so as to give developers confidence in checking in their changes. In particular, we'll need some low-level bytecode instruction tests. Finally, it would be nice to incorporate a few large Java applications into the test suite.

Automated Testing

Worse than introducing a regression is not discovering it until sometime after it's introduced. When ElectricalFire was a commercial product, an automated report was generated every day, which provided summary statistics and identified test regressions. Assuming we can come up with the tests to run, it shouldn't be too difficult to resurrect this machinery. Another type of automated testing that we planned to do, but never got around to, was performance tracking, so as to catch slowdowns in the compilation process or in compiled code.

Porting

Only the x86 code generator is reasonably close to working order.  The others (PowerPC, Sparc, PA-RISC, MIPS) are in various states of disrepair, as the original development tea, temporarily shed them to concentrate on the x86. At one point,  Scott Silver was working on a tool to make it easier to specify instruction selection rules. He would be a good person to contact when starting up a port to another architecture.

Exception Support

ElectricalFire's exception-handling code was nearly complete when the original EF project was cancelled.  Some of the remaining tasks:

Register Allocator

The existing register allocator, which is a variant on the classic coloring algorithm, suffers from unacceptably slow compilation performance for some methods. Unfortunately, the SSA intermediate form used in EF tends to produce lots of temporaries and this drives up the cost of register allocation. (The complexity of computing the register interference graph goes up as the square of the number of registers. Editor: Is this right ?  I seem to remember that our algorithm has an N^4 component to it.)  In some extreme cases, compilation of a single very large function could take 20 minutes. We need an algorithm that is both more scalable and that makes better assignment/spilling decisions.  A hierarchical allocator is needed, i.e. one that operates depth-first on the control-flow graph.  Prior to the cancellation of the EF commercial project, a developer was working on a replacement allocator based on Proebsting et al's probability-based register allocation algorithm.  Believe it or not, we seem to have lost the source code for the replacement allocator and we have not been successful at locating the developer.  (Anyone seen Laurent Morichetti ?)

Garbage Collection

ElectricalFire's garbage collector, codenamed "Sport Model", is substantially complete, but it is not yet integrated with the rest of the compiler.  Doing that integration is a massive task, since the GC must become intimately entwined with the compiler and runtime, e.g. by constraining the object layout and by requiring access to the stack and registers. Now might be a good time to review some of the Sport Model design decisions to see if a simpler initial implementation is possible.  For example, the current design in which object headers are segregated onto their own memory pages might be an optimization for a future release.  The conservative versus non-conservative GC question is another good way to start a fist-fight among the GC cognoscenti.

Optimizing Code Generation

I'm not even going to attempt an exhaustive list of all the optimizations that might be implemented in EF.  Even when source Java bytecodes are optimized prior to compilation, there are still many optimizations that can be done at the primitive graph or instruction level.  Some obvious candidate optimizations would seek to lower the overhead that Java imposes (versus C++) by reducing/eliminating: Many of the traditional optimizations are also useful: When inlining is added to the compiler's arsenal, most of the traditional compiler optimizations also become important.

Waldemar wrote some more about optimizations a long time ago.