ElectricalFire Compiler
Development Roadmap
Author: Scott
Furman
Last Modified:
In keeping with the philosophy of open source, ElectricalFire development
will proceed in the direction and manner chosen by those who contribute
to the project. So, consider the following document as a skeleton
- one possible future for ElectricalFire out of many. We'll put flesh
on these bones (or add new bones) as the project progresses.
Overall project organization
Phase 1 - Getting organized
Phase 2 - Stability/Completeness/Porting
- Integrate runtime libraries with VM
- Finish exception support, including async
exception-handling
- JNI API support
- Integrate garbage-collection code
- Finish Java bytecode verifier support, so that EF can safely be used inside
a browser, e.g. as an OJI plugin
- Debugging support
- Make it easier for new back-ends to be added
- Porting to other architectures/compilers/OS
Phase 3 - Performance Optimization
Developers and Module Owners
ElectricalFire is a big enough project that it needs to broken up into several
parts, each with a coordinator, or "module
owner", in mozilla-speak. Initially, some of the owners will come from
the original ElectricalFire development team, but that will likely change as
other developers come up to speed on the code.
A first cut at the partitioning might look like this:
|
Module
|
Owner
|
| Overall project administration |
Scott Furman |
| Intermediate code generation and optimization |
Waldemar Horwat |
| x86 back-end |
Bjorn Carlson |
| Java runtime libraries |
TBD |
In addition, we'll need one module owner for each additional back-end targeting
a new architecture.
Runtime Library
The lack of Java runtime libraries is the single largest hole in the ElectricalFire
effort. EF provides partial implementations of the "core" Java libraries,
i.e. java.lang, java.lang.reflect and java.io, but has no code for AWT or any
of the other dozens of subsystems supported by Sun's JDK. The two most obvious
suppliers for EF's runtime are the GNU Classpath
effort and Sun's JDK 1.2 (under the new community
licensing policy). To the extent possible, EF should support both, but the
legal requirement to avoid tainting Classpath and ElectricalFire with Sun code
may cause problems, e.g. if a developer looks at the Sun runtime library sources
then they can't contribute to the Classpath effort.
Compiler Tests
More than most other types of programs, compiler code is notorious for having
insidious and subtle bugs. Even worse, it's difficult to quantify a compiler's
quality or that of a compiler test suite. For this reason, compiler suites tend
to be quite large so as to take advantage of any shotgun effect. Unfortunately,
there appears to be no way we can get free access to the tens of thousands of
test that make up Sun's JCK.
However, EF can benefit tremendously from the library functional tests being
provided by the Mauve project,
and perhaps also from Japhar's tests. We'll
need to provide a small set of low-level regression tests that can be run quickly
so as to give developers confidence in checking in their changes. In particular,
we'll need some low-level bytecode instruction tests. Finally, it would be nice
to incorporate a few large Java applications into the test suite.
Automated Testing
Worse than introducing a regression is not discovering it until sometime after
it's introduced. When ElectricalFire was a commercial product, an automated
report was generated every day, which provided summary statistics and identified
test regressions. Assuming we can come up with the tests to run, it shouldn't
be too difficult to resurrect this machinery. Another type of automated testing
that we planned to do, but never got around to, was performance tracking, so
as to catch slowdowns in the compilation process or in compiled code.
Porting
Only the x86 code generator is reasonably close to working order. The others
(PowerPC, Sparc, PA-RISC, MIPS) are in various states of disrepair, as the original
development tea, temporarily shed them to concentrate on the x86. At one point,
Scott
Silver was working on a tool to make it easier to specify instruction selection
rules. He would be a good person to contact when starting up a port to another
architecture.
Exception Support
ElectricalFire's exception-handling code was nearly complete when the original
EF project was cancelled. Some of the remaining tasks:
- Convert native C++ exceptions thrown by the compiler, e.g. IncompatibleClassChangeError,
into Java exceptions at the boundary between the compiled Java caller and
the native code callee. This is done in some places, but not all.
- Add support for x86 integer exceptions, e.g. divide-by-zero errors.
- Fix the bytecode verifier's overzealousness. Right now, it does not
allow methods without an explicit return or throw as final bytecode, even
if that code is dead. This breaks some Java classes.
- Asyncronous exception handling needs to be addressed. (Or maybe not.
Isn't Thread.Stop() deprecated now ?)
Register Allocator
The existing register allocator, which is a variant on the classic coloring algorithm,
suffers from unacceptably slow compilation performance for some methods. Unfortunately,
the SSA intermediate form used in EF tends to produce lots of temporaries and
this drives up the cost of register allocation. (The complexity of computing the
register interference graph goes up as the square of the number of registers.
Editor: Is this right ? I seem to remember that our algorithm has an
N^4 component to it.) In some extreme cases, compilation
of a single very large function could take 20 minutes. We need an algorithm that
is both more scalable and that makes better assignment/spilling decisions.
A hierarchical allocator is needed, i.e. one that operates depth-first on the
control-flow graph. Prior to the cancellation of the EF commercial project,
a developer was working on a replacement allocator based on Proebsting et al's
probability-based register allocation algorithm. Believe it or not, we seem
to have lost the source code for the replacement allocator and we have not been
successful at locating the developer. (Anyone seen Laurent Morichetti
?)
Garbage Collection
ElectricalFire's garbage collector, codenamed
"Sport Model", is substantially complete, but it is not yet integrated with the
rest of the compiler. Doing that integration is a massive task, since the
GC must become intimately entwined with the compiler and runtime, e.g. by constraining
the object layout and by requiring access to the stack and registers. Now might
be a good time to review some of the Sport Model design decisions to see if a
simpler initial implementation is possible. For example, the current design
in which object headers are segregated onto their own memory pages might be an
optimization for a future release. The conservative versus non-conservative
GC question is another good way to start a fist-fight among the GC cognoscenti.
Optimizing Code Generation
I'm not even going to attempt an exhaustive list of all the optimizations that
might be implemented in EF. Even when source Java bytecodes are optimized
prior to compilation, there are still many optimizations that can be done at the
primitive graph or instruction level. Some obvious candidate optimizations
would seek to lower the overhead that Java imposes (versus C++) by reducing/eliminating:
-
null-reference checks
-
array bounds checks
-
checked narrowing of types
Many of the traditional optimizations are also useful:
- invariant loop hoisting and other types of code motion
-
common subexpression (CSE) elimination
-
better register allocation
-
instruction scheduling
When inlining is added to the compiler's arsenal, most of the traditional compiler
optimizations also become important.
Waldemar wrote some more about
optimizations a long time ago.