Tuesday, August 5, 2008

Java is suitable for building large applications

For this point, we need to distinguish between Java the programming language (the description of syntax and semantics) and Java as it is implemented today. As a language, Java may be perfectly suitable for big projects. Its object orientation supports integration of large numbers of object classes. By eliminating explicit pointers, it allows programmers to write more maintainable code. So Java as a language is likely to be a better choice than C and probably better than C++ for large applications. Of course, we won't know until someone actually tries it! We are now seeing descriptions of a few large Java development projects, most of which seem sketchy or self-serving enough to make one want to wait for further documentation before accepting their claims.

But while the Java language may be appropriate for big programs, Java as it is implemented in web browsers is not. With a fully compiled language like C, all of the compiled code is combined into an executable program as part of a link process. References to symbols in one module are resolved to their definitions in another.

Java may also turn out to be unsuitable for big applications, rather than just applets. Part of the problem is likely in the way Java deals with memory; none of the Java environments handle large memory spaces at all well. (A speaker at the 1997 Java Internet Business Expo made an interesting comment on his attempts to benchmark Java: that in taking his C++ benchmarks to Java he had to reduce the data size by a factor of ten before any of the Java environments could run the programs to completion.)

But there's another potential problem that is inherent in the dynamic rapid prototyping style of development Java and its advocates encourage. Good prototypes tend to become very bad applications. As we learned (or at least should have learned) from our brush with Lisp and Expert Systems in the 80's, there's a world of difference between prototyping an application and producing a piece of production quality code. It's far more than a matter of fixing bugs and smoothing out the rough edges. The very process of designing as we code leads to applications that don't meet the requirements of stability, reliability, maintainability and extensibility we demand of professional software.

Java resolves all symbols when an applet is loaded into the browser. Each class mentioned in the applet class is loaded to the browser and all the symbolic references are resolved. Inheritance relationships among classes are also resolved at this time; where C++ decides the location of each class member at compile time, Java defers this decision until you attempt to load and run the class.

The upshot of all this is that the equivalent of program linking occurs when you run the code in a class. The larger the class, the larger the number of classes and the more complex the inheritance tree, the longer all this will take.

In addition to dynamic linking, Java performs one other important task before it can begin running a class: validating the code to prevent it from doing anything dangerous. This requires a scan of all of the loaded code to look for funny operations and attempts to break out of the restrictions placed on untrusted applets. Again, the more code you have the longer it will take to process the code before it can begin to run.

Another concern with using Java for large applications is its reliance on stop-and-copy garbage collection. Whenever the application begins running low on memory, everything stops while the GC determines what objects are available for reclamation. Objects still in use are copied to a new area of memory to allow a large contiguous area of free space. Once the GC finishes, the program is free to continue execution.

Right now garbage collection is quick, taking perhaps one or two tenths of a second. But imagine what happens when the size of the Java code and its storage requirements increase by a factor of ten or one hundred. Suddenly we will see our program stop for seconds or even minutes while the garbage collector goes about its work. To solve this problem (as Lisp and Smalltalk systems have had to do) will require a much more sophisticated approach to garbage collection, using a generational scheme or a reference counting model. Either technique will add complexity and overhead to the Java run time environment.

Note that the first commercial Java applets don't use Java for everything. Applix's Java-based spreadsheet, for example, uses Java for the user interface. All the real processing, including loading and saving spreadsheets, is done in CGI code on the server. This is probably the best model for using Java in sophisticated applications. Once there are fully compiled Java implementations, of course, all the rules change.

0 comments: