Tuesday, August 5, 2008

Java code is portable, where C and C++ are not

Java source code is a little more portable than C-based languages. In C and C++, each implementation decides the precision and storage requirements for basic data types (short, int, float, double, etc.). This is a major source of porting problems when moving from one kind of system to another, since changes in numeric precision can affect calculations and assumptions about the size of structs can be violated. Java defines the size of basic types for all implementations; an int on one system is the same size (and can represent the same range of values) as on every other system. It does not permit the use of arbitrary pointer arithmetic, so assumptions about struct packing and sizes can't lead to non-portable coding practices.

(One reader of this page points out that while storage requirements for float and double are defined by Java, precision during calculation is not. This means that a program that uses floating point arithmetic can produce different answers on different systems, with the degree of difference increasing with the number of calculations a particular value goes through. This is true of floating point in general, not just in Java, and explains why the Cobol world continues to rely on bizarre data types like COMPUTATIONAL-3 (binary coded decimal) for calculations where accuracy matters.)

Where Java is more portable than other languages is in its object code. Most language compilers generate the native code for the target computer, which then runs at the best speed of which the system is capable. Java compiles to an object code for a theoretical machine; the Java interpreter emulates that machine. This means that Java code compiled on one kind of computer will run on every other kind of computer with a Java interpreter. The tradeoff is in performance: the interpreter adds a significant level of overhead to the program.

Note that this extra overhead can be reduced considerably by just-in-time compilation techniques. When the Java interpreter receives a chunk of code to execute, it could convert it from Java object code into the native code of the machine and then execute the real code. This adds some overhead during the translation process but permits the resulting code to run at close to native speeds. Java is still likely to be slower than C or C++, due to some features of the language intended to ease development. It's hard to know how close well-optimized native Java code can get to the best C or C++. But a range of 50% to 200% slower (1.5x to 3x the execution time) seems a reasonable guess.

But it's important that an application written in Java is still not 100% portable. An application written on one kind of system will still need to be tested on every platform before one can say with certainty that there are no problems. Even if the Java code itself was 100% portable (and it isn't; just compare the peculiarities of the Sun implementation of threads with Netscape's), every time the code goes out to native runtime code it encounters incompatibilities: the window toolkit and networking support are riddled with such problems.

0 comments: