The Relationship Between XML and Java

Overview

Sun Microsystems puts across the view that XML and Java are complementary technologies, that Java is about mobile code whereas XML is about mobile data. In this article I will put forward the view that XML and Java are fundamentally competing technologies and that although they do complement each other in certain regards the XML philosophy fundamentally undermines the Java one.

Introduction

Just as XML "future proofs" your data by allowing you to change database servers or generate output in different formats, the Java platform "future proofs" your applications against hardware and operating system obsolescence.
Code Fast, Run Fast with XML Data Binding - Eric Armstrong for Sun Microsystems, Inc.

The two most hyped technologies related to the World Wide Web today must be Java and XML. They're very different beasts. Java is a language, code library and virtual machine. It is a new platform for software development that can be layered on top of existing platforms. XML on the other hand appears to be a more modest thing. It's an open way of describing data with extensions to describe how it is manipulated and displayed. Java is a piece of engineering, XML is a set of standards documents for how engineering communicates.

It's very easy to make the case that XML and Java complement each other. For a start, XML needs implementations and Java is currently the most popular language of choice for XML systems. Also, XML frees us from proprietry data formats, while Java (it is claimed) frees us from proprietry platforms. Sun have commissioned an article which puts across this view.

Personally, I think this idea, although attractive, is fundamentally flawed. XML and Java do complement each other in the sense that the XML standards need to be implemented in a language and at the moment Java is the most popular choice for that language. More fundamentally though, they're actually competing. The reason is that from a broader perspective Java and XML, and even Windows, are all ways of solving the same problem.

The Problem is Heterogeneity

The problem is that the world is full of different machines based on different architectures running different operating systems and they all want to talk to each other. One of the most awkward recurring problems in software development is crossing technology boundaries. I want the address book in my PDA to be synchronized with the one in my email client. I want the legacy data stored in the company mainframe to be viewable with clients running on all the machines in the enterprise. I want to create a graph in Excel that's embeddable within a Java application. Our biggest wins come when we create technologies that help us to plug all this stuff together.

The Solutions

If we want to grossly simplify the solution space we could say that there are three basic solutions to the problem of heterogeneity, each with its own flagship example. The solutions are:

Make All Machines the Same a.k.a. Windows Everywhere

The strategy here is to get every machine we want to connect together running the same operating system and the same set of applications. Only one company has, or will, come close to achieving this - Microsoft.

As a long term strategy it will never completely succeed - there's just too much variation in systems: mainframes, workstations, PCs, set-top boxes, mobiles phones, PDAs etc. Having said that, this strategy works well enough to make Microsoft the most successful software company in the world. When 90% of desktops run your software, failure is relative.

Make All Machines Look the Same a.k.a. Java

Of the three solutions I'd say this was the least likely to succeed. What you do is create a platform on a platform, what is called the virtual machine (VM). Software is written to run on the virtual machine rather than the native ("real"?) machine like Windows, MacOS or Unix. It's a very difficult thing to do. Here's some of the things you have to achieve:

Despite these problems, Sun have done a very good job so far and Java has become a very popular language, partly on the usability of the language and partly on the quality of the libraries. Unfortunately Java often runs out of steam. For example, it's often easier to write a Java application that talks to machines on the other side of the planet (using HTTP, CORBA, RMI etc.) than one that talks to applications running on the same machine (using COM or AppleEvents). Sun doesn't like to think of a world where a Java application talks to an Excel spreadsheet, but that's the kind of world most of us live in.

Make All Machines Talk the Same a.k.a. XML and the Internet

When it comes to dealing with the heterogeneity issue, the Internet must be the path of least resistence. Whatever systems you have you keep, you just install additional software that talks the language of the internet. Sure, there's some rewriting involved to make legacy applications internet-aware, but this is minor compared to porting these applications to a new platform.

Where XML comes in is when you start dealing with data. Legacy applications aren't important, legacy data is. XML provides a way of expressing many different kinds of data in a way that is universally readable. XML is, to some degree, self-describing and when you send someone some data in XML they don't necessarily need the application used to create that data to do something useful with it. The Internet and XML are about plumbing, about plugging stuff together in a useful way.

One final point that needs to be addressed is whether Java counts as an Internet technology. Java is certainly heavily associated with the Internet but I'd argue that Java doesn't really embrace the Internet way of thinking - it is about making all platforms the same rather than making them all talk together, it is about believing you can mask heterogeneity effectively rather than embracing it.

Do We Need A Single Winner?

It's important to ask if we need one of these approaches to win over the others. I think the answer is no and I think we benefit from the fact that they are all trying to solve the same fundamental problem. Owning a Windows PC doesn't stop me from browsing the web or running Java applications - we have non-exclusive choices. What is important is that our software development has a focus that gives the greatest long-term value.

The Focus: Platforms or Protocols?

When implementing a new system or when trying to extend an old one the central technology question has to be "Do we build on top of platforms like Java and Windows or on top of protocols like HTTP and XML?". Sure, you have to use a platform, or possibly even a number of them, but choosing to be platform-centric rather than protocol-centric leads you down dead ends. Platforms are a noose with which to hang yourself, protocols free you from that noose.

Windows or Java? Who Cares?

XML needs a language for implementation and a platform for those implementations to run on, but with regards to the choice of language and platform we can afford to be neutral. Sure, some languages lend themselves to supporting XML, such as those with Unicode support, but the choice of language isn't a show stopper. Use whatever is easiest.

Conclusion

It's easy to see XML and Java as the Yin and Yang of the Internet, but XML doesn't need Java in the long term - there's no lock in. To future-proof our systems we need to understand that the most successful systems are those which can communicate with other systems and can use what has been implemented before. Reinventing the wheel, which is central to the Java ideal, is just wasting time. Making systems talk, making them Internet-enabled, is the way forward.


Appendix: Jini Considered Harmful

One platform in particular should be treated with caution: Sun's Jini. Jini is a distributed systems technology built on Java and can be seen as the progression of the Java ideal. With Jini devices can be plugged into the network and they will automatically discover services and download any stub code that they might need to interact with those services.

The Jini concept is truly incredible and it would be easy to see it as the next stage in the evolution of the Internet taking us into a phase where consumer devices can be networked with few administrative worries. Unfortunately the downside of Jini is, in my view, unacceptable: Java lock-in. Building systems based on Jini compels you to continue to use Java for the entire life of your system. A protocol centric approach would allow system components to replaced easily and allow you to exploit technologies that haven't even been thought of yet. Technologies such as the Microsoft-backed Universal Plug and Play take a protocol-centric approach that can deliver much of what Jini offers without the lock-in.


History

DateVersionComments
30/04/20000.1For review by peeps.
08/05/20001.0Retitled, moved Jini comments to an appendix, corrected mistakes pointed out by Pete plus other minor changes.

Up: Ian Fairman's Writings
Copyright Ian Fairman 2000 - ifairman@yahoo.com