Distributed XML Development

by Paul Connaughton

B.A. (Mod) Computer Science

Trinity College, University of Dublin, Ireland

Supervised by Alexis Donnelly

 

Abstract

 This is a project investigating the use of XML as the data for a generic distributed collaborative tool; it describes how XML is used and what advantages it gives. The implementation of a real-time distributed whiteboard is provided as an example of the power of this application.

 Keywords: Collaborative workplaces, distributed systems, XML, Java, SVG.

Contents 

Chapter 1 Introduction

Chapter 2 Background and Research

Chapter 3 Software and System Architecture

Chapter 4 Technology Evaluation

Chapter 5 Software Design

Chapter 6 Project Management

Chapter 7 Analysis

Chapter 8 Conclusion

Chapter 9 References

Chapter 1 Introduction

This project is a distributed system for the collaborative development of XML documents. A real-time distributed whiteboard is implemented as an example.

Overview of Report

Chapter 1 - An introduction to the project.

Chapter 2 – A background to how the project came about and the results of research carried out in the general area of collaborative tools.

Chapter 3 - A specification of the architecture of the system and the software components.

Chapter 4 - An evaluation of the third party software components available and those employed.

Chapter 5 - The design of the software developed for the implementation of the system. UML diagrams and explanations are provided for the main classes and packages.

Chapter 6 - A short account of the project management.

Chapter 7 - A reflection on the main problems encountered and the solutions used, as well as a summary of possible improvements and future work.

Chapter 8 - The report finishes with a conclusion.

Chapter 2 Background and Research

Inspiration

Whilst working for Lucent Technologies, a conference call was held with an overseas branch. During the call, the need came up to explain a diagram to the other group, not an easy thing to do by telephone, especially as it grew more complicated with arrows and lines and boxes added as ideas developed. On this scale a videoconference was too costly but it seems that something was definitely needed. A tool, simple to use and simple to set up, to enable distributed idea-sharing. 

To reinforce the need for a simple tool like this consider the following example: during a lecture, particularly an engineering or science one, a lecturer would not rely only on his/her words (hopefully). A lecture would consist of slides, sketches and diagram to explain ideas and help understanding.

Research

Research was carried out through two main sources, face to face (work experience contacts and university colleagues) and the Internet. The aim of the research was to discover the following: 

It was assumed that something was missing because of the situation encountered with Lucent.

Initial encouragement in the project was provided early on by Kevin Scally, Development Manager at Trintech Technologies who was of the opinion that 'there has been a lot of work in that area, but nobody has really cracked it' [19].

Research revealed a lot of papers on the subject but relatively few products. There is even a standard defined, the T.120 Data Conferencing Standard [8], for whiteboard, chat, audio and video. Several applications use this standard, like the outdated Netscape Conference [12] and Microsoft's NetMeeting [10].

NetMeeting is probably the most popular tool at the moment, used regularly in the IT departments of multi-national corporations like Goldman Sachs [20] and Waterford-Wedgewood [21]. As well as its support of the T.120 standard (whiteboard, chat, etc) it provides generic distributed capabilities, using expert knowledge of the Windows platform to provide application and desktop sharing. However it has firewall restrictions, causes other programs to hang (any Java or Open GL application) and is limited to the windows platform. Other limitations are discussed here [11].

The power that NetMeeting provides is impressive but I believe that it is because of the set up required that it is not suited to ad-hoc usage. To start up a meeting, first it is required that one member hosts a meeting, then communicate his/her IP address to the other members so that they may connect.

 A more favourable option is an application in a web browser like a Java applet, requiring no-installation and (if designed well) no-configuration. Several whiteboards and chat-rooms are already deployed on the web in this manner like Netwriter [15] or GroupBoard [17]. Both of these provide only simple drawing commands, with little power or flexibility, but there is easy access: just by pointing your web browser at their site a new session can be started. There are also no problems with firewall restrictions and Netwriter has the added bonus of allowing the configuration of private rooms and scheduling meetings. Much of the response on the Netwriter forum (on the web site) is very favourable and proves the point that there is a gap in the market. Here is an extract:

 'My credit card is burning a hole in my pocket...' - Simon Clifford (Mar 2000)

 'Is there a paid version that offers additional functionality? If so what does it cost, where can it be purchased an what additional functions are available.' - 'Dennis' (July 2000)

 These comments are very encouraging. With regards to deployment, it seems that an applet is ideal, but with the added ability to function as an application without security restrictions. The scope for improved functionality is tremendous: pair programming [51], [52], software development tools, CAD, word processing, all of these applications would be useful if they could be executed in a distributed manner. Imagine a PC with only Notepad and Paint! Then why should it be different for distributed applications?  Greg Schottland [3] supports the need for better distributed applications in a paper on industrial whiteboarding for software design, he says, 'Software development today, more than ever, involves groups of developers no longer sharing the same office or building, but are often distributed across the country or the globe.'

 Deutsche Bank [53] is working on a wireless system to deliver over-night market news to the palm-tops of their traders coming into work in the morning. One source claims that doing this would save half an hour a day for each trader, this translates to $150 million per annum. This saving could equally apply to a group of software engineers using similar distributed tools. 

Is there a way of incorporating all of these functions into a single distributed application - in an efficient way? The application sharing function of NetMeeting is a good idea but is unreliable (with non-Microsoft applications) and of little more use than a shared view. It does not allow multi-client simultaneous editing. Full interaction is needed. Distributed applications have many things in common, so NetMeeting has the right idea in trying to provide a generic framework for sharing any application.  In view of this, the aim of this project is to provide a generic system for distributed collaborative workplaces. This will be achieved by providing a framework to view, develop and create any type of XML document, only limited in the formats defined and a UI to edit them with. A system such as this could be used with future XML formats as they are defined and standardised (like DocTypeX).

Research indicates that there is only one company currently investigating any such area, Loox Informatique [18], based in Paris. According to the W3C SVG [24] implementations page, they are working on an SVG (an XML mark up language for defining vector graphics) viewer that 'will have distributed collaborative SVG editing and dynamic SVG generation in later releases.' 

Workplaces are not restricted to offices so a system like this should allow collaborative workplaces between someone using a MAC at an Internet café and someone behind a firewall using a PC with the installed application. Eventually clients using mobile devices should be able to take part using a wireless protocol. 

Now that the background to the project has been explained, the next chapter will describe how the system works. 

Chapter 3 Software and System Architecture

This section explains how the system provides the distributed XML development: the architecture and the protocols involved. It also touches upon the huge area of concurrency controls, explaining the considerations made.

Platform

The system is completely platform independent and can be run on a Macintosh, Windows or Unix system.

Network Architecture

A server to multi-client architecture is the proposed architecture. Messages are passed from client to server to receiving client. 

Using a single server provides a central place where clients can log onto, limiting the need to use a different IP address at each log on. A major limitation is performance, as each message has to be sent to the server and then relayed to the client, effectively doubling the transmission time. 

A peer-to-peer architecture would perhaps be more efficient but would require greater consideration of concurrency control issues. A centralised server means concurrency control to be concentrated to one place.

System Architecture

Three important points are shown in the following diagram:

When a user, through a UI, makes a change to a document, the local copy is updated and then the update is passed to the server, which updates its documents and then passes the update to all clients that are logged on. In the above diagram the update made by Client A is used to update the server which then propagates the update to Client B.


  By keeping the latest version on the server, clients who log on in the middle of a session can obtain a copy of any of the document being worked on.

Client and Server Architecture

The server must perform several functions:

The most crucial functions if applying the client updates to documents, how the server does this will be discussed in the Concurrency Control section at the end of this chapter.

It's not all left to the server; the client has a lot to do as well:

The following diagram shows the individual components of the system. In the client, it should be noted that only updates originating from the UI are passed to the network. If this were not the case, updates would pass through the system ad infinitum.

 


As can be seen, the components on the client are reusable by the server.

Updates

XML documents are made up of a hierarchy of elements (often called tags), arranged in a tree-like structure. Documents can be modified by manipulating these elements. In order to control how the document update data is transported through the system and how it is used, a protocol had to be defined. That protocol consisted of the following types of element update: 

The XML elements used for the update are encapsulated inside update elements, these update elements contain information on the type of update. 

A major issue with allowing the update of elements (as opposed to only allowing the creation of new ones) is how to reference existing elements. Object references (used in OO programming languages) cannot be used because this is a distributed system, so some kind of unique id is needed for each element. This id has to be unique across the system and associated with the element in some way.

The best way to make an id unique across the system would be to have the centralised server generate the id as requested by the client. This way each id is guaranteed to be unique.

This solution would mean that when a client makes an update (e.g. an insert element), a request is made to the server for an id, then when the id is returned the update could be made locally. The idea of the user having to wait, before he/she can see the results of his/her update on the screen, is unacceptable, users expect instant results.

When a client logs on it is given a user id, by concatenating this id to a client generated element id, a globally unique id is obtained. This is similar to a method used in timestamping as concurrency control for distributed database systems, where taking the local time and appending the site id create a unique timestamp [50].

An advantage of this is that the user who created a particular element can be identified at a later stage.

Concurrency Control

Concurrency Control is a massive topic in the subject of distributed systems so in the interests of brevity this section will attempt to maintain relevancy to the project rather than generally explore this topic in detail. 

Transactions

A fundamental aspect of concurrency control is the notion of a transaction, a series of indivisible actions treated as an indivisible unit. In this system an update is the equivalent of a transaction. There are four ‘ACID’ properties of a transaction and thisis how they relate to the application of document updates:

  1. Atomicity – an update is applied or not (‘all or nothing’)
  2. Consistency – an update transforms a document from one consistent state to another.
  3. Independence – updates execute independently of each other
  4. Durability – updates cannot be ‘undone’ (However the effects of one update can be changed by another update)

When an update is about to be applied a check is done for its validity, if there are no problems then the update is applied to the document (atomicity).

Whenever updates are made to a document, a lock is placed on the document by the process performing the update. This is to stop other processes performing updates at the same time, which could lead to corruption of the document (independence).

The lock is performed on the whole document, this granularity could be changed to lock only elements that are the subject of any update; this solution would improve performance but would require extensive testing and so will not be implemented at this time.

Messaging Order

In a distributed system such as this, a lot of care has to be taken with the order of messaging delivery and reception (in this project it is assumed that messages and updates to be the same thing and use the words interchangeably). If precautions are not taken then it is possible for different clients to hold inconsistent copies of documents. Let us take the following example (see fig 3.3) with two clients, A and B, connected to the server.


  Client A makes an update (1) which is received by the server and passed on to B. Let us imagine that B makes an update after A but before it has received notification of A's update (2). In this case both the server and client A see a different order of updates to client B (3). If the protocol was finished here, then serious problems could occur, especially if the two updates concerned the same element. The problem is resolved by making sure that the server sends client B's update back again therefore consistency is maintained between the server and each client (4).


A similar problem with the ordering of messages by the time they were sent as opposed to received will now be demonstrated by means of an example (see fig 3.4). Ordering messages in this way is known as a totally ordered delivery protocol.

Let us say that two clients, A and B, are connected to the server and each makes an update to a document, client A (1) followed by client B (2). If client A has a slower network connection than B then the server could receive B's update before A. From client A's point of view the message order is incorrect, the server will agree with client B but it is ironically client A who has the true picture of events. Then the server 'corrects' client A's ordering (3). The ordering is at least consistent across the system, but it is not the true order of the creation of the messages, it is the order that the server received them. 

Without a mechanism for time-stamping updates when they are sent (rather than received), the true order of the messages is unknown to the server.

This is a well-explored problem in distributed systems and although an important consideration in many critical systems, I believe that in this case it is relatively less significant. 

The situation will only occur whenever:

TB < TA + (MA - MB) (Where T is the time that messages are sent at and M is the time to send a message.)

Let us say that client A has a 20kb/s connection and B a 40kb/s and both clients are sending an update of 100 bytes. Client B must make its update within 25ms of client A for this disordering of updates to occur. Put in this context I think it is safe to allow the possibility of the situation occurring.

Control Tokens

The multi-client architecture of this system is the cause of these concurrency control concerns. Another solution could be provided on the user level by limiting document control to a single user at a time, this could even be a feature. 

Implementing a token mechanism whereby only user in possession of the token can make document updates would remove the worry of messaging orders and the ACID properties’ dependency. This easy way out will not be used but as a feature it could be added at a later stage.

Chapter 4 Technology Evaluation

There are several components required in the system and it is hoped that some third party software packages could save a lot of development time. This chapter documents the evaluation of the suitability of these third party packages.

Programming Language

Java is being used for all programming in this project, more specifically release 1.2 or Java 2, as it is more commonly known. As well as the usual advantages that Java offers over other languages there is the following:

Document Storage

To store XML Documents on the server and clients, the W3C Document Object Model (DOM) specification [22] is used, as it is the most widely supported. This specification provides 'a standard set of objects for representing HTML and XML documents, a standard model of how these objects can be combined, and a standard interface for accessing and manipulating them.' [22] 

As XML is a hierarchical language, the DOM interface provides methods for accessing the document which are similar to the traversal of a tree.

 The level 2 specification is used because it requires that elements have a unique id and it is by using this that elements can be referenced.

There are several implementations available, for this project the Apache implementation, Xerces [30], is being used. However this project is not dependent on any one implementation; the implementation to be used is specified at runtime and dynamically loaded.

XML Parsing

XML parsers are required to parse the input to both the server and client. The SAX (Simple API for XML) specification provides a standard interface for event-based XML parsing. It was developed collaboratively by the members of the XML-DEV mailing list and is available on the David Megginson web site [34]. 

Event-based parsing means that rather than parse the whole document into memory, it generates higher-level events as it parses. These events are sent to a document handler registered with the parser. Such events include 'start of document', 'start of element' and 'end of element'.

Like the DOM specification there are many implementations, however unlike the DOM, the implementation chosen can be crucial to the performance of an application such as this.

An article by Clark Cooper available from xml.com [26], benchmarks three java SAX parsers:

The fastest parser is the James Clark parser. It is based on the Expat parser that James Clark developed in C which is the fastest XML parser in the world.

The Apache Xerces parser [30] was also considered; an unfortunate pitfall of this parser is that it reads the whole document into memory before starting to parse. This means that it will not work with a socket stream because the document is not being sent in a single shot. Unfortunately this completely goes against the idea of the SAX specification which is that the whole document does not have to be read into memory, it is an event based parser. For reading a whole document into memory the DOM specification should be used.

 Fortunately there is a fix for this on their web site [30] where a subclass of a character iterator is created and inserted into the parser. However, using this fix means that if the SAX implementation is specified at runtime, a check has to be done on the parser class and if it is of the Xerces type, then the fix is applied. This spoils the flexibility of having a generic parser interface. 

As with the DOM implementation, the project is not dependent on any one SAX implementation. The implementation is specified at runtime and any of the above parsers will work, even the Apache parser because the optional fix is coded in as a precaution. The parser being used is the James Clark parser as it is the fastest.

Graphical Mark-up Language

A simple mark-up language (an XML format) is proposed to define the graphics transported between clients for the whiteboard. 

 For example to define a rectangle to be drawn the following data could be used: 

<shape type="rect">

<pos x="0" y="0"/>

     <size w="0" h="100"/>

</shape>

At first it was expected that a proprietary mark up language would have to be defined and implemented, but after further research it was discovered that a mark-up language for vector graphics was currently under development by the W3 commission. A candidate recommendation was made on 2nd August 2000 for the Scaleable Vector Graphics 1.0 Specification and as of writing a standard recommendation is likely to be made soon. It was decided to adopt this specification for the project rather than create a new one. 

The advantages of having this mark-up language are as follows:

In order to incorporate SVG into the project, two components are required:

The priority is to have a perfect viewer that allows any SVG document to be rendered, as the generator would always be limited by whatever functionality the user interface provided to the user. 

Once again it was expected that these tools would have to be developed from scratch.

SVG Rendering Tool

The W3 web site maintains a list of current SVG viewer implementations. There is a browser plug-in by Adobe [35] allowing SVG documents to be viewed from within a web browser, there are also standalone SVG browser applications.

Two viewers were considered, the CSIRO SVG Toolkit [25] and the Koala Jackaroo Viewer [27], these were picked out because:

The IBM viewer [37] is also well developed but does not come with source code. Source code is required so that the UI that these viewers provide can be stripped out and the implementation of the W3C SVG specification used.

It states on the Jacakroo project web site that, 'the purpose of Jackaroo is not to provide a commercial quality product that fully renders SVG documents. We just wanted to evaluate the W3C technologies (CSS, DOM, XML and SMIL) so we decided to create the Jackaroo project in order to see the interaction between those specifications.

This lack of commitment and the fact that the Jackaroo viewer does not support animated SVG, makes the CSIRO implementation preferable. Further to its advantage, the CSIRO implementation is regularly updated: 6 versions were released last year between April 2000 and December 2000.

 Using their implementation of the SVG specification, which is based on the Apache DOM implementation, a draw method allows you to pass the document a graphics object (java.awt.Graphics2D) as a parameter. The document then draws itself on the graphics object, therefore if the on-screen graphics object is passed, the SVG document will be rendered on screen.

SVG Generator Tool

There are only two generators available for generating SVG from onscreen graphics. Most generators on the W3C website are for file conversion to SVG from formats such as Windows Metafile or Flash animation.

The first, CWI 'SVGGraphics' [28], takes the onscreen graphics and streams the content out to a file in SVG format. Using this requires that the DOM object is rebuilt from the stream each time, this is impractical and more suited to a drawing application as a save function.

The second generator is by Sun [29] and lets you pass a reference to a DOM object and then create a graphics object (java.awt.Graphics2D). These are used to build a DOM object representing the SVG document, as shown below.


Description from the Sun web site:

'The generator manages a tree of DOM objects that represent the SVG content corresponding to the rendering calls made on the SVGGraphics2D instance. In other words, every time a program invokes a rendering method, such as fillRect, on a SVGGraphics2D instance, a new DOM object, representing the SVG equivalent, is appended to the DOM tree (for example a <rect> element will be appended after the fillRect method has been invoked). '

An excellent design. This SVG Generator subclasses java.awt.Graphics2D, so it can be used with drawing tools that have no knowledge of DOM or SVG. Methods are performed on the graphics object as normal.

 

Chapter 5 Software Design

The project required the development of several software packages. These were all designed in UML [56] notation using Rational Rose 2000 [49], and then the class and interface skeletons were automatically generated. There were over 90 classes (11 packages and around 11,000 lines of code) developed for this project so in the interests of brevity, not all UML diagrams are shown, and not all classes are explained in detail.

Client and Server Implementations

A quick note on the deployment of the client and server.

There are two choices for the implementation of both the client and server:

Client

A Java applet is the immediate choice for an application running in a web browser, nothing else would have the power required of this application. The usual security restrictions with applets pose a problem. Of greatest importance is the restriction on loading and saving documents and making socket connections. By embedding the applet in an application, its functionality can be extended.

Server

For the server a standalone application is proposed for performance and flexibility reasons, a CGI script or Java Servlet would not be powerful enough.

Document Builder  (ie.psc.dombuilder.*)

This package provides a framework for editing and viewing XML documents and is based on the MVC design pattern [43].

There are three classes that make up this framework, as can be seen below:

Subclasses of the DocumentUI create high-level semantic events that represent updates to the model; these updates are encapsulated in the Update class, which contains information about the type of updates.

There are two other interfaces for handling updates: UpdateListener and UpdateLauncher.

UpdateListeners can be registered with a DocumentBuilder, in the same way an ActionListener can be registered with a button in the Java Swing/AWT event model.

All registered Updatelisteners are informed of any updates applied to the model from the UI. One implementation of an UpdateListener is as a serialiser, used to send updates across the network.

The DocumentBuilder also implements the UpdateListener interface itself so that it can be registered to listen to updates from the network. These updates are passed to the model, which updates the document it wraps. The DocumentUI is notified of this change and it updates it view accordingly.

The dombuilder framework can be used to create and modify any XML document distributedly. By subclassing the DocumentUI class, different XML documents can be viewed and modified.

When new nodes are created and added to the model an id is assigned to them. In order to make sure that ids are unique on a global (i.e. distributed) level, an id is held by each DocumentBuilder, which is pre-pended to each node id. The server assigns the client this id upon log on.

Although this implementation is not strictly MVC as at any one time there is only one UI associated with a Model, this can easily be changed by making DocumentModel extend the Observable class (java.util package). This would then allow any number of views to be registered as observers of the model and be notified simultaneously when the model changed [44].

XML tools (ie.psc.xml.*)

This package contains general tools needed for working with XML, none of them are specific to the project.

The Filter or Pipeline Design Pattern


The ParserFilter class is based on one written by John Cowan [31]. It allows the construction of a pipeline [55]. The initial input to the pipeline is a SAX Parser and the final output is a SAX DocumentHandler. Each filter along the pipe can be used to modify the XML input before passing it to the next stage.

One implementation of a filter is the Indenter class [32]. This adds whitespace indentation to the XML for better-displayed output.


Rule-based Design Pattern

A mechanism is required to separate functions and keep the structure modular and simple so that the XML stream being received by client or server is not all processed by the same component. The solution is to use a rule-based approach.

This design pattern is implemented by creating an implementation of the SAX DocumentHandler, the DocumentSplitter class, that diverts a XML document by element name to other DocumentHandlers. The DocumentSplitter maintains the set of rules as a hashtable. The hashtable is indexed by element name, each one referencing a DocumentHandler to handle the element and all of its sub-elements until the closing tag of the element is reached. 

On the diagram above, the switch represents the DocumentSplitter and each processing module represents a DocumentHandler.

Converting from SAX to DOM

To convert a stream of events fired by a SAX parser into a DOM document object, a DocumentHandler is implemented that builds up the document from the events received, this is the SAXBuilder class.

Converting from DOM to SAX

John Cowan [31] has written a SAX-style parser (DOMParser) for DOM document objects. This parser traverses a document and fires off SAX events as if a stream of XML data was being parsed. Using this parser a SAX DocumentHandler can be used to process DOM documents.

XMLOutputter

This class is used to serialise an XML document to an output stream. Using this in combination with the Indenter or DOMParser creates some powerful tools, vital to the project.

This class is used to send XML to socket streams on the client and server sides.

Threaded XML Parsing

The XMLParserThread class is instantiated with a reference to a SAX Parser and a SAX InputSource, then when required it starts a thread to parse the InputSource.

Network components  (ie.psc.net.*)

These classes are used for both the client and the server, the only difference being the way in which new sockets are created.

Clients create sockets by explicitly naming a host and port and attempting to make a connection whereas servers listen for connections on a certain port and a new socket is created each time a client connects.

 The Session class is used to maintain a group of connections. Connections can only communicate with other connections of the same session. The server in this application uses only one session object but by allowing the creation of more, a multi-session server would be created.

Distributing Output

Incoming data from a connection is queued up and sent out to each connection in turn. The queue is implemented in such a way that locking is restricted and all outputs can be used at once if needed.

Drawing package (ie.psc.awt.geom.*)


This package contains tools which respond to mouse and keyboard events; these are the controllers of the MVC design pattern.

The Tool class implements the necessary event listeners of the Java event model and is the super class of the other tools, as shown in the class diagram above.

The Tool class uses two objects:

When the drawing of a shape is completed the current tool calls the paintCompleted() method with a reference to itself. The ToolManager can then do a callback to the tool and obtain the completed shape through the graphics object as shown below:


 
New tools can be created easily by sub-classing Tool and providing event handling for the mouse and keyboard events.

This package is used by the SVG UI to create new SVG elements but because of the design, it knows nothing about SVG and the tools could be used for any drawing package.

User Interfaces

Several user interfaces were developed for the dombuilder framework, as shown in the following class diagram. The views are read only and the UIs allow user interaction.

Each UI uses a properties file (java.util.Properties) for configuration.


 
Each UI has getActions() method. This is used to return an array of AbstractAction objects (javax.swing) that can be used to create a toolbar, menu or pop-up menu. In this manner a UI can define the actions it needs to function but does not implement how they are made accessible to the user.

SVG UI Package (ie.psc.project.svg.*)

This class is required to:

The configuration properties file defines the actions to be used by this UI and the name and image to represent them on a toolbar or menu. Each action represents a drawing tool (from ie.psc.awt.geom).

The reflection package (java.reflect) in the standard java library allows a class to be loaded and instantiated after compilation simply by specifying its classname. This feature is used to full advantage to provide flexibility in the tools used by the SVG UI. 

When an action associated with a tool is activated that tool is dynamically loaded and an object of its class is created. A reference is then stored in a vector so it doesn't need to be loaded every time it is needed, only the first.

Controllers

The translation of mouse and keyboard events into graphical objects is split into two parts, the creation of new objects and the modification of existing objects.

The creation of new shapes is handled by the ie.psc.awt.geom package through the SvgController class. Once a shape has been created the callback to the paint method is performed, this time passing a Sun SVG generator object, from which the SVG document is obtained. This document contains the SVG mark-up for all commands invoked on the graphics object and is used to update the document model. The SvgController class encapsulates this generation so by simply changing this class another SVG generation tool could be used instead, if one existed. 

The SvgEditor class handles modification. An initial problem was how to link a mouse position to a graphical object in the document. Fortunately, the CSIRO library provides methods to find which XML elements represent objects that contain a certain co-ordinate, and with this an object reference to the shape can be obtained. With this reference on-screen manipulation can be performed such as dragging and dropping of the shape within the drawing canvas.

View

The viewing of the SVG document is separated from the UI by use of the SvgViewer interface, which has a single paint method that takes a reference to the document to be painted and the graphics object to paint it onto. In this implementation the CSIRO implementation is used to render the SVG on the screen.

Other Functions

The SVG UI also provides other functions that are not used to directly edit the model.

The first three make straightforward changes to the attributes used by the current tool (ToolAttributes).

The zoom in and out function requires the graphics object to be scaled by the zoom ratio before it is passed to the current tool and the SVG renderer. It also requires that mouse events be pre-processed before passing them to the current tool, so that the co-ordinates are also scaled. 

Importation of files is a simple procedure. A file chooser (javax.swing.JFileChooser) is first used to select a SVG file. This is then parsed to create an SVG element which is appended to the document. This could be extended to allow choice of a file or a URL thus allowing applets to import files as well.

Text UI package (ie.psc.project.text.*)

There are three views provided in this package, there are no controllers so they are all read-only.

Plain Text

The ReadOnlyUI class provides a plain-text view of the document.

Using an XMLOutputter that outputs to a StringBuffer (using java.io.StringWriter) as a DocumentHandler, the XML document is parsed by a DOMParser (ie.psc.xml).

This string from the StringBuffer is obtained after parsing and displayed in the text component on the screen.

Indented XML

The TextView class uses the same method as with plain text but to provide the indentation, the parser filter design is used. The parsed XML is passed through the Indenter which adds the indenting whitespace before passing the XML to the XMLOutputter and into the StringBuffer.

Formatted XML

This view displays colour-coded XML, different colours are used depending on what is being displayed; character data, an element or an attribute.

The JTextPane Swing component is used for display, because when a string is added to a JTextPane, a display style can be associated with it.

To implement this the Indenter is used again, but a subclass of XMLOutputter is used as the output of the pipeline. This subclass, XMLFormatter, overrides methods such as writeAttribute() and writeCharacterData() and sets the style of the strings to be displayed as they are added.

HTML UI package (ie.psc.project.html.*)

A Swing component called the JEditorPane can be used to display HTML 3.2 content.

The HTMLView class inherits from ReadOnlyUI because it uses the same method of displaying the data. The only difference being that the component that displays the data is a JEditorPane with its content type set to 'text/html', so that the data is rendered as HTML formatted text.

As the DOM object must represent well-formed XML there is no issue of the data not conforming to the XHTML specification and therefore not being parseable.

Tree UI package (ie.psc.project.tree.*)

One of the components provided in Swing is a JTree, an expandable tree similar to that in Windows Explorer.  In order to modify the JTree component so that it would display an XML document, a custom tree model was developed (a subclass of javax.swing.tree.DefaultTreeModel).

This model, the DOMToTreeModelAdapter, wraps the DOM document object, defining a model for the JTree display. In this implementation the document is displayed in a tree structure with attributes as leaves. Future implementations should provide a means of viewing the character data, maybe in a separate window.

General project package (ie.psc.project.*)

These are general classes that bring together the above components to provide the functionality of the application.

Server Document Request Function

There are four classes used to implement the document retrieval function and protocol:

The commands used in the protocol are constants in the Commands interface. 

The CommandSender translates high-level semantic methods, like 'get list of documents' into commands of the protocol and send them; these are decoded on the server side by the CommandHandler. 

The CommandHandler works in conjunction with the DocumentServer, telling it to send which documents to which clients.

MVC UI

The MultiDocumentBuilderUI class provides a UI for managing multiple document builders. It is tied to the MVC design of the dombuilder framework.

It improves performance and reduces memory usage by only creating a single instance of each UI used and sharing it between the DocumentBuilders as needed. 

The UIs to be used are specified in a configuration file at runtime (example below), this allows other UIs to be written at a later date and added to this project without the need for recompilation. These UIs are loaded and created similar to the way in which the SVG UI loads tools. Loading UIs only when they are needed is a common technique known as lazy instantiation [42].


 This UI also handles the protocol for requesting documents from the server. The default implementation requests a list of documents from the server and then downloads each one in turn.

Producer and Consumer Partnership

The UpdateConsumer, UpdateProducer and SharedObjects classes provide a Producer-Consumer partnership [41]. A SharedObjects object is referenced by both the consumer and producer. The producer thread places updates in it while the consumer thread consistently checks it for objects to process. This is used to queue updates for sending to the network.

Resources

The resources include resource bundles, icon images and the configuration file. Resource bundles allow the internationalisation of software by allowing strings to be loaded according to the current locale. 

There only exist two resource bundles:

  1. GuiLabels_fr - French language
  2. GuiLables_en - English language

Further language support can be provided by:

  1. Creating a subclass of ListResourceBundle (java.util package)
  2. Naming it GuiLabels_xx, where xx represents the locale
  3. Placing the class in the classpath.

See the java web site [40] for further information.

Development package ( ie.psc.dev.*)

Java is still a young language; it lacks (or no IDE provides) many feature that hardcore programmers swear by, like assertions, macros or pre-compile processing.

This package was developed to try to provide help with debugging and logging. The main features are:

As of writing a beta version of Java 1.4 is near release and will provide these features, leaving this package redundant, but until then this package is an important tool in the development.

Main package ( ie.psc.main.*)

This package contains two very important classes: Server and ReadWriteClient. These two classes are applications that set up the server and the client of the system respectively. 

The server takes a single argument, the port to start the server on. The client takes three arguments, the hostname and the port to connect to and the path to the configuration file. 

The choice of DOM and SAX implementations is specified at runtime by setting system properties. For example the following line on the command line starts up a server on port 6666 using the Sun implementation of the SAX parser and the Apache implementation of the DOM specification. 

java -Dorg.sax.xml.Parser=com.sun.xml.parser.Parser

-Dorg.w3c.dom.Document=org.apache.xerces.dom.DocumentImpl Server 6666 

Optionally, a log file for the output streams can be set at runtime, also using system properties. The standard output is set using the property System.out and the standard error is set using System.err. Adding the following line to the above command sends all standard output to C:\log\out.txt and all error output to C:\log\err.txt. 

-DSystem.out=C:\log\out.txt -DSystem.err=C:\log\err.txt 

For testing and development two more clients were developed, LogClient and RobotClient. 

LogClient is used to connect to the server and output all data received either to the screen or to a file. A useful feature of this client is that it can be used to save an entire session. 

RobotClient is the partner of LogClient and is used to log onto the server and send a list of updates that are read from a file, mimicking the actions of a real user. The delay between updates is specified on the command line. This client can be told to read a file produced by the LogClient and therefore replay a saved session. 

Chapter 6 Project Management

This chapter briefly describes the planning made at the start of the project and the careful time-management involved during development.

Project Planning

The diagram to the right that was made back in October at the start of the project shows the stages in the development process.

Research (Oct-Nov)

Research of collaborative tools available.

Research the possible deployment, and the technologies for the design and implementation.

Feature Specification (Nov)

A specification of the project's functionality.

Software Architecture (Dec)

The design and the deployment of the system.

Software Design  (Jan)

Specification of classes to be developed, using

Rational Rose.

Coding (Feb - Mar)

Development and testing. (see Development Planning, below)

Functionality Testing (Mar)

Server and client were tested using several robot clients to send updates simultaneously.

Performance Testing (depends on time)

The intention was to test the performance of the application, over LAN, dial-up, firewalls, etc and make improvements as needed, starting at the coding level, going on to the software design, then software architecture, if there are no significant improvements.

As no significant performance problems, no performance optimisations were made.

Extend Functionality (depends on time)

In this phase the HTML viewer was developed as well as minor improvements to the UI in general.

Development Planning