XimpleWare Benchmark Report

(10/29/03)

Objective

We intend to show some performance numbers of  JAVA implemention of XimpleWare's technology.

BackGround and Introduction

XML performance analysis

One of the big issues of current XML processing performance has to do with the perception of what a token is. Traditional wisdom has that a token is a standalone piece of memory terminated by a null character, which has lead to lots of object creations/destruction for heavy duty text processing tasks. In Java, object creation and garbage collection are quite expensive. So whatever technique that can reduce/minimize object creation should be considered as the first step towards better application performance. This is precisely what we did in our processing model. As a result, we are able to consistently outperform Crimson SAX, while delivering  random access performance quite similar to DOM's.

Serial VS Hierarchical

In distributed computing, it is commonly known that  hierarchical data structures have to be converted into serial format before transported across separate memory boundries.  So is "hierarchical" an  antonym of "serial?"  Can something be both hierarchical and serial? We believe that notions of "hiearchical" and "serial" are intrinsically orthognal. One example is a Yellow Page phone book. It is serial in that it is printed on papers and can be read from left to right, from first page to last page; but  it is also hiearchical in that one can look up items in the index section before being "redirected" into the right places.

It is during our inventive process that we realized that we can create the illusion of "hiearachy" on top of serial format. Notice that similar "illusion" approaches have been applied to other fields of computer science. Multiple threads running on a single processor are never "truely concurrent," but rather in a time-sliced fashion.

A quick technology overview

A few points about our technology are list below:

  1. Our processing model maintains the XML document in memory.
  2. Our processing model generates a separate piece of binary file to represent the parsed state of the XML document.
  3. Our processing model allows one to incrementally update the content of the XML document
  4. The binary file is persistent

The API

Our random access API is similar to XML cursor in that it allows one to "walk" to the an adjacent node, whether it is an child, sibling or parent.  The core functions  are:

We also export a lower level API allowing direct token access in a way similar to Xerces's NodeIterator, a depth-first, element order node traversal interface.

Test Setup and Methodology

The Setup

Our initial implementation is done in Java because it is easier to port across multiple platforms. The test machine we use to run the benchmark is a Athlon 1800+ (with an actually clock of 1.5Ghz) with 512 MB of DDR RAM  running Win98 (4.10.98). The Java virtual machine is of the version "build 1.4.1-beta-b14." When running the benchmark, we turn on VM's "-server" option in order to invoke the "server" VM that has more sophisticated optimization features.  

Test Methodology

We mainly compare the performance of Crimson DOM (bundled with JDK1.41) and our own Java-based XML processing API. The performance figures are calculated by first running the test routines 1000~100000 times, then obtaining the average runtime for a subsequent 100 ~1000 test iterations. The reason we choose such large numbers is so that the VM's native code compiler would fully kick in, the performance would fully reflect the best achievable performance by VM.

We pay special attention to the following parameters

  1. Data structure building performance w/o I/O-- We read the entire test XML document  into memory before callling the benchmark routine to obtain the in-memory representation. One thing to notice is that, because Java VM does automatically garbage collection, we are actually measuring the average time of "building data structure" and "garbage collection."
  2. Memory Usage -- We meaure the size of in-memory data structures. For this, we rely on  JDK's Runtime object to measure the difference between  total memory before and after the data structure building, and we ran the data structure building routine only once. Also we discovered that our measurement looks more reasonable only when the file size is substantial.
  3. Navigation performance  --  We perform a large number of searches according to certain XPATH expression, then calculate the total time spent on doing such traversal  The corresponding XPATH  expression is (see the sample doc below)  /purchaseOrder/items/item[@partNum="926-AA"] . For "po1m.xml" we have a total of  2650 nodes that match that XPATH expression.
  4. Update Performance -- Using the same XPATH expression, we write update data structure into another XML file.  More specificly, we replace 2650 attribute values "926-AA" with "new val," which we think is a pretty "heavy duty" update operation .
  5. Entire flight with I/O -- The combination of 1, 3, 4, plus the I/O for both input and output


Test File(s)

We focus our benchmark test on a set of files named as po10k.xml(10kB in size), po100k.xml(100KB) and po1m.xml(1000K). We obtain those files from BEA's XMLcursor package. A typical po*.xml file looks like the following:

<?xml version="1.0"?>
<
purchaseOrder xmlns="http://openuri.org/simplepo" orderDate="1999-10-20">
  <shipTo country="US">
    <name>Alice Smith</name>
    <street>123 Maple Street</street>
    <city>Mill Valley</city>
    <state>CA</state>
    <zip>90952</zip>
  </shipTo>
  <billTo country="US">
    <name>Robert Smith</name>
    <street>8 Oak Avenue</street>
    <city>Old Town</city>
    <state>PA</state>
    <zip>95819</zip>
  </billTo>
  <comment>Hurry, my lawn is going wild!</comment>
  <
items>
    <
item partNum="872-AA">
      <productName>Lawnmower</productName>
      <quantity>1</quantity>
      <USPrice>148.95</USPrice>
      <comment>Confirm this is electric</comment>
    </item>
    <item partNum="
926-AA">
      <productName>Baby Monitor</productName>
      <quantity>1</quantity>
      <USPrice>39.98</USPrice>
      <shipDate>1999-05-21</shipDate>
    </item>
    <item partNum="872-AA">
      <productName>Lawnmower</productName>
      <quantity>1</quantity>
      <USPrice>148.95</USPrice>
      <comment>Confirm this is electric</comment>
    </item>
    <item partNum="
926-AA">
      <productName>Baby Monitor</productName>
      <quantity>1</quantity>
      <USPrice>39.98</USPrice>
      <shipDate>1999-05-21</shipDate>
    </item>

     ...

   </items>
</purchaseOrder>


For 1. we also run the test on several other XML files, including ones that are complex in structure and attributes-heavy. We also list those results in our benchmark results.

Showdown with SAX

In later part of  our benchmark, we also list performance of SAX parser versus the performance of generating our binary file. We outperform SAX on all but one test case.

Performance Table

Most Relevant Data (DOM, XimpleWare)


po10k.xml
po100k.xml
po1m.xml
building data structure  (w/o I/O) (ms)  (including garbage collection time)
2.26 / 0.5
35.1/4.22
305.9/48.0
total memory usage (kilo byte)
 N/A
N/A
6447.9/1861.4
Traversal the data structure (ms) (5300 nodes travesed)
0.11/ 0.11
0.44/0.49
7.47/5.93
 Write updated structure to disk (ms) (modified 2650 attribute values)
2.91/0.195
19.2/ 3.3
137.59/ 14.8
Entire flight  time ( I/O included  wherever applicable) (ms)
13.2 /  1.7
64.8/14.7
486.1/93.4



Performance Comparison with SAX

File name
File Size (KB)
DOM(ms)
SAX (ms)
XimpleWare (ms) Notes    
ab.xml
31.662
10.4
2.2
1.32
Test file is an address book
cd.xml
30.125
12.1
1.1
1.37
Test file is  a cd catalog
ado.xml
1007.214
318.0
107.1
63.39
Ttest file is an MS ADO data output from MSDN website
soap.xml
1255.019
458.6
84.1
69.64
Test file contains large amount of data from pull parser benchmark
ot.xml
2523.419
375.7
104.9
80.13
Test file is a chapter from old testment
SUAS.xml
13762.57
4609.3
1152.9
672.29
A huge file we got from here (attributes heavy)
bench.xml
21050.145
5490.9
1037.0
802.46
from devsphere's XML test benchmark


More Memory Usage Comparison with DOM

File Name
File Size (KB)
 DOM (KB)
 XimpleWare(KB)
ado.xml
1007.214
10761.016
2179.200
soap.xml
1255.019
10374.200
2856.800
ot.xml
2523.419
10437.552
3601.312
SUAS.xml
13762.57
103410.928
25634.712
bench.xml
21050.145
115551.120
35194.672


Summary

We present some of the performance numbers and compare them with some really popular XML processor implementations. By avoiding excessive unnecessary object creation we managed to consistently outperform DOM and SAX. Also by maintaining the original XML document intact in memory, we managed nearly an order of magnitude update performance.

There are other attributes that we intend to show later, such as persisting XPATH output on disk, extracting and manipulating fragments XML document without reparsing the fragments. We would like to show those in a later version of our benchmark performance.


Appendix

SAXTest.java

import java.io.*;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;

public class SAXTest {
  public static void main(String argv[])
  {
     Runtime rt = Runtime.getRuntime();
     int total;
     long  a = System.currentTimeMillis();
        try {
            // Set up output stream
            //out = new OutputStreamWriter(System.out, "UTF8");
            SAXParserFactory factory = SAXParserFactory.newInstance();
            File f= new File(argv[0]);
        System.out.println(" argv 0 is "+ argv[0]);
             FileInputStream fi = null;
             byte[] bt;
              fi = new FileInputStream(f);
        System.out.println(" file size "+(int)f.length());
               bt = new byte[(int)f.length()];
               fi.read(bt);       
        total = 0;

        if (bt.length > 1000000)
        {
            total = 1000;
        }else if (bt.length > 100000)
        {
            total = 10000;
        }else if (bt.length > 10000)
        {
            total = 100000;
        }

            SAXParser saxParser = factory.newSAXParser();
        for(int i=0;i<total;i++)
            {
             ByteArrayInputStream bais  = new ByteArrayInputStream(bt);
                saxParser.parse( bais, (DefaultHandler) null );
                //saxParser.parse( new File(argv[0]), handler );
        }
               a = System.currentTimeMillis();
        
            // Parse the input
         for(int i=0;i<100;i++)
            {
         saxParser = factory.newSAXParser();
             ByteArrayInputStream bais  = new ByteArrayInputStream(bt);
                saxParser.parse( bais, (DefaultHandler) null );
                //saxParser.parse( new File(argv[0]), handler );
        }       
           System.out.println("total time used "+ (float)(System.currentTimeMillis() - a)/100);
        } catch (SAXException e)
    {
           System.out.println( e);
    }
    catch (Throwable t) {
            t.printStackTrace();
        }

  }
}