Since its initial release, VTD-XML has undergone several rounds of improvement. This report show how well VTD-XML fares against some of the well-known XML parsers. The old version of the benchmark can be found here.
For DOM and VTD-XML, the benchmark programs generate hierarchical structures.
For SAX and PULL parsers, the benchmark programs scan over the entire documents without any processing logic.
For performance numbers, all benchmark programs first loop thru the parsing code a number of iterations so the server JVM compile them into native code to obtain optimal performance, before the real measurement of parsing throughput and latency starts.
It should be noted that comparing VTD-XML with SAX or PULL is not really fair comparisons: VTD-XML allows random access; SAX and Pull are forward only.
A wide selection of XML files, ranging from very small (1k) to big (15MB) are chosen and grouped into small (<10k), medium sized (<1M), and big.
Benchmark programs for measuring parsing performance can be downloaded below:
Benchmark programs for measuring memory usage can be downloaded below
Benchmark programs for doing node iteration can be downloaded below
XML files used in the benchmark can be downloaded here.
File name/size | VTD-XML (ms) | VTD-XML buffer reuse (ms) | SAX (ms) | DOM(ms) | DOM deferred(ms) | Piccolo (ms) | Pull (ms) |
soap2.xml (1727 bytes) | 0.0446 | 0.0346 | 0.0782 | 0.1122 | 0.16225 | 0.092 | 0.066 |
nav_48_0.xml (4608 bytes) | 0.1054 | 0.0928 | 0.266 | 0.37 | 0.385 | 0.2784 | 0.1742 |
cd_catalog.xml (5035 bytes) | 0.118 | 0.108 | 0.19 | 0.348 | 0.4 | 0.2 | 0.214 |
nav_63_0.xml (6848 bytes) | 0.149 | 0.135 | 0.354 | 0.513 | 0.557 | 0.484 | 0.242 |
nav_78_0.xml (6920 bytes) | 0.153 | 0.142 | 0.3704 | 0.588 | 0.52 | 0.42 | 0.29 |
File name/size | VTD-XML (ms) | VTD-XML buffer reuse (ms) | SAX (ms) | DOM(ms) | DOM deferred(ms) | Piccolo (ms) | Pull (ms) |
nav_50_0.xml (10304 bytes) | 0.2 | 0.185 | 0.55 | 0.802 | 0.773 | 0.701 | 0.398 |
officeOrder.xml (10591 bytes) | 0.186 | 0.174 | 0.41 | 0.617 | 0.615 | 0.526 | 0.432 |
form.xml (15845 bytes) | 0.274 | 0.258 | 0.227 | 0.214 | 0.486 | 0.773 | 0.921 |
book.xml (22996 bytes) | 0.368 | 0.354 | 0.743 | 2.391 | 2.046 | 0.843 | 0.857 |
soap_small.xml (26734 bytes) | 0.58 | 0.563 | 1.221 | 3.825 | 3.068 | 1.346 | 1.137 |
cd.xml (30831 bytes) | 0.569 | 0.549 | 1.205 | 5.092 | 4.376 | 1.211 | 1.362 |
bioinfo.xml (34759 bytes) | 0.529 | 0.517 | 1.068 | 4.126 | 4.366 | 1.188 | 1.33 |
soap_mid.xml (134334 bytes) | 2.885 | 2.804 | 6.028 | 32.846 | 21.896 | 6.668 | 5.828 |
File name/size | VTD-XML (ms) | VTD-XML buffer reuse (ms) | SAX (ms) | DOM(ms) | DOM deferred(ms) | Piccolo (ms) | Pull (ms) |
po1m.xml (1.01 MB) | 25.71 | 20.08 | 36.4 | 186.16 | 115.67 | 47.62 | 63.27 |
soap.xml (2.59 MB) | 64.7 | 57.27 | 123.18 | 502.32 | 380.74 | 134.8 | 393.96 |
bioinfo_big.xml (4.27 MB) | 70.1 | 73.9 | 131.8 | 629.1 | 442.02 | 151.62 | 177.64 |
SUAS.xml (13.13 MB) | 359.91 | 315.24 | 665.36 | 1961.01 | 1296.08 | 820.38 | 637.72 |
address.xml (15.24 MB) | 315.06 | 276 | 658.56 | 2158.5 | 1822.22 | 617.48 | 684.57 |
Because SAX and Pull do not build data structures in memory, so the meaningful comparison is between DOM and VTD-XML. To that end, we benchmark the multiplying factor which is the ratio between the memory usage and the document size.
The goal for this part is to benchmark the performance of the XML parsers visiting every single node after finishing building the hierarchical structure.
File name/size | VTD-XML (ms) | DOM(ms) |
soap2.xml (1727 bytes) | 0.00671 | 0.00676 |
nav_48_0.xml (4608 bytes) | 0.028 | 0.0155 |
cd_catalog.xml (5035 bytes) | 0.0388 | 0.0385 |
nav_63_0.xml (6848 bytes) | 0.0431 | 0.0238 |
nav_78_0.xml (6920 bytes) | 0.043 | 0.0244 |
File name/size | VTD-XML (ms) | DOM(ms) |
nav_50_0.xml (10304 bytes) | 0.063 | 0.034 |
officeOrder.xml (10591 bytes) | 0.0788 | 0.051 |
form.xml (15845 bytes) | 0.065 | 0.046 |
book.xml (22996 bytes) | 0.149 | 0.144 |
soap_small.xml (26734 bytes) | 0.225 | 0.193 |
cd.xml (30831 bytes) | 0.226 | 0.3 |
bioinfo.xml (34759 bytes) | 0.236 | 0.178 |
soap_mid.xml (134334 bytes) | 1.61 | 1.151 |
File name/size | VTD-XML (ms) | DOM(ms) |
po1m.xml (1.01 MB) | 11.19 | 10.84 |
soap.xml (2.59 MB) | 32.84 | 35.44 |
bioinfo_big.xml (4.27 MB) | 30.43 | 38.26 |
SUAS.xml (13.13 MB) | 21.43 | 21.82 |
address.xml (15.24 MB) | 132.18 | 130.8 |
As the next generation XML parser, VTD-XML delivers compelling improvement in both memory usage and parsing performance. Moreover, its performance and memory usage benefits apply to all file sizes and purposes. So it should satisfy even the most demanding XML processing needs and enable new and exciting XML applications.