Using XML Catalogs in Cocoon

In this article I'm going to show a common use case of XML Catalogs. Their usage is not only recommended to avoid certain issues but can also drastically improve the performance. I'll start with explaining the issue that I've faced recently and will conclude with the resolution.

Issue

To start with, I've got the following exception:

java.io.IOException: Server returned HTTP response code: 429 for URL: 
http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd

The HTTP code 429 stands for "Too Many Requests" that can appear when:

The user has sent too many requests in a given amount of time. Intended for use with rate limiting schemes

Just to provide some context, I have an Apache Cocoon based application that does a lot of XSLT processing with Saxon. It appears that every time Saxon reads an xml document with a DTD reference, it tries to fetch the DTD source for validation. Obviously, if the processing rate is high enough and there is no caching, you can create a lot of excessive network traffic and hit the rate limit. The same issue has been kindly explained by W3C.

Solution

XML Catalog maps resources addresses to local copies of the same resources. Thus, the use of XML Catalogs can bring big benefits when there are many external references in your xml documents. Finally, let's look at an example catalog that resolved the above issue by using local SVG DTD files:

PUBLIC "-//W3C//DTD SVG 1.1//EN" "svg11.dtd"

So it looks pretty simple mapping the SVG formal public identifier to the local copy of the main DTD file. Both this file named catalog and all the required SVG DTD files are located under META-INF/cocoon/entities/catalog as a standard location for Cocoon. Now as you can read in How to use a catalog file and Cocoon catalog documentation, we need to create a CatalogManager.properties file that must be placed in the Java classpath:

catalogs=META-INF/cocoon/entities/catalog
relative-catalogs=false
static-catalog=yes
verbosity=1

To conclude, XML Catalog appeared to me as a not really well-known mechanism that must be used as a good practice. Besides avoiding the rate limit issue, it helped to increase the performance several times in certain cases. This can happen if the application is hidden behind a slow proxy and the DTD is fetched dozens of times on a pipeline.

Ivan Lagunov's Blog

Search This Blog

Using XML Catalogs in Cocoon

Labels

Comments

Post a Comment

Popular posts from this blog

Connection to Amazon Neptune endpoint from EKS during development

How to import an untrusted website certificate to the Java keystore

Managing Content Security Policy (CSP) in IBM MAS Manage