Skip to main content

Using JavaScript hashCode to enable Cocoon caching of POST requests

I've just faced an issue with the Cocoon caching related to POST requests. Let me describe the use case here. We use a custom XQueryGenerator to execute XQuery code over Sedna XML Database and then process the XML results in the Cocoon pipeline. For the sake of performance, I configured the pipeline caching based on the expiration timeout of 60 seconds for all XQuery invocations:
<map:pipeline id="cached-services" type="expires" internal-only="true">
  <map:parameter name="cache-expires" value="60"/>
  <map:parameter name="cache-key" 
                 value="{request:sitemapURI}?{request:queryString}"/>

  <map:match pattern="cached-internal-xquery/**">
    <map:generate src="cocoon:/xquery-macro/{1}" type="queryStringXquery">
      <map:parameter name="contextPath" value="{request:contextPath}"/>
    </map:generate>
    <map:transform src="xslt/postprocessXqueryResults.xslt" type="saxon"/>
    <map:serialize type="xml"/>
  </map:match>
</map:pipeline>
So you can see that both a request sitemap URI and a query string are used to form the cache key. It works perfectly until you want to send XQuery parameters via POST method instead of GET. Then the query string will be empty and identical for all the POST requests. As a result, one POST request's results will be cached for all of them, the caching breaks it all.

You may wonder why we need POST requests to actually load XML data. This is because we cannot predict how many request parameters will be there as they are generated from the list of identifiers like this:
// id_list is a Collection of identifiers to be sent as request parameters
var postData = id_list.stringJoin(
        function(object) { return "id=" + object },
        "&"
);
    
var uri = "xquery/basictype_tree";

// This sends an asynchronous request and
// inserts its results into the containerId element.
new SimpleContainerTransaction(
    {
        "uri": uri, "containerId": "treenode-details-container",
        "method": "POST", "data": postData
    }
).execute();
Here the SimpleContainerTransaction is a part of a custom YUI3-based Transaction utility.

Now it's time to fix the issue. It seems quite obvious that we should simply generate a fake GET parameter in addition to meaningful POST parameters. This fake parameter will be a hash of POST parameters to make identical requests have identical hash values. As soon as we implement this, the caching should work perfectly for this use case as well.

As we generate POST parameters string in JavaScript, I googled for JavaScript hash implementations and discovered this pretty overview of possible JavaScript hash solutions. So I adapted the first one and incorporated it into our project JS library:
String.prototype.hashCode = function() {
    var charCode, hash = 0;
    if (this.length == 0) return hash;
    for (var i = 0; i < this.length; i++) {
        charCode = this.charCodeAt(i);
        hash = ((hash << 5) - hash) + charCode;
        hash = hash & hash; // Convert to 32bit integer
    }
    return hash;
}
This extends all String objects' with the hashCode function. So let's fix now the caching issue by appending POST parameters hash as a GET parameter to the URL:
var uri = "xquery/basictype_tree?hash=" + postData.hashCode();
That's it, the caching works fine again.

Comments

Popular posts from this blog

Connection to Amazon Neptune endpoint from EKS during development

This small article will describe how to connect to Amazon Neptune database endpoint from your PC during development. Amazon Neptune is a fully managed graph database service from Amazon. Due to security reasons direct connections to Neptune are not allowed, so it's impossible to attach a public IP address or load balancer to that service. Instead access is restricted to the same VPC where Neptune is set up, so applications should be deployed in the same VPC to be able to access the database. That's a great idea for Production however it makes it very difficult to develop, debug and test applications locally. The instructions below will help you to create a tunnel towards Neptune endpoint considering you use Amazon EKS - a managed Kubernetes service from Amazon. As a side note, if you don't use EKS, the same idea of creating a tunnel can be implemented using a Bastion server . In Kubernetes we'll create a dedicated proxying pod. Prerequisites. Setting up a tunnel.

Cocoon authentication

This article will guide you through the steps showing how to use the Authentication Framework in a Cocoon 2.2 application. Maven dependencies. Spring configuration. Sitemap. Login page and controls. Maven dependencies You need the following dependency in your pom.xml : <dependency> <groupId>org.apache.cocoon</groupId> <artifactId>cocoon-auth-impl</artifactId> <version>1.0.0</version> </dependency> Spring configuration Authentication Framework has a flexible configuration based on a concepts of applications and security handlers . There can be several applications defined and running at the same that are simply independent security zones of your web application. The security details of an application are specified using a security handler. There are several implementations provided and you're free to implement your own. Here is the SimpleSecurityHandler used that takes the hardcoded credentials: <?xml versio

Extracting XML comments with XQuery

I've just discovered that it's possible to process comment nodes using XQuery. Ideally it should not be the case if you take part in designing your data formats, then you should simply store valuable data in plain xml. But I have to deal with OntoML data source that uses a bit peculiar format while export to XML, i.e. some data fields are stored inside XML comments. So here is an example how to solve this problem. XML example This is an example stub of one real xml with irrelevant data omitted. There are several thousands of xmls like this stored in Sedna XML DB collection. Finally, I need to extract the list of pairs for the complete collection: identifier (i.e. SOT1209 ) and saved timestamp (i.e. 2012-12-12 23:58:13.118 GMT ). <?xml version="1.0" standalone="yes"?> <!--EXPORT_PROGRAM:=eptos-iso29002-10-Export-V10--> <!--File saved on: 2012-12-12 23:58:13.118 GMT--> <!--XML Schema used: V099--> <cat:catalogue xmlns:cat=