Ivan Lagunov's Blog

Posts

Linux command line tips and tricks

This post lists a number of useful tips and tricks from my daily Linux experience. Mostly I deal with RHEL but I believe these commands are quite independent on Linux distribution (or can be adapted). Network commands Here are network commands represented. Basic net utils: # Who is listening to port: netstat -lp | grep <port> # Show all connections with numeric addresses and proc IDs: netstat -anp # Listen to port (to check connectivity from another side): netcat -l -p <port> # -or- nc -l -p <port> SSH tunnel: # Tunnel to remote_ip:remote_port via proxy_ip with known login/password # The remote_ip:remote_port is being redirected to localhost:local_port ssh -L local_port:remote_ip:remote_port login@proxy_ip # Real-world example of tunnel to remote Sedna XML DB: ssh -L 5050:134.27.100.67:5050 pxqa1@134.27.100.67 Download via HTTP proxy with wget: # Download resource from internet from behind a proxy: http_proxy=http://host:port ; export http_proxy ; w...

Extracting collection from Sedna XML DB

This post is actually based on a kind of an epic fail story. Initially the task was just to rename a collection in Sedna XML DB . The solution is as primitive as using RENAME COLLECTION statement of Sedna Data Definition Language. But I'm probably too enthusiastic about writing Bash scripts in Linux. So I missed out single-statement solution and wrote a bunch of scripts to perform the same task via extracting-loading procedure. Anyway, it can still be quite valuable for more complex tasks like moving a collection between XML DB installations (e.g. from Production to Test environment) or merging collections. So my solution follows below. Extracting a single file It's always wise to modularize the code and divide a task into smaller parts. First, we need a script for extracting a single file. It need be parametrized with a file name and a collection name. Also I address another essential problem here that is the safety of file names. It's not a common problem but we do...

Using JavaScript hashCode to enable Cocoon caching of POST requests

I've just faced an issue with the Cocoon caching related to POST requests. Let me describe the use case here. We use a custom XQueryGenerator to execute XQuery code over Sedna XML Database and then process the XML results in the Cocoon pipeline. For the sake of performance, I configured the pipeline caching based on the expiration timeout of 60 seconds for all XQuery invocations: <map:pipeline id="cached-services" type="expires" internal-only="true"> <map:parameter name="cache-expires" value="60"/> <map:parameter name="cache-key" value="{request:sitemapURI}?{request:queryString}"/> <map:match pattern="cached-internal-xquery/**"> <map:generate src="cocoon:/xquery-macro/{1}" type="queryStringXquery"> <map:parameter name="contextPath" value="{request:contextPath}"/> </map:generate> ...

Bulk loading files into Sedna XML DB - part 2

In the part 1 of the article I've used scripts to generate bulk load file with LOAD instructions. But that approach has several drawbacks: existing files are not overwritten; hard to track the progress of long-term operation in case of huge number of files. I've written a better script to solve those issues. Bash script for loading files The following Linux Bash script uploads files one by one using separate LOAD instructions . Also it tries to remove the file first using DROP DOCUMENT instruction . As a result, existing files are overwritten. After each 100 of files being loaded, you get a message with a timestamp. It helps to predict the end time of the operation. #!/bin/bash # This function writes a status message to both stdout and $OUTPUT_FILE function print_status { echo ">>> Loaded $counter files, time: `date`" | tee -a $OUTPUT_FILE } OUTPUT_FILE=load_files.log COLLECTION_NAME=legacyBasicTypes echo "" > $OUTPUT_FILE counter=0...

Bulk loading files into Sedna XML DB

The problem is to upload plenty of files into Sedna XML DB . How would you do this? If it is a repeated action, it's logical to create an application for this. This is quite easy using Sedna XML:DB Java API . Actually we've already done so but this article addresses another case. There is a problem using Java API that is the performance. Using Java API always brings overhead compared to using embedded terminal utility (I got the performance of 2 seconds per file with the remote Sedna installation). Now I have several thousands of files and I want to upload them fast so let's turn to writing some useful scripts to automate it. Generate bulk load file First we need to generate an xquery file with LOAD instructions that are supported by Sedna terminal utility. Let's do this with another simple script. I had to do this under both Linux and Windows systems so you'll find two scripts below. First comes the Linux shell script: #!/bin/sh OUTPUT_FILE=bulk_load.xque...

Do It Yourself Java Profiling

This article is a free translation of the Russian one that is a transcript of the Russian video lecture done by Roman Elizarov at the Application Developer Days 2011 conference. The lecturer talked about profiling of Java applications without any standalone tools. Instead, it's suggested to use internal JVM features (i.e. threaddumps, java agents, bytecode manipulation) to implement profiling quickly and efficiently. Moreover, it can be applied on Production environments with minimal overhead. This concept is called DIY or "Do It Yourself". Below the lecture's text and slides begin. Today I'm giving a lecture "Do It Yourself Java Profiling". It's based on the real life experience that was gained during more than 10 years of developing high-loaded finance applications that work with huge amounts of data, millions currency rate changes per second and thousands of online users. As a result, we have to deal with profiling. Application pro...

Still looking for a Java profiler?

This post is just a short overview on the topic. Recently I've had to investigate a performance issue of a Java application running under JBoss server. This problem resulted in using a full-featured free Java profiling tool - VisualVM that is available separately and as an embedded JDK tool starting from JDK 6 update 7. Thus, it's most likely already installed on your system. This solution was great in localizing the performance bottleneck on the Production environment in my case. The list of features include monitoring CPU and memory usage, application threads, profiling, sampling, taking thread and heap dumps, etc. I advise to watch the video tutorial on the Getting Started page . Here is my screenshot: I'm not going to make a thorough comparison of different Java profiler tools here but this is a list of alternatives for completeness: YourKit is free for open-source projects; JProfiler is free for open-source projects; NetBeans profiler is embedded into c...