Skip to main content

Sedna XML DB and RelWithDebugInfo mode

Once we had a severe issue with Sedna hanging regularly. It was caused by broken indexes after an upgrade at that moment. The issue caused quite a nightmare and led to a lot of time wasted until we solved it together with Sedna devs. Since that moment it has become very important to be able to look into what is happening inside Sedna at any particular moment. Fortunately, there is a suitable way although it's not documented properly on the Sedna website. All you need is to build Sedna from source with a special flag RelWithDebugInfo.
  1. Cmake build modes.
  2. Using gdb.
  3. Using netstat.
Cmake build modes
Cmake has several build modes with Release and Debug obviously among them. Another mode that can be of big use is called RelWithDebugInfo. There is a perfect explanation for it on the mailing list:
The difference between Debug and RelwithDebInfo is that RelwithDebInfo is quite similar to Release mode. It produces fully optimised code, but also builds the program database, and inserts debug line information to give a debugger a good chance at guessing where in the code you are at any time.
So it was suggested by Sedna devs to build the database from source with this flag:
-DCMAKE_BUILD_TYPE=RelWithDebugInfo
Next you'll see how it allows tracking state of open database transactions.

Using gdb
The GNU Project debugger can help to see what is going on inside a running program. It's usage is very simple, you only need to know the process id so you should run gdb . <pid>. Then you can fetch the list of threads with info threads, switch between threads with thread <number> and see the backtrace of a thread with bt command. Here is an example:
lagivan@host:/home/lagivan>gdb . 7108
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-32.el5)
--- GDB loading traces have been removed for clearness ---

warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fffb51fc000
0x0000003fbc8d5887 in semop () from /lib64/libc.so.6
(gdb) info threads
  4 Thread 0x41ec8940 (LWP 7109)  0x0000003fbd40b150 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3 Thread 0x41ee1940 (LWP 7110)  0x0000003fbc8d5887 in semop () from /lib64/libc.so.6
  2 Thread 0x41189940 (LWP 7111)  0x0000003fbc8d5887 in semop () from /lib64/libc.so.6
* 1 Thread 0x2b20394eea80 (LWP 7108)  0x0000003fbc8d5887 in semop () from /lib64/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 0x41ee1940 (LWP 7110))]#0  0x0000003fbc8d5887 in semop () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003fbc8d5887 in semop () from /lib64/libc.so.6
#1  0x00000000004678d0 in EventWaitReset ()
#2  0x00000000004679c7 in UEventWait ()
#3  0x000000000042e724 in checkpoint_thread(void*) ()
#4  0x0000003fbd40673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003fbc8d44bd in clone () from /lib64/libc.so.6
(gdb) quit
The backtrace above looks like the thread is waiting for a semaphore that is quite normal. However, the hanging Sedna transaction is probably blocked waiting for a client response when you observe the following backtrace:
#0  0x00000037858cd722 in select () from /lib64/libc.so.6
#1  0x000000000097bf1c in uselect_read ()
#2  0x00000000005942b1 in socket_client::read_msg(msg_struct*) ()
#3  0x000000000058c001 in TRmain ()
#4  0x00000000008fb60e in main ()
This can be a sign that the client has not closed the transaction properly or there has been a network interruption. Certainly, this tool can be very useful if you know the program implementation details. In my case the backtraces together with Sedna logs were analyzed by Sedna devs.

Using netstat
Further investigation can be done with netstat tool to track which client behaves badly and makes the database wait. The following command will return the list of established connections with process ids shown:
netstat -nap | grep ESTABLISHED
Hopefully, this information should be sufficient to identify the client. Then it's a matter of analyzing its logs and deciding if it's a client issue or a Sedna bug.

Comments

Popular posts from this blog

Connection to Amazon Neptune endpoint from EKS during development

This small article will describe how to connect to Amazon Neptune database endpoint from your PC during development. Amazon Neptune is a fully managed graph database service from Amazon. Due to security reasons direct connections to Neptune are not allowed, so it's impossible to attach a public IP address or load balancer to that service. Instead access is restricted to the same VPC where Neptune is set up, so applications should be deployed in the same VPC to be able to access the database. That's a great idea for Production however it makes it very difficult to develop, debug and test applications locally. The instructions below will help you to create a tunnel towards Neptune endpoint considering you use Amazon EKS - a managed Kubernetes service from Amazon. As a side note, if you don't use EKS, the same idea of creating a tunnel can be implemented using a Bastion server . In Kubernetes we'll create a dedicated proxying pod. Prerequisites. Setting up a tunnel. ...

Notes on upgrade to JSF 2.1, Servlet 3.0, Spring 4.0, RichFaces 4.3

This article is devoted to an upgrade of a common JSF Spring application. Time flies and there is already Java EE 7 platform out and widely used. It's sometimes said that Spring framework has become legacy with appearance of Java EE 6. But it's out of scope of this post. Here I'm going to provide notes about the minimal changes that I found required for the upgrade of the application from JSF 1.2 to 2.1, from JSTL 1.1.2 to 1.2, from Servlet 2.4 to 3.0, from Spring 3.1.3 to 4.0.5, from RichFaces 3.3.3 to 4.3.7. It must be mentioned that the latest final RichFaces release 4.3.7 depends on JSF 2.1, JSTL 1.2 and Servlet 3.0.1 that dictated those versions. This post should not be considered as comprehensive but rather showing how I did the upgrade. See the links for more details. Jetty & Tomcat. JSTL. JSF & Facelets. Servlet. Spring framework. RichFaces. Jetty & Tomcat First, I upgraded the application to run with the latest servlet container versio...

Managing Content Security Policy (CSP) in IBM MAS Manage

This article explores a new system property introduced in IBM MAS 8.11.0 and Manage 8.7.0+ that enhances security but can inadvertently break Google Maps functionality within Manage. We'll delve into the root cause, provide a step-by-step solution, and offer best practices for managing Content Security Policy (CSP) effectively. Understanding the issue IBM MAS 8.11.0 and Manage 8.7.0 introduced the mxe.sec.header.Content_Security_Policy   property, implementing CSP to safeguard against injection attacks. While beneficial, its default configuration restricts external resources, causing Google Maps and fonts to malfunction. CSP dictates which domains can serve various content types (scripts, images, fonts) to a web page. The default value in this property blocks Google-related domains by default. Original value font-src 'self' data: https://1.www.s81c.com *.walkme.com; script-src 'self' 'unsafe-inline' 'unsafe-eval' ...