Skip to main content

Java 8 Lambdas applied to QuickSort algorithm

In this article I'm going to review Java 8 Lambdas use cases after I've watched the Lambdas have come to Java! screencast from Typesafe. As a nice example, I've decided to count comparisons in the Quicksort algorithm.
  1. Basic algorithm.
  2. Inline lambdas.
  3. Method references.
Basic algorithm
Here is a basic implementation where we count comparisons in the Quicksort algorithm:
public class QuickSort {
    public static long countComparisons(List<Integer> a) {
        if (a.size() <= 1) return 0;
        int p = getPivot(a);
        int i = 1;
        for (int j = 1; j < a.size(); j++) {
             if (a.get(j) < p) {
                 if (j > i) swapInList(a, i, j);
                 i++;
             }
        }
        swapInList(a, 0, i - 1);
        return countComparisons(a.subList(0, i - 1)) +
                countComparisons(a.subList(i, a.size())) + a.size() - 1;
    }

    private static Integer getPivot(List<Integer> a) {
        return a.get(0);
    }

    private static void swapInList(List<Integer> a, int i, int j) {
        int temp = a.get(j);
        a.set(j, a.get(i));
        a.set(i, temp);
    }
}
Above the getPivot method is a crucial point of the Quicksort algorithm. The choice of pivot affects the algorithm performance. So I'd like to be able to provide an external implementation of this method and use "median of three" rule, for example. This rule means choosing the median of the first, middle and last element as a pivot.

Inline lambdas
First let's use lambdas in the straightforward way. This is what QuickSort.countComparisons method will look like:
public static long countComparisons(List<Integer> a,
                                    Function<List<Integer>, Integer> getPivot) {
    if (a.size() <= 1) return 0;
    int p = getPivot.apply(a);
    int i = 1;
    for (int j = 1; j < a.size(); j++) {
         if (a.get(j) < p) {
             if (j > i) swapInList(a, i, j);
             i++;
         }
    }
    swapInList(a, 0, i - 1);
    return countComparisons(a.subList(0, i - 1), getPivot) +
            countComparisons(a.subList(i, a.size()), getPivot) + a.size() - 1;
}
And here are the unit tests:
public class TestQuickSort {
    private List<Integer> unsorted;

    @Before
    public void setUp() {
        unsorted = Arrays.asList(9, 5, 1, 10, 6, 3, 2, 9, 3, 2);
    }

    @Test
    public void testFirstAsPivot() {
        Assert.assertEquals(22, QuickSort.countComparisons(unsorted, a -> a.get(0)));
    }

    @Test
    public void testLastAsPivot() {
        Assert.assertEquals(23, QuickSort.countComparisons(unsorted, a -> {
            QuickSort.swapInList(a, 0, a.size() - 1);
            return a.get(0);
        }));
    }

    @Test
    public void testMedianOfThreeAsPivot() {
        Assert.assertEquals(21, QuickSort.countComparisons(unsorted, a -> {
            int l = a.get(0), m = a.get((a.size() - 1) / 2), r = a.get(a.size() - 1);
            if ((m >= l && m <= r) || (m >= r && m <= l)) {
                QuickSort.swapInList(a, 0, (a.size() - 1) / 2);
            } else if ((r >= l && r <= m) || (r >= m && r <= l)) {
                QuickSort.swapInList(a, 0, a.size() - 1);
            }
            return a.get(0);
        }));
    }
}
Now all this is cool except that it's hard to debug, maintain and reuse. So if there is an exception being thrown from inside the lambda, you'll see rather cryptic stacktrace using anonymous class syntax. For example, let's add division by zero to a lambda and see the stacktrace:
@Test
public void testFirstAsPivot() {
    Assert.assertEquals(22, QuickSort.countComparisons(unsorted, a -> a.get(0) / 0));
}

java.lang.ArithmeticException: / by zero
 at TestQuickSort.lambda$testFirstAsPivot$0(TestQuickSort.java:22)
 at TestQuickSort$$Lambda$2/1766822961.apply(Unknown Source)
 at QuickSort.countComparisons(QuickSort.java:71)
 at TestQuickSort.testFirstAsPivot(TestQuickSort.java:22)
Not a big issue some will say. But in the next section let's refactor the lambdas into methods and use method references.

Method references
First let's move the choice of pivot logic to a separate class:
public class PivotChoice {
    public static Integer getFirst(List<Integer> a) {
        return a.get(0);
    }

    public static Integer getLast(List<Integer> a) {
        QuickSort.swapInList(a, 0, a.size() - 1);
        return a.get(0);
    }

    public static Integer getMedianOfThree(List<Integer> a) {
        int l = a.get(0), m = a.get((a.size() - 1) / 2), r = a.get(a.size() - 1);
        if ((m >= l && m <= r) || (m >= r && m <= l)) {
            QuickSort.swapInList(a, 0, (a.size() - 1) / 2);
        } else if ((r >= l && r <= m) || (r >= m && r <= l)) {
            QuickSort.swapInList(a, 0, a.size() - 1);
        }
        return a.get(0);
    }
}
Second let's update the unit tests:
@Test
public void testFirstAsPivot() {
    Assert.assertEquals(22, QuickSort.countComparisons(unsorted, PivotChoice::getFirst));
}

@Test
public void testLastAsPivot() {
    Assert.assertEquals(23, QuickSort.countComparisons(unsorted, PivotChoice::getLast));
}

@Test
public void testMedianOfThreeAsPivot() {
    Assert.assertEquals(21, QuickSort.countComparisons(unsorted, PivotChoice::getMedianOfThree));
}
Now it already looks much prettier. But let's again add division by zero at the same place and see the stacktrace:
public static Integer getFirst(List<Integer> a) {
    return a.get(0) / 0;
}

java.lang.ArithmeticException: / by zero
 at PivotChoice.getFirst(PivotChoice.java:8)
 at TestQuickSort$$Lambda$2/1496724653.apply(Unknown Source)
 at QuickSort.countComparisons(QuickSort.java:29)
 at TestQuickSort.testFirstAsPivot(TestQuickSort.java:22)
The stacktrace looks much more readable now. Besides when using method references, you cannot close over variables as you definitely can with lambdas. It makes the code less error-prone.

Comments

Popular posts from this blog

Connection to Amazon Neptune endpoint from EKS during development

This small article will describe how to connect to Amazon Neptune database endpoint from your PC during development. Amazon Neptune is a fully managed graph database service from Amazon. Due to security reasons direct connections to Neptune are not allowed, so it's impossible to attach a public IP address or load balancer to that service. Instead access is restricted to the same VPC where Neptune is set up, so applications should be deployed in the same VPC to be able to access the database. That's a great idea for Production however it makes it very difficult to develop, debug and test applications locally. The instructions below will help you to create a tunnel towards Neptune endpoint considering you use Amazon EKS - a managed Kubernetes service from Amazon. As a side note, if you don't use EKS, the same idea of creating a tunnel can be implemented using a Bastion server . In Kubernetes we'll create a dedicated proxying pod. Prerequisites. Setting up a tunnel.

Notes on upgrade to JSF 2.1, Servlet 3.0, Spring 4.0, RichFaces 4.3

This article is devoted to an upgrade of a common JSF Spring application. Time flies and there is already Java EE 7 platform out and widely used. It's sometimes said that Spring framework has become legacy with appearance of Java EE 6. But it's out of scope of this post. Here I'm going to provide notes about the minimal changes that I found required for the upgrade of the application from JSF 1.2 to 2.1, from JSTL 1.1.2 to 1.2, from Servlet 2.4 to 3.0, from Spring 3.1.3 to 4.0.5, from RichFaces 3.3.3 to 4.3.7. It must be mentioned that the latest final RichFaces release 4.3.7 depends on JSF 2.1, JSTL 1.2 and Servlet 3.0.1 that dictated those versions. This post should not be considered as comprehensive but rather showing how I did the upgrade. See the links for more details. Jetty & Tomcat. JSTL. JSF & Facelets. Servlet. Spring framework. RichFaces. Jetty & Tomcat First, I upgraded the application to run with the latest servlet container versio

Extracting XML comments with XQuery

I've just discovered that it's possible to process comment nodes using XQuery. Ideally it should not be the case if you take part in designing your data formats, then you should simply store valuable data in plain xml. But I have to deal with OntoML data source that uses a bit peculiar format while export to XML, i.e. some data fields are stored inside XML comments. So here is an example how to solve this problem. XML example This is an example stub of one real xml with irrelevant data omitted. There are several thousands of xmls like this stored in Sedna XML DB collection. Finally, I need to extract the list of pairs for the complete collection: identifier (i.e. SOT1209 ) and saved timestamp (i.e. 2012-12-12 23:58:13.118 GMT ). <?xml version="1.0" standalone="yes"?> <!--EXPORT_PROGRAM:=eptos-iso29002-10-Export-V10--> <!--File saved on: 2012-12-12 23:58:13.118 GMT--> <!--XML Schema used: V099--> <cat:catalogue xmlns:cat=