A faster and more compact Set

Looking at some of the new code that is being added to Ontopia I thought it might be useful to point out that as part of Ontopia we have a class called CompactHashSet, which is both faster and more compact than java.util.HashSet. So when you use Sets in your code, it might be worth your while to use the CompactHashSet instead of HashSet.

The in-memory engine uses Sets throughout for all the collections in the core API. There are a lot of these: the set of topics, the set of names for each topic, the set of item identifiers for each topic, … Ontopia originally used a lot of memory, and we identified HashSet as the source of a lot of this, and CompactHashSet was written in order to reduce memory usage somewhat. An interesting side-effect was that it also turned out to be faster.

HashSet uses open hashing, which means that each bucket in the hash table refers to a linked list of entries whose hash codes place them in the same bucket. This means that for each entry in the set an extra linked list element object must be allocated, which of course requires extra memory.

CompactHashSet, by contrast, uses closed hashing. This is a strategy where if the bucket you want to place a new entry in is already occupied, you run further down the hash table (in a predictable way!) looking for free buckets. This means you can do away with the linked list, thus saving memory.

So how much faster and more compact is CompactHashSet? I put together a little unscientific test and ran it three times for each of the implementations. The test first adds to the set 1 million times, then does 1 million lookups, then traverses the set, then removes 1 million objects. (See below for the test code.) This is the results:

CLASS          TIME   MEMORY 
HashSet        4477   44644624
HashSet        4447   44651984
HashSet        4500   44632824
CompactHashSet 2416   22886464
CompactHashSet 2351   22889408
CompactHashSet 2370   22895872

In other words, in our little test, CompactHashSet is nearly twice as fast and uses about half as much memory compared to HashSet. When you win on both speed and memory use there isn’t much left, really…

Except, of course, reliability. Initially, there were some problems with the CompactHashSet class. In some cases, the run down the hashtable to find free buckets could get into an infinite loop without ever finding a bucket. That’s now been solved. And, many years ago, there was a memory leak in it, causing deleted objects to be kept when rehashing. This caused serious performance problems for some customers and took months to track down.

EDIT: Discussion on reddit.com shows that many people misunderstood the above. The infinite loop bug was found during initial testing. The memory leak was found once customers started deploying the code for real, which may have been about a year after it was implemented. This was in 2003. Since then we have found exactly zero bugs in the code. END EDIT

By now, however, we have an extensive test suite for the class, and it’s been in use unchanged for many years with no problems. Using CompactHashSet should be entirely safe.

The test code

If you want to try the test yourself, here is the test code:

import java.util.Set;
import java.util.HashSet;
import java.util.Iterator;
import net.ontopia.utils.CompactHashSet;

public class TestHashSet {
  private static final int TIMES = 1000000;
  private static final int MAX = 5000000;
  
  public static void main(String[] argv) {
    // first, get the JIT going
    test(false, new CompactHashSet());
    test(false, new HashSet());

    // then, do real timings
    for (int ix = 0; ix < 3; ix++)
      test(true, new HashSet());
    for (int ix = 0; ix < 3; ix++)
      test(true, new CompactHashSet());
  }

  public static void test(boolean output, Set set) {
    long start = System.currentTimeMillis();

    if (output) {
      System.gc(); System.gc();
    }
    long before = Runtime.getRuntime().totalMemory() -
      Runtime.getRuntime().freeMemory();
    
    // add
    for (int ix = 0; ix < TIMES; ix++)
      set.add(new Long(Math.round(Math.random() * MAX)));

    if (output) {
      System.gc(); System.gc();
      long after = Runtime.getRuntime().totalMemory() -
        Runtime.getRuntime().freeMemory();
      System.out.println("Memory before: " + before);
      System.out.println("Memory after: " + after);
      System.out.println("Memory usage: " + (after - before));
    }
    
    // lookup
    int count = 0;
    for (int ix = 0; ix < TIMES; ix++) {
      Long number = new Long(Math.round(Math.random() * MAX));
      if (set.contains(number))
        count++;
    }

    // iterate
    Iterator it = set.iterator();
    while (it.hasNext()) {
      Long number = (Long) it.next();
    }

    // remove
    for (int ix = 0; ix < TIMES; ix++) {
      Long number = new Long(Math.round(Math.random() * MAX));
      set.remove(number);
    }

    if (output)
      System.out.println("TIME: " + (System.currentTimeMillis() - start));
  }
}

Hardship at NRK

At NRK Per Magnus Løvold is the internal developer responsible for keeping the application running, and so he is the one who had to deal with the frustration from the editorial team when it was not working properly.

In the end he made a wow not to shave before the problem was solved, possibly inspired by Harald Finehair. He sent us some photographic evidence, which we thought worth sharing:

Before and after
Before and after

Performance problems at NRK/Skole

In the presentation of the NRK/Skole case at the August 27 meeting the customer spent some time on the performance problems that they have been experiencing recently, and there was quite a bit of discussion at the meeting of what the possible causes might be. Given that the whole site is based on Ontopia, much of the suspicion obviously centered on the Ontopia software and possible performance problems with it.

The initial report we had from the customer was that the site was “unstable”, meaning that it would sometimes not respond or be very slow. On occasion it would also have to be restarted. There was just one server with Ontopia deployed on it, but traffic on the site could only be described as modest (it’s not officially in use by schools yet), so it should easily be able to handle the traffic.

The customer wanted to switch to Ontopia 5.0.0 from OKS 4.1 in order to be able to use two load-balanced servers in a cluster (OKS 4.1 is priced per CPU, so it was too expensive to do it with this version). Unfortunately, testing showed that Ontopia 5.0.0 performed even worse than 4.1. It was at this point that the customer asked us to investigate and solve the performance problems.

Investigation

Our first step was to use TologSpy, a query profiler for tolog queries, to see if any of the queries in the application were slow. We quickly determined that the new optimizer made one of the queries very slow because it, well, de-optimized the query quite considerably. However, this was a problem with Ontopia 5.0.0 only, and couldn’t be the cause of the problems with OKS 4.1. Further, inserting a comment with an pragma to the optimizer should solve the problem.

Further testing showed that other than this the application seemed perfectly fine. We tested it with JMeter scripts running 10 client threads in parallel without being able to provoke any real problems. TologSpy showed that the bad query was still an issue (even with the pragma), but other than that everything seemed just fine.

So we changed tactics. Instead of using JMeter to pound some randomly selected URIs we decided to write a Python script to play back portions of the access log from the production site. This gave the same results, initially. Then the customer noticed that two sets of URLs were “hanging” and not giving responses. So further investigation naturally focused on these.

Problem #1

It turned out that for some clips the metadata on the left would never be shown. Instead, the server would use all available memory and 100% CPU until the thread died with an out-of-memory error. Further digging showed that some of the metadata fields were being parsed with an Antlr-based parser written specifically for this project (ie: not part of Ontopia at all) in order to have the metadata nicely formatted. Running the parser from the command-line without Ontopia or a web server we found that some string values would make it consume 100% CPU until it hit the ceiling of 700 MB RAM, and then crash.

Studying the Antlr-based parser showed that its lexer had a token definition which could match empty strings. If an illegal character (that is, one that couldn’t match any token) appeared, the lexer would take the only option open to it and generate an infinite list of empty string tokens in front of that character.

This, of course, was the problem. Any time a user asked for one of these clips the server would use all available CPU and all memory, dumping out most of the objects in the Ontopia object cache. Typically, the user would reload when the server didn’t respond the first time, thus setting off the same thing again. Other pages would of course be slow during this time, especially as all caches would have to be filled up again, and the server would appear frozen/sluggish for a while before things calmed down again.

Fixing this required just changing a few lines in the .g file in the project code.

Problem #2

The second problem was the de-optimized tolog query. Setting the pragma option to turn off the new optimizer for some reason did not solve the problem. Running the query in the tolog plug-in in Omnigator worked fine, even if the query was a little slow. Running from the project Java code, however, the query would never complete.

It took a while to work out what was causing this, but in the end it was realized that the inference rules were running differently in Omnigator from in the Java code. The Omnigator plugin passes all rule declarations to the query processor together with the query as a single string, but the Java code was parsing the rule declarations into a DeclarationContextIF object, which was then passed in together with the query.

Further study showed that the code which optimizes the inference rule was not being run when rules were being parsed into a context object. They were, however, run when the rules were declared as part of the query. Once this observation was made, the fix was quite simple.

Conclusion

A bug in the project code (in the parser) was in other words the cause of the performance and stability issues. A fairly obscure bug in the tolog parser was the reason why 5.0.0 performed even worse than the old version. Now that these two issues are out of the way the site is behaving much better, and the customer will be able to change over to a load-balanced environment with two clustered Ontopia servers.

Ontopia 5.0.1 is released!

After a few days in beta with 30 downloads and no error reports we have released Ontopia 5.0.1. This release mostly holds bug fixes and minor improvements throughout the product. From the release notes:

The following changes have been made:

The following bugs have been fixed: