Dealing with JVM limitations in Apache Cassandra

FOSDEM 2012

The Cassandra distributed database runs on the JVM, which in many ways is a huge productivity boost. However, the more you care about performance, the less you can afford to let abstractions like i/o and memory management exist as black boxes. This talk will cover how Cassandra has gone beyond the interfaces provided by the JVM to achieve higher performance in three areas. First, how Cassandra uses platform-specific features such as posix_fadvise, munmap, and mlock. Second, how we tune memory management to avoid pain points in common garbage collection designs, including in OpenJDK. Third, how Cassandra uses the JDK's instrumentation hooks to measure the real size of on-heap structures and make better decisions about what to keep in memory without error-prone manual adjustments.

Speakers: Jonathan Ellis