Quiz answer: memory leak in Java

I posted a little quiz yesterday, and here is the answer.

The short answer is that InputStream needs to be closed. It’s easy to see why if it’s FileInputStream because you know the file handle needs to be released. But in this case, it’s just ByteArrayInputStream. We can just let GC recycle all the memory, right?

Turns out GZIPInputStream (or more precisely Deflater that it uses internally) uses native zlib code to perform decompression, so it’s actually occupying more memory (about 32K-64K depending on the compression level, I believe) on the native side, while its Java heap footprint is small. So if you allocate enough of those, you can end up eating a lot of native memory, while Java heap is still mostly idle. Even though those GZipInputStreams are no longer referenced, it just doesn’t create enough heap pressure to cause the GC to run.

And eventually you eat up all the native memory, and zlib’s malloc fails, and you get OutOfMemoryError (or your system starts to swap like crazy and your system effectively becomes unusable first.)

The other interesting thing to note is that -XX:HeapDumpOnOutOfMemoryError doesn’t do anything in this case. I read the JVM source code and I learned that heap dump only happens when OOME is caused during 3 or 4 specific memory allocation operations, like allocating a Java object, array, GC saturation, and a few other things. There are many other code passes in JVM that reports OOME, like this zlib malloc failure, that doesn’t trigger heap dump. There’s no question HeapDumpOnOutOfMemoryError is useful, but just beware that in some cases it doesn’t get created.

I knew that GZipInputStream is using native code internally, but I didn’t think about it too much when I was putting this original code together. Humans can’t think about all the transitive object graph and its implications.

The other lesson is that now I know why ps sometimes report such a big memory footprint for JVM while jmap reports only a modest usage. The difference is native memory outside Java heap, although unfortunately I don’t think there’s any easy way to check what’s eating the native memory.

My colleague and friend Paul Sandoz pointed out that if GZipInputStream was nice enough to free them up at EOF, it would have saved a lot of hassle, and I think he’s right — one still needs to consider the case where IOException causes the processing to abort before hitting EOF, but it would have helped, because those abnormal cases would be rare. I mean, there’s no harm in doing so, and anything that makes the library more robust in the face of abuse is a good thing, especially when the failure mode is this cryptic.

6 Comments Add yours

  1. kohsuke says:

    And I should also credit a number of comments who accurately found the root cause! Once again it proves the rest of the world is smarter than me.

  2. Cowtowncoder says:

    Interesting finding (and I assume based on finding this via running out of memory).

    One thing that I have found is that while gzip is nice for cases where you must optimize for compression rate (lots of CPU, not tons of bandwidth), basic simple LZ codecs are very nice for cases where you want to reduce CPU usage.
    Specifically, Snappy and LZF Java codecs are wicked fast (esp. compression is 3-5x faster than Deflate; but decompression 2-3x as well). And since these are pure Java, no accidental memory retention occurs.

    I have worked on LZF codec (at github, “Ning compress”), and Dain (ex-coworker from Ning) on Snappy pure java (“iq80/snappy”) — there’s also a JNI wrapper around native Snappy (google code, “snappy-java”), which also works well, but speedwise there’s not much benefit from using it over java-only one.

    Oh and I also wrote a benchmark that tests speeds (https://github.com/ning/jvm-compressor-benchmark), which show Snappy-java currently having the top speed.
    (but wait for LZF 0.9 which will get it closer again…) 🙂

    ps. I tried adding links, but spam filter considers those bad :-/

  3. kohsuke says:

    Thanks for the pointer to other compression algorithms. I’m curious about the memory usage during compression/decompression.

  4. JRuby 1.7 (unreleased) is unaffected, because we have diverged away from Java’s broken wrapper around zlib toward ymnk’s jzlib, a Java port of zlib that suffers from none of these issues. JRuby leads the way? (with ymnk’s help, of course 🙂

  5. MrBSD says:


    Interesting. Would it be interesting to add the Java version of LZ4 to this benchmark ? It is said to be even faster than Snappy. The Java version is also hosted at GitHub :

  6. Cowtowncoder says:


    LZF uses block sizes of up to 64k, and comp/decomp is block by block; memory usage is limited by block size. This is different from some codecs (bzip2) that require larger block sizes.

    @MrBSD: I have indeed added lz4 since then; although just the JNI-wrapped one (as java version had lots of issues). Seems plenty fast, even faster than snappy, lzf, although same order of magnitude (i.e. they are all rather fast)

Leave a Reply

Your email address will not be published. Required fields are marked *