Quiz time: memory leak in Java

Today I had an interesting debugging exercise, and I felt like I learned a new lesson that’s worth sharing with the rest of the world.

I had the following code, which takes a small-ish byte array and deserializes it into an object (let’s say someNotTooBigData is something like new byte[]{1,5,4, ... some data... }.) Seems innocent enough, no?

voidObject foo() {
	byte[] buf = someNotTooBigData();
	return new ObjectInputStream(new GZIPInputStream(
	    new ByteArrayInputStream(buf))).readObject();
}

But when this is executed frequently enough, like while(true) { foo(); }, it creates OutOfMemoryError. Can you tell why? I’ll post the answer tomorrow.

10 thoughts on “Quiz time: memory leak in Java”

  1. I’ve run into a problem before with GZIPInputStream causing native memory leaks if not closed correctly. Very nasty!

    GZIPInputStream internally uses Inflater, which uses JNI and thus allocates native memory. If you don’t call close() on the GZIPInputStream, then this native memory will not be freed until the Inflater object is GC’d. And of course this can take a long time, because the Inflater object itself (the on-heap part) is quite small.

  2. I don’t get an OutOfMemoryError, although it does use about 6GB of memory very quickly.

    I’m going with the GZIPInputStream never being closed, resulting in the finalizer on Inflater being called at the garbage collector’s leisure, and native zlib buffers sticking around for too long.

    The pure Java JZlib doesn’t have any problems.

  3. The leak is not specific to this code. Any objects returned from a void function can never be garbage collected. Hence the OutOfMemoryError.

  4. @Matthew Wilson
    I corrected a compilation error where the method signature was returning void where it should have been Object. (Or I could have removed the return statement — doesn’t matter either way.) I assume you were just sarcastic/joking when you say objects returned from a void function causes memory leak.)

  5. I haven’t tested this, but could it be because:

    GZIPInputStream extends InflaterInputStream, which in turn has an Inflater reference.

    Inflater has implemented the finalize() method:
    http://download.oracle.com/javase/6/docs/api/java/util/zip/Inflater.html#finalize()

    Presumably this causes the Finalizer reference ReferenceQueue to keep growing and the amount of Finalizer objects to increase, until you hit OOM.

    The problem is described here in more detail:
    http://www.fasterj.com/articles/finalizer1.shtml

  6. Carey :
    [..]
    I’m going with the GZIPInputStream never being closed, resulting in the finalizer on Inflater being called at the garbage collector’s leisure, and native zlib buffers sticking around for too long.
    [..]

    I just read this comment hinting at the same cause.

    I don’t think calling close() on the GZIPInputStream fixes this issue necessarily, at least not fundamentally. As far as I know close() methods don’t prevent the Finalizer objects from being created or the finalize() method from being called.

    Calling GZIPInputStream#close() calls Inflater#end(), this has an if check to not execute the same code twice. So calling GZIPInputStream#close() *might* in practice slow down your loop iterations and speed up the finalize() call enough to fix this. But just as without calling close(), the problem could still occur in theory.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>