Berkeley DB JE and ehcache are two simple libraries for persistent storage of serialized Java objects. How good are they at handling large data sets?
Here are some benchmark results:
store 1m objects | read all objects best/worst | disk space | memory | |
---|---|---|---|---|
ehcache 1.2beta2 | 6:24 min | 12:33/13:45 min | 2.36 GB | < 1024 MB |
bdb-je 2.0.90 | 3:32 min | 5:46/7:46 min | 2.02 GB | < 128 MB |
…with custom serializer | 2:17 min | 5:07/7:00 min | 2.01 GB | < 128 MB |
Berkeley DB seems to scale better, especially in terms of memory use. It is also more flexible, but requires a bit more code to be written, and is distributed under a more restrictive license (GPL). Tests were run on a Linux machine with BEA JRockit 1.5.0_03.