'How to purge zookeeper logs with PurgeTxnLog?

Zookeeper's rapidly pooping its internal binary files all over our production environment. According to: http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html and http://dougchang333.blogspot.com/2013/02/zookeeper-cleaning-logs-snapshots.html this is expected behavior and you must call org.apache.zookeeper.server.PurgeTxnLog regularly to rotate its poop.

So:

% ls -l1rt /tmp/zookeeper/version-2/
total 314432
-rw-r--r-- 1 root root 67108880 Jun 26 18:00 log.1
-rw-r--r-- 1 root root   947092 Jun 26 18:00 snapshot.e99b
-rw-r--r-- 1 root root 67108880 Jun 27 05:00 log.e99d
-rw-r--r-- 1 root root  1620918 Jun 27 05:00 snapshot.1e266
... many more

% sudo java -cp zookeeper-3.4.6.jar::lib/jline-0.9.94.jar:lib/log4j-1.2.16.jar:lib/netty-3.7.0.Final.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:conf \
    org.apache.zookeeper.server.PurgeTxnLog \
    /tmp/zookeeper/version-2 /tmp/zookeeper/version-2 -n 3

but I get:

% ls -l1rt /tmp/zookeeper/version-2/
... all the existing logs plus a new directory
/tmp/zookeeper/version-2/version-2 

Am I doing something wrong?

zookeeper-3.4.6/



Solution 1:[1]

ZooKeeper now has an Autopurge feature as of 3.4.0. Take a look at https://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html

It says you can use autopurge.snapRetainCount and autopurge.purgeInterval


autopurge.snapRetainCount

New in 3.4.0: When enabled, ZooKeeper auto purge feature retains the autopurge.snapRetainCount most recent snapshots and the corresponding transaction logs in the dataDir and dataLogDir respectively and deletes the rest. Defaults to 3. Minimum value is 3.


autopurge.purgeInterval

New in 3.4.0: The time interval in hours for which the purge task has to be triggered. Set to a positive integer (1 and above) to enable the auto purging. Defaults to 0.

Solution 2:[2]

Since I'm not hearing a fix via Zookeeper, this was an easy workaround:

COUNT=6
DATADIR=/tmp/zookeeper/version-2/
ls -1drt ${DATADIR}/* | head --lines=-${COUNT} | xargs sudo rm -f

Should run once a day from a cron job or jenkins to prevent zookeeper from exploding.

Solution 3:[3]

You need to specify the parameter dataDir and snapDir with the value that is configured as dataDir in your .properties file of zookeeper.

If your configuration looks like the following.

dataDir=/data/zookeeper

You need to call PurgeTxnLog (version 3.5.9) like the following if you want to keep the last 10 logs/snapshots

java -cp zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.17.jar:conf org.apache.zookeeper.server.PurgeTxnLog /data/zookeeper /data/zookeeper -n 10

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 kmort
Solution 2 user48956
Solution 3 Jehof