How to Monitor Zookeeper
CEO & Founder of Server Density.
Published on the 12th January, 2016.
Update: We hosted a live Hangout on Air with some members of Server Density engineering and ops teams. Amongst other topics, we discussed how we use Zookeeper here @ Server Density. Check out the recorded session at the bottom of this blog post.
Apache Zookeeper works at the zoo—not your usual zoo, but similar—and does what you’d expect. You know, keep your service-oriented architecture nice and clean.
It provides a distributed hierarchical file system that helps with the difficulties associated with services working in different machines (discovery, registration, configuration, locking, leader selection, queueing, etc). All data replicates across all nodes and the leader performs atomic broadcasts to other servers, therefore guaranteeing strong ordering on changes propagation.
Zookeeper nodes (ZNodes) are like hierarchical file system files (eg. /foo/foo1, /bar/taz, /dev/null/full). They store any data inside, and notify watchers on any event pertaining to them.
Zookeeper can be quite a tricky service to manage. From a client programming point of view there are plenty of low level and error handling pitfalls. That explains the popularity of higher level API wrappers, like the one created by Netflix team (Curator).
With that in mind, here is our very own checklist of best practices, including key Zookeeper metrics and alerts we monitor with Server Density.
Monitoring Zookeeper: Metrics and Alerts
As per previous articles, our general rule of thumb is “collect all possible/reasonable metrics that can help when troubleshooting, alert only on those that require an action from you”. Well, the Zookeeper list that satisfies this criteria is not that long.
Zookeeper process is running
|Zookeeper process||Is the right binary daemon process running?||When process list contains the regexp /usr/bin/java*org.apache.zookeeper$.|
You can also use the following script to check if the server is running:
Or if you run Zookeeper via supervisord (recommended) you can alert the supervisord resource instead.
|Memory usage||Zookeeper should run entirely on RAM. JVM heap size shouldn’t be bigger than your available RAM. That is to avoid swapping.||None|
|Swap usage||Watch for swap usage, as it will degrade performance on Zookeeper and lead to operations timing out (set vm.swappiness = 0).||When used swap is > 128MB.|
|Network bandwidth||Zookeeper servers can incur a high network usage. Keep an eye on this, especially if you notice any performance degradation. Also look out for dropped packet errors. Zookeeper standards are: 20% writes, 80% reads. More nodes result in more writes and higher overall traffic.||None|
|Disk usage||Zookeeper data is usually ephemeral and small. Still we recommend dataLogDir to be on a dedicated partition and watch for disk usage. Use purge task to clean up dataDir and dataLogDir.||When disk is > 85% usage.|
Zookeeper disk writes are asynchronous which means they shouldn’t have high IO requirements. Still, keep an eye on this, especially if your server is shared with other services, say Kafka.
Here is how Server Density graphs disk usage and memory usage. Note the up and down curves created by the purge task:
And here are some Zookeeper alerts configured in Server Density:
|Request Avg/Max Latency||Amount of time it takes for the server to respond to a client request (since the server was started).||When latency > 10 (Ticks).|
|Outstanding Requests||Number of queued requests in the server. This goes up when the server receives more requests than it can process.||When count > 10.|
|Received||Number of client requests (typically operations) received.||None|
|Sent||Number of client packets sent (responses and notifications).||None|
|File Descriptors||Number of file descriptors used over the limit.||When FD percentage > 85 %.|
|Mode||Serving mode: leader or follower, or standalone if not running in an ensemble.||None|
|Pending syncs||(Only exposed by the leader) number of pending syncs from the followers.||When pending > 10.|
|Followers||(Only exposed by the leader) number of followers within the ensemble. You can deduce the number of servers from the MBeam Quorum Size.||When followers != (number of ensemble servers -1).|
|Node count||Number of znodes in the Zookeeper namespace||None|
|Watch count||Number of watchers setup over Zookeeper nodes.||None|
|Heap Memory Usage||Memory allocated dynamically by the Java process, Zookeeper in this case.||None|
Here is a Zookeeper monitoring graph including Latency average and Outstanding requests:
Zookeeper Monitoring Tools
The simplest way to monitor Zookeeper and collect these metrics is by using the commands known as “4 letter words” within the ZK community. You can run these using telnet or netcat directly:
$ echo ruok | nc 127.0.0.1 5111 imok $ echo mntr | nc localhost 5111 zk_version 3.4.0 zk_avg_latency 0 zk_max_latency 0 zk_min_latency 0 zk_packets_received 70 zk_packets_sent 69 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 4 zk_watch_count 0 zk_ephemerals_count 0 zk_approximate_data_size 27 zk_followers 4 - only exposed by the Leader zk_synced_followers 4 - only exposed by the Leader zk_pending_syncs 0 - only exposed by the Leader zk_open_file_descriptor_count 23 - only available on Unix platforms zk_max_file_descriptor_count 1024 - only available on Unix platforms
$ ./zktop.py --servers "localhost:2181,localhost:2182,localhost:2183" Ensemble -- nodecount:10 zxid:0x1300000001 sessions:4 SERVER PORT M OUTST RECVD SENT CONNS MINLAT AVGLAT MAXLAT localhost 2181 F 0 93 92 2 2 7 13 localhost 2182 F 0 37 36 1 0 0 0 localhost 2183 L 0 36 35 1 0 0 0 CLIENT PORT I QUEUE RECVD SENT 127.0.0.1 34705 1 0 56 56 127.0.0.1 35943 1 0 1 0 127.0.0.1 33999 1 0 1 0 127.0.0.1 37988 1 0 1 0
If you are after more detailed metrics, you can access those through JMX. You could also take the DIY road and go for JMXTrans and Graphite, or use Nagios/Cacti/Ganglia with check_zookeeper.py. Alternatively, you can save time (and preserve your sanity) by choosing a hosted service like Server Density (that’s us!).
If you want to test the quality and performance of your Zookeeper ensemble, then zk-smoketest with zk-smoketest.py and zk-latencies.py are great tools to check out.
Zookeeper Management tools
There are not too many management options out there. The folks at Netflix have released Exhibitor, a tool that provides some basic monitoring, log cleaning up (for old versions), backup/restore, ensemble configuration and nodes visualization. There is also zookeeper_dashboard, but it hasn’t been updated in years.
Did this article pique your interest in Zookeeper? Nice, keep reading. We found Scott Leberknight’s Zookeeper series of blog posts to be worthwhile. We also like these presentations:
- Building an Impenetrable Zookeeper (includes video).
- Apache Zookeeper is a long presentation covering some required concepts of distributed systems
- Zookeeper in the Wild goes straight to the point on operating a Zookeeper ensemble.
So what about you? Do you have a checklist or any best practices for monitoring Zookeeper? What systems do you have in place and how do you monitor them? Any interesting reads to suggest?