How to Monitor Zookeeper

Monitor Zookeeper

By David Mytton,
CEO & Founder of Server Density.

Published on the 12th January, 2016.

Update: We hosted a live Hangout on Air with some members of Server Density engineering and ops teams. Amongst other topics, we discussed how we use Zookeeper here @ Server Density. Check out the recorded session at the bottom of this blog post.

Apache Zookeeper works at the zoo—not your usual zoo, but similar—and does what you’d expect. You know, keep your service-oriented architecture nice and clean.

It provides a distributed hierarchical file system that helps with the difficulties associated with services working in different machines (discovery, registration, configuration, locking, leader selection, queueing, etc). All data replicates across all nodes and the leader performs atomic broadcasts to other servers, therefore guaranteeing strong ordering on changes propagation.

Zookeeper nodes (ZNodes) are like hierarchical file system files (eg. /foo/foo1, /bar/taz, /dev/null/full). They store any data inside, and notify watchers on any event pertaining to them.

Zookeeper can be quite a tricky service to manage. From a client programming point of view there are plenty of low level and error handling pitfalls. That explains the popularity of higher level API wrappers, like the one created by Netflix team (Curator).

With that in mind, here is our very own checklist of best practices, including key Zookeeper metrics and alerts we monitor with Server Density.

Monitoring Zookeeper: Metrics and Alerts

As per previous articles, our general rule of thumb is “collect all possible/reasonable metrics that can help when troubleshooting, alert only on those that require an action from you”. Well, the Zookeeper list that satisfies this criteria is not that long.

Zookeeper process is running

Metric Comments Suggested Alert
Zookeeper process Is the right binary daemon process running? When process list contains the regexp /usr/bin/java*org.apache.zookeeper$.

You can also use the following script to check if the server is running:

$INSTALL_PREFIX/zk-server-3/bin/zkServer.sh status

Or if you run Zookeeper via supervisord (recommended) you can alert the supervisord resource instead.

System Metrics

Metric Comments Suggested Alert
Memory usage Zookeeper should run entirely on RAM. JVM heap size shouldn’t be bigger than your available RAM. That is to avoid swapping. None
Swap usage Watch for swap usage, as it will degrade performance on Zookeeper and lead to operations timing out (set vm.swappiness = 0). When used swap is > 128MB.
Network bandwidth Zookeeper servers can incur a high network usage. Keep an eye on this, especially if you notice any performance degradation. Also look out for dropped packet errors. Zookeeper standards are: 20% writes, 80% reads. More nodes result in more writes and higher overall traffic. None
Disk usage Zookeeper data is usually ephemeral and small. Still we recommend dataLogDir to be on a dedicated partition and watch for disk usage. Use purge task to clean up dataDir and dataLogDir. When disk is > 85% usage.

Zookeeper disk writes are asynchronous which means they shouldn’t have high IO requirements. Still, keep an eye on this, especially if your server is shared with other services, say Kafka.

Here is how Server Density graphs disk usage and memory usage. Note the up and down curves created by the purge task:

how-to-monitor-zookeeper

And here are some Zookeeper alerts configured in Server Density:

How-to-monitor-zookeeper2

Zookeeper Metrics

Metric Comments Suggested Alert
Request Avg/Max Latency Amount of time it takes for the server to respond to a client request (since the server was started). When latency > 10 (Ticks).
Outstanding Requests Number of queued requests in the server. This goes up when the server receives more requests than it can process. When count > 10.
Received Number of client requests (typically operations) received. None
Sent Number of client packets sent (responses and notifications). None
File Descriptors Number of file descriptors used over the limit. When FD percentage > 85 %.
Mode Serving mode: leader or follower, or standalone if not running in an ensemble. None
Pending syncs (Only exposed by the leader) number of pending syncs from the followers. When pending > 10.
Followers (Only exposed by the leader) number of followers within the ensemble. You can deduce the number of servers from the MBeam Quorum Size. When followers != (number of ensemble servers -1).
Node count Number of znodes in the Zookeeper namespace None
Watch count Number of watchers setup over Zookeeper nodes. None
Heap Memory Usage Memory allocated dynamically by the Java process, Zookeeper in this case. None

Here is a Zookeeper monitoring graph including Latency average and Outstanding requests:

How-to-monitor-zookeeper3

Zookeeper Monitoring Tools

The simplest way to monitor Zookeeper and collect these metrics is by using the commands known as “4 letter words” within the ZK community. You can run these using telnet or netcat directly:

$ echo ruok | nc 127.0.0.1 5111
imok
 
$ echo mntr | nc localhost 5111
zk_version  3.4.0
zk_avg_latency  0
zk_max_latency  0
zk_min_latency  0
zk_packets_received 70
zk_packets_sent 69
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count   4
zk_watch_count  0
zk_ephemerals_count 0
zk_approximate_data_size    27
zk_followers    4                   - only exposed by the Leader
zk_synced_followers 4               - only exposed by the Leader
zk_pending_syncs    0               - only exposed by the Leader
zk_open_file_descriptor_count 23    - only available on Unix platforms
zk_max_file_descriptor_count 1024   - only available on Unix platforms

We’ve looked at mytop for MySQL, and memcache-top for Memcached. Well, Zookeeper has one too, zktop:

$ ./zktop.py --servers "localhost:2181,localhost:2182,localhost:2183"
Ensemble -- nodecount:10 zxid:0x1300000001 sessions:4
SERVER           PORT M      OUTST    RECVD     SENT CONNS MINLAT AVGLAT MAXLAT
localhost        2181 F          0       93       92     2      2      7     13
localhost        2182 F          0       37       36     1      0      0      0
localhost        2183 L          0       36       35     1      0      0      0

CLIENT           PORT I   QUEUE RECVD  SENT
127.0.0.1       34705 1       0    56    56
127.0.0.1       35943 1       0     1     0
127.0.0.1       33999 1       0     1     0
127.0.0.1       37988 1       0     1     0

If you are after more detailed metrics, you can access those through JMX. You could also take the DIY road and go for JMXTrans and Graphite, or use Nagios/Cacti/Ganglia with check_zookeeper.py. Alternatively, you can save time (and preserve your sanity) by choosing a hosted service like Server Density (that’s us!).

If you want to test the quality and performance of your Zookeeper ensemble, then zk-smoketest with zk-smoketest.py and zk-latencies.py are great tools to check out.

Zookeeper Management tools

There are not too many management options out there. The folks at Netflix have released Exhibitor, a tool that provides some basic monitoring, log cleaning up (for old versions), backup/restore, ensemble configuration and nodes visualization. There is also zookeeper_dashboard, but it hasn’t been updated in years.

How-to-monitor-Zookeeper5

Further reading

Did this article pique your interest in Zookeeper? Nice, keep reading. We found Scott Leberknight’s Zookeeper series of blog posts to be worthwhile. We also like these presentations:

So what about you? Do you have a checklist or any best practices for monitoring Zookeeper? What systems do you have in place and how do you monitor them? Any interesting reads to suggest?

Tech chat: processing billions of events a day with Kafka, Zookeeper and Storm

Free eBook: 4 Steps to Successful DevOps

This eBook will show you how we i) hacked our on-call rotation to increase code resilience, ii) broke our infrastructure, on purpose, to debug quicker and increase uptime, and iii) borrowed practices from the healthcare and aviation industry, to reduce complexity, stress and fatigue. And speaking of stress and fatigue, we’ve devoted an entire chapter on how we placed humans at the centre of Ops, in order to increase their productivity and boost the uptime of the systems they manage. What are you waiting for, download your free copy now.

Help us speak your language. What is your primary tech stack?

What infrastructure do you currently work with?

Articles you care about. Delivered.

Help us speak your language. What is your primary tech stack?

Maybe another time