Moving to a metric centric model – Zoom, SNMP, statsd, JMX & 70 new plugins

datacenter_efficiency

By David Mytton,
CEO & Founder of Server Density.

Published on the 16th November, 2017.

Earlier this year, we completed a migration to a new time series database (OpenTSDB) backed by a new datastore (Google Cloud Bigtable). There were many reasons for this migration but one of the main ones was to support a range of new features we wanted to build on top of it.

Having now completed the full migration of our entire infrastructure (not just the metrics backend) to Google Cloud as of the beginning of Nov, I’m pleased to announce we’re now rolling out a range of new features to Server Density SaaS monitoring.

A move to a metric centric model

The industry has changed significantly since Server Density started in 2009. In those days, servers (instances, nodes or devices) were the core component of infrastructure and so it made sense for metrics to belong to an instance.

Today, metrics are the main component of monitoring. A metric might still be associated with a server but it could also be associated with an application, a container or a function.

In this release, we’re introducing a transition to this metric centric model which starts with how configuration of graphs works on dashboards. Graphs are now built series-by-series by choosing a number of filters, one of which can be the name of the device or service. In particular, you can specify string based match patterns which makes it much easier to plot metrics from a cluster of servers or containers.

Server Density metric centric model

We no longer have the concept of “elastic graphs” because all graphs are now capable of dynamically updating as new matches to your filters come online. This means you could have just a single series configuration that can match many metrics.

Multi-dimensional metrics

Many of our customers make use of our API or the agent custom plugin framework to send us custom metrics. Previously, custom metrics had a top level name and then multiple key/value pairs. Only a single level of metrics was supported.

The move to the new time series database has allowed us to implement multi-dimensional metrics so you can now have custom metrics with any number of levels of data. This is useful to embed contextual information into the metric hierarchy e.g. container IDs which can then be filtered using the new graphing configuration options.

You will see this change in system metrics and official plugins because we have moved to a dot notation format for all metric names. This makes it easy to determine the hierarchy and filter based on name matches.

Plugins now also support multiple metric types including gauges, counters, histograms, rates, counts and raw metrics as we have historically. You can find details with example code in our documentation.

Drag to zoom on graphs

One of the major changes in this release is refactoring of our data model. This was required with the shift to a more metric centric model but those data model changes have also allowed us to build out drag to zoom functionality which has been frequently requested. The foundations for this were implemented earlier in the year with our new graphing architecture based on React, Redux and D3.

Server Density Graph Zoom

JMX

The JMX plugin will automatically pick up 11 core metrics and you can then configure it further to pick up any custom metrics. The details are in our documentation.

statsd

The new version of our monitoring agent can act as a collector for statsd metrics. This means you can instrument your code to measure any metric you like – from execution time through to throughput counters. The agent listens locally and doesn’t require any special configuration to aggregates the values before sending them into Server Density for reporting and alerting. This means you can report at any volume you wish without incurring network overhead.

It’s simple to post metrics to Sdstatsd. The python example below will send the metric ‘application.metric.example’ with a random value between 1-100.

import socket
from random import randint

count =  randint(1, 100)
HOST = 'localhost'
PORT = 8125
MESSAGE = 'application.metric.example:{}|c|#example: tag'.format(count)

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) 
sock.sendto(MESSAGE, (HOST, PORT))

You can find other examples in our documentation.

SNMP

If you have custom hardware running on your network, the agent can be configured to issue requests to SNMP devices and collect stats data from them. Since the agent sits on your servers, it allows us to collect SNMP metrics from behind your firewall without any custom firewall rules needing to be set up to allow external access. You configure the agent to collect metrics from each device and can also configure your own MIBs. Full details are in our documentation.

Lots of new plugins

Many of the plugins we’ve had in development for some time depend on this release because they are multi-dimensional. This means we now have full support for 70 new and improved plugins including:

Activemq, Cassandra, Docker, Elastic, HAProxy, HDFS, Kafka, Kubernetes, Memcache, MongoDB, MySQL, Nginx, Openstack, PGBouncer, PHP FPM, Postfix, PostgreSQL, RabbitMQ, Redis, Riak, Solr, Spark, Supervisor, Tomcat, Varnish, vSphere and Zookeeper.

These are available to all customers for no extra cost. We are publishing the documentation for these over the coming days but feel free to get in touch to ask help in configuring these in the meantime – everything is done via agent config files which include detailed comments.

Availability

These new features are available to new accounts right now. We have released a new version of our *nix monitoring agent (2.2.0) which supports all of the above changes and is the minimum version required. Windows support is on our roadmap for 2018.

Due to the changes in our data model, we’ll be rolling this out to existing accounts gradually with the full release completed for everyone by the end of Jan. You won’t need to make any changes – you’ll see an update to the “What’s new” panel in-app once the migration is completed.

statsd support will be available for everyone. SNMP and JMX will be available to customers on our Pro or Enterprise pricing only. Customers on older pricing can get in touch to find out about switching to get access to SNMP and JMX.

Articles you care about. Delivered.

Help us speak your language. What is your primary tech stack?

Maybe another time