How to monitor NGINX
Last Modified: 22nd Nov 2016
Nginx is a popular web server which is often used as a load balancer because of its performance. It is used extensively at Server Density to power our public facing UI and APIs, and also for its support for WebSockets.
As such, monitoring Nginx is important because it is often the critical component between your users and your service.
We hosted a live Hangout on Air with Rick Nelson the Technical Solutions architect from NGINX, in which we dug deeper into some of the issues discussed in this blog post. We’ve made the slides and video available, which can be found embedded at the bottom of this blog post.
Monitor Nginx from the command line
Monitoring Nginx in real time has advantages when you are trying to debug live activity or monitor what traffic is being handled in real time. These methods make use of the Nginx logging to parse and display activity as it happens.
Enable Nginx access logging
After enabling access logging, you need to restart Nginx and tail the log as requests hit the server to see them in real time:
Using ngxtop to parse the Nginx access log
Whilst tailing the access log directly is useful for checking a small number of requests, it quickly becomes unusable if you have a lot of traffic. Instead, you can use a tool like ngxtop to parse the log file for you, displaying useful monitoring stats on the console.
For more long running monitoring of the logs, Luameter is a better tool that has improved performance for long running monitoring.
Nginx monitoring and alerting – Nginx stats
The above tools are useful for monitoring manually but aren’t useful if you want to automatically collect Nginx monitoring statistics and configure alerts on them. Nginx alerting is useful for ensuring your web server availability and performance remains high.
The basic Nginx monitoring stats are provided by HttpStubStatusModule – metrics include requests per second and number of connections, along with stats for how requests are being handled.
Server Density supports parsing the output of this module to automatically graph and trigger alerts on the values, so we have a guide to configuring HttpStubStatusModule too. Using this module you can keep an eye on the number of connections to your server, and the requests per second throughput. What values these “should” be will depend on your application and hardware.
A good way to approach configuring Nginx alerts is to understand what kind of baseline traffic your application experiences and set alerts around this e.g. alert if the stats are significantly higher (indicating a sudden traffic spike) and if the values are suddenly significantly lower (indicating a problem preventing traffic somewhere).
You could also benchmark your server to find out at what traffic level things start to slow down and the server becomes too overloaded – this will then act as a good upper limit which you can trigger alerts at too.
|Metric||Meaning / Comments||Suggested Alert|
|Uptime||Seconds since the server was started. We can use this to detect respawns.||When uptime is < 180.|
|Current Connections||Number of clients connections currently handled.||None|
|Current Requests||Number of HTTP requests currently handled. Multiple requests can be made over one connection.||None|
|Traffic||Incoming and outgoing traffic. Will depend on the HTTP requests (GET/POST, etc) and their body sizes.||None|
Nginx monitoring and alerting – server stats
Monitoring Nginx stats like requests per second and number of connections is useful to keep an eye on Nginx itself, but its performance will also be affected by how overloaded the server is. Ideally you will be running Nginx on its own dedicated instance so you don’t need to worry about contention with other applications.
Web servers are generally limited by CPU and so your hardware spec should offer the web server as many CPUs and/or cores as possible. As you get more traffic then you will likely see the CPU usage increase.
CPU % usage itself is not necessarily a useful metric to alert on because the values tend to be per CPU or per core. It’s more useful to set up monitoring on average CPU utilisation across all CPUs or cores. Using a tool such as Server Density, you can visualise this and configure alerts so you can be notified when the CPU is overloaded – our guide to understanding these metrics and configuring CPU alerts will help.
On Linux this average across all CPUs is abstracted out to another system metric called load average. It is a decimal number rather than a percentage and allows you to understand load from the perspective of the operating system i.e. how long processes are waiting for access to the CPU. The recommended threshold for load average therefore depends on how many CPUs and cores you have – our guide to load average will help you understand this further.
|Metric||Meaning / Comments||Suggested Alert|
|Load||An all-in-one performance metric.||When load is > factor x (number of cores). Our suggested factor is 4.|
|CPU usage||A high CPU usage is not a bad thing as long as you don’t reach the limit. This will depend on your app.||None|
|Memory usage||RAM usage depends on your app usage too.||None|
|Swap usage||Swap is for emergencies only. Don’t swap.||When used swap is > 128MB.|
|Network bandwidth||Traffic will be directly related to the number of connections and the size of those requests. A must in any web server.||None|
|Disk usage||Make sure you always have free space for new data, logs, temporary files, snapshot or backups.||When disk is > 85% usage.|
Monitoring Nginx and load balancers with Nginx Plus
If you purchase a commercial version of Nginx then you get access to more advanced monitoring (and other features) without having to recompile Nginx with the HttpStubStatusModule enabled.
Nginx Plus includes monitoring stats for connections, requests, load balancer counts, upstream metrics, the status of different load balancer upstreams and a range of other metrics. A live example of what this looks like is provided by Nginx themselves. It also includes a JSON Nginx monitoring API which would be useful for pulling the data out into your own tools.
Monitoring the remote status of Nginx
All of the above metrics monitor the internal status of Nginx and the servers it is running on but it is also important to monitor the experience your users are getting too. This is achieved by using external status and response time tools – you want to know if your Nginx instance is serving traffic from different locations around the world (wherever your customers are) and the kind of response time performance.
This is easy to do with a service like Server Density because of our in-built website monitoring. You can check the status of your public URLs and other endpoints from custom locations and get alerts when performance drops or there is an outage.
This is particularly useful when you can build graphs to correlate the Nginx and server metrics with remote response time, especially if you are benchmarking your servers and want to know when a certain load average starts to affect end user performance.
Monitor Nginx Slides
Monitor Nginx Video
Before you go, make sure you download all our Nginx know-how in one document. See right below this article for the download link.