How to Monitor Lighttpd

Monitor Lighttpd

By David Mytton,
CEO & Founder of Server Density.

Published on the 4th February, 2016.

After years of Apache dominance, developers around the world started building new HTTP servers designed for massive scale and solving the C10k problem. The two leading servers in terms of popularity are, of course, Nginx and Lighttpd.

The main advantage of the asynchronous approach is scalability. In a process-based server, each simultaneous connection requires a thread which incurs significant overhead. An asynchronous server, on the other hand, is event-driven and handles requests in a single (or at least, very few) threads.

Nginx FAQ – Are there other, similar web servers?

Performance wise, Nginx and Lighttpd are very similar. When comparing feature lists however, Nginx has a slight edge. This is probably why it’s backed by a wider community, even if Lighttpd offers more straightforward configuration options.

Regardless which one you pick, you’ll probably want to monitor it. What follows is our very own checklist of best practices (#howto) for Lighttpd, including key metrics and alerts we monitor with Server Density.

#howto Monitor Lighttpd: Metrics and Alerts

Even in simple services like an HTTP server, there is no shortage of possible metrics you can monitor. The key to successful monitoring is to select those very few ones we care about enough to let them pester you with alerts and notifications.

Our rule of thumb here at Server Density is, “collect all metrics that help with troubleshooting, alert only on those that require an action.”

Same as with any other database, you need to monitor some broad conditions:

  1. Required processes are running as expected
  2. Resource usage is within limits
  3. Requests are coming through
  4. Typical failure points

Let’s now take a look at each of those categories and “flesh them out” with some specifics.

1. Lighttpd process running

Metric Comments Suggested Alert
lighttpd process Right binary daemon process running. When process /usr/sbin/lighttpd count != 1.

2. System Metrics

The metrics listed below are the “usual suspects” behind most issues and bottlenecks. They also correspond to the top system resources you should monitor on pretty much any database server.

Metric Comments Suggested Alert
Load An all-in-one performance metric. Understanding Linux load. When load is > factor x (number of cores). Our suggested factor is 4.
CPU usage A high CPU usage is not a bad thing as long as you don’t reach the limit. This will depend on your app. None
Memory usage RAM usage depends on your maximum number of users. How much memory takes a connection depends on if a static file, you have any (Fast) CGI application in the backend and if you are using SendFile. None
Swap usage Swap is for emergencies only, and it should not be used. Don’t swap. When used swap is > 128MB.
Network bandwidth Traffic will be directly related to the number of connections and the size of those requests. A must in any web server. None
Disk usage Make sure you always have free space for new data, logs, temporary files, snapshot or backups. When disk is > 85% usage.

Disk can be a bottleneck. Unlike databases where you rely on both read and write performance, when it comes to web servers it’s mostly reads that count.

Metric Comments Suggested Alert
Read/Write requests IOPS (Input/Output operations per second) None
IO Queue length Tracks how many operations are waiting for disk access. If a query hits the cache, it doesn’t create any disk operation. If a query doesn’t hit the cache (i.e. a miss), it will create multiple disk operations. None
Average IO wait Time that queue operations have to wait for disk access. None
Average Read/Write time Time it takes to finish disk access operations (latency). None
Read/Write bandwidth Data transfer from and towards your disk. None

3. Lighttpd Metrics

Monitoring Lighttpd availability and requests

These metrics will tell you if Lighttpd is working and accepting new HTTP requests.

Metric Meaning / Comments Suggested Alert
Uptime Seconds since the server was started. We can use this to detect respawns. When uptime is < 180.
Current Connections Number of clients connections currently handled. None
Current Requests Number of HTTP requests currently handled. Multiple requests can be made over one connection. None
Traffic Incoming and outgoing traffic. Will depend on the HTTP requests (GET/POST, etc) and their body sizes. None

 

Lighttpd typical errors

Lighttpd is a simple service with just a few failure points that you need to keep an eye on. Make sure you have enough RAM space allocated for lighttpd so that your server doesn’t resort to swapping. Also make sure your logs are being rotated. There are no relevant messages in /var/log/lighttpd/error.log.

Understanding the scoreboard

The scoreboard is very similar across all web servers (Apache, Nginx and Lighttpd report on these metrics too).

Metric lighttpd2 Metric lighttpd1 Meaning / Comments Suggested Alert
Inactive TCP connections opened but waiting for a client response None. If too many we can reduce timeout.
connect Opening the TCP connection None
Request start request-start Start of HTTP request None
Read header read / read-POST Read the content of the HTTP request None
request-end End of HTTP request None
Handle request handle-request Decide action to take with the request None. If too many the application backed is being too slow or needs more resources/workers.
response-start Start of the HTTP request response None
Write response write Write the HTTP response to the socket None
response-end End of the HTTP request response None
hard error None
Keep-Alive keep-alive Keeping the TCP connection open for more HTTP requests from the same client to avoid the TCP handling overhead If too many we can reduce keep-alive timeout.
close Closing the TCP connection if no other HTTP request will use it. None
Upgraded None

 

Monitoring Lighttpd HTTP return codes

HTTP return status codes are a simple way of checking if your webserver is working as expected. Keep an eye on those server errors, especially the 5XX one:

Metric Comments Suggested Alert
HTTP Status 1XX Informational provisional response: request received, continuing process. None
HTTP Status 2XX Success: client action requested received, understood, accepted and processed successfully. None
HTTP Status 3XX Redirection: additional action to complete the request required. None
HTTP Status 4XX Client Error: client seems to have erred. None
HTTP Status 5XX Server Error: server failed to process an apparently valid request. None

Lighttpd Monitoring Tools

Not many options out there. These are the ones we know of. Please chime in if we’ve missed something obvious here:

Lighttpd server stats

Connect to your Lighttpd server via curl, retrieve the stats URL, and you get the main metrics:

$ curl http://localhost/server-status?auto

Total Accesses: 100000

Total kBytes: 373633

Uptime: 458

BusyServers: 1

IdleServers: 127

Scoreboard: h_________________________________________________________

In order to configure these stats, will have to enable the mod_status:

$ sudo lighty-enable-mod status

$ sudo /etc/init.d/lighttpd force-reload

If you want to protect this page then edit this file /etc/lighttpd/conf-available/10-status.conf and restrict access on an IP basis:

$HTTP["remoteip"] == "10.0.0.0/8" {

status.status-url = "/server-status"

}

Or use authentication:

auth.require = ( "/server-status" =>

( "realm" ... ) )

If you are using lighttpd2, the available metrics and their respective mod_status outputs are not exactly the same. Lighttpd2 has a “current connection states section” in lieu of a “scoreboard”. It also includes some counters for HTTP return codes:

$ curl "http://localhost/server-status?format=plain"
# Absolute Values
uptime: 195
memory_usage: 4907008
requests_abs: 9
traffic_out_abs: 26060
traffic_in_abs: 869
connections_abs: 1

# Average Values (since start)
requests_avg: 0
traffic_out_avg: 133
traffic_in_avg: 4
connections_avg: 0

# Average Values (5 seconds)
requests_avg_5sec: 0
traffic_out_avg_5sec: 0
traffic_in_avg_5sec: 0
connections_avg_5sec: 0

# Connection States
connection_state_start: 0
connection_state_read_header: 0
connection_state_handle_request: 1
connection_state_write_response: 0
connection_state_keep_alive: 0
connection_state_upgraded: 0

# Status Codes (since start)
status_1xx: 0
status_2xx: 7
status_3xx: 0
status_4xx: 1
status_5xx: 0

Also configuration in this case will be slightly different. Edit this file /etc/lighttpd2/lighttpd.conf and add:

# (inside the status section)

setup {
module_load "mod_status";
}

# (after the status section)
if req.path == "/server-status" {
status.info;
}

apachetop

As the name suggests apachetop was built for Apache. Lighttpd, however, uses the same log format, so you can use the same script for real-time monitoring of Lighttpd (for metrics like: requests per second, traffic, return codes and current requested URLs) in a top-like interface.

Just run:

$ sudo apt-get install apachetop

$ sudo lighty-enable-mod accesslog

$ sudo /etc/init.d/lighttpd force-reload

And then:

$ sudo apachetop -qf /var/log/lighttpd/*.log:

last hit: 16:43:43         atop runtime:  0 days, 00:00:20             16:43:47

All:          330 reqs (  82.5/sec)       1149.8K (  287.5K/sec)    3568.0B/req

2xx:     330 ( 100%) 3xx:       0 ( 0.0%) 4xx:     0 ( 0.0%) 5xx:     0 ( 0.0%)

R ( 20s):     330 reqs (  16.5/sec)       1149.8K (   57.5K/sec)    3568.0B/req

2xx:     330 ( 100%) 3xx:       0 ( 0.0%) 4xx:     0 ( 0.0%) 5xx:     0 ( 0.0%)

 

REQS REQ/S    KB KB/S URL

330 82.50  1150  287*/

mod_stats and apachetop are great interactive / realtime tools. But when managing a Lighttpd server in production you probably want to record metrics over a period of time. This is to assist with troubleshooting and with delivering alerts when things break.

lighttpd version 1 includes a module for storing metrics in RRD files (subsequently removed in version 2). If you want to get graphs that way here is a short article, but you won’t get alerts unfortunately.

There are some 3rd party plugins for alerting and graphing. Those are on top of existing on-premise open source monitoring solutions like Nagios with check_apachestatus_auto.pl (and Lighttpd support), Cacti (only graphs) with cacti-template-lighttpd, Munin with lighttpd plugin or Zabbix (read this post). All these plugins make use of the aforementioned mod_stats URL.

Server Density

If all that sounds too onerous and if you have other, more pressing business priorities, then maybe you should leave server monitoring to the experts and carry on with your business.

This is where we shamelessly toot our own horn.

Server Density offers a user interface (we like to think it’s very intuitive) that supports tagging, elastic graphs and advanced infrastructure workflows. It plays well with your automation tools and offers mobile apps too. If you don’t have the time to setup and maintain your own on-premise monitoring and you are looking for a hosted and robust monitoring that covers Lighttpd (and the rest of your infrastructure), you should sign up for a 2-week trial of Server Density.

Server Density is not free but it saves you significant time and effort, which translates to significant savings for your boss.

Further Reading

Did this article pique your interest in lighttpd? Nice, keep reading. We’ve found some great posts about lighttpd performance that we think you should read. Here they are: Tuning Lighttpd for Linux, Lighttpd performance and Performance Tuning Lighty. Optimizing Lighttpd book from Packt Publishing should be a great read too.

If you didn’t know about Keep-Alive or you want to delve deeper into the HTTP protocol then The Definitive Guide from O’Reilly has a nice section on Persistent Connections.

Also, if you’re interested in optimising the Linux TCP stack, you can’t go wrong with Linux TCP/IP Tuning for Scalability and Linux Network Tuning for 2013.

Summary

What about you? Do you have a checklist of best practices for monitoring Lighttpd? What web servers do you have in place and how do you monitor them? Any books you can suggest?

Articles you care about. Delivered.

Help us speak your language. What is your primary tech stack?

Maybe another time