Monitoring network bandwidth, CPU and memory effectively

Here is a bunch of handy tips for today that will likely remain in your armoury forever.

As a Linux sysadmin it’s sometimes difficult to visualise just what is causing a performance problem. Sure, it’s easy enough to see which process is hogging the CPU with tools like ‘top’ or its fancier brother, htop. When it comes to figuring out the long term load on a machine or understanding how much memory and network bandwidth is being used can be a little more of a challenge if you aren’t aware of the tools out there.

CPU & memory monitoring with (h)top 

Use the F6 button in htop to sort by CPU or memory etc.

htop real time cpu analyser
htop real time cpu analyser

Analysing CPU, Memory and Disk I/O over a measured time

To analyse the average CPU, memory and disk I/O load over a measured amount of time, use the vmstat tool. It is ugly looking in comparison to htop but once you understand the display it can be highly effective in understanding what’s going on with the system except network utilisation. Note as well that virtualised guest servers might not give the true CPU & I/O figures as these can vary dynamically based on the hypervisor settings.

Like top, vmstat is almost ubiquitous in availability for each Linux version. Vmstat normally takes two arguments: the sample time and the number of samples to measure. So for example running 

vmstat 1 100

Will make a sample each second and will perform the sample 100 times. By default vmstat will show you the output of the CPU load, memory/swap and block I/O, when it runs its 100 samples, it will give you the averages over the time samples for, in this case 100 seconds. If you wish to run vmstat continuously use 0 as the sample number. More information on the syntax and output of vmstat is available here (or use man vmstat).

Finally as an text based alternative you can brew this function in your .bashrc or in a shell script, this will allow you to execute it at intervals using the at command or schedule with cron or perhaps combine it with another script to make further analysis over time.

memcpu() { echo “— Top 10 cpu eating process —“; ps auxf | sort -nr -k 3 | head -10;
echo “— Top 10 memory eating process —“; ps auxf | sort -nr -k 4 | head -10; }

Analysing network utilisation quickly

Monitoring network utilisation is arguably as important as your CPU and memory. The amount of built in tools that do this vary between distributions. There is a multitude of tools you can install via yum or apt-get in the respective distributions. You can try ntop or nmon. Today we are going to look at nload. Although ntop touts itself as the ‘top’ command of networking, it’s a web based tool which whilst good, isn’t as simple to get going as nload. To execute nload simply run it without any arguments and it will output the load on the current network interface. 

nload at work
nload at work

Nload does just what it says on the tin. The historic analysis makes it easy to see how busy the network is, unfortunately that won’t show you what application is causing the load but there are apps which can help there too like the excellent nethogs app. It looks and works just like top- showing processes by name and sorted by order of which process is chewing the most bandwidth.

nethogs at work
nethogs at work

In conclusion and further options

What I’ve demonstrated here are some great, quick analysis tools to get you out of a potentially difficult to diagnose issue. If you need longer term analysis of almost any aspect measurable then you should look to something like nagios and combine it with rrdtool to graph historical trend analysis. Look at cacti (rrd graphing) and munin (sort of like nagios + rrdtool + cacti in one easy package).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.