Analysing system performance with ‘Top’

There are literally hundreds of guides on the Internet detailling how to use the ‘top’ command. A very handy command-line tool that has come with UNIX since back in the dark ages, however not all of these guides are directed flatly at the new Linux user. This one won’t go into loads of detail but will give you the basics.

top linux command
The top command in action

Why would I want to run ‘top’ ?

Top is a great utility to find out if your Linux machine is running slowly, or perhaps you want to see what a server is doing most of the time, top tells you loads of things about how well your box is performing and can be compared to tools like the Windows Task Manager.

How do I run ‘top’?

Top is a command-line tool. That is, you need to run the Terminal or Konsole program in order to run it. For example, in Ubuntu, click on Applications, then click Accessories, then click ‘Terminal’.  You will be presented with a command prompt. Type the word top (in lower case) and press return. You will then see the ‘top’ program running.

What am I seeing here?

With any luck, your terminal window should look a bit like this:

top – 16:17:41 up 100 days, 18:01,  4 users,  load average: 0.20, 1.13, 1.77
Tasks: 126 total,   1 running, 125 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.8%id,  0.2%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16383952k total, 15630644k used,   753308k free,  4180008k buffers
Swap:  7815580k total,       64k used,  7815516k free, 10127600k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
1 root      20   0  1948  600  508 S    0  0.0   0:21.74 init

Whoa! What is all that nonsense?! Have no fear, it will all make sense in a second and you’ll be able to impress all your friends with your new found knowledge!

The first line shows all of the stats that the command ‘uptime’ shows. For example, you can see the time of the system, how many days the system has been running (the ‘uptime’) – in this case my system has been running for 100 days, 18 hours, 1 minute. There are four active users on the system and finally you see the ‘load average’ figures.

The load average shows how many processes (program tasks) are ready to run over three time averages: 1 minute, 5 minutes and 15 minutes. You can tell that the ‘load’ on this box has come down from 1.77 to 0.20 in the last 15 minutes. Typically, you will find that a load average of over 10 is fairly high and you will definitely start to notice the computer being slower.

The next line is useful, but not as useful as the last one. The amount of tasks currently waiting and running is listed, note that 125 out of the 126 tasks at this time were ‘sleeping’. The only running process was in fact ‘top’, everything else was doing nothing – just waiting around for something to happen, thus they are sleeping. You’ll also notice that there are 0 zombie processes – these are when a process spawns (starts) another process (eg a child process) and the parent process fails and leaves a child process behind. The processes are still running but have nothing to do and nothing to speak to, so essentially they are ‘Zombified processes!’. They are often difficult to get rid of, but I rarely see them these days unless you aren’t looking after your system.

The next line will show you how much % of the CPU is being used, and in what states. If the CPU is 100% used for a blink and then back to around 5% use, this is quite normal and you will note that it happens quite a lot. The %us means this is how much percent of the system CPU usage is being occupied by user tasks (eg a task that you run as user ‘bob’. The opposite of root or system processes). The %sy is the amount of system processes are using the CPU. Next, %ni means the amount of processes in percent that are ‘niced’ processes, eg processes that have had their normal weighting of priority adjusted in some way. Finally the %id is the percentage of the CPU that is currently idle, waiting for instructions, you can see that this box is really doing very little here, thus the high idle value.

There is a lot to say about Mem (Memory) and Swap (Virtual Memory) usage, beyond the scope of this article, but needless to say, you should always expect the amount of free memory to be low – this is by design, it’s not like the old days in Windows or DOS. Linux automatically allocates most of your available RAM memory to use in caches.

The rest of the top program shows you the ‘top’ running processes (thus why the program is called top!). By default, it shows you the top processes sorted by CPU usage.

The example below shows an idle system, but you might see that a number of processes are above the process ‘1’ (called init), and these are all chewing up more CPU usage. Here is what all the fields mean in that bar along the top:

PID – Process ID. The unique number given to each process on the system. The init process always has PID 1 because it is the first thing that runs, and it spawns all other proceses. Don’t kill this process unless you want to reboot your box!

USER – Username. This is the user or username that ‘owns’ the process in question. This way you can quite quickly see which user or users are chewing up most of the system’s utilisation. Remember that the ‘root’ user is the system user.

PR – Priority. This is the priority of the process. This is often of little use to you as the kernel automatically works out the priority of a process depending upon the load and usage of a process. The higher the number, the lower the priority, +20 being the lowest priority, -20 the highest.

NI – ‘Nice’ Value. The ‘nice’ value can be between -19 and +20. This over-rides the priority of a process a bit, so if you have a big heavy duty process that you want to run, but don’t want it to overpower everything else, you can ‘renice’ a priority to +20. If you want it to go above the normal threshold of kernel prioritisation (0), then you need to be the root user, you can renice a process down to -19 to beef up the priority of a task over any others.

VIRT – Virtual Memory allocated. This is the amount of virtual memory the process is using presently.

RES – Resident Memory allocated. This is the amount of ‘real’ memory allocated.

SHR – Shared Memory allocated. Processes can share memory with other processes, this is the amount of memory they are using which is considered to be ‘shared’ memory.

S – Status. This is the status of the process, it will either be R (Running), S (Sleeping) or Z (Zombie).

%CPU – This is the amount in percent that this process is using of the CPU at that instance.

%MEM – This is the amount of allocated memory that this process is using at that instance. Often you will find that this is fairly low.

TIME+ – This is the amount of CPU time a process is using in hundredths of a second.

COMMAND – This is the actual command that is running, or the name of the process.

 

Cool, what else can I do with top?

There are a few keys you can press within top, that will help you analyse other parts of your system’s performance.

Kill – If you press k and enter a process ID (PID), you will kill (close down) that process. Be careful with this though, as if you kill a process that you shouldn’t, the system can become unstable, especially if you are running top as the super-user (root). If you are asked for a ‘signal’ to give a process, there are a number of signals you can give but 15 and 9 are the most common. See ‘man kill’ for an idea of what each one does. Essentially 15 will terminate a normal process gracefully, 9 will kill it straight away (not graceful – doesn’t have time to save it’s state or data).

Quit – to quit out of top, press q (lower case q).

Renice – to renice a process (see above section) use the r key.

Sort by memory usage – press lower case m.

Sort by CPU Usage – Press capital P. This is the default view

Further Usage and Reading

For further information, at the command line, type man top and you will see the manual page on the top program which gives you detailled information for top. You’ll also find plenty of other guides on the Internet that go into further depth, but hopefully this helps you to diagnose your system’s performance. For example, if something is  running slowly (often a problem with programs like firefox crashing and chewing up CPU usage). You are likely to see firefox at the top of the top list. You can kill it by pressing k and entering the PID number of firefox, then  pressing return. Once that’s done, unless there are other processes still chewing up the CPU, you should notice things returning to normal.

Finally, there’s also a newer, prettier (arguably nicer) version of top, which although is not on every linux system, it can be easily installed. It’s called htop.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.