Oct 4 update: added availability option. Now uses xentop internally.
Oct 2 update: added graphing to xenstat.pl. Now xenstat.pl detects Guest VM start/shutdown and resets itself. Number of vcpus also shown. Misc bug fixes.
Recently we have been experimenting with the Xen hypervisor. In my testing, I have found that Linux performance is better on Xen than VMWare and we are considering it for Linux rollouts.
Normally when we roll out a new server for a customer, we have a simple PHP script installed as a cron job that runs vmstat and logs the CPU utilization of the server into our database every 5 minutes. It's very useful for benchmarking, monitoring and troubleshooting mysterious performance problems. I needed a similar script for Xen.
A search in Google revealed a Perl script by Tom Brown to record the Xen domain CPU utilisation.
However the following limitations led me to modify it:
- I want total CPU utilisation to be capped at 100%, which is the way "top" works, but not the way "xm top" works.
- Does not work properly with multi-core CPUs. CPU utilisation can go over 100%.
- Unfortunately sleep() does not sleep for precisely the number of seconds you define causing the CPU utilization to go over 100% again. There is some perturbation, either because Dom-0 is still virtualised or some other reason.
- No easy way of logging to a database.
So i rewrote parts of the script and renamed in xenstat.pl (after vmstat). You can download xenstat.pl here.
Usage
To use run "perl xenstat.pl" in domain 0. The following output will be generated, with a new statistic generated every 5 seconds:
[root@server ~]# perl xenstat.pl
cpus=2
40_falcon 2.67% 2.51 cpu hrs in 1.96 days ( 2 vcpu, 2048 M)
52_python 0.24% 747.57 cpu secs in 1.79 days ( 2 vcpu, 1500 M)
54_garuda_0 0.44% 2252.32 cpu secs in 2.96 days ( 2 vcpu, 750 M)
Dom-0 2.24% 9.24 cpu hrs in 8.59 days ( 2 vcpu, 564 M)
40_falc 52_pyth 54_garu Dom-0 Idle
2009-10-02 19:31:20 0.1 0.1 82.5 17.3 0.0 *****
2009-10-02 19:31:25 0.1 0.1 64.0 9.3 26.5 ****
2009-10-02 19:31:30 0.1 0.0 50.0 49.9 0.0 *****
In the above output, the first few lines summarise the CPUs and running domains. Then we have the statistics generated every 5 seconds. At the end of each line is a simple graph. 5 stars means 90% or over CPU utilisation, 4 stars is 70% or over, etc.
You can also define the interval to poll (in seconds), and the number of samples just like vmstat:
[root@server ~]# perl xenstat.pl 3 2
cpus=2
40_falcon 2.67% 2.51 cpu hrs in 1.96 days ( 2 vcpu, 2048 M)
52_python 0.24% 748.07 cpu secs in 1.79 days ( 2 vcpu, 1500 M)
54_garuda_0 0.44% 2258.38 cpu secs in 2.96 days ( 2 vcpu, 750 M)
Dom-0 2.24% 9.24 cpu hrs in 8.59 days ( 2 vcpu, 564 M)
40_falc 52_pyth 54_garu Dom-0 Idle
2009-10-01 12:14:59 0.0 0.0 1.7 5.7 92.5
2009-10-01 12:15:02 0.0 0.0 0.3 10.4 89.3 *
[root@server ~]#
Logging Using REST web service
To log the CPU utilisation using the Perl script, I didn't want to install a database client in Dom-0. So I added another parameter, a URL to a web server to call with the CPU info as GET parameters. I assume wget is installed in your Dom-0.
[root@server ~]# perl xenstat.pl 10 1 [192.168.0.1] cpus=2
54_garuda_0 0.49% 165.81 cpu sec over 3.62 days (2 vcpu, 750 M)
59_gyrfalcon 0.62% 69.03 cpu sec over 0.80 days (2 vcpu, 2000 M)
Dom-0 1.57% 2.15 cpu hrs over 3.62 days (2 vcpu, 564 M)
--10:46:42-- [192.168.0.1] Connecting to 192.168.0.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 498 [text/html]
Saving to: `STDOUT'
100%[============================================>] 498 --.-K/s in 0s
10:46:42 (67.8 MB/s) - `-' saved [498/498]
2009-09-29 10:46:42 0.1 2.1 2.2 95.6
This will accumulate statistics for 10 seconds then send it to the above url in this format:
[192.168.0.1]
This allows you to log the data using a REST-ful web service.
Availability Option [a]
The problem with showing CPU Utilisation is that some guest VMs are capped because they have fewer vcpu. However if the CPU Utilisation of a guest VM is 50% can you tell whether it can go higher (vcpus == # physical cpus), or is it already capped (vcpus = 50% of physical cpus)?
The solution is to reverse the CPU figures and view information in terms of Available CPU % left (100 - CPU Utilisation %). In other words we show how much CPU power you have left for each VM. The advantage is that it tells w
Truncated by Planet PHP, read more at the original (another 2386 bytes)