Thursday, March 15, 2012

Linux Server Monitoring Commands You Really Need To Know

Hello Friends,

Want to know what's really going on with your server? Then you need to know these essential commands. Once you've mastered them, you'll be well on your way to being an expert Linux system administrator.

Depending on the Linux distribution, you can run pull up much of the information that these shell commands can give you from a GUI program. SUSE Linux, for example, has an excellent, graphical configuration and management tool, YaST, and KDE's KDE System Guard is also excellent.

However, it's a Linux administrator truism that you should run a GUI on a server only when you absolutely must. That's because Linux GUI's take up system resources that could be better used elsewhere. So, while using a GUI program is fine for basic server health check ups, if you want to know what's really happening, turn off the GUI and use these tools from the Linux command shell.

This also means that you should only start a GUI on a server when it's required; don’t leave it running. For optimum performance, a Linux server should run at "runlevel 3", which fully supports networking and multiple users but doesn't start the GUI when the machine boots. If you really need a graphical desktop, you can always get one by running "startx" from a shell prompt.

Note: If your server starts by booting into a graphical desktop, you need to change this. To do so, head to a terminal window, "su" to the root user, and use your favourite editor on /etc/inittab.

Once there, find the initdefault line and change it from id:5:initdefault: toid:3:initdefault:

If there is no inittab file, create it, and add the id:3 line. Save and exit. The next time you boot into your server it will boot into "runlevel 3". If you don't want to reboot after this change, you can also set your server's run level immediately with the command: init 3

Once your server is running at init 3, you can start using the following shell programs to see what's happening inside your server.

iostat

The iostat command shows in detail what your storage subsystem is up to. You usually use iostat to monitor how well your storage sub-systems are working in general and to spot slow input/output problems before your clients notice that the server is running slowly. Trust me, you want to spot these problems before your users do!

meminfo and free

Meminfo gives you a detailed list of what's going on in memory. Typically you access meminfo's data by using another program such as cat or grep. For example,

cat /proc/meminfo

gives you the details of what's going on in your server’s memory at any given moment.

For a quick “just the facts” look at memory, you can use the free command. In short,free gives you the overview; meminfo gives you the details.

mpstat

The mpstat command reports on the activities of each of the available CPUs on a multi-processor server. These days, thanks to multi-core processors, that’s almost all servers.mpstat also reports on the average activities of all your server's CPUs. It enables you to display overall CPU statistics per system or per processor. This overview can alert you to possible application problems before they get to the point of annoying users.

netstat

Netstat, like ps, is a Linux tool that administrators use every day. It displays a lot of network related information, such as socket usage, routing, interface, protocol, network statistics, and more. Some of the most commonly used options are:

-a Show all socket information

-r Show routing information

-i Show network interface statistics

-s Show network protocol statistics

nmon

Nmon, short for Nigel's Monitor, is a popular open-source tool to monitor Linux systems performance. Nmon watches the performance information for several subsystems, such as processor utilization, memory utilization, run queue information, disk I/O statistics, network I/O statistics, paging activity, and process metrics. You can then view nmon's real-time system measurements via its curses “graphical” interface.

sjvn_LinuxServerMonitoring_nmon.png

To run nmon, you start the tool from the shell. Once up, you select the subsystems to monitor by typing in its one-key commands. For example, to get CPU, memory, and disk statistics, you type c, m, and d. You can also use nmon with the -f flag to save performance statistics to a CSV file for later analysis.

For day to day server monitoring I find nmon to be the single most useful program in my Linux system management tool-kit.

pmap

The pmap command reports the amount of memory that your server's processes are using. You can use this tool to determine which processes on the server are being allocated memory and whether any of these processes are being piggy with RAM.

ps and pstree

The ps and pstree commands are two of the Linux administrator’s best friends. They both provide a list of all currently running processes. Ps tells you how much memory and processor time the server’s programs are using. Pstree shows less information, but highlights which processes are the children of other processes. Armed with this information, you can spot out–of-control processes and kill them off with Linux's “take no prisoners” kill command.

sar

The sar program is a Swiss-army knife of a system monitoring tool. The sar command is actually made up of three programs: sar, which displays the data, and sa1 and sa2, which collect and store it. Once installed, sar creates a detailed overview of CPU utilization, memory paging, network I/O and transfer statistics, process creation activity, and storage device activity. The big difference between sar and nmon is that the former is better at long-term system monitoring, while I find nmon to be better at giving me a quick read on my server's status.

strace

strace is often thought of a programmer's debugging tool, but it's more than that. It intercepts and records the system calls that are called by a process. This makes it a useful diagnostic, instructional, and debugging tool. For example, you can use strace to find out which configuration file a program is actually using when it starts up.

Strace does have one flaw though. When it's checking out a specific process, that process' performance will fall through the floor. Thus, I only use strace when I already have a darned good reason to think that that program is causing trouble.

tcpdump

Tcpdump is a simple, robust network monitoring utility. Its basic protocol analyzing capability enables you to get a rough view of what is happening on your network. To really dig into what's going on with your network however, you'll want to use Wireshark (see below).

top

The top command shows what's going on with your active processes. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds. You can sort the processes by PID (Process ID); age, newest first; time, by cumulative time; and resident memory usage and total time it's been using the CPU since startup. I find this a fast and easy way to see if any process is starting to go out of control and about to get into trouble.

uptime

Use uptime to see how long the server has been running and how many users are logged on. It also gives you an overview of the average server load. The optimal value of the load is 1 or less, which means that each process has immediate access to the CPU and there are no CPU cycles lost.

vmstat

For the most part, you use vmstat to monitor what's going on with virtual memory. Linux constantly uses virtual memory to get the best possible storage performance.

If your applications are taking up too much memory you get excessive page-outs — programs moving from RAM to your system's swap space, which is on the hard drive. Your server can reach a point where it's spending more time managing memory paging than running your applications, a condition called thrashing. When your computer is thrashing, its performance falls through the floor. Vmstat, which can display either average data or actual samples, can help you spot memory pig programs and processes before they bring your server to a crawl.

Wireshark

Wireshark, formerly known as Ethereal (and still often referred to that way), is tcpdump's big brother, though it is more sophisticated and with far more advanced protocol analyzing and reporting. Wireshark has both a GUI interface and a shell interface. If you do any serious network administration, you must use ethereal.

Note: If you're using Wireshark/ethereal, I highly recommend Chris Sander's Practical Packet Analysis, a great book on how to get the most out of this useful program.


Have A Good Day Ahead...

Monday, January 23, 2012

LVM In Linux

Practical steps to "Extend/Reduce" LVM Partition in RedHat/CentOS 5x & Fedora Linux upto version 12 only.

############To Extend LVM Partition:############
Note: No need to unmount the partition.

In my case, I want to extend my LVM partition to 200MB on LV0, the command is,
Command:1
lvextend -L 200M /dev/vg0/lv0

Command:2
fsck /dev/vg0/lv0

Command:3
e2fsck -f /dev/vg0/lv0

Command:4 (Used to get output of the said extended partition)
lvdisplay /dev/vg0/lv0
df -kh

############To Reduce LVM Partition:############

Note: Unmount the said partition first before reducing.

In my case, I have mounted "lv0" on "google0" directory, the command would be,

Command:1
umount /google0 /dev/vg0/lv0

Command:2
e2fsck -f /dev/vg0/lv0

Command:3
resize2fs /dev/vg0/lv0 50M (I have reduced LVM Partition size from 150 MB to 50 MB)

Command:4
lvreduce /dev/vg0/lv0 -L 50M

Command:5
mount /dev/vg0/lv0 /google0/ (Remount LV0 on google0 directory)

Command:6 (Used to get output of the said reduced partition)
lvdisplay /dev/vg0/lv0
df -kh

Wednesday, June 29, 2011

Backup an entire hard disk using "dd" command


The "dd" command is one of the original Unix utilities and should be in everyone's tool box. It can strip headers, extract parts of binary files and write into the middle of floppy disks; it is used by the Linux kernel Makefiles to make boot images. It can be used to copy and convert magnetic tape formats, convert between ASCII and EBCDIC, swap bytes, and force to upper and lower case.

For blocked I/O, the dd command has no competition in the standard tool set. One could write a custom utility to do specific I/O or formatting but, as dd is already available almost everywhere, it makes sense to use it.

Like most well-behaved commands, dd reads from its standard input and writes to its standard output, unless a command line specification has been given. This allows dd to be used in pipes, and remotely with the rsh remote shell command.

Unlike most commands, dd uses a keyword=value format for its parameters. This was reputedly modeled after IBM System/360 JCL, which had an elaborate DD 'Dataset Definition' specification for I/O devices.

Using "dd" you can create backups of an entire harddisk or just a parts of it. This is also useful to quickly copy installations to similar machines. It will only work on disks that are exactly the same in disk geometry, meaning they have to the same model from the same brand.

Full Hard Disk copy
dd if=/dev/hdx of=/dev/hdy
dd if=/dev/hdx of=/path/to/image
dd if=/dev/hdx | gzip > /path/to/image.gz

Hdx could be hda, hdb etc. In the second example gzip is used to compress the image if it is really just a backup.

Restore Backup of hard disk copy
dd if=/path/to/image of=/dev/hdx

gzip -dc /path/to/image.gz | dd of=/dev/hdx

MBR backup

In order to backup only the first few bytes containing the MBR and the partition table you can use dd as well.

dd if=/dev/hdx of=/path/to/image count=1 bs=512

MBR restore

dd if=/path/to/image of=/dev/hdx
Add "count=1 bs=446" to exclude the partition table from being written to disk. You can manually restore the table.
Another popular tools are: "Clonezilla, Mondo Rescue"


Regards,
Nishith N.Vyas

Monday, May 23, 2011

Zombie Process Understanding.

A zombie process is a process that has completed execution but still has an entry in the process table, allowing the process that started it to read its exit status.

When a process ends, all of the memory and resources associated with it are deallocated so they can be used by other processes.

However, the process's entry in the process table remains. The parent is sent a SIGCHLD signal indicating that a child has died; the handler for this signal will typically execute the wait system call, which reads the exit status and removes the zombie.

The zombie's process ID and entry in the process table can then be reused. However, if a parent ignores the SIGCHLD, the zombie will be left in the process table.

In some situations this may be desirable, for example if the parent creates another child process it ensures that it will not be allocated the same process ID.


If you have zombie processes it means those zombies have not been waited for by their parent.

To remove zombies from a system, the SIGCHLD signal can be sent to the parent manually, using the kill command. If the parent process still refuses to reap the zombie, the next step would be to remove the parent process. When a process loses its parent, init becomes its new parent. Init periodically executes the wait system call to reap any zombies with init as parent.


How to find "zombie" process in Linux ?

Execute "top" command & read top left corner to check zombie process. If you are unable to identify the process, press "z" & you will get "red" colored identification for easy understanding.

How to kill "zombie" process in Linux ?

Run this command.
ps aux | awk '{ print $8 " " $2 }' | grep -w Z


Output would be,
Z 3456
Z 2107
Z 1708
Use "Kill command" for all three processes as given below.

kill -9 3456
kill -9 2107
kill -9 1708

That's it. Enjoy Linux.


Tuesday, May 3, 2011

Commands to check "Disk Usage" in Linux

'du' = Finding the "disk usage"

du

Typing the above at the prompt gives you a list of directories that exist in the current directory along with their sizes. The last line of the output gives you the total size of the current directory including its subdirectories. The size given includes the sizes of the files and the directories that exist in the current directory as well as all of its subdirectories. Note that by default the sizes given are in kilobytes.

du /home/nishith
The above command would give you the directory size of the directory /home/nishith


du -h
This command gives you a better output than the default one. The option '-h' stands for human readable format. So the sizes of the files / directories are this time suffixed with a 'k' if its kilobytes and 'M' if its Megabytes and 'G' if its Gigabytes.

du -ah
This command would display in its output, not only the directories but also all the files that are present in the current directory. Note that 'du' always counts all files and directories while giving the final size in the last line. But the '-a' displays the filenames along with the directory names in the output. '-h' is once again human readable format.

du -c
This gives you a grand total as the last line of the output. So if your directory occupies 100MB the last 2 lines of the output would be

100M .
100M total


The first line would be the default last line of the 'du' output indicating the total size of the directory and another line displaying the same size, followed by the string 'total'. This is helpful in case you this command along with the grep command to only display the final total size of a directory as shown below.

du -ch | grep total
This would have only one line in its output that displays the total size of the current directory including all the subdirectories.

du -s
This displays a summary of the directory size. It is the simplest way to know the total size of the current directory.

du -S
This would display the size of the current directory excluding the size of the subdirectories that exist within that directory. So it basically shows you the total size of all the files that exist in the current directory.

du --exclude=mp3
The above command would display the size of the current directory along with all its subdirectories, but it would exclude all the files having the given pattern present in their file names. Thus in the above case if there happens to be any mp3 files within the current directory or any of its subdirectories, their size would not be included while calculating the total directory size.


'df' = Finding the "disk free" space

df

Typing the above, outputs a table consisting of 6 columns. All the columns are very easy to understand. Remember that the 'Size', 'Used' and 'Avail' columns use kilobytes as the unit. The 'Use%' column shows the usage as a percentage which is also very useful.

df -h
Displays the same output as the previous command but the '-h' indicates human readable format. Hence instead of kilobytes as the unit the output would have 'M' for Megabytes and 'G' for Gigabytes.
Example :

I have my Linux installed on /dev/hda1 and I have mounted my Windows partitions as well (by default every time Linux boots). So 'df' by default shows me the disk usage of my Linux as well as Windows partitions. And I am only interested in the disk usage of the Linux partitions. This is what I use :

$ df -h | grep /dev/sda1 | cut -c 41-43

This command displays the following on my machine

78%

Please Note: You can find your drive letter by typing "fdisk -l / df -kh" command line.



Thanks,
Nishith N.Vyas

Friday, April 29, 2011

Kickstart installation guide for CentOS 5.5

This guide explain how to install and configure kickstart server for network based deployments of CentOS, from an NFS share.

The instructions should work the same on RedHat and Fedora.

Requirement:

* CentOS 5.5 DVD
* Static IP address for the Kickstart/DHCP server
* /data partition or any other

Installation Steps:

1. Login to the CentOS server using Root account.

2. Mount the CentOS DVD. Command would be : mount /dev/cdrom /media

3. Move to the CentOS RPM folder inside the DVD: cd /media/CentOS

4. Run the command bellow to install the TFTP-Server:
rpm -ivh xinetd-2.3.14-10.el5.i386.rpm
rpm -ivh tftp-server-0.49-2.el5.centos.i386.rpm

(If you get dependency error, download all necessary packages using "yum")

5. Run the command bellow to install the DHCP server:
rpm -ivh dhcp-3.0.5-23.el5.i386.rpm

6. Create new folder for the Kickstart server:
mkdir -p /data/kickstart

7. Edit using "vi", the file /etc/xinetd.d/tftp and change the following settings:
From:
disable = yes To: disable = no
From:
server_args = -s /tftpboot To: server_args = -s /data/kickstart

8. Run the command bellow to start the TFTP server:
/sbin/service xinetd start

9. Run the command bellow to start the TFTP server run at startup:
chkconfig xinetd on

10. Edit using "vi", the file /etc/dhcpd.conf and add the following lines:
ddns-update-style none;
allow bootp;
allow booting;
subnet 10.1.1.0 netmask 255.255.255.0 {
option routers 10.1.1.254;
option domain-name-servers 10.1.1.2;
next-server 10.1.1.1;
filename "pxelinux.0";
range dynamic-bootp 10.1.1.200 10.1.1.210;
}Note 1: Replace 10.1.1.0 with the correct network ID.
Note 2: Replace 255.255.255.0 with the correct subnet mask.
Note 3: Replace 10.1.1.254 with the correct default gateway.
Note 4: Replace 10.1.1.1 with the Kickstart server IP address.
Note 5: Replace 10.1.1.200 with the first IP of the DHCP pool.
Note 6: Replace 10.1.1.210 with the last IP of the DHCP pool.
Note 7: Replace 10.1.1.2 with the correct DNS server.

11. Start the DHCP server
service dhcpd start or /etc/init.d/dhcpd start

12. Run the command bellow to start the DHCP server run at startup:
chkconfig dhcpd on

13. Copy Boot Files
cp /usr/lib/syslinux/{pxelinux.0,menu.c32,memdisk,mboot.c32,chain.c32} /data/kickstart

14. Create a folder for the PXE menu files:
mkdir -p /data/kickstart/pxelinux.cfg

15. Move to the CentOS DVD root folder:
cd /media

16. Copy vmlinuz and initrd.img from the DVD to the images directory:
cp /media/images/pxeboot/{vmlinuz,initrd.img} /data/kickstart/images

17. Create the CentOS DVD structure:
cp -r CentOS /data/kickstart/
cp -r isolinux /data/kickstart/
cp -r repodata /data/kickstart/
cp -r images /data/kickstart/

18. Create using "vi", the file /data/kickstart/pxelinux.cfg/default with the following content:
default menu.c32
prompt 0
MENU TITLE PXE Menu
LABEL CentOS
MENU LABEL CentOS
KERNEL images/vmlinuz
append initrd=images/initrd.img vga=normal network ks=nfs:10.1.1.1:/data/kickstart/ks.cfg textNote: Replace 10.1.1.1 with the Kickstart server IP address.
19. Create an unattended installation script /data/kickstart/ks.cfg

Note: Make sure the file starts with the following lines:
install
nfs --server=10.1.1.1 --dir=/data/kickstartNote 1: Replace 10.1.1.1 with the Kickstart server IP address.

Note: Make sure the lines beginning with “cdrom” and “url” does not exist on the file.

Note:
To review ks.cfg file options, see the link:
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.4/html/Installation_Guide/s1-kickstart2-options.html

20. Edit using "vi", the file /etc/exports and add the following line:
/data/kickstart *(ro,no_root_squash)

21. Start the NFS service:
service portmap start
service nfs start
chkconfig nfs on

That' it.

Wednesday, April 20, 2011

Alert "Disk Usage" on your email id

Hello,

Below shell script shows "Hard Disk Usage" on your email address on regular basis.

Step:1

Make a file & named "disk-alert" & copy below contents in it.

######################################################################
# set admin email so that you can get email
ADMIN="abc@xyz.com" (Mention Your Mail Id Here)
# set alert level 90% is default (Set Usage Level as per the need)
ALERT=90
df -H | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $5 " " $1 }' | while read output;
do
#echo $output
usep=$(echo $output | awk '{ print $1}' | cut -d'%' -f1 )
partition=$(echo $output | awk '{ print $2 }' )
if [ $usep -ge $ALERT ]; then
echo "Running out of space \"$partition ($usep%)\" on $(hostname) as on $(date)" |
mail -s "Alert: Almost out of disk space $usep" $ADMIN
fi
done
######################################################################

Step:2
Edit "crontab" & make below setting.

crontab -e
59 23 * * * /root/script/diskalert

Save & Exit (:wq)

Step:3
service crond start
service sendmail start
chkconfig crond on
chkconfig sendmail on

Please Note : You can use any other mail service in your network as per the availability.

Conclusion:
This shell script will give you "Disk Usage" output on the mentioned email id on daily basis @ 23 Hrs:59 Minutes

Note: Thanks to "cyberciti.biz" to make such a nice effort to make my post more powerful.