Disk Cleanup
Contents
- 1 Finding Disk Utilization
- 2 Cleaning Partitions
- 3 Finding Inode Utilization
Finding Disk Utilization
Partitions fill up. It's the way of things. Cleaning things up can be easy once you know where space is being utilized.
Here's a lil' shortcut to quickly finding a partition over 90% (of course, change 90 in this line to whatever threshold you'd like):
df -h | sed 1d | awk '{if ($5 > 90) print $6 " is at " $5;}'
First off, du is your friend. Using it will allow you to track disk usage in any partition quite easily. I recommend the following command to check things out.
du -hx --max-depth 1
or even better
du -sk ./* | sort -nr | awk 'BEGIN{ pref[1]="K"; pref[2]="M"; pref[3]="G";} { total = total + $1; x = $1; y = 1; while( x > 1024 ) { x = (x + 1023)/1024; y++; } printf("%g%s\t%s\n",int(x*10)/10,pref[y],$2); } END { y = 1; while( total > 1024 ) { total = (total + 1023)/1024; y++; } printf("Total: %g%s\n",int(total*10)/10,pref[y]); }'
Its easy to see how this technique can be used to show a good breakdown for disk usage. Sometimes you might see disk usage that does not "add up". This is where the apparent disk usage for the partition doesn't match what the output of the du command shows. This typically happens when logfiles are deleted, but the server process that is writing to them still has the files open. The file will continue to take up disk space until it is closed, even thought it doesn't appear in the filesystem tree. Once it is closed, the space will be freed. You can identify these deleted files by running the following command.
lsof | grep deleted
It should return output in the following format. (I inserted the column header for reference)
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME httpd 546 root 3u REG 8,8 0 46 /tmp/ZCUDfVfcQp (deleted) httpd 547 root 3u REG 8,8 0 46 /tmp/ZCUDfVfcQp (deleted) httpd 548 root 3u REG 8,8 0 46 /tmp/ZCUDfVfcQp (deleted) httpd 748 root 3u REG 8,8 0 46 /tmp/ZCUDfVfcQp (deleted) mysqld 844 root 6u REG 8,8 0 17 /tmp/ib8chgxk (deleted) mysqld 844 root 7u REG 8,8 3000 25 /tmp/ibMnPjAD (deleted) mysqld 844 root 12u REG 8,8 0 30 /tmp/ibMtDLhA (deleted) mysqld 966 root 6u REG 8,8 0 17 /tmp/ib8chgxk (deleted) mysqld 966 root 7u REG 8,8 3000 25 /tmp/ibMnPjAD (deleted) mysqld 966 root 12u REG 8,8 0 30 /tmp/ibMtDLhA (deleted) httpd 3943 root 3u REG 8,8 0 46 /tmp/ZCUDfVfcQp (deleted) cpbandwd 6309 root 1w REG 8,5 151453 448340 /var/cpanel/updatelogs/update.1162606882.postinstall.log (deleted) cpbandwd 6309 root 2w REG 8,5 151453 448340 /var/cpanel/updatelogs/update.1162606882.postinstall.log (deleted)
The first two columns show the name and PID number of the process keeping the files open. If you see that the filesize is large, you can free up the space by killing/restarting the process that is keeping the file open.
Note also that sometimes disk usage errors are not the result of physical disk space being used up, but rather the number of inodes. You can see the inode usage of a given drive thus:
df -hi
Should a partition show 100% disk inode usage, you can quickly find the directories on that partition with the most files in them with this sweet little one liner:
find . -type d | while read line; do echo "$( find "$line" -maxdepth 1 | wc -l) $line"; done | sort -rn | less
This will give you a nice 'less' screen with the directories in order by file count, where you can pretty easily figure out which ones are eating the inodes. You may wish to drop the pipe to less and redirect to a text file for easier culling.
Cleaning Partitions
Once the "offending" files/directories are identified, they can be cleaned. The following sections identify common sources of sizable disk utilization, and how to fix issues that result from that utilization.
/
There are typically only a few situations where the root (/) partition will fill up:
Broken /backup
There are instances where a backup process gets confused and thinks that the backup drive is mounted at /backup, but is really not, and backups are written into the / partition. Run unmount /backup a few time to make sure that the backup drive is unmounted, and then check the disk usage in /backup. If there are backups present, that's probably your problem. If the server has a backup drive and should be backing up to that location, removing these backups is advisable. Again before you remove these it is very important to verify that the backup drive is not mounted.
This can be verified by running:
df -h
It should show something like this:
[root@server ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda6 2.0G 572M 1.4G 30% / tmpfs 3.9G 0 3.9G 0% /dev/shm /dev/sda1 194M 78M 107M 43% /boot /dev/sda8 864G 274G 547G 34% /home /dev/sda7 2.0G 135M 1.8G 8% /tmp /dev/sda2 20G 13G 6.4G 67% /usr /dev/sda5 20G 14G 4.9G 74% /var
There you can see that all of the partitions are on the drive /dev/sda and there is no /backup listed.
run the command
fdisk -l
This will list the drives in the server. It should list something like this:
[root@server ~]# fdisk -l Disk /dev/sda: 147.0 GB, 147015821824 bytes 255 heads, 63 sectors/track, 17873 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 13 104391 83 Linux /dev/sda2 14 1283 10201275 83 Linux /dev/sda3 1284 1537 2040255 82 Linux swap /dev/sda4 1538 17849 131026140 f Win95 Ext'd (LBA) /dev/sda5 1538 2807 10201243+ 83 Linux /dev/sda6 2808 2938 1052226 83 Linux /dev/sda7 2939 3069 1052226 83 Linux /dev/sda8 3070 17849 118720318+ 83 Linux Disk /dev/hda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/hda1 1 30401 244196001 83 Linux
By looking at this it is pretty clear that /dev/sda is the primary drive on the server (you can see all the partitions listed there) Which means /dev/hda1 is probably the /backup drive. To check you can run this:
e2label /dev/hda1
The output of that command looks like this:
[root@server ~]# e2label /dev/hda1
/backup
The letters on the drives are not always the same. A back up drive could be /dev/sdb1 or /dev/sda1 or /dev/hdb1 etc. so please make sure that you are looking at the partitions and using the e2label command to determine the correct drive. Most of the time the backup drive will be the only partition on a single disk.
You can accidentally change the name of a partition with the e2label command so make sure you run it as displayed above!
If the backup drive does not have a label it is best to add one now. Run the following command to label the backup drive:
e2label /dev/hda1 /backup
At this point you should check the fstab for an entry for the /backup drive
vim /etc/fstab
It will look something like this:
LABEL=/ / ext3 defaults,usrquota 1 1 LABEL=/boot /boot ext3 defaults 1 2 none /dev/pts devpts gid=5,mode=620 0 0 LABEL=/home /home ext3 defaults,usrquota 1 2 none /proc proc defaults 0 0 none /dev/shm tmpfs defaults 0 0 LABEL=/tmp /tmp ext3 defaults,noexec,nodev,nosuid 1 2 LABEL=/usr /usr ext3 defaults,usrquota 1 2 LABEL=/var /var ext3 defaults,usrquota 1 2 /dev/sda3 swap swap defaults 0 0
In this case the backup drive is not listed, so even if we mount it manually it will not mount itself if the server is rebooted. The line should look like this when you add it to the file:
LABEL=backup /backup ext3 defaults 1 2
If the backup drive is not labeled you can also use the below entry in fstab. (It is still recommended to label the drive if it is replaced down the road the new drive can be labeled /backup and continue to function using the above entry in fstab)
/dev/hda1 /backup ext3 defaults 1 2
You should now be able to run the command and have it mount to the proper location.
mount /backup
Before you mount the backup drive you should make sure any broken backups are cleaned out of /backup. You should run umount /backup before deleting anything there just to be sure you are not deleting good backups!
Finding Inode Utilization
Note also that sometimes disk usage errors are not the result of physical disk space being used up, but rather the number of inodes. You can see the inode usage of a given drive thus:
df -hi
Should a partition show 100% disk inode usage, you can quickly find the directories on that partition with the most files in them with this sweet little one liner:
find . -type d | while read line; do echo "$( find "$line" -maxdepth 1 | wc -l) $line"; done | sort -rn | less
This will give you a nice 'less' screen with the directories in order by file count, where you can pretty easily figure out which ones are eating the inodes. You may wish to drop the pipe to less and redirect to a text file for easier culling.
If / is the partition full of inodes on a dedi though, just searching within /root/ should save you some time instead of looking inside every single partition all at once.
You can also use the -xdev flag for find to keep it from looking on other partitions
There is also a script to assist with the finding of inodes, however learning how to use find to locate these doesnt hurt either.
wget -O /scripts/inodes.sh http://layer3.liquidweb.com/scripts/inodes.sh chmod +x /scripts/inodes.sh /scripts/inodes.sh
While following these procedures, if you notice that the directory /.cpanel/comet is the primary culprit in inode usage, there happens to be a handy script to clear out unneeded files.
/usr/local/cpanel/bin/purge_dead_comet_files
Also worth checking default email accounts on cpanel servers, e.g /home/$user/mail/cur|new
and ` exim -bpc`
/lib/modules
On some older machines (FC2 boxes especially), there will be many kernels installed, and their module directories will reside in the / partition under /lib/modules. It is safe to remove older kernels as long as the server is not booted into the kernel you're removing. Just make sure to check grub.conf after you're done removing kernels to make sure that grub still points to a kernel that exists.
The following script can help with this task immensely: http://layer3.liquidweb.com/scripts/kernelcleaner.sh
Non-standard directories
Sometimes customers will place data/program directories in the / partition, not knowing that it is formatted with a small filesystem. There's no way around this really except to inform the customer that it is advisable to move their data/program to another partition with more space.
/root
Since root's home directory (/root) is typcially on the / partition, it can fill up with downloaded files, saved logs, or anything else that gets placed in /root. Cleaning out unneeded data from this directory can save a lot of space. Frequently, /root/loadMon is filled up the fastest. Disable written logs to prevent this from occurring
/tmp
Typically, there is nothing here that can't be removed except the mysql.sock socket file. After all, it is temporary space.
/boot
The only times I've ever seen issues with the /boot partition filling up is when the system mistakenly mounted /boot at /backup and tried to place backups there. These are obvious and easy to remove.
/usr
There are quite a few things in /usr that can eat up disk space. Most of them are CPanel or apache related.
/usr/src
/usr/src/kernels
Quite frequently, the OS will place kernel source directories under /usr/src/kernels. If the user on the box isn't compiling their own kernels, these sources are typically unneeded. Running the following command will clear the kernel sources if they're are provided by the 'kernel-sources' RPM package.
rpm -qa kernel-source | xargs rpm -e
If there is no kernel-source package installed, removing these sources should proceed with discretion. Make sure the kernel the server is running isn't custom compiled, and doesn't depend on that source directory.
Other
There may be other source directories in /usr/src. Unless they were put there by Liquid Web and we know that they can be removed, it's best to let them be. If it's imperative to free up the space they're taking, the directories can be tar'd up and (g|b)zipped, and the originals removed. If you do this, leave a note in the account in case this causes an issue.
/usr/local/apache
On a CPanel machine, all of the logs for Apache will be placed under /usr/local/apache.
/usr/local/apache/domlogs
The domain logs (and other CPanel data) will reside in /usr/local/apache/domlogs, and is a common source of large disk utilization. Unless the customer is willing to remove the logs or set CPanel to rotate them frequently, moving the directory to another partition and making a symlink is the only option. Fortunately, /home is almost always huge and under-utilized, so moving them is an easy band-aid. First, copy the logs to /home using the following commands.
mkdir /home/domlogs rsync -avHP /usr/local/apache/domlogs/ /home/domlogs/
After the data is copied, stop apache, and do a final sync of the logs. It is then safe to move the orignal directory to the side, create the symlink, and restart apache.
service httpd stop rsync -avHP /usr/local/apache/domlogs/ /home/domlogs/ mv /usr/local/apache/domlogs{,.bak} ln -s /home/domlogs /usr/local/apache/domlogs service httpd start
Once you verify that data is being written properly to the new /home/domlogs directory, it is safe to remove the old domlogs directory.
rm -rf /usr/local/apache/domlogs.bak
/usr/local/apache/logs
The default apache logs and a few other things will reside in /usr/local/apache/logs. Large disk utilization in this directory is common if the error_log and access_log files aren't being rotated and/or compressed properly. This is quite common. Adding a logrotate script to clean up apache will normally take care of this problem. The default logrotate script is either /etc/logrotate.d/apache or /etc/logrotate.d/httpd. Sometimes both will be present. If both are present, remove one, and replace the contents of the other with the following content.
#You may need to tweak this a bit for the customer's specific domains. #The below example will rotate all of apache's core logs, # Take a look at their domlogs and use your judgment to add additional paths to the line at the top. # This part of the log line should not be added unless they have disabled awstats/cpanellogd: # /usr/local/apache/domlogs/*.com /usr/local/apache/domlogs/*.net /usr/local/apache/logs/*log { compress weekly notifempty missingok rotate 3 sharedscripts postrotate /bin/kill -HUP `cat /usr/local/apache/logs/httpd.pid 2>/dev/null` 2> /dev/null || true endscript }
You can force the rotation of the logs by running the following command (replacing [filename] with either apache or httpd, whichever exists).
logrotate -f /etc/logrotate.d/[filename]
/usr/local/cpanel
This directory is a beast. Cpanel keeps so much junk in here its not even funny. It tends to be large, but it *cannot* be symlinked anywhere. Doing so will defeat the patched suexec that cpanel uses to make the mailman and cgi-sys directories work properly and will result in internal server errors being displayed on their access.
/usr/local/cpanel-rollback
Look for last modified and delete all but the latest 1 or 2 sets.
/usr/local/cpanel/3rdparty/mailman
- If the server has even one busy mailing list on it, this directory can take up a ton of space. If you're lucky, only the logs/ directory will be large. This just holds the mailman logfiles, so if it's large and the logfiles are unneeded, they can be discarded.
- The other directory that tends to be large is the archives/ directory. This is customer data, and shouldn't be removed. This is another candidate for symlinking to /home. Please make note that you *cannot* symlink the whole mailman directory to /home. Mailman will break, and you will be hearing from customers. The proper action is to symlink the archives directory to /home.
mkdir -p /home/mailman/archives rsync -avHl /usr/local/cpanel/3rdparty/mailman/archives/ /home/mailman/archives/
After the data is copied, stop CPanel/mailman and do a final sync of the archives. It is then safe to move the orignal directory to the side, create the symlink, and restart CPanel/mailman.
service cpanel stop rsync -avHl /usr/local/cpanel/3rdparty/mailman/archives/ /home/mailman/archives/ mv /usr/local/cpanel/3rdparty/mailman/archives /usr/local/cpanel/3rdparty/mailman/archives.bak ln -s /home/mailman/archives /usr/local/cpanel/3rdparty/mailman/archives service cpanel start
Once you verify that data is being written properly to the new /home/mailman/archives directory, it is safe to remove the old archives directory.
rm -rf /usr/local/cpanel/3rdparty/mailman/archives.bak
- A third directory that can, on occasion, get ridiculously large is the data/ directory. This can fill up with pickled held messages. DO NOT JUST RM THEM. Doing so may cause breakage to the mailman interface, and we wouldn't want that. Instead:
cd /usr/local/cpanel/3rdparty/mailman bin/discard data/heldmsg-<listname>-*
If for some reason there are so many that the bin/discard program chokes on the wildcard expansion, try this fu:
find ./data -name heldmsg-<listname>-* -print | xargs bin/discard
Mad props to mailman for actually knowing about this issue: http://wiki.list.org/pages/viewpage.action?pageId=4030620
As of this writing, that discard script is not entirely reliable.
/usr/local/cpanel/logs
These logs can get quite large without rotation, but can be very useful in investigating many problems. Set up logrotate for cPanel logs through WHM >> "cPanel Log Rotation Configuration" on the server, and set an appropriate threshold.
/usr/local/cpanel/src
This directory is where CPanel stores source code for software it builds. The only directory I've seen inside of it that has any notable size is the 3rdparty directory. It contains sources for third-party applications, and I've never seen an issue from removing all of the contents therein, since upcp will repopulate that directory if it needs to build something there.
rm -rf /usr/local/cpanel/src/3rdparty/*
/usr/local/lp/logs/httpd
This directory holds Mr. Radar logs. These logs can be symlinked to /dev/null. You will see a logfile with a name similar to servXXXXXXX.sn.sourcedns.com.
rm -f /usr/local/lp/logs/httpd/servXXXXXXX.sn.sourcedns.com && ln -s /dev/null /usr/local/lp/logs/httpd/servXXXXXX.sn.sourcedns.com
(where servXXXXXX is the actual name of the file in the directory)
/usr/local/jakarta
If the user is running Tomcat on their server, one of the log files can grow to be extremely large. To clear this file, do the following:
/usr/local/jakarta/tomcat/bin/shutdown.sh rm /usr/local/jakarta/tomcat/logs/catalina.out /usr/local/jakarta/tomcat/bin/startup.sh
It's been found that you can move this, just create a symlink back.
mv /usr/share/clamav /home/usr_share_clamav ln -s /home/usr_share_clamav /usr/share/clamav /etc/init.d/exim restart
/var
/var/cpanel/bandwidth
Cpanel seems to have given us another reason to clean out /var as of late. The fix in this case would be to do the following:
killall cpanellogd mkdir /home/bandwidth chown root:wheel /home/bandwidth chmod 755 /home/bandwidth rsync -avHl /var/cpanel/bandwidth/ /home/bandwidth
Doublecheck to make sure tailwatchd is not running and then run:
/usr/local/cpanel/bin/tailwatchd stop rsync -avHl /var/cpanel/bandwidth/ /home/bandwidth mv /var/cpanel/bandwidth /var/cpanel/bandwidth.bak ln -s /home/bandwidth /var/cpanel/bandwidth /usr/local/cpanel/bin/tailwatchd start
After verifying the new directory is working correctly, remove the old:
rm -rf /var/cpanel/bandwidth.bak
/var/cache/yum
This is where yum stores a lot of its stuff, including downloaded RPMs. It's always completely safe to clean this directory with the command:
yum clean all
/var/log
This directory is a haven for large log files. It is quite often the case that logrotate isn't set to compress log files here, and they grow to great size. This is easily fixed though. If logrotate isn't compressing the logs as they're rotated, you'll see something like this in a directory listing.
-rw------- 1 root root 637071 Jan 24 21:26 messages -rw------- 1 root root 3563526 Jan 21 04:03 messages.1 -rw------- 1 root root 3805857 Jan 14 04:03 messages.2 -rw------- 1 root root 3421860 Jan 7 04:03 messages.3 -rw------- 1 root root 1019255 Dec 31 04:04 messages.4
Fixing this is a simple step process.
- Edit /etc/logrotate.d/syslog to include 'compress' on its own line, as per the following example.
vim /etc/logrotate.d/syslog
/var/log/messages /var/log/secure /var/log/maillog /var/log/spooler /var/log/boot.log /var/log/cron { sharedscripts compress postrotate /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true endscript }
- compress the existing logfile.n logfiles, so that logrotate will rotate them properly in the future.
cd /var/log gzip *.?
- run logrotate to compress the current logfile, if needed.
logrotate -f /etc/logrotate.d/syslog
There may also be large Apache SSL log files present here. Setting up logrotate to compress and rotate the apache default log files as explained here will take care of them.
If space is an issue even after compression, removal of the oldest log files is a viable option. Just make sure to leave the logs for at least the past two weeks if it is possible.
/var/log/audit.d
This directory isn't typically a problem, but on some servers it can be. It's filled up by the audit daemon's binary log files, and I've seen it reach 7GB before on systems with large /var partitions. We've found no use for the audit daemon (and it's harmful in certain instances), so it can safely be disabled, and its logs removed.
/etc/init.d/auditd stop chkconfig auditd off rm -rf /var/log/audit.d/*
/var/tmp
This is temporary space, so its generally safe to delete anything in here except the mysql socket file (mysql.sock).
/var/lib/mysql
This is where mysql keeps its data by default. This can be problematic with our default partitioning system when the customer has large databases. Moving the mysql data directory to a larger partition is the only option. For instructions on how to do this, see Moving the MySQL Data Directory to /home.
Exim Stats ( /var/lib/mysql/eximstats/ )
This can at times grow very large. It will hold stats about exim which most users never interact with. Once you confirm they do not need to use these stats they can be cleaned out.
mysql eximstats
Then if you:
mysql> show tables; +---------------------+ | Tables_in_eximstats | +---------------------+ | defers | | failures | | sends | | smtp | +---------------------+
You will see the tables these stats make up. You can delete them:
truncate defers; truncate failures; truncate sends; truncate smtp;
This can be accomplished from the command line too:
mysql eximstats -e "truncate defers;truncate failures;truncate sends;truncate smtp;"
Then restart mysql
If eximstats is especially large and or the drive is maxed out using truncate can be a pain. So instead do the following. FYI, do not do this for other databases with similar issues. mysqldump -d eximstats > /root/eximstats.sql rm /var/lib/mysql/eximstats/* mysql eximstats < /root/eximstats.sql
This will remove the information. You can also in WHM -> Service Manger disable exim stats or in Tweak Settings the number of days to keep the stats (default is 90).
/home
If /home is full, there are a few things that will provide some breathing room, but since the majority of the stuff in /home is customer data, there isn't much that can be done except add another drive.
/home/temp
This is a directory we add on setup of the server, and can sometimes contain a large amount of data that can be removed. If something large in this directory hasn't been touched in two weeks or more, I consider it safe to remove.
/home/cprestore
Can have some old backups, check the date and if it seems old enough remove em. Also look for old backup files in /home/
CPanel build directories
Since it is typically large, CPanel does its large software builds in directories in /home. If space is a concern, the cpzendinstall/ and cpapachebuild/ directories can be removed.
CPanel account transfer file directories
CPanel places the tar archives it uses for account transfers and uncompresses them to /home. These files and directories can be removed in a crunch. Files/directores beginning with cpmove or cprestore can be removed safely if they exist.
CPAN directories
in /home, there is quite typically a directory that perl's CPAN system uses to hold onto its cached information, and they also hold source/build directories. The source and build directories can be cleaned to save space.
rm -rvf /home/.cpan/build/* rm -rvf /home/.cpan/sources/*
Directories symlinked from other partitions
Directories symlinked from other filesystems are placed in /home for a reason, but if space is limited on /home, moving them back to their original filesystem (if space permits) can provide some breathing room.
/backup
Since this is typically set up using a dedicated backup drive, it can quite often be highly utilized. However, if it's completely full, it can break the backup processes.
CPanel backups
If the drive is full of CPanel backups, there's not much that can be done except deleting backups. If all of the backup timeframes (daily, weekly, monthly) are enabled, disabling one of them may allow the backup process to complete properly.