在linux中,当我们使用rm在linux上删除了大文件,但是如果有进程打开了这个大文件,却没有关闭这个文件的句柄,那么linux内核还是不会释放这个文件的磁盘空间,最后造成磁盘空间占用100%,整个系统无法正常运行。
这种情况下,通过df和du命令查找的磁盘空间,两者是无法匹配的,可能df显示磁盘100%,而du查找目录的磁盘容量占用却很小。
遇到这种情况,基本可以断定是某些大文件被某些程序占用了,并且这些大文件已经被删除了,但是对应的文件句柄没有被某些程序关闭,造成内核无法回收这些文件占用的空间。
具体案例如下:
[user@l-cloud-logstash1.rd.beta.cn2 /]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda2 9.8G 9.0G 270M 98% /
devtmpfs 7.8G 0 7.8G 0% /dev
tmpfs 7.8G 0 7.8G 0% /dev/shm
tmpfs 7.8G 744M 7.1G 10% /run
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/xvda1 477M 103M 345M 23% /boot
/dev/xvda5 2.9G 143M 2.6G 6% /home
/dev/xvda6 180G 27G 144G 16% /home/xx
tmpfs 1.6G 0 1.6G 0% /run/user/30073
df 显示/已经使用98%!
一般我们用du找出大文件,清除大文件即可解决!但是du发现没有占用磁盘的大文件!
[user@l-cloud-logstash1.rd.beta.cn2 /]$ sudo du -sh *
0 bin
101M boot
0 dev
33M etc
29G home
0 lib
0 lib64
16K lost+found
4.0K media
4.0K mnt
4.0K opt
du: cannot access ‘proc/112120/task/112120/fd/4’: No such file or directory
du: cannot access ‘proc/112120/task/112120/fdinfo/4’: No such file or directory
du: cannot access ‘proc/112120/fd/4’: No such file or directory
du: cannot access ‘proc/112120/fdinfo/4’: No such file or directory
0 proc
48K root
760M run
0 sbin
4.0K srv
0 sys
7.2M tmp
du: cannot access ‘usr/lib/python2.7/site-packages/meld3-1.0.2-py2.7.egg’: Input/output error
1.3G usr
329M var
4.0K zookeeper_server.pi
sudo lsof -n | grep deleted打印出所有针对已删除文件的读写操作,这类操作是无效的,也正是磁盘空间莫名消失的根本原因。
注意一定要sudo,不然不能显示所有文件!
[user@l-cloud-logstash1.rd.beta.cn2 /]$ sudo lsof -n | grep deleted
rsyslogd 918 root 4w REG 202,2 251207 131434 /var/log/cron (deleted)
rsyslogd 918 root 6w REG 202,2 7784196954 135105 /var/log/messages (deleted)
rsyslogd 918 root 7w REG 202,2 488777 131122 /var/log/kern.log (deleted)
rsyslogd 918 root 8w REG 202,2 385555 135109 /var/log/secure (deleted)
in:imjour 918 987 root 4w REG 202,2 251207 131434 /var/log/cron (deleted)
in:imjour 918 987 root 6w REG 202,2 7784196954 135105 /var/log/messages (deleted)
in:imjour 918 987 root 7w REG 202,2 488777 131122 /var/log/kern.log (deleted)
in:imjour 918 987 root 8w REG 202,2 385555 135109 /var/log/secure (deleted)
rs:main 918 988 root 4w REG 202,2 251207 131434 /var/log/cron (deleted)
rs:main 918 988 root 6w REG 202,2 7784196954 135105 /var/log/messages (deleted)
rs:main 918 988 root 7w REG 202,2 488777 131122 /var/log/kern.log (deleted)
rs:main 918 988 root 8w REG 202,2 385555 135109 /var/log/secure (deleted)
tuned 920 root 3w REG 202,2 5562 131125 /var/log/tuned/tuned.log (deleted)
gmain 920 1400 root 3w REG 202,2 5562 131125 /var/log/tuned/tuned.log (deleted)
tuned 920 1401 root 3w REG 202,2 5562 131125 /var/log/tuned/tuned.log (deleted)
果然有很多已删除的大文件还持有文件句柄,导致磁盘空间没释放!
只需把这些进程删掉就能释放空间!
[user@l-cloud-logstash1.rd.beta.cn2 /]$ sudo lsof -n | grep deleted | awk '{print $2}'| xargs sudo kill -9
[xiaofengh_1@l-cloud-logstash1.rd.beta.cn2 /]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda2 9.8G 1.7G 7.6G 19% /
devtmpfs 7.8G 0 7.8G 0% /dev
tmpfs 7.8G 0 7.8G 0% /dev/shm
tmpfs 7.8G 752M 7.1G 10% /run
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/xvda1 477M 103M 345M 23% /boot
/dev/xvda5 2.9G 143M 2.6G 6% /home
/dev/xvda6 180G 28G 143G 17% /home/xxx
tmpfs 1.6G 0 1.6G 0% /run/user/30073
df 显示/已经使用19%,问题解决!