The Enigma of Disappearing Disk Space: Why Space Isn't Freed After Deleting Files in Linux?

Tags:
Technical Track
Introduction
If you're a Linux user, you might have encountered a perplexing scenario where you delete a large file or folder, expecting to reclaim disk space immediately, only to find that the freed space is not reflected in the available storage. This enigmatic phenomenon can be frustrating and confusing for many users. In this blog, we will unravel the mysteries behind why space isn't immediately freed after deleting files in Linux and explore the inner workings of the filesystem.Understanding the File System Basics
Before we delve into the specifics, it's essential to understand the fundamentals of how Linux filesystems manage data. Most Linux systems use the "ext" (ext2, ext3, ext4) filesystem family, which employs a hierarchical structure to store files and directories. Each file system consists of data blocks, and the management of these blocks plays a crucial role in understanding the behavior of disk space.File Deletion in Linux
When you delete a file in Linux, the actual data in the file is not immediately removed from the disk. Instead, the file system marks the corresponding data blocks as available for reuse. The file's directory entry is deleted, making it seem like the file has vanished. However, the content of the file remains intact on the disk until the space is needed for new data.root@localhost ~ % df -h Filesystem Size Used Avail %iused Mounted on /dev/disk3s1s1 400Gi 384Gi 16Gi 96% / devfs 215Ki 215Ki 0Bi 100% /devIn this example, you find a file test_log file generated by an application which occupies about 40 GB in the directory. So you perform the following operations attempting to release space:
- Delete test_log file.
rm test_log
- Check the file system disk space usage.
root@localhost ~ % df -h Filesystem Size Used Avail %iused Mounted on /dev/disk3s1s1 400Gi 384Gi 16Gi 96% / devfs 215Ki 215Ki 0Bi 100% /dev
The Role of the "unlink" Operation
The process of deleting a file in Linux involves an operation known as "unlink" or "unlinking." This operation removes the link between the file's name and its inode (a data structure that contains information about the file). When a file is unlinked, its inode is marked as free, and the associated data blocks become eligible for reuse.Delayed Disk Space Reclamation
Now that we understand the basics of file deletion let's explore why disk space isn't immediately reclaimed: File Still Open: If a program has a file open at the time of deletion, the space won't be freed until the file is closed. Linux allows files to remain open even after they are unlinked, as long as they are still in use by a process. This feature ensures that processes can continue to work with the file until they finish using it. To check this, run the lsof command to check whether any process keeps writing data to the test_log file. Tools like lsof can help identify open files handles.# lsof -n |grep delete root@localhost ~# lsof -n | grep deleted COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME httpd 13423 root 5u REG 253,3 42949672960 17 (deleted) mysqld 13423 mysql 6u REG 253,3 0 18 (deleted) mysqld 13423 mysql 7u REG 253,3 0 19 (deleted)As shown in the command output, the test_log file is still being used by the httpd process which keeps writing data into the file. Value deleted in the brackets indicates that the log file has been deleted. But the space is not released because httpd keeps writing data to the file.
So how do we resolve this issue? By Forcing Disk Space Reclamation
While Linux automatically manages disk space for efficiency and performance reasons, you can manually force disk space reclamation in certain situations. Some techniques include:- Closing Open Handles: Identify and close any open file handles that might be holding on to deleted files. Restart the httpd process or restart the system. Then you are advised to empty the contents of test_log, rather than deleting the file.
# echo "">/test_logIn this way, the disk space will be released immediately and the process can continue writing data to the file. Now run the df -h command to check the space usage again, you can see that 10% disk space is freed up now.
root@localhost ~ % df -h Filesystem Size Used Avail %iused Mounted on /dev/disk3s1s1 400Gi 344Gi 56Gi 86% / devfs 215Ki 215Ki 0Bi 100% /devSnapshots and Backups : Some filesystems support snapshots or have backup mechanisms that retain deleted files for a certain period. These features are essential for data recovery and version control purposes, but they can temporarily prevent the immediate release of disk space.