How to Analyze and Manage Disk Space on Linux

On
An open hard disk

Although a general Linux user never deals with managing disk space, it's still an important skill to learn if you're more than just a casual Linux enthusiast. A thorough understanding of Linux basics includes disk management too. It can help you utilize the attached disks more efficiently and easily. In this tutorial, we will explore different methods and techniques to manage disk drives attached to your Linux box. Almost all of these commands are available on all the popular Linux distros. If you use the GUI environment on Linux, you may find the graphic equivalents of some of these commands. Let's go!

An open hard disk

Be careful! While using these disk management commands, double-check and understand what exactly you're trying to do. Make sure you do not accidentally or inadvertently wipe out or delete your data.

Read Also:
A Step-by-Step Guide to Adding a New Hard Disk to Ubuntu Linux Desktop

I'll recommend first trying all the commands on a virtual machine instance of your Linux machine. This way, you can get familiar with all the methods without risking your precious data. So, here we go!

1. Check Disk Usage

Let's start with du (disk usage) command. It's one of the most command disk management utilities and is generally used to check the directories and files consuming the most space.

Here are a few useful examples of this command.

du -sh /path/to/directory

To find the disk space consumed by any directory, use the command shown above. The switch s ensures you get a consolidated (summarized) figure of the occupied space. And, the switch h ensures you get this value in a human-readable format.

Here's another command you may find very useful.

du -sh * | sort -hr

Through this command, you can list disk usage for all the files and directories in the current directory, and that too—sorted by size.

2. Remove Unnecessary Files

No matter which operating system you're using, with the passage of time, a lot of unnecessary files accumulate on the disk taking up space that can be easily reclaimed. Let's take a look at some of these unnecessary file types and how we can get rid of them.

Temporary files - The most common type of unnecessary files is the temporary file created by different applications and the operating system during day-to-day operations.

Sometimes, these temporary files are not purged by the applications that created them. Fortunately, there are several applications available for Linux to clean these temporary files. Let's take a look.

sudo apt-get autoclean         # For Debian-based distributions
sudo dnf clean packages        # For Fedora-based distributions

The commands shown above are used to remove packages that are no more required on your system. The packages that fall under this category are the obsolete and outdated versions.

APT package cache - The next thing you can target is the APT packages cache files generally present in the /var/cache/apt/archives directory.

sudo apt-get clean             # For Debian-based distributions

To delete these package cache files, use the command shown above. You can reclaim a significant amount of disk space after cleaning these files from your system.

Unused dependencies - While installing new applications, additional dependencies are installed automatically to fulfill the requirements of the primary package you want to use. At a later stage, if these dependencies are no more required, you must clean them to free up the disk space.

sudo apt-get autoremove        # For Debian-based distributions
sudo dnf remove old-kernel     # For Fedora-based distributions

To do so, use the commands shown above. It'll remove all the unused dependencies thus cleaning up your system. You should generally use this command after uninstalling the primary packages you originally installed for your use.

Log files - These file types can consume a humungous amount of disk space in the long term. If you have limited disk space, keep an eye on these log files and, if required, purge them.

sudo journalctl --vacuum-size=100M     # For systemd-based distributions
sudo rm /var/log/*.log                 # Remove individual log files

You can use the journalctl command to view these system-generated log files in a human-readable format as natively they are stored in binary format. And, if required, use the regular rm command to purge these log files from the /var/log/ directory.

3. Analyze and Delete Large Files

Sometimes, large files hidden deep down in the directory tree occupy a lot of disk space and are hard to find. Identifying and purging these large files can free up significant disk space.

To find files larger than a certain size, use the following command.

find /path/to/search -type f -size +100M -exec ls -lh {} \;

The find command combined with exec enables you to quickly locate and purge files based on specific criteria. For example, here we're searching for files of size 100MB and above.

find /path/to/search -type f -size +100M -delete

Here, we're finding all the files of size equal to or greater than 100MB for the given path /path/to/search/ and subsequently deleting them. I'll warn you to use deletion commands with utmost care to avert any accidental deletion of important data.

4. Compress and Archive Files

A lot of disk space can be saved by compressing and archiving files. Files that are large but at the same important too can be compressed if they are not used on a daily basis.

To create a tar archive:

tar -czvf archive.tar.gz /path/to/directory

Remember, if the file in question is huge, the compression process may take some time. It's best to run this command in the background if you have other important things to do in the pipeline.

tar -xzvf archive.tar.gz

To extract the previously compressed archive, use the command shown above. Make sure you're not overwriting it on the newer version of the same file. Extract it in a new location to prevent any data loss.

5. Manage Log Rotation

If not checked, system log files can quickly consume a huge amount of disk space. Enabling log rotation ensures that log files are compressed or deleted after reaching a certain size or age.

This whole process can be automated easily. All you need is the logrotate utility. Configuration files for logrotate are typically found in the /etc/logrotate.d/ directory.

sudo nano /etc/logrotate.d/myapplication   

/path/to/myapplication/logs/*.log { 
    size 100M
    rotate 5
    compress
    missingok
    notifempty
}

A sample configuration is shown above which can be modified as per your requirements. Here, a log is rotated if the file size exceeds 100MB. Similarly, if 5 of them exist, it is rotated and the oldest one is deleted to make room for the new log file.

The compress directive is self-explanatory. The missingok ensures you do not get an error message if a log file is missing. And, the last directive notifempty does not rotate a log file if it is empty.

6. Use Disk Quotas

If multiple users are working on your Linux system, it is advisable to enforce disk quotas on these users and groups. It'll ensure nobody uses excessive disk space without any checks.

To enable disk quotas, you must modify the /etc/fstab file:

sudo nano /etc/fstab

# Add the 'usrquota' or 'grpquota' option to the relevant disk partition:
/dev/sda1  /     ext4   defaults,usrquota  0  1

Depending on your requirements, you can provide the fstab config options in this file. To initialize disk quotas on the filesystem, use the following commands.

sudo quotacheck -cu /path/to/filesystem
sudo quotaon /path/to/filesystem

Remember, if you do not have a working knowledge of how Linux works with disks, I'll recommend skipping this part.

7. Monitor Disk Space

Disk space is a limited resource and should be consumed with care. Monitoring disk space on your Linux system is another important task one must perform on a regular basis. It helps in detecting anomalies and potential issues.

df -h

The most popular command to monitor disk space is df which is generally used with the -h switch to display the space consumed by each mounted filesystem in a human-readable format.

Conclusion

Efficiently managing disk space is essential for maintaining optimal performance and stability in a Linux environment. By regularly checking disk usage, removing unnecessary files, analyzing and deleting large files, compressing archives, managing log rotation, and utilizing disk quotas, administrators can ensure sufficient space and prevent disk-related issues.

Applying these techniques and commands will help keep your Linux system running smoothly and minimize the risk of disk space-related problems.