When it comes to desktop OS market share, Linux may not be the number one choice, but it still has millions of users across the world. Like any other operating system, it is too prone to errors. If you are new to Linux and struggle to resolve and debug errors in the Linux environment, here's a guide to help you tackle the most common errors, its users may encounter. Whether these errors are related to system crashes or an incorrectly configured application, this tutorial guides you through a step-by-step method to handle these errors. You can apply these methods to all popular distros without little or no changes.
Working knowledge of the command line environment is required to apply these methods. In most cases, you may require root privileges to complete the execution of the commands.
If you want to use a Linux system like a pro, learn the debugging methods given below. It'll give you an edge over other users who struggle to resolve Linux-related problems. Let's get started!
Step 1: Understand the Problem
Before you attempt to resolve a Linux error, it's imperative to first understand it correctly. Unless you are not aware of the following, you cannot resolve any error in the Linux environment. Here are some of the important questions to ask yourself before you move ahead with finding a solution.
- What task you were executing or running when the error occurred?
- Can you reproduce this error?
- Have you recently made substantial changes to your Linux system?
Whenever an error occurs, it's always a good practice to take note of the following:
- Write down the exact text of the error message. It may include error codes that may be required during the debugging process.
- You must also take note of any unusual behaviour of the application or system in general that's completely different prior to the error.
Jot down all this information in a text editor of your choice and keep it ready for future reference.
Step 2: Check System Logs
Fortunately, Linux has rich logging system support. All you need is the knowledge about these logs. These logs contains tons of information about Linux events and errors. Let's learn how to check these logs for finding the root cause of an error.
1. View the System Log with journalctl
Linux uses the systemd software suite to manage systems and services on a Linux system. It extensively logs information related to events and services. To access its logs, you can use the journalctl command.
sudo journalctl
# Filters you can apply through this command
# Filter by time range:
sudo journalctl --since "2024-11-01" --until "2024-11-07"
# Filter for a specific service:
sudo journalctl -u nginx.service
# Check logs in real-time:
sudo journalctl -f
This is just the tip of the iceberg. Check the command's man page to see all the available options for log filtering. This command can give you insights about Linux events in chronological order.
2. Check the Kernel Log with dmesg
If you have to deal with device driver issues, hardware issues, or low-level system messages, the dmesg command is your best bet. It displays the Linux kernel message buffer. Here's how to use it.
dmesg | less
# To filter results for specific keywords or phrases:
dmesg | grep -i "X-Server"
You can also pipe the output to a text file for deferred analysis of the log.
3. Explore Application-Specific Logs
Apart from system logs, you also have the option to analyze logs created by applications. These application-specific logs help you pinpoint the exact reason for the error. Here are some common log file locations you can use to identify an error.
/var/log/syslog: Contains general system logs on Debian-based Linux distros./var/log/messages: Contains general system logs on CentOS-based Linux distros./var/log/auth.log: Here, you'll get authentication-related logs on Debian-based distributions./var/log/boot.log: And, this is the one containing boot process logs.
Here are some examples of how you can examine these logs:
sudo tail -f /var/log/syslog
# Use `grep` utility to search for specific keywords in the log files:
sudo grep "ERROR" /var/log/syslog
Applications like Apache or Nginx maintain separate logs which can be examined to find the exact reason of an error.
Step 3: Use Debugging Tools
Linux provides several tools and applications to analyze, monitor, and debug errors. Let's take a look at some of the most common and popular tools you can use to debug the errors on your Linux system.
1. Monitor System Performance with htop
You can use the htop command to get insights about the system resources (disk, memory, and CPU) and processes on a Linux system. It gives you an interactive interface where you get the data in real time.
Through this tool, you can monitor and analyze the following:
- Identify the processes that are consuming excessive system resources.
- Identify orphaned or zombie processes which are marked with the
Zsymbol. - Find high wait times associated with active I/O processes.
2. Trace System Calls with strace
New developers often ignore the strace command that is one of the best tools to find the errors of a program. This command outputs all the system calls and signals that are executed and generated during the life cycle of a program.
Here are some examples showcasing how you can use this command for debugging errors:
strace -o analyze.txt <command>
# Tracing errors when listing a non-existent directory
strace -o analyze.txt ls -la /nonexistent/directory
We are redirecting the strace output to the 'analyze.txt' text file. Storing the output in a file enables you to analyze it at a later time. Feel free to change the name of these files as per your preference.
3. Debug Programs with gdb
If you want to debug compiled programs on Linux, GNU Debugger (gdb) is your best option. It's a powerful tool to analyze and debug the program during its execution.
# Exceute the program through `gdb` for tracing and debugging:
gdb <program_name>
Once within the gdb environment, you can use the following commands:
run: Fire this command to start program execution.backtrace: Use it to display the call stack. It's somewhat similar to thestracecommand.break: Set breakpoints in your program to halt execution at given points.
4. Analyze Disk Usage
Sometimes, the errors are related to disk space or disk-related issues. Correctly monitoring and analyzing disks on your Linux machine is an important skill to debug errors related to these storage devices.
Here's a complete guide to disk management in Linux. It's a comprehensive tutorial that equips you with all the knowledge required to monitor and manage disks on a Linux system.
Step 4: Research and Use Online Resources
No one is perfect or has all the answers. If you get stuck and struggle to find a solution for an error, do not hesitate to go to the online Linux community. Here are some handpicked online resources you can use to seek help when debugging Linux errors.
Search Engines
One of the largest developer communities reside on Stack Overflow. You can either search existing solutions or can ask a new question related to your error. It's always a good practice to first search for an existing solution as there are high chances that someone has already asked about that problem.
Forums and Wikis
There are several good forums and Wikis where you can find answers to your queries. For example:
- LinuxQuestions.org: A large forum dedicated to questions related to Linux. Register and benefit from its large user base.
- Ubuntu Community Hub: It's an active community of Ubuntu user base. The forum is run and managed by Ubuntu and is a great place to seek help related to Ubuntu issues.
- Red Hat Community: If you are running Red Hat on your computer, this is the community to resolve all your Linux-related queries.
Man Pages and Documentation
And last but not least is the native documentation of each application. In Linux world, this documentation is called Man Pages. Simply use the following commands to open built in docs of an application.
man <command>
info <command>
These man pages are like a user manual of an application or a command. Through it, you can reolve the type of errors where you are not using the correct switches or parameters while firing the command.
Step 5: Revert Changes or Use Backups
One of the useful strategies to tackle errors is to rollback the changes made earlier. It can be done either through commands or by using backups to restore the original state. Let's see how to do it.
Uninstall Problematic Packages
If you've identified a problematic package, you must remove it to get rid of errors.
sudo apt remove <package> # On a Debian/Ubuntu system
sudo yum remove <package> # On a RHEL/CentOS system
Restore Configuration Files
Most applications automatically create a backup of a configuration file, if any. You just need to know to use this backup to overwrite the changed configuration file. Here's an example:
sudo cp /etc/<config_file_name>.bak /etc/<config_file_name>
Use System Snapshots
On a Ubuntu computer, one can use the Timeshift tool to revert the system to its previous state.
sudo timeshift --restore
If you are not using this tool on your Ubuntu system, install it without giving it a second thought.
Common Linux Errors and Solutions
Let's see some of the most common Linux errors we may encounter and the ways to deal with them. Most of these problems are easy to tackle provided you know how to deal with them. Let's take a look.
1. Permission Denied
Sometimes you may get the 'Permission Denied' message. To rectify this problem, all you need to do is to modify the file permissions of the program in question.
chmod +x <file_name>
2. Command Not Found
This error occurs when you try to use a package that's missing on your system or the correct path to the executable file is missing in the $PATH environment variable. To fix it, do the following:
sudo apt install <package>
echo $PATH
3. Disk Full
If you get errors or warnings related to limited or no space left on your disk drive, use the following commands to resolve this issue.
df -h
sudo rm -rf /path/to/large/file_name
4. Dependency Issues
Sometimes, the installation of the primary package becomes a nightmare due to dependency conflicts. To resolve such issues, use the following command.
sudo apt --fix-broken install
5. Network Connectivity Issues
If you are experiencing network issues on your Linux system, try the following commands to fix it.
ping google.com
sudo systemctl restart NetworkManager
Conclusion
Debugging Linux errors doesn’t have to be daunting. By systematically analyzing the issue, utilizing logs and debugging tools, and leveraging community resources, you’ll quickly become proficient at troubleshooting.
Remember, every error you resolve enhances your Linux skills and confidence!