Tech Decoded
Search Button
Blog Cover Image

7 Common Linux Server Problems and Their Solutions: A Troubleshooting Guide

02 April 2025

By Peter Nørgaard

Subscribe to Tech Decoded weekly newsletter

Sign Up
Sign Up
You are now a subscriber. Thank you!
Please fill all required fields!

Server problems happen to everyone, even when using Linux servers that are known for being reliable and able to handle heavy workloads. If these issues aren't fixed quickly, they can cause downtime or create security risks.

 

This guide shows you how to fix the most common Linux server issues. Each solution includes clear steps and real examples. You'll learn to spot problems early and keep your servers working well.

 

 

At a Glance Problem

 

Key Symptoms

 

1. Network Connectivity Woes

 

SSH failure, DNS errors, packet drops

 

2. Disk Space Troubles and Filesystem Clutter

 

Insufficient space, disk errors

 

3. High CPU or Memory Usage

 

Sluggish response, CPU spikes

 

4. Service Failures and Configuration Pitfalls

 

Services not starting, misconfiguration errors

 

5. Permissions and Ownership Blunders

 

“Permission denied,” inability to modify files

 

6. Security Breaches and Intrusions

 

Unknown processes, suspicious logs

 

7. Boot and Kernel Errors

 

Kernel panic, failure to boot

 

 

1. Network Connectivity Woes

 

Target Keyword: Troubleshooting Linux Server Network Connectivity

 

“In any network, the first sign of trouble often reveals itself when you can’t connect to the server.” – Casey L., Network Engineer

 

Nothing is more frustrating than not being able to SSH into your Linux server or losing your remote connection. Fortunately, diagnosing issues like “How to fix linux server high cpu usage caused by zombie processes” and “resolving linux server ssh connection refused error” can start with verifying your network.

 

Common Symptoms

 

  • Inability to ping or connect via SSH

  • “Connection refused” or “Host unreachable” errors

  • Slow data transfers or frequent packet drops

 

Possible Root Causes

 

  • Misconfigured IP settings (typos in IP, gateway, or DNS)

  • Firewall blocking ports (e.g., port 22 for SSH)

  • DNS resolution issues

  • Physical cable or NIC hardware failure

 

Step-by-Step Troubleshooting

 

Check Interface Settings

  • Use commands like ip addr or ifconfig to confirm correct IP address, netmask, and gateway.

Ping External Addresses

  • Run ping 8.8.8.8 or a known server to test basic IP connectivity.

Verify DNS

  • Check /etc/resolv.conf for correct DNS entries, or run nslookup or dig.

Look for Firewall Issues

  • Confirm iptables or ufw rules are not blocking SSH or other critical ports.

Physical Inspection

  • Ensure cables are plugged in; check NIC lights or switch port logs for errors.

 

Quick Solutions

 

  • Correct IP/gateway settings, reissue systemctl restart network.

  • Update DNS entries as needed.

  • Open necessary ports in firewall using ufw allow ssh.

 

Case Study: A small enterprise found that after a reboot, their new gateway IP was typed incorrectly. After verifying using ip route, the admin noticed the wrong IP. A quick fix in /etc/network/interfaces (Debian/Ubuntu) or /etc/sysconfig/network-scripts/ifcfg-ethX (RHEL/CentOS) solved the problem instantly.

 

2. Disk Space Troubles and Filesystem Clutter

 

Target Keyword: Troubleshooting Linux Server Disk Space Full in var log directory

 

“When a disk fills, services start to fail, logs can’t be written, and everything can spiral out of control. It’s a race against time.” – Lisa M., Systems Administrator

 

When you run out of disk space, your Linux server throws “No space left on device” errors or fails to launch essential services. This can be due to logs piling up in /var/log or large backup files building up in unexpected folders.

 

Telltale Signs

 

  • “No space left on device” or “Disk is full” terminal messages

  • Services failing to start, e.g., MySQL or Apache

  • Logs in /var/log show repeated “disk full” warnings

 

Diagnosis Steps

 

Check Usage with df

  • Run df -h to see overall disk usage by partition.

Locate Large Files

  • Use du -sh /var/log/* or ncdu to identify massive logs or backups.

Purge Old Logs

  • Rotate or clean logs in /var/log if they are excessively large.

Check for Large Orphaned Files

  • A service might have logs open but unlinked. Use lsof | grep deleted.

 

Quick Solutions

 

  • Delete or archive old logs. Configure log rotation in /etc/logrotate.conf.

  • Use compression for large archives.

  • Expand partitions if possible or add new disk volumes.

  • Employ best practices for log rotation: logrotate --force /etc/logrotate.conf.

 

Case Study: A hosting provider discovered that daily backups of /home were being stored in /var/backups, quickly filling /var. Implementing a new rotation policy and storing backups on a separate volume prevented repeated “disk full” errors.

 

3. High CPU or Memory Usage

 

Target Keyword: How to fix Linux server high CPU usage caused by zombie processes

 

“Often, resource spikes happen due to a single runaway process or memory leak. Identifying it quickly is crucial.” – James F., Lead DevOps Engineer

 

When a server slows to a crawl or stops responding, a key suspect is CPU or memory saturation via “zombie” processes or poorly optimized apps.

 

Common Warning Signs

 

  • Server lag, unresponsive commands, high load averages

  • Java or Java-based apps hogging CPU

  • OOM (Out Of Memory) kills essential processes

 

Systematic Troubleshooting

 

Run top or htop

  • Spot process with highest CPU usage quickly or see memory hogs.

Check Memory

  • Use free -m or vmstat; watch for low available memory or swap usage.

Look for Zombie Processes

  • Use ps aux | grep Z. A high number of zombies suggests orphaned children not reaped by parent processes.

Examine Logs

  • Repeated error logs might show a memory leak or crashed processes.

 

Best Practice Solutions

 

  • Kill or remove zombie processes with kill -9 or fix parent code.

  • Tweak server config for apps like Apache, MySQL to limit CPU usage.

  • Add or upgrade server hardware if truly needed.

 

Case Study: A marketing company’s Java-based web app (Tomcat server) began hogging 90% CPU daily. Using jmap and analyzing heap dumps revealed a memory leak in a custom library. A quick patch re-coded the library to release memory after each process cycle, dropping usage from 90% to a stable 30%.

 

4. Service Failures and Configuration Pitfalls

 

Target Keyword: Linux service fails to start

 

“A single misconfiguration can cascade into a major downtime event if not quickly identified. Familiarity with system logs shortens downtime drastically.” – Raj B., Infrastructure Head

 

Critical services failing on boot or giving “Failed to start” messages are a top headache.

 

Key Symptoms

 

  • “systemctl status servicename” shows “Failed” with an error code

  • “Connection refused” or “Service not running” errors in logs

  • Applications not reachable despite correct configurations

 

Typical Misconfigurations

 

  • Typos in config files, e.g. /etc/nginx/nginx.conf

  • Missing dependencies or incorrect file paths

  • Systemd unit files pointing to wrong locations

 

Structured Diagnosis

 

Check Status

  • systemctl status servicename or journalctl -u servicename

Examine Config Syntax

  • For Nginx, nginx -t; for Apache, apachectl configtest.

Look for Dependency Issues

  • If using systemd, ensure “After=” lines match required dependencies.

 

Common Solutions

 

  • Fix syntax errors, typos in config

  • Re-run or reinstall missing dependencies

  • Adjust systemd unit file and re-run systemctl daemon-reload; systemctl restart servicename

 

Case Study: A WordPress site refused to run after a reboot because Nginx had “listen 80” missing a semicolon in /etc/nginx/sites-available/default. The error logs pinned it down quickly. A fix plus systemctl daemon-reload && systemctl start nginx resolved the downtime within minutes.

 

5. Permissions and Ownership Blunders

 

Target Keyword: File permission issues

 

“A single ‘chmod 777 * -R’ might solve short-term issues but opens a massive security hole. Correct ownership is the real fix.” – Linda V., Security Consultant

 

Mistakes in file permissions can lock out valid users or open your server to danger.

 

Red Flags

 

  • “Permission denied” while reading, writing, or executing

  • SSH "Connection refused" due to improper home directory perms

  • System logs showing repeated “access denied”

 

Systematic Steps

 

List Current Permissions

  • ls -l filename to see user, group, and mode bits (rwx).

Check Ownership

  • chown user:group filename as needed.

Set Proper Permissions

  • chmod 640 file.conf for config files, for example, to restrict to root and group.

 

Simple Solutions

 

  • Provide minimal privileges needed to run. Avoid large scale chmod 777.

  • Use usermod -aG group user to fix group memberships for services.

 

Case Study: A production environment's web images were uploaded by the 'deployer' user to /var/www/images with the group set incorrectly. The website gave constant 403 permission denied errors. Correcting group ownership via chown -R root:www-data /var/www/images and ensuring chmod g+rx fixed the issue instantly.

 

6. Security Breaches and Intrusions

 

Target Keyword: How to identify and kill processes causing memory leaks on linux servers

 

"Security is not a product, but a process, each day you must keep an eye on suspicious processes and logs.” – Bruce T., Security Analyst

 

No environment is immune to intrusions. Quick detection of suspicious processes, memory leaks from crypto-miners, or unknown user accounts is critical.

 

Telltale Signs

 

  • Unknown processes with high CPU usage

  • Strange user accounts, repeated login attempts

  • High outbound network traffic or cryptomining logs

 

Action Steps

 

Inspect Running Processes

  • ps aux --sort=-%cpu or top -o %CPU

Check SSH logs

  • Look in /var/log/auth.log or /var/log/secure.

Remove Malicious Tools

  • Use kill -9 , uninstall suspicious software

Patch and Update

  • Keep packages updated, follow best security practices.

 

Tools

 

  • rkhunter or chkrootkit for rootkit scanning

  • fail2ban or iptables for repeated login attempts

  • ClamAV for any known malware viruses

 

Case Study: A Linux server started slowing, with top showing unknown kworker processes at 100% CPU usage. A closer inspection via logs revealed a cryptomining Trojan. Removing it with ClamAV plus adjusting firewall rules closed the compromise path. The memory usage quickly returned to normal.

 

7. Boot and Kernel Errors

 

Target Keyword: How to repair corrupted filesystem on linux server without data loss

 

“When kernels panic, the entire system halts. Maintaining backups and rescue strategies ensures minimal downtime.” – David U., Senior SysAdmin

 

Boot and kernel problems can be the most severe, preventing the OS from loading or causing “Kernel panic” messages.

 

Common Indicators

 

  • Kernel panic messages on boot

  • Grub rescue prompts

  • “No boot device found,” halting system mid-boot

 

Logical Troubleshooting

 

Check BIOS/UEFI

  • Confirm correct boot ordering and that disks are recognized.

Inspect GRUB

  • Access GRUB config (grub.cfg) or run rescue mode to fix boot loader.

Boot Recovery

  • Use a live or rescue environment. Run fsck for partition checks.

Review Kernel Logs

  • dmesg or logs in /var/log/kern.log for root cause.

 

Quick Solutions

 

  • Select an older kernel from GRUB if new kernel loops panic

  • Reinstall or repair the boot loader via grub-install

  • Perform a filesystem check: fsck -y /dev/sdaX if necessary

 

Case Study: An overnight kernel update left a server stuck in panic state. The admin rebooted into a rescue environment, reinstalled GRUB, and downgraded the kernel to the known good version. After verifying /boot/grub/grub.cfg, the server booted normally.

 

Conclusion

 

Fixing common Linux server issues requires a step-by-step approach and quick response. Being ready to handle problems like full disk space in /var/log, fixing damaged filesystems without losing data, or dealing with SSH connection errors will help you reduce outages, protect your data, and keep your server running smoothly.

 

Key Takeaways:

 

  • Always keep track of your network, CPU, and memory usage to catch anomalies before they impact performance.

  • Have a plan for log management, to quickly parse root causes from /var/log and correlated logs across your environment.

  • Prioritize security. Watch out for unknown processes, keep your system patched, and carefully manage user privileges.

  • Updating kernel and software regularly can stave off known vulnerabilities and fix performance bugs.

  • Remember: backups, backups, backups! They’re your best friend against corrupted filesystems and major meltdown events.

 

In the end, a stable, secure, and well-monitored Linux server stands at the core of robust IT operations. This troubleshooting guide is just the beginning—keeping an eye on logs, employing best practices for configuration, and leaning on stable automation for repetitive checks all form the bedrock of professional system administration.

 

Keep your Linux server running smoothly by following these steps. This will help prevent problems and make things easier for both admins and users.

Your source for the latest tech news, guides, and reviews.

Tech Decoded

PAGES

CONTACT

INFORMATION

Mailbox Icon
LinkedIn Icon

Receive Tech Decoded's Newsletter in your inbox every week.

NEWSLETTER

Submit
Submit
You are now a subscriber. Thank you!
Please fill all required fields!

Copyright © 2025 Tech Decoded, All rights reserved.