by Steven J. Owens (unless otherwise attributed)
* Troubleshooting Quick Reference
So I'm helping a guy troubleshoot some problems with his email on a mac osx box (yes, yes, I know this is an ubuntu journal, but) and I'm dredging up various commands, and I thought, I should keep a short summary of general troubleshooting command somewhere. So, I'm going to try to migrate/copy various commands up to this section to create a summary of troubleshooting commands.
Many, if not most, of these commands will work fine on other linux distros.
And here's the summary of the summary:
editing sudoers: | $ sudo EDITOR=emacs visudo |
finding processes: | $ ps -ef | fgrep -i whatever |
tracing a process: | $ strace -p processid |
list open ports/processes: | $ netstat -tln |
list open files: | $ lsof |
file users: | $ fuser filename |
file status: | $ lstat |
log files: | $ sudo ls -l /var/log |
checking disk space: | $ df -h |
spot problem processes: | $ top |
what's your cpu: | $ cat /proc/cpuinfo |
what's your memory: | $ cat /proc/meminfo |
hardware issues: | $ dmesg |
list PCI devices: | $ lspci |
list USB devices: | $ lsusb |
list all devices: | $ lsdev |
list all hardware: | $ sudo lshw |
display hard drive ID: | $ hdparm -i /dev/sda (where a is the hard drive identifier, e.g. sda, sdb, sdc) |
list kernel modules: | $ lsmod |
load kernel module: | $ sudo modprobe modulename |
unload kernel module: | $ sudo modprobe -r modulename |
trace an already running command | $ strace -p processid |
start a command with trace | $ strace sudo aptitude update |
print out Xwindows Video details | $ xvinfo |
Also, check out the linuxinfo package, which parses /proc/cpuinfo and displays a human-readable summary.
Do NOT chmod and manually edit /etc/sudoers anymore.
Use visudo to run the editor.
To run visudo with a custom editor:
$ sudo EDITOR=emacs visudo
$ ps -ef | fgrep -i whatever
(OSX is System V derived, so "ps -aux" instead; in fact, nowadays many versisn of ps support both syntaxes, so you can do "ps aux" even on an ubuntu box).
Note that, since -f lists the full info, including the command, you can grep on the command name, etc.
Related: To see the full command line, i.e. to avoid your terminal truncating each line of output at 80 or however many characters, pipe it through cat:
$ ps -ef | cat
Or:
$ ps -ef | cat | fgrep -i whatever
$ netstat -tln
Also see the -p option to netstat for process ID and name.
Another handy command for that (which I keep forgetting about) is lsof for list open files.
$ lsof
fuser, on the other hand, does the reverse - starts with the file or socket and shows what processes are using it.
$ fuser filename
Another trick I'll do is: if I can't find log entries in the logical place, I'll attempt something (send an email through, or a web request, or stop/restart some service) and then check the timestamps in /var/log to see what was touched.
$ sudo ls -l /var/log
Also, it's usually a good idea to check and see if you've run out of disk space, as this will often cause odd things to happen:
$ df -h
Not to mention check top to make sure you don't have some process out of control:
$ top
Finally, for hardware troubleshooting, there are a number of handy tips.
First, look in /proc, particularly /proc/cpuinfo and /proc/meminfo, but there's lots of other handy hardware stuff in there.
$ cat /proc/cpuinfo
In general, dmesg shows you the things linux said to itself while it was booting up (most if this stuff also ends up in /var/log/syslog):
$ dmesg
$ lspci
$ lsusb
$ lsdev
lshw compiles a comprehensive listing of your hardware. Some info requires sudo.
$ sudo lshw
hdparm queries the kernel for info about drives and similar devices connected to your SATA, PATA, SAS, or IDE controllers. To display the identification info for a particular hard drive, find the device ID, let's say it's /dev/sda, and:
$ sudo hdparm -i /dev/sda
Use lsmod to list what modules are loaded into the kernel:
$ lsmod
Most of the time that's pretty esoteric stuff, but sometimes it can really help to make sure that, say, your ipw2200 driver module is loaded. (This used to be a problem with ipw2200 under ubuntu when resuming from hibernation, a few years back).
Then, if you actually understand what you're doing with the kernel and modules, you can use "modprobe modulename" and/or "modprobe -r modulename" to load or remove modules. Or vice versa, modprobe -r to remove it and then modprobe to reload it:
$ modprobe -r ipw2200
$ modprobe ipw2200
strace traces system calls, but this is for really deep voodoo. For a command that's already running but appears to have hung, run strace on the process ID:
$ strace -p processid
Or you can try starting the command with strace:
$ strace sudo aptitude update