Needle in a haystack: how to read your server log files

How hard can it be?!

I’ve been using Linux for more than a decade now, so I had my share of random problems to diagnose and fix. Maybe my video card was misbehaving, my PPP connection was unstable, or my mails weren’t delivered.

Usually it all comes down to finding the right information in a log file. Sometimes you need to dig into gigabytes of worthless junk to find that hidden error message that sheds light on your unfortunate situation. Often the problem is time-critical, so you need to move fast and quickly isolate relevant data.

Quickly scanning through large text files is probably second nature for any long-time Linux user, and most of us take this skill for granted. So I’m always surprised to see how difficult it is for some IT people to use basic tools like grep or less.

Grep

Grep it’s probably the tool you already use most often to find stuff in your logs. It’s pretty simple to use:

grep pattern file1 file2 ..  
some cmd | grep pattern

It will look at the list of files (or standard input) and only print the lines matching the provided pattern. While the pattern is a regular expression (briefly described a bit later), most people just use it as a search string:

grep '2010-12-05:14' access.log  # filter out requests from 2PM.

# filter out requests from 2PM, except the homepage
grep '2010-12-05:14' access.log | grep -v  'GET / ' 

(If the | is not clear to you, read about Pipelines in the bash manpage  . It basically connects the first command’s output to the second commands input).

So here are some useful parameters:

-r  # grep all files recursively in the provided dirs
grep -r 192.168.1.15 /etc  #is my old IP hardcoded anywhere?

-v  # only show lines NOT matching the pattern
-A 7 # also show 7 lines after the matched line
-B 3 # also show 3 lines before the matched line
-C 4 # show matched line context (4 lines before, 4 lines after)
-i # ignore case
-c # just count the number of matched lines (instead of: grep foo | wc -l) 

But if you just take 15 minutes to quickly scan through the man page (man grep) you’ll probably find a lot of unexpected features that will save you some trouble later on.

Regular expressions quick intro

As I mentioned, grep’s search pattern is a regular expression, but regular expressions are widely used by most unix tools, so they’re always handy

A simple regex can be just a string of normal characters:

grep 'GET /about' access.log 

But the fun part comes when using the special characters:

Anchors: ^ and $

  • ‘^foobar’ means “line starts with foobar’
  • ‘foobar$’ means “line ends with foobar”
  • ‘^foobar$’ means “line is exactly foobar, it doesn’t just contain the word”
  • ’^$’ means empty line
grep -v '^$' file #  show all nonempty lines

To escape a special char you need to use a backslash: “\$” will actually match a dollar sign, and not a line end.

Character classes: . []

. matches any character, so the ‘ja.e’ pattern matches any line containing jane, jade jape or ja9e.

So be careful, you can accidentally make your pattern more general than you want:

ls | grep '.tar'  # will  also match 'star' 

Just escape it and you should be ok:

ls | grep '\.tar'

[abc] matches any character between the brackets.
so ja[nd]e will only match jane and jade

You can even use ranges:

[a-d] # will match a single character among a, b, c or d
[a-z0-9A-Z,] # will match any letter, digit or a comma

You can match ‘any character but these’ by starting the class with a ^:

[^0-9] # matches any non-digit char

Another example:

grep '[1-9][0-9]$' #show lines ending in numbers larger than 10

Keep in mind that . and [] will match exactly one character.

Quantifiers: * ? - how many times the previous character can show up

* - any number of times ( 0 , 1, 50, etc)

ja*ne will match: jne, jane, jaaaaaaaaane
j[a-c]*ne will match: jne jane jcne jacbabababcbbabane
It translates to: ‘j’, then any of a, b or c, any number of times, then ‘ne’

Many times you will use .* as “anything”:

grep '^2010.*GET / ' access.log 

will show lines starting with 2010, then any characters, then ‘GET / ‘

? - 0 or 1 times (i.e. the previous character is optional)

ja?ne will only match jane and jne

Extended regular expressions

Some programs allow “extended” regular expressions (e.g. egrep or sed -r). I’ll just mention a few additions:

Alternatives: (foo|bar)

(pattern1|pattern2|pattern3) - will match any of the three patterns. Example:

egrep -v '^(#|$)' file.conf # will not show comments and empty lines

More quantifiers: + and {}

  • + - will at least one occurrence
  • {5} - exactly 5 occurrences
  • {5,} - minimum 5 occurrences
  • {3,5} - between 3 and 5 occurrences
[0-9]{4}-[0-9]{2}-[0-9]{2} # will match a YYYY-MM-DD date 

Less

Less is a simple file viewer, usually used as a pager (for viewing large chunks of text).

less largefile.log  # view the large file
grep '^2010' largefile.log |  less # view the results of the grep command in less 

How is this handier than grep when searching log files? It’s really useful when your logged items span more than one line, like a full debug log from some program.

Less is also better than standard text editors (like vim or, hmm, notepad) because it allows reading huge files or random streams of text outputted by other programs.

Make sure to use the ‘-n’ option if you have huge files, otherwise search and navigation will be rather slow. If less seems stuck with “Calculating line numbers… (interrupt to abort)”, just press CTRL+C and line numbering will be disabled.

Some useful commands:

  • Navigation:
    • up/down pageup/pagedown - scroll the file
    • G - move to the end of the file
    • g - move to the start of the file
  • searching:
    • /pattern -> search for the pattern
    • ?pattern -> search for the pattern, backwards
    • n -> during a search, show next occurrence
    • N -> show previous occurrence
    (be careful, when using ? ,n/N will move in the opposite direction)
  • q -> exit less

Example usage: You have a recent incident that needs to be figured out. Open the file with less, go to the end of the file, search backwards (with ? ) to a known pattern (like a timestamp, email address, phone number etc) until you find the interesting bit, then use the arrow keys to look at the whole chunk.

So don’t forget about less, it can sometimes be a very helpful tool. Bonus points if you look at the manpage, as usual.

Tail & Head

Two simple, but very useful commands: tail will show the last 10 lines of input, and head will show the first 10 lines. Examples:

tail error.log # see the last few errors
tail -n 30 file.log # see the last 30 lines

Sometimes the logs are huge and grep would be very slow but if you’re looking for recent events, it’s useful to restrict your search with tail, which is very fast:

tail -n 50000 file.log | grep something

if you use -f  tail will not exit and will show you new lines as soon as they are appended to the logfile

tail -f file.log 

You can even check only certain events:

 tail -f file.log | grep ERROR

You want to have all the important logfiles watched with tail -f when you’ve done some big changes in your environment and just restarted the services:

/etc/init.d/apache2 restart

Meanwhile, in another terminal window:

tail -f /var/log/apache2/error.log 

head is useful when you just want to see a few results of your search, and it’s of huge help when you’re working over ssh. If you forget about this, you may accidentally print thousands of lines over ssh, which will usually block your connection for a while.

I just type ‘| head’ after every command that may have a lot of output:

grep pattern logfile | head 

It’s also useful to see when a certain logfile was rotated:

head -n1 access.log.1  # a-ha! need to look in access.log.2.gz 

A few more tips:

Handling .gz files

Log files are usually rotated with logrotate, so you have to work a lot with gzipped files.

Thankfully, you can use your friendly tools here too: you have zless, zgrep, zcat that work as expected.

Example: I want to find all emails sent to foo@example.com in the last month:

First: Quickly find out what log file contain the first lines from this month:

zcat mail.log.3.gz | head -n1 # nope, starts on the 5th
zcat mail.log.4.gz | head -n1 # bingo

Then you search in exactly the files you want:

(zcat mail.log.{4,3,2}.gz ; cat mail.log.1 mail.log ) | grep foo@example.com | less 

zgrep can also handle uncompressed data, so if your logs aren’t huge, you can just use:

zgrep foo@example.com mail.log* # but you don't get them ordered by date

Playground logfiles

Say you want to investigate an issue that happened in the last week of november. You can prevent the huge files slowing you down by creating a smaller, playground logfile:

zgrep '^2010-11-[23]' file.log.{6,5,4}.gz > final-november.log

Then do all your investigation on that file. Everything will be much faster. Sometime the information you already have about the issue can turn a 10,000,000 lines into 20,000 lines or less.

Aggregate data

Show most active IP addresses accessing our vote system:

grep 'POST /post/[0-9]*/vote ' access.log.1 access.log | awk '{print $1}' 
                    | sort | uniq -c | sort -nr | head
  • So, first thing I only care about people (mis)using our voting system, so I just look at those requests.
  • Then I show the first column (using whitespace as delimiter) , which is the client IP.
  • Then I sort, so I can pipe it to uniq -c.
  • Uniq -c will show unique entries and count the number of occurrences, something like:
     242 1.2.3.4
     12 5.4.3.2
     31 9.8.4.2
     ...
    
  • Then sort -nr will sort by the number of occurrences, and show them in reverse, and head will give me a top 10 list of possible cheaters.

But this is just a silly example, use your imagination to combine the tools you already know. And for more complicated aggregations (like, what types of requests / sql queries are killing my server), you probably need to start looking into Perl :)

Closing words

Don’t forget, take a few minutes to quickly look at the man pages of all your favorite tools, you’ll definitely find some surprising features.

I will update this article based on feedback, so any additions are welcome!

Add new comment