0

Our company uses multiple log file formats.

We would like to develop a series of tools to parse them, often the same core functionality for multiple log file formats.

A classical example is generating Message Sequence Charts from the log files (other candidates are memory usage, stack size and time measurement between events).

To stick with Message Sequence Charts, we basically need to identify:

  • the title of the columns (i.e processes involved)
  • the title of the messages
  • perhaps also message parameters

These, we intend to determine by parsing the log files.

However, for different formats, we might say that the information is

  • delimited by certain strings (<some text> <process name> SEND <message> TO <process name>)
  • in fixed column positions
  • in the Nth word of lines beginning with a certain string
  • and so on

This could turn into a spaghetti of command line arguments, so we are thinking of an external configuration file.

What sort of file (plain text, .INI, XMl, or what) and how should we structure it?

I would hope that about 90% of the core of our scripts is common, and we just need 10% or so to massage the input to direct our parsing.

Any advice, or references?

1 Answer 1

1

Do not reinvent the wheel, use something like AWK. So your "configuration file" is simply an awk script, which is powerful enough to deal with all the requirements you mentioned.

If that turns out to be not powerful enough, use your favorite scripting language like Perl or Python. And if you can live with a closed source Windows-only solution (freeware, but not FOSS), specificially designed for processing log files, you can try Microsoft's log parser.

3
  • Alas, it has to support Unix. Company standard is Python & we are not afraid of the parsing part - just wondering how to structure the log file in a generic way, so that it is as future proof as possible.
    – Mawg
    Commented Aug 20, 2015 at 13:16
  • "how to structure the log file"? - originally you asked for how to structure a "configuration file" (for the parsing of log files), just a typo? And I cannot think of something more comprehensive (and still powerful enough for your requirements) than an AWK script, which describes with a few regular expressions which lines to parse and the kind of output you want.
    – Doc Brown
    Commented Aug 21, 2015 at 12:16
  • Aaaaaargh!! typo !!!! (sorry)
    – Mawg
    Commented Aug 21, 2015 at 12:37

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.