Log filtering

Introduction

Papertrail can filter incoming log messages that match a regular expression of your choosing. Here's how.

Log filtering is included with all Papertrail accounts and filtered messages don’t consume log data transfer.

Filters are specific to a destination, which means that different environments, systems, or apps can have their own settings. Also, because the filter is a regular expression (regex) rather than a string, a single filter can reflect many filtering policies.

Papertrail's log filter is an additional tool. The sending client or app can still filter logs, like with the remote_syslog exclude_patterns option or by changing an app's log settings. These filters are independent of any Papertrail filter.

Quick start

  1. Login to Papertrail and click the Account menu option. On a standalone Papertrail account, look for a left menu tab called Log Destinations. Users accessing Papertrail via an app hosting service should see a section titled Log Filtering on the main Account page itself.
  2. On the log filtering settings, enter the case-sensitive regular expression that matches messages which Papertrail should ignore. More.

Example uses

  • Ignore noise. "Noise" could be requests from monitoring agents, requests for static assets, requests which succeeded and did not modify a resource, or any other log messages which are unlikely to be useful.
  • Control log verbosity from services which you do not have access or ability to change. For example, a closed-source app, a managed service, or a system with strict change control.
  • Environment-specific log configuration. For example, retain everything in staging and development but silence certain messages in production, or vice versa.
  • Team-wide control. Let anyone on a team see and change the filtering settings, without needing to understand and modify a config file.
  • Infrastructure-wide control. Create a single regular expression that reflects your own logging preferences, then apply it to log streams from many systems and apps at once.

Examples

Here are a few common uses. Read on for complete docs.

Filtering all occurrences of 3 messages

This will drop all messages containing any of the 3 strings. The separator is the standard pipe (|), which in a regular experssion, means "or":

the first string|the second string|something else

All matches are case-sensitive.

Filtering one program or log file from one sender

Imagine you have one program generating log messages that you don't want. Filter all messages from the program mongod on the system db-server-42:

^db-server-42 mongod

The ^ indicates that the match must happen from the start of a log message. The sender name (in this example, db-server-42) is the name as shown on the Dashboard.

Substrings

Regexes automatically match substrings (unless other parts of the regex constrain them further). That is, these three expressions are identical:

cron
.*cron.*
cron.*

They will all match any string containing cron, with or without any leading or following characters. Including .* before or after a typical filter rule is unnecessary and should be omitted.

Filtering ("disabling") multiple sender(s)

Imagine you have two sending systems which are temporarily generating an undesirable torrent of log messages that you don't want. Filter all messages from the senders system-a and system-b:

^(system-a|system-b)

Or filter only messages from noisy-file.log on these two senders:

^(system-a|system-b) noisy-file.log

Setup

Decide what to filter

Visit Events and browse the full log stream with all log message. Decide which messages you want to filter.

Create regular expression

Create a regex that matches all of the messages which Papertrail should filter. A simple example is to copy and paste a string taken from the message itself. For example, to filter all log messages containing debug, simply use debug as the filter.

Because this is a regex, characters that have special meaning in regexes need to be escaped by placing a \ before them. These are special characters: .|()[]{}\^$+?*. To match log messages containing GET a.b.c type=json, use a filter string that escapes each special character:

GET a\.b\.c type=json

Here is a more complex example matching multiple log messages and only messages from certain senders.

Test regular expression

To test a regular expression, we recommend Rubular with Ruby version 2.0.0 selected. This isn't exactly the same regex engine that Papertrail uses, but it's a close approximation.

Paste the filter expression created above, then copy a sample log message of each message type which should be matched. We test against everything shown in the Papertrail viewer except for the timestamp, so include the sender name, program name, a colon, and then the message (as shown in the 'Your test string' input box below).

For example:

rubular.jpg

Since this regex matches the log message shown, Papertrail would silently discard the message.

Finally, paste a log message which should not match. Confirm that the filter is not too permissive.

Enable filtering

Log filtering is configured from the Account menu in Papertrail.

Most users will see a Log Destinations menu option in the left hand menu. Users accessing Papertrail via an app hosting service may see a Log Filtering form field on the main Account page itself.

Paste the regex and click Update.

Advanced

Filtering multiple messages

A more complex example would match multiple messages or only messages from certain senders or apps. For example, suppose that these two messages serve no operational purpose:

www42 httpd: 127.0.0.1 - "GET / HTTP/1.0" 200 3

and

util2 kernel: nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead.

Find the portion of the log that occurs in all such messages. Here we'll use 127.0.0.1 - "GET / HTTP/1.0" 200 (a successful HTTP request from the Web server itself) and nf_conntrack: automatic helper assignment is deprecated (a warning which could be repeated). This filter would match either message:

127\.0\.0\.1 - "GET / HTTP/1\.0|nf_conntrack: automatic helper assignment is deprecated

Filtering by sender

Papertrail matches your regular expression against the complete log message as it is formatted in the viewer. This allows you to include the name of the sender and/or program (or a substring of them) as part of the filter.

Using the examples above, to filter each message from only the system shown in the example, use a filter like:

^www42 httpd: 127\.0\.0\.1 - "GET / HTTP/1\.0|^util2 kernel: nf_conntrack: automatic helper assignment is deprecated

The ^ indicates that the match must be at the very start of the log. The sender name is the same display name shown on the Papertrail dashboard.

Note: if you use the sender name in a filter and then edit the sender name, the filter will need to be updated as well.

Filter by default

Papertrail's default policy is to process messages it receives, which means that the filter string is deciding which messages are ignored. While there's currently no support for an inverse filter (default behavior of ignoring log messages), these two workarounds often accomplish the same behavior:

  • Filter them locally. All common loggers (rsyslog, syslog-ng, remote_syslog) can filter by message contents.
  • Pick the top few message types and have papertrail filter them. Often this can get close to the same result. Here's an example. In most cases this is simply:

    a string from one|a string from the other|repeat for more messages.

If you have a filtering requirement which Papertrail can't serve well, please tell us.