Papertrail Knowledge Base

Permanent log archives

Introduction

Each night, Papertrail automatically uploads your log messages and metadata to Amazon's cloud storage service, S3. Papertrail stores one copy in our S3 bucket, and optionally, also stores a copy in a bucket that you provide. You have full control of this archive - it's tied to your AWS account.

Already use S3? Jump to "Create and share an S3 bucket."

Format

For most services, Papertrail creates one file per day in tab-separated value format, gzip compressed. For higher-volume plans (above about 50 GB/month of logs, though the specifics vary), Papertrail creates one file per hour so the files are of a manageable size.

Each file is named under a path (key prefix) provided to Papertrail, typically papertrail/logs/<xxx> where <xxx> is an ID. For example, February 25, 2011 is:

your-bucket-name/papertrail/logs/54321/dt=2011-02-25/2011-02-25.tsv.gz

Days are from midnight to midnight UTC. Alternatively, an hourly archive file for 3 PM UTC would be:

your-bucket-name/papertrail/logs/54321/dt=2011-02-25/2011-02-25-15.tsv.gz

Each line contains one message. The fields are ordered:

id generated_at received_at source_id 
source_name source_ip facility_name severity_name program 
message

Here's an example (tabs converted to linebreaks for readability):

50342052
2011-02-10 00:19:36 -0800
2011-02-10 00:19:36 -0800
42424 
mysystem
208.122.34.202
User
Info
testprogram
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor

Fields are delimited by tabs, so an actual line looks like this:

50342052\t2011-02-10 00:19:36 -0800\t2011-02-10 00:19:36 -0800\t42424\tmysystem\t208.122.34.202\tUser\tInfo\ttestprogram\tLorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor

To learn more about the meaning of each column, see response field descriptions in HTTP API.

The tab-separated value (TSV) format is easy to parse and the directory-per-day structure make it easy to load and analyze a single day's records.

Usage example

Downloading logs

In addition to being downloadable on Archives, you can retrieve archive files using your Papertrail HTTP API key. The URL format is simple and predictable.

Papertrail generates either daily or hourly archives based on the amount of log data transfer included in your plan (and thus, which duration is likely to be a manageable size). The examples below cover both situations.

Simple example

If archives show that Papertrail is generating daily files, download the archive for 2014-09-24 (UTC) with:

curl --no-include -o 2014-09-24.tsv.gz -L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \
    https://papertrailapp.com/api/v1/archives/2014-09-24/download

Alternatively, if archives show that Papertrail is generating hourly files, download the archive for 2014-09-24 at 14:00 UTC with:

curl --no-include -o 2014-09-24-14.tsv.gz -L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \
    https://papertrailapp.com/api/v1/archives/2014-09-24-14/download

Downloading a single archive

Because the day or hour is included in the URL, more sophisticated and automation-friendly examples - like a relative day or hour - are also possible. For example, to download yesterday's daily archive on a Linux host, run:

curl -silent --no-include -o `date -u --date='1 day ago' +%Y-%m-%d`.tsv.gz -L \
    -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \
    https://papertrailapp.com/api/v1/archives/`date -u --date='1 day ago' +%Y-%m-%d`/download

If Papertrail generates hourly archives for you, download the hourly archive for 16 hours ago with:

curl -silent --no-include -o `date -u --date='16 hours ago' +%Y-%m-%d-%H`.tsv.gz -L \
    -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \
    https://papertrailapp.com/api/v1/archives/`date -u --date='24 hours ago' +%Y-%m-%d-%H`/download

As you can see, there's a lot going on in that one line. The main parts are:

Downloading multiple archives

To download multiple daily archives in one command, use:

seq 0 X | xargs -I {} date -u --date='{} day ago' +%Y-%m-%d | \
    xargs -I {} curl --progress-bar -f --no-include -o {}.tsv.gz \
    -L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" https://papertrailapp.com/api/v1/archives/{}/download

Where X is the number of days + 1 which you wish to download. To specify a start date, for example: 10th August 2013, change:

date -u --date='{} day ago' +%Y-%m-%d

to:

date -u --date='2013-08-10 {} day ago' +%Y-%m-%d

To download multiple hourly archives in one command, use:

seq 0 X | xargs -I {} date -u --date='{} hours ago' +%Y-%m-%d-%H | \
    xargs -I {} curl --progress-bar -f --no-include -o {}.tsv.gz \
    -L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" https://papertrailapp.com/api/v1/archives/{}/download

Where X is the number of hours + 1 which you wish to download.

Your API token can be found under your profile.

Presuming that the downloaded files have file names such as 2013-08-18.tsv.gz, multiple archives can be searched through using:

gzcat 2013-08-* | grep SEARCH_TERM

On some distributions, you may need to substitute gzcat for zcat.

More information on the HTTP API is available here.

Searching

To find an entry in a particular archive, use commands such as:

gzcat 2011-02-25.tsv.gz | grep Something

gzcat 2011-02-25.tsv.gz | grep Something | awk -F \t '{print $5 " " $9 " " $10 }'

The files are generic gzipped TSV files, so after un-gzipping them, anything capable of working with a text file can work with them.

Syncing

To transfer multiple archives from Papertrail's S3 bucket to a custom bucket, use the download command mentioned above, and then upload them to another bucket using:

s3cmd put --recursive path/to/archives/ s3://bucket.name/the/path/

where path/to/archives/ is the local directory where all the archives are stored, and bucket.name/the/path/ is the bucket and path of the target S3 storage location.

Setup

Here's how to sign up for Amazon Web Services, create a bucket for log archives, and share write-only access to Papertrail for nightly uploads.

Sign up for Amazon Web Services

Skip this step if you already have an AWS account, like for Amazon EC2, S3, or another AWS product.

Activate Amazon S3

Skip this step if your AWS account is already activated for S3.

Create and share an S3 bucket

Note: After submission, Amazon's management console may change the grantee name to aws or another label different from what was entered. This is expected.

Amazon also has instructions for editing bucket permissions.

Alternative: Define sharing policy with IAM

If you followed the instructions above to grant "Upload/Delete" permissions via the AWS Management Console, skip this step.

If you have experience defining an IAM policy on a bucket and prefer to do so, grant List and Upload permissions to the following Principal:

 "Principal":{"AWS":"arn:aws:iam::719734659904:root"}

Tell Papertrail the bucket name

On Account, enable S3 archive copies and provide the S3 bucket name.

Papertrail will perform a test upload as part of saving the bucket name (and will then delete the test file).

Screenshots

Sharing bucket access in AWS Management Console (the bucket name and existing bucket user have been obscured):

Papertrail S3 archive copy settings:

s3_permissions-1.png

s3.png

Questions

Why does Papertrail support S3 but not Glacier?

Papertrail supports S3 rather than Glacier because: