Permanent log archives


Each night, Papertrail automatically uploads your log messages and metadata to Amazon’s cloud storage service, S3. Papertrail stores one copy in our S3 bucket, and optionally, also stores a copy in a bucket that you provide. You have full control of the optional archive in your own bucket, since it’s tied to your AWS account.

Already use S3? Jump to Create and share an S3 bucket.


For most accounts, Papertrail creates one file per day in tab-separated value (tsv) format, gzip compressed (gz). Days are from midnight to midnight UTC.

For accounts with higher-volume plans (above about 50 GB/month of logs, though the specifics vary), Papertrail creates one file per hour so the files are of a manageable size.

Each line contains one message. The fields are ordered:


For a longer description of each column, see Log Search API: Responses.

Here’s an example log message. Tabs have been converted to linebreaks for readability:

2011-02-10 00:19:36 -0800
2011-02-10 00:19:36 -0800
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor

A line actually looks like this:

50342052\t2011-02-10 00:19:36 -0800\t2011-02-10 00:19:36 -0800\t42424\tmysystem\t208.122.34.202\tUser\tInfo\ttestprogram\tLorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor

The tab-separated values (TSV) format is easy to read and parse. The archives are stored in a directory-per-day structure that makes it easy to load and analyze a single day’s records.

Usage example

Show identical messages

Here’s how to extract the message (field 10) from the archive file 2016-10-31.tsv.gz, then show the messages sorted by the number of identical occurrences (duplicates).

gzip -cd 2016-10-31.tsv.gz | cut -f10 | sort | uniq -c | sort -n

Windows PowerShell can do the same thing, with 7-Zip’s help. In this example, [9] still selects the message (field 10), due to zero-based indexing.

7z x -so 2016-10-31.tsv.gz | %{($_ -split '\t')[9]} | group | sort count,name | ft count,name -wrap

Show similar messages

The most common messages often differ only by a random number, IP address, or message suffix. These near-duplicates can be discovered with a bit more work.

Here’s how to extract the sender, program, and message (fields 5, 9, and 10) from all archive files, squeeze whitespace and digits, truncate after eight words, and sort the result by the number of identical occurrences (duplicates).

gzip -cd *.tsv.gz | # extract all archives
 cut -f 5,9-      | # sender, program, message
 tr -s '\t' ' '   | # squeeze whitespace
 tr -s 0-9 0      | # squeeze digits
 cut -d' ' -f 1-8 | # truncate after eight words
 sort | uniq -c | sort -n

# or, as a one-liner:
gzip -cd *.tsv.gz | cut -f 5,9- | tr -s '\t' ' ' | tr -s 0-9 0 | cut -d' ' -f 1-8 | sort | uniq -c | sort -n

Once again, Windows PowerShell can do the same thing, with 7-Zip’s help.

7z x -so *.tsv.gz                     | # extract all archives
 %{($_ -split '\t')[4,8,9] -join ' '} | # sender, program, message
 %{$_ -replace ' +',' '}              | # squeeze whitespace
 %{$_ -replace '[0-9]+','0'}          | # squeeze digits
 %{($_ -split ' ')[0..7] -join ' '}   | # truncate after eight words
 group | sort count,name | ft count,name -wrap

# or, as a one-liner:
7z x -so *.tsv.gz | %{($_ -split '\t')[4,8,9] -join ' '} | %{$_ -replace ' +',' '} | %{$_ -replace '[0-9]+','0'} | %{($_ -split ' ')[0..7] -join ' '} | group | sort count,name | ft count,name -wrap

Downloading logs

In addition to being downloadable on Archives, you can retrieve archive files using your Papertrail HTTP API key. The URL format is simple and predictable.

Papertrail generates either daily or hourly archives based on the amount of log data transfer included in your plan, and thus, which duration is likely to be a manageable size. The examples below cover each situation separately.

Simple example


If archives show that Papertrail is generating daily files, download the archive for 2016-09-24 (UTC) with:

curl --no-include -o 2016-09-24.tsv.gz -L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \


Alternatively, if archives show that Papertrail is generating hourly files, download the archive for 2016-09-24 at 14:00 UTC with:

curl --no-include -o 2016-09-24-14.tsv.gz -L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \

Downloading a single archive

Because the day or hour is included in the URL, more sophisticated and automation-friendly examples - like a relative day or hour - are also possible.


For example, to download yesterday’s daily archive on a Linux host, run:

curl -silent --no-include -o `date -u --date='1 day ago' +%Y-%m-%d`.tsv.gz -L \
    -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \`date -u --date='1 day ago' +%Y-%m-%d`/download


If Papertrail generates hourly archives for your account, download the archive for 16 hours ago with:

curl -silent --no-include -o `date -u --date='16 hours ago' +%Y-%m-%d-%H`.tsv.gz -L \
    -H "X-Papertrail-Token: YOUR-HTTP-API-KEY" \`date -u --date='16 hours ago' +%Y-%m-%d-%H`/download

Command syntax

As you can see, there’s a lot going on in those cURL one-liners. The main parts are:

Downloading multiple archives


To download multiple daily archives in one command, use:

seq 1 X | xargs -I {} date -u --date='{} day ago' +%Y-%m-%d | \
    xargs -I {} curl --progress-bar -f --no-include -o {}.tsv.gz \
    -L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY"{}/download

where X is the number of days + 1 that you want to download. For example, to guarantee 2 days, change X to 3; see note below for details. To specify a start date, for example: 10th August 2013, change:

date -u --date='{} day ago' +%Y-%m-%d


date -u --date='2013-08-10 {} day ago' +%Y-%m-%d


To download multiple hourly archives in one command, use:

seq 1 X | xargs -I {} date -u --date='{} hours ago' +%Y-%m-%d-%H | \
    xargs -I {} curl --progress-bar -f --no-include -o {}.tsv.gz \
    -L -H "X-Papertrail-Token: YOUR-HTTP-API-KEY"{}/download

where X is the number of hours + 1 that you want to download. For example, to guarantee 8 hours, change X to 9.

Command syntax

The seq 1 X command is being used to generate date or hour offsets, starting with 1 (1 day or hour ago) because the current day or hour will not yet have an archive. Since archive processing takes time, near the beginning of the hour or UTC day, the previous day or hour also may not have an archive yet (and will return 404 when requested). Thus, to guarantee that you get at least X days/hours, replace X with the number of days/hours + 1.

Your API token can be found under your profile.

More information on the HTTP API is available here.


Using OS X and see date: illegal option -- -? In the examples above, change:


To find an entry in a particular archive, use commands such as:

gzip -cd 2016-02-25.tsv.gz | grep Something

gzip -cd 2016-02-25.tsv.gz | grep Something | cut -f5,9,10 | tr '\t' ' '

The files are generic gzipped TSV files, so after un-gzipping them, anything capable of working with a text file can work with them.

If the downloaded files have file names such as 2013-08-18.tsv.gz (the default), multiple archives can be searched through using:

gzip -cd 2013-08-* | grep SEARCH_TERM


To transfer multiple archives from Papertrail’s S3 bucket to a custom bucket, use the relevant download command mentioned above, and then upload them to another bucket using:

s3cmd put --recursive path/to/archives/ s3://

where path/to/archives/ is the local directory where all the archives are stored, and is the bucket and path of the target S3 storage location.

S3 Bucket Setup

Here’s how to sign up for Amazon Web Services, create a bucket for log archives, and share write-only access to Papertrail for nightly uploads.

Sign up for Amazon Web Services

Skip this step if you already have an AWS account, like for Amazon EC2, S3, or another AWS product.

Activate Amazon S3

Skip this step if your AWS account is already activated for S3.

Create and share an S3 bucket

Note: After submission, Amazon’s management console may change the grantee name to aws or another label different from what was entered. This is expected.

Amazon also has instructions for editing bucket permissions.

Alternative: Define sharing policy with IAM

If you followed the instructions above to grant Upload/Delete permissions via the AWS Management Console, skip this step.

If you prefer defining a bucket policy to control access, here’s an example policy that permits Papertrail to List and Upload:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "PapertrailLogArchive",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
            "Action": [
            "Resource": [

where bucket-name/papertrail/logs/ is the directory for Papertrail.

Tell Papertrail the bucket name

On Settings, enable S3 archive copies and provide the S3 bucket name.

Papertrail will perform a test upload as part of saving the bucket name (and will then delete the test file). Note that a new bucket can sometimes take several hours to become available, due to DNS propagation delays. If it fails, wait two hours, and try again.

When archives are uploaded to the bucket, each file is named under the path (key prefix) provided to Papertrail, typically papertrail/logs/<xxx> where <xxx> is an ID. For example, February 25, 2016 would be:


Days are from midnight to midnight UTC. Alternatively, an hourly archive file for 3 PM UTC would be:



Sharing bucket access in AWS Management Console (the bucket name and existing bucket user have been obscured):


Papertrail S3 archive copy settings:



Why does Papertrail support S3 but not Glacier?

Papertrail supports S3 rather than Glacier because:

Are archives encrypted at rest?

Yes, Papertrail takes advantage of S3’s server-side encryption so that archived data is encrypted at rest using AES-256.