Search API

Your apps can make HTTP requests to Papertrail to programmatically search for events.

To manage account resources, such as to create groups or register new senders, see HTTP API.

Introduction

The log search API endpoint is one part of Papertrail’s HTTP API. All API calls use the same authentication and request/response format.

The search API URL is https://papertrailapp.com/api/v1/events/search.json.

Examples

The most basic example is simply to hit the search API endpoint, https://papertrailapp.com/api/v1/events/search.json. Try it. You’ll receive the most recent 100 log events.

Because session cookies are not used with the API, you may be prompted to re-authenticate even when already logged in to the papertrailapp.com Web site.

curl

These examples use curl, a command-line HTTP client.

Authenticate with an API token and search events for Critical error as a quoted phrase:

curl -v -H "X-Papertrail-Token: abc123" "https://papertrailapp.com/api/v1/events/search.json?q='Critical error'"

Papertrail’s own papertrail-cli uses this search API to retrieve events. Install the gem or read search_query.rb source.

Search

Background

The search API returns a set of events between a minimum and/or maximum ID or time and meeting a set of constraints (like matching a search query or in a group).

Further, Papertrail’s workload varies significantly based on the number of possible events and the complexity of the query. For example, a complex search of 10 billion events takes longer than retrieving the newest events from a few million. Even though a search may take time, an API client can’t be expected to wait forever, and may be able to act on a subset of all results (for example, outputting them to the user).

To ensure that the caller does not block forever, the search API automatically enforces a per-request timeout of about 5 seconds. The response contains results found so far (if any), plus extra fields so that the API client can choose to continue the search without simply receiving the same set of results again. These extra fields are min_id and max_id.

Common situations

For example, consider 2 common cases:

Should I use the API?

Live tail and time-constrained search are both implemented by Papertrail’s CLI. No API integration is required to perform these tasks.

Requests

Depending on the operation (new events, older events) and scope (all logs or specific group or system), API clients may provide these optional query parameters. Each is discussed in detail below.

Reference implementation

The papertrail-cli command-line tools exclusively use Papertrail’s APIs, including the one documented here.

The connection.rb file encapsulates most of the query logic.

Summary

Less frequently needed:

Search query (q)

To search for a specific message or string, add the optional parameter q (as in query). All search queries that work in the Papertrail Web interface should work in the API.

All parameters should be URL-encoded, as is standard for GET query strings. For example, the search string:

bob OR ("some phrase" AND sally)

would be this URL-encoded GET query string:

q=bob%20OR%20(%22some%20phrase%22%20AND%20sally)

Constrain Scope (group_id or system_id)

To limit results to only a specific group or system, include system_id or group_id in the query string. For example:

search.json?system_id=1234
search.json?group_id=2345

Also, system names which contain only numbers, letters, and underscores (like hostnames) can be used as arguments for system_id, in lieu of the Papertrail ID. The attribute is the Papertrail system name (name in the system JSON hash).

This can simplify linking to results for a single system, since no name-to-mapping API query is necessary. For example:

search.json?system_id=www42
search.json?system_id=my-big-server

Responses

Papertrail responds with a JSON hash containing 3 important keys:

In addition, either reached_beginning or reached_time_limit may be true to indicate the reason why the request ended. See below for more.

Within each log event hash, the following keys are defined:

Example response

Here is a response with 2 log events:

{
  "max_id":"7711582041804800",
  "min_id":"7711561783320576",
  "events":[
    {"hostname":"abc","received_at":"2011-05-18T20:30:02-07:00","severity":"Info","facility":"Cron","source_id":2,"message":"message body","program":"CROND","source_ip":"208.75.57.121","display_received_at":"May 18 20:30:02","id":7711561783320576,"source_name":"abc"},
    {"hostname":"def","received_at":"2011-05-18T20:30:02-07:00","severity":"Info","facility":"Cron","source_id":19,"message":"A short event","program":"CROND","source_ip":"208.75.57.120","display_received_at":"May 18 20:30:02","id":7711562567655424,"source_name":"server1"}
  ],
  "reached_beginning":true
}

Types of responses

Each search query will return 1 of 3 types of responses, which your app can use to decide what to do next. The 3 possible responses are:

Think of this as the disposition of the request. While casual implementations of Papertrail’s search API do not need to handle all cases, thorough implementations should. If you have any questions about how your app can best handle these responses, contact us or see examples. Papertrail’s command-line client uses this API call to do time-based searches, such as:

papertrail --min-time "midnight" abc def

Empty result set

A response containing no matching events will return an empty events array, such as:

{"max_id":"0","reached_beginning":true,"events":[],"min_id":"0"}

received_at and display_received_at are in the time zone of the API token owner (see Profile).

Displaying the time range searched

In addition to min_id and max_id, the response hash will also contain min_time_at and max_time_at. These are the oldest (min_time_at) and newest (max_time_at) timestamps searched during this request, in Unixtime. These are purely informational, such as to display the progress of a recursive search (“Searched back to Tuesday, November 11 at 11:33 AM”).

Note: because multiple events may have occured during the same second, Unixtime is not granular enough to serve as a “cursor” for subsequent requests. These response fields are informational and should not be used as request parameters in subsequent queries. Instead, use min_id and max_id.

How search works

Retrieving current logs (live tail)

Apps may implement a “live tail” (tail -f) style display by performing multiple successive searches for the same search query (or no search query). To do this, the max_id value from the prior result set must be passed back to the Papertrail API as the min_id parameter.

When your search query changes, omit the min_id from the prior search’s result set.

Retrieving older events

By Event ID

To obtain older events (rather than current events or tailing current events), a max_id may be passed to Papertrail. This is the newest (highest) ID that your app would like to see, and Papertrail will return the newest events that are older than that ID.

To scroll back through older events, like for a progressively-older tail, the min_id from a Papertrail response would be passed back in as the max_id. You’ll receive events older than that response.

The id values for a series of events will only increase (and does indicate relative event order). id values are not sequential.

By Time

If you have a timestamp but not an event ID, pass max_time or min_time with your first request (rather than max_id or min_id). These should be in Unix time, GMT.

After the first request/response, if successive queries are needed, provide max_id as in “By Event ID” above.

64-bit integers in JSON

Papertrail uses 64-bit event IDs, which Javascript has trouble with, so the id value is a string. The other values are consistently set to the type you would expect. No values should be null except for program (when none is defined by the message). An empty message body is a blank string (‘’).

Submitting log messages

HTTP is very poorly suited for realtime logging. It’s probably obvious from this API that we love HTTP. However, but its characteristics are almost the polar opposite of what makes an easy, resilient log sender.

Here’s why HTTP doesn’t do your app any favors:

The final and biggest reason: Sending a syslog packet is extremely easy because at its core, a “syslog packet” is just a simple string (think printf). Generating and transmitting it is usually 2-4 lines of code. It’s often less code and easier to follow than generating an HTTP request (and is much shorter and more elegant than HTTP log submission).

So, although we love HTTP for some things, it’s a really bad fit for transmitting logs. Papertrail makes it easy to not encounter these problems.

Feel free to request a code sample for your language via support@papertrailapp.com.