Spamhaus Intelligence REST API

You can access Spamhaus Intelligence API (SIA) through a convenient HTTP REST interface, enabling real time access to IP and domain reputation data for the automated assessment and discovery of potential threats or changes.

The following is a list of all available endpoints and how to use them.

Login

To access the API, every request requires a proper Authorization header containing a specific token. Therefore, to get SIA access, you must pass the authentication stage.

To fetch your temporary token, send a request to the Login API, which you can access by sending a POST request to the /api/v1/login endpoint.

The POST payload should include a JSON object containing the username and the password, as set up in the customer portal, as the following example shows:

curl -s -d '
    {
	"username":"[email protected]",
	"password":"m4g1c",
	"realm":"intel"
    }' https://api.spamhaus.org/api/v1/login
  • username is the email address associated with your SIA account.

  • password is the SIA password assigned to such user.

  • realm is -as far SIA is involved- always intel

In the case of success, the HTTP status code will be 200 and the body will contain a JSON object similar to the following one:

{
  "code": 200,
  "token": "eyJ0eXAi[......]dx2UTSGcyEKvU",
  "expires": 1583252180
}

The successful JSON response object will include an “expires” integer field detailing the Unix timestamp of when the token will expire. Usually, each token is valid for 24 hours.

In the case of authentication failure, the API will return a different status code and an associated error message as part of the JSON object:

{
  "code": 401,
  "message": "Authentication failed"
}

When an auth token has expired any subsequent API requests will result in a 401 Unauthorized HTTP response code. In this case, you require a new auth token before sending additional API requests. However, there is no need to wait until a token has expired before requesting a new one. We recommend refreshing the token when the current one is close to expiration.

NOTE: Please do NOT request an auth token for each API request. This action is penalized to protect against brute force attacks and may result in API access being temporarily slowed down or blocked. Additionally, this approach will slow the service due to the added latency introduced by the repeated authentication sessions.

The Authorization header

Once logged in, you must store the token value indicated in the response payload.

The token value is needed to assemble the authentication header . You need to add this header to every request to the API, excluding the login calls.

The Authorization header must be in the form:

Authorization: Bearer <AUTH TOKEN>

<AUTH TOKEN> is the string obtained through the previously described API call.

Please note that you should remove the < and > symbols. They are used in this example to contain the token placeholder. The full “” string should be replaced with the real token.

Limits and counters

Access to SIA limits the number of queries you can run over specified time frames. These limits are associated with the product tier that you have subscribed to. If you need to change these limits, please contact your sales representative.

The limits API endpoint allows you to check the maximum query limits applied to your account and the number of queries used for each limit, e.g. 24 hours and 30 days.

Usage example:

curl -sH 'Authorization: Bearer <AUTH TOKEN>' \
        https://api.spamhaus.org/api/intel/v1/limits

This api call returns a JSON object like this:

{
  "status": 200,
  "account": {
    "sub": "3534543",
    "usr": "[email protected]"
  },
  "limits": {
    "ads": "XBL,BCL,CSS",
    "trs": "base",
    "qms": 1000,
    "qmh": 1500,
    "rl_qph": 3600,
    "rl_qpm": 60,
    "rl_qps": 1
  },
  "current": {
    "qpm": 18,
    "qpd": 18,
    "rl_qph": 5,
    "rl_qpm": 0,
    "rl_qps": 0
  }
}

Field explanations:

  • code - will be 200 in the case of success; otherwise an error occurred

  • account - an object which shows the account properties:

    • sub - shows the account subscription identifier

    • usr - shows the account username

  • limits - an object containing the global limits of the account

    • ads - allowed queries datasets (comma separated list)

    • trs - identifier of the access level, defaults to “base”

    • qms - an integer showing the max number of queries per month allowed (soft limit)

    • qmh - an integer showing the max number of queries per month allowed (hard limit)

    • rl_qph - an integer showing the enforced rate limits (queries per hour)

    • rl_qpm - an integer showing the enforced rate limits (queries per minute)

    • rl_qps - an integer showing the enforced rate limits (queries per second)

  • current - an object containing the current counters of the account

    • qpm - shows the number of actual queries performed during the current month

    • qpd - shows the number of actual queries performed during the current day

    • rl_qph - an integer showing the rate limit current counter (queries per hour)

    • rl_qpm - an integer showing the rate limit current counter (queries per minute)

    • rl_qps - an integer showing the rate limit current counter (queries per second)

The rate limit values are predominantly intended as a measure to prevent abuse of the system that can cripple its functionalities. The contractual tier of the user is defined by qms and qmh. The soft limit is the contractually allowed number of queries per month. If you exceed these limits regularly, the account’s query volumes will need reviewing.

If the hard limit is reached, the system will refuse all subsequent queries with a 429 - Too many requests HTTP error code.

This same error message is provided if your query rate is too high and you are hitting one of the rate limit values (the various rl_qpX entries). If this is the case, please can we request you to add some delay between API calls to slow down your query rate?

The rate-limiting engine thresholds are proportional to the maximum number of daily queries your account can perform based on account type.

NOTE: Not all queries are counted as 1. Queries for CIDR resources have a multiplier applied relating to the size of the requested CIDR:

  • a query for a /32 IPv4 (or a /64 IPv6) is counted as 1

  • a query for a /31 IPv4 (or a /63 IPv6) is counted as 2

  • a query for a /24 IPv4 (or a /56 IPv6) is counted as 9

To calculate the “count” of a query for a CIDR, apply log2X where X is the number of distinct resources and add 1 (addresses in the case of IPv4, /64’s in the case of IPv6) contained in the CIDR.

IP reputation data

Search by CIDR

This API endpoint allows the request of either current or historic listing information for a specific network block.

The query URL is:

  • GET /api/intel/v1/byobject/cidr/<DATASET>/<MODE>/<QTY><IPADDRESS>[/<MASK>][?get_arguments..]

Arguments:

  • DATASET - identifies the dataset where the search can be performed. Allowed values are: XBL, CSS, BCL or ALL

  • MODE - search CIDR mode. Can have two different values:

    • listed search all submissions contained within specified netmask

    • listings search all submissions that contains specified netmask (this should only be used when querying a dataset that allows listings of variable sizes. This will be implemented in the future.)

  • TYPE - can either be “live” or “history”:

    • live only returns listings that are still “active”, i.e., they are currently listed and haven’t expired from the Spamhaus datasets yet

    • history returns any record seen within the (implicit or explicit) time window, including expired data

  • IPADDRESS - IP address to look for

  • MASK - optional netmask to use. It defaults to 32 for IPv4 or 64 for IPv6; its maximum value can be 24 or 56 for IPv4 and IPv6 searches respectively

Available GET arguments:

  • limit - constrain the number of rows returned by the query. Default is 500 and max value is 2000. If you pass a value greater thatn 2000, an error is returned.

  • since - extract results with a timestamp greater than or equal to ‘since’ (unix timestamp); default is 12 months before since if not passed

  • until - extract results with a timestamp less than or equal to ‘until’ (unix timestamp); defaults to the current timestamp if not passed

NOTE: When querying for historical data, the maximum timeframe allowed is 12 months; passing a larger interval will result in an error code. If a larger timespan is required, multiple queries are needed.

Some usage examples:

Get active listings for 45.150.206.114 in XBL

curl -s https://api.spamhaus.org/api/intel/v1/byobject/cidr/XBL/listed/live/45.150.206.114 \
	-H 'Authorization: Bearer <AUTH TOKEN>'

Get last 10 listing events for IPs in 45.150.206.0/24 from XBL

curl -s https://api.spamhaus.org/api/intel/v1/byobject/cidr/XBL/listed/history/45.150.206.0/24?limit=10 \
	-H 'Authorization: Bearer <AUTH TOKEN>'

Get submissions from a specified time range for 45.150.206.0/24

curl -s 'https://api.spamhaus.org/api/intel/v1/byobject/cidr/XBL/listed/history/45.150.206.0/24?since=1606600000&until=1606750000' \
	-H 'Authorization: Bearer <AUTH TOKEN>'

Output

Independent of the type of query against the data, all successful queries will return a JSON object composed of a code with the same value as the HTTP status code at the HTTP protocol layer.

Here is an example of a query returning no data:

{"code":404}

If the query results in data being returned, the JSON object provides an array named results with all the records returned by the query.

For example: https://api.spamhaus.org/api/intel/v1/byobject/cidr/XBL/listed/history/74.77.66.227?limit=2 would return an object like the following:

{
  "code": 200,
  "results": [
    {
      "dataset": "XBL",
      "ipaddress": "74.77.66.227",
      "asn": "11351",
      "cc": "US",
      "listed": 1606757120,
      "seen": 1606757113,
      "valid_until": 1607361913,
      "rule": "01a400d5",
      "botname": "unknown",
      "detection": "SMTP impersonation",
      "dstport": 25,
      "helo": "outlook.com",
      "heuristic": "IMPERSONATE",
      "lat": 43.0505,
      "lon": -78.853,
      "srcip": "74.77.66.227"
    },
    {
      "dataset": "XBL",
      "ipaddress": "74.77.66.227",
      "asn": "11351",
      "cc": "US",
      "listed": 1606063971,
      "seen": 1606063960,
      "valid_until": 1606668760,
      "rule": "01a400d5",
      "botname": "unknown",
      "detection": "SMTP impersonation",
      "dstport": 25,
      "helo": "outlook.com",
      "heuristic": "IMPERSONATE",
      "lat": 43.0505,
      "lon": -78.853,
      "srcip": "74.77.66.227"
    }
  ]
}

NOTE TO SELF: WHAT IS THE DETECTION PROPERTY??

For details of the fields included in each record, please refer to the relevant dataset’s documentation.

Domain reputation data

Search by domain

This API endpoint allows the request of the reputation data regarding a given domain.

The query URL is:

  • GET /api/intel/v1/byobject/domain/rep/<DOMAIN>

Arguments:

  • DOMAIN - identifies the domain for which reputation data is being requested. This needs to be the bare domain (like in example.com) and not a hostname inside it (like www.example.com).

Output

All successful queries will return a JSON object composed of a code with the same value as the HTTP code at the HTTP protocol layer.

Here is an example of a query returning no data:

{"code":404}

If the query results in data being returned, the JSON object provides a result object containing all the data made available.

For example:
https://api.spamhaus.org/api/intel/v1/byobject/domain/rep/example.com would return an object like the following:

{
  "code": 200,
  "result": {
    "domain": "example.com",
    "reputation": "great",
    "registrar": "RESERVED-Internet Assigned Numbers Authority",
    "date_created": 808358400,
    "first_seen": 1248469080,
    "last_seen": 1661863800,
    "trusted_tld": false,
    "corporate_registrar": false,
    "ns": [
      {
        "hostname": "a.iana-servers.net",
        "first_seen": 1250643120,
        "last_seen": 1661863800,
        "reputation": "great"
      },
      {
        "hostname": "b.iana-servers.net",
        "first_seen": 1250643120,
        "last_seen": 1661863800,
        "reputation": "great"
      }
    ],
    "senders": [
      {
        "ip": "93.95.228.211",
        "last_seen": 1661863800
      },
      {
        "ip": "95.111.251.196",
        "last_seen": 1661863800
      },
      {
        "ip": "103.149.120.10",
        "last_seen": 1661863800
      },
      {
        "ip": "107.191.56.52",
        "last_seen": 1661863800
      },
      {
        "ip": "108.170.43.243",
        "last_seen": 1661863800
      },
      {
        "ip": "111.90.148.163",
        "last_seen": 1661863800
      },
      {
        "ip": "123.231.243.132",
        "last_seen": 1661863800
      },
      {
        "ip": "151.236.57.12",
        "last_seen": 1661863800
      },
      {
        "ip": "200.7.39.182",
        "last_seen": 1661863800
      },
      {
        "ip": "204.15.146.3",
        "last_seen": 1661863800
      }
    ]
  }
}

For details of the fields included in each record, please refer to the domain reputation data under the “Anatomy of the data” chapter.

Dataset Download

This API endpoint allows the download of an entire current dataset, in compressed format. Access to this API endpoint is only granted to customers who have subscribed for enterprise access to that specific dataset.

The query URL is:

  • GET /api/intel/v1/download/ext/<DATASET>

Arguments list:

  • DATASET - identifies the dataset to download. Currently supported dataset values are: bcl, xbl, css

Usage example:

# get eBCL full dataset export file
wget --header="Authorization: Bearer <AUTH TOKEN>" \
   --output-document=bcl.tgz \
   https://api.spamhaus.org/api/intel/v1/download/ext/bcl

This API will return binary data. The output is a compressed file with tar and gzip (.tgz extension).

Return Codes

The dataset download API will return the following HTTP codes:

  • 200 - Download OK

  • 401 - User not allowed to access the functionality

  • 404 - Specified dataset file not found

Additional notes and troubleshooting

Any other return code in the 4xx range could be possible and in general would identify a problem in accessing the functionality by the user.

If you get a 401 Unauthorized http status code, you are probably sending the wrong authentication header.

If you are getting a 403 Forbidden http status code, your authentication token has probably expired. Please renew it

If you are consistently hitting a 429 error, you are probably hitting the API too hard. Please have a look at the Limits and counters chapter and adapt accordingly.

If your query returns a 400 Bad Request error, please check that your query is within the specifications. Check that the time window is less than 1 year, that the limit upper bound is correct or that the netmasks are properly within bounds.