Spamhaus Intelligence API
The Spamhaus Intelligence API (SIA) is a framework to provide convenient access for users to different datasets, giving them flexibility to utilize this data across multiple applications, services and products.
The framework comprises of several API endpoints. One is used to authenticate users and give them a token. This token is used to query the other endpoints.
The following is a list of all available endpoints and how to use them.
Login
To access the API, each request must have a proper Authorization header.
The Authorization header must be in the form:
Authorization: Bearer
<AUTH TOKEN>
<AUTH TOKEN>
is a string obtained through the following login API call.
Issue a call to the Login API which is accessed from the URL path /api/v1/login
and send a POST request
containing a JSON object that includes the username and password fields with the correct values as the following example shows:
curl -s -d '{"username":"[email protected]", "password":"m4g1c", "realm":"intel"}' \
https://api.spamhaus.org/api/v1/login
username
is the email address associated with your SIA accountpassword
is the SIA password assigned to such userrealm
is -as far SIA is involved- alwaysintel
In the case of success, the HTTP status code will be 200 and the body will contain a JSON object similar to the following one:
{
"code": 200,
"token": "eyJ0eXAi[......]dx2UTSGcyEKvU",
"expires": 1583252180
}
The successful JSON response object will include an “expires” integer field representing the unix timestamp of when the token will expire. Usually each token has a validity of 24h.
In the case of authentication failure, the API will return a different status code and an associated error message as part of the JSON object:
{
"code": 401,
"message": "Authentication failed"
}
When an auth token has expired any subsequent API requests will result in a 401 Unauthorized HTTP response code. In this case a new auth token is needed before sending additional API requests.
However, there is no need to wait until a token has expired before requesting a new one. We recommend refreshing the token when the current one is close to expiration.
NOTE: Please do NOT request an auth token for each API request. To protect against brute force attacks this action is penalized and may result in API access being temporarily blocked. Additionally, it will make the service extremely slow due to the added latency introduced by the repeated authentication sessions.
Limits and counters
Access to SIA comes with limits to the number of queries that can be run over specified time frames.
The limits API endpoint gives the ability to check the maximum query limits applied to the account in use, and the
number of queries used so far for each different limit e.g. 24 hours and 30 days.
Usage example:
curl -sH 'Authorization: Bearer <AUTH TOKEN>' \
https://api.spamhaus.org/api/intel/v1/limits
This query returns a JSON object like this:
{
"status": 200,
"account": {
"sub": "3534543",
"usr": "[email protected]"
},
"limits": {
"ads": "XBL,BCL,CSS",
"trs": "base",
"qms": 1000,
"qmh": 1500,
"rl_qph": 3600,
"rl_qpm": 60,
"rl_qps": 1
},
"current": {
"qpm": 18,
"qpd": 18,
"rl_qph": 5,
"rl_qpm": 0,
"rl_qps": 0
}
}
Field explanations:
code
- will be 200 in the case of success; otherwise an error occurredaccount
- an object which shows the account properties:sub
- shows the account subscription identifierusr
- shows the account username
limits
- an object containing the global limits of the accountads
- allowed queries datasets (comma separated list)trs
- identifier of the access level, defaults to “base”qms
- an integer showing the max number of queries per month allowed (soft limit)qmh
- an integer showing the max number of queries per month allowed (hard limit)rl_qph
- an integer showing the rate limit applied (queries per hour)rl_qpm
- an integer showing the rate limit applied (queries per minute)rl_qps
- an integer showing the rate limit applied (queries per second)
current
- an object containing the current counters of the accountqpm
- shows the number of actual queries performed during the current monthqpd
- shows the number of actual queries performed during the current dayrl_qph
- an integer showing the rate limit current counter (queries per hour)rl_qpm
- an integer showing the rate limit current counter (queries per minute)rl_qps
- an integer showing the rate limit current counter (queries per second)
The rate limit values are predominantly intended as a measure to prevent abuse of the system that can cripple its functionalities.
The contractual tier of the user is defined by qms
and qmh
.
The soft limit is the number of queries per month contractually allowed. If these limits are exceeded on a regular basis, the query volumes on the account need reviewing.
If the hard limit is reached all subsequent queries will be refused with a 429 - Too many requests
HTTP error code.
The same error message will be provided if your query rate is too high and hitting one of the rate limit values (the various rl_qpX
entries).
In this case you’re kindly asked to add some delay between API calls to slow down your query rate.
The rate limiting engine thresholds are proportional to the maximum number of queries per day that your account can perform based on account type.
NOTE: Not all queries are counted as 1. Queries for CIDR resources are applied a multiplier related with the size of the requested CIDR:
a query for a
/32
IPv4 (or a/64
IPv6) is counted as 1a query for a
/31
IPv4 (or a/63
IPv6) is counted as 2…
a query for a
/24
IPv4 (or a/56
IPv6) is counted as 8To calculate the “count” of a query for a CIDR, apply log2X where X is the number of distinct resources (addresses in the case of IPv4, /64’s in the case of IPv6) contained in the CIDR.
IP reputation data
Search by CIDR
This API endpoint allows the request of either current or historical listing information for a specific network block.
The query URL is:
GET /api/intel/v1/byobject/cidr/<DATASET>/<MODE>/<QTY><IPADDRESS>[/<MASK>][?get_arguments..]
Arguments:
DATASET
- identifies the dataset where the search can be performed. For example:XBL
MODE
- search CIDR mode. Can have two different values:listed
search all submissions contained within specified netmasklistings
search all submissions that contains specified netmask (this should only be used when querying a dataset that allows listings of variable sizes)
TYPE
- can either be “live” or “history”:live
only returns listings that are still “active”, i.e., they are currently listed and haven’t expired from the Spamhaus datasets yethistory
returns any record seen within the (implicit or explicit) time window, including expired data
IPADDRESS
- IP address to look forMASK
- optional netmask to use. It defaults to32
for IPv4 or64
for IPv6; its maximum value can be24
or56
for IPv4 and IPv6 searches respectively
Available GET arguments:
limit
- constrain the number of rows returned by the querysince
- extract results with a timestamp greater than or equal to ‘since’ (unix timestamp); default is 12 months beforesince
if not passeduntil
- extract results with a timestamp less than or equal to ‘until’ (unix timestamp); defaults to the current timestamp if not passed
NOTE: When querying for historical data, the maximum timeframe allowed is 12 months; passing a larger interval will result in an error code. If a larger timespan is required, multiple queries are needed.
Some usage examples:
# get active listings for 45.150.206.114 in XBL
curl -s https://api.spamhaus.org/api/intel/v1/byobject/cidr/XBL/listed/live/45.150.206.114 \
-H 'Authorization: Bearer <AUTH TOKEN>'
# get last 10 listing events for IPs in 45.150.206.0/24 from XBL
curl -s https://api.spamhaus.org/api/intel/v1/byobject/cidr/XBL/listed/history/45.150.206.0/24?limit=10 \
-H 'Authorization: Bearer <AUTH TOKEN>'
# get submissions from a specified time range for 45.150.206.0/24
curl -s 'https://api.spamhaus.org/api/intel/v1/byobject/cidr/XBL/listed/history/45.150.206.0/24?since=1606600000&until=1606750000' \
-H 'Authorization: Bearer <AUTH TOKEN>'
Output
Independent of the type of query against the data, all successful queries will return a JSON object composed of a code with the same value as the HTTP code at the HTTP protocol layer.
Here is an example of a query returning no data:
{"code":404}
If the query results in data being returned, the JSON object provides an array named results
with all the records returned by the query.
For example:https://api.spamhaus.org/api/intel/v1/byobject/cidr/XBL/listed/history/74.77.66.227?limit=2
would return an object like the following:
{
"code": 200,
"results": [
{
"dataset": "XBL",
"ipaddress": "74.77.66.227",
"asn": "11351",
"cc": "US",
"listed": 1606757120,
"seen": 1606757113,
"valid_until": 1607361913,
"rule": "01a400d5",
"botname": "unknown",
"detection": "SMTP impersonation",
"dstport": 25,
"helo": "outlook.com",
"heuristic": "IMPERSONATE",
"lat": 43.0505,
"lon": -78.853,
"srcip": "74.77.66.227"
},
{
"dataset": "XBL",
"ipaddress": "74.77.66.227",
"asn": "11351",
"cc": "US",
"listed": 1606063971,
"seen": 1606063960,
"valid_until": 1606668760,
"rule": "01a400d5",
"botname": "unknown",
"detection": "SMTP impersonation",
"dstport": 25,
"helo": "outlook.com",
"heuristic": "IMPERSONATE",
"lat": 43.0505,
"lon": -78.853,
"srcip": "74.77.66.227"
}
]
}
For details of the fields included in each record, please refer to the relevant dataset’s documentation.
Domain reputation data
Search by domain
This API endpoint allows the request of the reputation data regarding a given domain.
The query URL is:
GET /api/intel/v1/byobject/domain/rep/<DOMAIN>
Arguments:
DOMAIN
- identifies the domain for reach reputation data is being requested. This needs to be the bare domain (like inexample.com
) and not a hostname inside it (likewww.example.com
).
Output
All successful queries will return a JSON object composed of a code with the same value as the HTTP code at the HTTP protocol layer.
Here is an example of a query returning no data:
{"code":404}
If the query results in data being returned, the JSON object provides a result
object containing all the data made available.
For example:https://api.spamhaus.org/api/intel/v1/byobject/domain/rep/example.com
would return an object like the following:
{
"code": 200,
"result": {
"domain": "example.com",
"reputation": "great",
"registrar": "RESERVED-Internet Assigned Numbers Authority",
"date_created": 808358400,
"first_seen": 1248469080,
"last_seen": 1661863800,
"trusted_tld": false,
"corporate_registrar": false,
"ns": [
{
"hostname": "a.iana-servers.net",
"first_seen": 1250643120,
"last_seen": 1661863800,
"reputation": "great"
},
{
"hostname": "b.iana-servers.net",
"first_seen": 1250643120,
"last_seen": 1661863800,
"reputation": "great"
}
],
"senders": [
{
"ip": "93.95.228.211",
"last_seen": 1661863800
},
{
"ip": "95.111.251.196",
"last_seen": 1661863800
},
{
"ip": "103.149.120.10",
"last_seen": 1661863800
},
{
"ip": "107.191.56.52",
"last_seen": 1661863800
},
{
"ip": "108.170.43.243",
"last_seen": 1661863800
},
{
"ip": "111.90.148.163",
"last_seen": 1661863800
},
{
"ip": "123.231.243.132",
"last_seen": 1661863800
},
{
"ip": "151.236.57.12",
"last_seen": 1661863800
},
{
"ip": "200.7.39.182",
"last_seen": 1661863800
},
{
"ip": "204.15.146.3",
"last_seen": 1661863800
}
]
}
}
For details of the fields included in each record, please refer to the domain reputation data under the “Available datasets”documentation.
Dataset Download
This API endpoint allows the download of an entire current dataset, in compressed format.
Access to this API endpoint is only granted to customers who have subscribed for complete access to the dataset.
The query URL is:
GET /api/intel/v1/download/ext/<DATASET>
Arguments list:
DATASET - identifies the dataset to download. Currently supported dataset values are: bcl, xbl, css
Usage example:
# get eBCL full dataset export file
wget --header="Authorization: Bearer <AUTH TOKEN>" \
--output-document=bcl.tgz \
https://api.spamhaus.org/api/intel/v1/download/ext/bcl
Return Codes
These are the return codes implemented:
200 - Found
400 - Bad Request
401 - Unauthorized
403 - Forbidden
404 - Not found
405 - Method not allowed
408 - Request Timeout
429 - Too many requests