URLhaus data (beta release)
URL Requests
When searching for URLs, it is good to understand that a URL normalization process is applied to all URLs ingested by Spamhaus and all queries for URLs sent through the API. For example, the following two URLs will be treated as the same:
https://www.example.com/resource
https://www.ExAmPlE.com/resource
Steps taken to normalize the URLs include conforming to RFC 3986, along with some additional encoding/decoding steps. This consists of
Entire scheme is lower-case
User details and credentials are removed
Hosts are lower case
Default ports are removed
Path is retained verbatim
Note that users do not need to perform any manual URL normalization before submitting API queries, as the API handles this process.
URL (ID)
This method allows the user to get the URL NUMERIC ID (identification code) starting from the URL itself.
METHOD: POST
URL:
/api/intel/v2/byobject/url/id
PAYLOAD: ‘{ “url”: “$url_value” }’
This would look like:
curl -sH 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/id -X POST -d '{"url":"http://www.example.com/malicious"}'
The returned object would be:
{
"id": 123456
}
URL (last status)
This API method allows the user to fetch the last status of a URL and the last payload observed i.e., this method does not show anything about the history of the URL and its payload.
METHOD: POST
URL:
/api/intel/v2/byobject/url/last
PAYLOAD:
{ "url": "$url_value" }
This would look like:
curl -sH 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/last -X POST -d '{"url":"http://www.example.com/malicious"}'
The returned object would be:
{
"id": 123456,
"url": "https://www.example.com/malicious",
"status": {
"ts": 1707758318,
"status": "online",
"removal": "reason for removal",
"reporter": "nickname of reporter",
},
"payload": {
"ts": 1707758318,
"mime_type":
"file_type":
"file_ext":
"file_size":
"file_name":
"sha256_hash":
"malware-family":
}
}
URL (history)
This method shows the last events occurring on a specific URL. It shows an array of events (added, removed, online, offline) and an array of observed payloads.
Please note: This method will show a maximum of the last 100 events/payloads.
METHOD: POST
URL:
/api/intel/v2/byobject/url/history
PAYLOAD:
{ "url": "$url_value" }
This would look like:
curl -sH 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/history -X POST -d '{"url":"http://www.example.com/malicious"}'
The returned object would be:
{
"id": 123456,
"url": "http://www.example.com/malicious",
"events": [
{
"ts": 1707758318,
"status": "online",
"reporter": "nickname",
"removal": "reason",
},
[more...],
],
"payloads": [
{
"ts": 1707758318,
"mime_type":
"file_type":
"file_ext":
"file_size":
"file_name":
"sha256_hash":
"malware-family":
},
[more...],
]
}
For GET
queries such as:
GET /api/intel/v2/byobject/url/<URLID>/statuses[?get_arguments..]
GET /api/intel/v2/byobject/url/<URLID>/payloads[?get_arguments..]
The output is in NDJSON rather than in JSON (so 1 record per line), and they output all the data held for a specific URLID around:
the status changes
the payloads
The output is similar to the previous history
API call. One would show only the status changes, the other the payloads.
There is no guarantee on the sorting of the records. Ordering should be done on the client side.
These methods accept the following GET arguments:
since
- extract results with a timestamp greater than or equal to ‘since’ (unix timestamp); default is 12 months before current timestamp if not passed (statuses endpoint only)until
- extract results with a timestamp less than or equal to ‘until’ (unix timestamp); defaults to the current timestamp if not passed (statuses endpoint only)limit
- limits the number of extracted records; default is 1000 rows
NDJSON documentation can be found on Wikipedia: https://en.wikipedia.org/wiki/JSON_streaming
IP
METHOD: POST
URL:
/api/intel/v2/byobject/url/search[?get_arguments..]
ARGUMENTS:
since
- extract results with a timestamp greater than or equal to ‘since’ (unix timestamp); default is 12 months before current timestamp if not passeduntil
- extract results with a timestamp less than or equal to ‘until’ (unix timestamp); defaults to the current timestamp if not passed
PAYLOADS:
{
"ip": "1.2.3.4"
}
The IP searches can be for any valid IP or CIDR subnet (e.g. “1.2.3.0/24”).
Example input:
curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"ip":"1.2.3.4"}'
Example output
{
"ts": 1717134688,
"id": 1234567,
"url": "http://1.2.3.4:60388/bin.sh"
}
{
"ts": 1717134688,
"id": 1234567,
"url": "http://1.2.3.4:60388/bin.sh"
}
Hostname
{
"host": "www.example.com"
}
Example input:
curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"host":"www.example.com"}'
Example output
{
"ts": 1716900215,
"id": 1234567,
"url": "https://www.example.com/scl/fi/y6v1mrocebayygj1ld5td/online.msi?rlkey=zjeecz4qvl4t5ztw89v9t7689vt&st=xgazgzrq&dl=1"
}
{
"ts": 1716897391,
"id": 1234568,
"url": "https://www.example.com/scl/fi/y6v1mrocebayygj1ld5td/online.msi?rlkey=zjeecz4qvl4t5ztw89v9t7689vt&st=xgazgzrq&dl=1"
}
{
"ts": 1716897390,
"id": 1234569,
"url": "https://www.example.com/scl/fi/y6v1mrocebayygj1ld5td/online.msi?rlkey=zjeecz4qvl4t5ztw89v9t7689vt&st=xgazgzrq&dl=1"
}
...more...
ASN
{
"asn": 13000
}
Example input:
curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"asn":"12345"}'
Example output
{
"ts": 1716224697,
"id": 1234567,
"url": "https://example.com/asdfasdf/test/downloads/new_image.jpg"
}
{
"ts": 1716223545,
"id": 1234567,
"url": "https://example.com/asdfasdf/test/downloads/new_image.jpg"
}
{
"ts": 1717140148,
"id": 1234568,
"url": "http://1.2.3.4/RansomV4-2.exe"
}
...more...
Malware family
{
"malware_family": "mirai"
}
Example input:
curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"malware_family":"mirai"}' |
Example output
{
"ts": 1712688765,
"id": 1234567,
"url": "http://1.2.3.4:55691/i"
}
{
"ts": 1712688688,
"id": 1234568,
"url": "http://2.2.3.4/5311qjmikurawepedalnqmashrabotatuk61119123c/infn.arm"
}
{
"ts": 1712688675,
"id": 1234566,
"url": "http://3.2.3.4:38415/Mozi.m"
}
...more...
Hash
{
"sha256": "c47b31612872d8616dd4d28f434d19a0b9e2ccc1db311f691f47224fed0ac128"
}
Example input:
curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"sha256":"c47b31612872d8616dd4d28f434d19a0b9e2ccc1db311f691f47224fed0ac128"}'
Example output
{
"ts": 1712565641,
"id": 2804705,
"url": "http://117.220.146.72:40669/Mozi.m"
}
{
"ts": 1712565634,
"id": 2804680,
"url": "http://222.140.192.157:60339/i"
}
{
"ts": 1712565629,
"id": 2804704,
"url": "http://115.55.232.49:59264/bin.sh"
}
...more...