URLhaus data (beta release)

URL Requests

When searching for URLs, it is good to understand that a URL normalization process is applied to all URLs ingested by Spamhaus and all queries for URLs sent through the API. For example, the following two URLs will be treated as the same:

https://www.example.com/resource
https://www.ExAmPlE.com/resource

Steps taken to normalize the URLs include conforming to RFC 3986, along with some additional encoding/decoding steps. This consists of

  • Entire scheme is lower-case

  • User details and credentials are removed

  • Hosts are lower case

  • Default ports are removed

  • Path is retained verbatim

Note that users do not need to perform any manual URL normalization before submitting API queries, as the API handles this process.

URL (ID)

This method allows the user to get the URL NUMERIC ID (identification code) starting from the URL itself.

  • METHOD: POST

  • URL: /api/intel/v2/byobject/url/id

  • PAYLOAD: ‘{ “url”: “$url_value” }’

This would look like:

curl -sH 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/id -X POST -d '{"url":"http://www.example.com/malicious"}'

The returned object would be:

{
    "id": 123456
}

URL (last status)

This API method allows the user to fetch the last status of a URL and the last payload observed i.e., this method does not show anything about the history of the URL and its payload.

  • METHOD: POST

  • URL: /api/intel/v2/byobject/url/last

  • PAYLOAD: { "url": "$url_value" }

This would look like:

curl -sH 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/last -X POST -d '{"url":"http://www.example.com/malicious"}'

The returned object would be:

{
    "id": 123456,
    "url": "https://www.example.com/malicious",


    "status": {
        "ts": 1707758318,
        "status": "online",
        "removal": "reason for removal",
        "reporter": "nickname of reporter",
    },


    "payload": {
        "ts": 1707758318,
        "mime_type":
        "file_type":
        "file_ext":
        "file_size":
        "file_name":
        "sha256_hash":
        "malware-family":
    }
}

URL (history)

This method shows the last events occurring on a specific URL. It shows an array of events (added, removed, online, offline) and an array of observed payloads.

Please note: This method will show a maximum of the last 100 events/payloads.

  • METHOD: POST

  • URL: /api/intel/v2/byobject/url/history

  • PAYLOAD: { "url": "$url_value" }

This would look like:

curl -sH 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/history -X POST -d '{"url":"http://www.example.com/malicious"}'

The returned object would be:

{
    "id": 123456,
    "url": "http://www.example.com/malicious",


    "events": [
        {
                "ts": 1707758318,
                "status": "online",
                "reporter": "nickname",
                "removal": "reason",
        },
        [more...],
    ],


    "payloads": [
        {
                "ts": 1707758318,
                "mime_type":
                "file_type":
                "file_ext":
                "file_size":
                "file_name":
                "sha256_hash":
                "malware-family":
        },
        [more...],
    ]
}

For GET queries such as:

GET /api/intel/v2/byobject/url/<URLID>/statuses[?get_arguments..]
GET /api/intel/v2/byobject/url/<URLID>/payloads[?get_arguments..]

The output is in NDJSON rather than in JSON (so 1 record per line), and they output all the data held for a specific URLID around:

  • the status changes

  • the payloads

The output is similar to the previous history API call. One would show only the status changes, the other the payloads.

There is no guarantee on the sorting of the records. Ordering should be done on the client side.

These methods accept the following GET arguments:

  • since - extract results with a timestamp greater than or equal to ‘since’ (unix timestamp); default is 12 months before current timestamp if not passed (statuses endpoint only)

  • until - extract results with a timestamp less than or equal to ‘until’ (unix timestamp); defaults to the current timestamp if not passed (statuses endpoint only)

  • limit - limits the number of extracted records; default is 1000 rows

NDJSON documentation can be found on Wikipedia: https://en.wikipedia.org/wiki/JSON_streaming

IP

  • METHOD: POST

  • URL: /api/intel/v2/byobject/url/search[?get_arguments..]

  • ARGUMENTS:

    • since - extract results with a timestamp greater than or equal to ‘since’ (unix timestamp); default is 12 months before current timestamp if not passed

    • until - extract results with a timestamp less than or equal to ‘until’ (unix timestamp); defaults to the current timestamp if not passed

  • PAYLOADS:

{
    "ip": "1.2.3.4"
}

The IP searches can be for any valid IP or CIDR subnet (e.g. “1.2.3.0/24”).

Example input:

curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"ip":"1.2.3.4"}'

Example output

{
  "ts": 1717134688,
  "id": 1234567,
  "url": "http://1.2.3.4:60388/bin.sh"
}
{
  "ts": 1717134688,
  "id": 1234567,
  "url": "http://1.2.3.4:60388/bin.sh"
}

Hostname

{
    "host": "www.example.com"
}

Example input:

curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"host":"www.example.com"}'

Example output

{
  "ts": 1716900215,
  "id": 1234567,
  "url": "https://www.example.com/scl/fi/y6v1mrocebayygj1ld5td/online.msi?rlkey=zjeecz4qvl4t5ztw89v9t7689vt&st=xgazgzrq&dl=1"
}
{
  "ts": 1716897391,
  "id": 1234568,
  "url": "https://www.example.com/scl/fi/y6v1mrocebayygj1ld5td/online.msi?rlkey=zjeecz4qvl4t5ztw89v9t7689vt&st=xgazgzrq&dl=1"
}
{
  "ts": 1716897390,
  "id": 1234569,
  "url": "https://www.example.com/scl/fi/y6v1mrocebayygj1ld5td/online.msi?rlkey=zjeecz4qvl4t5ztw89v9t7689vt&st=xgazgzrq&dl=1"
}
...more...

ASN

{
    "asn": 13000
}

Example input:

curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"asn":"12345"}'

Example output

{
  "ts": 1716224697,
  "id": 1234567,
  "url": "https://example.com/asdfasdf/test/downloads/new_image.jpg"
}
{
  "ts": 1716223545,
  "id": 1234567,
  "url": "https://example.com/asdfasdf/test/downloads/new_image.jpg"
}
{
  "ts": 1717140148,
  "id": 1234568,
  "url": "http://1.2.3.4/RansomV4-2.exe"
}
...more...

Malware family

{
    "malware_family": "mirai"
}

Example input:

curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"malware_family":"mirai"}' |

Example output

{
  "ts": 1712688765,
  "id": 1234567,
  "url": "http://1.2.3.4:55691/i"
}
{
  "ts": 1712688688,
  "id": 1234568,
  "url": "http://2.2.3.4/5311qjmikurawepedalnqmashrabotatuk61119123c/infn.arm"
}
{
  "ts": 1712688675,
  "id": 1234566,
  "url": "http://3.2.3.4:38415/Mozi.m"
}
...more...

Hash

{
    "sha256": "c47b31612872d8616dd4d28f434d19a0b9e2ccc1db311f691f47224fed0ac128"
}

Example input:

curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"sha256":"c47b31612872d8616dd4d28f434d19a0b9e2ccc1db311f691f47224fed0ac128"}'

Example output

{
  "ts": 1712565641,
  "id": 2804705,
  "url": "http://117.220.146.72:40669/Mozi.m"
}
{
  "ts": 1712565634,
  "id": 2804680,
  "url": "http://222.140.192.157:60339/i"
}
{
  "ts": 1712565629,
  "id": 2804704,
  "url": "http://115.55.232.49:59264/bin.sh"
}
...more...

Tags

Tags are community generated descriptors of the submitted entries. They are helpful methods to filter the data appropriate to your use case.

{
    "tags": [ "exe", "32" ]
}

Example input:

curl -H 'Authorization: Bearer 12345TOKEN67890' https://api.spamhaus.org/api/intel/v2/byobject/url/search -X POST -d '{"tags": [ "exe", "32" ]}'

Example output

{
  "ts": 1717235528,
  "id": 1234567,
  "url": "https://example.com/385123/setup.exe",
  "tags": [
    "32",
    "exe"
  ]
}
{
  "ts": 1717232888,
  "id": 1234568,
  "url": "http://example.com/385128/setup.exe",
  "tags": [
    "32",
    "exe"
  ]
}
{
  "ts": 1717224429,
  "id": 1234566,
  "url": "https://example.com/385120/setup.exe",
  "tags": [
    "32",
    "exe"
  ]
...more...

Please note: The search results are not guaranteed to be ordered by any particular criteria.