Extended BCL (eBCL)

This is the metadata-enriched version of the Botnet Controller List (BCL), a list of public IP addresses identified as being used by cybercriminals to control infected computers, also referred to as Command and Control servers (C&C or C2).
The BCL is normally published as integral part of the Spamhaus Blocklist (SBL) for general consumption via DNSBLs. However, this enriched version is also distributed as a single JSON file containing all the listings live at the time of the file generation, or can be queried through the API as BCL dataset, where historical data is also available. In both of these last two cases, the record format is exactly the same.
Each record is composed by the following fields:

  • ipaddress The IP address identified as the source of the bot-generated traffic. Always provided.

  • botname The bot name associated with the detected activity. Where a clear association is not possible, “unknown” will be returned. Always provided.

  • botnam_malpedia The bot name associated with the detected activity, as named by Malpedia. Where a clear association is not possible, “unknown” will be returned. Not always provided, particularly for historical listings.

  • seen The Unix timestamp (rounded to the minute) of the last detected event for the given IP and the given botname. Always provided.

  • firstseen Unix timestamp (rounded to the minute) of the first detection event for this IP+botname combination. This will match the value of seen if it’s the first sighting of this type on this particular IP. Always provided.

  • abused Boolean value (either true or false) indicates whether the resource is, according to our analysis, in use due to a legitimate asset being compromised or is under the direct control of the perpetrators.

  • shared Boolean value (either true or false) indicates whether the resource is, according to our analysis, in use by multiple actors or by the single offending entity.

  • listed The Unix timestamp (rounded to the minute) of when the entry reached our database. Usually, this is very close to the value of seen unless when the data is coming from batched processes. Always provided.

  • valid_until Unix timestamp (rounded to the minute) of when the given entry will be considered “expired” from our dataset. Always provided.

  • dstport The destination port of the traffic that triggered the detection or where the identified C2 service has been observed running. Always provided.

  • asn The Autonomous System Number (ASN) announcing the IP; predominantly obtained from routeviews data.

  • lat Geographic latitude of the IP. Only provided when geolocation data is available.

  • lon Geographic longitude of the IP. Only provided when geolocation data is available.

  • cc The ISO Country Code of the nation where the IP resides. Only provided when geolocation data is available.

  • protocol IP protocol of the traffic triggering the detection. Usually either UDP or TCP.

  • urls An array of URLs where relevant parts of the C2 service have been observed. Provided only for HTTP or other standard-protocol C2 instances.

  • domains An array of domains observed as in use by the given C2 instance. Provided only for C2 infrastructure that actually makes use of hostnames/domains.

  • samples An array providing information about the binary files observed referring to the specific C2 instance. Each element in the array is composed of:

    • md5hash The MD5 hash (in HEX format) of the binary file

    • sha256hash The SHA256 hash (in HEX format) of the binary file

    • ts The UNIX timestamp at which the binary sample was observed and analyzed