Extended XBL (eXBL)
This is the metadata-enriched version of the eXploit BlockList, a list of public IP addresses where through behavioral heuristics we identify indicators of compromised machines.
It’s distributed as a single JSON file containing all the listings live at the time of the file generation (indeed named eXBL, to differentiate it from the plain DNSBL list, not containing the metadata and named XBL), or can be queried through the API as
XBL dataset , where historical data are also available. In both cases, the record format is exactly the same.
Each record identifies a “detection” for the given IP, with multiple records for the same IP possible in case multiple bots (or what our analysts think are multiple bots) have been identified on the same IP resource.
Each record is composed by the following fields:
ipaddressThe IP address identified as the source of the bot-generated traffic. Always provided.
botnameThe bot name associated with the detected activity. Where a clear association is not possible, “unknown” will be returned. Always provided.
botnam_malpediaThe bot name associated with the detected activity, as named by Malpedia. Where a clear association is not possible, “unknown” will be returned. Not always provided, particularly for historical listings.
seenThe Unix timestamp (rounded to the minute) of the last detected event for the given IP and the given botname. Always provided.
firstseenUnix timestamp (rounded to the minute) of the first detection event for this IP+botname combination. This will match the value of
seenif it’s the first sighting of this type on this particular IP. When there has been no activity for this given combination for a month, the field is reset the next time it’s observed. Always provided.
listedThe Unix timestamp (rounded to the minute) of when the entry reached our database. Usually, this is very close to the value of
seenunless when the data is coming from batched processes. Always provided.
valid_untilUnix timestamp (rounded to the minute) of when the given entry will be considered “expired” from our dataset. Always provided.
detectionHuman-readable form, briefly describing how the data was collected. This field only appears when the heuristic can involve multiple ways of collecting said data.
ruleAn internal ID pointing to the rule operating the detection. Detections operated by different means or rules will show different IDs, even when they refer to the same detection. Always provided.
dstportThe destination port of the traffic that triggered the detection. Not always disclosed/available.
heloWhen the detection is operated from SMTP traffic, this is the HELO string used in the SMTP session triggering the detection.
helosSpecific to MPD detections only. This is an array enumerating all the HELO strings involved in the detection. Appears only in records for the MPD heuristic.
heuristicIt’s the heuristic applied to generate the detection. Has a limited number of possible values.
asnIt’s the Autonomous System Number (ASN) announcing the IP; predominantly obtained from routeviews data.
latGeographic latitude of the IP. Only provided when geolocation data is available.
lonGeographic longitude of the IP. Only provided when geolocation data is available.
ccThe ISO Country Code of the nation where the IP resides. Only provided when geolocation data is available.
protocolIP protocol of the traffic triggering the detection. Usually either UDP or TCP.
srcipSource IP of the traffic triggering the detection. Except in rare cases, this matches the argument of the listing for IPv4, while in IPv6 -for which the granularity of XBL is the
/64, this provides the specific IP (
/128) causing the listing.
uriSpecific to the “SINKHOLE” heuristic, and to HTTP sinkholes detections in particular. This is the URI of the HTTP request triggering the listing. Not always available.
useragentSpecific to the “SINKHOLE” heuristic, and to HTTP sinkhole detections in particular. It is the User-Agent header of the HTTP request triggering the listing. Not always available.
domainMostly specific to the “SINKHOLE” heuristic, and to HTTP sinkholes in particular. It’s the domain/hostname the traffic triggering the detection is reaching, i.e., the sinkhole’d domain. Often obtained from the “host” header of the HTTP request triggering the listing. Not always available.