Passive DNS Daily Files Endpoint

The Passive DNS Daily Files Endpoint allows a user to download dumps of passive DNS daily files using HTTPS. These files are in a CSV format and are compressed using gzip. The previous days files are available by 00:30 UTC.

URL

https://daily-01.deteque.com/pdns/download/

Example Curl Command

curl -H "Authorization: Bearer <token>" "https://daily-01.deteque.com/pdns/download/address_ipv4_<YYYYMMDD>.csv.gz"

Supported Files

Address IPv4

The address IPv4 files contain data of hostname and IPv4 pairs with the epoch timestamp of the query. The filename is in the format of ‘address_ipv4_YYYYMMDD.csv.gz’ where ‘YYYYMMDD’ is the format of the date of the file. The contents of the file are in the format below:

timestamp | ipv4 | hostname

Address IPv6

The address IPv6 files contain data of hostname and IPv6 pairs with the epoch timestamp of the query. The filename is in the format of ‘address_ipv6_YYYYMMDD.csv.gz’ where ‘YYYYMMDD’ is the format of the date of the file. The contents of the file are in the format below:

timestamp | ipv6 | hostname

CNAME

The cname files contain data of canonical hostname and hostname pairs with the epoch timestamp of the query. The filename is in the format of ‘cname_YYYYMMDD.csv.gz’ where ‘YYYYMMDD’ is the format of the date of the file. The contents of the file are in the format below:

timestamp | hostname | canonical hostname

MX

The mx files contain data of domain and mx pairs (priority included) with the epoch timestamp of the query. The filename is in the format of ‘mx_YYYYMMDD.csv.gz’ where ‘YYYYMMDD’ is the format of the date of the file. The contents of the file are in the format below:

timestamp | domain | mx

Nameserver

The nameserver files contain data of domain and nameserver pairs with the epoch timestamp of the query. The filename is in the format of ‘nameserver_YYYYMMDD.csv.gz’ where ‘YYYYMMDD’ is the format of the date of the file. The contents of the file are in the format below:

timestamp | hostname | TXT

TXT

The TXT files contain data of hosts and TXT record pairs with the epoch timestamp of the query in a JSON format. TXT records are in a JSON format because of the difficulty in reading the character sets in a CSV format. The filename is in the format of ‘txt_YYYYMMDD.json.gz’ where ‘YYYYMMDD’ is the format of the date of the file. The contents of the file are in the format below:

{
	"timestamp": 1680876532,
	"qname": "example.com",
	"txt": "v=spf1 -all"
}

SOA

The SOA files contain data of the SOA information of domains. For easier processing these files are in a JSON format. The filename is in the format of ‘soa_YYYYMMDD.json.gz’ where ‘YYYYMMDD’ is the format of the date of the file. The contents of the file are in the format below:

{
	"timestamp": 1680876532,
	"domain": "example.com",
	"ns": "ns.icann.org.",
	"mbox": "noc.dns.icann.org.",
	"serial": 2023013039,
	"refresh": 7200,
	"retry": 3600,
	"expire": 1209600,
	"minttl": 3600
}

New Domains

The new domains files contain data of newly seen domains with the first seen epoch timestamp of the query. This is a derivative of the nameserver/domain feed and contains subdomains. The filename is in the format of ‘new_domains_YYYYMMDD.csv.gz’ where ‘YYYYMMDD’ is the format of the date of the file. The contents of the file are in the format below:

timestamp | domain

NBD

The Nothing But Domains(”NBD”) files contain data of newly seen domains with only domains that match the criteria outlined here https://docs.spamhaus.com/sia/docs/source/02-data-explained/data-anatomy.html#domain-address-records. This feed is a derivative of the New Domains feed and does not contain subdomains. The filename is in the format of ‘nbd_YYYYMMDD.csv.gz’ where ‘YYYYMMDD’ is the format of the date of the file. The contents of the file are in the format below:

timestamp | domain

Example Script

Below is an example Bash script for downloading the files:

USER=""
PASSWORD=""
REALM="pdns"
LOGIN_URL="https://api.spamhaus.org/api/v1/login"

BASE_URL="https://daily-01.deteque.com/pdns/download"
DATADIR="/tmp"
TOKEN_FILE="token.json"
DATE=`date --date="yesterday" "+%Y%m%d"`   #FORMAT 20220414

# Remove the files you don't want to download
files=(
  "address_ipv4" 
  "address_ipv6"
  "cname"
  "mx"
  "nameserver"
  "nbd"
  "new_domains"
)
json_files=(
  "txt"
  "soa"
)

load_token() {
  local current_time=$(date +%s)

  if [ ! -f $TOKEN_FILE ]; then
    get_new_token
    echo "No token file. Getting new token..."
  else
    expires="$(jq -r '.expires' "$TOKEN_FILE")"
    token="$(jq -r '.token' "$TOKEN_FILE")"
  fi

  if [ $token = "null" ]; then
    echo "Invalid token. Getting new token..."
    get_new_token
  fi
  
  if [ $current_time -gt $(($expires - 300)) ]; then
    get_new_token
    echo "Token is expired. Getting new token..."
  else
    return 1  # Token is not expired
  fi
}

get_new_token() {
  response=$(curl -s -d "
    {
  \"username\":\"${USER}\",
  \"password\":\"${PASSWORD}\",
  \"realm\":\"${REALM}\"
    }" ${LOGIN_URL}

  )
  echo $response > $TOKEN_FILE
  expires="$(echo $response | jq -r '.expires')"
  token="$(echo $response | jq  -r '.token')"
  code="$(echo $response | jq  -r '.code')"
  if [ "$code" -ne 200 ]; then
    echo "Invalid login request $response"
    exit 1
  fi

}

load_token

# Check if token is empty or not
if [ -z "$token" ]; then
  echo "Token is empty or not found in response."
else
  echo "Token is: $token"
fi

for file in ${files[@]}; do
  echo "[Downloading ${file} to ${DATADIR}/]"
  /usr/bin/curl -H "Authorization: Bearer ${token}" ${BASE_URL}/${file}_${DATE}.csv.gz -o ${DATADIR}/${file}_${DATE}.csv.gz
done

for file in ${json_files[@]}; do
  echo "[Downloading ${file} to ${DATADIR}/]"
  /usr/bin/curl -H "Authorization: Bearer ${token}" ${BASE_URL}/${file}_${DATE}.json.gz	-o ${DATADIR}/${file}_${DATE}.json.gz
done