Threat Intelligence
SPYDER supports threat intelligence workflows by mapping the infrastructure behind malicious domains. Starting from known indicators of compromise (IOCs), it expands the graph through DNS resolution, TLS certificate analysis, and HTTP link extraction to uncover related infrastructure that shares hosting, certificates, or content relationships with known threats.
IOC Expansion from Known Malicious Domains
Seed and Expand
The core threat intel workflow with SPYDER starts with a known malicious domain and expands outward by analyzing its infrastructure relationships:
# Start with known IOCs
cat <<EOF > iocs.txt
malware-delivery.example
phishing-login.example
c2-callback.example
EOF
./bin/spyder -domains=iocs.txt -concurrency=64 -exclude_tlds=gov,mil,int,edu \
2>/dev/null > ioc-expansion.jsonPivot on Shared IP Addresses
Domains sharing an IP with a known malicious domain are candidates for related infrastructure:
# Extract IPs from known-bad domains
jq -r '.edges[] | select(.type=="RESOLVES_TO") | [.source, .target] | @tsv' \
ioc-expansion.json > ip-map.tsv
# Find all domains on those IPs (requires a second scan)
awk -F'\t' '{print $2}' ip-map.tsv | sort -u > suspect-ips.txt
# Cross-reference against a broader scan
./bin/spyder -domains=broad-domains.txt 2>/dev/null | \
jq -r '.edges[] | select(.type=="RESOLVES_TO") | [.source, .target] | @tsv' | \
grep -F -f suspect-ips.txtPivot on Shared Certificates
Certificates bind multiple domains together. A certificate used by a known-bad domain that also covers other domains via Subject Alternative Names reveals related infrastructure:
# Extract certificate fingerprints from IOC scan
jq -r '.edges[] | select(.type=="USES_CERT") | .target' \
ioc-expansion.json | sort -u > bad-certs.txt
# Find other domains using the same certificates
./bin/spyder -domains=broad-domains.txt 2>/dev/null | \
jq -r '.edges[] | select(.type=="USES_CERT") | [.source, .target] | @tsv' | \
grep -F -f bad-certs.txtPivot on Nameserver Infrastructure
Threat actors often register multiple domains through the same registrar and delegate them to the same nameservers:
# Extract nameservers from IOC domains
jq -r '.edges[] | select(.type=="USES_NS") | .target' \
ioc-expansion.json | sort -u
# Find other domains using the same nameservers
jq -r '.edges[] | select(.type=="USES_NS") | [.source, .target] | @tsv' \
ioc-expansion.json | sort -k2If malware-delivery.example and phishing-login.example both use ns1.shadydns.example and ns2.shadydns.example, that nameserver provider is a strong pivot point for discovering additional threat actor domains.
Fast-Flux Detection Through DNS Pattern Analysis
Fast-flux networks rapidly rotate the IP addresses behind a domain, making takedowns difficult. SPYDER can detect this pattern by running repeated scans and comparing DNS resolution results.
Detecting IP Rotation
# Scan the same domains at two different times
./bin/spyder -domains=suspect-domains.txt -run=scan-t1 2>/dev/null > scan-t1.json
sleep 300
./bin/spyder -domains=suspect-domains.txt -run=scan-t2 2>/dev/null > scan-t2.json
# Compare IP resolutions
diff <(jq -r '.edges[] | select(.type=="RESOLVES_TO") | [.source, .target] | @tsv' scan-t1.json | sort) \
<(jq -r '.edges[] | select(.type=="RESOLVES_TO") | [.source, .target] | @tsv' scan-t2.json | sort)Domains with significantly different resolution sets between scans are candidates for fast-flux.
Scoring IP Diversity
# Count unique IPs per domain
jq -r '.edges[] | select(.type=="RESOLVES_TO") | [.source, .target] | @tsv' \
scan-output.json | \
awk -F'\t' '{ips[$1]++} END {for (d in ips) if (ips[d] > 5) print ips[d] "\t" d}' | \
sort -rnA domain resolving to many different IPs (especially in different subnets) is a strong fast-flux indicator. Legitimate CDNs also show high IP diversity, so combine this with AS number analysis to distinguish CDNs from botnets.
Automated Fast-Flux Analysis
import json
from collections import defaultdict
from ipaddress import ip_address, ip_network
def detect_fast_flux(scan_files):
"""Analyze multiple scan results for fast-flux indicators."""
domain_ips = defaultdict(set)
for scan_file in scan_files:
with open(scan_file) as f:
for line in f:
data = json.loads(line)
for edge in data.get("edges", []):
if edge["type"] == "RESOLVES_TO":
domain_ips[edge["source"]].add(edge["target"])
suspects = {}
for domain, ips in domain_ips.items():
if len(ips) < 3:
continue
# Count unique /24 subnets
subnets = set()
for ip_str in ips:
try:
addr = ip_address(ip_str)
if addr.version == 4:
subnets.add(ip_network(f"{ip_str}/24", strict=False))
except ValueError:
continue
if len(subnets) >= 3:
suspects[domain] = {
"ip_count": len(ips),
"subnet_count": len(subnets),
"ips": list(ips)
}
return suspectsC2 Infrastructure Mapping
Mapping Command and Control Networks
C2 servers often share infrastructure patterns. SPYDER can map these by starting from a known C2 domain and expanding through DNS and certificate relationships:
# Seed with known C2 domain
echo "known-c2.example" > c2-seed.txt
# Run recursive crawl to discover related infrastructure
./bin/spyder -domains=c2-seed.txt \
-continuous \
-max_domains=500 \
-concurrency=64 \
-exclude_tlds=gov,mil,int,edu \
-ua="ThreatResearch/1.0" \
2>/dev/null > c2-map.jsonIdentifying C2 Indicators
# Domains discovered through links from C2 (potential staging/exfil)
jq -r '.edges[] | select(.type=="LINKS_TO") | [.source, .target] | @tsv' \
c2-map.json | sort -k1
# Infrastructure sharing (domains on same IP as C2)
C2_IPS=$(jq -r '.edges[] | select(.type=="RESOLVES_TO" and .source=="known-c2.example") | .target' c2-map.json)
for ip in $C2_IPS; do
echo "=== Domains on $ip ==="
jq -r --arg ip "$ip" '.edges[] | select(.type=="RESOLVES_TO" and .target==$ip) | .source' c2-map.json
doneBuilding a C2 Infrastructure Profile
# Generate a C2 profile from SPYDER output
jq '{
c2_domain: "known-c2.example",
resolved_ips: [.edges[] | select(.type=="RESOLVES_TO" and .source=="known-c2.example") | .target] | unique,
nameservers: [.edges[] | select(.type=="USES_NS" and .source=="known-c2.example") | .target] | unique,
certificates: [.edges[] | select(.type=="USES_CERT" and .source=="known-c2.example") | .target] | unique,
linked_domains: [.edges[] | select(.type=="LINKS_TO" and .source=="known-c2.example") | .target] | unique,
co_hosted_domains: [.edges[] | select(.type=="RESOLVES_TO") | .source] | unique | length
}' c2-map.jsonCertificate-Based Threat Actor Tracking
Certificate Fingerprint Tracking
TLS certificates provide durable tracking identifiers. Even when a threat actor changes IPs and domains, they may reuse the same certificate or certificate authority patterns.
# Extract certificate details for threat actor domains
jq '[.nodes_cert[] | {
spki: .spki_sha256,
subject: .subject_cn,
issuer: .issuer_cn,
valid_from: .not_before,
valid_until: .not_after
}]' threat-scan.jsonCertificate Authority Pattern Analysis
Threat actors tend to use consistent certificate providers. Track which CAs issue certificates to suspicious domains:
# CA distribution across suspect domains
jq -r '.nodes_cert[].issuer_cn' threat-scan.json | \
sort | uniq -c | sort -rnA cluster of domains all using certificates from the same free CA, issued within a narrow time window, is a strong indicator of coordinated infrastructure setup.
Certificate Timeline Analysis
# Find certificates issued in a tight time window (batch registration)
jq -r '.nodes_cert[] | [.not_before, .subject_cn, .issuer_cn] | @tsv' \
threat-scan.json | sort -k1If many certificates were issued on the same day for different domains, the domains were likely registered together as part of a campaign.
Wildcard Certificate Abuse Detection
# Find wildcard certificates and their associated domains
jq -r '.nodes_cert[] | select(.subject_cn | startswith("*")) | .subject_cn' \
threat-scan.json | sort -u
# Find which domains use these wildcards
WILDCARDS=$(jq -r '.nodes_cert[] | select(.subject_cn | startswith("*")) | .spki_sha256' threat-scan.json)
for spki in $WILDCARDS; do
echo "=== Certificate: $spki ==="
jq -r --arg s "$spki" '.edges[] | select(.type=="USES_CERT" and .target==$s) | .source' threat-scan.json
doneIntegration with Threat Intelligence Platforms
STIX/TAXII Export
Convert SPYDER output to STIX indicators for ingestion into threat intelligence platforms:
import json
import uuid
from datetime import datetime
def spyder_to_stix(spyder_file):
"""Convert SPYDER edges to STIX 2.1 indicators."""
indicators = []
with open(spyder_file) as f:
for line in f:
data = json.loads(line)
for edge in data.get("edges", []):
if edge["type"] == "RESOLVES_TO":
indicators.append({
"type": "indicator",
"spec_version": "2.1",
"id": f"indicator--{uuid.uuid4()}",
"created": datetime.utcnow().isoformat() + "Z",
"name": f"{edge['source']} resolves to {edge['target']}",
"pattern": f"[domain-name:value = '{edge['source']}']",
"pattern_type": "stix",
"valid_from": edge["observed_at"],
"labels": ["malicious-activity"]
})
return {"type": "bundle", "id": f"bundle--{uuid.uuid4()}", "objects": indicators}MISP Event Creation
# Extract IOCs for MISP import
jq -r '.edges[] | select(.type=="RESOLVES_TO") |
"domain-ip|" + .source + "|" + .target' threat-scan.json > misp-import.csv
jq -r '.nodes_cert[] |
"x509-fingerprint-sha256|" + .spki_sha256' threat-scan.json >> misp-import.csvBlocklist Generation
# Generate domain blocklist from expanded IOCs
jq -r '.nodes_domain[].host' threat-scan.json | sort -u > blocklist-domains.txt
# Generate IP blocklist
jq -r '.nodes_ip[].ip' threat-scan.json | sort -u > blocklist-ips.txt
# Generate in common firewall formats
awk '{print "deny ip any host " $1}' blocklist-ips.txt > acl-rules.txtContinuous Monitoring with Recursive Mode
Persistent Threat Monitoring
Use -continuous mode to continuously expand the threat actor's known infrastructure as new domains are discovered:
./bin/spyder -domains=threat-seeds.txt \
-continuous \
-max_domains=10000 \
-concurrency=128 \
-exclude_tlds=gov,mil,int,edu \
-ingest=https://threat-ingest.internal/v1/batch \
-probe=threat-monitor-1 \
-ua="ThreatIntel/1.0"In this mode, when SPYDER discovers that malware-c2.example links to exfil-staging.example, the second domain is automatically added to the crawl queue. Its DNS, certificates, and links are then analyzed, potentially discovering payload-host.example, and so on.
Scheduled Threat Sweeps
Run periodic scans against known threat actor infrastructure to detect changes:
#!/bin/bash
# threat-sweep.sh - Run daily against known threat IOCs
DATE=$(date +%Y%m%d-%H%M)
SEEDS="/etc/spyder/threat-iocs.txt"
OUTPUT="/var/lib/spyder/threat-sweeps/sweep-${DATE}.json"
./bin/spyder \
-domains="${SEEDS}" \
-continuous \
-max_domains=5000 \
-concurrency=256 \
-probe=threat-sweep \
-run="sweep-${DATE}" \
-exclude_tlds=gov,mil,int,edu \
2>/dev/null > "${OUTPUT}"
# Alert on new infrastructure
PREV=$(ls -t /var/lib/spyder/threat-sweeps/sweep-*.json | sed -n '2p')
if [ -n "$PREV" ]; then
NEW_DOMAINS=$(diff <(jq -r '.nodes_domain[].host' "$PREV" | sort -u) \
<(jq -r '.nodes_domain[].host' "$OUTPUT" | sort -u) | grep "^>" | wc -l)
NEW_IPS=$(diff <(jq -r '.nodes_ip[].ip' "$PREV" | sort -u) \
<(jq -r '.nodes_ip[].ip' "$OUTPUT" | sort -u) | grep "^>" | wc -l)
if [ "$NEW_DOMAINS" -gt 0 ] || [ "$NEW_IPS" -gt 0 ]; then
echo "ALERT: ${NEW_DOMAINS} new domains, ${NEW_IPS} new IPs detected in threat sweep" | \
mail -s "SPYDER Threat Sweep Alert" soc@company.com
fi
fiRedis-Backed Continuous Monitoring
For ongoing threat monitoring across a team of analysts, use Redis-backed queuing so that any analyst can add new seed domains and the crawl continues:
export REDIS_ADDR=redis.soc.internal:6379
export REDIS_QUEUE_ADDR=redis.soc.internal:6379
export REDIS_QUEUE_KEY=spyder:threat-intel
./bin/spyder \
-continuous \
-max_domains=50000 \
-concurrency=256 \
-ingest=https://threat-ingest.internal/v1/batch \
-probe=threat-worker-$(hostname) \
-domains=/etc/spyder/threat-seeds.txtNew IOCs can be pushed to the Redis queue independently, and the running probe instances will pick them up and begin expanding the infrastructure graph.
Operational Considerations
Rate Limiting and Ethics
When scanning threat infrastructure, use responsible rate limiting and identification:
./bin/spyder -domains=iocs.txt \
-concurrency=64 \
-ua="ThreatResearch/1.0 (+https://your-org.com/security-research)" \
-exclude_tlds=gov,mil,int,eduSPYDER respects robots.txt by default. For threat intel use cases where you need to scan domains that block crawlers, be aware that robots.txt compliance may limit visibility into certain domains.
Data Handling
Threat intelligence data from SPYDER scans should be handled with appropriate controls:
- Store scan results in access-controlled storage
- Limit retention to the period required by your threat intel program
- Anonymize or redact data when sharing outside your organization
- Tag data with classification labels appropriate to your environment
- Consider legal jurisdiction when scanning infrastructure in other countries