Testing Guide
This guide covers running, writing, and maintaining tests for the SPYDER codebase. SPYDER uses Go's built-in testing framework with no external test dependencies.
Running Tests
Full Test Suite
Run every test in the project with a single command:
go test ./...For verbose output showing individual test names and results:
go test -v ./...The Makefile provides a shortcut that also generates a coverage profile:
make testThis runs go test ./... -coverprofile=coverage.txt under the hood.
Running Tests for a Specific Package
Target a single package when you are working on a particular subsystem:
# Test only the DNS resolver
go test -v ./internal/dns
# Test only the circuit breaker
go test -v ./internal/circuitbreaker
# Test only config loading and validation
go test -v ./internal/configRunning a Single Test
Use the -run flag with a regex that matches the test function name:
# Run only the YAML config loading test
go test -v -run TestLoadFromFile_YAML ./internal/config
# Run all circuit breaker state transition tests
go test -v -run TestCircuitBreaker ./internal/circuitbreaker
# Run only the concurrent dedup test
go test -v -run TestMemory_Concurrent ./internal/dedupTest Coverage
Generating a Coverage Profile
go test -coverprofile=coverage.txt ./...Viewing Coverage in the Terminal
go tool cover -func=coverage.txtThis prints per-function coverage percentages, for example:
github.com/gustycube/spyder/internal/config/config.go:52: SetDefaults 100.0%
github.com/gustycube/spyder/internal/config/config.go:92: Validate 100.0%
github.com/gustycube/spyder/internal/dedup/memory.go:15: Seen 100.0%Viewing Coverage in a Browser
Generate an HTML report and open it:
go tool cover -html=coverage.txt -o coverage.html
open coverage.html # macOS
xdg-open coverage.html # LinuxThe HTML report highlights covered lines in green and uncovered lines in red, making it easy to find gaps.
Per-Package Coverage
Check coverage for a single package during development:
go test -cover ./internal/rate
# ok github.com/gustycube/spyder/internal/rate 0.015s coverage: 87.5% of statementsRace Detection
Go's race detector finds data races at runtime. SPYDER uses goroutines extensively (worker pools, concurrent dedup, rate limiters), so race detection is critical.
Running Tests with Race Detection
go test -race ./...This instruments the binary with ThreadSanitizer. Tests run slower (typically 2-10x) but will catch concurrent access bugs that only manifest under specific timing conditions.
Running Race Detection on Specific Packages
Packages with concurrent code that should always be tested with -race:
go test -race ./internal/dedup # concurrent map access via sync.Map
go test -race ./internal/rate # concurrent per-host limiter access
go test -race ./internal/circuitbreaker # state transitions under loadBuilding a Race-Instrumented Binary
For manual testing against a live environment:
go build -race -o bin/spyder-debug ./cmd/spyder
./bin/spyder-debug -domains=configs/domains.txt -concurrency=64Any detected race will print a diagnostic to stderr and crash the program with a non-zero exit code.
Existing Test Packages
The following packages have test coverage. Use them as examples when writing new tests.
internal/circuitbreaker
Tests the three-state circuit breaker (Closed, Open, Half-Open) and the per-host breaker wrapper:
go test -v ./internal/circuitbreakerKey test cases:
TestCircuitBreaker_ClosedState-- successful requests keep circuit closedTestCircuitBreaker_OpensOnFailures-- exceeding failure ratio opens the circuitTestCircuitBreaker_HalfOpenState-- timeout transitions to half-open, successes close itTestCircuitBreaker_HalfOpenFailure-- failure in half-open reopens the circuitTestHostBreaker-- independent breakers per host, stats, resetTestExecuteWithRetry-- retry logic with exponential backoffTestExecuteWithRetry_CircuitOpen-- retries abort when circuit is open
internal/config
Tests YAML/JSON loading, default values, validation, flag merging, and environment variable loading:
go test -v ./internal/configKey test cases:
TestLoadFromFile_YAML-- loads and parses a YAML config fileTestLoadFromFile_JSON-- loads and parses a JSON config fileTestSetDefaults-- verifies all default values (concurrency=256, batch_max_edges=10000, etc.)TestValidate-- table-driven validation with valid and invalid configsTestMergeWithFlags-- CLI flags override file config, unset flags preserve originalsTestLoadFromEnv-- readsREDIS_ADDR,REDIS_QUEUE_ADDR,REDIS_QUEUE_KEYfrom environment
internal/dedup
Tests the in-memory deduplication implementation:
go test -v ./internal/dedupKey test cases:
TestMemory_Seen-- first call returns false, second returns trueTestMemory_Concurrent-- 100 goroutines racing on the same key; exactly one sees it as newBenchmarkMemory_Seen-- benchmarks for unique keys and repeated keys
internal/dns
Tests DNS resolution against live DNS servers:
go test -v ./internal/dnsKey test cases:
TestResolveAll-- resolves google.com; checks for IPs, NS records, no trailing dotsTestResolveAll_InvalidDomain-- non-existent domain returns empty results without panicTestResolveAll_ContextCancellation-- cancelled context returns empty results gracefullyBenchmarkResolveAll-- benchmark for DNS resolution latency
Note: DNS tests make live network calls. They may be flaky in environments without DNS access (some CI containers, air-gapped networks). Consider using
-shortto skip them if needed.
internal/rate
Tests the per-host token bucket rate limiter:
go test -v ./internal/rateKey test cases:
TestPerHost_Allow-- burst allowance, rate limiting after exhaustion, independent host limitsTestPerHost_Wait-- blocking wait respects rate intervalTestPerHost_Concurrent-- 20 goroutines contend on same host; rate limiting appliesTestPerHost_MultipleHosts-- each host gets its own burst allowanceBenchmarkPerHost_Allow-- single-host and multi-host benchmark
internal/robots
Tests robots.txt caching and TLD exclusion:
go test -v ./internal/robotsKey test cases:
TestCache_Get-- fetches from httptest server, verifies caching returns same instanceTestCache_Get_404-- 404 response returns empty (allow-all) robots dataTestShouldSkipByTLD-- table-driven test for TLD exclusion (gov, mil, int)
Integration Testing with Redis
Several packages support Redis backends (dedup, queue). Integration tests for these require a running Redis instance.
Setting Up Redis for Tests
# Start Redis locally
redis-server &
# Or with Docker
docker run -d --name spyder-redis -p 6379:6379 redis:7-alpineRunning Integration Tests
Set the REDIS_ADDR environment variable to enable Redis-backed tests:
REDIS_ADDR=127.0.0.1:6379 go test -v ./internal/dedup
REDIS_ADDR=127.0.0.1:6379 go test -v ./internal/queueWithout REDIS_ADDR, the Redis dedup and queue tests are skipped, and only the in-memory implementations are tested.
Full Integration Test Run
# Start Redis, run all tests, stop Redis
docker run -d --name spyder-test-redis -p 6379:6379 redis:7-alpine
REDIS_ADDR=127.0.0.1:6379 go test -race -v ./...
docker rm -f spyder-test-redisBenchmarks
Several packages include benchmarks for performance-sensitive code:
# Run all benchmarks
go test -bench=. ./...
# Run benchmarks for a specific package
go test -bench=. ./internal/dedup
go test -bench=. ./internal/rate
go test -bench=. ./internal/dns
# Run benchmarks with memory allocation stats
go test -bench=. -benchmem ./internal/dedup
# Run a specific benchmark
go test -bench=BenchmarkMemory_Seen ./internal/dedupExample output:
BenchmarkMemory_Seen/UniqueKeys-8 5000000 234 ns/op 48 B/op 1 allocs/op
BenchmarkMemory_Seen/SameKey-8 20000000 62.3 ns/op 0 B/op 0 allocs/opWriting New Tests
File Naming
Test files live alongside the code they test and use the _test.go suffix:
internal/
extract/
extract.go
extract_test.go # tests for extract.go
httpclient/
httpclient.go
httpclient_test.go # tests for httpclient.goTest Function Naming
Follow Go conventions. Test functions start with Test, benchmarks with Benchmark:
func TestParseLinks(t *testing.T) { ... }
func TestParseLinks_EmptyBody(t *testing.T) { ... }
func BenchmarkParseLinks(b *testing.B) { ... }Table-Driven Tests
SPYDER uses table-driven tests extensively. Follow this pattern for validation and transformation logic:
func TestShouldSkipByTLD(t *testing.T) {
excluded := []string{"gov", "mil", "int"}
tests := []struct {
host string
expected bool
}{
{"example.gov", true},
{"subdomain.example.gov", true},
{"example.com", false},
{"gov.example.com", false},
}
for _, tt := range tests {
result := ShouldSkipByTLD(tt.host, excluded)
if result != tt.expected {
t.Errorf("ShouldSkipByTLD(%s) = %v, want %v",
tt.host, result, tt.expected)
}
}
}For subtests with names (useful for identifying failures):
func TestValidate(t *testing.T) {
tests := []struct {
name string
cfg Config
wantErr bool
}{
{
name: "valid config",
cfg: Config{Domains: "d.txt", Concurrency: 256, BatchMaxEdges: 10000, BatchFlushSec: 2},
wantErr: false,
},
{
name: "missing domains",
cfg: Config{Concurrency: 256, BatchMaxEdges: 10000, BatchFlushSec: 2},
wantErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := tt.cfg.Validate()
if (err != nil) != tt.wantErr {
t.Errorf("Validate() error = %v, wantErr %v", err, tt.wantErr)
}
})
}
}Using httptest for HTTP Tests
When testing components that make HTTP calls, use net/http/httptest to avoid live network dependencies:
func TestCache_Get(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/robots.txt" {
w.WriteHeader(http.StatusOK)
w.Write([]byte("User-agent: *\nDisallow: /private/\n"))
} else {
w.WriteHeader(http.StatusNotFound)
}
}))
defer server.Close()
client := &http.Client{Timeout: 2 * time.Second}
cache := NewCache(client, "TestBot/1.0")
ctx := context.Background()
rd, err := cache.Get(ctx, server.URL[7:])
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if rd == nil {
t.Fatal("expected robots data, got nil")
}
}Using t.TempDir() for File Tests
For tests that need temporary files (config loading, spool writing):
func TestLoadFromFile_YAML(t *testing.T) {
yamlContent := `
probe: test-probe
domains: domains.txt
concurrency: 512
`
tmpDir := t.TempDir()
configFile := filepath.Join(tmpDir, "config.yaml")
if err := os.WriteFile(configFile, []byte(yamlContent), 0644); err != nil {
t.Fatal(err)
}
cfg, err := LoadFromFile(configFile)
if err != nil {
t.Fatalf("failed to load config: %v", err)
}
// assertions...
}Testing Concurrent Code
Use sync.WaitGroup and verify that concurrent access is safe:
func TestMemory_Concurrent(t *testing.T) {
d := NewMemory()
var wg sync.WaitGroup
firstSeen := 0
var mu sync.Mutex
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
defer wg.Done()
if !d.Seen("concurrent-key") {
mu.Lock()
firstSeen++
mu.Unlock()
}
}()
}
wg.Wait()
if firstSeen != 1 {
t.Errorf("expected exactly 1 first occurrence, got %d", firstSeen)
}
}Always run concurrent tests with -race to catch data races that might not cause test failures on their own.
Linting
Running the Linter
make lintThis runs golangci-lint run, which checks for:
- govet -- suspicious constructs (e.g., printf format mismatches)
- staticcheck -- advanced static analysis
- errcheck -- unchecked error returns
- gosimple -- code simplifications
- ineffassign -- ineffectual variable assignments
Installing golangci-lint
# macOS
brew install golangci-lint
# Linux
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin
# Verify
golangci-lint versionCI Pipeline
The GitHub Actions CI pipeline (.github/workflows/ci.yml) runs on every push and pull request:
name: ci
on:
push:
pull_request:
jobs:
build-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.23.x'
- name: Build
run: go build -v ./cmd/spyder
- name: Test
run: go test ./... -vThe CI pipeline:
- Checks out the repository
- Sets up Go 1.23.x (matching
go.mod) - Builds the
spyderbinary to verify compilation - Runs the full test suite with verbose output
Tests must pass before a pull request can be merged. If a test fails in CI, check the Actions tab on GitHub for the full log output.
Quick Reference
| Task | Command |
|---|---|
| Run all tests | go test ./... |
| Run all tests (verbose) | go test -v ./... |
| Run with coverage | make test |
| Run with race detection | go test -race ./... |
| Run single package | go test -v ./internal/dns |
| Run single test | go test -v -run TestResolveAll ./internal/dns |
| Run benchmarks | go test -bench=. ./... |
| Coverage HTML report | go tool cover -html=coverage.txt -o coverage.html |
| Lint | make lint |
| Integration tests | REDIS_ADDR=127.0.0.1:6379 go test -v ./... |