Katana is a fast web crawler designed for web application security testing. Discovers endpoints, parameters, and hidden paths through intelligent crawling and JavaScript execution.
Basic Usage
- katana -u https://example.com - Crawl URL
- katana -list urls.txt - Crawl from list
- katana -u https://example.com -o output.txt - Save output
- katana -u https://example.com -json - JSON output
Options
- -u url - Target URL
- -list file - URLs from file
- -o file - Output file
- -json - JSON output format
- -depth N - Max crawl depth (default: 3)
- -scope domain - Scope to domain
- -crawl - Enable crawling
- -js-crawl - JavaScript crawling
- -headless - Use headless browser
- -timeout N - Request timeout (seconds)
Filtering
- -match-regex pattern - Match regex
- -filter-regex pattern - Filter regex
- -match-status 200,301 - Match status codes
- -filter-status 404 - Filter status codes
- -match-size N - Match response size
- -filter-size N - Filter response size
Advanced Options
- -delay N - Delay between requests (ms)
- -concurrency N - Concurrent requests
- -rate-limit N - Rate limit (req/sec)
- -proxy http://proxy:8080 - Use proxy
- -header "Name: Value" - Custom header
- -cookie "name=value" - Cookie
- -user-agent "UA" - User agent
Common Examples
Basic Crawl
katana -u https://example.com
Crawl target website.
Save Output
katana -u https://example.com -o urls.txt
Save discovered URLs.
JSON Output
katana -u https://example.com -json -o results.json
Output in JSON format.
JavaScript Crawling
katana -u https://example.com -js-crawl -headless
Enable JS execution.
Depth Limit
katana -u https://example.com -depth 5
Set crawl depth.
Scope to Domain
katana -u https://example.com -scope example.com
Limit to domain.
Filter Status
katana -u https://example.com -filter-status 404
Exclude 404 responses.
With Proxy
katana -u https://example.com -proxy http://127.0.0.1:8080
Route through proxy.
Custom Headers
katana -u https://example.com -header "Authorization: Bearer token"
Add authentication header.
From List
katana -list urls.txt -o discovered.txt
Crawl multiple URLs.
Tips
- Use -js-crawl for modern SPAs
- Adjust -depth based on site size
- Use -scope to avoid crawling external sites
- Filter out noise with -filter-status
- Use -json for programmatic processing
- Combine with other tools in pipelines
- Respect rate limits and robots.txt
- Great for endpoint discovery