4.4 KiB
Analytics Accuracy Guide
How to Verify Analytics Accuracy
Run the verification script to check for discrepancies:
php /var/www/verify-analytics.php [YYYY-MM-DD]
If no date is provided, it checks today's data.
Current Accuracy Issues Found
1. Returning Visitor Count Bug ⚠️
The summary shows incorrect returning visitor counts. The script counts unique returning visitors, but the summary logic appears flawed.
Impact: Returning visitor numbers are inflated.
2. RSS Click Tracking
RSS clicks are tracked in two ways:
- Button clicks on the page (tracked via JavaScript)
- Actual RSS feed fetches (tracked via PHP in
feed.php)
Impact: RSS numbers may be double-counted or inconsistent.
3. No Bot Filtering
Bot traffic (search engines, crawlers) is currently counted as regular visitors.
Impact: Numbers may be inflated by 10-30% depending on site popularity.
4. Ad Blockers
Users with ad blockers may block the analytics script entirely.
Impact: Numbers may be deflated by 5-15% (depending on user base).
5. Self-Visits
Your own visits are not filtered out.
Impact: Development/testing visits inflate numbers.
6. Duplicate Pageviews
Same visitor, same page, within 5 seconds = potential duplicate.
Impact: Rapid navigation or page refreshes create duplicates.
7. New vs Returning Logic
Currently only checks within the same day. A visitor who came yesterday but returns today is counted as "new" again.
Impact: Returning visitor counts are inaccurate across days.
Factors Affecting Accuracy
✅ What IS Tracked Accurately:
- Pageview timestamps (hourly breakdown is recalculated from raw data)
- Share counts (when JavaScript executes)
- Reaction counts (stored separately, very accurate)
⚠️ What May Be Inaccurate:
- Total visits: May include bots, duplicates, self-visits
- New vs Returning: Only accurate within same day
- RSS clicks: May have double-counting issues
- Unique visitors: Uses localStorage, can be cleared/blocked
Recommendations to Improve Accuracy
1. Filter Bot Traffic
Add bot detection in track.php:
// Check user agent for bots
$ua = $_SERVER['HTTP_USER_AGENT'] ?? '';
$isBot = preg_match('/bot|crawler|spider|scraper/i', $ua);
if ($isBot) {
// Skip tracking or mark as bot
}
2. Filter Self-Visits
Add your IP(s) to a blocklist in track.php:
$yourIPs = ['YOUR_IP_HERE', 'ANOTHER_IP'];
if (in_array($_SERVER['REMOTE_ADDR'], $yourIPs)) {
// Skip tracking
}
3. Fix Returning Visitor Logic
Store visitor history across days, not just within the same day.
4. Deduplicate Rapid Pageviews
Add a cooldown period (e.g., same visitor + same page + <10 seconds = ignore).
5. Separate RSS Tracking
Distinguish between:
- RSS button clicks (user intent)
- RSS feed fetches (automatic, may be bots)
Understanding Your Numbers
Realistic Accuracy Range
- Pageviews: ±15-25% (due to bots, ad blockers, duplicates)
- Unique Visitors: ±20-30% (localStorage can be cleared/blocked)
- Shares: ±5% (very accurate, requires JavaScript)
- Reactions: ±1% (very accurate, stored server-side)
What the Numbers Mean
- Total Visits: All page loads, including bots and duplicates
- New Visitors: First-time visitors today (not lifetime)
- Returning Visitors: Visitors who visited earlier today (not yesterday)
- Hourly Breakdown: Accurate (recalculated from timestamps)
Best Practices
- Run verification script regularly to catch discrepancies
- Focus on trends rather than absolute numbers
- Compare with server logs for validation
- Filter your own IP for more accurate numbers
- Monitor for anomalies (sudden spikes may be bots)
Quick Accuracy Check
# Check today's data
php /var/www/verify-analytics.php
# Check specific date
php /var/www/verify-analytics.php 2025-12-28
# Look for:
# - Discrepancies between summary and raw data
# - High bot counts
# - Duplicate pageviews
# - Rapid-fire visits
Expected Accuracy
For a typical personal blog:
- Pageviews: 70-85% accurate (after accounting for bots/ad blockers)
- Unique Visitors: 60-75% accurate (localStorage limitations)
- Engagement (shares/reactions): 95%+ accurate
The analytics are good enough for trends and general insights, but don't rely on exact numbers for critical decisions.