History

Fraggle 5958758b3f first commit		2025-12-07 12:14:33 -04:00
..
README.md	first commit	2025-12-07 12:14:33 -04:00
receipt-ocr-workflow.json	first commit	2025-12-07 12:14:33 -04:00

README.md

n8n Workflow: Receipt OCR Analysis

This workflow processes receipt images uploaded from the Axion HR system, extracts key information using OCR, and saves it to the backend database.

Workflow Overview

Webhook Receives Receipt - Receives POST request with receipt image (base64) and user ID
Extract Data - Extracts image and user ID from request
OCR API Call - Sends image to OCR.space API for text extraction
Parse Receipt Data - Uses regex patterns to extract:
- Amount (total)
- Date
- Vendor name
- Tax amount
- Calculates confidence score
Save to Backend - Saves extracted data to backend API
Respond Success - Returns success response with receipt ID and extracted amount

Setup Instructions

1. Import the Workflow

Open your n8n instance
Click "Workflows" → "Import from File"
Select receipt-ocr-workflow.json
The workflow will be imported with all nodes configured

2. Configure Environment Variables

Set these environment variables in your n8n instance:

OCR_API_KEY=your_ocr_space_api_key
BACKEND_API_URL=https://your-backend-api.com
BACKEND_API_KEY=your_backend_api_key

Getting an OCR API Key:

Sign up at https://ocr.space/ocrapi
Get your free API key (25,000 requests/month free)
Or use alternative OCR services (Google Vision, AWS Textract, etc.)

3. Configure Webhook URL

Click on the "Webhook - Receipt Upload" node
Note the webhook URL (e.g., https://your-n8n.com/webhook/receipt-upload)
Update your frontend to POST to this URL

4. Update Backend API Endpoint

Click on the "Save to Backend" node
Update the URL to match your backend API endpoint
Ensure your backend expects this data structure:

{
  "userId": "string",
  "amount": "number",
  "date": "string",
  "vendor": "string",
  "tax": "number",
  "confidence": "number",
  "status": "string",
  "extractedText": "string"
}

Frontend Integration

Update your Receipts.tsx component to call the n8n webhook:

const handleFile = async (file: File) => {
  setUploading(true);
  
  // Convert file to base64
  const reader = new FileReader();
  reader.onloadend = async () => {
    const base64Image = reader.result as string;
    
    try {
      const response = await fetch('YOUR_N8N_WEBHOOK_URL', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          image: base64Image,
          userId: currentUser.id,
        }),
      });
      
      const result = await response.json();
      
      if (result.success) {
        // Update UI with extracted data
        setFormData({
          amount: result.amount,
          // ... other fields
        });
      }
    } catch (error) {
      console.error('OCR processing failed:', error);
    } finally {
      setUploading(false);
    }
  };
  
  reader.readAsDataURL(file);
};

Workflow Customization

Using Different OCR Service

Replace the "OCR API Call" node with your preferred service:

Google Vision API:

// Use Google Vision API node or HTTP Request
POST https://vision.googleapis.com/v1/images:annotate

AWS Textract:

// Use AWS Textract node

Improving Amount Extraction

Modify the regex in "Parse Receipt Data" node:

// More robust amount regex
const amountRegex = /(?:total|amount|sum|balance|due|\\$|€|£|USD|EUR)\\s*:?\\s*([\\d,]+\\.[\\d]{2})/i;

Adding Category Detection

Add a Code node after parsing to detect category:

const categoryKeywords = {
  'Office Supplies': ['office', 'supplies', 'staples', 'paper'],
  'Meals': ['restaurant', 'cafe', 'food', 'dining'],
  'Transportation': ['uber', 'lyft', 'taxi', 'gas', 'fuel'],
};

// Detect category based on vendor name
let category = 'Other';
for (const [cat, keywords] of Object.entries(categoryKeywords)) {
  if (keywords.some(kw => vendor?.toLowerCase().includes(kw))) {
    category = cat;
    break;
  }
}

Testing

Test the Workflow

Use n8n's "Execute Workflow" button
Or send a test POST request:

curl -X POST https://your-n8n.com/webhook/receipt-upload \
  -H "Content-Type: application/json" \
  -d '{
    "image": "base64_encoded_image_here",
    "userId": "test-user-123"
  }'

Expected Response

{
  "success": true,
  "receiptId": "receipt-123",
  "amount": 45.99,
  "confidence": 0.85,
  "status": "needs_review"
}

Troubleshooting

OCR Not Extracting Amount

Check OCR API key is valid
Verify image quality (clear, readable text)
Adjust regex patterns in "Parse Receipt Data" node
Check OCR API response in node output

Backend Save Failing

Verify backend API URL is correct
Check API authentication headers
Ensure backend endpoint accepts the data structure
Check n8n execution logs for errors

Low Confidence Scores

Improve image quality before upload
Adjust regex patterns to match your receipt format
Add more extraction patterns for different receipt types
Consider using ML-based extraction for better accuracy

Alternative OCR Services

If OCR.space doesn't meet your needs:

Google Cloud Vision API - High accuracy, pay-per-use
AWS Textract - Good for structured documents
Azure Computer Vision - Microsoft's OCR service
Tesseract.js - Open source, runs locally
ABBYY FineReader - Enterprise-grade OCR

Update the "OCR API Call" node accordingly.