Axion/n8n-workflows
2025-12-07 12:14:33 -04:00
..
README.md first commit 2025-12-07 12:14:33 -04:00
receipt-ocr-workflow.json first commit 2025-12-07 12:14:33 -04:00

n8n Workflow: Receipt OCR Analysis

This workflow processes receipt images uploaded from the Axion HR system, extracts key information using OCR, and saves it to the backend database.

Workflow Overview

  1. Webhook Receives Receipt - Receives POST request with receipt image (base64) and user ID
  2. Extract Data - Extracts image and user ID from request
  3. OCR API Call - Sends image to OCR.space API for text extraction
  4. Parse Receipt Data - Uses regex patterns to extract:
    • Amount (total)
    • Date
    • Vendor name
    • Tax amount
    • Calculates confidence score
  5. Save to Backend - Saves extracted data to backend API
  6. Respond Success - Returns success response with receipt ID and extracted amount

Setup Instructions

1. Import the Workflow

  1. Open your n8n instance
  2. Click "Workflows" → "Import from File"
  3. Select receipt-ocr-workflow.json
  4. The workflow will be imported with all nodes configured

2. Configure Environment Variables

Set these environment variables in your n8n instance:

OCR_API_KEY=your_ocr_space_api_key
BACKEND_API_URL=https://your-backend-api.com
BACKEND_API_KEY=your_backend_api_key

Getting an OCR API Key:

  • Sign up at https://ocr.space/ocrapi
  • Get your free API key (25,000 requests/month free)
  • Or use alternative OCR services (Google Vision, AWS Textract, etc.)

3. Configure Webhook URL

  1. Click on the "Webhook - Receipt Upload" node
  2. Note the webhook URL (e.g., https://your-n8n.com/webhook/receipt-upload)
  3. Update your frontend to POST to this URL

4. Update Backend API Endpoint

  1. Click on the "Save to Backend" node
  2. Update the URL to match your backend API endpoint
  3. Ensure your backend expects this data structure:
{
  "userId": "string",
  "amount": "number",
  "date": "string",
  "vendor": "string",
  "tax": "number",
  "confidence": "number",
  "status": "string",
  "extractedText": "string"
}

Frontend Integration

Update your Receipts.tsx component to call the n8n webhook:

const handleFile = async (file: File) => {
  setUploading(true);
  
  // Convert file to base64
  const reader = new FileReader();
  reader.onloadend = async () => {
    const base64Image = reader.result as string;
    
    try {
      const response = await fetch('YOUR_N8N_WEBHOOK_URL', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          image: base64Image,
          userId: currentUser.id,
        }),
      });
      
      const result = await response.json();
      
      if (result.success) {
        // Update UI with extracted data
        setFormData({
          amount: result.amount,
          // ... other fields
        });
      }
    } catch (error) {
      console.error('OCR processing failed:', error);
    } finally {
      setUploading(false);
    }
  };
  
  reader.readAsDataURL(file);
};

Workflow Customization

Using Different OCR Service

Replace the "OCR API Call" node with your preferred service:

Google Vision API:

// Use Google Vision API node or HTTP Request
POST https://vision.googleapis.com/v1/images:annotate

AWS Textract:

// Use AWS Textract node

Improving Amount Extraction

Modify the regex in "Parse Receipt Data" node:

// More robust amount regex
const amountRegex = /(?:total|amount|sum|balance|due|\\$|€|£|USD|EUR)\\s*:?\\s*([\\d,]+\\.[\\d]{2})/i;

Adding Category Detection

Add a Code node after parsing to detect category:

const categoryKeywords = {
  'Office Supplies': ['office', 'supplies', 'staples', 'paper'],
  'Meals': ['restaurant', 'cafe', 'food', 'dining'],
  'Transportation': ['uber', 'lyft', 'taxi', 'gas', 'fuel'],
};

// Detect category based on vendor name
let category = 'Other';
for (const [cat, keywords] of Object.entries(categoryKeywords)) {
  if (keywords.some(kw => vendor?.toLowerCase().includes(kw))) {
    category = cat;
    break;
  }
}

Testing

Test the Workflow

  1. Use n8n's "Execute Workflow" button
  2. Or send a test POST request:
curl -X POST https://your-n8n.com/webhook/receipt-upload \
  -H "Content-Type: application/json" \
  -d '{
    "image": "base64_encoded_image_here",
    "userId": "test-user-123"
  }'

Expected Response

{
  "success": true,
  "receiptId": "receipt-123",
  "amount": 45.99,
  "confidence": 0.85,
  "status": "needs_review"
}

Troubleshooting

OCR Not Extracting Amount

  • Check OCR API key is valid
  • Verify image quality (clear, readable text)
  • Adjust regex patterns in "Parse Receipt Data" node
  • Check OCR API response in node output

Backend Save Failing

  • Verify backend API URL is correct
  • Check API authentication headers
  • Ensure backend endpoint accepts the data structure
  • Check n8n execution logs for errors

Low Confidence Scores

  • Improve image quality before upload
  • Adjust regex patterns to match your receipt format
  • Add more extraction patterns for different receipt types
  • Consider using ML-based extraction for better accuracy

Alternative OCR Services

If OCR.space doesn't meet your needs:

  1. Google Cloud Vision API - High accuracy, pay-per-use
  2. AWS Textract - Good for structured documents
  3. Azure Computer Vision - Microsoft's OCR service
  4. Tesseract.js - Open source, runs locally
  5. ABBYY FineReader - Enterprise-grade OCR

Update the "OCR API Call" node accordingly.