OpenClaw Skillv1.0.6

DeepRead OCR

DeepRead.techby DeepRead.tech
Deploy on EasyClawdfrom $14.9/mo

AI-native OCR platform that turns documents into high-accuracy data in minutes. Using multi-model consensus, DeepRead achieves 97%+ accuracy and flags only uncertain fields for Human-in-the-Loop (HIL) review—reducing manual work from 100% to 5-10%. Zero prompt engineering required.

How to use this skill

OpenClaw skills run inside an OpenClaw container. EasyClawd deploys and manages yours — no server setup needed.

  1. Sign up on EasyClawd (2 minutes)
  2. Connect your Telegram bot
  3. Install DeepRead OCR from the skills panel
Get started — from $14.9/mo
7stars
4,324downloads
10installs
0comments
8versions

Latest Changelog

Fix display name to DeepRead OCR

Tags

latest: 1.0.6

Skill Documentation

---
name: deepread
title: DeepRead OCR
description: AI-native OCR platform that turns documents into high-accuracy data in minutes. Using multi-model consensus, DeepRead achieves 97%+ accuracy and flags only uncertain fields for Human-in-the-Loop (HIL) review—reducing manual work from 100% to 5-10%. Zero prompt engineering required.
disable-model-invocation: true
metadata:
  {"openclaw":{"requires":{"env":["DEEPREAD_API_KEY"]},"primaryEnv":"DEEPREAD_API_KEY","homepage":"https://www.deepread.tech"}}
---

# DeepRead - Production OCR API

DeepRead is an AI-native OCR platform that turns documents into high-accuracy data in minutes. Using multi-model consensus, DeepRead achieves 97%+ accuracy and flags only uncertain fields for Human-in-the-Loop (HIL) review—reducing manual work from 100% to 5-10%. Zero prompt engineering required.

## What This Skill Does

DeepRead is a production-grade document processing API that gives you high-accuracy structured data output in minutes with human review flagging so manual review is limited to the flagged exceptions

**Core Features:**
- **Text Extraction**: Convert PDFs and images to clean markdown
- **Structured Data**: Extract JSON fields with confidence scores
- **HIL Interface**: Built-in Human-in-the-Loop review — uncertain fields are flagged (`hil_flag`) so only exceptions need manual review
- **Multi-Pass Processing**: Multiple validation passes for maximum accuracy
- **Multi-Model Consensus**: Cross-validation between models for reliability
- **Free Tier**: 2,000 pages/month (no credit card required)

## Setup

### 1. Get Your API Key

Sign up and create an API key:
```bash
# Visit the dashboard
https://www.deepread.tech/dashboard

# Or use this direct link
https://www.deepread.tech/dashboard/?utm_source=clawdhub
```

Save your API key:
```bash
export DEEPREAD_API_KEY="sk_live_your_key_here"
```

### 2. Clawdbot Configuration (Optional)

Add to your `clawdbot.config.json5`:
```json5
{
  skills: {
    entries: {
      "deepread": {
        enabled: true
        // API key is read from DEEPREAD_API_KEY environment variable
        // Do NOT hardcode your API key here
      }
    }
  }
}
```

### 3. Process Your First Document

**Option A: With Webhook (Recommended)**
```bash
# Upload PDF with webhook notification
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: $DEEPREAD_API_KEY" \
  -F "[email protected]" \
  -F "webhook_url=https://your-app.com/webhooks/deepread"

# Returns immediately
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}

# Your webhook receives results when processing completes (2-5 minutes)
```

**Option B: Poll for Results**
```bash
# Upload PDF without webhook
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: $DEEPREAD_API_KEY" \
  -F "[email protected]"

# Returns immediately
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued"
}

# Poll until completed
curl https://api.deepread.tech/v1/jobs/550e8400-e29b-41d4-a716-446655440000 \
  -H "X-API-Key: $DEEPREAD_API_KEY"
```

## Usage Examples

### Basic OCR (Text Only)

Extract text as clean markdown:

```bash
# With webhook (recommended)
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: $DEEPREAD_API_KEY" \
  -F "[email protected]" \
  -F "webhook_url=https://your-app.com/webhook"

# OR poll for completion
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: $DEEPREAD_API_KEY" \
  -F "[email protected]"

# Then poll
curl https://api.deepread.tech/v1/jobs/JOB_ID \
  -H "X-API-Key: $DEEPREAD_API_KEY"
```

**Response when completed:**
```json
{
  "id": "550e8400-...",
  "status": "completed",
  "result": {
    "text": "# INVOICE\n\n**Vendor:** Acme Corp\n**Total:** $1,250.00..."
  }
}
```

### Structured Data Extraction

Extract specific fields with confidence scoring:

```bash
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: $DEEPREAD_API_KEY" \
  -F "[email protected]" \
  -F 'schema={
    "type": "object",
    "properties": {
      "vendor": {
        "type": "string",
        "description": "Vendor company name"
      },
      "total": {
        "type": "number",
        "description": "Total invoice amount"
      },
      "invoice_date": {
        "type": "string",
        "description": "Invoice date in MM/DD/YYYY format"
      }
    }
  }'
```

**Response includes confidence flags:**
```json
{
  "status": "completed",
  "result": {
    "text": "# INVOICE\n\n**Vendor:** Acme Corp...",
    "data": {
      "vendor": {
        "value": "Acme Corp",
        "hil_flag": false,
        "found_on_page": 1
      },
      "total": {
        "value": 1250.00,
        "hil_flag": false,
        "found_on_page": 1
      },
      "invoice_date": {
        "value": "2024-10-??",
        "hil_flag": true,
        "reason": "Date partially obscured",
        "found_on_page": 1
      }
    },
    "metadata": {
      "fields_requiring_review": 1,
      "total_fields": 3,
      "review_percentage": 33.3
    }
  }
}
```

### Complex Schemas (Nested Data)

Extract arrays and nested objects:

```bash
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: $DEEPREAD_API_KEY" \
  -F "[email protected]" \
  -F 'schema={
    "type": "object",
    "properties": {
      "vendor": {"type": "string"},
      "total": {"type": "number"},
      "line_items": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "description": {"type": "string"},
            "quantity": {"type": "number"},
            "price": {"type": "number"}
          }
        }
      }
    }
  }'
```

### Page-by-Page Breakdown

Get per-page OCR results with quality flags:

```bash
curl -X POST https://api.deepread.tech/v1/process \
  -H "X-API-Key: $DEEPREAD_API_KEY" \
  -F "[email protected]" \
  -F "include_pages=true"
```

**Response:**
```json
{
  "result": {
    "text": "Combined text from all pages...",
    "pages": [
      {
        "page_number": 1,
        "text": "# Contract Agreement\n\n...",
        "hil_flag": false
      },
      {
        "page_number": 2,
        "text": "Terms and C??diti??s...",
        "hil_flag": true,
        "reason": "Multiple unrecognized characters"
      }
    ],
    "metadata": {
      "pages_requiring_review": 1,
      "total_pages": 2
      }
  }
}
```

## When to Use This Skill

### ✅ Use DeepRead For:

- **Invoice Processing**: Extract vendor, totals, line items
- **Receipt OCR**: Parse merchant, items, totals
- **Contract Analysis**: Extract parties, dates, terms
- **Form Digitization**: Convert paper forms to structured data
- **Document Workflows**: Any process requiring OCR + data extraction
- **Quality-Critical Apps**: When you need to know which extractions are uncertain

### ❌ Don't Use For:

- **Real-time Processing**: Processing takes 2-5 minutes (async workflow)
- **Batch >2,000 pages/month**: Upgrade to PRO or SCALE tier

## How It Works

### Multi-Pass Pipeline

```
PDF → Convert → Rotate Correction → OCR → Multi-Model Validation → Extract → Done
```

The pipeline automatically handles:
- Document rotation and orientation correction
- Multi-pass validation for accuracy
- Cross-model consensus for reliability
- Field-level confidence scoring

### Human-in-the-Loop (HIL) Interface

DeepRead includes a built-in Human-in-the-Loop (HIL) review system. The AI compares extracted text to the original image and sets `hil_flag` on each field:

- **`hil_flag: false`** = Clear, confident extraction → Auto-process
- **`hil_flag: true`** = Uncertain extraction → Routed to human review

**How HIL works:**
1. Fields extracted with high confidence are auto-approved
2. Uncertain fields are flagged with `hil_flag: true` and a `reason`
3. Only flagged fields need human review (typically 5-10% of total fields)
4. Review flagged fields in **DeepRead Preview** (`preview.deepread.tech`) — a dedicated HIL review interface where reviewers can see the original d
Read full documentation on ClawHub
Security scan, version history, and community comments: view on ClawHub