Ingest a document end-to-end

Receipts, manuals, warranties, insurance docs — Dib stores them, runs OCR, and pulls out the useful fields. Here's the full path from a file to a searchable, attached document.

1. Attach the document

Give Dib a file_url and a title. Attach it to something with subject_type (home, inventory, or vehicle) plus the matching subject_id, or leave the subject off for a general team document. Storage is handled for you — you never deal with bucket keys.

curl -X POST https://dib.io/api/v1/documents \
  -H "Authorization: Bearer $DIB_API_KEY" \
  -H "Idempotency-Key: $(uuidgen)" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Dishwasher warranty",
    "file_url": "https://example.com/warranty.pdf",
    "type": "warranty",
    "subject_type": "inventory",
    "subject_id": "6f1c2e2a-1b3c-4d5e-8f90-1a2b3c4d5e6f"
  }'

Dib fetches the file, stores it, and kicks off OCR. The new document shows up in the change feed as a document created event, and you can read extracted fields back from GET /v1/documents/{id} once processing finishes.

2. List what's attached

Filter documents by their subject to show everything tied to one item or vehicle:

curl -H "Authorization: Bearer $DIB_API_KEY" \
     "https://dib.io/api/v1/documents?subject_type=inventory&subject_id=6f1c2e2a-1b3c-4d5e-8f90-1a2b3c4d5e6f"

Just need the data, not storage?

If you already have your own document store and only want the extraction, use POST /v1/documents/extract. It returns OCR text and a best-effort field map without persisting anything (needs documents:read).

curl -X POST https://dib.io/api/v1/documents/extract \
  -H "Authorization: Bearer $DIB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "file_url": "https://example.com/warranty.pdf", "document_type": "warranty" }'

{
  "data": {
    "document_type": "warranty",
    "summary": "2-year manufacturer warranty for a Bosch dishwasher...",
    "fields": { "brand": "Bosch", "warranty_expiry": "2028-05-24", "total": "$899.00" },
    "text": "BOSCH LIMITED WARRANTY ...",
    "stored": false
  },
  "meta": { "request_id": "req_abc123" }
}

Treat fieldsas suggestions rather than gospel — it's a model extraction, so confirm anything you act on.

Good to know

Creating a document needs documents:write; the one-off extract needs documents:read and counts against your AI quota.
Send an Idempotency-Keyon the create so a retry doesn't file the same document twice.

1. Attach the document

curl -X POST https://dib.io/api/v1/documents \
  -H "Authorization: Bearer $DIB_API_KEY" \
  -H "Idempotency-Key: $(uuidgen)" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Dishwasher warranty",
    "file_url": "https://example.com/warranty.pdf",
    "type": "warranty",
    "subject_type": "inventory",
    "subject_id": "6f1c2e2a-1b3c-4d5e-8f90-1a2b3c4d5e6f"
  }'

Just need the data, not storage?

curl -X POST https://dib.io/api/v1/documents/extract \
  -H "Authorization: Bearer $DIB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "file_url": "https://example.com/warranty.pdf", "document_type": "warranty" }'

{ "data": { "document_type": "warranty", "summary": "2-year manufacturer warranty for a Bosch dishwasher...", "fields": { "brand": "Bosch", "warranty_expiry": "2028-05-24", "total": "$899.00" }, "text": "BOSCH LIMITED WARRANTY ...", "stored": false }, "meta": { "request_id": "req_abc123" } }

Treat fieldsas suggestions rather than gospel — it's a model extraction, so confirm anything you act on.