Google Cloud Vision OCR Pricing 2026
Free tier limits, per-image rates, Document AI add-ons — and a flat-rate alternative if you don't want to manage GCP.
TL;DR
- Cloud Vision is a developer API: 1,000 free images per feature per month, then ~$1.50 per 1,000 images for TEXT_DETECTION or DOCUMENT_TEXT_DETECTION.
- Document AI is a separate product. Plain OCR is the same ~$1.50 / 1,000 pages, but Form Parser jumps to ~$30 / 1,000 pages for tables and forms.
- If you want a fixed monthly bill and a UI instead of an SDK, LensCopy starts at $0 (10 pages lifetime free) and scales to $14.99/mo for 2,500 pages with markdown output and table extraction included.
Bottom line: which should you use?
Bottom line: if you want OCR output you can paste straight into a document without managing a GCP project, LensCopy is the easier and more predictable choice. It returns markdown with reconstructed tables on every plan — including the free tier — for a flat $14.99/month at 2,500 pages, versus Cloud Vision metering each feature at ~$1.50 per 1,000 units plus a separate Document AI Form Parser at ~$30 per 1,000 pages once you need tables and forms. There is no SDK, no Cloud Functions, and no service account to set up, and a permanent 10-page free tier lets you test first. Google Cloud Vision is still the better fit when you are already GCP-native, need fine-grained ML metadata like per-word confidence and bounding polygons, require region-pinned processing for compliance, or run millions of plain-text pages through a pipeline your engineers maintain.
LensCopy vs Google Cloud Vision OCR: feature comparison
Pricing snapshot (as of 2026-05-09)
LensCopy
- Free — $0 (10 pages lifetime)
- Basic — $14.99/mo (2,500 pages/month)
- Pro — $49.99/mo (10,000 pages/month)
- Business — Custom (Unlimited)
Google Cloud Vision OCR
- TEXT_DETECTION — Free first 1,000/mo (~$1.50 / 1,000 after)
- DOCUMENT_TEXT_DETECTION — Free first 1,000/mo (~$1.50 / 1,000 after)
- Document AI – OCR — ~$1.50 / 1,000 pages (Separate product)
- Document AI – Form Parser — ~$30 / 1,000 pages (For tables/forms)
FAQ: Google Cloud Vision OCR Pricing 2026
After the free tier of 1,000 units per feature per month, both TEXT_DETECTION and DOCUMENT_TEXT_DETECTION are billed at ~$1.50 per 1,000 units. Document AI Form Parser, which adds table and form structure, is billed separately at ~$30 per 1,000 pages.
There is a free tier of 1,000 units per feature per month — TEXT_DETECTION gets its own 1,000, DOCUMENT_TEXT_DETECTION gets its own 1,000, and Document AI processors are metered separately. The free tier resets monthly. There is no permanent free tier and no usage rolls over.
They are priced identically — ~$1.50 per 1,000 units after the free tier — but they are metered as separate features, so each gets its own 1,000-unit free allowance per month. TEXT_DETECTION is tuned for short text in natural images; DOCUMENT_TEXT_DETECTION is tuned for dense pages.
One unit is one image (or one page of a PDF) sent to one feature. If you call both TEXT_DETECTION and label detection on the same image, that counts as two units against two different feature meters.
For non-developer workflows that want markdown output and a flat monthly bill, yes. LensCopy starts at $0 with a 10-page lifetime free tier, scales to $14.99/mo for 2,500 pages, and runs in a browser without a GCP project. Cloud Vision remains the right choice for production GCP-native pipelines that already live in Google's stack.
Cloud Vision is general-purpose OCR billed per image. Document AI is a higher-tier product family with specialized processors (Form Parser for tables, Invoice Parser, ID Parser, etc.) priced per page at significantly higher rates. If you need tables and forms, Document AI is required and pricing jumps from ~$1.50 to ~$30 per 1,000 pages.
LensCopy returns markdown tables on every plan, including the free tier. The closest Google equivalent is Document AI Form Parser at ~$30 per 1,000 pages — so 2,500 pages/month costs ~$75 with Form Parser vs. $14.99 flat on LensCopy Basic.
Both are strong on printed text. LensCopy tends to reconstruct mixed-content layouts (tables + equations + handwriting) more faithfully because the output is designed as markdown rather than raw blocks. Cloud Vision returns richer ML metadata (per-word confidence, bounding polygons, language detection) that is useful if you are building your own post-processing pipeline.
Comparison data verified 2026-05-09. Loading interactive comparison…