Skip to content

Component: Smrt.ExtractText

Canonical source: SmrtApps/src/Smrt.ExtractText/README.md (mirrored below)


Smrt.ExtractText

Basic OCR (unstructured text extraction) planning + orchestration.

Overview and responsibilities

  • Plans OCR execution (local-first by default) and produces an ordered candidate list.
  • Defines vendor-agnostic contracts and delegates execution to a host-provided IOcrTextExecutor.

Public surface / entry points

  • Planning/orchestration APIs and execution contracts (see source for the public types).

Dependencies and integrations

  • Local OCR candidates are intended to be executed by host executors in Smrt.ExtractText.Host.
  • Optional cloud candidates are provided via Smrt.CloudProviders (CapabilityId.OcrText).

Configuration and operational data

  • No persistent config/state is owned by this library.

Observability and diagnostics

  • Logs metadata only (planned candidates + used provider).
  • Never log image bytes or extracted text.

Testing and validation

  • Build (Debug, win-x64):
    • dotnet build SmrtApps/src/Smrt.ExtractText/Smrt.ExtractText.csproj -c Debug -r win-x64
    • (End-to-end wiring) dotnet build SmrtApps/src/Smrt.ExtractText.Host/Smrt.ExtractText.Host.csproj -c Debug -r win-x64
  • Unit tests:
    • dotnet test SmrtApps/src/Smrt.ExtractText.Tests/Smrt.ExtractText.Tests.csproj -c Debug -r win-x64 --no-build
  • Integration tests (credential/network-gated):
    • Gate: set SMRTHUB_INTEGRATION_TESTS=1
    • Requires a non-production OCR.Space profile with Credential Manager secret api-key (never stored in config/state/logs)

Support Bundle

  • Not applicable directly (library); collect host application logs via Support Bundle.

Design

  • Plans local-first execution when available:
    • Windows AI OCR (Microsoft.Windows.AI.Imaging.TextRecognizer) when ready.
    • Graceful fallback to legacy Windows OCR (Windows.Media.Ocr).
    • Optional cloud fallback via Smrt.CloudProviders (CapabilityId.OcrText).
  • If the caller prefers cloud (e.g., user explicitly selects a cloud OCR provider), the plan places the cloud candidate first and includes local OCR candidates as the automatic fallback chain.
  • Produces an execution contract (OcrTextExecutionContract) with an ordered candidate list and delegates execution + fallback to a host-provided executor (IOcrTextExecutor).
  • Logs metadata only (planned candidates + used provider). Never log image bytes or extracted text.

Layout-aware text (toggle)

OcrTextOptions includes PreferLayoutAwareText (default true). When enabled, host executors may use line/word metadata (when provided by OCR engines) to produce more readable plain text (line breaks, light indentation, and paragraph-like gaps). When disabled, executors return the raw Text property when available (or a simple line-join fallback) without additional formatting heuristics.

Clipboard contract (SmrtHub UX)

SmrtHub treats ExtractText as clipboard-in / clipboard-out:

  • Input: the current Windows clipboard content (typically an image fragment, HTML <img>, or a captured screenshot).
  • Output: extracted text is written back to the Windows clipboard so the user can paste it or feed it into other SmrtHub actions.
  • Shared state: ClipboardMonitor/Python shared state is an ingestion and display surface; ExtractText should not “rewrite” shared state. If ExtractText produces output, it does so via the clipboard.

Important nuance:

  • Smrt.ExtractStructuredText local structured extraction does not route through Smrt.ExtractText. It uses the Windows OCR engines directly and preserves geometry/layout details for structured output.
  • Smrt.ExtractText remains a clipboard-focused, unstructured text extraction component.

Status

  • Planning/orchestration implemented.
  • Host-side OCR execution is implemented in Smrt.ExtractText.Host (not in this library).
  • Dev harness Smrt.Tools.ExtractLab wires the host executors to validate plan + execute end-to-end.