Skip to content

Component: Smrt.Utils.Clipboard

Canonical source: SmrtApps/src/Smrt.Utils.Clipboard/README.md (mirrored below)


Smrt.Utils.Clipboard

Overview & Responsibilities

  • Extracts text, HTML, image, and CF_HDROP document/file fragments from the Windows clipboard and normalizes them for SmrtHub ingestion.
  • Provides an OCR “quick hint” that flags image fragments likely containing text using Windows OCR (when available) with a heuristic fallback.
  • Emits structured telemetry via SmrtHub.Logging so downstream services (for example ClipboardMonitor and HubWindow) can track extraction quality.

Dependencies & Integrations

  • Targets full .NET 8 (WinDesktop) and uses Windows Forms clipboard APIs (STA required by host).
  • Late-binds Windows.Media.Ocr projections if the host process ships CsWinRT; otherwise falls back to the built-in heuristic.
  • Configuration is delivered through host apps (typically via Smrt.Config) by calling OcrHintSettings.Configure at startup.
  • No third-party logging—everything flows through SmrtHub.Logging with the Operational Data Policy file naming conventions.

Configuration & Log Paths

  • This library is configuration-free by itself; host components persist settings under their own slug at %AppData%/SmrtHub/Config/<slug>.
  • Logging uses the host component’s logger; fragments library emits rate-limited informational traces (e.g., extraction counts, OCR hint metrics).
  • No additional config/log files live under this directory.

Runtime Workflow

  1. Host (ClipboardMonitor) initializes logging and calls OcrHintSettings.Configure if overrides are needed.
  2. Clipboard changes trigger ExtractionFragmentUtils.ExtractFragmentsFromClipboardAsync, passing a cancellation token to cancel stale work.
  3. The method probes HTML markup first (when <img> present), falling back to plain text/image paths when appropriate.
  4. Image fragments pass through WindowsOcrHintService.QuickDetectTextAsync, which attempts late-bound Windows OCR and, if unavailable/slow, uses ImageLikelyHasText heuristic.
  5. Returned fragments contain Type, Content, and optional textLikely hints for downstream prioritization.

Testing & Validation

  • Build the library directly: dotnet build SmrtApps/src/Smrt.Utils.Clipboard/ExtractionFragmentUtilsLib.csproj -c Debug.
  • ClipboardMonitor integration tests exercise the library under real STA threads; run dotnet test for that project after modifying extraction behavior.
  • When changing OCR hints, validate on a Windows machine with and without Windows OCR projections installed to ensure graceful fallback.

Key APIs

  • ExtractionFragmentUtils.ExtractFragmentsFromClipboardAsync — asynchronous clipboard normalization with HTML parsing.
  • WindowsOcrHintService.QuickDetectTextAsync — hybrid Windows OCR + heuristic quick hint.
  • OcrHintSettings.Configure — runtime configuration entry point for the OCR hint pipeline.

Observability

  • Extraction traces are rate-limited to avoid log spam ([Extraction] Contains …, [Extraction] Image fragment …).
  • OCR hint service tracks totals, successes, timeouts, backoff windows, and logs transitions (e.g., when Windows OCR is detected or when backoff clears).
  • Call WindowsOcrHintService.GetDiagnosticsSnapshot() for a structured view of hint metrics, and ResetDiagnostics() when automated tests or support workflows need to zero the counters.
  • Host components should include these events in support bundles when diagnosing clipboard issues.

Support bundle

  • Clipboard-related investigations should capture the host component logs plus any relevant diagnostics snapshots.
  • Not documented yet: any clipboard-specific non-log artifacts that should be included by default.

Directory Map

  • ExtractionFragmentUtilsLib.cs — fragment extraction logic, HTML parsing, heuristics.
  • WindowsOcrHintService.cs — Windows OCR quick-hint implementation and fallback logic.
  • OcrHintSettings.cs — runtime configuration options and global accessor.
  • Build-ExtractionFragmentUtilsLib.ps1 / Clean-ExtractionFragmentUtilsLib.ps1 — helper scripts for local packaging/cleanup.
  • bin/, obj/ — generated build artifacts.

Policy References

  • Documentation standards: README.Files/System/Policies/SmrtHub-Documentation-Policy-v1.0.README.md
  • Logging policy: README.Files/Reference-Guides/SmrtHub.Logging.README.md
  • Operational data paths: README.Files/System/Policies/SmrtHub-Operational-Data-Policy-v1.0.README.md
  • SmrtApps/CSApps/ClipboardMonitor/README.md
  • SmrtApps/src/Smrt.Utils.Clipboard.Tests/README.md