Skip to content

Component: ClipboardMonitor

Canonical source: SmrtApps/CSApps/ClipboardMonitor/README.md (mirrored below)


ClipboardMonitor — Windows Clipboard → SmrtHub bridge

Lightweight Windows clipboard watcher that extracts normalized fragments (text, images, HTML ) and forwards them to the Python Core over HTTP. Optimized for responsiveness and stability with debounce, cancellation, duplicate suppression, and an optional OCR quick‑hint for images.

Overview and responsibilities

  • Owns the Windows clipboard polling loop (STA) and fragment extraction.
  • Normalizes clipboard content into fragments and posts to the local Python bridge.
  • Stamps encounter provenance (IDs + digests) so downstream storage can correlate evidence.

Highlights

  • Synchronous STA polling loop with small debounce to ensure all Clipboard API calls occur on a single STA thread (Windows requirement)
  • Duplicate suppression (fixed five-second window; configuration toggle currently reserved while telemetry is gathered)
  • Image text‑likelihood hint (textLikely) via late‑bound Windows OCR, with heuristic fallback
  • Centralized logging and config, schema‑validated defaults, and sensible timeouts

How it works

1) Monitors the clipboard on a dedicated STA thread (no async context hops). The loop uses short Thread.Sleep delays to preserve STA. 2) On change, calls ExtractionFragmentUtils.ExtractFragmentsFromClipboardAsync(ct) to normalize fragments (HTML parsing and anchor/image extraction now live in the shared utility library): - text → { type: "text", content: "..." } - image → { type: "image", content: "<base64 png>", textLikely?: true|false } 3) Applies duplicate suppression based on the serialized fragment list within a short time window. 4) Wraps the fragments in an encounter envelope (encounterId + per-fragment SHA-256 digests + Storage Guard snapshot reference) and POSTs { fragments, encounter } to the Python Core.

Python endpoint (configurable): http://127.0.0.1:5001/clipboard

Dependencies and integrations

  • Local Python bridge endpoint (default): http://127.0.0.1:5001/clipboard
  • Shared extraction library: SmrtApps/src/Smrt.Utils.Clipboard
  • Logging: SmrtApps/src/Smrt.Logging
  • Config/path resolution: SmrtApps/src/Smrt.Config

Notes: - The legacy Python payload_check has been retired. The quick image text hint is computed in C#. - Windows Clipboard APIs require an STA thread; async/await can resume on an MTA thread under a default ThreadPool, causing empty reads. The monitor keeps the entire loop on STA to avoid this.

Configuration

Config file path is resolved via SmrtHub’s centralized resolver to the Roaming profile, e.g. %APPDATA%/SmrtHub/Config/clipboard-monitor/clipboard-monitor-config.json.

Defaults and JSON schema ship with the app and are copied to output: - config/defaults/clipboard-monitor-settings.defaults.json - config/schema/clipboard-monitor-settings.schema.json

Settings (excerpt):

  • endpointUrl (string, uri): Flask bridge endpoint; default http://127.0.0.1:5001/clipboard
  • pollingIntervalMs (int 100–30000): Main loop delay between scans; default 500
  • duplicateDetection (bool): Reserved toggle for runtime suppression policy; currently logged only while fixed window remains in place
  • enableOcrHint (bool): Enable OCR quick‑hint for images; default true
  • ocrHintTimeoutMs (int 50–5000): Per‑image quick‑hint timeout; default 350
  • ocrHintMaxEdgePx (int 64–4096): Downscale longest image edge before hinting; default 1024
  • ocrHintLanguageTag (string|null): Optional BCP‑47 tag (e.g., "en-US")

On startup, settings are loaded (or seeded from defaults), validated against the schema, and applied to OcrHintSettings.Configure(new OcrHintOptions { ... }) so the library path adheres to your policy.

Logging

Uses SmrtHub.Logging and writes structured logs under the component name ClipboardMonitor.

  • Startup summary with polling and duplicate detection mode, plus the configured endpointUrl
  • Per‑cycle trace if no fragments are detected (rate‑limited): “No clipboard fragments detected this cycle.”
  • Per‑post summary: “Posting {Count} fragment(s) to {Url}”
  • Errors for clipboard contention and HTTP failures (non‑fatal)

The extraction library also emits lightweight traces to aid diagnosis, e.g.:

  • [Extraction] Contains: html=…, htmlHasImg=…, text=…, image=…
  • [Extraction] HTML parse produced N fragment(s)
  • [Extraction] Text fragment length=…
  • [Extraction] Image fragment added: base64Len=…, textLikely=…

When OCR hinting is enabled, the library logs availability detection, success/failure counters, and brief backoff messages after repeated failures.

Build and run

  • Project: SmrtApps/CSApps/ClipboardMonitor/ClipboardMonitor.csproj
  • Target: net8.0-windows (STA background thread; no UI message pump)
  • Staging (via build scripts): Apps/<Configuration>/<RID>/ClipboardMonitor/ClipboardMonitor.exe

Try it (optional, via VS Code tasks): run the central build task “⚙️ Build & Stage Apps (Full Solution | Debug win-x64)”, then run the staged EXE.

Contract (inputs/outputs)

  • Input: current Windows clipboard state
  • Output: HTTP POST JSON body: { "fragments": [...], "encounter": { ... } }
  • text fragment: { type: "text", content: "..." }
  • image fragment: { type: "image", content: "<base64>", textLikely?: bool }
  • encounter envelope: { encounterId, issuedAtUtc, source: "clipboard-monitor", fragments: [{ fragmentId, contentType, sha256, size }], storageGuard: { snapshotPath, signaturePath, verified, verificationError?, signatureIssuedAtUtc?, keyId? } }

Edge cases and behavior

  • Clipboard contention: access errors are logged; the loop continues.
  • Rapid changes: previous extraction is canceled in favor of the latest state; small debounce (100–300ms) coalesces bursts. Debounce sleeps are performed on the STA thread (no async context switching).
  • Duplicates: identical serialized fragment lists within ~5s are skipped.
  • OCR hint disabled: hint path is bypassed entirely via config.
  • Windows OCR unavailable: the library falls back to a heuristic and logs availability once.

Encounter provenance

  • Each clipboard batch now receives an encounter ID and per-fragment SHA-256 digests before leaving the STA loop.
  • Storage Guard snapshot + signature metadata are loaded from %ProgramData%/SmrtHub/Logs/system-info and embedded directly into the envelope when available; verification errors are logged and captured for downstream analysis.
  • Encounter envelopes are appended to %ProgramData%/SmrtHub/Logs/encounter-log/encounter-log-YYYYMMDD.ndjson (overridable via SMRTHUB_COMMON_APPDATA_OVERRIDE) whenever ClipboardMonitor mints a new batch.
  • The Python runtime stores the same encounter metadata, logs the persistence stage to the same ProgramData NDJSON for full provenance, and appends SmartSpace Archive records for each saved artifact.

Testing and validation

  • Launch ClipboardMonitor and confirm logs show the resolved endpointUrl.
  • Copy text and an image: confirm the Python bridge receives fragments and processing continues (no crashes on contention).
  • Confirm encounter metadata is emitted to the canonical encounter log and includes fragment digests.

Support Bundle

  • ClipboardMonitor logs and recent encounter provenance should be captured via the Support Bundle using the canonical log and system-info exports.
  • When debugging cross-component issues, prefer exporting a Support Bundle instead of hand-copying logs.
  • Library: SmrtApps/src/Smrt.Utils.Clipboard (extraction + OCR hint)
  • Logging: SmrtApps/src/Smrt.Logging
  • Config: SmrtApps/src/Smrt.Config

Generated Output

  • bin/ and obj/ contain build artifacts produced by the .NET build; they remain excluded from recursive README coverage per the documentation policy.