SmrtExtensions¶
Smrt.Extensions (SmrtChrome)¶
Smrt.Extensions contains the MV3 browser extension that feeds the Smrt.Web stack. Chrome is the active target (Edge/Opera builds will reuse the same code once SmrtChrome is stabilized; Firefox will follow later because of the different extension model). Ignore SmrtChromeOriginal/; it only preserves the pre-refactor scripts.
Directory map¶
| Path | Purpose |
|---|---|
SmrtChrome/manifest.json |
Declares MV3 permissions (tabs, contextMenus, pageCapture, nativeMessaging) and wires background.js as the service worker. |
SmrtChrome/background.js |
Core logic: context menu wiring, capture orchestration, Native Messaging streaming, ACK/backpressure handling, and capability polling. |
SmrtChrome/content.js |
Lightweight notifier that pings the background worker on page load/interactions so tab metadata stays current. |
SmrtChrome/scrape.js |
Placeholder for future on-page scrape profiles; currently exposes window.__smrt_scrape for debugging extracted text. |
SmrtChrome/singlefile.js |
Placeholder hook to drive SingleFile/MHTML snapshotting (returns not-implemented today). |
HostManifest/com.smrt.web.template.json |
Native Messaging manifest template; scripts copy/patch this to register Smrt.Web.Host.exe with Chrome/Edge. |
Current capabilities¶
- Context menus (background.js)
- Capture DOM: injects a script to clone
document.documentElement, inserts a<base>tag, and streamspage-<timestamp>.htmlin 128 KB base64 chunks viaSTART/CHUNK/END. - Capture Visible Area: uses
chrome.tabs.captureVisibleTab, enforces rate limiting, and streamscapture-<ts>.pngbefore issuingFINISH_JOB. - Capture DOM + Visible: sequentially performs both streams, allowing mixed OCR + scrape jobs.
- Capture Full Page (Scroll OCR): leverages the tiling helpers to emit
tiles.jsonplustile-r###-c###.pngartifacts for the worker stitcher. -
Snapshot (MHTML/single-file): uses
chrome.pageCapture.saveAsMHTMLwhen the “Snapshot” path is chosen (still marked preview until SingleFile integration lands). -
OCR mode submenu toggles per-job options passed to the host (
allowCloud/startWith) so users can force local-only or top-cloud runs when cloud providers are available. -
Capabilities handshake (
GET_CAPABILITIES) runs on startup and when the mode changes; the host relays worker capabilities (local,providers[]) so the UI hides cloud-first options when none are configured. -
Native Messaging transport
- One
chrome.runtime.connectNativeport per job to keep the MV3 service worker alive. - Per-artifact ACK waiters detect host-side validation failures (size limits, base64 errors) and abort gracefully.
- Grace window for early
job-not-foundresponses so the host can finalize manifests before a second status poll. -
Optional Chrome notifications (currently suppressed because icons are not packaged).
-
Telemetry bridge posts basic tab info (title/url/favicon) to
http://127.0.0.1:5000/tab-info; this keeps Python-side heuristics aware of the active site even when no capture is running.
Native Messaging manifest¶
HostManifest/com.smrt.web.template.json is the canonical manifest. When registering the host:
- Copy the template to
%LocalAppData%\SmrtHub\SmtrWeb\com.smrt.web.json(the install script handles this). - Replace the
pathwith the installedSmrt.Web.Host.exe. - Replace
REPLACE_WITH_EXTENSION_IDwith the actual Chrome/Edge ID of the unpacked/packed extension. - Import the registry keys (per Chrome instructions) pointing at the manifest path.
The host uses the same manifest for Chrome and Edge; future rebrands (SmrtHub Edge/Opera) still reference com.smrt.web so the native host stays shared.
Developer setup¶
- Install or build
Smrt.Web.Hostand runTools/Install-SmtrWebHost.ps1to register the manifest. - In Chrome, open
chrome://extensions, enable Developer mode, and loadSmrtApps/Smrt.Extensions/SmrtChromeas an unpacked extension. - Watch the service worker console: a
PING→PONGround-trip confirms the host wiring. - Trigger context menus from any tab; artifacts appear under
%LocalAppData%/SmrtHub/SmtrWeb/Jobs/<jobId>/inputand the host forwards them to the worker.
Known limitations / backlog¶
- DOM capture writes raw
outerHTMLplus a<base>tag. It is not yet self-contained; assets may not load when opened offline.singlefile.js/MHTML integration will fix this. - Tile stitching exists in the worker, but the capture-side scroll step still needs tuning for complex nested scroll regions.
- Back-pressure is coarse: we wait for ACKs per chunk but have not implemented true windowing, zstd compression, or per-chunk hashes.
- Notifications are console-only until branded icons are added to the package.
scrape.jsandsinglefile.jsare placeholders; scraping profiles and SingleFile orchestration are tracked in the SmtrWeb backlog.
Troubleshooting¶
- “Error when communicating with the native messaging host”: reload the extension after updating host permissions, re-run the install script, and confirm
sendNativeMessagereturnsPONG. - No capabilities in the menu: ensure the worker is running; the host proxies capabilities and will report none when the pipe isn’t reachable.
- Stuck jobs: check
%LocalAppData%/SmrtHub/SmtrWeb/HostLogs/nm-*.logfor frame-by-frame diagnostics andSmrt.Web.Hoststructured logs for policy violations. - Capture-visible failures: Chrome throttles
captureVisibleTab. The extension spaces calls (MIN_CAPTURE_INTERVAL_MS = 750); if throttling persists, close other capture-heavy extensions.