Skip to content

Component: Utils

Canonical source: SmrtApps/PythonApp/python_core/utils/README.md (mirrored below)


python_core.utils

Overview

  • Location: SmrtApps/PythonApp/python_core/utils
  • Purpose: Shared utilities for logging, config/path resolution, file IO, filename normalization, heartbeat monitoring, HTML output, favicon caching, and shared-state access across Python Core.
  • Style: Inline docstrings stay terse; this README carries the operational details.

Modules

  • logger.py — unified logging with rotating text + JSON sinks; follows README.Files/Reference-Guides/SmrtHub.Logging.README.md for schema and configuration.
  • config_path_resolver.py — canonical config/log/SmrtDB path helpers plus health markers; extended notes in Python_Core.Config_Path_Resolver.README.md.
  • fs.py — duplicate-aware text writes (write_file) using MD5 compare and atomic swap.
  • filename_utils.py — normalization/extraction helpers for titles and clipboard data.
  • filesystem_identity.py — cross-platform get_unique_file_id (volume+file index on Windows, device+inode elsewhere).
  • exclusion_matcher.py — heuristics for junk directories/files to keep indexers clean.
  • favicon_cache.py — cached favicon fetch with PNG conversion and 7-day TTL.
  • html_formatter.py — renders the unified SmrtReader template (offline-safe, inline assets) from python_core/runtime/HTML_output/smrtreader_template.html.
  • encounter_log.py — ProgramData-aware writer for newline-delimited encounter evidence logs shared between python-net and python-runtime.
  • heartbeat.py — daemon heartbeat thread that logs status and updates health markers.
  • shared_state_window.py — safe snapshot of python_core.runtime.shared_state.results for UI/monitoring.
  • shared_data_path_resolver.py — ProgramData-aware helper that mirrors the C# shared resolver so Storage Guard + SmrtSpace metadata land in shared locations.

Key Behaviors

  • Logging: All loggers derive from logger.py. Text logs write to <slug>-log.txt, structured logs to <slug>-log.json. Permission-safe rotation uses SafeRotatingFileHandler. Emojis stay in console streams only.
  • Config Paths: component_dir, config_file_path, state_file_path, log_paths, smrt_db_dir, and smrt_db_component_dir mirror the C# resolver. SMRTHUB_HOME (dev/test override) is treated as the root for the Config tree; see the resolver README for canonical layouts and slug rules.
  • Health Markers: write_health_marker writes {component, status, ts, pid, extra} JSON (UTF-8, ensure_ascii=False) under the component’s Config folder. Heartbeat refreshes it each cycle and flips status="degraded" on exceptions.
  • Duplicate-Aware Writes: fs.write_file hashes existing bytes vs. new content before writing. Duplicates log via log_info and skip disk writes.
  • Filename Normalization Pipeline: normalize_unicodenormalize_apostrophesstrip_punctuation. extract_filename_candidate matches regex patterns for filenames, falling back to fallback_ext when provided. generate_variants outputs lowercase variants for fuzzy matching.
  • Junk Filtering: is_junk_dir_name rejects known system/cache/VCS directories and variants such as .Trash-1000, cache (3), or tmp_copy. is_junk_file_name covers AppleDouble files, temp prefixes/suffixes, numeric extensions, and configured junk extensions.
  • HTML Formatting: format_smrtreader_output renders the single-flow SmrtReader template with inline Cal Sans and the SmrtHub/MKDocs palette (light/dark). Text, images, and files are streamed in order as one article; the page title/H1 uses metadata["source"] (fallback document_title) and the footer uses a single breadcrumb line built from save_path, source/source_url, and saved_at_iso/saved_at_tz (fallback saved_at). Clipboard HTML is sanitized conservatively, with a tiny emphasis-only inline-style allowlist.
  • Heartbeat Loop: Heartbeat logs ❤️ [Heartbeat:<tag>] entries at the configured interval using the dedicated heartbeat logger. The singleton start_heartbeat() lazily launches one instance.
  • Shared State Window: get_shared_state_window() returns dict(shared_state.results) so callers mutate a copy. Fields include clipboard (list of items), source_data, source_type, buffer, save_format, and editor_status.
  • Shared ProgramData Paths: shared_data_path_resolver pins Storage Guard artifacts and other cross-identity files to %ProgramData%/SmrtHub/Config/<slug>/. Dev machines without ProgramData fall back to %LocalAppData%/%AppData% to keep tooling runnable.
  • Favicon Cache: Favicons are cached under component_dir("python-net")/favicons. Files expire after 7 days. When Pillow is available non-PNG responses are converted to PNG before caching.
  • README.Files/Reference-Guides/SmrtHub.Logging.README.md
  • Python_Core.Config_Path_Resolver.README.md
  • Python_Core.Config.README.md
  • README.Files/System/Policies/SmrtHub-Operational-Data-Policy-v1.0.README.md

Notable Updates

  • 2025-11-08 — Refreshed utils docstrings with reStructuredText :param:/:returns: metadata to satisfy the Documentation Policy auto-import requirements.