installomator

The Installomator subsystem is co-located in its own subpackage. Three modules:

parser

Tokenizes Installomator’s bash label fragments into structured field assignments without invoking a shell.

resolver

Evaluates dynamic field values (e.g. downloadURL=$(curl ... | grep ...)) inline where possible, in subprocess as an opt-in fallback. For the two-stage split with the macOS runner see Resolution.

ingest

Pulls the Installomator label registry from GitHub and writes parsed/resolved rows into the catalog.

Parser

parse_fragment(fragment: str) dict[str, Any][source]

Parse an Installomator label fragment into a dict of variable assignments.

Recognized syntaxes:

  • key="quoted value": string values, surrounding quotes stripped.

  • key=$(shell expression): preserved verbatim as the literal expression.

  • key=(arr "values" here): bash arrays returned as Python lists.

A key assigned exactly once maps to a scalar string (or a list, for a bash array). A key assigned more than once maps to the ordered list of every assignment, so the resolve step in a resolve-then-transform chain and the primary URL in an arch-conditional branch are never discarded. Consumers that need a single value should take the first element (see _scalar_for_column in the ingest module).

Lines starting with # and blank lines are skipped. The opening <label>) header (including multi-name a|b|c) headers) and the trailing ;; separator are stripped before parsing.

Parameters:

fragment (str)

Return type:

dict[str, Any]

Resolver

The resolver returns one of three outcomes (Resolved, Unresolvable, InvalidOutput) so callers can distinguish a clean value from a rejected one from nothing at all.

class Resolved(value: str)[source]

A pipeline (or literal) produced a final, usable value. Caller stores it.

Parameters:

value (str)

class Unresolvable(reason: str)[source]

We couldn’t get a value at all. Pipeline contained an unsupported command, failed parsing, networked errored, or produced empty output. Caller nulls the column.

Parameters:

reason (str)

class InvalidOutput(raw: str, reason: str)[source]

We got a value, but it failed sanity checks (URL validator, etc). Caller nulls the column AND keeps the raw value for review. Distinct from Unresolvable so callers can log “we got something, but rejected it” vs “we got nothing.”

Parameters:
class PipelineResolver(http_client: Client | None = None, *, allow_subprocess_fallback: bool = False, context: dict | None = None)[source]

Evaluate an Installomator label’s shell-expression value into a concrete string, in Python (no subprocess by default).

Holds the state that threads through pipeline execution — the httpx client reused across curl stages, the opt-in subprocess-fallback toggle, and the parsed-label context — so a caller resolving many labels constructs one resolver and reuses it across the batch. The stateless filter stages live in _filters; this class is the stateful execution core (orchestration + source commands).

Parameters:
  • http_client (httpx.Client | None) – Optional pre-configured httpx.Client. If omitted, a fresh client with a 30-second timeout is created and disposed per curl invocation. Tests inject a MockTransport-backed client to avoid hitting real URLs.

  • allow_subprocess_fallback (bool) – When True, pipelines that raise UnsupportedOperation during native dispatch fall through to _subprocess_fallback(). Off by default because the fallback invokes bash on a public-repo string, a real (accepted) shell-injection surface. Callers that pin the Installomator commit and trust the pipeline-string corpus can opt in.

  • context (dict | None) – Parsed label dict, so source commands can read sibling variables (downloadURLFromGit reads type/archiveName; echo "${updateFeed}" resolves the prior assignment).

resolve(expression: str | None, *, is_url: bool = False, is_version: bool = False) Resolved | Unresolvable | InvalidOutput[source]

Resolve a label variable’s value, evaluating shell-style pipelines in Python.

Parameters:
  • expression (str | None) – The label variable value as parsed from the .sh fragment. Plain strings ("121.0") pass through as literals; values shaped $(cmd | cmd | ...) are parsed and evaluated.

  • is_url (bool) – When True, the resolved value is run through looks_like_clean_http_url() before returning. Failures land as InvalidOutput so callers see “got something, rejected it” rather than “no value.” Pass for fields whose projected column gets serialized as Pydantic HttpUrl.

  • is_version (bool) – When True, the resolved value is run through looks_like_clean_version(). A pipeline that succeeds at the shell level but captures an HTML page, a header dump, or an un-filtered multi-line blob is rejected as InvalidOutput rather than stored as a bogus version. Pass for appNewVersion.

Returns:

A Resolved, Unresolvable, or InvalidOutput.

Return type:

ResolveOutcome

resolve(expression: str | None, *, http_client: Client | None = None, is_url: bool = False, is_version: bool = False, allow_subprocess_fallback: bool = False, context: dict | None = None) Resolved | Unresolvable | InvalidOutput[source]

Resolve a single label value with a one-off PipelineResolver.

Convenience wrapper equivalent to PipelineResolver(http_client, allow_subprocess_fallback=...).resolve(expression, is_url=...). Callers resolving many labels in a batch should construct one PipelineResolver and reuse it so a single httpx.Client is shared across all of them.

Parameters:
Returns:

A Resolved, Unresolvable, or InvalidOutput.

Return type:

ResolveOutcome

is_shell_expression(value: str | None) bool[source]

Detect whether an Installomator label value contains shell syntax that needs evaluation rather than being a usable literal.

resolve() handles values shaped exactly like $(... pipeline ...) (its regex is anchored). This helper is broader: it also catches embedded substitutions that resolve will pass through as literals, and which callers should treat as unsafe:

  • Pure expressions: $(curl -fsL https://...) or $varname

  • Embedded substitutions: https://example.com$(curl ...) or ${baseURL}/path/to/installer.pkg

Useful as a safety net after resolve() returns a literal value to confirm the literal is genuinely a clean value, not an unresolvable fragment that snuck past the resolver’s anchored pattern.

Parameters:

value (str | None) – A parsed label-fragment value.

Returns:

True if the value contains any shell-expression artifacts.

Return type:

bool

looks_like_clean_http_url(value: str | None) bool[source]

Sanity-check that value is a single, reasonably-sized http(s) URL safe to store in a column the API later serializes through Pydantic’s HttpUrl type.

Catches three classes of garbage resolve() can produce when a pipeline succeeds at the shell level but the captured output isn’t a usable URL:

  • HTML response bodies: the upstream vendor returned a non-2xx response (404, 400, etc.) but curl didn’t see it as an error, so the response body landed in the value. These typically start with <!doctype or <html.

  • Multi-line concatenations: the Installomator pipeline’s final filter was unsupported (e.g. awk or head -n1), so the full grep output (every matched URL on the page, joined with newlines) landed in the value instead of a single line.

  • Non-http schemes: a handful of Installomator labels still use ftp:// sources. Pydantic’s HttpUrl rejects these and the catalog only documents http(s) URLs.

Also enforces a 2000-character ceiling. Pydantic’s HttpUrl maxes out at 2083 (the IE-era de-facto limit), so leaving 83 chars of headroom avoids edge cases at the boundary.

Parameters:

value (str | None) – Resolved or literal URL candidate.

Returns:

True when the value passes all sanity checks, False otherwise (including for None and empty strings).

Return type:

bool

looks_like_clean_version(value: str | None) bool[source]

Sanity-check that value is a plausible version string, not pipeline garbage.

appNewVersion has no schema-level guard (unlike downloadURL, which Pydantic’s HttpUrl validates downstream), so a pipeline that succeeds at the shell level but captures the wrong thing would otherwise store junk as a version. Empirically that junk is: empty output, whole HTML pages, HTTP header dumps, and un-head’d multi-line lists. Each is rejected here:

  • Empty / whitespace-only: nothing to store; the column should be NULL.

  • Multi-line: a version is a single token. A newline means the final filter (head -1 etc.) was unsupported and the whole match list landed.

  • HTML / markup: an < or > means a page body, not a version.

  • Over-length: a real version is short; _MAX_VERSION_LENGTH caps it well above any legitimate 1.2.3-beta.4+build567 shape.

  • No digit: every version carries a number; a digit-free string is a stray word or label, not a version.

Internal spaces are allowed — a few labels legitimately produce "Build 4200"-style versions, and the multi-line and markup rules already catch the header/HTML dumps that contain spaces.

Parameters:

value (str | None) – Resolved or literal version candidate.

Returns:

True when the value passes all sanity checks, False otherwise (including for None and empty strings).

Return type:

bool

Ingest

class FetchPlan(name_to_content: dict[str, str]=<factory>, name_to_blob_sha: dict[str, str]=<factory>, removed: frozenset[str] = <factory>, unchanged: int = 0, missing: int = 0, errored: int = 0)[source]

Outcome of a gated label fetch.

Variables:
  • name_to_content – Raw .sh fragment text for every label that was actually fetched this run (i.e. SHA changed, new, or force=True). Empty when nothing changed upstream.

  • name_to_blob_sha – Full upstream view: every label name with its current blob SHA. Used by the ingest step to persist the SHA even for upserts of unchanged-but-re-fetched rows.

  • removed – Labels that exist in the local DB but are absent from upstream — caller is expected to delete these.

  • unchanged – Count of labels skipped because their SHA matched what’s already stored. Zero on a fresh DB or a --force run.

  • missing – Count of fragments that 404’d during fetch. Should be zero now that discovery is tree-driven (the tree only lists files that exist); kept for defensive logging if upstream removes a file mid-run.

  • errored – Count of fragments that failed with an unexpected error during fetch.

Parameters:
set_resolve_on_ingest(enabled: bool) None[source]

Override the PATCHER_API_RESOLVE_INGEST env default at runtime.

Lets the --resolve CLI flag turn resolution on explicitly, sidestepping the shell-export footgun where an unexported env var never reaches the ingest process. The resolution functions read this module global at call time, so setting it before they run takes effect.

Parameters:

enabled (bool)

Return type:

None