Plugin-Driven Entity Surface Notifications (Plugin API v7)
Motivation
A plugin that implements UpdateProvider (or any other plugin
that mutates the entity surface at runtime) has two mechanisms available
in v6:
declare the entity in the base manifest file (static, known at gateway startup).
rely on runtime discovery to notice a new ROS 2 node appearing on the graph.
Neither covers the OTA install case. The installer drops a new app on
disk, starts it, and wants the operator’s client (Web UI, MCP tool,
Foxglove panel) to see the change immediately. Runtime discovery
catches the new node within a few seconds but shows it as
source: orphan - it is not attached to any component so the entity
tree looks inconsistent. On plugin rollback the opposite problem
appears: the manifest entry (if it existed) lingers even though the
app is gone.
v7 adds two pieces that together close the gap:
PluginContext::notify_entities_changed(EntityChangeScope)- a plugin-driven refresh trigger.discovery.manifest.fragments_dir- a directory of drop-in manifest yaml chunks that the gateway scans on every manifest load / reload.
Neither is OTA-specific. The same pair is usable by rosbag injectors, dynamic config deployers, hot-reloadable adapters - any plugin whose correct behaviour depends on adding or removing entities at runtime.
Lifecycle
(1) Plugin installs new app on disk
(2) Plugin writes /path/to/fragments/<deploy-id>.yaml declaring
the app (is_located_on, ros_binding, ...)
(3) Plugin starts the app process
(4) Plugin calls
ctx.notify_entities_changed(
EntityChangeScope::for_component("my-ecu"));
(5) Gateway re-parses base manifest and re-scans fragments_dir,
rebuilds the entity cache, runs a full discovery cycle
(6) Client's next GET /apps / GET /components/{id}/apps reflects
the new app
Rollback reverses the sequence:
(7) Plugin stops the app process
(8) Plugin deletes /path/to/fragments/<deploy-id>.yaml
(9) Plugin calls
ctx.notify_entities_changed(
EntityChangeScope::for_component("my-ecu"));
(10) Gateway re-loads, app is no longer in the merged manifest,
cache drops it, clients refresh to find it gone
The scope hint in steps (4) / (9) is informational in v7: the gateway
always performs a full refresh_cache(). A future optimisation may
restrict the pass to the named area or component subtree; the plugin
API does not change when that lands.
Fragment merge rules
Files in discovery.manifest.fragments_dir are parsed with
ManifestParser::parse_fragment_file and merged on top of the
loaded base manifest before validation runs.
Allowed in a fragment:
apps- appended toManifest::appscomponents- appended toManifest::componentsfunctions- appended toManifest::functions
Forbidden in a fragment (owned by the base manifest):
areasmetadata(any field -name,description,version,created_at)discoveryscriptscapabilities(vendor extensions)lock_overrides
A fragment that declares any forbidden top-level field fails the load
with a FRAGMENT_FORBIDDEN_FIELD validation error. Each forbidden
field in a fragment is reported separately so a single load reports
every violation, not just the first. manifest_version is optional
in fragments - when omitted a synthetic "1.0" is injected before
parsing.
File ordering is deterministic: fragment files in the directory are sorted by full path before being merged. Duplicate IDs across the combined manifest (base + every fragment) are caught by the normal validator run and cause the load to fail with the same error the user would see from a single file, plus the offending fragment path in the error message.
A missing fragments directory is not an error - the plugin can create the directory lazily on first install.
All-or-nothing fragment contract
Fragment loading is intentionally atomic: if ANY fragment in
fragments_dir fails to parse, exceeds the size limit, declares a
forbidden top-level field, or causes the merged manifest to fail
validation, the entire load_manifest / reload_manifest call
returns failure. The previously-loaded manifest stays active until the
next successful reload. One broken fragment therefore blocks every
fragment in that directory (including valid ones) from taking effect.
The contract is deliberate - a partially-applied fragment set is worse than the pre-existing manifest, because it produces a client-visible entity tree that neither matches the old deployment nor the intended new one. Plugins that want independent failure domains should use separate fragment directories per domain.
Plugin write contract (TOCTOU and size safety)
To keep the all-or-nothing semantics from being tripped by a racing
reader, plugins that write into fragments_dir MUST publish each
fragment atomically:
write the final content to
fragments_dir/.tmp-<id>.yamlandfsyncit;rename()tofragments_dir/<id>.yaml(POSIXrenameis atomic within one filesystem).
The gateway scans the directory on every reload and calls
notify_entities_changed runs the reload synchronously on the caller’s
thread, so a partial write observed between open() and close()
can fail the entire merge if the plugin publishes in-place.
Two other reader-side safeguards protect the gateway from malformed inputs:
Size cap. Each fragment file is rejected before it is read into memory if it exceeds
ManifestParser::kMaxFragmentBytes(1 MiB). This protects against a misconfiguredfragments_dirpointing at a log or data file.Symlink scoping. Symlinks ARE followed (k8s ConfigMap mounts rely on that), but any directory entry whose resolved real path is not a descendant of the canonical
fragments_diris skipped with a warning. This stops aevil.yaml -> /etc/shadowescape while keeping ConfigMap use cases working.
Compatibility
PLUGIN_API_VERSIONbumped from v6 to v7.PluginContext::notify_entities_changedhas a default no-op implementation, so plugin source code written against v6 compiles unchanged against v7 headers. No code changes are needed.Binary compatibility is NOT provided: the plugin loader compares the exported
plugin_api_version()against the gateway’sPLUGIN_API_VERSIONwith strict equality, so a.sopre-compiled against v6 IS rejected. In-tree plugins pick up the bump automatically because theyreturn PLUGIN_API_VERSIONfrom the shared header; out-of-tree plugins must be recompiled.A v7 gateway loading a recompiled v6 plugin is fully functional; the new lifecycle hooks are off by default until the plugin opts in by calling
notify_entities_changed.