157 lines
8.3 KiB
Markdown
157 lines
8.3 KiB
Markdown
# Screens Agent Streamer — Implementation Plan
|
||
|
||
**Source doc:** [screens_connection_updates.md](../screens_connection_updates.md)
|
||
**Created:** 2026-04-13
|
||
**Goal:** Add a thin WebSocket-based IQ streaming agent to `ria-toolkit-oss`, alongside the existing long-poll `NodeAgent`, and wire up the RIA Hub server to consume it.
|
||
|
||
---
|
||
|
||
## Architectural Decision
|
||
|
||
The existing [src/ria_toolkit_oss/agent.py](../src/ria_toolkit_oss/agent.py) (`NodeAgent`) uses HTTP long-polling and runs ONNX inference locally. It stays as-is.
|
||
|
||
The new **streamer agent** described in `screens_connection_updates.md` is a *different* execution mode — thin, WebSocket-based, server-driven inference. It will be added as a new submodule and exposed as a new CLI subcommand. Both modes coexist; users pick one based on deployment needs.
|
||
|
||
---
|
||
|
||
## Part A — ria-toolkit-oss (this repo)
|
||
|
||
### Phase 1 — SDR foundation
|
||
|
||
| # | Task | File(s) | Priority |
|
||
|---|------|---------|----------|
|
||
| 1 | Add `detect_available() -> dict[str, type]` that probes every driver without importing GUI deps | [src/ria_toolkit_oss/sdr/__init__.py](../src/ria_toolkit_oss/sdr/__init__.py) | P0 |
|
||
| 2 | Audit each SDR driver for headless cleanliness (no matplotlib/Qt at import time) | `src/ria_toolkit_oss/sdr/{pluto,hackrf,rtlsdr,usrp,blade,thinkrf}.py` | P1 |
|
||
| 3 | Raise typed `SdrDisconnectedError` from `radio.rx()` on USB drop instead of crashing | `src/ria_toolkit_oss/sdr/sdr.py` + drivers | P1 |
|
||
| 4 | Validate `load_recording` against SigMF captures from each radio type | `tests/io/` | P2 |
|
||
|
||
### Phase 2 — Agent package restructure (non-breaking)
|
||
|
||
Promote the existing module to a package and add the streamer next to it:
|
||
|
||
```
|
||
src/ria_toolkit_oss/
|
||
├── agent.py # DELETE after move
|
||
└── agent/
|
||
├── __init__.py # re-export NodeAgent for back-compat
|
||
├── legacy_executor.py # former agent.py (NodeAgent, unchanged behavior)
|
||
├── streamer.py # NEW — thin IQ streamer loop
|
||
├── ws_client.py # NEW — persistent WS client + heartbeat + reconnect
|
||
├── hardware.py # NEW — wraps sdr.detect_available()
|
||
├── config.py # shared: ~/.ria/agent.json load/save
|
||
└── cli.py # unified CLI with subcommands
|
||
```
|
||
|
||
Back-compat requirement: `from ria_toolkit_oss.agent import NodeAgent` must keep working. Keep the `ria-agent` console script entry point; expose both modes via subcommands.
|
||
|
||
### Phase 3 — Streamer implementation
|
||
|
||
| # | Task | File | Priority |
|
||
|---|------|------|----------|
|
||
| 5 | `ws_client.py` — persistent WebSocket, auto-reconnect, heartbeat loop | `src/ria_toolkit_oss/agent/ws_client.py` | P0 |
|
||
| 6 | `streamer.py` — main loop: receive `start` → open SDR → `radio.rx()` → send binary IQ → handle `stop`/`configure` | `src/ria_toolkit_oss/agent/streamer.py` | P0 |
|
||
| 7 | `hardware.py` — heartbeat payload builder, uses `sdr.detect_available()` | `src/ria_toolkit_oss/agent/hardware.py` | P0 |
|
||
| 8 | `config.py` — `~/.ria/agent.json` read/write, registration token storage | `src/ria_toolkit_oss/agent/config.py` | P1 |
|
||
| 9 | CLI subcommands: `ria-agent register`, `detect`, `stream` (new), `run` (legacy long-poll) | `src/ria_toolkit_oss/agent/cli.py` | P0 |
|
||
| 10 | Add `websockets` to dependencies | [pyproject.toml](../pyproject.toml) | P0 |
|
||
|
||
### Phase 4 — WebSocket protocol
|
||
|
||
Implement exactly the messages from `screens_connection_updates.md` §"WebSocket Protocol":
|
||
|
||
**Agent → Server (JSON control):**
|
||
```json
|
||
{"type": "heartbeat", "hardware": ["pluto", "hackrf"], "status": "idle"}
|
||
{"type": "status", "status": "streaming", "app_id": "abc"}
|
||
{"type": "error", "app_id": "abc", "message": "USB device disconnected"}
|
||
```
|
||
|
||
**Agent → Server (binary data):** raw interleaved float32 IQ bytes per `radio.rx()` call.
|
||
|
||
**Server → Agent (JSON control):**
|
||
```json
|
||
{"type": "start", "app_id": "...", "radio_config": {"device": "pluto", ...}}
|
||
{"type": "stop", "app_id": "..."}
|
||
{"type": "configure", "app_id": "...", "radio_config": {"center_frequency": 915000000}}
|
||
```
|
||
|
||
The agent does **not** interpret manifests, models, or preprocessing — it just applies `radio_config` to `ria_toolkit_oss.sdr.<device>`.
|
||
|
||
### Phase 5 — Tests
|
||
|
||
| # | Task | Location |
|
||
|---|------|----------|
|
||
| 11 | Unit: streamer loop against `sdr.mock` + fake WS | `tests/agent/test_streamer.py` |
|
||
| 12 | Unit: `ws_client` reconnect/heartbeat timing | `tests/agent/test_ws_client.py` |
|
||
| 13 | Unit: `hardware.detect_available()` and heartbeat payload | `tests/agent/test_hardware.py` |
|
||
| 14 | Integration: local `websockets` server → mock SDR → full start/stream/stop cycle | `tests/agent/test_integration.py` |
|
||
| 15 | Regression: `NodeAgent` still importable and functional | `tests/agent/test_legacy.py` |
|
||
|
||
---
|
||
|
||
## Part B — RIA Hub (hand-off to separate session)
|
||
|
||
> **Give this section to Claude in the ria-hub repo session.** It depends on Part A shipping first (or at minimum, stubbed protocol).
|
||
|
||
### Server-side tasks
|
||
|
||
| # | Task | File / Area | Priority |
|
||
|---|------|-------------|----------|
|
||
| B1 | Implement `AgentDataSource` (DataSource ABC, reads IQ from WebSocket connection) and register it in `build_data_source()` | `controller/app/modules/screens/data_sources.py` | P0 |
|
||
| B2 | Add `"agent"` to `dataSource.type` enum in manifest schema; update Pydantic/JSON schema validators | `controller/app/modules/screens/graph_derivation.py` + manifest schema files | P0 |
|
||
| B3 | Add agent WebSocket endpoint `POST /api/agent/ws` — accepts agent connections, auth via registration token, bridges the connection to the Celery task's `AgentDataSource` | `controller/app/modules/agent/routes.py` (new) | P0 |
|
||
| B4 | Agent registry: MongoDB collection tracking `agent_id`, hardware list, last heartbeat, online status, registration tokens | `controller/app/modules/agent/models.py` (new) | P1 |
|
||
| B5 | Registration endpoint `POST /api/agent/register` returning agent credentials for `~/.ria/agent.json` | `controller/app/modules/agent/routes.py` | P1 |
|
||
| B6 | Celery task wiring: when manifest `dataSource.type == "agent"`, look up the connected agent by `agent_id`, forward `radio_config` to it, and feed received IQ chunks into the inference loop via `AgentDataSource.next_chunk()` | `controller/app/modules/screens/tasks.py` | P0 |
|
||
| B7 | Device/agent picker UI in Screens app config (Vue 3) | frontend screens panel | P2 |
|
||
| B8 | Agents list/status admin view | frontend admin area | P2 |
|
||
|
||
### Protocol contract (must match Part A exactly)
|
||
|
||
See `screens_connection_updates.md` §"WebSocket Protocol". Key invariants:
|
||
- Binary frames are interleaved float32 IQ, one frame per `radio.rx()` call.
|
||
- `radio_config` is derived from the manifest's `dataSource.params` and forwarded verbatim.
|
||
- The server sends `configure` for retune-without-stop flow; the agent is responsible for applying it at the next capture boundary.
|
||
|
||
### Manifest example (new mode)
|
||
|
||
```json
|
||
{
|
||
"dataSource": {
|
||
"type": "agent",
|
||
"device": "pluto",
|
||
"agent_id": "agent-abc123",
|
||
"params": {
|
||
"identifier": "ip:192.168.3.1",
|
||
"sample_rate": 1000000,
|
||
"center_frequency": 2450000000,
|
||
"gain": 40,
|
||
"buffer_size": 1024
|
||
}
|
||
},
|
||
"preprocess": "magnitude_phase_window_stats",
|
||
"config": {"inference": {"knownDevices": [], "interval": 1}}
|
||
}
|
||
```
|
||
|
||
### Pipeline invariant
|
||
|
||
**No changes to `inference_core.py`, `preprocessors.py`, or `run_onnx_chain_loop()`.** `AgentDataSource` is a drop-in for `SdrDataSource`; everything downstream is unchanged.
|
||
|
||
---
|
||
|
||
## Rollout Order
|
||
|
||
1. Part A Phase 1 (SDR foundation) — safe to land independently.
|
||
2. Part A Phases 2–3 (agent package + streamer) — lands the `ria-agent stream` CLI; no server needed to build/test against a mock WS.
|
||
3. Part B B1, B2, B3, B6 — minimum viable server path; lets a real agent connect end-to-end.
|
||
4. Part A Phase 5 integration test against a dev RIA Hub instance.
|
||
5. Part B B4, B5 (registry + registration) — hardens multi-agent deployments.
|
||
6. Part B B7, B8 (UI) — operator-facing polish.
|
||
|
||
## Open Questions
|
||
|
||
- Auth model for the WebSocket handshake: bearer token in header, query param, or first-message auth frame?
|
||
- Should `configure` apply mid-buffer or only at capture boundaries? (Doc implies boundaries; confirm for retune latency budget.)
|
||
- Backpressure policy when the server's inference loop is slower than the agent's `rx()` cadence — drop frames, queue, or pause?
|