Defining the
chiller control plane
Three months on one big thing — porting the legacy chiller-plant control stack onto a profile-based configuration system, and re-platforming it for alto-cero-edge-infra.
Executive summary
Three months of work on one big thing: porting the legacy chiller_optimization chiller-plant control stack onto a new profile-based configuration system that mirrors the DoHome air-side product, and re-platforming it for alto-cero-edge-infra.
The work decomposes into:
- A new entity model —
Blueprint → Automation → Profile → Calendar— replacing the legacy "each agent owns its own settings page" pattern with a 3-level (automation × profile × date) configuration tree. Documented in 17 files underdocs/architecture/with typed JSON references indocs/architecture/jsonb/. - A Django backend app —
core/backend/alto/control_automation/withAutomation,Blueprint,ActionEvent,Statemodels, JSONB-based config (not 12 per-blueprint tables), and a full DRF REST surface. - Eight Volttron agents at v4.0.0 — all 8 profile-aware chiller agents (
ChillerAddSubtract,ChillerPlantSchedule,ChillerSequence,CustomSchedule,DynamicSetpoint,PIDControl,SCHPControl,SmartCT) reading config from Supabase, writing state + action events back, and emitting commands viaaction_request/…pubsub. - A frontend rebuild — five-tab Profile Manager (Today / Profiles / Conditions / Metadata / Diagnostics), eight V2 modals (one per blueprint), AutopilotCard rewired to Supabase, and the midnight-copy override pattern for "edit for today only" UX.
- Infrastructure plumbing — Supabase agent gains
add/update/list_action_eventsRPCs, a docker-compose bridge overlay for cross-network reach, energy-endpoint hardening (503/504 instead of 500), and theinfra.Supabasedata plane renamed to singular tables.
dev — concentrated in one foundation commit on 2026-05-14 that lands the entire end-to-end build atop Andaman's upstream main.The problem this work solves
Before this work, the chiller plant configuration was fragmented:
- Chiller sequence used one schedule mechanism.
- Smart CT and PID controllers used a different settings mechanism.
- There was no concept of a "Weekday vs Weekend vs Holiday" profile — every change was a permanent global change.
- Customers had to call us to change anything by-date, because there was no calendar layer.
"Pre-configured operation profiles that customers can select / customize (like DoHome). The proper flow should be a 3-level system — Automation → Profile → Calendar Date."
That is what shipped.
Architecture — what got built
The new system is one consistent hierarchy across all chiller automations. The full board is on Miro; the diagrams below are extracts.
3.1 · Entity relationships
The system is built on seven entities. Blueprints define the kind of automation (e.g. "Chiller Sequence", "PID Speed Control"). Automations are configured instances of a blueprint. Profiles are named groups of automations ("Weekday", "Weekend"). Calendars assign a profile to each date. States hold the live runtime data each automation reports. Action Events are the timeseries log of what each automation did.
This is the contract the entire chiller plant runs on top of — every control feature, present or future, lives somewhere on this diagram.
3.2 · Blueprint catalog (8 control types)
Eight blueprints cover every chiller-side control surface currently in production. Each blueprint declares the shape of its metadata (what site-specific equipment it needs) and its default config (what the operator can tune).
3.3 · An end-to-end example — the KSPO site
The remaining tables show how a real customer (KSPO) is configured under the new model. They illustrate the same data the operator now sees in the UI.
Runtime state rows — each agent reports its live state to a shared row keyed by state_type and site_id. Multiple automations of the same kind share one state row; cross-automation reads happen by querying that row.
Automations — KSPO runs 19: three Chiller Sequences (Monday, other weekdays, weekends), two Schedules, two Dynamic Setpoints, one Add/Subtract, one Smart CT, five Custom Schedules, three PID Speed Controllers (PCHP, CT, CDP), and two SCHP staging configs.
Profiles — three operating modes for KSPO.
Profile membership — the junction table that binds automations to profiles, with priority. Custom Schedules sit at priority 100 (they override normal automations).
Calendar — assigns a default profile per day-type for the whole site. Specific dates can override.
3.4 · How the system resolves a date to a control plan
When a date arrives, the agent walks the hierarchy: specific date override → calendar day-type → profile → automations sorted by priority → highest priority wins per device + datapoint. Custom Schedules at priority 100 always override the normal automations (1–9). This is one of the key gains over the legacy model — operators can write a one-off override without touching the baseline configuration.
3.5 · Schema design (illustrative)
Two examples of the typed schemas that define what each blueprint accepts — included so the reviewer can see the shape, not to be read in detail.
New profile-based control UI
The UI is one consolidated Profile Manager modal opened from the Autopilot card. Inside: Today (live read-only view), Calendar (assign profiles to dates), Metadata (per-site equipment ontology), Diagnostics (what the agent currently sees).
4.1 · Autopilot entry point
The autopilot widget on the chiller plant page surfaces the active profile, every automation row, and a status pill. Click the gear icon on any row to deep-link into that automation's tab.
4.2 · Today tab — live plant view
The Today tab is the operator's primary read-only surface. It shows the active profile, the live chiller sequence with running / standby / excluded status per priority, the live schedule timeline with a "now" marker, the lead-lag rotation timer, and any custom schedules layered on top.
Below that — stage up / stage down conditions (with live status + threshold), Smart CT groups (with current load tier and CDS setpoint), SCHP pump status (per zone + type), and PID controllers.
4.3 · Metadata tab — single source of equipment truth
Metadata is where the site's equipment ontology lives. Every modal in the system reads from here — equipment names are never manually typed inside automation modals. One source of truth, one place to maintain.
4.4 · Calendar tab — assign profiles to dates
The Calendar tab is how operators schedule the entire plant for the month. Quick actions for "All weekdays", "All weekends", or "Clear", and drag-select for date ranges. The picked profile is applied to the selection on Apply.
4.5 · Profile editor — canonical configuration
The Profile editor is the only place where canonical (permanent) edits happen. Edits made on the Today tab are routed to a one-day override copy of the profile that auto-purges at midnight — this protects the baseline configuration from accidental permanent changes.
Linear timeline
The complete day-by-day timeline lives in Linear. Status at end of period:
| Linear ID | Title | Status |
|---|---|---|
| DIZ-813 | Profile-based settings (parent) | In Review |
| DIZ-960 | Frontend | In Review |
| DIZ-961 | Backend (Django + REST API) | In Review |
| DIZ-962 | Chiller Schedule Agent | In Review |
| DIZ-963 | Chiller Sequence Agent | In Review |
| DIZ-1111 | Smart CT Agent | In Review |
| DIZ-1112 | Adaptive Start Agent | In Review |
| DIZ-1114 | Control / Custom Schedule Agent | In Review |
| DIZ-965 | Actionable chiller event card (parent) | In Progress |
| DIZ-967 | Event card — Backend | In Progress |
| DIZ-968 | Event card — Agent integration | In Progress |
| DIZ-966 | Event card — Frontend | Todo |
| DIZ-969 | Manual staging up/down button | Todo |
| DIZ-1113 | Adaptive Stop Agent | Todo |
The seven "In Review" items together carry the foundation work — the eight rebuilt automations plus the backend and frontend they sit on top of. Reviews are with Andaman.
Adjacent work this period
Outside the profile-based control / edge app-framework track, the same window included several other tracks:
- Aj. Pisitchai's IPMVP & energy-saving work (early–late March 2026, follow-up 2 Apr) — daily data aggregation; an audit of daily energy for chiller-plant
power_all_*devices (with Zayar and Ham); revision of the energy-saving report for MBK / JWM / CP9; an 8-sites COP-per-component delivery (monthly_energy_all_sites-rev1.xlsx); a JWM IPMVP meeting on occupancy; TaTa's Enco IPMVP presentation; baseline IPMVP prediction models for CP9 / JWM / KSPO / MBK. - RL on Energy Optimization (2 Apr 2026) — kick-off literature review on chiller-plant energy optimisation with reinforcement learning; dependent / independent variable questions opened with Opal and Ham.
- BLA data analysis & meeting prep (30 Apr → 5 May 2026) — BLA-site data work alongside the main edge build.
- NOVA / AltoACE Voice Stack (19 → 20 May 2026) — moved the Pipecat backend and React UI in-tree, pinned
nvidia-pipecat, fixed the tool-calling crash at startup (swapLLMContext→OpenAILLMContext), and validated an end-to-end voice loop (7 tools; ASR → tool dispatch → grounded TTS reply quoting real chiller numbers — kW/RT 0.61, CHW ΔT 5.6 °C). - MBK Adaptive Stop mitigation (28 May 2026) — partnered with P'Sit to mitigate the Adaptive Stop issue surfaced on MBK.
What's next (June)
Deployment goal: roll out this profile-based control system to the CP9 site in production.
Alongside that, three pieces of work remain to close out the v3.4 milestone:
- Actionable event card — operators can already see scheduled events; the remaining UI lets them reschedule, reassign equipment, or cancel events inline. Backend + agent are in progress; the consumer UI is the one piece still to build.
- Manual staging up / down — a single-click "add a chiller" / "drop a chiller" button on the Today tab, independent of the staging conditions.
- Adaptive Stop agent — port of the Adaptive Stop control feature onto the new profile-based framework. Mirrors the Adaptive Start work already in review.
All three are smaller in scope than what is already in review.
Where this plugs into the bigger picture
The new profile + calendar pattern is the first place it's been applied on the water side — but the data model is intentionally generic. Future automation types (demand-limit optimization, optimum pump staging, optimum CT staging, recommendation/copilot mode) can be added as new blueprints without touching any of the framework already shipped. The frontend is data-driven from the blueprint catalog, so new automation types appear automatically once their blueprint is registered.