The auto-extracted manual SVGs were unusable PDF text-glyph soup. These are fresh, theme-aware (currentColor everywhere, accent via the --sl-color-accent CSS var), and built to teach. src/assets/diagrams/handshake-sequence.svg Sequence diagram with CLIENT and CONTROLLER swim lanes, five steps: ClientRequestNewSession -> ControllerAckNewSession (carries SessionID) -> derive SessionKey (inline note) -> ClientRequestSecureSession (encrypted, accent-coloured) -> ControllerAckSecureSession (encrypted) -> first OmniLink2Message. Plaintext arrows in currentColor, encrypted arrows in accent. src/assets/diagrams/packet-structure.svg Bytes-on-the-wire box diagram: outer Packet header (seq u16 + type + reserved + encrypted payload) decomposed below into the inner Message (start byte 0x21, length, opcode, data, CRC u16 LE). Plain vs encrypted fields colour-coded with a legend. src/assets/diagrams/session-key-derivation.svg Quirk #1 visual. Three rows of byte cells: ControllerKey (16 bytes, with bytes 0..10 in plain colour and 11..15 highlighted), SessionID (5 bytes), and the resulting SessionKey with the XOR boundary visible. XOR operator in the accent colour to draw the eye. src/assets/diagrams/per-block-whitening.svg Quirk #2 visual. seq pill at the top, three blocks below (block 1, block 2, block N) each showing 16 byte cells with the first two highlighted in accent and labelled with the seq XOR mask. Drives home that it's the SAME mask on EVERY block. src/assets/diagrams/architecture.svg Three groups (LIBRARY, HA INTEGRATION, TEST SURFACE) with boxes inside. Library shows the four protocol-layer modules + connection + client + models + events. HA shows coordinator + 8 platforms. Test surface shows MockPanel (accent-coloured), HA test harness, e2e tests, unit tests. One accent-coloured arrow runs from OmniConnection across to MockPanel labelled 'TCP/4369 (encrypted)'. src/assets/diagrams/pca-file-format.svg Key chain: hardcoded keyPC01 -> decrypts PCA01.CFG (boxes for the CFG fields including the highlighted pca_key) -> arrow showing the extracted pca_key -> decrypts the .pca file (boxes for PCA03 magic, account info, model byte, body, and the highlighted ControllerKey) -> caption 'feeds session-key derivation (quirk #1)'. Wired in via inline-SVG-via-?raw-import + set:html (so currentColor adapts to the theme). Required converting four pages to .mdx: reference/protocol.mdx + handshake + packet diagrams reference/file-format.mdx + pca-file-format diagram explanation/quirks.mdx + session-key + whitening diagrams explanation/architecture.mdx + architecture diagram Two MDX paper cuts during conversion: bare '<100ms' and '<50ms' in architecture.mdx confused the JSX parser; backticked them as . Build: 23 pages clean. Verified inline SVG ships in the rendered HTML (grep for SVG title IDs returns 2/2 hits per relevant page). Container rebuilt + redeployed. Protocol page is now 92750 bytes (was ~63000), quirks page 84156 (was ~63000).
219 lines
9.8 KiB
Plaintext
219 lines
9.8 KiB
Plaintext
---
|
|
title: The two non-public quirks
|
|
description: Why public Omni-Link clients silently fail on the first encrypted message — session key XOR mix and per-block pre-whitening before AES.
|
|
---
|
|
|
|
import SessionKey from '../../../assets/diagrams/session-key-derivation.svg?raw';
|
|
import Whitening from '../../../assets/diagrams/per-block-whitening.svg?raw';
|
|
|
|
The Omni-Link II protocol, as documented in the publicly-available spec, looks
|
|
like a textbook AES-128-ECB session over TCP: handshake, derive a key, encrypt
|
|
everything from then on. As implemented by HAI's PC Access 3.17, it isn't.
|
|
There are two quirks in the way the session key is derived and the way payload
|
|
blocks are encrypted that are not in any third-party Omni-Link writeup we
|
|
could find. Both are unambiguous in the decompiled C# (`clsOmniLinkConnection.cs`).
|
|
Both are load-bearing: if a client skips either, the panel accepts the
|
|
connection, completes the unencrypted handshake, and then drops the session
|
|
on the first encrypted message — `ControllerSessionTerminated`, no
|
|
diagnostic, no log.
|
|
|
|
## Why these quirks exist (informed speculation)
|
|
|
|
Both quirks have the texture of *defense by inconvenience*. Neither makes the
|
|
protocol meaningfully harder to attack — anyone with a packet capture and the
|
|
`ControllerKey` can reproduce both transformations in a few lines of code.
|
|
But both add just enough complexity that a casual reverse engineer reading
|
|
the public spec will write a client that doesn't work, and won't have an
|
|
obvious explanation for why.
|
|
|
|
It looks like the kind of thing where someone on the original team said
|
|
"let's not make it trivial for the obvious clones," and the implementation
|
|
has the slight inelegance of cargo-culted-from-one-block-to-all-blocks that
|
|
suggests it was added by hand rather than designed in. The first quirk may
|
|
also have been an attempt at session-key freshness — mix a controller-supplied
|
|
nonce so that two sessions with the same `ControllerKey` don't use literally
|
|
the same AES key. That's a reasonable goal; a 5-byte XOR is just an unusual
|
|
way to achieve it.
|
|
|
|
Whatever the origin, both quirks are stable across the firmware versions PC
|
|
Access 3.17 supports (the v2-on-TCP path), and both must be implemented
|
|
exactly to talk to the panel.
|
|
|
|
## Quirk #1 — session key XOR mix
|
|
|
|
<div style="margin: 1rem 0 1.5rem;" set:html={SessionKey} />
|
|
|
|
The `ControllerKey` is the 16-byte AES-128 key that lives in the panel's
|
|
NVRAM and inside the encrypted `.pca` config file. The naive expectation is
|
|
that this key is what AES uses for the session. It isn't.
|
|
|
|
From `clsOmniLinkConnection.cs:1886-1892` (the TCP path):
|
|
|
|
```csharp
|
|
SessionKey = new byte[16];
|
|
ControllerKey.CopyTo(SessionKey, 0);
|
|
for (int j = 0; j < 5; j++)
|
|
{
|
|
SessionKey[11 + j] = (byte)(ControllerKey[11 + j] ^ SessionID[j]);
|
|
}
|
|
AES = new clsAES(SessionKey);
|
|
```
|
|
|
|
The first 11 bytes of the session key are the `ControllerKey` verbatim. The
|
|
last 5 bytes are the `ControllerKey` XORed with a 5-byte `SessionID` nonce
|
|
that the controller sent in the unencrypted `ControllerAckNewSession` packet.
|
|
That's the entire key derivation. No PBKDF2, no HKDF, no PIN, no salt. Five
|
|
bytes of XOR.
|
|
|
|
The same five-byte block appears at `:1423-1429` for the UDP path. Identical.
|
|
|
|
The Python equivalent:
|
|
|
|
```python
|
|
def derive_session_key(controller_key: bytes, session_id: bytes) -> bytes:
|
|
assert len(controller_key) == 16
|
|
assert len(session_id) == 5
|
|
sk = bytearray(controller_key)
|
|
for j in range(5):
|
|
sk[11 + j] ^= session_id[j]
|
|
return bytes(sk)
|
|
```
|
|
|
|
A naive client that uses `ControllerKey` directly as the AES key will
|
|
encrypt `ClientRequestSecureSession` (the first encrypted packet) with the
|
|
wrong key. The panel decrypts it to garbage — ECB has no integrity check, so
|
|
no exception fires; the panel just sees that the SessionID echo doesn't match
|
|
what it sent — and drops the session with `ControllerSessionTerminated`.
|
|
PC Access surfaces this as `InvalidEncryptionKey`, which sounds like "your
|
|
ControllerKey is wrong" but really means "your *derived* key is wrong, which
|
|
in practice is always because you didn't apply the XOR mix."
|
|
|
|
## Quirk #2 — per-block XOR pre-whitening before AES
|
|
|
|
<div style="margin: 1rem 0 1.5rem;" set:html={Whitening} />
|
|
|
|
This is the headline.
|
|
|
|
Before AES-encrypting any payload block, the *first two bytes of every
|
|
16-byte block* get XORed with the packet's 16-bit sequence number. Same XOR
|
|
mask, every block of the packet. From `clsOmniLinkConnection.cs:396-401`:
|
|
|
|
```csharp
|
|
for (num = 0; num < PKT.Data.Length; num += 16)
|
|
{
|
|
PKT.Data[num] = (byte)(PKT.Data[num] ^ ((PKT.SequenceNumber & 0xFF00) >> 8));
|
|
PKT.Data[num + 1] = (byte)(PKT.Data[num + 1] ^ (PKT.SequenceNumber & 0xFF));
|
|
}
|
|
PKT.Data = AES.Encrypt(PKT.Data);
|
|
```
|
|
|
|
And the inverse on receive (`:413-417`):
|
|
|
|
```csharp
|
|
PKT.Data = AES.Decrypt(PKT.Data);
|
|
for (int i = 0; i < PKT.Data.Length; i += 16)
|
|
{
|
|
PKT.Data[i] = (byte)(PKT.Data[i] ^ ((PKT.SequenceNumber & 0xFF00) >> 8));
|
|
PKT.Data[i + 1] = (byte)(PKT.Data[i + 1] ^ (PKT.SequenceNumber & 0xFF));
|
|
}
|
|
```
|
|
|
|
So the on-the-wire encryption is "AES-128-ECB of (payload XOR-prewhitened
|
|
with the seq number, two bytes per block)". This is *not* CBC. It is *not*
|
|
CTR. It is an outer transformation applied to the plaintext before AES
|
|
sees it (and reversed after AES decryption on the wire), independent of
|
|
AES's mode.
|
|
|
|
The Python equivalent:
|
|
|
|
```python
|
|
def whiten(data: bytes, seq: int) -> bytes:
|
|
out = bytearray(data)
|
|
seq_hi = (seq >> 8) & 0xFF
|
|
seq_lo = seq & 0xFF
|
|
for i in range(0, len(out), 16):
|
|
out[i] ^= seq_hi
|
|
out[i + 1] ^= seq_lo
|
|
return bytes(out)
|
|
|
|
def encrypt_payload(payload: bytes, seq: int, session_key: bytes) -> bytes:
|
|
# payload is already zero-padded to a 16-byte multiple by the caller.
|
|
return aes_ecb_encrypt(whiten(payload, seq), session_key)
|
|
|
|
def decrypt_payload(ciphertext: bytes, seq: int, session_key: bytes) -> bytes:
|
|
return whiten(aes_ecb_decrypt(ciphertext, session_key), seq)
|
|
```
|
|
|
|
The `whiten` function is its own inverse — XOR is symmetric — so the same
|
|
helper works both directions.
|
|
|
|
Cryptographically this is weak. An attacker with a known-plaintext for one
|
|
block can recover both bytes of the seq XOR mask by XORing the plaintext
|
|
against the un-AES'd ciphertext. From there the AES-encrypted bits are
|
|
unprotected by the whitening. It feels like the original intent might have
|
|
been nonce-mixing — use the seq as a per-packet salt to defeat ECB's
|
|
identical-block-equals-identical-ciphertext property — and the implementation
|
|
got cargo-culted from one block (where it would have been roughly
|
|
defensible) to every block of the packet (where it isn't doing useful work
|
|
beyond the first one). Doesn't matter. It's the protocol. Implement it. Move on.
|
|
|
|
## Why public OSS Omni-Link clients miss these
|
|
|
|
The two non-trivial public Omni-Link II clients we checked are
|
|
[`jomnilinkII`](https://github.com/digitaldan/jomnilinkII) (Java) and
|
|
[`pyomnilink`](https://github.com/excalq/pyomnilink) (Python), plus a
|
|
handful of writeups on personal blogs. None of them describe either quirk.
|
|
We can't be sure from the outside why, but two plausible explanations:
|
|
|
|
1. **Inherited working code from a pre-quirk firmware era.** If an early
|
|
version of the panel firmware used `ControllerKey` directly as the
|
|
session key and didn't have the XOR pre-whitening, an OSS client
|
|
written against that firmware would just keep working as long as the
|
|
panel maintained backward compatibility on the wire — even though new
|
|
firmware added the quirks for new clients. We don't have the firmware
|
|
history to confirm or refute this.
|
|
2. **Serial-only / unencrypted paths.** Both quirks live in the
|
|
`clsOmniLinkConnection.EncryptPacket` / `DecryptPacket` methods, which
|
|
are only invoked on packet types `OmniLinkMessage` (0x10) and
|
|
`OmniLink2Message` (0x20). The *unencrypted* twin packet types (0x11,
|
|
0x21) bypass them entirely. A client that only ever talks to the panel
|
|
over the unencrypted v1 serial path would never need them.
|
|
|
|
Either way, the practical outcome is that an existing OSS client is not a
|
|
useful reference for someone trying to write a v2-on-TCP encrypted client
|
|
from scratch. The decompiled PC Access C# is.
|
|
|
|
## The mock panel as proof
|
|
|
|
The most direct way to prove our implementation of both quirks is correct is
|
|
to build a controller-side emulator that round-trips with the client.
|
|
`omni_pca.mock_panel.MockPanel` is exactly that: a TCP server that runs the
|
|
controller half of the handshake, derives the same `SessionKey`, applies
|
|
the same per-block XOR pre-whitening, and decodes / encodes real Omni-Link II
|
|
messages. The library's e2e test suite connects a real `OmniClient` to a
|
|
real `MockPanel` over a real TCP socket and exchanges real frames. Seventeen
|
|
of those tests cover the secure-session handshake, encrypted command
|
|
roundtrips, and the unsolicited push-event stream.
|
|
|
|
If either quirk were implemented incorrectly on either side, decryption
|
|
would produce garbage and the connection would drop. The fact that all
|
|
seventeen tests pass — including ones that subscribe to events and watch
|
|
them roundtrip cleanly through the encrypted channel — is bidirectional
|
|
validation that we have both quirks right.
|
|
|
|
That doesn't prove they're right against a *real* HAI panel. The user's
|
|
panel is currently offline (Ethernet module disabled at the panel firmware),
|
|
and the live-validation lap is on the backlog. But round-tripping with a
|
|
faithful emulator is meaningful evidence that the spec we extracted from
|
|
the C# is internally consistent — and that's the work that the public
|
|
clients didn't do.
|
|
|
|
## See also
|
|
|
|
- [Protocol reference](/reference/protocol/) — full byte-level handshake
|
|
including both quirks in their natural place in the flow.
|
|
- [Architecture overview](/explanation/architecture/) — how the mock panel
|
|
fits into the test stack.
|
|
- [The Journey](/journey/) — what it took to find the quirks in the first
|
|
place.
|