- firmware/custom-v305.mdx: watchdog, timeout protection, error codes, I2C return checks, do_tune() safety, DiSEqC fixes, hardware validation - tools/safety-testing.mdx: 55-test adversarial suite covering DiSEqC abuse, tune parameter abuse, I2C address abuse, state machine violations, boundary conditions, rapid-fire stress, and invalid vendor commands - Updated version comparison with v3.05 tab and architectural table - Sidebar entries for both new pages
203 lines
9.3 KiB
Plaintext
203 lines
9.3 KiB
Plaintext
---
|
||
title: Safety Testing
|
||
description: Hamilton adversarial test suite for firmware safety validation — operator error, invalid inputs, state machine violations, and rapid-fire stress.
|
||
---
|
||
|
||
import { Badge, Aside, Steps, Tabs, TabItem } from '@astrojs/starlight/components';
|
||
|
||
The Hamilton adversarial test suite (`test_hamilton.py`) validates that the firmware survives every kind of wrong input an operator or buggy host software could send. Named after Margaret Hamilton's methodology at MIT Instrumentation Laboratory — she tested "what if the astronaut pushes the wrong button?" and found bugs that saved Apollo missions. <Badge text="v3.05" variant="success" />
|
||
|
||
```bash
|
||
sudo python3 tools/test_hamilton.py
|
||
```
|
||
|
||
<Aside type="note">
|
||
Requires `pyusb`, root access, and firmware v3.05.0+ loaded. The device must have the BCM4500 booted before running. The test handles boot/shutdown sequences internally.
|
||
</Aside>
|
||
|
||
## Test Philosophy
|
||
|
||
Traditional hardware testing validates the happy path — correct inputs producing correct outputs. Adversarial testing validates the *unhappy* path: what happens when the host sends garbage? The firmware must never:
|
||
|
||
1. **Hang** — every code path must terminate in bounded time
|
||
2. **Corrupt state** — invalid inputs must not leave the device in an inconsistent state
|
||
3. **Fire the watchdog** — the main loop must stay responsive
|
||
4. **Become unresponsive** — the device must accept new commands after any error
|
||
|
||
Each test follows the same pattern:
|
||
|
||
<Steps>
|
||
1. Read `last_error` before the operation
|
||
2. Send the invalid/abusive command
|
||
3. Verify the device is still alive (`GET_FW_VERS` returns `3.05.0`)
|
||
4. Read `last_error` after — check it matches expectations
|
||
</Steps>
|
||
|
||
## Test Categories
|
||
|
||
<Tabs>
|
||
<TabItem label="DiSEqC">
|
||
|
||
### Category 1: DiSEqC Message Abuse
|
||
|
||
Tests the DiSEqC messaging path with invalid parameters. DiSEqC uses Timer2-based Manchester encoding on the 22 kHz carrier — a stuck timer would hang the firmware without the v3.05.0 `diseqc_wait_ticks()` timeout fix.
|
||
|
||
| Test | Input | Expected |
|
||
|------|-------|----------|
|
||
| 1a. Tone burst B | `wValue=1` | `ERR_NOT_SUPPORTED` (0x0A) |
|
||
| 1b. Tone burst 0xFF | `wValue=0xFF` | `ERR_NOT_SUPPORTED` (0x0A) |
|
||
| 1c. Too short | 2-byte payload | No hang |
|
||
| 1d. Too long | 8-byte payload | No hang |
|
||
| 1e. Empty | 0-byte payload | No hang |
|
||
| 1f. Recovery | Valid 4-byte message | No hang, correct execution |
|
||
| 1g. Motor halt | DiSEqC 1.2 `E0 31 60` (no motor connected) | No hang |
|
||
| 1h. Drive 255 steps | DiSEqC 1.2 `E0 31 68 FF` | No hang |
|
||
| 1i. Bogus USALS | DiSEqC 1.2 `E0 31 6E FF FF` | No hang |
|
||
|
||
</TabItem>
|
||
<TabItem label="Tune">
|
||
|
||
### Category 2: Tune Parameter Abuse
|
||
|
||
Sends tuning commands with out-of-range parameters. The BCM4500 demodulator receives these values directly over I2C — the firmware must not hang regardless of what the host sends.
|
||
|
||
| Test | Input | Expected |
|
||
|------|-------|----------|
|
||
| 2a. Zero symbol rate | `SR=0` | No hang |
|
||
| 2b. Max symbol rate | `SR=0xFFFFFFFF` | No hang |
|
||
| 2c. Zero frequency | `freq=0` | No hang |
|
||
| 2d. Max frequency | `freq=0xFFFFFFFF` | No hang |
|
||
| 2e. Invalid modulation | `mod=0xFF` | No hang |
|
||
| 2f. Invalid FEC | `fec=0xFF` | No hang |
|
||
| 2g. Truncated (4 bytes) | 4 of 10 bytes sent | `ERR_EP0_TIMEOUT` (0x07) |
|
||
| 2h. Single byte | 1 of 10 bytes sent | `ERR_EP0_TIMEOUT` (0x07) |
|
||
| 2i. All zeros | 10 zero bytes | No hang |
|
||
| 2j. All 0xFF | 10 × `0xFF` | No hang |
|
||
|
||
Tests 2g and 2h specifically validate the `ep0_wait_data()` + `EP0BCL` payload length check. The firmware waits for the EP0 data phase, but when the host sends fewer bytes than expected, the timeout fires and the command is safely aborted.
|
||
|
||
</TabItem>
|
||
<TabItem label="I2C">
|
||
|
||
### Category 3: I2C Address Space Abuse
|
||
|
||
Sends I2C reads and writes to non-existent or invalid addresses. Before v3.05.0, the bus scan command (`0xB4`) used bare `while (!(I2CS & bmDONE))` loops that would hang if any address didn't ACK.
|
||
|
||
| Test | Input | Expected |
|
||
|------|-------|----------|
|
||
| 3a. Addr 0x7F | Non-existent I2C device | `ERR_I2C_NAK` (0x02) |
|
||
| 3b. Addr 0x00 | General call address | No hang |
|
||
| 3c. Page 0xFF | BCM4500 indirect read, invalid page | No hang |
|
||
| 3d. Count=0 | Multi-reg read with zero count | No hang |
|
||
| 3e. Count=255 | Multi-reg read exceeding 64 max | No hang (clamped) |
|
||
| 3f. Write to 0x7F | Raw I2C write to non-existent device | No hang |
|
||
| 3g. Reserved reg | BCM4500 direct read of reg 0xFF | No hang |
|
||
|
||
</TabItem>
|
||
<TabItem label="State Machine">
|
||
|
||
### Category 4: State Machine Violations
|
||
|
||
Tests operations in the wrong order — the classic "what if the astronaut pushes the wrong button?" scenario.
|
||
|
||
| Test | Scenario | Expected |
|
||
|------|----------|----------|
|
||
| 4a. Double boot | `BOOT_8PSK(1)` when already booted | No hang |
|
||
| 4b. Tune with BCM off | Power off BCM4500, then tune | `ERR_BCM_NOT_READY` (0x04) |
|
||
| 4c. Signal read, BCM off | Signal monitor with demod powered down | No hang |
|
||
| 4d. Bus scan, BCM off | I2C bus scan with demod powered down | No hang |
|
||
| 4e. Hotplug, BCM off | Force hotplug rescan with demod off | No hang |
|
||
| 4f. Recovery | Re-boot BCM4500 after power cycle | `STARTED \| FW_LOADED` |
|
||
| 4g. Arm + disarm | Arm streaming then immediately disarm | No hang |
|
||
| 4h. Disarm, not armed | Disarm when already disarmed | No hang |
|
||
| 4i. Rapid boot toggle | Off → on → off → on in quick succession | No hang |
|
||
|
||
Test 4b is the most important: it proves the `do_tune()` early guard works. Without the v3.05.0 fix, tuning with a powered-off BCM4500 would attempt I2C writes to a device that isn't there, potentially hanging on the bus.
|
||
|
||
</TabItem>
|
||
<TabItem label="Boundaries">
|
||
|
||
### Category 5: Boundary and Buffer Abuse
|
||
|
||
Tests USB transfer size mismatches — requesting more or fewer bytes than the firmware produces.
|
||
|
||
| Test | Scenario | Expected |
|
||
|------|----------|----------|
|
||
| 5a. 0 bytes from GET_CONFIG | Request 0 bytes (firmware sends 1) | No hang |
|
||
| 5b. 64 bytes from GET_CONFIG | Request 64 bytes (firmware sends 1) | No hang |
|
||
| 5c. 64 bytes from GET_LAST_ERROR | Request 64 bytes (firmware sends 1) | No hang |
|
||
| 5d. 1 byte from GET_FW_VERS | Request 1 byte (firmware sends 6) | No hang |
|
||
| 5e. 1 byte from GET_STREAM_DIAG | Request 1 byte (firmware sends 12) | No hang |
|
||
| 5f. STREAM_DIAG wval=0xFFFF | Non-standard wValue | No hang |
|
||
| 5g. HOTPLUG wval=0xFFFF | Non-standard wValue | No hang |
|
||
|
||
The USB layer handles size mismatches — the firmware writes its full response to EP0BUF regardless of how many bytes the host requested. The host-side libusb may return an `Overflow` error, but the device remains stable.
|
||
|
||
</TabItem>
|
||
<TabItem label="Stress">
|
||
|
||
### Category 6: Rapid-Fire Stress
|
||
|
||
Sends many commands in quick succession to test for timing-dependent failures.
|
||
|
||
| Test | Operations | Typical Time |
|
||
|------|-----------|--------------|
|
||
| 6a. Config reads | 200 × `GET_CONFIG` | ~9 ms |
|
||
| 6b. Error reads | 50 × `GET_LAST_ERROR` | ~2 ms |
|
||
| 6c. Signal monitors | 30 × `SIGNAL_MONITOR` | ~276 ms |
|
||
| 6d. Voltage toggles | 40 × `SET_LNB_VOLTAGE` | ~2 ms |
|
||
| 6e. DiSEqC messages | 10 × 4-byte DiSEqC switch | ~1,473 ms |
|
||
|
||
Single-byte vendor commands (config, error) complete in under 0.05 ms. Signal monitors are slower (~9 ms each) because they perform multiple I2C reads. DiSEqC messages are the slowest due to Timer2-based bit-bang encoding.
|
||
|
||
</TabItem>
|
||
<TabItem label="Invalid Cmds">
|
||
|
||
### Category 7: Invalid Vendor Commands
|
||
|
||
Sends vendor command codes that don't exist in the firmware.
|
||
|
||
| Test | Command | Expected |
|
||
|------|---------|----------|
|
||
| 7a-f. Unknown codes | `0xFF`, `0x01`, `0x50`, `0xFE`, `0x00`, `0x79` | USB STALL |
|
||
| 7g. Wrong direction | `GET_CONFIG` as OUT | No hang |
|
||
| 7h. Oversized payload | 64-byte payload to `GET_CONFIG` | No hang |
|
||
|
||
Unknown commands correctly return a USB STALL (the EP0 error response per USB spec). The firmware's `handle_vendorcommand()` returns FALSE for unrecognized request codes, which causes fx2lib to send the STALL handshake.
|
||
|
||
</TabItem>
|
||
</Tabs>
|
||
|
||
## Results
|
||
|
||
The full test suite runs in under 30 seconds on hardware:
|
||
|
||
```
|
||
================================================================
|
||
HAMILTON ADVERSARIAL TEST — FINAL RESULTS
|
||
-----------------------------------------
|
||
Tests passed: 55
|
||
Tests failed: 0
|
||
Device alive: True
|
||
Final error: 0x08 [GPIF_TIMEOUT]
|
||
Watchdog fired: No
|
||
Verdict: PASS
|
||
================================================================
|
||
```
|
||
|
||
<Aside type="tip">
|
||
The final `GPIF_TIMEOUT` error is expected — it comes from the arm/disarm toggle test (4g). The GPIF state machine hadn't fully started before it was told to stop. The error code correctly reports what happened, and the device remains fully operational.
|
||
</Aside>
|
||
|
||
## Running the Tests
|
||
|
||
The test can be run after any firmware change to verify safety properties are preserved:
|
||
|
||
```bash
|
||
# Load firmware and run tests
|
||
sudo python3 tools/fw_load.py load firmware/build/skywalker1.ihx --wait 5
|
||
sudo python3 tools/test_hamilton.py
|
||
```
|
||
|
||
The test automatically boots the BCM4500, runs all 55 tests, and powers everything down. If any test fails, the device state and error code are reported. If the device becomes unresponsive, the test halts immediately and reports which operation killed it.
|