pg_orrery/docs/DESIGN.md
Ryan Malloy cd338c3c64 Vendor SGP4/SDP4 as pure C, drop g++ dependency
Replace the sat_code git submodule (lib/sat_code/) with vendored
sources in src/sgp4/. The upstream .cpp files are renamed to .c —
the code is valid C99 with zero C++ features. This eliminates the
g++ and -lstdc++ build dependencies.

Adds 518 Vallado test vectors (AIAA 2006-6753-Rev1) as a 13th
regression suite to verify byte-identical numerical output.

Updates all documentation (CLAUDE.md, DESIGN.md, 11 MDX pages,
Dockerfile) to reflect the new layout and pure-C compilation.
2026-02-17 12:45:39 -07:00

748 lines
32 KiB
Markdown

# pg_orbit Design Document
Internal architecture notes. Documents WHY decisions were made,
not how to use the extension. Intended audience: future maintainers
who need to modify pg_orbit without breaking physical correctness.
## 1. Constant Chain of Custody
This is the single most critical design constraint in the extension.
Get it wrong and positions silently drift by kilometers. There is no
runtime check that can detect this class of error after the fact.
### The Problem
Two-Line Elements are not raw orbital measurements. They are *mean*
elements produced by a differential correction process that fits
observed positions against an SGP4 propagator running with a specific
set of geopotential constants (WGS-72). The mean elements absorb
geodetic model biases -- the eccentricity, inclination, and mean
motion encode corrections that only make physical sense when propagated
with the same constants used to generate them.
Substituting WGS-84 constants (or any other set) into the propagator
does not "upgrade" accuracy. It breaks the internal consistency of the
element set. The resulting position error can exceed the natural
prediction error of the TLE by an order of magnitude.
### The Rules
1. **SGP4/SDP4 propagation**: WGS-72 constants only (mu, ae, J2, J3, J4, ke).
These flow through sat_code's `norad_in.h` defines and are never
overridden.
2. **Coordinate output** (geodetic lat/lon/alt, topocentric az/el/range):
WGS-84 ellipsoid (a = 6378.137 km, f = 1/298.257223563). This is
the modern standard for ground-station positioning and GPS receivers.
3. **TEME frame**: The SGP4 output frame (True Equator, Mean Equinox)
uses only 4 of the 106 IAU-80 nutation terms. Applying the full
nutation model would "correct" for effects that SGP4 already accounts
for internally, introducing error rather than removing it.
4. **No other combination is valid.** WGS-72 for propagation, WGS-84 for
output. Perigee and apogee altitudes from `tle_perigee()` and
`tle_apogee()` use WGS-72 AE because they derive from mean elements.
Geodetic altitude from `eci_to_geodetic()` uses WGS-84 because it
converts a physical position.
### Constant Inventory
| Constant | Source Paper | Value | pg_orbit Location | Vendored SGP4 Location |
|----------|-------------|-------|-------------------|------------------------|
| ae (equatorial radius) | Hoots & Roehrich STR#3 | 6378.135 km | `types.h` (WGS72_AE) | `src/sgp4/norad_in.h` (earth_radius_in_km) |
| J2 | Hoots & Roehrich STR#3 | 0.001082616 | `types.h` (WGS72_J2) | `src/sgp4/norad_in.h` (xj2) |
| J3 | Hoots & Roehrich STR#3 | -2.53881e-6 | `types.h` (WGS72_J3) | `src/sgp4/norad_in.h` (xj3) |
| J4 | Hoots & Roehrich STR#3 | -1.65597e-6 | `types.h` (WGS72_J4) | `src/sgp4/norad_in.h` (xj4) |
| ke | Hoots & Roehrich STR#3 | 0.0743669161331734132 min^-1 | `types.h` (WGS72_KE) | `src/sgp4/norad_in.h` (xke) |
| mu | Hoots & Roehrich STR#3 | 398600.8 km^3/s^2 | `types.h` (WGS72_MU) | (implicit in xke) |
| WGS-84 a | NIMA TR8350.2 | 6378.137 km | `types.h:31` (WGS84_A) | -- |
| WGS-84 f | NIMA TR8350.2 | 1/298.257223563 | `types.h:32` (WGS84_F) | -- |
Note that `types.h` carries a parallel copy of the WGS-72 constants
even though sat_code defines them in `norad_in.h`. This is intentional:
`types.h` is the single header for all pg_orbit C sources, and
`norad_in.h` is an internal sat_code header not meant for external
consumers. The GiST index (`gist_tle.c`) and TLE accessor functions
(`tle_type.c`) need KE and AE without pulling in sat_code internals.
The values MUST be identical.
### Why Two Copies of AE?
`tle_perigee()`, `tle_apogee()`, and the GiST altitude-band computation
all use `WGS72_KE` and `WGS72_AE` from `types.h`. They compute:
a_er = (KE / n)^(2/3) [earth radii]
perigee_km = a_er * (1 - e) * AE - AE
This MUST use WGS-72 AE (6378.135), not WGS-84 (6378.137), because `n`
is a mean motion fitted against the WGS-72 geopotential. Using the
wrong radius shifts every altitude by 2 meters -- small in absolute terms
but wrong in principle, and the error compounds in index operations.
## 2. SGP4 Implementation Choice
pg_orbit wraps Bill Gray's `sat_code` library (MIT license, Project Pluto).
### Why sat_code over alternatives
**Vallado's reference implementation** (from the STR#3 revision paper) is
the canonical source but has two problems: it is written in C++ with heavy
use of global state, and its license is unclear for embedding in a
PostgreSQL extension.
**libsgp4** (various forks on GitHub) is typically a C++ class hierarchy
that assumes an object-per-satellite lifecycle. This conflicts with
PostgreSQL's per-call execution model where we cannot persist C++ objects
across function invocations.
**sat_code** was chosen because:
1. **Pure C linkage.** All public functions are declared `extern "C"` in
`norad.h` (lines 97-133). The library is compiled as C++ but exposes
a flat C function interface: `SGP4_init()`, `SGP4()`, `SDP4_init()`,
`SDP4()`, `parse_elements()`, `select_ephemeris()`.
2. **No global mutable state.** The propagator state lives in a caller-
allocated `double params[N_SAT_PARAMS]` array. This maps directly to
PostgreSQL's `palloc`-based memory model: allocate before propagation,
free after.
3. **Includes deep-space SDP4.** Many SGP4 implementations only handle
near-earth orbits (period < 225 minutes). sat_code includes the full
SDP4 with lunar/solar perturbations via `deep.c`, handling GEO,
Molniya, and GPS orbits.
4. **MIT license.** Compatible with the PostgreSQL License for embedding
in a shared library.
5. **Actively maintained.** Used in Bill Gray's Find_Orb production
astrometry software.
### Build Integration
The SGP4/SDP4 source is vendored into `src/sgp4/` the `.cpp` files
renamed to `.c` (the code is valid C99 with zero C++ features). The
Makefile compiles everything with `gcc` and links with `-lm` only. No
C++ compiler or runtime is required.
```
src/*.c --[gcc]--> .o --|
src/sgp4/*.c --[gcc]--> .o --|--> pg_orbit.so
-lm
```
The `-I$(SGP4_DIR)` flag lets our C files `#include "norad.h"` directly.
Provenance is recorded in `src/sgp4/PROVENANCE.md`.
## 3. Type System Design
### Design Principles
Every pg_orbit type is fixed-size, not varlena. This means:
- No TOAST overhead (no detoasting on access)
- Direct pointer access via `PG_GETARG_POINTER(n)` -- no copy
- Predictable memory layout for binary I/O (`tle_recv`/`tle_send`)
- All types use `ALIGNMENT = double` to satisfy the strictest member
alignment requirement (all structs contain `double` fields)
- `STORAGE = plain` -- the type is never compressed or moved to TOAST
### TLE Type (112 bytes)
The TLE type stores **parsed mean elements**, not raw text.
This is the most important type design decision. Alternatives considered:
1. **Store raw 69+69 character lines, parse on every propagation.**
Rejected. Parsing is ~10x slower than propagation itself for a
pre-initialized model. Every `sgp4_propagate()` call would pay
the parse cost.
2. **Store raw text plus parsed cache.**
Rejected. Doubles the storage for no benefit. The parsed form
round-trips perfectly through `write_elements_in_tle_format()`.
3. **Store parsed mean elements only.**
Chosen. Input validation happens once at `tle_in()` time via
sat_code's `parse_elements()`. The text representation is
reconstructed on output via `write_elements_in_tle_format()`.
Storage layout:
```
Offset Size Field
0 88 11 doubles: epoch, inclination, raan, eccentricity,
arg_perigee, mean_anomaly, mean_motion,
mean_motion_dot, mean_motion_ddot, bstar
(one unused slot in the double array for alignment)
80 12 3 int32: norad_id, elset_num, rev_num
92 12 classification (1), ephemeris_type (1),
intl_desig (9), pad (1)
----
112 bytes total
```
The SQL definition declares `INTERNALLENGTH = 112`. This is smaller
than the raw two-line text (138+ bytes with line separator) and avoids
the 4-byte varlena header overhead.
Angular elements are stored in radians (matching sat_code's internal
representation). Accessor functions convert to degrees for human
consumption. Mean motion is stored in radians/minute; the accessor
returns revolutions/day.
### ECI Position Type (48 bytes)
Six doubles: x, y, z (km), vx, vy, vz (km/s).
SGP4 outputs velocity in km/min. We convert to km/s at the boundary
(`sgp4_funcs.c`, lines 181-183: `vel[i] / 60.0`). This conversion
happens exactly once, at the point where the pg_eci struct is populated.
Internally, all velocity in pg_orbit is km/s.
### Geodetic Type (24 bytes)
Three doubles: lat, lon (radians), alt (km above WGS-84).
Radians internally, degrees in text representation. The accessor
functions perform the conversion.
### Observer Type (24 bytes)
Three doubles: lat, lon (radians), alt_m (meters above WGS-84).
Note the asymmetry with geodetic: observer altitude is in meters
(matching GPS receiver output and ground station databases), while
geodetic altitude is in km (matching orbital altitude conventions).
This prevents unit confusion at the API boundary -- you set up a
ground station in meters, you get satellite altitude in kilometers.
### Topocentric Type (32 bytes)
Four doubles: azimuth, elevation (radians), range_km, range_rate (km/s).
### Pass Event Type (48 bytes)
Three int64 (TimestampTz): aos_time, max_el_time, los_time.
Three doubles: max_elevation (degrees), aos_azimuth (degrees),
los_azimuth (degrees).
Times are stored as native PostgreSQL TimestampTz values (microseconds
since 2000-01-01 00:00:00 UTC). This allows direct comparison with
SQL timestamp expressions without conversion.
## 4. TEME to ECEF Transform
SGP4 outputs position and velocity in the TEME (True Equator, Mean
Equinox) frame. Converting to Earth-fixed coordinates for geodetic
and topocentric output requires a frame rotation.
### The Rotation
TEME to ECEF is a single Z-axis rotation by the negative of the
Greenwich Mean Sidereal Time (GMST) angle:
```
[x_ecef] [ cos(g) sin(g) 0 ] [x_teme]
[y_ecef] = [-sin(g) cos(g) 0 ] [y_teme]
[z_ecef] [ 0 0 1 ] [z_teme]
```
This is implemented in `coord_funcs.c:teme_to_ecef()` and
`pass_funcs.c:teme_to_ecef()` (duplicated -- see note below).
### GMST Computation
GMST is computed from the Julian date using the IAU 1982 formula,
which is the same low-precision model that SGP4 uses internally.
From Vallado (2013) Eq. 3-47:
```c
T_UT1 = (JD - 2451545.0) / 36525.0
GMST = 67310.54841
+ (876600*3600 + 8640184.812866) * T_UT1
+ 0.093104 * T_UT1^2
- 6.2e-6 * T_UT1^3
```
The result is in seconds of time, converted to radians by multiplying
by pi/43200 and normalized to [0, 2*pi).
We deliberately do NOT use a higher-precision GMST model (e.g., IAU 2000A).
The SGP4 output is only accurate to the precision of its own GMST model;
applying a more precise rotation would not improve the final position and
could introduce a systematic offset.
### Velocity Transform
The velocity transform includes the Earth angular velocity cross product
term, accounting for the rotating reference frame:
```
v_ecef = R(-g) * v_teme + omega_E x r_ecef
```
where omega_E = 7.2921158553e-5 rad/s. In `pass_funcs.c`, the velocity
cross product uses omega_E in rad/min (multiplied by 60.0) because
sat_code outputs velocity in km/min.
### ECEF to Geodetic
The ECEF-to-geodetic conversion uses Bowring's iterative method, which
converges in 2-3 iterations for LEO altitudes:
```
phi_0 = atan2(z, p * (1 - e^2)) [initial estimate]
N = a / sqrt(1 - e^2 * sin^2(phi)) [prime vertical radius]
phi = atan2(z + e^2 * N * sin(phi), p) [iterate]
```
Convergence criterion: 1.0e-12 radians (sub-millimeter at Earth's surface).
Maximum 10 iterations as a safety bound, though LEO cases converge in 2-3.
We chose Bowring's method over Vermeille's or Heikkinen's closed-form
solutions because it is simpler, well-tested, and the iteration cost is
negligible compared to the SGP4 propagation that precedes it.
### Why Duplicated Helpers
`coord_funcs.c` and `pass_funcs.c` each contain their own static copies
of `pg_tle_to_sat_code()`, `gmst_from_jd()`, `teme_to_ecef()`,
`observer_to_ecef()`, and `ecef_to_topocentric()`. All `.c` files link
into a single `.so`, so we could share these through a common object file.
We chose duplication instead because:
1. Each translation unit is self-contained. No hidden coupling between
files through shared internal functions.
2. The functions are small (5-20 lines each). The binary size increase
is negligible.
3. The compiler can inline them within each translation unit.
4. If the helpers ever need to diverge (e.g., pass_funcs using km/min
velocity while coord_funcs uses km/s), they can do so without
affecting each other.
## 5. Pass Prediction Algorithm
### The Core Problem
Given a TLE and a ground observer, find all time intervals where the
satellite is above the observer's local horizon. SGP4 has no closed-form
inverse for the elevation function -- you cannot algebraically solve
"at what time does elevation = 0?" You have to evaluate the function
at many points and search for zero crossings.
This is the same fundamental approach used by Skyfield's `find_events()`
and Gpredict's pass finder.
### Algorithm
1. **Coarse scan**: Step through the search window at 30-second intervals,
evaluating elevation at each step. A 30-second step is fine enough
for LEO (~90 min period, ~10 min pass duration) that no pass shorter
than about 1 minute gets skipped, and coarse enough that a 7-day
window requires ~20,000 propagation calls (fast on modern hardware).
2. **AOS detection**: When elevation transitions from negative to
positive between consecutive coarse steps, a rising edge has been
found. Bisect the interval to locate AOS to within 0.1 second
tolerance (BISECT_TOL_JD = 0.1/86400).
3. **LOS detection**: Continue the coarse scan from AOS until elevation
goes negative again. Bisect to refine LOS to the same 0.1 second
tolerance.
4. **Peak elevation**: Between AOS and LOS, the elevation function is
unimodal (one peak). Use ternary search (50 iterations) to locate
the maximum. Ternary search is chosen over golden section because
the convergence rate is adequate and the implementation is simpler.
5. **Degenerate pass filter**: Passes shorter than 10 seconds
(MIN_PASS_DURATION_JD) are discarded. These are typically numerical
noise from orbits just barely grazing the horizon.
6. **Minimum elevation filter**: Passes whose peak elevation falls below
the caller-specified threshold are silently skipped. The scan resumes
from their LOS.
7. **Post-LOS gap**: After finding a complete pass, the scan resumes
1 minute past LOS (POST_LOS_GAP_JD) to avoid re-detecting the
same descending arc.
### Error Handling in the Scan
If SGP4 returns a hard error (negative semi-major axis, nearly parabolic,
convergence failure) during the elevation scan, `elevation_at_jd()`
returns -pi radians. This is well below any real horizon elevation, so
the scan treats the point as "below horizon" and continues. The pass
finder does NOT abort the query on propagation errors during scanning --
a TLE might be valid for part of the search window and decayed for
the rest.
### SRF (Set-Returning Function) Pattern
`predict_passes()` uses PostgreSQL's ValuePerCall SRF protocol:
- First call: allocate a `predict_passes_ctx` struct in
`funcctx->multi_call_memory_ctx`, copy the TLE and observer into it,
initialize the scan position.
- Each subsequent call: call `find_next_pass()` starting from the
current scan position. If a pass is found, advance `current_jd`
past LOS and return the pass. If not, return DONE.
The TLE and observer are copied into the multi_call context because
the original function arguments may be freed between calls.
## 6. GiST Index Architecture
### The Indexing Problem
The natural query pattern for conjunction screening is: "which TLEs
have orbits that could intersect with this TLE?" Full conjunction
analysis requires propagating both objects and computing distance at
every timestep -- far too expensive for an index scan.
### Altitude-Band Approximation
Every orbit has a perigee altitude and an apogee altitude. Two orbits
can only come close to each other if their altitude ranges overlap.
This is a **necessary but not sufficient** condition for conjunction.
The GiST index stores [perigee_km, apogee_km] as its internal key
(`tle_alt_range` struct: two doubles, 16 bytes). This is derived from
the mean elements via Kepler's third law:
```
a = (KE / n)^(2/3) [earth radii, WGS-72]
perigee_km = a * (1 - e) * AE - AE
apogee_km = a * (1 + e) * AE - AE
```
The altitude band uses WGS-72 constants because n and e are WGS-72
mean elements.
### GiST Support Functions
| Function | Strategy | Description |
|----------|----------|-------------|
| compress | -- | Leaf: extract [perigee, apogee] from pg_tle. Internal: pass through (already a range). |
| decompress | -- | Identity. We operate on compressed keys directly. |
| consistent | `&&` (3), `@>` (7), `<@` (8) | Does the subtree's bounding range overlap/contain the query range? |
| union | -- | [min(low), max(high)] over all entries in a node. |
| penalty | -- | Increase in altitude span (km) from expanding the node's range to include the new entry. |
| picksplit | -- | Sort entries by altitude midpoint, split at the median. |
| same | -- | Endpoint comparison within 1e-9 km tolerance. |
| distance | `<->` (15) | Minimum gap between altitude bands (0 if overlapping). Drives KNN queries. |
### Always Recheck
`gist_tle_consistent()` always sets `*recheck = true`. Altitude band
overlap eliminates the vast majority of non-candidates (a LEO satellite
cannot conjunct with a GEO satellite), but it cannot confirm conjunction.
Two objects at the same altitude but on opposite sides of the earth are
not close. After the index scan filters candidates, the query must
propagate both TLEs to the reference time and compute actual distance.
### Picksplit Strategy
The picksplit function sorts entries by the midpoint of their altitude
band ((perigee + apogee) / 2) and splits at the median. This is optimal
for a 1-D range index because it minimizes overlap between the two
resulting subtrees. Orbits at similar altitudes end up in the same
subtree, which matches the access pattern of conjunction screening
(you are usually querying within a narrow altitude band).
## 7. Memory Management
### Allocation Strategy
All heap allocation goes through `palloc()` / `pfree()`. No `malloc()`,
no `new`, no static buffers.
- **Single-shot propagation** (`sgp4_propagate`, `tle_distance`):
`palloc(sizeof(double) * N_SAT_PARAMS)` for the params array,
propagate, `pfree(params)`. The params array lives in the current
memory context and is freed before the function returns.
- **Set-returning functions** (`sgp4_propagate_series`, `ground_track`,
`predict_passes`): The SRF context struct (containing `tle_t`,
`params[N_SAT_PARAMS]`, and scan state) is allocated in
`funcctx->multi_call_memory_ctx`. This context survives across calls
and is freed automatically when the SRF completes.
- **Type I/O functions** (`tle_in`, `eci_in`, etc.): The result struct
is allocated with `palloc()` in the current context. PostgreSQL
manages its lifecycle.
### No Global Mutable State
For v0.1.0/v0.2.0 functions, there are no file-scope variables, no
static locals that accumulate state, no caches. Every function computes
from its arguments alone.
The v0.3.0 DE ephemeris layer introduces per-backend static state in
`eph_provider.c` (file descriptor, coefficient cache, init flags). This
is safe because each backend gets its own copy after fork(). The handle
is cleaned up via `on_proc_exit()`. All pg_orbit functions remain
`PARALLEL SAFE` -- parallel workers each open their own DE handle
independently.
sat_code itself has no global mutable state. The propagator state is
entirely in the `params[]` array and the `tle_t` struct, both of which
are caller-provided.
### SRF Context Layout
For `sgp4_propagate_series` and `ground_track`, the params array is
embedded directly in the context struct (not a separate allocation):
```c
typedef struct {
tle_t sat;
double params[N_SAT_PARAMS]; /* ~92 doubles, embedded */
int is_deep;
double epoch_jd;
int64 start_ts;
int64 step_usec;
} propagate_series_ctx;
```
This puts everything in a single allocation, reducing palloc overhead
and keeping the data contiguous in memory for cache locality during
the per-row propagation loop.
## 8. Error Handling
### Error Classification
sat_code returns integer error codes from SGP4() and SDP4():
| Code | Constant | Severity | Meaning | pg_orbit Response |
|------|----------|----------|---------|-------------------|
| 0 | -- | OK | Normal | Return result |
| -1 | SXPX_ERR_NEARLY_PARABOLIC | Fatal | eccentricity >= 1 | `ereport(ERROR)` |
| -2 | SXPX_ERR_NEGATIVE_MAJOR_AXIS | Fatal | Decayed orbit | `ereport(ERROR)` |
| -3 | SXPX_WARN_ORBIT_WITHIN_EARTH | Warning | Entire orbit below surface | `ereport(NOTICE)`, return result |
| -4 | SXPX_WARN_PERIGEE_WITHIN_EARTH | Warning | Perigee below surface | `ereport(NOTICE)`, return result |
| -5 | SXPX_ERR_NEGATIVE_XN | Fatal | Negative mean motion | `ereport(ERROR)` |
| -6 | SXPX_ERR_CONVERGENCE_FAIL | Fatal | Kepler equation diverged | `ereport(ERROR)` |
The distinction between warnings (-3, -4) and errors (-1, -2, -5, -6) is
physical. A satellite with perigee within the earth is plausible during
reentry or shortly after launch -- the state vector is still mathematically
valid. A negative semi-major axis means the orbit has decayed past the
point where the model produces meaningful output.
### Error Handling in Different Contexts
**Direct propagation** (`sgp4_propagate`, `tle_distance`): Fatal errors
abort the query via `ereport(ERROR)`. Warnings emit a `NOTICE` and
return the result.
**Pass prediction** (`elevation_at_jd` in `pass_funcs.c`): Fatal errors
return -pi elevation instead of aborting. The scan treats this as "below
horizon" and continues. A TLE might be valid for the first 3 days of a
7-day search window and then decay -- the pass finder should return the
valid passes, not abort.
**SRF propagation** (`sgp4_propagate_series`): Fatal errors abort the
entire series. There is no meaningful way to "skip" a bad timestep in a
continuous time series.
### Input Validation
TLE parsing errors from `parse_elements()` are caught in `tle_in()` and
reported as `ERRCODE_INVALID_TEXT_REPRESENTATION`. Invalid TLEs never
make it into storage.
`select_ephemeris()` returning -1 (invalid mean motion or eccentricity
range) is caught at propagation time, not at storage time. A TLE with
marginal elements might parse correctly but fail when you try to
initialize the propagator.
## 9. Theory-to-Code Mapping
This table maps key equations from the SGP4 theory papers to their
implementation in pg_orbit and the vendored SGP4 code.
| Theory | Paper | What | Code Location |
|--------|-------|------|---------------|
| Mean element recovery | Brouwer (1959) | Recover original mean motion (xnodp) and semi-major axis (aodp) from input TLE elements, removing secular J2 perturbations | `src/sgp4/common.c:sxpall_common_init()` lines 17-35 |
| Secular perturbations | Lane & Cranford (1969), Hoots & Roehrich STR#3 | Secular rates of mean anomaly, argument of perigee, and RAAN due to J2, J4 | `src/sgp4/common.c:sxpx_common_init()` lines 86-101 |
| Atmospheric drag | Hoots & Roehrich STR#3 | B* formulation of drag, C1/C2/C4 coefficients, perigee-dependent s parameter | `src/sgp4/common.c:sxpx_common_init()` lines 47-84; `src/sgp4/sgp4.c:SGP4_init()` |
| Short-period perturbations | Lane & Cranford (1969), Brouwer (1959) | Oscillatory corrections to radius, argument of latitude, node, and inclination | `src/sgp4/common.c:sxpx_posn_vel()` lines 121-229 |
| Kepler equation | Classical | Newton-Raphson with second-order correction, bounded first step | `src/sgp4/common.c:sxpx_posn_vel()` lines 175-208 |
| Deep-space resonance | Hujsak (1979) | Lunar and solar gravitational perturbations, geopotential resonance for 12-hour and 24-hour orbits | `src/sgp4/deep.c:Deep_dpinit()`, `Deep_dpsec()`, `Deep_dpper()` |
| Near-earth propagation | Hoots & Roehrich STR#3 | SGP4 main loop: secular + short-period + drag terms | `src/sgp4/sgp4.c:SGP4()` |
| Deep-space propagation | Hoots & Roehrich STR#3 | SDP4: SGP4 core + deep-space secular/periodic corrections | `src/sgp4/sdp4.c:SDP4()` |
| Semi-major axis from n | Kepler's third law | a = (KE / n)^(2/3) in earth radii | `src/tle_type.c:tle_perigee()` line 415; `src/gist_tle.c:tle_to_alt_range()` line 76 |
| GMST | Vallado (2013) Eq. 3-47 | Greenwich Mean Sidereal Time from Julian date | `src/coord_funcs.c:gmst_from_jd()` lines 59-73; `src/pass_funcs.c:gmst_from_jd()` lines 129-151 |
| TEME to ECEF | Vallado (2013) | Z-axis rotation by -GMST, velocity cross-product correction | `src/coord_funcs.c:teme_to_ecef()` lines 83-103; `src/pass_funcs.c:teme_to_ecef()` lines 157-179 |
| Geodetic from ECEF | Bowring (1976) | Iterative latitude from ECEF Cartesian coordinates on WGS-84 | `src/coord_funcs.c:ecef_to_geodetic()` lines 109-138 |
| Topocentric transform | Standard SEZ | ECEF range vector rotated to South-East-Zenith, azimuth from north | `src/coord_funcs.c:ecef_to_topocentric()` lines 163-188 |
| Observer to ECEF | Geodesy standard | WGS-84 ellipsoid surface point to Cartesian | `src/coord_funcs.c:observer_to_ecef()` lines 143-156 |
| Range rate | Dot product | Projection of relative velocity onto line-of-sight unit vector | `src/coord_funcs.c:eci_to_topocentric()` line 618 |
| Near/deep selection | Hoots & Roehrich STR#3 | Period threshold: 225 minutes (n < 2*pi/225 rad/min) | `src/sgp4/norad.h:select_ephemeris()` |
## 10. JPL DE Ephemeris Architecture (v0.3.0)
v0.3.0 adds optional JPL DE440/441 ephemeris support (~0.1 milliarcsecond
accuracy) without modifying any existing VSOP87/ELP82B code path. This
section documents the architectural decisions specific to DE integration.
### The Fundamental Tension
pg_orbit's core properties (compiled-in coefficients, no file I/O, no
mutable state) are precisely what DE441 challenges. A ~3GB binary file
introduces file dependency, per-backend state (file descriptor,
coefficient cache), and OS-level file descriptor management across
PostgreSQL's multi-process model.
The architecture resolves this by treating DE as an **additive layer**:
new `_de()` function variants alongside existing functions. Existing
functions are untouched. VSOP87 is the bulkhead -- always compiled in,
always works, always IMMUTABLE.
### Why Separate Functions (Not a GUC Switch)
Considered and rejected: a single `planet_observe()` that dispatches to
DE or VSOP87 based on a GUC setting.
The problem is volatility. VSOP87 functions are IMMUTABLE -- their data
is compiled into the binary. PostgreSQL can:
- Cache results in expression indexes (almanac tables, pork chop grids)
- Constant-fold during planning (bake results into prepared statements)
If `planet_observe()` dispatched to DE based on a GUC, it would need to
be STABLE, losing these optimizations for ALL users, even those without
a DE file. Separate `_de()` variants let VSOP87 users keep IMMUTABLE
and DE users get STABLE -- honest volatility for both.
### Clean-Room DE Reader Design
The JPL DE binary format (used by DE405, DE430, DE440, DE441) consists
of fixed-size records of `double` values:
- **Record 1 (header):** Start/end JD, record interval, number of
coefficients, AU value, Earth-Moon mass ratio (EMRAT), coefficient
layout table (3 values per body: offset, ncoeff, nsub)
- **Record 2:** Constant values
- **Records 3+:** Chebyshev polynomial coefficients covering a time
interval for all 13 body groups
The reader is implemented in ~250 lines of C (`de_reader.c`), using:
- Raw POSIX I/O (`open`/`lseek`/`read`, not `fread`) to avoid libc
buffering issues across fork
- Clenshaw recurrence for Chebyshev evaluation (~15 lines)
- Single-record coefficient cache (most queries hit consecutive times)
#### Why Not jpl_eph
Bill Gray's `jpl_eph` (GPL-2+) would be the obvious choice, but:
1. GPL-2+ license constrains pg_orbit's licensing flexibility
2. Uses global statics (`static int init_err_code`)
3. Written in C++ (`jpleph.cpp`); pg_orbit is pure C
4. We only need position queries, not velocity or nutation
The format is well-documented and the algorithm is straightforward.
A clean-room implementation in ~250 lines avoids all four issues.
### Per-Backend Lazy Initialization
PostgreSQL parallel workers are forked from the **postmaster, not the
leader**. They don't inherit the leader's file descriptors or initialized
handles.
Each backend (including parallel workers) gets its own independently-
opened handle via lazy init on first `_de()` call:
```
Backend A calls planet_observe_de() -> opens own fd, own cache
Worker W1 calls planet_observe_de() -> opens own fd, own cache
Worker W2 calls planet_observe_de() -> opens own fd, own cache
```
Static per-backend variables (`de_handle_ptr`, `de_init_attempted`) are
safe because after `fork()`, each process gets its own copy-on-write
address space.
**Critical rule:** Never open the DE file in `_PG_init()`. That runs in
the postmaster, and the file descriptor would be inherited by all children
with undefined stream behavior. Always lazy init.
Cleanup is via `on_proc_exit(eph_cleanup, 0)`, registered in `_PG_init()`.
### ICRS-to-Ecliptic Frame Rotation
DE ephemerides return positions in the ICRS equatorial frame. The
pg_orbit observation pipeline expects ecliptic J2000. The conversion
happens at the provider boundary in `eph_provider.c`:
```
DE position (ICRS equatorial) -> equatorial_to_ecliptic() -> ecliptic J2000
^
already in astro_math.h since v0.2.0
```
This rotation is applied to BOTH the target body and Earth positions
before they enter `observe_from_geocentric()`. The frame conversion
happens exactly once, at the provider boundary.
### Earth Position from DE
DE ephemerides don't store Earth directly. They store the Earth-Moon
Barycenter (EMB, body 3) and the Moon (body 10). Earth's position is
derived:
```
Earth = EMB - Moon / (1 + EMRAT)
```
where EMRAT (~81.300587) is the Earth-Moon mass ratio from the DE
header. This calculation happens in `eph_provider.c:de_get_earth_helio_ecliptic()`.
### Constant Chain of Custody Extension
Three new rules for the DE pipeline:
| Rule | What | Why |
|------|------|-----|
| 6 | DE positions pass through `equatorial_to_ecliptic()` at the provider boundary | DE returns ICRS equatorial; the observation pipeline expects ecliptic J2000 |
| 7 | Both target and Earth must come from the same provider | Mixing DE Mars with VSOP87 Earth introduces frame inconsistency |
| 8 | DE header AU must match compiled-in `AU_KM` (149597870.7) | AU defines the length scale; a mismatch corrupts all distance calculations |
### Fallback Strategy
Every `_de()` function follows this pattern:
1. Try to get position from DE
2. If DE succeeds, use DE result
3. If DE fails (no file, JD out of range, I/O error):
- If DE was explicitly configured (GUC set), emit `ereport(NOTICE)`
- Fall back to VSOP87/ELP82B equivalent
- Return the VSOP87 result
When `pg_orbit.ephemeris_path` is empty (default), DE functions fall
back silently -- no NOTICE, no overhead, identical results to the
non-DE variants.
### Lauren Bugs (DE-Specific)
| Bug | What "Can't Happen" | How It Happens | Mitigation |
|-----|---------------------|----------------|------------|
| L1 | File won't change mid-query | DBA replaces file; new parallel worker opens new version | Per-backend handle is stable for backend lifetime |
| L2 | File will always be there | Docker restart with ephemeral volume | VSOP87 fallback always available |
| L3 | File is always valid | Partial download, bit rot, wrong endianness | Canary validation: Earth at J2000.0 ~1 AU from Sun; byte-order detection via AU constant |
| L4 | All backends use same version | Primary has DE441, replica has DE440 | AU consistency check; log ephemeris version at NOTICE |
| L5 | read() always succeeds | NFS timeout, disk error, fd limit | Every `read()` return checked; propagate as `ereport(ERROR)` |