API Reference
Python API
KernelDB
from spice_kernel_db import KernelDBThe main class. Wraps a DuckDB database and provides all scanning, lookup, and rewriting operations.
Constructor
KernelDB(db_path: str | Path | None = None, read_only: bool = False)Opens (or creates) the database at the given path. When db_path is None (the default), the path is loaded from ~/.config/spice-kernel-db/config.toml. The schema is initialized automatically (unless read_only=True).
Set read_only=True for commands that only query the database — this allows concurrent access while another process holds a write lock.
Mission management
add_mission
db.add_mission(
name: str,
server_url: str,
mk_dir_url: str,
dedup: bool = True,
) -> NoneRegister a mission. If it already exists, replaces the existing entry.
list_missions
db.list_missions() -> list[dict]List all configured missions. Returns dicts with keys: name, server_url, server_label, mk_dir_url, dedup, added_at.
remove_mission
db.remove_mission(name: str) -> boolRemove a mission. Returns True if found and removed, False if not found.
get_mission
db.get_mission(name: str) -> dict | NoneLook up a mission by name (case-insensitive). Returns dict with same keys as list_missions, or None.
Scanning
scan_directory
db.scan_directory(
root: str | Path,
mission: str | None = None,
extensions: set[str] | None = None,
verbose: bool = False,
archive_dir: str | Path | None = None,
) -> tuple[int, set[str]]Recursively scan a directory tree and register all kernel files. Returns (count, missions_found) — the number of files registered and a set of detected mission names.
If archive_dir is provided, files are moved into {archive_dir}/{mission}/{type}/{filename} and symlinks are left at the original locations.
If mission is not provided, it’s auto-detected from the path structure (looks for .../<MISSION>/kernels/...).
extensions defaults to all known SPICE kernel extensions (.tls, .tpc, .bsp, .bc, .tf, .ti, .tsc, .bds, .tm).
register_file
db.register_file(
path: str | Path,
mission: str | None = None,
source_url: str | None = None,
archive_dir: str | Path | None = None,
expected_hash: str | None = None,
) -> strRegister a single kernel file. Returns its SHA-256 hash.
If archive_dir is provided, the file is moved into the archive and a symlink is left at the original location. If expected_hash is provided, the computed hash is verified before storing — raises ValueError on mismatch.
If the file’s content already exists in the database under a different filename, the new location is recorded pointing to the same hash. This is logged at the INFO level.
Lookup
find_by_filename
db.find_by_filename(filename: str) -> list[dict]Find all locations for a kernel by filename (basename). Checks both the canonical filename in the kernels table and the actual filename in locations.abs_path.
Returns a list of dicts with keys: sha256, abs_path, mission, kernel_type, size_bytes.
find_by_hash
db.find_by_hash(sha256: str) -> list[dict]Find all locations for a kernel by its SHA-256 hash.
Returns a list of dicts with keys: abs_path, mission, filename.
resolve_kernel
db.resolve_kernel(
filename: str,
preferred_mission: str | None = None,
) -> tuple[str | None, list[str]]Resolve a kernel filename to an absolute path on disk.
Returns (path, warnings) where path is the resolved absolute path (or None if not found) and warnings is a list of human-readable strings about any fallback decisions.
Resolution priority (see Mission-aware resolution):
- Exact filename match in
preferred_mission→ no warning - Exact filename match in any mission → warning
- Path-suffix match (file on disk registered under different name) → warning
- Not found →
(None, [])+ hint to runscan
aliases
db.aliases(name_or_hash: str) -> dict | NoneFollow the deduplication trail for a kernel. name_or_hash may be a full or partial (≥6-char) SHA-256, the canonical filename, or any alias filename the content has been registered under.
Returns None if nothing matches, otherwise a dict:
{
"sha256": str,
"canonical": str, # first-seen filename (kernels.filename)
"kernel_type": str,
"size_bytes": int,
"superseded": bool,
"aliases": list[str], # every filename this content is known under
"locations": list[dict], # {"abs_path", "mission", "source_url"}
}Aliases are derived from locations, not stored separately: the read-only view kernel_aliases (sha256, filename) exposes every distinct on-disk basename each content hash appears under, so a deduplicated kernel (identical bytes under several names) is followable with zero extra bookkeeping — the trade-off is that a name is forgotten once its file is pruned. The companion alias_counts(shas: list[str]) -> dict[str, int] maps each hash to its number of known names, and alias_counts_by_name(filenames: list[str]) -> dict[str, int] maps each filename to how many other names its content is known under (used to render the (+N aliases) annotation in kernel listings).
Metakernel operations
check_metakernel
db.check_metakernel(
mk_path: str | Path,
mission: str | None = None,
verbose: bool = False,
) -> dictCheck which kernels from a metakernel are available locally.
Returns a dict with keys:
found: list of(raw_entry, local_path)tuplesmissing: list ofraw_entrystringswarnings: list of warning strings
rewrite_metakernel
db.rewrite_metakernel(
mk_path: str | Path,
output: str | Path,
*,
mission: str | None = None,
link_root: str | Path | None = None,
) -> tuple[Path, list[str]]Rewrite a metakernel for local use with minimal edits.
Creates a symlink tree at link_root (default: kernels/ next to the output file) that mirrors the original directory structure. Only PATH_VALUES in the output .tm file is changed — KERNELS_TO_LOAD, PATH_SYMBOLS, and all comments are preserved.
Returns (output_path, warnings).
See Minimal metakernel edits for the design rationale.
get_metakernel
db.get_metakernel(
url: str,
download_dir: str | Path | None = None,
mission: str | None = None,
yes: bool = False,
force: bool = False,
) -> dictFetch a remote metakernel, display a status table, and download missing kernels.
- Fetches and parses the remote
.tmfile - Resolves kernel entries to full URLs (using
PATH_VALUESrelative to the metakernel URL) - Checks each kernel against the local database
- Queries remote file sizes via parallel HTTP HEAD requests
- Displays a summary table with kernel names, sizes, and status (
in db/missing) - Prompts for confirmation (unless
yes=True) - Downloads missing kernels, preserving subdirectory structure, and registers them
- Creates symlinks for kernels already in the database so the metakernel works immediately
download_dir defaults to the configured kernel_dir from ~/.config/spice-kernel-db/config.toml. Downloaded files are placed under <download_dir>/<mission>/<relpath>.
Returns a dict with keys: found, missing, downloaded, warnings.
update_metakernel
db.update_metakernel(
mk_path_or_name: str | Path,
mission: str | None = None,
download_dir: str | Path | None = None,
yes: bool = False,
force: bool = False,
) -> dictRe-fetch a metakernel from its source URL and download new or missing kernels. Looks up the source URL from the metakernel_registry, falling back to the mission’s mk_dir_url + filename if no source URL is stored.
Raises LookupError if the metakernel is not found in the registry or was added via scan without a source URL. Raises MetakernelUnreachableError (a LookupError subclass) when the remote returns 403/404/410 — typically because NAIF rotated the file into former_versions/; use prune_metakernels() to clean up the stale registry row.
After downloading, automatically rescans the kernel directories referenced by the metakernel so newly downloaded files are immediately indexed.
Returns the same dict as get_metakernel.
verify_metakernel
db.verify_metakernel(
mk_path: str | Path,
*,
deep: bool = False,
) -> dictDeeply cross-check a metakernel against the database. Each entry in KERNELS_TO_LOAD is substituted through PATH_VALUES/PATH_SYMBOLS to an absolute path and validated against the kernels row whose filename matches (case-insensitive).
Checks: traversal escape from PATH_VALUES root, dangling symlinks, missing files, size mismatch, SHA-256 mismatch (only in deep mode), AMBIGUOUS resolution (multiple non-superseded kernels rows share the filename — see superseded_by semantics), unregistered files, and non-absolute PATH_VALUES.
Returns a dict with keys entries, ok, fail, fatal, mk_path, deep. Each entry is a dict with raw, resolved, status, detail. fatal is True if any entry is in a P0 state (e.g. TRAVERSAL, HASH_MISMATCH, AMBIGUOUS).
This is the recommended cross-check after every get/update/rewrite. Run with deep=True periodically to catch content corruption that preserves size.
prune_orphan_symlinks
db.prune_orphan_symlinks(dry_run: bool = True) -> list[str]Walk every mission’s download tree (derived from metakernel_registry.mk_path — the parent of the mk/ directory) and find symlinks whose target no longer exists. These accumulate when an underlying kernel store moves or after the default prune removes a locations row — the symlinks pointing at that file are not themselves tracked in locations, so they survive as junk.
With dry_run=False, every dangling symlink is unlinked. Healthy symlinks and regular files are left alone.
Returns a list of absolute paths that were (or would be) removed.
prune_metakernels
db.prune_metakernels(
dry_run: bool = True,
delete_files: bool = False,
timeout: float = 10.0,
) -> list[dict]Find rows in metakernel_registry whose remote URL returns a permanent HTTP error (403/404/410) and optionally remove them. Sends a HEAD request to each URL with the given timeout.
The probe URL is derived per row: explicit source_url if set, otherwise mission.mk_dir_url + filename (same fallback as update_metakernel). Rows with neither — typically locally-created metakernels with no upstream — are noted in the output and skipped (use mk --remove <name> to drop them manually).
Transient failures (timeouts, DNS errors, 5xx) are deliberately never classified as dead — leaving a registry row stale is always safer than deleting on a network blip.
With dry_run=False, the matching rows are removed from metakernel_registry and metakernel_entries. With delete_files=True, the on-disk .tm file is also unlinked; symlink trees under the mission’s download directory are shared and never auto-cleaned.
Returns a list of dicts, one per dead metakernel: {'mk_path', 'filename', 'mission', 'source_url', 'status_code'}.
index_metakernel
db.index_metakernel(mk_path: str | Path)Parse a metakernel and store its entries in the metakernel_entries table for future reference.
list_metakernels
db.list_metakernels(mission: str | None = None) -> list[dict]List all tracked metakernels, optionally filtered by mission. Each entry includes the number of kernel entries.
Alias rows (symlink metakernels created by get) inherit their target’s entry count and content fingerprint, so the listing reads consistently — both rows show the same kernel count and one of them is annotated ↳ identical to <target>.
Prints a summary table and returns a list of dicts with keys: filename, mission, source_url, acquired_at, mk_path, n_kernels, plus identical_to (set when another row has the same content fingerprint).
info_metakernel
db.info_metakernel(name: str) -> dict | NoneShow detailed info about a tracked metakernel, looked up by filename. Displays per-kernel status (in db / missing) with kernel type and size.
Returns a dict with keys: filename, mission, source_url, acquired_at, mk_path, kernels, n_kernels, n_in_db, n_missing. Returns None if not found.
browse_remote_metakernels
db.browse_remote_metakernels(
mk_dir_url: str,
mission: str | None = None,
show_versioned: bool = False,
sort_by: str = "name",
filter: str | None = None,
) -> list[dict]Scan a remote NAIF mk/ directory and show available metakernels. Groups entries by base name (stripping version tags like _v461_20251127_001), counts versioned snapshots, and checks which base metakernels have been locally acquired.
sort_by controls row ordering: "name" (default) sorts alphabetically; "date" sorts by latest remote modification date ascending, so the most recently updated metakernels appear at the bottom.
filter is an optional case-insensitive substring applied to entry filenames before grouping. Useful for narrowing large listings such as the ~1,400-entry JUICE former_versions/ archive.
Prints a summary table and returns a list of dicts with keys: base_name, n_versions, latest_date, is_local, filenames.
Deduplication
report_duplicates
db.report_duplicates(min_copies: int = 2) -> list[dict]Find and report kernels that exist in multiple locations.
Returns a list of dicts with keys: sha256, filename, size_bytes, count, paths, missions, wasted_bytes.
deduplicate_plan
db.deduplicate_plan() -> list[dict]Generate a deduplication plan without executing it.
For each set of identical files, selects a canonical copy (preferring generic mission) and lists the rest as removable. Respects per-mission dedup settings from the missions table.
Returns a list of dicts with keys: filename, size_bytes, keep, remove.
deduplicate_with_symlinks
db.deduplicate_with_symlinks(dry_run: bool = True) -> list[dict]Replace duplicate files with symlinks to the canonical copy.
With dry_run=True (the default), only prints what would happen.
Statistics
stats
db.stats() -> dictPrint and return summary statistics.
Returns a dict with keys: n_kernels, n_locations, total_bytes, n_duplicates, missions.
parse_metakernel
from spice_kernel_db import parse_metakernelparse_metakernel(path: str | Path) -> ParsedMetakernelParse a SPICE metakernel (.tm) file from a local path without needing a database.
parse_metakernel_text
from spice_kernel_db import parse_metakernel_textparse_metakernel_text(text: str, source: str) -> ParsedMetakernelParse a SPICE metakernel from text content (e.g. fetched from a remote URL). The source string is stored as the source_path attribute for reference.
ParsedMetakernel
Dataclass with attributes:
| Attribute | Type | Description |
|---|---|---|
source_path |
Path |
Absolute path to the source file |
header |
str |
Everything before the first \begindata |
path_values |
list[str] |
Contents of PATH_VALUES |
path_symbols |
list[str] |
Contents of PATH_SYMBOLS |
kernels |
list[str] |
Contents of KERNELS_TO_LOAD |
Properties and methods:
| Name | Returns | Description |
|---|---|---|
symbol_map |
dict[str, str] |
Mapping from symbol name to path value |
resolve(entry) |
str |
Replace $SYMBOLs with their values |
kernel_filenames() |
list[str] |
Basenames of all kernels |
kernel_relpaths() |
list[str] |
Relative paths with $SYMBOL stripped |
Configuration
from spice_kernel_db import Config, ensure_configensure_config
ensure_config() -> ConfigLoad configuration from ~/.config/spice-kernel-db/config.toml. If no config file exists, runs an interactive first-time setup prompting for the database path and kernel storage directory.
Config
Dataclass with attributes:
| Attribute | Type | Default | Description |
|---|---|---|---|
db_path |
str |
~/.local/share/spice-kernel-db/kernels.duckdb |
Path to the DuckDB database |
kernel_dir |
str |
~/.local/share/spice-kernel-db/kernels |
Default directory for downloaded kernels |
Remote utilities
from spice_kernel_db import fetch_metakernel, resolve_kernel_urlsfetch_metakernel
fetch_metakernel(url: str) -> strDownload and return the text content of a remote metakernel file.
resolve_kernel_urls
resolve_kernel_urls(mk_url: str, parsed: ParsedMetakernel) -> list[str]Resolve each KERNELS_TO_LOAD entry in a parsed metakernel to a full URL, using PATH_VALUES relative to the metakernel’s own URL.
list_remote_missions
from spice_kernel_db.remote import list_remote_missions
list_remote_missions(server_url: str) -> list[str]List available mission directories from a SPICE archive server (NASA NAIF or ESA SPICE). Parses the Apache directory listing.
SPICE_SERVERS
from spice_kernel_db.remote import SPICE_SERVERSDict mapping server labels to URLs: {"NASA": "https://naif.jpl.nasa.gov/pub/naif/", "ESA": "https://spiftp.esac.esa.int/data/SPICE/"}.
list_remote_metakernels
from spice_kernel_db import list_remote_metakernelslist_remote_metakernels(mk_dir_url: str) -> list[RemoteMetakernel]Parse an Apache directory listing at mk_dir_url and extract .tm metakernel entries. For each file, computes a base_name by stripping NAIF version tags (e.g. _v461_20251127_001). Returns a list sorted by (base_name, filename).
RemoteMetakernel
Dataclass with attributes:
| Attribute | Type | Description |
|---|---|---|
filename |
str |
Original filename from the directory listing |
url |
str |
Full URL to the file |
date |
str |
Last modified date from the listing |
size |
str |
Size string from the listing (e.g. "12K") |
base_name |
str |
Filename with version tag stripped |
version_tag |
str \| None |
Version tag (e.g. "v461_20251127_001") or None |
probe_mk_candidates
from spice_kernel_db.remote import probe_mk_candidates
probe_mk_candidates(urls: list[str], *, max_workers: int = 8, timeout: float = 5.0) -> list[str]Send parallel HEAD requests to candidate metakernel-directory URLs and return those that respond successfully, preserving the input order (which encodes priority). Duplicates are collapsed.
discover_mk_url
from spice_kernel_db.remote import discover_mk_url, DEFAULT_ALT_MK_PATHS
discover_mk_url(
server_url: str,
mission: str,
*,
registry_candidates: list[str] | None = None,
extra_paths: tuple[str, ...] = DEFAULT_ALT_MK_PATHS,
timeout: float = 5.0,
) -> list[str]Discover live metakernel directory URLs for mission. Tries registry_candidates (highest priority) first, then expands the templates in extra_paths (placeholders {server} and {m}). Returns hits in priority order.
DEFAULT_ALT_MK_PATHS contains only {server}{m}/kernels/mk/. An empirical NAIF survey found that no missions use alternative templates one might expect from naming convention — alternates should be added via the curated registry or --mk-dir-url, not inferred.
Mission registry
from spice_kernel_db import registryReads mission_registry.toml shipped with the package. Used by mission add to find metakernel directories for missions whose layout deviates from the default.
MissionEntry
Dataclass: name, candidates: tuple[str, ...] (URL templates with {server}/{m} placeholders), planetarypy: bool.
load_registry
registry.load_registry() -> dict[str, MissionEntry]Cached loader for the bundled registry. Clear the cache via registry.load_registry.cache_clear() after replacing the data file (only relevant in tests).
registry_candidates
registry.registry_candidates(mission: str, server_url: str) -> list[str]Return registry-defined candidate metakernel URLs for mission with placeholders expanded. Empty list when the mission has no registry entry.
is_planetarypy_managed
registry.is_planetarypy_managed(mission: str) -> boolTrue when the registry marks mission as managed by planetarypy.
planetarypy bridge
from spice_kernel_db import planetarypy_bridgeOptional integration hook for the planetarypy library. Activated via the [planetarypy] extra: pip install spice-kernel-db[planetarypy].
is_available
planetarypy_bridge.is_available() -> boolTrue when planetarypy is importable in the current environment.
delegate_mission_add
planetarypy_bridge.delegate_mission_add(mission: str, server_url: str) -> dict | NoneStub delegation hook. Currently returns None; full integration is tracked in the project’s issue tracker.
Exceptions
from spice_kernel_db import ConcurrentModificationError, MetakernelUnreachableErrorConcurrentModificationError
Raised when KernelDB.get_metakernel() detects that another process modified the kernel records during its network I/O window (after the DB lock was released to allow parallel reads). Subclass of RuntimeError. Retry the operation after the other writer completes.
MetakernelUnreachableError
Raised by KernelDB.update_metakernel() when the remote metakernel returns HTTP 403/404/410 — typically because NAIF rotated the file into former_versions/. Subclass of LookupError. Attributes: url, status (int HTTP code), filename. Recovery: db.prune_metakernels(dry_run=False).