API Reference

Python API

KernelDB

from spice_kernel_db import KernelDB

The main class. Wraps a DuckDB database and provides all scanning, lookup, and rewriting operations.

Constructor

KernelDB(db_path: str | Path | None = None, read_only: bool = False)

Opens (or creates) the database at the given path. When db_path is None (the default), the path is loaded from ~/.config/spice-kernel-db/config.toml. The schema is initialized automatically (unless read_only=True).

Set read_only=True for commands that only query the database — this allows concurrent access while another process holds a write lock.


Mission management

add_mission

db.add_mission(
    name: str,
    server_url: str,
    mk_dir_url: str,
    dedup: bool = True,
) -> None

Register a mission. If it already exists, replaces the existing entry.

list_missions

db.list_missions() -> list[dict]

List all configured missions. Returns dicts with keys: name, server_url, server_label, mk_dir_url, dedup, added_at.

remove_mission

db.remove_mission(name: str) -> bool

Remove a mission. Returns True if found and removed, False if not found.

get_mission

db.get_mission(name: str) -> dict | None

Look up a mission by name (case-insensitive). Returns dict with same keys as list_missions, or None.


Scanning

scan_directory

db.scan_directory(
    root: str | Path,
    mission: str | None = None,
    extensions: set[str] | None = None,
    verbose: bool = False,
    archive_dir: str | Path | None = None,
) -> tuple[int, set[str]]

Recursively scan a directory tree and register all kernel files. Returns (count, missions_found) — the number of files registered and a set of detected mission names.

If archive_dir is provided, files are moved into {archive_dir}/{mission}/{type}/{filename} and symlinks are left at the original locations.

If mission is not provided, it’s auto-detected from the path structure (looks for .../<MISSION>/kernels/...).

extensions defaults to all known SPICE kernel extensions (.tls, .tpc, .bsp, .bc, .tf, .ti, .tsc, .bds, .tm).

register_file

db.register_file(
    path: str | Path,
    mission: str | None = None,
    source_url: str | None = None,
    archive_dir: str | Path | None = None,
    expected_hash: str | None = None,
) -> str

Register a single kernel file. Returns its SHA-256 hash.

If archive_dir is provided, the file is moved into the archive and a symlink is left at the original location. If expected_hash is provided, the computed hash is verified before storing — raises ValueError on mismatch.

If the file’s content already exists in the database under a different filename, the new location is recorded pointing to the same hash. This is logged at the INFO level.


Lookup

find_by_filename

db.find_by_filename(filename: str) -> list[dict]

Find all locations for a kernel by filename (basename). Checks both the canonical filename in the kernels table and the actual filename in locations.abs_path.

Returns a list of dicts with keys: sha256, abs_path, mission, kernel_type, size_bytes.

find_by_hash

db.find_by_hash(sha256: str) -> list[dict]

Find all locations for a kernel by its SHA-256 hash.

Returns a list of dicts with keys: abs_path, mission, filename.

resolve_kernel

db.resolve_kernel(
    filename: str,
    preferred_mission: str | None = None,
) -> tuple[str | None, list[str]]

Resolve a kernel filename to an absolute path on disk.

Returns (path, warnings) where path is the resolved absolute path (or None if not found) and warnings is a list of human-readable strings about any fallback decisions.

Resolution priority (see Mission-aware resolution):

  1. Exact filename match in preferred_mission → no warning
  2. Exact filename match in any mission → warning
  3. Path-suffix match (file on disk registered under different name) → warning
  4. Not found → (None, []) + hint to run scan

aliases

db.aliases(name_or_hash: str) -> dict | None

Follow the deduplication trail for a kernel. name_or_hash may be a full or partial (≥6-char) SHA-256, the canonical filename, or any alias filename the content has been registered under.

Returns None if nothing matches, otherwise a dict:

{
    "sha256": str,
    "canonical": str,          # first-seen filename (kernels.filename)
    "kernel_type": str,
    "size_bytes": int,
    "superseded": bool,
    "aliases": list[str],      # every filename this content is known under
    "locations": list[dict],   # {"abs_path", "mission", "source_url"}
}

Aliases are derived from locations, not stored separately: the read-only view kernel_aliases (sha256, filename) exposes every distinct on-disk basename each content hash appears under, so a deduplicated kernel (identical bytes under several names) is followable with zero extra bookkeeping — the trade-off is that a name is forgotten once its file is pruned. The companion alias_counts(shas: list[str]) -> dict[str, int] maps each hash to its number of known names, and alias_counts_by_name(filenames: list[str]) -> dict[str, int] maps each filename to how many other names its content is known under (used to render the (+N aliases) annotation in kernel listings).


Metakernel operations

check_metakernel

db.check_metakernel(
    mk_path: str | Path,
    mission: str | None = None,
    verbose: bool = False,
) -> dict

Check which kernels from a metakernel are available locally.

Returns a dict with keys:

  • found: list of (raw_entry, local_path) tuples
  • missing: list of raw_entry strings
  • warnings: list of warning strings

rewrite_metakernel

db.rewrite_metakernel(
    mk_path: str | Path,
    output: str | Path,
    *,
    mission: str | None = None,
    link_root: str | Path | None = None,
) -> tuple[Path, list[str]]

Rewrite a metakernel for local use with minimal edits.

Creates a symlink tree at link_root (default: kernels/ next to the output file) that mirrors the original directory structure. Only PATH_VALUES in the output .tm file is changed — KERNELS_TO_LOAD, PATH_SYMBOLS, and all comments are preserved.

Returns (output_path, warnings).

See Minimal metakernel edits for the design rationale.

get_metakernel

db.get_metakernel(
    url: str,
    download_dir: str | Path | None = None,
    mission: str | None = None,
    yes: bool = False,
    force: bool = False,
) -> dict

Fetch a remote metakernel, display a status table, and download missing kernels.

  1. Fetches and parses the remote .tm file
  2. Resolves kernel entries to full URLs (using PATH_VALUES relative to the metakernel URL)
  3. Checks each kernel against the local database
  4. Queries remote file sizes via parallel HTTP HEAD requests
  5. Displays a summary table with kernel names, sizes, and status (in db / missing)
  6. Prompts for confirmation (unless yes=True)
  7. Downloads missing kernels, preserving subdirectory structure, and registers them
  8. Creates symlinks for kernels already in the database so the metakernel works immediately

download_dir defaults to the configured kernel_dir from ~/.config/spice-kernel-db/config.toml. Downloaded files are placed under <download_dir>/<mission>/<relpath>.

Returns a dict with keys: found, missing, downloaded, warnings.

update_metakernel

db.update_metakernel(
    mk_path_or_name: str | Path,
    mission: str | None = None,
    download_dir: str | Path | None = None,
    yes: bool = False,
    force: bool = False,
) -> dict

Re-fetch a metakernel from its source URL and download new or missing kernels. Looks up the source URL from the metakernel_registry, falling back to the mission’s mk_dir_url + filename if no source URL is stored.

Raises LookupError if the metakernel is not found in the registry or was added via scan without a source URL. Raises MetakernelUnreachableError (a LookupError subclass) when the remote returns 403/404/410 — typically because NAIF rotated the file into former_versions/; use prune_metakernels() to clean up the stale registry row.

After downloading, automatically rescans the kernel directories referenced by the metakernel so newly downloaded files are immediately indexed.

Returns the same dict as get_metakernel.

verify_metakernel

db.verify_metakernel(
    mk_path: str | Path,
    *,
    deep: bool = False,
) -> dict

Deeply cross-check a metakernel against the database. Each entry in KERNELS_TO_LOAD is substituted through PATH_VALUES/PATH_SYMBOLS to an absolute path and validated against the kernels row whose filename matches (case-insensitive).

Checks: traversal escape from PATH_VALUES root, dangling symlinks, missing files, size mismatch, SHA-256 mismatch (only in deep mode), AMBIGUOUS resolution (multiple non-superseded kernels rows share the filename — see superseded_by semantics), unregistered files, and non-absolute PATH_VALUES.

Returns a dict with keys entries, ok, fail, fatal, mk_path, deep. Each entry is a dict with raw, resolved, status, detail. fatal is True if any entry is in a P0 state (e.g. TRAVERSAL, HASH_MISMATCH, AMBIGUOUS).

This is the recommended cross-check after every get/update/rewrite. Run with deep=True periodically to catch content corruption that preserves size.

prune_metakernels

db.prune_metakernels(
    dry_run: bool = True,
    delete_files: bool = False,
    timeout: float = 10.0,
) -> list[dict]

Find rows in metakernel_registry whose remote URL returns a permanent HTTP error (403/404/410) and optionally remove them. Sends a HEAD request to each URL with the given timeout.

The probe URL is derived per row: explicit source_url if set, otherwise mission.mk_dir_url + filename (same fallback as update_metakernel). Rows with neither — typically locally-created metakernels with no upstream — are noted in the output and skipped (use mk --remove <name> to drop them manually).

Transient failures (timeouts, DNS errors, 5xx) are deliberately never classified as dead — leaving a registry row stale is always safer than deleting on a network blip.

With dry_run=False, the matching rows are removed from metakernel_registry and metakernel_entries. With delete_files=True, the on-disk .tm file is also unlinked; symlink trees under the mission’s download directory are shared and never auto-cleaned.

Returns a list of dicts, one per dead metakernel: {'mk_path', 'filename', 'mission', 'source_url', 'status_code'}.

index_metakernel

db.index_metakernel(mk_path: str | Path)

Parse a metakernel and store its entries in the metakernel_entries table for future reference.

list_metakernels

db.list_metakernels(mission: str | None = None) -> list[dict]

List all tracked metakernels, optionally filtered by mission. Each entry includes the number of kernel entries.

Alias rows (symlink metakernels created by get) inherit their target’s entry count and content fingerprint, so the listing reads consistently — both rows show the same kernel count and one of them is annotated ↳ identical to <target>.

Prints a summary table and returns a list of dicts with keys: filename, mission, source_url, acquired_at, mk_path, n_kernels, plus identical_to (set when another row has the same content fingerprint).

info_metakernel

db.info_metakernel(name: str) -> dict | None

Show detailed info about a tracked metakernel, looked up by filename. Displays per-kernel status (in db / missing) with kernel type and size.

Returns a dict with keys: filename, mission, source_url, acquired_at, mk_path, kernels, n_kernels, n_in_db, n_missing. Returns None if not found.

browse_remote_metakernels

db.browse_remote_metakernels(
    mk_dir_url: str,
    mission: str | None = None,
    show_versioned: bool = False,
    sort_by: str = "name",
    filter: str | None = None,
) -> list[dict]

Scan a remote NAIF mk/ directory and show available metakernels. Groups entries by base name (stripping version tags like _v461_20251127_001), counts versioned snapshots, and checks which base metakernels have been locally acquired.

sort_by controls row ordering: "name" (default) sorts alphabetically; "date" sorts by latest remote modification date ascending, so the most recently updated metakernels appear at the bottom.

filter is an optional case-insensitive substring applied to entry filenames before grouping. Useful for narrowing large listings such as the ~1,400-entry JUICE former_versions/ archive.

Prints a summary table and returns a list of dicts with keys: base_name, n_versions, latest_date, is_local, filenames.


Deduplication

report_duplicates

db.report_duplicates(min_copies: int = 2) -> list[dict]

Find and report kernels that exist in multiple locations.

Returns a list of dicts with keys: sha256, filename, size_bytes, count, paths, missions, wasted_bytes.

deduplicate_plan

db.deduplicate_plan() -> list[dict]

Generate a deduplication plan without executing it.

For each set of identical files, selects a canonical copy (preferring generic mission) and lists the rest as removable. Respects per-mission dedup settings from the missions table.

Returns a list of dicts with keys: filename, size_bytes, keep, remove.

Statistics

stats

db.stats() -> dict

Print and return summary statistics.

Returns a dict with keys: n_kernels, n_locations, total_bytes, n_duplicates, missions.


parse_metakernel

from spice_kernel_db import parse_metakernel
parse_metakernel(path: str | Path) -> ParsedMetakernel

Parse a SPICE metakernel (.tm) file from a local path without needing a database.

parse_metakernel_text

from spice_kernel_db import parse_metakernel_text
parse_metakernel_text(text: str, source: str) -> ParsedMetakernel

Parse a SPICE metakernel from text content (e.g. fetched from a remote URL). The source string is stored as the source_path attribute for reference.

ParsedMetakernel

Dataclass with attributes:

Attribute Type Description
source_path Path Absolute path to the source file
header str Everything before the first \begindata
path_values list[str] Contents of PATH_VALUES
path_symbols list[str] Contents of PATH_SYMBOLS
kernels list[str] Contents of KERNELS_TO_LOAD

Properties and methods:

Name Returns Description
symbol_map dict[str, str] Mapping from symbol name to path value
resolve(entry) str Replace $SYMBOLs with their values
kernel_filenames() list[str] Basenames of all kernels
kernel_relpaths() list[str] Relative paths with $SYMBOL stripped

Configuration

from spice_kernel_db import Config, ensure_config

ensure_config

ensure_config() -> Config

Load configuration from ~/.config/spice-kernel-db/config.toml. If no config file exists, runs an interactive first-time setup prompting for the database path and kernel storage directory.

Config

Dataclass with attributes:

Attribute Type Default Description
db_path str ~/.local/share/spice-kernel-db/kernels.duckdb Path to the DuckDB database
kernel_dir str ~/.local/share/spice-kernel-db/kernels Default directory for downloaded kernels

Remote utilities

from spice_kernel_db import fetch_metakernel, resolve_kernel_urls

fetch_metakernel

fetch_metakernel(url: str) -> str

Download and return the text content of a remote metakernel file.

resolve_kernel_urls

resolve_kernel_urls(mk_url: str, parsed: ParsedMetakernel) -> list[str]

Resolve each KERNELS_TO_LOAD entry in a parsed metakernel to a full URL, using PATH_VALUES relative to the metakernel’s own URL.

list_remote_missions

from spice_kernel_db.remote import list_remote_missions
list_remote_missions(server_url: str) -> list[str]

List available mission directories from a SPICE archive server (NASA NAIF or ESA SPICE). Parses the Apache directory listing.

SPICE_SERVERS

from spice_kernel_db.remote import SPICE_SERVERS

Dict mapping server labels to URLs: {"NASA": "https://naif.jpl.nasa.gov/pub/naif/", "ESA": "https://spiftp.esac.esa.int/data/SPICE/"}.

list_remote_metakernels

from spice_kernel_db import list_remote_metakernels
list_remote_metakernels(mk_dir_url: str) -> list[RemoteMetakernel]

Parse an Apache directory listing at mk_dir_url and extract .tm metakernel entries. For each file, computes a base_name by stripping NAIF version tags (e.g. _v461_20251127_001). Returns a list sorted by (base_name, filename).

RemoteMetakernel

Dataclass with attributes:

Attribute Type Description
filename str Original filename from the directory listing
url str Full URL to the file
date str Last modified date from the listing
size str Size string from the listing (e.g. "12K")
base_name str Filename with version tag stripped
version_tag str \| None Version tag (e.g. "v461_20251127_001") or None

probe_mk_candidates

from spice_kernel_db.remote import probe_mk_candidates
probe_mk_candidates(urls: list[str], *, max_workers: int = 8, timeout: float = 5.0) -> list[str]

Send parallel HEAD requests to candidate metakernel-directory URLs and return those that respond successfully, preserving the input order (which encodes priority). Duplicates are collapsed.

discover_mk_url

from spice_kernel_db.remote import discover_mk_url, DEFAULT_ALT_MK_PATHS
discover_mk_url(
    server_url: str,
    mission: str,
    *,
    registry_candidates: list[str] | None = None,
    extra_paths: tuple[str, ...] = DEFAULT_ALT_MK_PATHS,
    timeout: float = 5.0,
) -> list[str]

Discover live metakernel directory URLs for mission. Tries registry_candidates (highest priority) first, then expands the templates in extra_paths (placeholders {server} and {m}). Returns hits in priority order.

DEFAULT_ALT_MK_PATHS contains only {server}{m}/kernels/mk/. An empirical NAIF survey found that no missions use alternative templates one might expect from naming convention — alternates should be added via the curated registry or --mk-dir-url, not inferred.

Mission registry

from spice_kernel_db import registry

Reads mission_registry.toml shipped with the package. Used by mission add to find metakernel directories for missions whose layout deviates from the default.

MissionEntry

Dataclass: name, candidates: tuple[str, ...] (URL templates with {server}/{m} placeholders), planetarypy: bool.

load_registry

registry.load_registry() -> dict[str, MissionEntry]

Cached loader for the bundled registry. Clear the cache via registry.load_registry.cache_clear() after replacing the data file (only relevant in tests).

registry_candidates

registry.registry_candidates(mission: str, server_url: str) -> list[str]

Return registry-defined candidate metakernel URLs for mission with placeholders expanded. Empty list when the mission has no registry entry.

is_planetarypy_managed

registry.is_planetarypy_managed(mission: str) -> bool

True when the registry marks mission as managed by planetarypy.

planetarypy bridge

from spice_kernel_db import planetarypy_bridge

Optional integration hook for the planetarypy library. Activated via the [planetarypy] extra: pip install spice-kernel-db[planetarypy].

is_available

planetarypy_bridge.is_available() -> bool

True when planetarypy is importable in the current environment.

delegate_mission_add

planetarypy_bridge.delegate_mission_add(mission: str, server_url: str) -> dict | None

Stub delegation hook. Currently returns None; full integration is tracked in the project’s issue tracker.

Exceptions

from spice_kernel_db import ConcurrentModificationError, MetakernelUnreachableError

ConcurrentModificationError

Raised when KernelDB.get_metakernel() detects that another process modified the kernel records during its network I/O window (after the DB lock was released to allow parallel reads). Subclass of RuntimeError. Retry the operation after the other writer completes.

MetakernelUnreachableError

Raised by KernelDB.update_metakernel() when the remote metakernel returns HTTP 403/404/410 — typically because NAIF rotated the file into former_versions/. Subclass of LookupError. Attributes: url, status (int HTTP code), filename. Recovery: db.prune_metakernels(dry_run=False).