pds.utils

pds.utils

General utilities for working with PDS data.

This module provides common, general-purpose utility functions for the PDS subpackage.

Functions

Name Description
complete_pid Return registered product IDs matching a prefix, for tab completion.
get_example_pid Return an example product ID from a registered PDS index.
get_index_names Return a sorted list of all index names for a given mission and instrument (from static and dynamic configs).
get_instrument_names Return a sorted list of all instruments for a given mission (from static and dynamic configs).
get_meta Return the metadata row for a product ID from a registered PDS index.
get_mission_names Return a sorted list of all available missions (from static and dynamic configs).
print_available_indexes List available index keys from static config plus dynamic handlers.
read_index_slice Read a column-projected, predicate-pushed-down slice of a PDS index.
rebuild_pid_cache Rebuild the sorted bare-PID cache for an index.
reorder_meta_row Reorder a meta row’s fields to [*_ID/FILE_NAME] + [*ANGLE*] + rest.

complete_pid

pds.utils.complete_pid(incomplete, index_key, *, max_results=50)

Return registered product IDs matching a prefix, for tab completion.

Backed by a sorted text cache built on first use from the index’s configured product-id column (PRODUCT_ID for most indexes, FILE_NAME for cassini.uvis, etc.). PDS path/extension and flight-software version suffixes are normalized away so the user types the bare PID they expect (EUV1999_007_17_05 not /COUVIS_0001/.../EUV1999_007_17_05.LBL; 1_N1454725799 not 1_N1454725799.122).

Parameters

Name Type Description Default
incomplete str Prefix to match against (case-insensitive). required
index_key str Dotted index key, e.g. "cassini.uvis.index". required
max_results int Cap on matches (default 50, plenty for shell completion). 50

Returns

Name Type Description
list of str Sorted, uppercased matches. Empty list if anything goes wrong (the shell completion path must never raise).

[source]

get_example_pid

pds.utils.get_example_pid(instr_key)

Return an example product ID from a registered PDS index.

Looks up an arbitrary product identifier from the cumulative index associated with instr_key — useful for CLI examples, completion seeds, smoke tests, and notebook demos.

Parameters

Name Type Description Default
instr_key str Dotted index key in the form <mission>.<instrument>.<index>, e.g. "mro.ctx.edr" or "cassini.iss.index". Must be one of the keys registered in ~/.planetarypy_index_urls.toml or in the dynamic handler registry. required

Returns

Name Type Description
str A product identifier from the index, stripped of PDS whitespace padding.

Raises

Name Type Description
ValueError If instr_key is not a registered index, or if the index contains no recognizable product-id column.

[source]

get_index_names

pds.utils.get_index_names(mission_instrument)

Return a sorted list of all index names for a given mission and instrument (from static and dynamic configs).

[source]

get_instrument_names

pds.utils.get_instrument_names(mission)

Return a sorted list of all instruments for a given mission (from static and dynamic configs).

[source]

get_meta

pds.utils.get_meta(instr_key, product_id, long=False)

Return the metadata row for a product ID from a registered PDS index.

Generalizes lookup across all indexes by trying the catalog-registered product-id column first, then common fallbacks. Matching is tolerant of case differences and PDS path/extension/version-suffix decoration (same normalization as _bare_pid).

Indexes registered in :mod:planetarypy.pds.meta_display get instrument-specific shaping (e.g. HiRISE EDR collapses one observation’s 28 channel rows into a short per-color summary).

Parameters

Name Type Description Default
instr_key str Dotted index key, e.g. "mro.ctx.edr" or "cassini.iss.index". required
product_id str Product identifier to look up. May be bare (e.g. "P02_001916_2221_XI_42N027W") or include a PDS path/extension. required
long bool Per-instrument long-form toggle (currently used by HiRISE: with an obsid input, picks the RED3_1 channel and returns the full row). Generic indexes ignore this flag. False

Returns

Name Type Description
pandas.Series The matched row, indexed by column name and pre-ordered for display. String values are stripped of PDS whitespace padding.

Raises

Name Type Description
ValueError If instr_key is not registered, or no row matches product_id.

[source]

get_mission_names

pds.utils.get_mission_names()

Return a sorted list of all available missions (from static and dynamic configs).

[source]

print_available_indexes

pds.utils.print_available_indexes(
    filter_mission=None,
    filter_instrument=None,
    *,
    keys_only=False,
)

List available index keys from static config plus dynamic handlers.

Combines all dotted index keys found in the remote static configuration with the dynamic indexes registered in DYNAMIC_URL_HANDLERS.

When keys_only is False (default), prints a tree of missions → instruments → indexes, optionally filtered by mission/instrument. When keys_only is True, returns a sorted list of dotted index keys instead of printing.

Args: filter_mission: If provided, only include this mission filter_instrument: If provided, only include this instrument (requires filter_mission) keys_only: When True, return a list of keys instead of printing a tree

Returns: - list[str] when keys_only is True - None when printing a tree (keys_only is False)

Examples: >>> from planetarypy.pds.utils import print_available_indexes >>> print_available_indexes(keys_only=True) # returns [“cassini.iss.index”, …] >>> print_available_indexes(‘mro’) # prints tree for mro only >>> print_available_indexes(‘mro’, ‘ctx’) # prints tree for mro.ctx only

[source]

read_index_slice

pds.utils.read_index_slice(index_key, filters=None, columns=None)

Read a column-projected, predicate-pushed-down slice of a PDS index.

Use this in place of :func:get_index whenever you only need a few columns or a few rows. Parquet is columnar at the storage level, so column projection avoids reading the bytes of unused columns; predicate pushdown filters at the row-group level via min/max statistics, skipping irrelevant chunks entirely. On the HiRISE EDR index (2.6 M rows) a per-obsid lookup goes from ~3.3 s to ~0.03 s.

Parameters

Name Type Description Default
index_key str Dotted index key (e.g. "mro.hirise.edr"). required
filters list PyArrow predicate-pushdown filters in the form [(column, op, value)] or DNF-style nested lists. op is one of "=", "!=", "<", "<=", ">", ">=", "in", "not in". Values are compared byte-equal — match the canonical (typically uppercase, unpadded) form stored in parquet. None
columns list of str Subset of columns to read. None (default) loads all columns. None

Returns

Name Type Description
pandas.DataFrame The filtered, projected slice. Empty if no rows match.

[source]

rebuild_pid_cache

pds.utils.rebuild_pid_cache(index_key)

Rebuild the sorted bare-PID cache for an index.

Reads only the configured completion column via column projection, normalizes each value through :func:_normalize_pid (PDS path/extension, version suffix, and per-index prefix), uppercases for prefix matching, deduplicates, and writes a sorted text file.

[source]

reorder_meta_row

pds.utils.reorder_meta_row(row)

Reorder a meta row’s fields to [*_ID/FILE_NAME] + [*ANGLE*] + rest.

Used by the generic get_meta path and by per-instrument handlers that want the same ordering for their long-form output.

[source]