pds.index_main

pds.index_main

Composition-based Index implementation using Remote classes.

This module provides a unified Index class that uses composition with Remote classes to handle URL management.

Classes

Name Description
Index Unified Index class using composition with Remote classes.
InventoryIndex Index class for inventory-style CSV indexes.

Index

pds.index_main.Index(index_key, local_dir=None, force_config_update=False)

Unified Index class using composition with Remote classes.

This class provides file management operations while delegating URL management to appropriate Remote classes based on the index configuration.

Parameters

Name Type Description Default
index_key str Dotted key identifying the index (e.g., “mro.ctx.edr”) required
local_dir str | Path | None Local directory for index files. If None, a default path is used. None
force_config_update bool Whether to force update of static index configuration. False

Attributes

Name Description
dataframe Get the index data as a pandas DataFrame from parquet cache.
files_downloaded Check if index files exist locally.
isupper Check if filename uses uppercase extension.
label_filename Get the label filename from URL.
local_dir Get local directory for this index.
local_label_path Get the local label file path.
local_parq_path Get the local parquet file path.
local_table_path Get the local table file path.
remote Get the appropriate Remote instance for this index.
remote_type Get the type of remote handling used (‘static’ or ‘dynamic’).
tab_extension Get the appropriate table extension.
table_filename Get the table filename.
table_url Get the table URL from the label URL.
update_available Check if an update is available.
url Get the current URL for this index.

Methods

Name Description
convert_to_parquet Convert the downloaded index files to parquet format.
download Download the index files from remote URL.
ensure_parquet Ensure a parquet cache exists for this index.
read_index_data Read the index data from label and table files.
refresh_remote Force refresh of remote URL information.
convert_to_parquet
pds.index_main.Index.convert_to_parquet()

Convert the downloaded index files to parquet format.

download
pds.index_main.Index.download(force=False, convert_to_parquet=True)

Download the index files from remote URL.

Args: force: Force download even if local files exist convert_to_parquet: Whether to convert to parquet after download

Returns: True if download was successful

ensure_parquet
pds.index_main.Index.ensure_parquet(force=False)

Ensure a parquet cache exists for this index.

  • If force is True, reconvert to parquet from existing label+table.
  • If parquet is missing, reconvert when label+table exist.
  • If label or table are missing, perform a clean download.

Returns: True if a download was performed, False otherwise.

read_index_data
pds.index_main.Index.read_index_data(convert_times=True)

Read the index data from label and table files.

refresh_remote
pds.index_main.Index.refresh_remote()

Force refresh of remote URL information.

InventoryIndex

pds.index_main.InventoryIndex(
    index_key,
    local_dir=None,
    force_config_update=False,
)

Index class for inventory-style CSV indexes. This class handles CSV files where: - First 3 columns are: volume, file_path, observation_id - Remaining columns contain comma-separated target names

The data is exploded so each target gets its own row, then grouped back by observation_id with targets as lists for efficient querying.

Attributes

Name Description
tab_extension Get the appropriate table extension.