koa_middleware.store
Classes
|
Class to manage the storage, retrieval, and synchronization of calibration data between a local database and the remote archive. |
- class koa_middleware.store.CalibrationStore(instrument_name: str | None = None, cache_dir: str | None = None, local_database_filename: str | None = None, connect_remote: bool = True, use_cached: bool = None, origin: str | None = None, sync_on_init: bool = True)[source]
Bases:
objectClass to manage the storage, retrieval, and synchronization of calibration data between a local database and the remote archive.
The CalibrationStore class provides a unified interface for interacting with both local (SQLite) and remote calibration databases. It handles caching of calibration files, querying for specific calibrations, and synchronizing calibration metadata between local and remote repositories.
Constructing this class sets up the necessary directory structure for caching calibration files and initializes the
LocalCalibrationDBinstance for managing the local SQLite database.Creates the
cache_dir,cache_dir/calibrations/<instrument_name>, andcache_dir/databasedirectories if they do not already exist.Initializes
self.local_dbwith aLocalCalibrationDBinstance.Initializes
self.remote_dbwith aRemoteCalibrationDBinstance (if connect_remote is True).
- Parameters:
instrument_name (str | None) – The name of the instrument associated with the calibration data (e.g., ‘hispec’, ‘liger’).
cache_dir (str | None) – The absolute path to the directory where calibration files and the local SQLite database will be stored. If None, uses the KOA_CALIBRATION_CACHE environment variable. Required either as parameter or environment variable.
local_database_filename (str | None) – The filename for the local SQLite database. If None, uses the KOA_LOCAL_CALIBRATION_DATABASE_FILENAME environment variable. If that is also None, defaults to
f'{instrument_name.lower()}_calibrations.db'.connect_remote (bool, optional) – Set to False to skip initializing the remote database connection. Default is True.
Variables (Environment)
directory. (- KOA_CALIBRATION_CACHE (Required) Path to cached calibrations)
Default (- KOA_CALIBRATIONS_URL (Optional) Remote database URL.)
Default
Default
Default
Examples
>>> from koa_middleware import CalibrationStore >>> # Initialize with explicit parameters >>> store = CalibrationStore( ... instrument_name='hispec', ... cache_dir='/tmp/koa_cache/', ... local_database_filename='hispec_calibrations.db', ... connect_remote=False ... ) >>> # Initialize using environment variables (assuming they are set) >>> os.environ['KOA_CALIBRATION_CACHE'] = '/tmp/koa_cache/' >>> store = CalibrationStore(instrument_name='hispec')- calibration_file_in_cache(cal: dict | str | SupportsCalibrationModelIO, filename: str | None = None) str | None[source]
Checks if a calibration file is already present in the local cache.
- Parameters:
cal (dict | str | SupportsCalibrationModelIO) –
- Can be one of:
str: A calibration ID string or filepath.dict: A calibration metadata dict.SupportsCalibrationModelIO: A calibration data model instance.
filename (str | None) – The filename to check for. If None, the filename will be extracted from the input
calparameter.
- Returns:
filepath – The absolute local file path if the calibration file is found in the cache, otherwise None.
- Return type:
str | None
- calibration_record_in_cache(cal: dict | str | SupportsCalibrationModelIO, mode: str = 'id') dict | None[source]
Checks if a calibration is already present in the local cache.
- Parameters:
cal (dict | str | SupportsCalibrationModelIO) –
- Can be one of:
str: A calibration ID string or filepath.dict: A calibration metadata dict.SupportsCalibrationModelIO: A calibration data model instance.
mode (str) –
- The mode to check the cache. Can be one of:
’id’ : Check by calibration ID (cal_id), the primary key in the database.
’version-family’ : Check by the version family values.
’md5’ : Check by the MD5 checksum of the calibration file.
- Returns:
The calibration metadata record if found, otherwise None.
- Return type:
dict | None
- close()[source]
Closes the connections to the local DB. Currently nothing is done to close the remote DB. The Keck Login session is cached for re-use within the same python session.
- detect_version_issues()[source]
- download_calibration_file(calibration: dict | str) str[source]
Downloads a calibration file from the remote DB. This does not register the calibration in the local DB. Most use cases should use
store.get_calibration()instead.- Parameters:
calibration (dict | str) – A calibration metadata dictionary or calibration ID string.
- Returns:
The absolute local file path where the calibration file was downloaded.
- Return type:
str
- generate_calibration_version(cal: dict | SupportsCalibrationModelIO, origin: str | None = None) str[source]
Generate the next calibration version (“001”, “002”, …), scoped to the calibration’s version family and origin.
- Parameters:
cal (dict | SupportsCalibrationModelIO) – The calibration record for which to generate the version. Must contain the necessary metadata fields to determine its version family (e.g. cal_type, datetime_obs, master_cal, spectrograph).
origin (str | None, optional) – The origin to use for generating the version. If None, the origin from the calibration metadata will be used.
- Returns:
The calibration version string
- Return type:
str
- get_calibration(cal: dict | str) tuple[str, dict][source]
Retrieves the calibration file based on its record or ID. Checks if the calibration is already cached locally, and downloads it if not.
- Parameters:
cal (dict | str) – A calibration metadata dictionary, calibration ID string, or local filepath string.
- Returns:
result –
str: The absolute local file path where the calibration file is stored.dict: The calibration metadata dictionary as stored in the local database.
- Return type:
tuple[str, dict]
- get_last_updated(source: str | None = None, **kwargs) str | None[source]
Get the last updated timestamp for the instrument’s calibration data.
- Parameters:
source (str | None) – Whether to query from the ‘local’ or ‘remote’ database. If None, defaults to ‘remote’ if available, otherwise ‘local’.
**kwargs – Additional parameters to pass to
local_db.get_last_updated()orremote_db.get_last_updated().
- Returns:
The last updated timestamp as a string, or None if no entries exist.
- Return type:
str | None
- get_missing_local_files() list[dict][source]
Identifies all calibration files that are recorded in the local sqlite DB but are missing from the local cache directory.
- Parameters:
instrument_name (str, optional) – The name of the instrument to check for missing files. If None, all instruments are checked.
- Returns:
A list of calibration metadata dictionaries for calibrations that are missing from the local cache.
- Return type:
list[dict]
- get_missing_records(source: str = 'remote', mode: str = 'id') list[dict][source]
Identifies calibration entries present in one database but missing from another.
- Parameters:
source (str, optional) –
‘remote’ (default): Returns entries in remote DB but not in local DB.
’local’: Returns entries in local DB but not in remote DB.
mode (str, optional) – The mode to determine which entries are considered missing. Options are: - ‘id’ (default): Entries whose IDs are not present in the target database. - ‘last_updated’: Entries with a
last_updatedtimestamp greater than the most recent timestamp in the target database.
- Returns:
- A list of dictionaries of metadata representing entries
that are in the source DB but not yet in the target DB.
- Return type:
list[dict]
- get_version_family_column_names(cal_type: str)[source]
Retrieves the column names for the version family attributes. By default, this includes ‘cal_type’ and ‘datetime_obs’, but subclasses should override this method to specify different or additional columns for different calibration types.
- Parameters:
cal_type (str) – The type of calibration.
- get_version_family_values(cal: dict) dict[source]
Retrieves the fields/values that determine whether or not a calibration requires a new version.
- Parameters:
cal (dict) – A calibration metadata record. One key must be ‘cal_type’ to determine the calibration type and thus the version family fields.
cal_type (str) – The type of calibration.
- Returns:
A dictionary containing only the keys/values for metadata that determines the version family.
- Return type:
dict
- query(source: str | None = None, **kwargs) list[dict] | dict | None[source]
Query calibrations from local or remote database.
Users can also query the local and remote databases directly using
store.local_db.query()andstore.remote_db.query().This method may be removed in the future if not found useful.
- Parameters:
source (str | None) – Whether to query from the ‘local’ or ‘remote’ database. If None, defaults to ‘local’.
**kwargs – Additional parameters to pass to the underlying
querymethod.
- Returns:
Query results from the specified source.
- Return type:
list[dict] | dict | None
- record_from(cal: dict | SupportsCalibrationModelIO) dict[source]
Extracts a calibration record dictionary from a given input.
- Parameters:
cal (dict | SupportsCalibrationModelIO) – The input from which to extract the calibration record. Can be a dict or any object with a
to_record()method.- Returns:
A dictionary representing the calibration record.
- Return type:
dict
- Raises:
ValueError – If the input type is invalid or if the object does not have a
to_record()method.
- register_calibration(cal: SupportsCalibrationModelIO, origin: str | None = None, new_version: bool = False) tuple[str, dict][source]
Registers a calibration to the local cache and metadata database.
- Parameters:
cal (SupportsCalibrationModelIO) – The datamodel object to register.
origin (str, optional) – The origin to register the calibration under.
new_version (bool, optional) – Whether to generate a new version for this calibration. If False, the method will check if a calibration with the same version family already exists in the cache and skip registration if so. Defaults to False.
- Returns:
- A tuple containing:
str: The local file path where the calibration was saved.dict: The calibration metadata dictionary as added to the database.
- Return type:
tuple[str, dict]
- save_calibration_file(cal: SupportsCalibrationModelIO, cal_record: dict | None = None) str[source]
Saves a calibration file to the local cache directory.
- Parameters:
cal (SupportsCalibrationModelIO) – The calibration data model instance to save.
cal_record (dict | None) – The corresponding record.
- Returns:
The absolute local file path where the calibration file was saved.
- Return type:
str
- select_and_get_calibration(input, selector: CalibrationSelector) tuple[str, dict][source]
Selects the best calibration based on input data and a selection rule, then retrieves it.
This method uses a
CalibrationSelectorto identify the most appropriate calibration for the given input data. Once selected, it retrieves the calibration file, downloading it if it’s not already cached locally.- Parameters:
input – The input data product for which a calibration is needed.
selector (CalibrationSelector) – An instance of a
CalibrationSelectorclass.
- Returns:
tuple[str, dict] –
str: The local file path of the retrieved calibration file.dict: The record of the selected calibration from the local database.
Example – >>> # Assuming
my_input_dataandmy_selectorare defined >>> local_filepath, calibration_record = store.select_and_get_calibration(my_input_data, my_selector) >>> print(f”Calibration file: {local_filepath}”) >>> print(f”Calibration ID: {calibration_record[‘id’]}”)
- sync_records_from_cached_files(cals: SupportsCalibrationModelIO | Sequence[SupportsCalibrationModelIO]) None[source]
Populates the local database from existing cached calibration files.
- Parameters:
cals (SupportsCalibrationModelIO | Sequence[SupportsCalibrationModelIO]) – A single calibration metadata dictionary or a data model instance, or a list of these.
Notes
This method may be removed in the future if not found useful.
- sync_records_from_remote(cals, mode: str = 'id') list[dict][source]
Synchronizes the local database with the remote database.
This method fetches entries from the remote database that are missing from the local database based on the
modeparameter, see below. It then adds these missing entries to the local database.- Parameters:
mode (str, optional) –
The mode to determine which entries are considered missing. Options are:
’last_updated’: Entries with a
last_updatedtimestamp greater than the most recent timestamp in the local database.’id’ (default): Entries whose IDs are not present in the local database.
- Returns:
cals – A list of dictionaries representing calibration entries that were added to the local database during synchronization.
- Return type:
list[dict]
- sync_records_to_remote(mode: str = 'id') list[dict][source]
Uploads local calibration entires to the remote DB.
- Parameters:
mode (str, optional) –
The mode to determine which entries are considered missing. Options are: - ‘last_updated’: Entries with a
last_updatedtimestamp greater than the most recent timestamp in the local database.’id’ (default): Entries whose IDs are not present in the local database.
- Returns:
cals – A list of dictionaries representing calibration entries that were added to the remote database during synchronization.
- Return type:
list[dict]