Python API
This page documents the public Python API of ioc_cleanup.
Only stable, user-facing functions are listed here.
Transformations & Cleaning
Functions related to loading, applying, and managing cleaning transformations.
ioc_cleanup.load_transformation(ioc_code, sensor, src_dir=_constants.TRANSFORMATIONS_DIR)
Load a transformation definition for a station and sensor.
This is a convenience wrapper around
load_transformation_from_path that constructs the transformation
filename from the IOC station code and sensor identifier.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ioc_code
|
str
|
IOC station code. |
required |
sensor
|
str
|
Sensor identifier. |
required |
src_dir
|
str | PathLike[str]
|
Directory containing transformation JSON files. |
TRANSFORMATIONS_DIR
|
Returns:
| Type | Description |
|---|---|
Transformation
|
Parsed transformation model. |
Source code in ioc_cleanup/_tools.py
ioc_cleanup.load_transformation_from_path(path)
Load a transformation definition from a JSON file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | PathLike[str]
|
Path to a transformation JSON file describing cleaning rules. |
required |
Returns:
| Type | Description |
|---|---|
Transformation
|
Parsed transformation model. |
Source code in ioc_cleanup/_tools.py
ioc_cleanup.transform(df, transformation=None)
Apply a cleaning transformation to an IOC sea-level time series.
The transformation defines the valid time window, dropped timestamps, dropped date ranges, and sensor breakpoints. Bad data is ropped data from the DatFrame; no offset correction is applied.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Raw IOC sea-level time series. The DataFrame must have
|
required |
transformation
|
Transformation | None
|
Cleaning transformation to apply. If not provided, it is loaded automatically using DataFrame attributes. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Cleaned time series with metadata stored in |
Source code in ioc_cleanup/_tools.py
ioc_cleanup.clean(df, station, sensor)
Clean a raw IOC time series using the corresponding transformation file.
Wrapper around transform function: returns a single sensor series.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Raw IOC station data. |
required |
station
|
str
|
IOC station code. |
required |
sensor
|
str
|
Sensor identifier. |
required |
Returns:
| Type | Description |
|---|---|
Series
|
Cleaned sea-level time series for the selected sensor. |
Source code in ioc_cleanup/_tools.py
Surge & Signal Processing
Utilities for tidal analysis, demeaning, and surge extraction.
ioc_cleanup.surge(ts, opts, rsmp)
Compute the non-tidal (surge) component of a sea-level time series.
Tidal constituents are estimated using UTide and reconstructed at the original timestamps. The tidal signal is then subtracted from the observed series.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ts
|
Series
|
Sea-level time series. |
required |
opts
|
Mapping[str, Any]
|
UTide solver options. |
required |
rsmp
|
int | None
|
Optional resampling interval in minutes. If provided, the series is resampled before tidal analysis. |
required |
Returns:
| Type | Description |
|---|---|
Series
|
Surge (non-tidal residual) time series. |
Source code in ioc_cleanup/_tools.py
Station Metadata
Access to IOC station metadata and geographic information.
ioc_cleanup.get_meta()
cached
Retrieve IOC station metadata with geographic coordinates.
Metadata are collected from both the IOC web service and the IOC API and merged into a single GeoDataFrame.
Returns:
| Type | Description |
|---|---|
GeoDataFrame
|
GeoDataFrame containing IOC station codes, longitude, latitude, |
GeoDataFrame
|
and geometry in EPSG:4326. |
Source code in ioc_cleanup/_searvey.py
Data Download
Helpers for downloading and storing IOC raw data.
ioc_cleanup.download_raw(ioc_codes, start, end)
Download raw IOC sea-level data for multiple stations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ioc_codes
|
list[str]
|
List of IOC station codes. |
required |
start
|
Timestamp
|
Start timestamp. |
required |
end
|
Timestamp
|
End timestamp. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, DataFrame]
|
Dictionary mapping station codes to raw dataframes. |
Source code in ioc_cleanup/_searvey.py
ioc_cleanup.download_year_station(station, year, data_folder='./data')
Download and store one year of IOC data for a single station.
Data are saved as Parquet files under <data_folder>/<year>/.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
station
|
str
|
IOC station code. |
required |
year
|
int
|
Year to download. |
required |
data_folder
|
str
|
Base directory for storing downloaded data. |
'./data'
|
Source code in ioc_cleanup/_searvey.py
Data Loading
Utilities for loading archived IOC data from disk.
ioc_cleanup.load_station(station, data_dir=Path('./data'), start_year=2011, end_year=2024)
Load multi-year IOC data for a station from local Parquet files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
station
|
str
|
IOC station code. |
required |
data_dir
|
Path
|
Base directory containing yearly Parquet files. |
Path('./data')
|
start_year
|
int
|
First year to load (inclusive). |
2011
|
end_year
|
int
|
Last year to load (exclusive). |
2024
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Concatenated DataFrame containing the available station data. |
DataFrame
|
Returns an empty DataFrame if no data are found. |
Source code in ioc_cleanup/_searvey.py
Models
Core data models used by the cleaning workflow.
ioc_cleanup.Transformation
Bases: BaseModel