openavmkit.checkpoint
Notebook checkpoint system.
Provides from_checkpoint, the main mechanism the pipeline notebooks use to
save and resume intermediate state. Each checkpoint stores a function's result
to disk; on subsequent runs the checkpoint is loaded instead of re-executing
the function. Supports both pickled Python objects and parquet-backed
DataFrames / GeoDataFrames.
Typical use pattern in a notebook cell::
sup = from_checkpoint("1-assemble-02-process_data", process_dataframes,
{"dataframes": dataframes, "settings": settings})
Set clear_checkpoints = True at the top of a notebook (and call
delete_checkpoints) to start over from scratch.
delete_checkpoints
delete_checkpoints(prefix)
Delete all checkpoint files that start with the given prefix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prefix
|
str
|
The prefix to match checkpoint files against. |
required |
Source code in openavmkit/checkpoint.py
149 150 151 152 153 154 155 156 157 158 159 160 | |
exists_checkpoint
exists_checkpoint(path)
Check if a checkpoint exists at the specified path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
The path to the checkpoint file (without extension). |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if a checkpoint exists, False otherwise. |
Source code in openavmkit/checkpoint.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
from_checkpoint
from_checkpoint(path, func, params, use_checkpoint=True)
Run a function with parameters, using a checkpoint if available.
If a checkpoint exists at the specified path, it will read from it, return the results, and not execute the function.
If a checkpoint does not exist, it will execute the function with the provided parameters, save the result to a checkpoint, and return the result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
The path to the checkpoint file (without extension). |
required |
func
|
callable
|
The function to execute if the checkpoint does not exist. |
required |
params
|
dict
|
The parameters to pass to the function. |
required |
use_checkpoint
|
bool
|
Whether to use the checkpoint if it exists. Defaults to True. |
True
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
The result of the function execution or the checkpoint data. |
Source code in openavmkit/checkpoint.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
read_checkpoint
read_checkpoint(path)
Read a checkpoint from the specified path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
The path to the checkpoint file (without extension). |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The data read from the checkpoint, which can be a DataFrame or GeoDataFrame. |
Source code in openavmkit/checkpoint.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | |
read_pickle
read_pickle(path)
Read a pickle file from the specified path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
The path to the pickle file (without extension). |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The data read from the pickle file. |
Source code in openavmkit/checkpoint.py
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 | |
write_checkpoint
write_checkpoint(data, path)
Write data to a checkpoint file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Any
|
The data to write to the checkpoint, which can be a DataFrame or GeoDataFrame. |
required |
path
|
str
|
The path to the checkpoint file (without extension). |
required |
Source code in openavmkit/checkpoint.py
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | |