Directory Manager (evopt.directory_manager)
Warning
This module is part of the internal implementation of evopt and is not intended for direct use by end users. Its API may change without notice.
Directory and file management for evolutionary optimization runs.
This module provides utilities for managing the directory structure, file organization, and checkpointing functionality required for evolutionary optimization runs. It handles creating consistent directory hierarchies, managing unique run identifiers, and providing file paths for saving and retrieving optimization data.
The standard directory structure created is:
evolve_<id>/ # Root directory for a specific optimization run
├── epochs.csv # Aggregated statistics for each epoch
├── results.csv # Individual solution results
├── epochs/ # Directory containing epoch-specific data
│ ├── epoch0000/ # Data for epoch 0
│ │ ├── solution0000/ # Data for solution 0 of epoch 0
│ │ └── solution0001/ # Data for solution 1 of epoch 0
│ └── epoch0001/ # Data for epoch 1
├── checkpoints/ # Saved optimizer state for resumability
│ ├── checkpoint_epoch0000.pkl
│ └── checkpoint_epoch0001.pkl
└── logs/ # Log files for the optimization run
- class evopt.directory_manager.DirectoryManager(base_dir: str, dir_id: int | None = None)[source]
Bases:
objectManages directories for evolutionary optimization runs.
This class handles the creation and management of directory structures for evolutionary optimization runs. It provides methods for directory creation, solution organization, and checkpoint management.
The directory structure is organized hierarchically to separate epochs and solutions, enabling clean organization of optimization results and facilitating analysis and visualization of the optimization process.
- base_dir
Base directory where all optimization runs are stored.
- Type:
str
- dir_id
Unique identifier for this optimization run.
- Type:
int
- evolve_dir
Main directory for this specific optimization run.
- Type:
str
- epochs_csv
Path to the CSV file containing epoch statistics.
- Type:
str
- results_csv
Path to the CSV file containing individual solution results.
- Type:
str
- epochs_dir
Directory containing epoch-specific data.
- Type:
str
- checkpoint_dir
Directory for storing optimizer checkpoints.
- Type:
str
- logs_dir
Directory for log files.
- Type:
str
- logger
Logger instance for this optimization run.
- Type:
Logger
Example
>>> dm = DirectoryManager("./opt_results", dir_id=5) >>> dm.setup_directory() # Creates the directory structure >>> epoch_folder = dm.create_epoch_folder(10) >>> solution_folder = dm.create_solution_folder(10, 3) >>> dm.save_checkpoint(optimizer_state, epoch=10) >>> optimizer_state = dm.load_checkpoint(epoch=10)
- create_epoch_folder(epoch: int) str[source]
Create a folder for a specific optimization epoch.
Creates a directory to store all data related to a specific epoch of the optimization process.
- Parameters:
epoch – The epoch number.
- Returns:
The path to the created epoch folder.
- Return type:
str
Example
>>> dm = DirectoryManager("./results") >>> epoch_path = dm.create_epoch_folder(5) >>> print(f"Epoch directory: {epoch_path}") # Output: Epoch directory: ./results/evolve_0/epochs/epoch0005
- create_sample_folder(sample: int) str[source]
Create a folder to store data for a specific sample within an sample study. The created folder is nested within the corresponding evolve folder.
- Parameters:
sample – The sample number within the exploratory study.
- Returns:
The path to the created solution folder.
- Return type:
str
Example
>>> dm = DirectoryManager("./results") >>> sample_path = dm.create_sample_folder(7) >>> print(f"sample directory: {sample_path}") # Output: sample directory: ./results/evolve_0/samples/sample0007
- create_solution_folder(epoch: int, solution: int) str[source]
Create a folder for a specific solution within an epoch.
Creates a directory to store data for a single solution evaluation within a particular epoch. The solution folder is nested within the corresponding epoch folder.
- Parameters:
epoch – The epoch number.
solution – The solution number within the epoch.
- Returns:
The path to the created solution folder.
- Return type:
str
Example
>>> dm = DirectoryManager("./results") >>> solution_path = dm.create_solution_folder(2, 7) >>> print(f"Solution directory: {solution_path}") # Output: Solution directory: ./results/evolve_0/epochs/epoch0002/solution0007
- get_checkpoint_filepath(epoch: int) str[source]
Get the filepath for a checkpoint file for a specific epoch.
Constructs the complete file path for saving or loading a checkpoint for the specified epoch. Checkpoint number corresponds to the latest complete epoch.
- Parameters:
epoch – The latest complete epoch number.
- Returns:
The full filepath for the checkpoint file.
- Return type:
str
Example
>>> dm = DirectoryManager("./results") >>> checkpoint_path = dm.get_checkpoint_filepath(15) >>> print(f"Checkpoint file: {checkpoint_path}") # Output: Checkpoint file: ./results/evolve_0/checkpoints/checkpoint_epoch0015.pkl
- get_dir_id(dir_id: int | None = None) int[source]
Determine an available directory ID.
Finds an appropriate directory ID based on existing directories and the provided ID (if any). This ensures each optimization run has a unique identifier.
Logic:
If dir_id is provided, use that ID.
If not provided, check existing evolve directories and:
If none exist, use ID 0.
Otherwise, find the smallest non-negative integer not already used.
- Parameters:
dir_id – The directory ID to use if provided. Default is None.
- Returns:
An available directory ID.
- Return type:
int
- Raises:
FileNotFoundError – If base_dir does not exist.
Example
>>> # Let the system choose an ID >>> dm = DirectoryManager("./results") >>> print(f"Assigned ID: {dm.dir_id}")
>>> # Force a specific ID >>> dm = DirectoryManager("./results", dir_id=42) >>> print(f"Using ID: {dm.dir_id}") # Will be 42
- load_checkpoint(epoch: int | None = None)[source]
Load a checkpoint file.
Loads and deserializes optimizer state or other data from a checkpoint file. Can either load a specific epoch’s checkpoint or the latest available checkpoint.
- Parameters:
epoch – The specific epoch number to load the checkpoint from. If None, the latest checkpoint will be loaded. Default is None.
- Returns:
The data loaded from the checkpoint file, or None if no checkpoint is found.
- Return type:
Any
- Raises:
pickle.UnpicklingError – If the checkpoint file is corrupted or invalid.
Example
>>> dm = DirectoryManager("./results") >>> # Load the latest checkpoint >>> latest_state = dm.load_checkpoint() >>> >>> # Load a specific epoch's checkpoint >>> state = dm.load_checkpoint(epoch=10) >>> if state is not None: ... print("Checkpoint loaded successfully") ... else: ... print("No checkpoint found")
- load_sample_history()[source]
Load the sample history from the results CSV file.
- save_checkpoint(data, epoch: int)[source]
Save a checkpoint file for a specific epoch.
Serializes and saves the optimizer state or other data to a checkpoint file for later resumption of the optimization process.
- Parameters:
data – The data to save in the checkpoint file (typically optimizer state).
epoch – The epoch number.
- Returns:
None
- Raises:
PermissionError – If the file cannot be written due to permission issues.
pickle.PickleError – If the data cannot be serialized.
Example
>>> dm = DirectoryManager("./results") >>> optimizer_state = {"params": [1.2, 3.4], "generation": 5} >>> dm.save_checkpoint(optimizer_state, epoch=5)
- setup_directory()[source]
Create the main directory structure for the optimization run.
Establishes the core directory hierarchy needed for an optimization run, including directories for epochs, checkpoints, and logs.
- Returns:
None
Example
>>> dm = DirectoryManager("./results") >>> dm.setup_directory() # Creates all necessary directories