Directory Manager (evopt.directory_manager)

Warning

This module is part of the internal implementation of evopt and is not intended for direct use by end users. Its API may change without notice.

Directory and file management for evolutionary optimization runs.

This module provides utilities for managing the directory structure, file organization, and checkpointing functionality required for evolutionary optimization runs. It handles creating consistent directory hierarchies, managing unique run identifiers, and providing file paths for saving and retrieving optimization data.

The standard directory structure created is:

evolve_<id>/                      # Root directory for a specific optimization run
├── epochs.csv                    # Aggregated statistics for each epoch
├── results.csv                   # Individual solution results
├── epochs/                       # Directory containing epoch-specific data
│   ├── epoch0000/                # Data for epoch 0
│   │   ├── solution0000/         # Data for solution 0 of epoch 0
│   │   └── solution0001/         # Data for solution 1 of epoch 0
│   └── epoch0001/                # Data for epoch 1
├── checkpoints/                  # Saved optimizer state for resumability
│   ├── checkpoint_epoch0000.pkl
│   └── checkpoint_epoch0001.pkl
└── logs/                         # Log files for the optimization run
class evopt.directory_manager.DirectoryManager(base_dir: str, dir_id: int | None = None)[source]

Bases: object

Manages directories for evolutionary optimization runs.

This class handles the creation and management of directory structures for evolutionary optimization runs. It provides methods for directory creation, solution organization, and checkpoint management.

The directory structure is organized hierarchically to separate epochs and solutions, enabling clean organization of optimization results and facilitating analysis and visualization of the optimization process.

base_dir

Base directory where all optimization runs are stored.

Type:

str

dir_id

Unique identifier for this optimization run.

Type:

int

evolve_dir

Main directory for this specific optimization run.

Type:

str

epochs_csv

Path to the CSV file containing epoch statistics.

Type:

str

results_csv

Path to the CSV file containing individual solution results.

Type:

str

epochs_dir

Directory containing epoch-specific data.

Type:

str

checkpoint_dir

Directory for storing optimizer checkpoints.

Type:

str

logs_dir

Directory for log files.

Type:

str

logger

Logger instance for this optimization run.

Type:

Logger

Example

>>> dm = DirectoryManager("./opt_results", dir_id=5)
>>> dm.setup_directory()  # Creates the directory structure
>>> epoch_folder = dm.create_epoch_folder(10)
>>> solution_folder = dm.create_solution_folder(10, 3)
>>> dm.save_checkpoint(optimizer_state, epoch=10)
>>> optimizer_state = dm.load_checkpoint(epoch=10)
create_epoch_folder(epoch: int) str[source]

Create a folder for a specific optimization epoch.

Creates a directory to store all data related to a specific epoch of the optimization process.

Parameters:

epoch – The epoch number.

Returns:

The path to the created epoch folder.

Return type:

str

Example

>>> dm = DirectoryManager("./results")
>>> epoch_path = dm.create_epoch_folder(5)
>>> print(f"Epoch directory: {epoch_path}")
# Output: Epoch directory: ./results/evolve_0/epochs/epoch0005
create_sample_folder(sample: int) str[source]

Create a folder to store data for a specific sample within an sample study. The created folder is nested within the corresponding evolve folder.

Parameters:

sample – The sample number within the exploratory study.

Returns:

The path to the created solution folder.

Return type:

str

Example

>>> dm = DirectoryManager("./results")
>>> sample_path = dm.create_sample_folder(7)
>>> print(f"sample directory: {sample_path}")
# Output: sample directory: ./results/evolve_0/samples/sample0007
create_solution_folder(epoch: int, solution: int) str[source]

Create a folder for a specific solution within an epoch.

Creates a directory to store data for a single solution evaluation within a particular epoch. The solution folder is nested within the corresponding epoch folder.

Parameters:
  • epoch – The epoch number.

  • solution – The solution number within the epoch.

Returns:

The path to the created solution folder.

Return type:

str

Example

>>> dm = DirectoryManager("./results")
>>> solution_path = dm.create_solution_folder(2, 7)
>>> print(f"Solution directory: {solution_path}")
# Output: Solution directory: ./results/evolve_0/epochs/epoch0002/solution0007
get_checkpoint_filepath(epoch: int) str[source]

Get the filepath for a checkpoint file for a specific epoch.

Constructs the complete file path for saving or loading a checkpoint for the specified epoch. Checkpoint number corresponds to the latest complete epoch.

Parameters:

epoch – The latest complete epoch number.

Returns:

The full filepath for the checkpoint file.

Return type:

str

Example

>>> dm = DirectoryManager("./results")
>>> checkpoint_path = dm.get_checkpoint_filepath(15)
>>> print(f"Checkpoint file: {checkpoint_path}")
# Output: Checkpoint file: ./results/evolve_0/checkpoints/checkpoint_epoch0015.pkl
get_dir_id(dir_id: int | None = None) int[source]

Determine an available directory ID.

Finds an appropriate directory ID based on existing directories and the provided ID (if any). This ensures each optimization run has a unique identifier.

Logic:

  • If dir_id is provided, use that ID.

  • If not provided, check existing evolve directories and:

    • If none exist, use ID 0.

    • Otherwise, find the smallest non-negative integer not already used.

Parameters:

dir_id – The directory ID to use if provided. Default is None.

Returns:

An available directory ID.

Return type:

int

Raises:

FileNotFoundError – If base_dir does not exist.

Example

>>> # Let the system choose an ID
>>> dm = DirectoryManager("./results")
>>> print(f"Assigned ID: {dm.dir_id}")
>>> # Force a specific ID
>>> dm = DirectoryManager("./results", dir_id=42)
>>> print(f"Using ID: {dm.dir_id}")  # Will be 42
load_checkpoint(epoch: int | None = None)[source]

Load a checkpoint file.

Loads and deserializes optimizer state or other data from a checkpoint file. Can either load a specific epoch’s checkpoint or the latest available checkpoint.

Parameters:

epoch – The specific epoch number to load the checkpoint from. If None, the latest checkpoint will be loaded. Default is None.

Returns:

The data loaded from the checkpoint file, or None if no checkpoint is found.

Return type:

Any

Raises:

pickle.UnpicklingError – If the checkpoint file is corrupted or invalid.

Example

>>> dm = DirectoryManager("./results")
>>> # Load the latest checkpoint
>>> latest_state = dm.load_checkpoint()
>>>
>>> # Load a specific epoch's checkpoint
>>> state = dm.load_checkpoint(epoch=10)
>>> if state is not None:
...     print("Checkpoint loaded successfully")
... else:
...     print("No checkpoint found")
load_sample_history()[source]

Load the sample history from the results CSV file.

save_checkpoint(data, epoch: int)[source]

Save a checkpoint file for a specific epoch.

Serializes and saves the optimizer state or other data to a checkpoint file for later resumption of the optimization process.

Parameters:
  • data – The data to save in the checkpoint file (typically optimizer state).

  • epoch – The epoch number.

Returns:

None

Raises:
  • PermissionError – If the file cannot be written due to permission issues.

  • pickle.PickleError – If the data cannot be serialized.

Example

>>> dm = DirectoryManager("./results")
>>> optimizer_state = {"params": [1.2, 3.4], "generation": 5}
>>> dm.save_checkpoint(optimizer_state, epoch=5)
setup_directory()[source]

Create the main directory structure for the optimization run.

Establishes the core directory hierarchy needed for an optimization run, including directories for epochs, checkpoints, and logs.

Returns:

None

Example

>>> dm = DirectoryManager("./results")
>>> dm.setup_directory()  # Creates all necessary directories