Option 1: Run data gen using interactive interface¶
To use the interactive interface, either open scripts/interactive_interface.ipynb
or copy the following into a Jupyter notebook and follow the instructions:
from gridfm_datakit.interactive import interactive_interface
interactive_interface()
Option 2: Using the command line interface¶
Run the data generation routine from the command line:
gridfm_datakit path/to/config.yaml
Configuration Overview¶
Refer to the sections Network, Load Scenarios, and Topology perturbations for a description of the configuration parameters.
Sample configuration files are provided in scripts/config
, e.g. default.yaml
:
network:
name: "case24_ieee_rts" # Name of the power grid network (without extension)
source: "pglib" # Data source for the grid; options: pglib, pandapower, file
network_dir: "scripts/grids" # if using source "file", this is the directory containing the network file (relative to the project root)
load:
generator: "agg_load_profile" # Name of the load generator; options: agg_load_profile, powergraph
agg_profile: "default" # Name of the aggregated load profile
scenarios: 200 # Number of different load scenarios to generate
# WARNING: the following parameters are only used if generator is "agg_load_profile"
# if using generator "powergraph", these parameters are ignored
sigma: 0.05 # max local noise
change_reactive_power: true # If true, changes reactive power of loads. If False, keeps the ones from the case file
global_range: 0.4 # Range of the global scaling factor. used to set the lower bound of the scaling factor
max_scaling_factor: 4.0 # Max upper bound of the global scaling factor
step_size: 0.025 # Step size when finding the upper bound of the global scaling factor
start_scaling_factor: 0.8 # Initial value of the global scaling factor
topology_perturbation:
type: "random" # Type of topology generator; options: n_minus_k, random, none
# WARNING: the following parameters are only used if type is not "none"
k: 1 # Maximum number of components to drop in each perturbation
n_topology_variants: 5 # Number of unique perturbed topologies per scenario
elements: ["line", "trafo", "gen", "sgen"] # elements to perturb options: line, trafo, gen, sgen
settings:
num_processes: 10 # Number of parallel processes to use
data_dir: "./data_out" # Directory to save generated data relative to the project root
large_chunk_size: 50 # Number of load scenarios processed before saving
no_stats: false # If true, disables statistical calculations
overwrite: true # If true, overwrites existing files, if false, appends to files (note that bus_params.csv, edge_params.csv, scenarios_{load.generator}.csv and scenarios_{load.generator}.html will still be overwritten)
mode: "pf" # Mode of the script; options: contingency, pf
Output Files¶
The data generation process produces several output files in the specified data directory:
- tqdm.log: Progress bar log.
- error.log: Log of the errors raised during data generation.
- args.log: Copy of the config file used.
- pf_node.csv: Data related to the nodes (buses) in the network, such as voltage levels and power injections.
- pf_edge.csv: Branch admittance matrix for each pf case.
- branch_idx_removed.csv: List of the indices of the branches (lines and transformers) that got removed when perturbing the topologies.
- edge_params.csv: Branch admittance matrix and branch rate limits for the unperturbed topology.
- bus_params.csv: Parameters for the buses (voltage limits and the base voltage).
- scenario_{args.load.generator}.csv: Load element-level load profile obtained after using the load scenario generator.
- scenario_{args.load.generator}.html: Plots of the element-level load profile.
- scenario_{args.load.generator}.log: If generator is "agg_load_profile", stores the upper and lower bounds for the global scaling factor.
- stats.csv: Stats about the generated data.
- stats_plot.html: Plots of the stats about the generated data.