Configuration¶
Each pipeline layer accepts an optional config object. Defaults work well for most cases, but all parameters can be tuned.
Full Example¶
result, analyzer = surrox.run(
problem=problem,
dataframe=df,
surrogate_config=surrox.TrainingConfig(
n_trials=100,
ensemble_size=7,
cv_folds=10,
default_coverage=0.95,
),
optimizer_config=surrox.OptimizerConfig(
population_size=200,
n_generations=500,
),
analysis_config=surrox.AnalysisConfig(
shap_background_size=200,
pdp_grid_resolution=100,
),
)
TrainingConfig¶
Controls surrogate model training: HPO budget, ensemble construction, and conformal calibration.
| Parameter | Default | Description |
|---|---|---|
n_trials |
50 | Optuna HPO trials per target column |
cv_folds |
5 | Cross-validation folds |
ensemble_size |
5 | Maximum models in the ensemble |
calibration_fraction |
0.2 | Data fraction held out for conformal calibration |
default_coverage |
0.9 | Conformal prediction interval coverage |
study_timeout_s |
300 | Optuna study timeout in seconds |
min_r2 |
0.7 | Minimum R² quality threshold (None to disable) |
random_seed |
42 | Random seed |
FeatureReductionConfig¶
Controls automatic feature reduction (importance screening + correlation grouping). Nested inside TrainingConfig.
| Parameter | Default | Description |
|---|---|---|
enabled |
True |
Enable automatic feature reduction |
importance_threshold |
0.01 | Minimum relative importance to keep a feature (XGBoost-based screening) |
correlation_threshold |
0.9 | Absolute correlation above which features are grouped via PCA |
Feature reduction is skipped when there are fewer than 10 features or fewer than 100 samples. Features involved in monotonic constraints are never dropped or grouped.
surrox.TrainingConfig(
feature_reduction=surrox.FeatureReductionConfig(
enabled=True,
importance_threshold=0.02,
correlation_threshold=0.85,
),
)
See TrainingConfig for the full API.
OptimizerConfig¶
Controls the optimization strategy. The optimizer auto-selects between a global surrogate strategy (pymoo) for low-dimensional problems and a trust region strategy (TuRBO) for high-dimensional problems.
| Parameter | Default | Description |
|---|---|---|
strategy |
None |
GLOBAL_SURROGATE, TRUST_REGION, or None (auto-select based on dim_threshold) |
dim_threshold |
15 | Decision variable count above which TuRBO is auto-selected |
population_size |
100 | Population size for pymoo (global strategy only) |
n_generations |
200 | Number of generations for pymoo (global strategy only) |
extrapolation_k |
5 | k-NN neighbors for extrapolation detection |
extrapolation_threshold |
2.0 | Distance threshold for extrapolation flag |
constraint_confidence |
0.95 | Conformal confidence for constraint evaluation |
seed |
42 | Random seed |
turbo |
TuRBOConfig() |
TuRBO-specific configuration (trust region strategy only) |
TuRBOConfig¶
| Parameter | Default | Description |
|---|---|---|
n_initial |
None |
Initial Sobol points (None = 2 × n_decision_variables) |
max_evaluations |
500 | Total evaluation budget |
batch_size |
1 | Candidates per iteration |
length_init |
0.8 | Initial trust region side length in [0,1]^d |
length_min |
0.0078125 | Minimum TR length before restart |
length_max |
1.6 | Maximum TR length |
success_tolerance |
3 | Consecutive successes before TR expansion |
failure_tolerance |
None |
Consecutive failures before TR shrinkage (None = ceil(dim / batch_size)) |
n_restarts |
3 | Maximum TR restarts before termination |
See OptimizerConfig for the full API.
AnalysisConfig¶
Controls the post-optimization analysis.
| Parameter | Default | Description |
|---|---|---|
shap_background_size |
100 | Background samples for SHAP |
pdp_grid_resolution |
50 | Grid points for PDP/ICE |
pdp_percentiles |
(0.05, 0.95) | Grid range percentile bounds |
monotonicity_check_resolution |
50 | Grid resolution for monotonicity checks |
random_seed |
42 | Random seed |
See AnalysisConfig for the full API.