MOEA/D Searcher example
This is an example about how using MOEADSearcher for multi-objectives optimization.
1. Import modules and prepare data
[1]:
from hypernets.core.random_state import set_random_state
set_random_state(1234)
from hypernets.utils import logging as hyn_logging
from hypernets.examples.plain_model import PlainModel, PlainSearchSpace
from hypernets.searchers.nsga_searcher import RNSGAIISearcher
from hypergbm import make_experiment
from hypernets.tabular import get_tool_box
from hypernets.tabular.datasets import dsutils
from hypernets.tabular.sklearn_ex import MultiLabelEncoder
hyn_logging.set_level(hyn_logging.WARN)
df = dsutils.load_bank().head(1000)
tb = get_tool_box(df)
df_train, df_test = tb.train_test_split(df, test_size=0.2, random_state=9527)
2. Run an experiment within NSGAIISearcher
[2]:
experiment = make_experiment(df_train,
eval_data=df_test.copy(),
callbacks=[],
random_state=1234,
search_callbacks=[],
target='y',
searcher='moead', # available MOO searcher: moead, nsga2, rnsga2
reward_metric='logloss',
objectives=['nf'],
drift_detection=False,
early_stopping_rounds=30)
estimators = experiment.run(max_trials=30)
hyper_model = experiment.hyper_model_
hyper_model.searcher
[2]:
MOEADSearcher(objectives=[PredictionObjective(name=logloss, scorer=make_scorer(log_loss, needs_proba=True), direction=min), NumOfFeatures(name=nf, sample_size=1000, direction=min)], n_neighbors=2, recombination=SinglePointCrossOver(random_state=RandomState(MT19937)), mutation=SinglePointMutation(random_state=RandomState(MT19937), proba=0.7), population_size=6)
3. Summary trails
[3]:
df_trials = hyper_model.history.to_df().copy().drop(['scores', 'reward'], axis=1)
df_trials[df_trials['non_dominated'] == True]
[3]:
trial_no | succeeded | elapsed | non_dominated | model_index | reward_logloss | reward_nf | |
---|---|---|---|---|---|---|---|
4 | 5 | True | 0.446323 | True | 0.0 | 0.217409 | 0.625 |
6 | 7 | True | 4.100305 | True | 1.0 | 0.537368 | 0.0 |
8 | 9 | True | 4.796208 | True | 2.0 | 0.253515 | 0.125 |
9 | 10 | True | 1.060251 | True | 3.0 | 0.246395 | 0.5625 |
22 | 30 | True | 0.366623 | True | 4.0 | 0.177716 | 0.75 |
4. Plot pareto font
We can pick model accord to Decision Maker’s preferences from the pareto plot, the number in the figure indicates the index of pipeline models.
[4]:
fig, ax = hyper_model.history.plot_best_trials()
fig.show()

5. Plot population
[5]:
fig, ax = hyper_model.searcher.plot_population()
fig.show()

6. Evaluate the selected model
[6]:
print(f"Number of pipeline: {len(estimators)} ")
pipeline_model = estimators[0] # selection the first pipeline model
X_test = df_test.copy()
y_test = X_test.pop('y')
preds = pipeline_model.predict(X_test)
proba = pipeline_model.predict_proba(X_test)
tb.metrics.calc_score(y_test, preds, proba, metrics=['auc', 'accuracy', 'f1', 'recall', 'precision'], pos_label="yes")
Number of pipeline: 5
[6]:
{'auc': 0.8417038690476191,
'accuracy': 0.855,
'f1': 0.17142857142857143,
'recall': 0.09375,
'precision': 1.0}
Automatically convert metric to negatives for minimize
[7]:
experiment = make_experiment(df_train,
eval_data=df_test.copy(),
callbacks=[],
random_state=1234,
search_callbacks=[],
target='y',
pos_label="yes",
searcher='moead',
reward_metric='accuracy',
objectives=['precision'],
drift_detection=False,
early_stopping_rounds=30)
estimators = experiment.run(max_trials=30)
hyper_model = experiment.hyper_model_
hyper_model.history.to_df().copy().drop(['scores', 'reward'], axis=1)[:5]
[7]:
trial_no | succeeded | elapsed | non_dominated | model_index | reward_accuracy | reward_precision | |
---|---|---|---|---|---|---|---|
0 | 1 | True | 0.645476 | False | NaN | -0.91 | -0.777778 |
1 | 2 | True | 0.752835 | False | NaN | -0.8975 | -0.0 |
2 | 3 | True | 0.779298 | False | NaN | -0.8975 | -0.0 |
3 | 4 | True | 0.737511 | False | NaN | -0.905 | -0.625 |
4 | 5 | True | 0.498275 | False | NaN | -0.90625 | -0.733333 |
[ ]: