MOEA/D Searcher example

This is an example about how using MOEADSearcher for multi-objectives optimization.

1. Import modules and prepare data

[1]:
from hypernets.core.random_state import set_random_state
set_random_state(1234)

from hypernets.utils import logging as hyn_logging
from hypernets.examples.plain_model import PlainModel, PlainSearchSpace
from hypernets.searchers.nsga_searcher import RNSGAIISearcher

from hypergbm import make_experiment

from hypernets.tabular import get_tool_box
from hypernets.tabular.datasets import dsutils
from hypernets.tabular.sklearn_ex import MultiLabelEncoder


hyn_logging.set_level(hyn_logging.WARN)

df = dsutils.load_bank().head(1000)
tb = get_tool_box(df)
df_train, df_test = tb.train_test_split(df, test_size=0.2, random_state=9527)

2. Run an experiment within NSGAIISearcher

[2]:
experiment = make_experiment(df_train,
                             eval_data=df_test.copy(),
                             callbacks=[],
                             random_state=1234,
                             search_callbacks=[],
                             target='y',
                             searcher='moead',  # available MOO searcher: moead, nsga2, rnsga2
                             reward_metric='logloss',
                             objectives=['nf'],
                             drift_detection=False,
                             early_stopping_rounds=30)

estimators = experiment.run(max_trials=30)
hyper_model = experiment.hyper_model_
hyper_model.searcher
[2]:
MOEADSearcher(objectives=[PredictionObjective(name=logloss, scorer=make_scorer(log_loss, needs_proba=True), direction=min), NumOfFeatures(name=nf, sample_size=1000, direction=min)], n_neighbors=2, recombination=SinglePointCrossOver(random_state=RandomState(MT19937)), mutation=SinglePointMutation(random_state=RandomState(MT19937), proba=0.7), population_size=6)

3. Summary trails

[3]:
df_trials = hyper_model.history.to_df().copy().drop(['scores', 'reward'], axis=1)
df_trials[df_trials['non_dominated'] == True]
[3]:
trial_no succeeded elapsed non_dominated model_index reward_logloss reward_nf
4 5 True 0.446323 True 0.0 0.217409 0.625
6 7 True 4.100305 True 1.0 0.537368 0.0
8 9 True 4.796208 True 2.0 0.253515 0.125
9 10 True 1.060251 True 3.0 0.246395 0.5625
22 30 True 0.366623 True 4.0 0.177716 0.75

4. Plot pareto font

We can pick model accord to Decision Maker’s preferences from the pareto plot, the number in the figure indicates the index of pipeline models.

[4]:
fig, ax  = hyper_model.history.plot_best_trials()
fig.show()
../_images/examples_63.MOEAD_example_8_0.png

5. Plot population

[5]:
fig, ax  = hyper_model.searcher.plot_population()
fig.show()
../_images/examples_63.MOEAD_example_10_0.png

6. Evaluate the selected model

[6]:
print(f"Number of pipeline: {len(estimators)} ")

pipeline_model = estimators[0]  # selection the first pipeline model
X_test = df_test.copy()
y_test = X_test.pop('y')

preds = pipeline_model.predict(X_test)
proba = pipeline_model.predict_proba(X_test)

tb.metrics.calc_score(y_test, preds, proba, metrics=['auc', 'accuracy', 'f1', 'recall', 'precision'], pos_label="yes")
Number of pipeline: 5
[6]:
{'auc': 0.8417038690476191,
 'accuracy': 0.855,
 'f1': 0.17142857142857143,
 'recall': 0.09375,
 'precision': 1.0}

Automatically convert metric to negatives for minimize

[7]:
experiment = make_experiment(df_train,
                             eval_data=df_test.copy(),
                             callbacks=[],
                             random_state=1234,
                             search_callbacks=[],
                             target='y',
                             pos_label="yes",
                             searcher='moead',
                             reward_metric='accuracy',
                             objectives=['precision'],
                             drift_detection=False,
                             early_stopping_rounds=30)

estimators = experiment.run(max_trials=30)
hyper_model = experiment.hyper_model_
hyper_model.history.to_df().copy().drop(['scores', 'reward'], axis=1)[:5]
[7]:
trial_no succeeded elapsed non_dominated model_index reward_accuracy reward_precision
0 1 True 0.645476 False NaN -0.91 -0.777778
1 2 True 0.752835 False NaN -0.8975 -0.0
2 3 True 0.779298 False NaN -0.8975 -0.0
3 4 True 0.737511 False NaN -0.905 -0.625
4 5 True 0.498275 False NaN -0.90625 -0.733333
[ ]: