Use HyperGBM with Python

HyperGBM is developed with Python. We recommend using the Python tool make_experiment to create experiment and train the model.

The basic steps for training the model with make_experiment are as follows:

  • Prepare the dataset(pandas or dask DataFrame)

  • Create experiment with make_experiment

  • Call the .run() method of experiment to performing training and get the model

  • Predict with trained model or save it with the Python tool pickle

Prepare the dataset

Both pandas and dask can be loaded depending on your task types to get DataFrame for training the model.

Taking loading the sklearn dataset breast_cancer as an example,one can get the dataset by following several procedures:

import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split

X,y = datasets.load_breast_cancer(as_frame=True,return_X_y=True)
X_train,X_test,y_train,y_test = train_test_split(X,y,train_size=0.7,random_state=335)
train_data = pd.concat([X_train,y_train],axis=1)

where train_data is used for model trianing while X_test are y_test used for evaluating the model.

Create experiment with make_experiment

Users can creating experiment for the prepared dataset and start training the model following procedures below:

from hypergbm import make_experiment


experiment = make_experiment(train_data, target='target', reward_metric='precision')
estimator = experiment.run()

where estimator is the trianed model.

Save the model

It is recommended to save the model with pickle

import pickle
with open('model.pkl','wb') as f:
  pickle.dump(estimator, f)

Evaluate the model

The model can be evaluated with tools provided by sklearn:

from sklearn.metrics import classification_report

y_pred=estimator.predict(X_test)
print(classification_report(y_test, y_pred, digits=5))

output:

              precision    recall  f1-score   support

           0    0.96429   0.93103   0.94737        58
           1    0.96522   0.98230   0.97368       113

    accuracy                        0.96491       171
   macro avg    0.96475   0.95667   0.96053       171
weighted avg    0.96490   0.96491   0.96476       171

More info:

Please refer to the docstring of make_experiment for more information about it:

print(make_experiment.__doc__)

If you are using Notebook or IPython, the following code can provide more information about make_experiment:

make_experiment?