Version 0.2.2

We add the following new features to this version:

Feature engineering

  • Feature generating

  • Feature dimension reduction

Data cleaning

  • Missing characters handling

  • Column types correction

  • Constant columns cleaning

  • Repeat columns cleaning

  • Deleating examples with missing targets

  • Replacing invalid values

  • id columns cleaning

Dataset splitting

  • Adversarial validation

Modelling algorithms

  • XGBoost

  • Catboost

  • LightGBM

  • HistGridientBoosting

Model training

  • Automatic task inferencing

  • Command line tools

Evaluation methods

  • Cross-Validation

  • Train-Validation-Holdout

Search Algorithms

  • Monte-Carlo Tree search

  • Evolution algorithms

  • Random search

Imbalanced data handling

  • Class Weight

  • Under-sampling

    • Near miss

    • Tomeks links

    • Random

  • Over-sampling

    • SMOTE

    • ADASYN

    • Random

Early-stopping strategy

  • stopping after n times searching without improving

  • stopping after using a maximal time

  • stopping after achieving expected performance

Advanced Features

  • Two-stage search

    • Pseudo-label

    • Feature selection

  • Concepts drift handling

  • Model ensemble