PyPruning
This package provides implementations for some common ensemble pruning algorithms. Pruning algorithms aim to select the best subset of an trained ensemble to minimize memory consumption and maximize accuracy. Currently, six types of pruning algorithms are implemented:
RandomPruningClassifier
: Selects a random subset of classifiers. This is mainly used as a baseline.RankPruningClassifier
: Rank each classifier according to a given metric and then select the best K classifier.ClusterPruningClassifier
: Cluster the classifiers according to a clustering method and then select a representative from each cluster to from the sub-ensemble.GreedyPruningClassifier
: Proceeds in rounds and selects the best classifier in each round given the already selected sub-ensemble.MIQPPruningClassifier
: Constructs a mixed-integer quadratic problem and optimizes this to compute the best sub ensemble.ProxPruningClassifier
: Minimize a (regularized) loss function via (stochastic) proximal gradient descent over the ensembles weights.
An example on how to use this code can be found in Pruning an ensemble.
How to install
You can install this package via directly via pip from git
pip install git+https://github.com/sbuschjaeger/PyPruning.git
If you have trouble with dependencies you can try setting up a conda environment which I use for development:
git clone git@github.com:sbuschjaeger/PyPruning.git
cd PyPruning
conda env create -f environment.yml
conda activate pypruning
Some notes on the MIQPPruningClassifier
For implementing the MIQPPruningClassifier
we use cvxpy which does not come with a MIQP solver. If you want to use this algorithm you have to manually install a solver, e.g.
pip install cvopt
for a free solver or if you want to use a commercial solver and use Anaconda you can also install gurobi (with a free license)
conda install -c gurobi gurobi
For more information on setting the solver for MIQPPruningClassifier
have a look here.
Acknowledgements
The software is written and maintained by Sebastian Buschjäger as part of his work at the Chair for Artificial Intelligence at the TU Dortmund University and the Collaborative Research Center 876
. If you have any question feel free to contact me under sebastian.buschjaeger@tu-dortmund.de.
Special thanks goes to Henri Petuker who provided parts of this implementation during his bachelor thesis and David Clemens who made the logo.