Ensemble

Some words about ensembles.

Available optimizations

The Ensemble object

class fastinference.models.Ensemble.Ensemble(classes, n_features, accuracy=None, name='model')

A Ensemble implementation. There is nothing fancy going on here. It stores all models of the ensemble in an array self.models and the corresponding weights in self.weights.

__init__(classes, n_features, accuracy=None, name='model')

Constructor of this ensemble.

Parameters
  • classes (int) – The class mappings. Each enty maps the given entry to the corresponding index so that the i-th output of the model belongs to class classes[i]. For example with classes = [1,0,2] the second output of the model maps to class 0, the first output to class 1 and the third output to class 2. n_features (list of int): The number of features this model was trained on.

  • accuracy (float, optional) – The accuracy of this tree on some test data. Can be used to verify the correctness of the implementation. Defaults to None.

  • name (str, optional) – The name of this model. Defaults to “Model”.

classmethod from_dict(data)

Generates a new ensemble from the given dictionary. It is assumed that the ensemble has previously been stored with the Ensemble.to_dict() method.

Parameters

data (dict) – The dictionary from which this ensemble should be generated.

Returns

The newly generated ensemble.

Return type

Ensemble

classmethod from_sklearn(sk_model, name='model', accuracy=None)

Generates a new ensemble from an sklearn ensemble.

Parameters
  • sk_model – A scikit-learn ensemble. Currently supported are {BaggingClassifier, RandomForestClassifier, ExtraTreesClassifier, AdaBoostClassifier, AdaBoostRegressor, GradientBoostingClassifier, GradientBoostingRegressor}

  • name (str, optional) – The name of this model. Defaults to “Model”.

  • accuracy (float, optional) – The accuracy of this tree on some test data. Can be used to verify the correctness of the implementation. Defaults to None.

Returns

The newly generated ensemble.

Return type

Ensemble

implement(out_path, out_name, implementation_type, base_implementation, **kwargs)

Implements this ensemble.

Parameters
  • out_path (str) – Folder in which this ensemble should be stored.

  • out_name (name) – Filename in which this ensemble should be stored.

  • implementation_type (str) – The implementation which should be used to implement this ensemble, e.g. cpp

  • base_implementation (str) – The implementation which should be used to implement the base learners, e.g. cpp.ifelse for trees.

optimize(optimizers, args, base_optimizers, base_args)

Optimizes this ensemble and all of its base learners.

Parameters
  • optimizers (list of str) – A list of strings which should be used to optimize this ensemble.

  • args (list of dict) – A list of dictionaries containing the arguments for the respective optimizer.

  • base_optimizers (list of str) – A list of strings which should be used to optimize the base learners of this ensemble.

  • base_args (list of dict) – A list of dictionaries containing the arguments for the respective base optimizer.

predict_proba(X)

Applies this ensemble to the given data and provides the predicted probabilities for each example in X.

Parameters

X (numpy.array) – A (N,d) matrix where N is the number of data points and d is the feature dimension. If X has only one dimension then a single example is assumed and X is reshaped via X = X.reshape(1,X.shape[0])

Returns

A (N, c) prediction matrix where N is the number of data points and c is the number of classes

Return type

numpy.array

to_dict()

Stores this ensemble as a dictionary which can be loaded with Ensemble.from_dict().

Returns

The dictionary representation of this ensemble.

Return type

dict