Ensemble

Some words about ensembles.

Available optimizations

The Ensemble object

class fastinference.models.Ensemble.Ensemble(classes, n_features, accuracy=None, name='model')

A Ensemble implementation. There is nothing fancy going on here. It stores all models of the ensemble in an array self.models and the corresponding weights in self.weights.

__init__(classes, n_features, accuracy=None, name='model')

Constructor of this ensemble.

Parameters

classes (int) – The class mappings. Each enty maps the given entry to the corresponding index so that the i-th output of the model belongs to class classes[i]. For example with classes = [1,0,2] the second output of the model maps to class 0, the first output to class 1 and the third output to class 2. n_features (list of int): The number of features this model was trained on.
accuracy (float, optional) – The accuracy of this tree on some test data. Can be used to verify the correctness of the implementation. Defaults to None.
name (str, optional) – The name of this model. Defaults to “Model”.

classmethod from_dict(data)

Generates a new ensemble from the given dictionary. It is assumed that the ensemble has previously been stored with the Ensemble.to_dict() method.

Parameters: data (dict) – The dictionary from which this ensemble should be generated.
Returns: The newly generated ensemble.
Return type: Ensemble

classmethod from_sklearn(sk_model, name='model', accuracy=None)

Generates a new ensemble from an sklearn ensemble.

Parameters

sk_model – A scikit-learn ensemble. Currently supported are {BaggingClassifier, RandomForestClassifier, ExtraTreesClassifier, AdaBoostClassifier, AdaBoostRegressor, GradientBoostingClassifier, GradientBoostingRegressor}
name (str, optional) – The name of this model. Defaults to “Model”.
accuracy (float, optional) – The accuracy of this tree on some test data. Can be used to verify the correctness of the implementation. Defaults to None.

Returns

The newly generated ensemble.

Return type

Ensemble

implement(out_path, out_name, implementation_type, base_implementation, **kwargs)

Implements this ensemble.

Parameters

out_path (str) – Folder in which this ensemble should be stored.
out_name (name) – Filename in which this ensemble should be stored.
implementation_type (str) – The implementation which should be used to implement this ensemble, e.g. cpp
base_implementation (str) – The implementation which should be used to implement the base learners, e.g. cpp.ifelse for trees.

optimize(optimizers, args, base_optimizers, base_args)

Optimizes this ensemble and all of its base learners.

Parameters

optimizers (list of str) – A list of strings which should be used to optimize this ensemble.
args (list of dict) – A list of dictionaries containing the arguments for the respective optimizer.
base_optimizers (list of str) – A list of strings which should be used to optimize the base learners of this ensemble.
base_args (list of dict) – A list of dictionaries containing the arguments for the respective base optimizer.

predict_proba(X)

Applies this ensemble to the given data and provides the predicted probabilities for each example in X.

Parameters: X (numpy.array) – A (N,d) matrix where N is the number of data points and d is the feature dimension. If X has only one dimension then a single example is assumed and X is reshaped via X = X.reshape(1,X.shape[0])
Returns: A (N, c) prediction matrix where N is the number of data points and c is the number of classes
Return type: numpy.array

to_dict()

Stores this ensemble as a dictionary which can be loaded with Ensemble.from_dict().

Returns: The dictionary representation of this ensemble.
Return type: dict