Tree

Some words about trees.

Available optimizations

fastinference.optimizers.tree.swap.optimize(model, **kwargs)

Performs swap optimization. Swaps two child nodes if the probability to visit the left tree is smaller than the probability to visit the right tree. This way, the probability to visit the left tree is maximized which in-turn improves the branch-prediction during pipelining in the CPU. You can activate this optimization by simply passing "swap" to the optimizer, e.g.

loaded_model = fastinference.Loader.model_from_file("/my/nice/tree.json")
loaded_model.optimize("swap", None)

Reference:: Buschjäger, Sebastian, et al. “Realization of random forest for real-time evaluation through tree framing.” 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 2018.

Parameters: model (Tree) – The tree model.
Returns: The tree model with swapped nodes.
Return type: Tree

fastinference.optimizers.tree.quantize.optimize(model, quantize_splits=None, quantize_leafs=None, **kwargs)

Quantizes the splits and predictions in the leaf nodes of the given tree and prunes away unreachable parts of the tree after quantization.

Note: The input data is not quantized as well if quantize_splits is set. Hence you have to manually scale the input data with the corresponding value to make sure splits are correctly performed.

Parameters

model (Tree) – The tree.
quantize_splits (str or None, optional) – Can be “rounding” or an integer (either int or as string). If “rounding” is set, then each split is rounded down towards the next integer. In any other case, quantize_splits is interpreted as integer value that is used to scale each split before rounding it down towards the next integer. Defaults to None.
quantize_leafs (str or None, optional) – Can be a string or an integer. quantize_leafs is interpreted as integer value that is used to scale each leaf node before rounding it down towards the next integer. Defaults to None.

Returns

The quantized and potentially pruned tree.

Return type

Tree

The Tree object

class fastinference.models.Tree.Node

A single node of a Decision Tree. There is nothing fancy going on here. It stores all the relevant attributes of a node.

__init__(): Generates a new node. All attributes are initialize to None.

class fastinference.models.Tree.Tree(classes, n_features, accuracy=None, name='Model')

A Decision Tree implementation. There is nothing fancy going on here. It stores all nodes in an array self.nodes and has a pointer self.head which points to the root node of the tree. Per construction it is safe to assume that self.head = self.nodes[0].

__init__(classes, n_features, accuracy=None, name='Model')

Constructor of a tree.

Parameters

classes (int) – The class mappings. Each enty maps the given entry to the corresponding index so that the i-th output of the model belongs to class classes[i]. For example with classes = [1,0,2] the second output of the model maps to class 0, the first output to class 1 and the third output to class 2.
n_features (list of int) – The number of features this model was trained on.
model_accuracy (float, optional) – The accuracy of this tree on some test data. Can be used to verify the correctness of the implementation. Defaults to None.
name (str, optional) – The name of this model. Defaults to “Model”.

classmethod from_dict(data)

Generates a new tree from the given dictionary. It is assumed that a tree has previously been stored with the Tree.to_dict() method.

Parameters: data (dict) – The dictionary from which this tree should be generated.
Returns: The newly generated tree.
Return type: Tree

classmethod from_sklearn(sk_model, name='Model', accuracy=None, ensemble_type=None)

Generates a new tree from an sklearn tree.

Parameters

sk_model (DecisionTreeClassifier) – A DecisionTreeClassifier trained in sklearn.
name (str, optional) – The name of this model. Defaults to “Model”.
accuracy (float, optional) – The accuracy of this tree on some test data. Can be used to verify the correctness of the implementation. Defaults to None.
ensemble_type (str, optional) – Indicates from which sciki-learn ensemble (e.g. RandomForestClassifier, AdaBoostClassifier_SAMME.R, AdaBoostClassifier_SAMME) this DecisionTreeClassifier has been trained, because the probabilities of the leaf-nodes are interpeted differently for each ensemble. If None is set, then a regular DecisionTreeClassifier is assumed. Defaults to None.

Returns

The newly generated tree.

Return type

Tree

populate_path_probs(node=None, curPath=None, allPaths=None, pathNodes=None, pathLabels=None)

predict_proba(X)

Applies this tree to the given data and provides the predicted probabilities for each example in X.

Parameters: X (numpy.array) – A (N,d) matrix where N is the number of data points and d is the feature dimension. If X has only one dimension then a single example is assumed and X is reshaped via X = X.reshape(1,X.shape[0])
Returns: A (N, c) prediction matrix where N is the number of data points and c is the number of classes
Return type: numpy.array

to_dict()

Stores this tree as a dictionary which can be loaded with Tree.from_dict().

Returns: The dictionary representation of this tree.
Return type: dict