bayeso.trees.trees_common

It defines a common function for tree-based surrogates.

bayeso.trees.trees_common._mse(Y: ndarray) → float

It returns a mean squared loss over Y.

Parameters:: Y (numpy.ndarray) – outputs in a leaf.
Returns:: a loss value.
Return type:: float
Raises:: AssertionError

bayeso.trees.trees_common._predict_by_tree(bx: ndarray, tree: dict) → Tuple[float, float]

It predicts a posterior distribution over bx, given tree.

Parameters:

bx (numpy.ndarray) – an input. Shape: (d, ).
tree (dict.) – a decision tree.

Returns:

posterior mean and standard devitation estimates.

Return type:

(float, float)

Raises:

AssertionError

bayeso.trees.trees_common._predict_by_trees(bx: ndarray, list_trees: list) → Tuple[float, float]

It predicts a posterior distribution over bx, given list_trees.

Parameters:

bx (numpy.ndarray) – an input. Shape: (d, ).
list_trees (list) – a list of decision trees.

Returns:

posterior mean and standard devitation estimates.

Return type:

(float, float)

Raises:

AssertionError

bayeso.trees.trees_common._split(X: ndarray, Y: ndarray, num_features: int, split_random_location: bool) → dict

It splits X and Y to left and right leaves as a dictionary including split dimension and split location.

Parameters:

X (numpy.ndarray) – inputs. Shape: (n, d).
Y (numpy.ndarray) – outputs. Shape: (n, 1).
num_features (int.) – the number of features to split.
split_random_location (bool.) – flag for setting a split location randomly or not.

Returns:

a dictionary of left and right leaves, spilt dimension, and split location.

Return type:

dict.

Raises:

AssertionError

bayeso.trees.trees_common._split_deterministic(X: ndarray, Y: ndarray, dim_to_split: int) → Tuple[int, float, Tuple]

bayeso.trees.trees_common._split_left_right(X: ndarray, Y: ndarray, dim_to_split: int, val_to_split: float) → tuple

It splits X and Y to left and right leaves.

Parameters:

X (numpy.ndarray) – inputs. Shape: (n, d).
Y (numpy.ndarray) – outputs. Shape: (n, 1).
dim_to_split (int.) – a dimension to split.
val_to_split (float) – a value to split.

Returns:

a tuple of left and right leaves.

Return type:

tuple

Raises:

AssertionError

bayeso.trees.trees_common._split_random(X: ndarray, Y: ndarray, dim_to_split: int) → Tuple[int, float, Tuple]

bayeso.trees.trees_common.compute_sigma(preds_mu_leaf: ndarray, preds_sigma_leaf: ndarray, min_sigma: float = 0.0) → ndarray

It computes predictive standard deviation estimates.

Parameters:

preds_mu_leaf (numpy.ndarray) – predictive mean estimates of leaf. Shape: (n, ).
preds_sigma_leaf (numpy.ndarray) – predictive standard deviation estimates of leaf. Shape: (n, ).
min_sigma (float) – threshold for minimum standard deviation.

Returns:

predictive standard deviation estimates. Shape: (n, ).

Return type:

numpy.ndarray

Raises:

AssertionError

bayeso.trees.trees_common.get_inputs_from_leaf(leaf: list) → ndarray

It returns an input from a leaf.

Parameters:: leaf (list) – pairs of input and output in a leaf.
Returns:: an input. Shape: (n, d).
Return type:: numpy.ndarray
Raises:: AssertionError

bayeso.trees.trees_common.get_outputs_from_leaf(leaf: list) → ndarray

It returns an output from a leaf.

Parameters:: leaf (list) – pairs of input and output in a leaf.
Returns:: an output. Shape: (n, 1).
Return type:: numpy.ndarray
Raises:: AssertionError

bayeso.trees.trees_common.mse(left_right: tuple) → float

It returns a mean squared loss over left_right.

Parameters:: left_right (tuple) – a tuple of left and right leaves.
Returns:: a loss value.
Return type:: float
Raises:: AssertionError

bayeso.trees.trees_common.predict_by_trees(X: ndarray, list_trees: list) → Tuple[ndarray, ndarray]

It predicts a posterior distribution over X, given list_trees, using multiprocessing.

Parameters:

X (numpy.ndarray) – inputs. Shape: (n, d).
list_trees (list) – a list of decision trees.

Returns:

posterior mean and standard devitation estimates. Shape: ((n, 1), (n, 1)).

Return type:

(numpy.ndarray, numpy.ndarray)

Raises:

AssertionError

bayeso.trees.trees_common.split(node: dict, depth_max: int, size_min_leaf: int, num_features: int, split_random_location: bool, cur_depth: int) → None

It splits a root node to construct a tree.

Parameters:

node (dict.) – a root node.
depth_max (int.) – maximum depth of tree.
size_min_leaf (int.) – minimum size of leaf.
num_features (int.) – the number of split features.
split_random_location (bool.) – flag for setting a split location randomly or not.
cur_depth (int.) – depth of the current node.

Returns:

None.

Return type:

NoneType

Raises:

AssertionError

bayeso.trees.trees_common.subsample(X: ndarray, Y: ndarray, ratio_sampling: float, replace_samples: bool) → Tuple[ndarray, ndarray]

It subsamples a bootstrap sample.

Parameters:

X (numpy.ndarray) – inputs. Shape: (n, d).
Y (numpy.ndarray) – outputs. Shape: (n, 1).
ratio_sampling (float) – ratio of sampling.
replace_samples (bool.) – a flag for sampling with replacement or without replacement.

Returns:

a tuple of bootstrap sample. Shape: ((m, d), (m, 1)).

Return type:

(numpy.ndarray, numpy.ndarray)

Raises:

AssertionError

bayeso.trees.trees_common.unit_predict_by_trees(X: ndarray, list_trees: list) → Tuple[ndarray, ndarray]

It predicts a posterior distribution over X, given list_trees.

Parameters:

X (numpy.ndarray) – inputs. Shape: (n, d).
list_trees (list) – a list of decision trees.

Returns:

posterior mean and standard devitation estimates. Shape: ((n, 1), (n, 1)).

Return type:

(numpy.ndarray, numpy.ndarray)

Raises:

AssertionError