probnet.helpers package

probnet.helpers.data_handler module

class probnet.helpers.data_handler.Data(X=None, y=None, name='Unknown')[source]

Bases: object

The structure of our supported Data class

Parameters:
  • X (np.ndarray) – The features of your data

  • y (np.ndarray) – The labels of your data

SUPPORT = {'scaler': ['standard', 'minmax', 'max-abs', 'log1p', 'loge', 'sqrt', 'sinh-arc-sinh', 'robust', 'box-cox', 'yeo-johnson']}
static encode_label(y)[source]
static scale(X, scaling_methods=('standard',), list_dict_paras=None)[source]
set_train_test(X_train=None, y_train=None, X_test=None, y_test=None)[source]

Function use to set your own X_train, y_train, X_test, y_test in case you don’t want to use our split function

Parameters:
  • X_train (np.ndarray) –

  • y_train (np.ndarray) –

  • X_test (np.ndarray) –

  • y_test (np.ndarray) –

split_train_test(test_size=0.2, train_size=None, random_state=41, shuffle=True, stratify=None, inplace=True)[source]

The wrapper of the split_train_test function in scikit-learn library.

class probnet.helpers.data_handler.DataTransformer(scaling_methods=('standard',), list_dict_paras=None)[source]

Bases: BaseEstimator, TransformerMixin

The class is used to transform data using different scaling techniques.

Parameters:
  • scaling_methods (str, tuple, list, or np.ndarray) – The name of the scaler you want to use. Supported scaler names are: ‘standard’, ‘minmax’, ‘max-abs’, ‘log1p’, ‘loge’, ‘sqrt’, ‘sinh-arc-sinh’, ‘robust’, ‘box-cox’, ‘yeo-johnson’.

  • list_dict_paras (dict or list of dict) – The parameters for the scaler. If you have only one scaler, please use a dict. Otherwise, please use a list of dict.

SUPPORTED_SCALERS = {'box-cox': <class 'probnet.helpers.scaler.BoxCoxScaler'>, 'log1p': <class 'probnet.helpers.scaler.Log1pScaler'>, 'loge': <class 'probnet.helpers.scaler.LogeScaler'>, 'max-abs': <class 'sklearn.preprocessing._data.MaxAbsScaler'>, 'minmax': <class 'sklearn.preprocessing._data.MinMaxScaler'>, 'robust': <class 'sklearn.preprocessing._data.RobustScaler'>, 'sinh-arc-sinh': <class 'probnet.helpers.scaler.SinhArcSinhScaler'>, 'sqrt': <class 'probnet.helpers.scaler.SqrtScaler'>, 'standard': <class 'sklearn.preprocessing._data.StandardScaler'>, 'yeo-johnson': <class 'probnet.helpers.scaler.YeoJohnsonScaler'>}
fit(X, y=None)[source]

Fit the sequence of scalers on the data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – The input data.

  • y (Ignored) – Not used, exists for compatibility with sklearn’s pipeline.

Returns:

self – Fitted transformer.

Return type:

object

inverse_transform(X)[source]

Reverse the transformations applied to the data.

Parameters:

X (array-like) – Transformed data to invert.

Returns:

X_original – Original data before transformation.

Return type:

array-like

transform(X)[source]

Transform the input data using the sequence of fitted scalers.

Parameters:

X (array-like of shape (n_samples, n_features)) – Input data to transform.

Returns:

X_transformed – Transformed data.

Return type:

array-like

class probnet.helpers.data_handler.FeatureEngineering[source]

Bases: object

Class used to create binary indicator columns for low values in the dataset. This is useful for identifying and processing low values in the data.

Parameters:

threshold (float) – The threshold value for identifying low values.

create_threshold_binary_features(X, threshold)[source]

Perform feature engineering to add binary indicator columns for values below the threshold. Add each new column right after the corresponding original column.

Args: X (numpy.ndarray): The input 2D matrix of shape (n_samples, n_features). threshold (float): The threshold value for identifying low values.

Returns: numpy.ndarray: The updated 2D matrix with binary indicator columns.

class probnet.helpers.data_handler.TimeSeriesDifferencer(interval=1)[source]

Bases: object

Class used to perform differencing on time series data. This is useful for making the data stationary.

Parameters:

interval (int) – The interval for differencing. Default is 1, which means first difference.

difference(X)[source]
inverse_difference(diff_data)[source]

probnet.helpers.distance module

probnet.helpers.distance.bhattacharyya_distance(x1, x2)[source]

Compute the Bhattacharyya distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Bhattacharyya distance between x1 and x2.

Return type:

float

probnet.helpers.distance.braycurtis_distance(x1, x2)[source]

Compute the Bray-Curtis distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Bray-Curtis distance between x1 and x2.

Return type:

float

probnet.helpers.distance.canberra_distance(x1, x2)[source]

Compute the Canberra distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Canberra distance between x1 and x2.

Return type:

float

probnet.helpers.distance.chebyshev_distance(x1, x2)[source]

Compute the Chebyshev distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Chebyshev distance between x1 and x2.

Return type:

float

probnet.helpers.distance.cityblock_distance(x1, x2)[source]

Compute the Cityblock distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Cityblock distance between x1 and x2.

Return type:

float

probnet.helpers.distance.correlation_distance(x1, x2)[source]

Compute the Correlation distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Correlation distance between x1 and x2.

Return type:

float

probnet.helpers.distance.cosine_distance(x1, x2)[source]

Compute the Cosine distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Cosine distance between x1 and x2.

Return type:

float

probnet.helpers.distance.dice_distance(x1, x2)[source]

Compute the Dice distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Dice distance between x1 and x2.

Return type:

float

probnet.helpers.distance.euclidean_distance(x1, x2)[source]

Compute the Euclidean distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Euclidean distance between x1 and x2.

Return type:

float

probnet.helpers.distance.hamming_distance(x1, x2)[source]

Compute the Hamming distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Hamming distance between x1 and x2.

Return type:

float

probnet.helpers.distance.hellinger_distance(x1, x2)[source]

Compute the Hellinger distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Hellinger distance between x1 and x2.

Return type:

float

probnet.helpers.distance.jaccard_distance(x1, x2)[source]

Compute the Jaccard distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Jaccard distance between x1 and x2.

Return type:

float

probnet.helpers.distance.jensen_distance(x1, x2)[source]

Compute the Jensen distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Jensen distance between x1 and x2.

Return type:

float

probnet.helpers.distance.jensen_shannon_distance(x1, x2)[source]

Compute the Jensen-Shannon distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Jensen-Shannon distance between x1 and x2.

Return type:

float

probnet.helpers.distance.kappa_distance(x1, x2)[source]

Compute the Kappa distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Kappa distance between x1 and x2.

Return type:

float

probnet.helpers.distance.kulczynski_distance(x1, x2)[source]

Compute the Kulczynski distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Kulczynski distance between x1 and x2.

Return type:

float

probnet.helpers.distance.kulsinski_distance(x1, x2)[source]

Compute the Kulsinski distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Kulsinski distance between x1 and x2.

Return type:

float

probnet.helpers.distance.mahalanobis_distance(x1, x2, VI=None)[source]

Compute the Mahalanobis distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

  • VI (array-like of shape (n_features, n_features), optional) – The inverse covariance matrix. If None, the covariance matrix is used.

Returns:

distance – The Mahalanobis distance between x1 and x2.

Return type:

float

probnet.helpers.distance.manhattan_distance(x1, x2)[source]

Compute the Manhattan distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Manhattan distance between x1 and x2.

Return type:

float

probnet.helpers.distance.minkowski_distance(x1, x2, p=3)[source]

Compute the Minkowski distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

  • p (int, optional) – The order of the norm. Default is 3.

Returns:

distance – The Minkowski distance between x1 and x2.

Return type:

float

probnet.helpers.distance.morisita_distance(x1, x2)[source]

Compute the Morisita distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Morisita distance between x1 and x2.

Return type:

float

probnet.helpers.distance.morisita_horn_distance(x1, x2)[source]

Compute the Morisita-Horn distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Morisita-Horn distance between x1 and x2.

Return type:

float

probnet.helpers.distance.rogers_distance(x1, x2)[source]

Compute the Rogers distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Rogers distance between x1 and x2.

Return type:

float

probnet.helpers.distance.rogers_tanimoto_distance(x1, x2)[source]

Compute the Rogers-Tanimoto distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Rogers-Tanimoto distance between x1 and x2.

Return type:

float

probnet.helpers.distance.russellrao_distance(x1, x2)[source]

Compute the Russell-Rao distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Russell-Rao distance between x1 and x2.

Return type:

float

probnet.helpers.distance.sokalmichener_distance(x1, x2)[source]

Compute the Sokal-Michener distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Sokal-Michener distance between x1 and x2.

Return type:

float

probnet.helpers.distance.sokalsneath_distance(x1, x2)[source]

Compute the Sokal-Sneath distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Sokal-Sneath distance between x1 and x2.

Return type:

float

probnet.helpers.distance.yule_distance(x1, x2)[source]

Compute the Yule distance between all pairs of x1 and x2.

Parameters:
  • x1 (array-like of shape (n_features,)) – First point.

  • x2 (array-like of shape (n_features,)) – Second point.

Returns:

distance – The Yule distance between x1 and x2.

Return type:

float

probnet.helpers.kernel module

probnet.helpers.kernel.bessel_kernel(dists, sigma)[source]

Compute the Bessel kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Bessel kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.cauchy_kernel(dists, sigma)[source]

Compute the Cauchy kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Cauchy kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.cosine_kernel(dists, sigma)[source]

Compute the Cosine kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Cosine kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.epanechnikov_kernel(dists, sigma)[source]

Compute the Epanechnikov kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Epanechnikov kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.exponential_kernel(dists, sigma)[source]

Compute the Exponential kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Exponential kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.gaussian_kernel(dists, sigma)[source]

Compute the Gaussian kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Gaussian kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.inverse_multiquadric_kernel(dists, sigma)[source]

Compute the Inverse Multiquadric kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Inverse Multiquadric kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.laplace_kernel(dists, sigma)[source]

Compute the Laplace kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Laplace kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.linear_kernel(dists, sigma)[source]

Compute the Linear kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Linear kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.logistic_kernel(dists, sigma)[source]

Compute the Logistic kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Logistic kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.multiquadric_kernel(dists, sigma)[source]

Compute the Multiquadric kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Multiquadric kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.power_kernel(dists, sigma)[source]

Compute the Power kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Power kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.quartic_kernel(dists, sigma)[source]

Compute the Quartic kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Quartic kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.rational_quadratic_kernel(dists, sigma)[source]

Compute the Rational Quadratic kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Rational Quadratic kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.sigmoid_kernel(dists, sigma)[source]

Compute the Sigmoid kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Sigmoid kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.triangular_kernel(dists, sigma)[source]

Compute the Triangular kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Triangular kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.uniform_kernel(dists, sigma)[source]

Compute the Uniform kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Uniform kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.vonmises_fisher_kernel(dists, sigma)[source]

Compute the Von Mises-Fisher kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Von Mises-Fisher kernel values.

Return type:

np.ndarray

probnet.helpers.kernel.vonmises_kernel(dists, sigma)[source]

Compute the Von Mises kernel.

Parameters:
  • dists (np.ndarray) – The distances between points.

  • sigma (float) – The bandwidth parameter.

Returns:

The computed Von Mises kernel values.

Return type:

np.ndarray

probnet.helpers.metrics module

probnet.helpers.metrics.get_all_classification_metrics()[source]

Gets a dictionary of all supported classification metrics.

This function returns a dictionary where keys are metric names and values are their optimization types (“min” or “max”).

Returns:

A dictionary containing all supported classification metrics.

Return type:

dict

probnet.helpers.metrics.get_all_regression_metrics()[source]

Gets a dictionary of all supported regression metrics.

This function returns a dictionary where keys are metric names and values are their optimization types (“min” or “max”).

Returns:

A dictionary containing all supported regression metrics.

Return type:

dict

probnet.helpers.metrics.get_metric_sklearn(task='classification', metric_names=None)[source]

Creates a dictionary of scorers for scikit-learn cross-validation.

This function takes the task type (classification or regression) and a list of metric names. It creates an appropriate metrics instance (ClassificationMetric or RegressionMetric) and iterates through the provided metric names. For each metric name, it checks if it exists in the metrics instance and retrieves the corresponding method. Finally, it uses make_scorer to convert the method to a scorer and adds it to a dictionary.

Parameters:
  • task (str, optional) – The task type, either “classification” or “regression”. Defaults to “classification”.

  • metric_names (list, optional) – A list of metric names. Defaults to None.

Returns:

A dictionary of scorers for scikit-learn cross-validation.

Return type:

dict

probnet.helpers.metrics.get_metrics(problem, y_true, y_pred, metrics=None, testcase='test')[source]

Calculates metrics for regression or classification tasks.

This function takes the true labels (y_true), predicted labels (y_pred), problem type (regression or classification), a dictionary or list of metrics to calculate, and an optional test case name. It returns a dictionary containing the calculated metrics with descriptive names.

Parameters:
  • problem (str) – The type of problem, either “regression” or “classification”.

  • y_true (array-like) – The true labels.

  • y_pred (array-like) – The predicted labels.

  • metrics (dict or list, optional) – A dictionary or list of metrics to calculate. Defaults to None.

  • testcase (str, optional) – An optional test case name to prepend to the metric names. Defaults to “test”.

Returns:

A dictionary containing the calculated metrics with descriptive names.

Return type:

dict

Raises:

ValueError – If the metrics parameter is not a list or dictionary.

probnet.helpers.scaler module

class probnet.helpers.scaler.BoxCoxScaler(lmbda=None)[source]

Bases: BaseEstimator, TransformerMixin

Apply the Box-Cox transformation to stabilize variance and make the data more normally distributed. The Box-Cox transformation is only defined for positive data.

fit(X, y=None)[source]
inverse_transform(X)[source]
transform(X)[source]
class probnet.helpers.scaler.LabelEncoder[source]

Bases: object

Encode categorical features as integer labels.

fit(y)[source]

Fit label encoder to a given set of labels.

Parameters:

y (array-like) – Labels to encode.

fit_transform(y)[source]

Fit label encoder and return encoded labels.

Parameters:

y (array-like of shape (n_samples,)) – Target values.

Returns:

y – Encoded labels.

Return type:

array-like of shape (n_samples,)

inverse_transform(y)[source]

Transform integer labels to original labels.

Parameters:

y (array-like) – Encoded integer labels.

Returns:

original_labels – Original labels.

Return type:

array-like

transform(y)[source]

Transform labels to encoded integer labels.

Parameters:
  • y (array-like) – Labels to encode.

  • Returns

  • --------

  • encoded_labels (array-like) – Encoded integer labels.

class probnet.helpers.scaler.Log1pScaler[source]

Bases: BaseEstimator, TransformerMixin

Apply the natural logarithm (base e) to each element of the input data. This is useful for transforming data that may have a long tail distribution.

fit(X, y=None)[source]
inverse_transform(X)[source]
transform(X)[source]
class probnet.helpers.scaler.LogeScaler[source]

Bases: BaseEstimator, TransformerMixin

Apply the natural logarithm (base e) to each element of the input data. This is useful for transforming data that may have a long tail distribution.

fit(X, y=None)[source]
inverse_transform(X)[source]
transform(X)[source]
class probnet.helpers.scaler.ObjectiveScaler(obj_name='sigmoid', ohe_scaler=None)[source]

Bases: object

For label scaler in classification (binary and multiple classification)

inverse_transform(data)[source]
transform(data)[source]
class probnet.helpers.scaler.OneHotEncoder[source]

Bases: object

Encode categorical features as a one-hot numeric array. This is useful for converting categorical variables into a format that can be provided to ML algorithms.

fit(X)[source]

Fit the encoder to unique categories in X.

fit_transform(X)[source]

Fit the encoder to X and transform X.

inverse_transform(one_hot)[source]

Convert one-hot encoded format back to original categories.

transform(X)[source]

Transform X into one-hot encoded format.

class probnet.helpers.scaler.SinhArcSinhScaler(epsilon=0.1, delta=1.0)[source]

Bases: BaseEstimator, TransformerMixin

Apply the sinh-arc-sinh transformation to increase kurtosis and skewness of normal random variable. This transformation is useful for data that are normally distributed but need to be transformed to have higher kurtosis and skewness.

fit(X, y=None)[source]
inverse_transform(X)[source]
transform(X)[source]
class probnet.helpers.scaler.SqrtScaler[source]

Bases: BaseEstimator, TransformerMixin

Apply the square root transformation to each element of the input data. This is useful for transforming data that may have a long tail distribution.

fit(X, y=None)[source]
inverse_transform(X)[source]
transform(X)[source]
class probnet.helpers.scaler.YeoJohnsonScaler(lmbda=None)[source]

Bases: BaseEstimator, TransformerMixin

Apply the Yeo-Johnson transformation to stabilize variance and make the data more normally distributed. The Yeo-Johnson transformation can handle both positive and negative data.

fit(X, y=None)[source]
inverse_transform(X)[source]
transform(X)[source]

probnet.helpers.validator module

probnet.helpers.validator.check_bool(name: str, value: bool, bound=(True, False))[source]

Checks if a value is a boolean and optionally verifies it matches a specified bound.

Parameters:
  • name (str) – The name of the variable being checked.

  • value (bool) – The value to check.

  • bound (tuple, optional) – A tuple of allowed boolean values.

Returns:

The validated boolean value.

Return type:

bool

Raises:

ValueError – If the value is not a boolean or not in the bound (if provided).

probnet.helpers.validator.check_float(name: str, value: None, bound=None)[source]

Checks if a value is a float and optionally verifies it falls within a specified bound.

Parameters:
  • name (str) – The name of the variable being checked.

  • value (int or float) – The value to check.

  • bound (tuple, optional) – A tuple representing the lower and upper bound (inclusive).

Returns:

The validated float value.

Return type:

float

Raises:

ValueError – If the value is not a float or falls outside the bound (if provided).

probnet.helpers.validator.check_int(name: str, value: None, bound=None)[source]

Checks if a value is an integer and optionally verifies it falls within a specified bound.

Parameters:
  • name (str) – The name of the variable being checked.

  • value (int or float) – The value to check.

  • bound (tuple, optional) – A tuple representing the lower and upper bound (inclusive).

Returns:

The validated integer value.

Return type:

int

Raises:

ValueError – If the value is not an integer or falls outside the bound (if provided).

probnet.helpers.validator.check_str(name: str, value: str, bound=None)[source]

Checks if a value is a string and optionally verifies it exists within a provided list.

Parameters:
  • name (str) – The name of the variable being checked.

  • value (str) – The value to check.

  • bound (list, optional) – A list of allowed string values.

Returns:

The validated string value.

Return type:

str

Raises:

ValueError – If the value is not a string or not found in the bound list (if provided).

probnet.helpers.validator.check_tuple_float(name: str, values: tuple, bounds=None)[source]

Checks if a tuple contains only floats or integers and optionally verifies they fall within specified bounds.

Parameters:
  • name (str) – The name of the variable being checked.

  • values (tuple) – The tuple of values to check.

  • bounds (list of tuples, optional) – A list of tuples representing lower and upper bounds for each value.

Returns:

The validated tuple of floats.

Return type:

tuple

Raises:

ValueError – If the values are not all floats or integers or do not fall within the specified bounds.

probnet.helpers.validator.check_tuple_int(name: str, values: None, bounds=None)[source]

Checks if a tuple contains only integers and optionally verifies they fall within specified bounds.

Parameters:
  • name (str) – The name of the variable being checked.

  • values (tuple) – The tuple of values to check.

  • bounds (list of tuples, optional) – A list of tuples representing lower and upper bounds for each value.

Returns:

The validated tuple of integers.

Return type:

tuple

Raises:

ValueError – If the values are not all integers or do not fall within the specified bounds.

probnet.helpers.validator.is_in_bound(value, bound)[source]

Checks if a value falls within a specified numerical bound.

Parameters:
  • value (float) – The value to check.

  • bound (tuple) – A tuple representing the lower and upper bound (inclusive for lists).

Returns:

True if the value is within the bound, False otherwise.

Return type:

bool

Raises:

ValueError – If the bound is not a tuple or list.

probnet.helpers.validator.is_str_in_list(value: str, my_list: list)[source]

Checks if a string value exists within a provided list.

Parameters:
  • value (str) – The string value to check.

  • my_list (list, optional) – The list of possible values.

Returns:

True if the value is in the list, False otherwise.

Return type:

bool