MLBoh
- mlboh.mlboh.manual_parallel_cv_processes(estimator, X, y, cv, metric, max_workers=4, *args, **kwargs)[source]
Run cross-validation manually using multiple processes.
- Parameters:
estimator (BaseEstimator) – Machine learning estimator pipeline to use
X (np.ndarray) – Input variables/features to analyze without train/test subdivision
y (np.ndarray) – Input labels/ground-truths to analyze without train/test subdivision
train_idx (list) – List of indices to use for the train set
test_idx (list) – List of indices to use for the test set
metric (callable) – Sklearn-like callable function in which the first argument is the y_true list and the second is the y_pred list
max_worker (int (default := 4)) – Maximum number of threads to use for the parallelization
max_workers (int)
- Returns:
score – Output list of the metric function applied on the predicted labels of the provided ML model.
- Return type:
np.ndarray
- mlboh.mlboh.manual_parallel_cv_threads(estimator, X, y, cv, metric, max_workers=4, *args, **kwargs)[source]
Run cross-validation manually using multiple threads.
- Parameters:
estimator (BaseEstimator) – Machine learning estimator pipeline to use
X (np.ndarray) – Input variables/features to analyze without train/test subdivision
y (np.ndarray) – Input labels/ground-truths to analyze without train/test subdivision
train_idx (list) – List of indices to use for the train set
test_idx (list) – List of indices to use for the test set
metric (callable) – Sklearn-like callable function in which the first argument is the y_true list and the second is the y_pred list
max_worker (int (default := 4)) – Maximum number of threads to use for the parallelization
max_workers (int)
- Returns:
score – Output list of the metric function applied on the predicted labels of the provided ML model.
- Return type:
np.ndarray
Functions
- mlboh.mlboh.manual_parallel_cv_threads(estimator, X, y, cv, metric, max_workers=4, *args, **kwargs)[source]
Run cross-validation manually using multiple threads.
- Parameters:
estimator (BaseEstimator) – Machine learning estimator pipeline to use
X (np.ndarray) – Input variables/features to analyze without train/test subdivision
y (np.ndarray) – Input labels/ground-truths to analyze without train/test subdivision
train_idx (list) – List of indices to use for the train set
test_idx (list) – List of indices to use for the test set
metric (callable) – Sklearn-like callable function in which the first argument is the y_true list and the second is the y_pred list
max_worker (int (default := 4)) – Maximum number of threads to use for the parallelization
max_workers (int)
- Returns:
score – Output list of the metric function applied on the predicted labels of the provided ML model.
- Return type:
np.ndarray
- mlboh.mlboh.manual_parallel_cv_processes(estimator, X, y, cv, metric, max_workers=4, *args, **kwargs)[source]
Run cross-validation manually using multiple processes.
- Parameters:
estimator (BaseEstimator) – Machine learning estimator pipeline to use
X (np.ndarray) – Input variables/features to analyze without train/test subdivision
y (np.ndarray) – Input labels/ground-truths to analyze without train/test subdivision
train_idx (list) – List of indices to use for the train set
test_idx (list) – List of indices to use for the test set
metric (callable) – Sklearn-like callable function in which the first argument is the y_true list and the second is the y_pred list
max_worker (int (default := 4)) – Maximum number of threads to use for the parallelization
max_workers (int)
- Returns:
score – Output list of the metric function applied on the predicted labels of the provided ML model.
- Return type:
np.ndarray
- mlboh.mlboh._train_and_score(estimator, X, y, train_idx, test_idx, metric, *args, **kwargs)[source]
Single fold fit->prediction of the estimator pipeline.
- Parameters:
estimator (BaseEstimator) – Machine learning estimator pipeline to use
X (np.ndarray) – Input variables/features to analyze without train/test subdivision
y (np.ndarray) – Input labels/ground-truths to analyze without train/test subdivision
train_idx (list) – List of indices to use for the train set
test_idx (list) – List of indices to use for the test set
metric (callable) – Sklearn-like callable function in which the first argument is the y_true list and the second is the y_pred list
- Returns:
score – Output of the metric function applied on the predicted labels of the provided ML model.
- Return type:
float
Notes
This function performes an internal copy of the provided estimator. This is particularly import when you want to use a parallelism based on theads in which ALL the involved variables are SHARED among all the threads; if the pipeline is not manually copied, a “slow thread” could find the estimator already fitted, avoiding the re-fit and so introducing errors in the data management (!!!)