library.pipeline.pipeline_manager module¶
- class library.pipeline.pipeline_manager.PipelineManager(pipelines: dict[str, dict[str, Pipeline]], serializer_type: str = 'joblib', variables: dict = None)[source]¶
Bases:
object
Trains all pipelines. Evaluates all pipelines
- all_pipelines_execute(methodName: str, verbose: bool = False, exclude_category: str = '', exclude_pipeline_names: list[str] = [], **kwargs)[source]¶
Executes a method for all pipelines using threading for parallelization. Method name can include dot notation for nested attributes (e.g. “model.fit”)
Note for verbose: - If u dont see a given pipeline in the results, it is because it has already been processed (its a copy of another pipeline)
- Parameters:
methodName (str) – The method to execute. As per defined in the phases implementation.
verbose (bool) – Whether to print to stdout the results returned by the method.
exclude_category (str) – The category to exclude from the execution. (either baseline or not_baseline)
exclude_pipeline_names (list[str]) – The pipeline names to exclude from the execution.
**kwargs (dict) – Additional keyword arguments that are method-specific.
- Returns:
results – The results of the execution.
- Return type:
dict
- create_pipeline_divergence(category: str, pipelineName: str, print_results: bool = False) Pipeline [source]¶
Originally all pipelines point to the same object. This function creates a copy at the moment and creates a new indepedent pipeline object. Changes to this pipeline now only affect this copy.
- Parameters:
category (str) – The category to create a divergence for.
pipelineName (str) – The pipeline name to create a divergence for.
print_results (bool) – Whether to print the results.
- Returns:
newPipeline – The new pipeline object.
- Return type:
- deserialize_models(models_to_deserialize: dict[str, str])[source]¶
Deserializes the models.
- Parameters:
models_to_deserialize (dict[str, str]) – The models to deserialize.
- deserialize_pipelines(pipelines_to_deserialize: dict[str, str]) None [source]¶
Deserializes the pipelines.
- Parameters:
pipelines_to_deserialize (dict[str, str]) – The pipelines to deserialize.
- evaluate_store_final_models()[source]¶
Evaluates and stores the final models (post-tuning).
- Return type:
None
- property pipeline_state¶
- select_best_performing_model(metric: str)[source]¶
Selects the best performing model based on the classification report
- Parameters:
metric (str) – The metric to use to select the best performing model.
- Returns:
best_model_name (str) – The name of the best performing model.
best_score (float) – The score of the best performing model.