library.phases.phases_implementation.dataset.split.strategies.base module¶
- class library.phases.phases_implementation.dataset.split.strategies.base.Split(dataset)[source]¶
Bases:
ABC
- plot_per_set_distribution(features: list[str], save_plots: bool = False, save_path: str = None)[source]¶
Plots the distribution of the features for the training, validation and test sets. This is going to be meaningful for checking the similarity in statistical distributions between the sets. Note: for high-dimesionality dataset this is going to be computationally expensive.
Parameters:¶
- features: list[str]
The names of the features to plot
- save_plots: bool
Whether to save the plots
- save_path: str
The path to save the plots