MonoPipeline¶
The MonoPipeline
class is part of the framework3.plugins.pipelines.parallel
module and is designed to run multiple pipelines in parallel, combining their outputs to create new features. This pipeline is ideal for scenarios where different models or transformations need to be applied simultaneously to the same dataset.
Module Contents¶
framework3.plugins.pipelines.parallel.parallel_mono_pipeline
¶
MonoPipeline
¶
Bases: ParallelPipeline
A pipeline that combines multiple filters in parallel and constructs new features from their outputs.
This pipeline allows for simultaneous execution of multiple filters on the same input data, and then combines their outputs to create new features. It's particularly useful for feature engineering and ensemble methods.
Key Features
- Parallel execution of multiple filters
- Combination of filter outputs for feature construction
- Support for evaluation metrics
Usage
from framework3.plugins.pipelines.parallel import MonoPipeline
from framework3.plugins.filters.transformation import PCAPlugin
from framework3.plugins.filters.classification import KnnFilter
from framework3.plugins.metrics import F1Score
from framework3.base import XYData
pipeline = MonoPipeline(
filters=[
PCAPlugin(n_components=2),
KnnFilter(n_neighbors=3)
],
metrics=[F1Score()]
)
x_train = XYData(...)
y_train = XYData(...)
x_test = XYData(...)
y_test = XYData(...)
pipeline.fit(x_train, y_train)
predictions = pipeline.predict(x_test)
evaluation = pipeline.evaluate(x_test, y_test, predictions)
print(evaluation)
Attributes:
Name | Type | Description |
---|---|---|
filters |
Sequence[BaseFilter]
|
A sequence of filters to be applied in parallel. |
Methods:
Name | Description |
---|---|
fit |
XYData, y: Optional[XYData], evaluator: BaseMetric | None = None) -> Optional[float]: Fit all filters in parallel. |
predict |
XYData) -> XYData: Make predictions using all filters in parallel. |
evaluate |
XYData, y_true: XYData | None, y_pred: XYData) -> Dict[str, Any]: Evaluate the pipeline using provided metrics. |
combine_features |
list[XYData]) -> XYData: Combine features from all filter outputs. |
start |
XYData, y: XYData | None, X_: XYData | None) -> XYData | None: Start the pipeline execution. |
Source code in framework3/plugins/pipelines/parallel/parallel_mono_pipeline.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 |
|
__init__(filters)
¶
Initialize the MonoPipeline.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filters
|
Sequence[BaseFilter]
|
A sequence of filters to be applied in parallel. |
required |
Source code in framework3/plugins/pipelines/parallel/parallel_mono_pipeline.py
combine_features(pipeline_outputs)
staticmethod
¶
Combine features from all filter outputs.
This method concatenates the features from all filter outputs along the last axis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pipeline_outputs
|
List[XYData]
|
List of outputs from each filter. |
required |
Returns:
Name | Type | Description |
---|---|---|
XYData |
XYData
|
Combined output with concatenated features. |
Note
This method assumes that all filter outputs can be concatenated along the last axis. Ensure that your filters produce compatible outputs.
Source code in framework3/plugins/pipelines/parallel/parallel_mono_pipeline.py
evaluate(x_data, y_true, y_pred)
¶
Evaluate the pipeline using the provided metrics.
This method applies each metric in the pipeline to the predicted and true values, returning a dictionary of evaluation results.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x_data
|
XYData
|
Input data used for evaluation. |
required |
y_true
|
XYData | None
|
True target data, if available. |
required |
y_pred
|
XYData
|
Predicted target data. |
required |
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict[str, Any]: A dictionary containing the evaluation results for each metric. |
Example
Source code in framework3/plugins/pipelines/parallel/parallel_mono_pipeline.py
fit(x, y, evaluator=None)
¶
Fit all filters in the pipeline to the input data in parallel.
This method applies the fit operation to each filter in the pipeline using the provided input data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input data to fit the filters on. |
required |
y
|
XYData | None
|
The target data, if available. |
required |
evaluator
|
BaseMetric | None
|
An evaluator metric, if needed. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
Optional[float]
|
Optional[float]: The mean of the losses returned by the filters, if any. |
Note
Filters that raise NotTrainableFilterError will be initialized instead of fitted.
Source code in framework3/plugins/pipelines/parallel/parallel_mono_pipeline.py
predict(x)
¶
Make predictions using all filters in parallel and combine their outputs.
This method applies the predict operation to each filter in the pipeline and then combines the outputs using the combine_features method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input data to make predictions on. |
required |
Returns:
Name | Type | Description |
---|---|---|
XYData |
XYData
|
The combined predictions from all filters. |
Source code in framework3/plugins/pipelines/parallel/parallel_mono_pipeline.py
start(x, y, X_)
¶
Start the pipeline execution by fitting the model and making predictions.
This method initiates the pipeline process by fitting the model to the input data and then making predictions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input data for fitting and prediction. |
required |
y
|
XYData | None
|
The target data for fitting, if available. |
required |
X_
|
XYData | None
|
Additional input data for prediction, if different from x. |
required |
Returns:
Type | Description |
---|---|
XYData | None
|
XYData | None: The predictions made by the pipeline, or None if an error occurs. |
Raises:
Type | Description |
---|---|
Exception
|
If an error occurs during the fitting or prediction process. |
Source code in framework3/plugins/pipelines/parallel/parallel_mono_pipeline.py
Class Hierarchy¶
- MonoPipeline
MonoPipeline¶
MonoPipeline
extends ParallelPipeline
and provides functionality for executing multiple pipelines in parallel, combining their outputs to enhance feature sets.
Key Methods:¶
__init__(filters: Sequence[BaseFilter])
: Initializes the pipeline with a sequence of filters to be applied in parallel.start(x: XYData, y: XYData|None, X_: XYData|None) -> XYData|None
: Starts the pipeline execution, fitting the data and making predictions.fit(x: XYData, y: XYData|None = None)
: Fits all pipelines in parallel using the provided input data.predict(x: XYData) -> XYData
: Runs predictions on all pipelines in parallel and combines their outputs.evaluate(x_data: XYData, y_true: XYData|None, y_pred: XYData) -> Dict[str, Any]
: Evaluates the pipeline using the provided metrics.combine_features(pipeline_outputs: list[XYData]) -> XYData
: Combines features from all pipeline outputs.
Usage Examples¶
Creating and Using MonoPipeline¶
from framework3.plugins.pipelines.parallel.mono_pipeline import MonoPipeline
from framework3.base import XYData
# Define filters (assuming filters are defined elsewhere)
filters = [Filter1(), Filter2()]
# Initialize the MonoPipeline
mono_pipeline = MonoPipeline(filters=filters)
# Example data
x_data = XYData(_hash='x_data', _path='/tmp', _value=[[1, 2], [3, 4]])
y_data = XYData(_hash='y_data', _path='/tmp', _value=[0, 1])
# Start the pipeline
results = mono_pipeline.start(x_data, y_data, None)
# Evaluate the pipeline
evaluation_results = mono_pipeline.evaluate(x_data, y_data, results)
print(evaluation_results)
Best Practices¶
- Ensure that the filters used in the pipeline are compatible with the input data.
- Define a clear combiner function if the default concatenation does not meet your needs.
- Monitor the performance and resource utilization to optimize parallel execution.
Conclusion¶
MonoPipeline
provides a flexible and efficient way to execute multiple pipelines in parallel, enhancing feature sets through combined outputs. By following the best practices and examples provided, you can effectively integrate this pipeline into your machine learning workflows.