Transformation
framework3.plugins.filters.transformation
¶
PCAPlugin
¶
Bases: BaseFilter
A plugin for performing Principal Component Analysis (PCA) on input data.
This plugin integrates scikit-learn's PCA implementation into the framework3 ecosystem, allowing for easy dimensionality reduction within pipelines.
Key Features
- Utilizes scikit-learn's PCA for dimensionality reduction
- Supports customization of the number of components to keep
- Provides methods for fitting the PCA model and transforming data
- Integrates seamlessly with framework3's BaseFilter interface
- Includes a static method for generating parameter grids for hyperparameter tuning
Usage
The PCAPlugin can be used to perform dimensionality reduction on your data:
from framework3.plugins.filters.transformation.pca import PCAPlugin
from framework3.base.base_types import XYData
import numpy as np
# Create a PCAPlugin instance
pca_plugin = PCAPlugin(n_components=2)
# Create some sample data
X = XYData.mock(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]))
y = None # PCA doesn't use y for fitting
# Fit the PCA model
pca_plugin.fit(X, y)
# Transform new data
new_data = XYData.mock(np.array([[2, 3, 4], [5, 6, 7]]))
transformed_data = pca_plugin.predict(new_data)
print(transformed_data.value) # This will be a 2x2 array
Attributes:
Name | Type | Description |
---|---|---|
_pca |
PCA
|
The underlying scikit-learn PCA object used for dimensionality reduction. |
Methods:
Name | Description |
---|---|
fit |
XYData, y: Optional[XYData], evaluator: BaseMetric | None = None) -> Optional[float]: Fit the PCA model to the given data. |
predict |
XYData) -> XYData: Apply dimensionality reduction to the input data. |
item_grid |
List[int]) -> Dict[str, Any]: Generate a parameter grid for hyperparameter tuning. |
Note
This plugin uses scikit-learn's implementation of PCA, which may have its own dependencies and requirements. Ensure that scikit-learn is properly installed and compatible with your environment.
Source code in framework3/plugins/filters/transformation/pca.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
|
__init__(n_components=2)
¶
Initialize a new PCAPlugin instance.
This constructor sets up the PCAPlugin with the specified number of components and initializes the underlying scikit-learn PCA object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_components
|
int
|
The number of components to keep after dimensionality reduction. Defaults to 2. |
2
|
Note
The n_components parameter is passed directly to scikit-learn's PCA. Refer to scikit-learn's documentation for detailed information on this parameter.
Source code in framework3/plugins/filters/transformation/pca.py
fit(x, y, evaluator=None)
¶
Fit the PCA model to the given data.
This method trains the PCA model on the provided input features.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features to fit the PCA model. |
required |
y
|
Optional[XYData]
|
Not used in PCA, but required by the BaseFilter interface. |
required |
evaluator
|
BaseMetric | None
|
An optional evaluator for the model. Not used in this method. |
None
|
Returns:
Type | Description |
---|---|
Optional[float]
|
Optional[float]: Always returns None as PCA doesn't have a standard evaluation metric. |
Note
This method uses scikit-learn's fit method internally. The y parameter is ignored as PCA is an unsupervised method.
Source code in framework3/plugins/filters/transformation/pca.py
item_grid(n_components)
staticmethod
¶
Generate a parameter grid for hyperparameter tuning.
This static method creates a dictionary that can be used for grid search over different numbers of components in PCA.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_components
|
List[int]
|
A list of integers representing different numbers of components to try in the grid search. |
required |
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict[str, Any]: A dictionary with the parameter name as key and the list of values to try as value. |
Note
This method is typically used in conjunction with hyperparameter tuning techniques like GridSearchCV.
Source code in framework3/plugins/filters/transformation/pca.py
predict(x)
¶
Apply dimensionality reduction to the input data.
This method uses the trained PCA model to transform new input data, reducing its dimensionality.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features to transform. |
required |
Returns:
Name | Type | Description |
---|---|---|
XYData |
XYData
|
The transformed data with reduced dimensionality, wrapped in an XYData object. |
Note
This method uses scikit-learn's transform method internally. The transformed data is wrapped in an XYData object for consistency with the framework.
Source code in framework3/plugins/filters/transformation/pca.py
StandardScalerPlugin
¶
Bases: BaseFilter
A plugin for standardizing features by removing the mean and scaling to unit variance.
This plugin integrates scikit-learn's StandardScaler into the framework3 ecosystem, allowing for easy feature standardization within pipelines.
Key Features
- Utilizes scikit-learn's StandardScaler for feature standardization
- Removes the mean and scales features to unit variance
- Provides methods for fitting the scaler and transforming data
- Integrates seamlessly with framework3's BaseFilter interface
Usage
The StandardScalerPlugin can be used to standardize features in your data:
from framework3.plugins.filters.transformation.scaler import StandardScalerPlugin
from framework3.base.base_types import XYData
import numpy as np
# Create a StandardScalerPlugin instance
scaler_plugin = StandardScalerPlugin()
# Create some sample data
X = XYData.mock(np.array([[0, 0], [0, 0], [1, 1], [1, 1]]))
y = None # StandardScaler doesn't use y for fitting
# Fit the StandardScaler
scaler_plugin.fit(X, y)
# Transform new data
new_data = XYData.mock(np.array([[2, 2], [-1, -1]]))
scaled_data = scaler_plugin.predict(new_data)
print(scaled_data.value)
# Output will be standardized, with mean 0 and unit variance
# For example: [[ 1.41421356 1.41421356]
# [-1.41421356 -1.41421356]]
Attributes:
Name | Type | Description |
---|---|---|
_scaler |
StandardScaler
|
The underlying scikit-learn StandardScaler object used for standardization. |
Methods:
Name | Description |
---|---|
fit |
XYData, y: Optional[XYData], evaluator: BaseMetric | None = None) -> Optional[float]: Fit the StandardScaler to the given data. |
predict |
XYData) -> XYData: Perform standardization on the input data. |
Note
This plugin uses scikit-learn's implementation of StandardScaler, which may have its own dependencies and requirements. Ensure that scikit-learn is properly installed and compatible with your environment.
Source code in framework3/plugins/filters/transformation/scaler.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
|
__init__()
¶
Initialize a new StandardScalerPlugin instance.
This constructor sets up the StandardScalerPlugin and initializes the underlying scikit-learn StandardScaler object.
Note
No parameters are required for initialization as StandardScaler uses default settings. For customized scaling, consider extending this class and modifying the StandardScaler initialization.
Source code in framework3/plugins/filters/transformation/scaler.py
fit(x, y, evaluator=None)
¶
Fit the StandardScaler to the given data.
This method computes the mean and standard deviation of the input features, which will be used for subsequent scaling operations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features to fit the StandardScaler. |
required |
y
|
Optional[XYData]
|
Not used in StandardScaler, but required by the BaseFilter interface. |
required |
evaluator
|
BaseMetric | None
|
An optional evaluator for the model. Not used in this method. |
None
|
Returns:
Type | Description |
---|---|
Optional[float]
|
Optional[float]: Always returns None as StandardScaler doesn't have a standard evaluation metric. |
Note
This method uses scikit-learn's fit method internally. The y parameter is ignored as StandardScaler is an unsupervised method.
Source code in framework3/plugins/filters/transformation/scaler.py
predict(x)
¶
Perform standardization on the input data.
This method applies the standardization transformation to new input data, centering and scaling the features based on the computed mean and standard deviation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features to standardize. |
required |
Returns:
Name | Type | Description |
---|---|---|
XYData |
XYData
|
The standardized version of the input data, wrapped in an XYData object. |
Note
This method uses scikit-learn's transform method internally. The transformed data is wrapped in an XYData object for consistency with the framework.
Source code in framework3/plugins/filters/transformation/scaler.py
pca
¶
__all__ = ['PCAPlugin']
module-attribute
¶
PCAPlugin
¶
Bases: BaseFilter
A plugin for performing Principal Component Analysis (PCA) on input data.
This plugin integrates scikit-learn's PCA implementation into the framework3 ecosystem, allowing for easy dimensionality reduction within pipelines.
Key Features
- Utilizes scikit-learn's PCA for dimensionality reduction
- Supports customization of the number of components to keep
- Provides methods for fitting the PCA model and transforming data
- Integrates seamlessly with framework3's BaseFilter interface
- Includes a static method for generating parameter grids for hyperparameter tuning
Usage
The PCAPlugin can be used to perform dimensionality reduction on your data:
from framework3.plugins.filters.transformation.pca import PCAPlugin
from framework3.base.base_types import XYData
import numpy as np
# Create a PCAPlugin instance
pca_plugin = PCAPlugin(n_components=2)
# Create some sample data
X = XYData.mock(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]))
y = None # PCA doesn't use y for fitting
# Fit the PCA model
pca_plugin.fit(X, y)
# Transform new data
new_data = XYData.mock(np.array([[2, 3, 4], [5, 6, 7]]))
transformed_data = pca_plugin.predict(new_data)
print(transformed_data.value) # This will be a 2x2 array
Attributes:
Name | Type | Description |
---|---|---|
_pca |
PCA
|
The underlying scikit-learn PCA object used for dimensionality reduction. |
Methods:
Name | Description |
---|---|
fit |
XYData, y: Optional[XYData], evaluator: BaseMetric | None = None) -> Optional[float]: Fit the PCA model to the given data. |
predict |
XYData) -> XYData: Apply dimensionality reduction to the input data. |
item_grid |
List[int]) -> Dict[str, Any]: Generate a parameter grid for hyperparameter tuning. |
Note
This plugin uses scikit-learn's implementation of PCA, which may have its own dependencies and requirements. Ensure that scikit-learn is properly installed and compatible with your environment.
Source code in framework3/plugins/filters/transformation/pca.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
|
__init__(n_components=2)
¶
Initialize a new PCAPlugin instance.
This constructor sets up the PCAPlugin with the specified number of components and initializes the underlying scikit-learn PCA object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_components
|
int
|
The number of components to keep after dimensionality reduction. Defaults to 2. |
2
|
Note
The n_components parameter is passed directly to scikit-learn's PCA. Refer to scikit-learn's documentation for detailed information on this parameter.
Source code in framework3/plugins/filters/transformation/pca.py
fit(x, y, evaluator=None)
¶
Fit the PCA model to the given data.
This method trains the PCA model on the provided input features.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features to fit the PCA model. |
required |
y
|
Optional[XYData]
|
Not used in PCA, but required by the BaseFilter interface. |
required |
evaluator
|
BaseMetric | None
|
An optional evaluator for the model. Not used in this method. |
None
|
Returns:
Type | Description |
---|---|
Optional[float]
|
Optional[float]: Always returns None as PCA doesn't have a standard evaluation metric. |
Note
This method uses scikit-learn's fit method internally. The y parameter is ignored as PCA is an unsupervised method.
Source code in framework3/plugins/filters/transformation/pca.py
item_grid(n_components)
staticmethod
¶
Generate a parameter grid for hyperparameter tuning.
This static method creates a dictionary that can be used for grid search over different numbers of components in PCA.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_components
|
List[int]
|
A list of integers representing different numbers of components to try in the grid search. |
required |
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
Dict[str, Any]: A dictionary with the parameter name as key and the list of values to try as value. |
Note
This method is typically used in conjunction with hyperparameter tuning techniques like GridSearchCV.
Source code in framework3/plugins/filters/transformation/pca.py
predict(x)
¶
Apply dimensionality reduction to the input data.
This method uses the trained PCA model to transform new input data, reducing its dimensionality.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features to transform. |
required |
Returns:
Name | Type | Description |
---|---|---|
XYData |
XYData
|
The transformed data with reduced dimensionality, wrapped in an XYData object. |
Note
This method uses scikit-learn's transform method internally. The transformed data is wrapped in an XYData object for consistency with the framework.
Source code in framework3/plugins/filters/transformation/pca.py
scaler
¶
__all__ = ['StandardScalerPlugin']
module-attribute
¶
StandardScalerPlugin
¶
Bases: BaseFilter
A plugin for standardizing features by removing the mean and scaling to unit variance.
This plugin integrates scikit-learn's StandardScaler into the framework3 ecosystem, allowing for easy feature standardization within pipelines.
Key Features
- Utilizes scikit-learn's StandardScaler for feature standardization
- Removes the mean and scales features to unit variance
- Provides methods for fitting the scaler and transforming data
- Integrates seamlessly with framework3's BaseFilter interface
Usage
The StandardScalerPlugin can be used to standardize features in your data:
from framework3.plugins.filters.transformation.scaler import StandardScalerPlugin
from framework3.base.base_types import XYData
import numpy as np
# Create a StandardScalerPlugin instance
scaler_plugin = StandardScalerPlugin()
# Create some sample data
X = XYData.mock(np.array([[0, 0], [0, 0], [1, 1], [1, 1]]))
y = None # StandardScaler doesn't use y for fitting
# Fit the StandardScaler
scaler_plugin.fit(X, y)
# Transform new data
new_data = XYData.mock(np.array([[2, 2], [-1, -1]]))
scaled_data = scaler_plugin.predict(new_data)
print(scaled_data.value)
# Output will be standardized, with mean 0 and unit variance
# For example: [[ 1.41421356 1.41421356]
# [-1.41421356 -1.41421356]]
Attributes:
Name | Type | Description |
---|---|---|
_scaler |
StandardScaler
|
The underlying scikit-learn StandardScaler object used for standardization. |
Methods:
Name | Description |
---|---|
fit |
XYData, y: Optional[XYData], evaluator: BaseMetric | None = None) -> Optional[float]: Fit the StandardScaler to the given data. |
predict |
XYData) -> XYData: Perform standardization on the input data. |
Note
This plugin uses scikit-learn's implementation of StandardScaler, which may have its own dependencies and requirements. Ensure that scikit-learn is properly installed and compatible with your environment.
Source code in framework3/plugins/filters/transformation/scaler.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
|
__init__()
¶
Initialize a new StandardScalerPlugin instance.
This constructor sets up the StandardScalerPlugin and initializes the underlying scikit-learn StandardScaler object.
Note
No parameters are required for initialization as StandardScaler uses default settings. For customized scaling, consider extending this class and modifying the StandardScaler initialization.
Source code in framework3/plugins/filters/transformation/scaler.py
fit(x, y, evaluator=None)
¶
Fit the StandardScaler to the given data.
This method computes the mean and standard deviation of the input features, which will be used for subsequent scaling operations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features to fit the StandardScaler. |
required |
y
|
Optional[XYData]
|
Not used in StandardScaler, but required by the BaseFilter interface. |
required |
evaluator
|
BaseMetric | None
|
An optional evaluator for the model. Not used in this method. |
None
|
Returns:
Type | Description |
---|---|
Optional[float]
|
Optional[float]: Always returns None as StandardScaler doesn't have a standard evaluation metric. |
Note
This method uses scikit-learn's fit method internally. The y parameter is ignored as StandardScaler is an unsupervised method.
Source code in framework3/plugins/filters/transformation/scaler.py
predict(x)
¶
Perform standardization on the input data.
This method applies the standardization transformation to new input data, centering and scaling the features based on the computed mean and standard deviation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features to standardize. |
required |
Returns:
Name | Type | Description |
---|---|---|
XYData |
XYData
|
The standardized version of the input data, wrapped in an XYData object. |
Note
This method uses scikit-learn's transform method internally. The transformed data is wrapped in an XYData object for consistency with the framework.