Text Processing

`framework3.plugins.filters.llm` ¶

`HuggingFaceSentenceTransformerPlugin` ¶

Bases: BaseFilter, BasePlugin

A plugin for generating sentence embeddings using Hugging Face's Sentence Transformers.

This plugin integrates Sentence Transformers from Hugging Face into the framework3 ecosystem, allowing for easy generation of sentence embeddings within pipelines.

Key Features

Utilizes pre-trained Sentence Transformer models from Hugging Face
Supports custom model selection
Generates embeddings for input text data
Integrates seamlessly with framework3's BaseFilter interface

Usage

The HuggingFaceSentenceTransformerPlugin can be used to generate embeddings for text data:

from framework3.plugins.filters.llm.huggingface_st import HuggingFaceSentenceTransformerPlugin
from framework3.base.base_types import XYData

# Create an instance of the plugin
st_plugin = HuggingFaceSentenceTransformerPlugin(model_name="all-MiniLM-L6-v2")

# Prepare input data
input_texts = ["This is a sample sentence.", "Another example text."]
x_data = XYData(_hash='input_data', _path='/tmp', _value=input_texts)

# Generate embeddings
embeddings = st_plugin.predict(x_data)
print(embeddings.value)

Attributes:

Name	Type	Description
`model_name`	`str`	The name of the Sentence Transformer model to use.
`_model`	`SentenceTransformer`	The underlying Sentence Transformer model.

Methods:

Name	Description
`fit`	XYData, y: XYData \| None) -> float \| None: Placeholder method for compatibility with BaseFilter interface.
`predict`	XYData) -> XYData: Generate embeddings for the input text data.

Note

This plugin requires the sentence-transformers library to be installed. Ensure that you have the necessary dependencies installed in your environment.

Source code in framework3/plugins/filters/llm/huggingface_st.py

@Container.bind()
class HuggingFaceSentenceTransformerPlugin(BaseFilter, BasePlugin):
    """
    A plugin for generating sentence embeddings using Hugging Face's Sentence Transformers.

    This plugin integrates Sentence Transformers from Hugging Face into the framework3 ecosystem,
    allowing for easy generation of sentence embeddings within pipelines.

    Key Features:
        - Utilizes pre-trained Sentence Transformer models from Hugging Face
        - Supports custom model selection
        - Generates embeddings for input text data
        - Integrates seamlessly with framework3's BaseFilter interface

    Usage:
        The HuggingFaceSentenceTransformerPlugin can be used to generate embeddings for text data:

        ```python
        from framework3.plugins.filters.llm.huggingface_st import HuggingFaceSentenceTransformerPlugin
        from framework3.base.base_types import XYData

        # Create an instance of the plugin
        st_plugin = HuggingFaceSentenceTransformerPlugin(model_name="all-MiniLM-L6-v2")

        # Prepare input data
        input_texts = ["This is a sample sentence.", "Another example text."]
        x_data = XYData(_hash='input_data', _path='/tmp', _value=input_texts)

        # Generate embeddings
        embeddings = st_plugin.predict(x_data)
        print(embeddings.value)
        ```

    Attributes:
        model_name (str): The name of the Sentence Transformer model to use.
        _model (SentenceTransformer): The underlying Sentence Transformer model.

    Methods:
        fit(x: XYData, y: XYData | None) -> float | None:
            Placeholder method for compatibility with BaseFilter interface.
        predict(x: XYData) -> XYData:
            Generate embeddings for the input text data.

    Note:
        This plugin requires the `sentence-transformers` library to be installed.
        Ensure that you have the necessary dependencies installed in your environment.
    """

    def __init__(self, model_name: str = "all-MiniLM-L6-v2"):
        """
        Initialize a new HuggingFaceSentenceTransformerPlugin instance.

        This constructor sets up the plugin with the specified Sentence Transformer model.

        Args:
            model_name (str): The name of the Sentence Transformer model to use.
                              Defaults to "all-MiniLM-L6-v2".

        Note:
            The specified model will be downloaded and loaded upon initialization.
            Ensure you have a stable internet connection and sufficient disk space.
        """
        super().__init__()
        self.model_name = model_name
        self._model = SentenceTransformer(self.model_name)

    def fit(self, x: XYData, y: XYData | None) -> float | None:
        """
        Placeholder method for compatibility with BaseFilter interface.

        This method is not implemented as Sentence Transformers typically don't require fitting.

        Args:
            x (XYData): The input features (not used).
            y (XYData | None): The target values (not used).

        Returns:
            float | None: Always returns None.

        Note:
            This method is included for API consistency but does not perform any operation.
        """
        ...

    def predict(self, x: XYData) -> XYData:
        """
        Generate embeddings for the input text data.

        This method uses the loaded Sentence Transformer model to create embeddings
        for the input text.

        Args:
            x (XYData): The input text data to generate embeddings for.

        Returns:
            XYData: The generated embeddings wrapped in an XYData object.

        Note:
            The input text should be in a format compatible with the Sentence Transformer model.
            The output embeddings are converted to a PyTorch tensor before being wrapped in XYData.
        """
        embeddings = self._model.encode(x.value)
        return XYData.mock(torch.tensor(embeddings))

`model_name = model_name` `instance-attribute` ¶

`init(model_name='all-MiniLM-L6-v2')` ¶

Initialize a new HuggingFaceSentenceTransformerPlugin instance.

This constructor sets up the plugin with the specified Sentence Transformer model.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	The name of the Sentence Transformer model to use. Defaults to "all-MiniLM-L6-v2".	`'all-MiniLM-L6-v2'`

Note

The specified model will be downloaded and loaded upon initialization. Ensure you have a stable internet connection and sufficient disk space.

Source code in framework3/plugins/filters/llm/huggingface_st.py

def __init__(self, model_name: str = "all-MiniLM-L6-v2"):
    """
    Initialize a new HuggingFaceSentenceTransformerPlugin instance.

    This constructor sets up the plugin with the specified Sentence Transformer model.

    Args:
        model_name (str): The name of the Sentence Transformer model to use.
                          Defaults to "all-MiniLM-L6-v2".

    Note:
        The specified model will be downloaded and loaded upon initialization.
        Ensure you have a stable internet connection and sufficient disk space.
    """
    super().__init__()
    self.model_name = model_name
    self._model = SentenceTransformer(self.model_name)

`fit(x, y)` ¶

Placeholder method for compatibility with BaseFilter interface.

This method is not implemented as Sentence Transformers typically don't require fitting.

Parameters:

Name	Type	Description	Default
`x`	`XYData`	The input features (not used).	required
`y`	`XYData \| None`	The target values (not used).	required

Returns:

Type	Description
`float \| None`	float \| None: Always returns None.

Note

This method is included for API consistency but does not perform any operation.

Source code in framework3/plugins/filters/llm/huggingface_st.py

def fit(self, x: XYData, y: XYData | None) -> float | None:
    """
    Placeholder method for compatibility with BaseFilter interface.

    This method is not implemented as Sentence Transformers typically don't require fitting.

    Args:
        x (XYData): The input features (not used).
        y (XYData | None): The target values (not used).

    Returns:
        float | None: Always returns None.

    Note:
        This method is included for API consistency but does not perform any operation.
    """
    ...

`predict(x)` ¶

Generate embeddings for the input text data.

This method uses the loaded Sentence Transformer model to create embeddings for the input text.

Parameters:

Name	Type	Description	Default
`x`	`XYData`	The input text data to generate embeddings for.	required

Returns:

Name	Type	Description
`XYData`	`XYData`	The generated embeddings wrapped in an XYData object.

Note

The input text should be in a format compatible with the Sentence Transformer model. The output embeddings are converted to a PyTorch tensor before being wrapped in XYData.

Source code in framework3/plugins/filters/llm/huggingface_st.py

def predict(self, x: XYData) -> XYData:
    """
    Generate embeddings for the input text data.

    This method uses the loaded Sentence Transformer model to create embeddings
    for the input text.

    Args:
        x (XYData): The input text data to generate embeddings for.

    Returns:
        XYData: The generated embeddings wrapped in an XYData object.

    Note:
        The input text should be in a format compatible with the Sentence Transformer model.
        The output embeddings are converted to a PyTorch tensor before being wrapped in XYData.
    """
    embeddings = self._model.encode(x.value)
    return XYData.mock(torch.tensor(embeddings))

`huggingface_st` ¶

`all = ['HuggingFaceSentenceTransformerPlugin']` `module-attribute` ¶

`HuggingFaceSentenceTransformerPlugin` ¶

Bases: BaseFilter, BasePlugin

A plugin for generating sentence embeddings using Hugging Face's Sentence Transformers.

This plugin integrates Sentence Transformers from Hugging Face into the framework3 ecosystem, allowing for easy generation of sentence embeddings within pipelines.

Key Features

Utilizes pre-trained Sentence Transformer models from Hugging Face
Supports custom model selection
Generates embeddings for input text data
Integrates seamlessly with framework3's BaseFilter interface

Usage

The HuggingFaceSentenceTransformerPlugin can be used to generate embeddings for text data:

from framework3.plugins.filters.llm.huggingface_st import HuggingFaceSentenceTransformerPlugin
from framework3.base.base_types import XYData

# Create an instance of the plugin
st_plugin = HuggingFaceSentenceTransformerPlugin(model_name="all-MiniLM-L6-v2")

# Prepare input data
input_texts = ["This is a sample sentence.", "Another example text."]
x_data = XYData(_hash='input_data', _path='/tmp', _value=input_texts)

# Generate embeddings
embeddings = st_plugin.predict(x_data)
print(embeddings.value)

Attributes:

Name	Type	Description
`model_name`	`str`	The name of the Sentence Transformer model to use.
`_model`	`SentenceTransformer`	The underlying Sentence Transformer model.

Methods:

Name	Description
`fit`	XYData, y: XYData \| None) -> float \| None: Placeholder method for compatibility with BaseFilter interface.
`predict`	XYData) -> XYData: Generate embeddings for the input text data.

Note

This plugin requires the sentence-transformers library to be installed. Ensure that you have the necessary dependencies installed in your environment.

Source code in framework3/plugins/filters/llm/huggingface_st.py

@Container.bind()
class HuggingFaceSentenceTransformerPlugin(BaseFilter, BasePlugin):
    """
    A plugin for generating sentence embeddings using Hugging Face's Sentence Transformers.

    This plugin integrates Sentence Transformers from Hugging Face into the framework3 ecosystem,
    allowing for easy generation of sentence embeddings within pipelines.

    Key Features:
        - Utilizes pre-trained Sentence Transformer models from Hugging Face
        - Supports custom model selection
        - Generates embeddings for input text data
        - Integrates seamlessly with framework3's BaseFilter interface

    Usage:
        The HuggingFaceSentenceTransformerPlugin can be used to generate embeddings for text data:

        ```python
        from framework3.plugins.filters.llm.huggingface_st import HuggingFaceSentenceTransformerPlugin
        from framework3.base.base_types import XYData

        # Create an instance of the plugin
        st_plugin = HuggingFaceSentenceTransformerPlugin(model_name="all-MiniLM-L6-v2")

        # Prepare input data
        input_texts = ["This is a sample sentence.", "Another example text."]
        x_data = XYData(_hash='input_data', _path='/tmp', _value=input_texts)

        # Generate embeddings
        embeddings = st_plugin.predict(x_data)
        print(embeddings.value)
        ```

    Attributes:
        model_name (str): The name of the Sentence Transformer model to use.
        _model (SentenceTransformer): The underlying Sentence Transformer model.

    Methods:
        fit(x: XYData, y: XYData | None) -> float | None:
            Placeholder method for compatibility with BaseFilter interface.
        predict(x: XYData) -> XYData:
            Generate embeddings for the input text data.

    Note:
        This plugin requires the `sentence-transformers` library to be installed.
        Ensure that you have the necessary dependencies installed in your environment.
    """

    def __init__(self, model_name: str = "all-MiniLM-L6-v2"):
        """
        Initialize a new HuggingFaceSentenceTransformerPlugin instance.

        This constructor sets up the plugin with the specified Sentence Transformer model.

        Args:
            model_name (str): The name of the Sentence Transformer model to use.
                              Defaults to "all-MiniLM-L6-v2".

        Note:
            The specified model will be downloaded and loaded upon initialization.
            Ensure you have a stable internet connection and sufficient disk space.
        """
        super().__init__()
        self.model_name = model_name
        self._model = SentenceTransformer(self.model_name)

    def fit(self, x: XYData, y: XYData | None) -> float | None:
        """
        Placeholder method for compatibility with BaseFilter interface.

        This method is not implemented as Sentence Transformers typically don't require fitting.

        Args:
            x (XYData): The input features (not used).
            y (XYData | None): The target values (not used).

        Returns:
            float | None: Always returns None.

        Note:
            This method is included for API consistency but does not perform any operation.
        """
        ...

    def predict(self, x: XYData) -> XYData:
        """
        Generate embeddings for the input text data.

        This method uses the loaded Sentence Transformer model to create embeddings
        for the input text.

        Args:
            x (XYData): The input text data to generate embeddings for.

        Returns:
            XYData: The generated embeddings wrapped in an XYData object.

        Note:
            The input text should be in a format compatible with the Sentence Transformer model.
            The output embeddings are converted to a PyTorch tensor before being wrapped in XYData.
        """
        embeddings = self._model.encode(x.value)
        return XYData.mock(torch.tensor(embeddings))

`model_name = model_name` `instance-attribute` ¶

`init(model_name='all-MiniLM-L6-v2')` ¶

Initialize a new HuggingFaceSentenceTransformerPlugin instance.

This constructor sets up the plugin with the specified Sentence Transformer model.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	The name of the Sentence Transformer model to use. Defaults to "all-MiniLM-L6-v2".	`'all-MiniLM-L6-v2'`

Note

The specified model will be downloaded and loaded upon initialization. Ensure you have a stable internet connection and sufficient disk space.

Source code in framework3/plugins/filters/llm/huggingface_st.py

def __init__(self, model_name: str = "all-MiniLM-L6-v2"):
    """
    Initialize a new HuggingFaceSentenceTransformerPlugin instance.

    This constructor sets up the plugin with the specified Sentence Transformer model.

    Args:
        model_name (str): The name of the Sentence Transformer model to use.
                          Defaults to "all-MiniLM-L6-v2".

    Note:
        The specified model will be downloaded and loaded upon initialization.
        Ensure you have a stable internet connection and sufficient disk space.
    """
    super().__init__()
    self.model_name = model_name
    self._model = SentenceTransformer(self.model_name)

`fit(x, y)` ¶

Placeholder method for compatibility with BaseFilter interface.

This method is not implemented as Sentence Transformers typically don't require fitting.

Parameters:

Name	Type	Description	Default
`x`	`XYData`	The input features (not used).	required
`y`	`XYData \| None`	The target values (not used).	required

Returns:

Type	Description
`float \| None`	float \| None: Always returns None.

Note

This method is included for API consistency but does not perform any operation.

Source code in framework3/plugins/filters/llm/huggingface_st.py

def fit(self, x: XYData, y: XYData | None) -> float | None:
    """
    Placeholder method for compatibility with BaseFilter interface.

    This method is not implemented as Sentence Transformers typically don't require fitting.

    Args:
        x (XYData): The input features (not used).
        y (XYData | None): The target values (not used).

    Returns:
        float | None: Always returns None.

    Note:
        This method is included for API consistency but does not perform any operation.
    """
    ...

`predict(x)` ¶

Generate embeddings for the input text data.

This method uses the loaded Sentence Transformer model to create embeddings for the input text.

Parameters:

Name	Type	Description	Default
`x`	`XYData`	The input text data to generate embeddings for.	required

Returns:

Name	Type	Description
`XYData`	`XYData`	The generated embeddings wrapped in an XYData object.

Note

The input text should be in a format compatible with the Sentence Transformer model. The output embeddings are converted to a PyTorch tensor before being wrapped in XYData.

Source code in framework3/plugins/filters/llm/huggingface_st.py

def predict(self, x: XYData) -> XYData:
    """
    Generate embeddings for the input text data.

    This method uses the loaded Sentence Transformer model to create embeddings
    for the input text.

    Args:
        x (XYData): The input text data to generate embeddings for.

    Returns:
        XYData: The generated embeddings wrapped in an XYData object.

    Note:
        The input text should be in a format compatible with the Sentence Transformer model.
        The output embeddings are converted to a PyTorch tensor before being wrapped in XYData.
    """
    embeddings = self._model.encode(x.value)
    return XYData.mock(torch.tensor(embeddings))

Text Processing

framework3.plugins.filters.llm ¶

HuggingFaceSentenceTransformerPlugin ¶

model_name = model_name instance-attribute ¶

__init__(model_name='all-MiniLM-L6-v2') ¶

fit(x, y) ¶

predict(x) ¶

huggingface_st ¶

__all__ = ['HuggingFaceSentenceTransformerPlugin'] module-attribute ¶

HuggingFaceSentenceTransformerPlugin ¶

model_name = model_name instance-attribute ¶

__init__(model_name='all-MiniLM-L6-v2') ¶

fit(x, y) ¶

predict(x) ¶

`framework3.plugins.filters.llm` ¶

`HuggingFaceSentenceTransformerPlugin` ¶

`model_name = model_name` `instance-attribute` ¶

`init(model_name='all-MiniLM-L6-v2')` ¶

`fit(x, y)` ¶

`predict(x)` ¶

`huggingface_st` ¶

`all = ['HuggingFaceSentenceTransformerPlugin']` `module-attribute` ¶

`HuggingFaceSentenceTransformerPlugin` ¶

`model_name = model_name` `instance-attribute` ¶

`init(model_name='all-MiniLM-L6-v2')` ¶

`fit(x, y)` ¶

`predict(x)` ¶