Text Processing
framework3.plugins.filters.llm
¶
HuggingFaceSentenceTransformerPlugin
¶
Bases: BaseFilter
, BasePlugin
A plugin for generating sentence embeddings using Hugging Face's Sentence Transformers.
This plugin integrates Sentence Transformers from Hugging Face into the framework3 ecosystem, allowing for easy generation of sentence embeddings within pipelines.
Key Features
- Utilizes pre-trained Sentence Transformer models from Hugging Face
- Supports custom model selection
- Generates embeddings for input text data
- Integrates seamlessly with framework3's BaseFilter interface
Usage
The HuggingFaceSentenceTransformerPlugin can be used to generate embeddings for text data:
from framework3.plugins.filters.llm.huggingface_st import HuggingFaceSentenceTransformerPlugin
from framework3.base.base_types import XYData
# Create an instance of the plugin
st_plugin = HuggingFaceSentenceTransformerPlugin(model_name="all-MiniLM-L6-v2")
# Prepare input data
input_texts = ["This is a sample sentence.", "Another example text."]
x_data = XYData(_hash='input_data', _path='/tmp', _value=input_texts)
# Generate embeddings
embeddings = st_plugin.predict(x_data)
print(embeddings.value)
Attributes:
Name | Type | Description |
---|---|---|
model_name |
str
|
The name of the Sentence Transformer model to use. |
_model |
SentenceTransformer
|
The underlying Sentence Transformer model. |
Methods:
Name | Description |
---|---|
fit |
XYData, y: XYData | None) -> float | None: Placeholder method for compatibility with BaseFilter interface. |
predict |
XYData) -> XYData: Generate embeddings for the input text data. |
Note
This plugin requires the sentence-transformers
library to be installed.
Ensure that you have the necessary dependencies installed in your environment.
Source code in framework3/plugins/filters/llm/huggingface_st.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|
model_name = model_name
instance-attribute
¶
__init__(model_name='all-MiniLM-L6-v2')
¶
Initialize a new HuggingFaceSentenceTransformerPlugin instance.
This constructor sets up the plugin with the specified Sentence Transformer model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name
|
str
|
The name of the Sentence Transformer model to use. Defaults to "all-MiniLM-L6-v2". |
'all-MiniLM-L6-v2'
|
Note
The specified model will be downloaded and loaded upon initialization. Ensure you have a stable internet connection and sufficient disk space.
Source code in framework3/plugins/filters/llm/huggingface_st.py
fit(x, y)
¶
Placeholder method for compatibility with BaseFilter interface.
This method is not implemented as Sentence Transformers typically don't require fitting.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features (not used). |
required |
y
|
XYData | None
|
The target values (not used). |
required |
Returns:
Type | Description |
---|---|
float | None
|
float | None: Always returns None. |
Note
This method is included for API consistency but does not perform any operation.
Source code in framework3/plugins/filters/llm/huggingface_st.py
predict(x)
¶
Generate embeddings for the input text data.
This method uses the loaded Sentence Transformer model to create embeddings for the input text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input text data to generate embeddings for. |
required |
Returns:
Name | Type | Description |
---|---|---|
XYData |
XYData
|
The generated embeddings wrapped in an XYData object. |
Note
The input text should be in a format compatible with the Sentence Transformer model. The output embeddings are converted to a PyTorch tensor before being wrapped in XYData.
Source code in framework3/plugins/filters/llm/huggingface_st.py
huggingface_st
¶
__all__ = ['HuggingFaceSentenceTransformerPlugin']
module-attribute
¶
HuggingFaceSentenceTransformerPlugin
¶
Bases: BaseFilter
, BasePlugin
A plugin for generating sentence embeddings using Hugging Face's Sentence Transformers.
This plugin integrates Sentence Transformers from Hugging Face into the framework3 ecosystem, allowing for easy generation of sentence embeddings within pipelines.
Key Features
- Utilizes pre-trained Sentence Transformer models from Hugging Face
- Supports custom model selection
- Generates embeddings for input text data
- Integrates seamlessly with framework3's BaseFilter interface
Usage
The HuggingFaceSentenceTransformerPlugin can be used to generate embeddings for text data:
from framework3.plugins.filters.llm.huggingface_st import HuggingFaceSentenceTransformerPlugin
from framework3.base.base_types import XYData
# Create an instance of the plugin
st_plugin = HuggingFaceSentenceTransformerPlugin(model_name="all-MiniLM-L6-v2")
# Prepare input data
input_texts = ["This is a sample sentence.", "Another example text."]
x_data = XYData(_hash='input_data', _path='/tmp', _value=input_texts)
# Generate embeddings
embeddings = st_plugin.predict(x_data)
print(embeddings.value)
Attributes:
Name | Type | Description |
---|---|---|
model_name |
str
|
The name of the Sentence Transformer model to use. |
_model |
SentenceTransformer
|
The underlying Sentence Transformer model. |
Methods:
Name | Description |
---|---|
fit |
XYData, y: XYData | None) -> float | None: Placeholder method for compatibility with BaseFilter interface. |
predict |
XYData) -> XYData: Generate embeddings for the input text data. |
Note
This plugin requires the sentence-transformers
library to be installed.
Ensure that you have the necessary dependencies installed in your environment.
Source code in framework3/plugins/filters/llm/huggingface_st.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|
model_name = model_name
instance-attribute
¶
__init__(model_name='all-MiniLM-L6-v2')
¶
Initialize a new HuggingFaceSentenceTransformerPlugin instance.
This constructor sets up the plugin with the specified Sentence Transformer model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name
|
str
|
The name of the Sentence Transformer model to use. Defaults to "all-MiniLM-L6-v2". |
'all-MiniLM-L6-v2'
|
Note
The specified model will be downloaded and loaded upon initialization. Ensure you have a stable internet connection and sufficient disk space.
Source code in framework3/plugins/filters/llm/huggingface_st.py
fit(x, y)
¶
Placeholder method for compatibility with BaseFilter interface.
This method is not implemented as Sentence Transformers typically don't require fitting.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input features (not used). |
required |
y
|
XYData | None
|
The target values (not used). |
required |
Returns:
Type | Description |
---|---|
float | None
|
float | None: Always returns None. |
Note
This method is included for API consistency but does not perform any operation.
Source code in framework3/plugins/filters/llm/huggingface_st.py
predict(x)
¶
Generate embeddings for the input text data.
This method uses the loaded Sentence Transformer model to create embeddings for the input text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
XYData
|
The input text data to generate embeddings for. |
required |
Returns:
Name | Type | Description |
---|---|---|
XYData |
XYData
|
The generated embeddings wrapped in an XYData object. |
Note
The input text should be in a format compatible with the Sentence Transformer model. The output embeddings are converted to a PyTorch tensor before being wrapped in XYData.