Fauss Plugins

Faiss functions and classes

Faiss Data Plugin

The FaissDataPlugin integrates with a faiss index.

search_params can be any params object compatible with faiss search


source

FaissDataPlugin

 FaissDataPlugin (k:int, faiss_index:faiss.swigfaiss_avx2.Index,
                  item_data:Optional[List[Dict]]=None,
                  item_key:Optional[str]=None, search_params:Optional[fais
                  s.swigfaiss_avx2.SearchParameters]=None,
                  distance_cutoff:Optional[float]=None)

FaissDataPlugin - data plugin for working with a faiss vector index

The data query will run k nearest neighbors against faiss_index

Optionally, item_data can be provided as a list of dicts, where item_data[i] corresponds to the data for embedding i in the faiss index

If item_data is provided item_data[i]['item_key'] defines the specific value for item i

search_params are optional kwargs sent to faiss.SearchParameters

if distance_cutoff is specified, query results with a distance greater than distance_cutoff are ignored

Type Default Details
k int k nearest neighbors to return
faiss_index Index faiss index
item_data typing.Optional[typing.List[typing.Dict]] None Optional dict of item data
item_key typing.Optional[str] None Optional key for item value (should be in item_data dict)
search_params typing.Optional[faiss.swigfaiss_avx2.SearchParameters] None faiss search params
distance_cutoff typing.Optional[float] None query to result distance cutoff
n_vectors = 256
d_vectors = 64
k = 10
n_queries = 5

vectors = np.random.randn(n_vectors, d_vectors)

vector_data = [{'other':np.random.randint(0,1e3), 
                'item':str(np.random.randint(0,1e4))} 
               for i in range(vectors.shape[0])]

index = faiss.IndexFlatL2(d_vectors)
index.add(vectors)

data_function = FaissDataPlugin(5, index, vector_data, 'item')
data_module = DataSourceModule(data_function)

batch = build_batch_from_embeddings(np.random.randn(n_queries, d_vectors))
batch2 = data_module(batch)