Tuning
KNN Convergence
Updating with a KNN method (TopKDiscreteUpdate, TopKContinuousUpdate) can result in slow convergence. This is because your effective step size is constrained by the maximum distance between your query and your query results.
For example, if you are querying a dense embedding space and returning the 10 nearest embeddings, the distance between your query embedding and result embeddings will be quite small. Running a KNN-style update on close embeddings results in small update steps.
This can be fixed by returning a larger number of query results. As a downsize, this requires more compute from your Filter and Score steps. If this is prohibitive, consider another update method
RL Update Divergence
When using a gradient based update method (RLUpdate), you may observe a failure mode where your query embeddings overshoot your embedding space. This can be detected by monitoring distances between queries and query results. This is a sign that your RL learning rate is too high, or your RL distance penalty is too low. It may also be worth implementing an RL Update method using more sophisticated optimization algorithms or learning rate schedules.
Best of Both Worlds: Grad Queries with KNN Updates
The above problems with KNN updates and RL updates can be simultaneously remedied using gradient queries (see UpdatePluginGradientWrapper and DataPluginGradWrapper). With this setup, we use a score gradient to generate multiple queries, but update with a KNN method.
This allows us to be very aggressive with our gradient query, sweeping a large range of learning rates and intentionally overshooting our embedding space. The KNN update then ensures that our new queries are pulled back into the embedding space.