Skip to content

WIP: Normalize similarities to [0, 1]#1095

Draft
edwinyyyu wants to merge 2 commits intoMemMachine:mainfrom
edwinyyyu:similarity_formulas
Draft

WIP: Normalize similarities to [0, 1]#1095
edwinyyyu wants to merge 2 commits intoMemMachine:mainfrom
edwinyyyu:similarity_formulas

Conversation

@edwinyyyu
Copy link
Contributor

@edwinyyyu edwinyyyu commented Feb 10, 2026

Purpose of the change

Define common similarity/distance definitions.

Score threshold is currently defined as bigger is better, and the parameter is supposed to be overloaded instead of creating new parameters for different systems.

Supposedly users may be unwilling to understand the raw values.

Description

Define score formulas.

TODO:

  • Implement score formulas for different database backends.
  • Consider impacts on rerankers. Some rerankers return range [0, 1], [-1, 1], (-inf, inf) e.g. logits, etc.

Alternative:

  • define different scores for different systems and return raw values (easiest to understand)

Breaking:

  • existing tuned score thresholds no longer work if used for embedding score

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (does not change functionality, e.g., code style improvements, linting)
  • Documentation update
  • Project Maintenance (updates to build scripts, CI, etc., that do not affect the main project)
  • Security (improves security without changing functionality)

How Has This Been Tested?

  • Unit Test
  • Integration Test
  • End-to-end Test
  • Test Script (please provide)
  • Manual verification (list step-by-step instructions)

Checklist

  • I have signed the commit(s) within this pull request
  • My code follows the style guidelines of this project (See STYLE_GUIDE.md)
  • I have performed a self-review of my own code
  • I have commented my code
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • Confirmed all checks passed
  • Contributor has signed the commit(s)
  • Reviewed the code
  • Run, Tested, and Verified the change(s) work as expected

Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
@edwinyyyu edwinyyyu added question Further information is requested documentation Issues related to documentation labels Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Issues related to documentation question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants