-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomic-embeddings-support #280
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey.
Do we not want to add aliases with the old class names to avoid breaking changes?
Never mind. Someone would not be accessing those classes directly. Hopefully.
), | ||
"nomic-ai/nomic-embed-text-v1.5": np.array( | ||
[-1.6531514e-02, 8.5380634e-05, -1.8171231e-01, -3.9333291e-03, 1.2763254e-02] | ||
[-0.15407836, -0.03053198, -3.9138033, 0.1910364, 0.13224715] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were these embeddings obtained differently from nomic-ai/nomic-embed-text-v1
?
Why do they have more digits after the point?
Hi @I8dNLo, I'm currently working on adding support for Matryoshka Representation Learning embedding models, the main one being From the rest of the library as well as Nomic's documentation it seems we typically do normalize embeddings. I haven't been able to find the logic from sentence-transformers you are referring to when it comes to normalization, it appears that normalization is not tied to the model in sentence-transformers. Were you mainly referring to the Ultimately, I am able to implement for variable dimensionality without normalization (even if it isn't optimal and not recommended for I'd be happy to your opinion as well as other contributors on whether to go ahead with my current changes: |
Fix of nomic-embeddings from #204
Also moved jina/miniLM models to refactored classes: PooledEmbedding and PooledNormalizedEmbedding, which implements the logic of embeddings from sentance-transformers lib for corresponding models
Todo: