Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model]: Add itdainb/PhoRanker and keepitreal/vietnamese-sbert #462

Open
duon9 opened this issue Feb 4, 2025 · 2 comments
Open

[Model]: Add itdainb/PhoRanker and keepitreal/vietnamese-sbert #462

duon9 opened this issue Feb 4, 2025 · 2 comments

Comments

@duon9
Copy link

duon9 commented Feb 4, 2025

Which model would you like to support?

https://huggingface.co/keepitreal/vietnamese-sbert
https://huggingface.co/itdainb/PhoRanker

What are the main advantages of this model?

I notice that fastembed does not support Vietnamese model, so I am so grateful if you add them to fastembed

@joein
Copy link
Member

joein commented Mar 2, 2025

Hey @duon9

Yeah, unfortunately, Vietnamese models might be underrepresented in fastembed and Vietnamese is supported only by the multilingual models

It seems, that these particular models do not have as much of attention (~1-10k download per month), and we are trying to keep fastembed slim

Also, the authors of the models have not converted it to onnx
If you are willing to convert the models to onnx, you can use these models with custom model functionality we added in fastembed 0.6.0

If a model follows a typical preprocessing / postprocessing (just pooling / normalization) steps, it can be added to fastembed in runtime via .add_custom_model (example from readme)

Once you convert the model, you can upload it either from huggingface, or, if you want to keep it private, via specific_model_path param (which is a path to the directory where you model is stored on disk)

@duon9
Copy link
Author

duon9 commented Mar 6, 2025

Hi @joein

Thanks for the clarification! I understand the focus on keeping fastembed slim. I'll look into converting the models to ONNX and testing the custom model functionality in fastembed 0.6.0. Appreciate the insights!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants