-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add fp8 source files #50
base: main
Are you sure you want to change the base?
feat: add fp8 source files #50
Conversation
e11ba83
to
db8c667
Compare
Thanks for the PR, I will try to test this soon! |
Signed-off-by: Mickael Seznec <[email protected]>
Signed-off-by: Mickael Seznec <[email protected]>
Signed-off-by: Mickael Seznec <[email protected]>
b6ca541
to
375edf5
Compare
Apologies for the very long delay! the 12.1 build issues is now landed on main and the MLA stuff is landed to so we are cleared to land this. However, is the |
This reverts commit 375edf5. Signed-off-by: Mickael Seznec <[email protected]>
Signed-off-by: Mickael Seznec <[email protected]>
Great, I didn't know about torch.expand(). I'll use it on vllm side to avoid any magic hidden in the interface layer. Also, I've seen some API mismatches between |
Awesome, thank you! Do you mind removing these bits from the flash_api.cpp in this PR?
oh good catch! oof not sure how that slipped by, PRs welcome :smiling: |
Hi @LucasWilkinson, I've updated my PR accordingly. Let me know if you have any further comments :) |
No description provided.