Request: Add Flash Attention 2.0 Support for ViTMAEForPreTraining #36527
Labels
Feature request
Request for a new feature
Flash Attention
Good Second Issue
Issues that are more difficult to do than "Good First" issues - give it a try if you want!
Vision
Hi Hugging Face team!
I am currently working on pre-training a Foundation Model using ViTMAEForPreTraining, and I was hoping to use Flash Attention 2.0 to speed up training and reduce memory usage. However, when I attempted to enable Flash Attention, I encountered the following error:
ValueError: ViTMAEForPreTraining does not support Flash Attention 2.0 yet. Please request to add support where the model is hosted, on its model hub page: https://huggingface.co//discussions/new or in the Transformers GitHub repo: https://github.com/huggingface/transformers/issues/new
Since MAE pre-training is heavily dependent on the attention mechanism, adding Flash Attention support would be a valuable enhancement—especially for larger ViT models and high-resolution datasets, like Landsat data we are working with.
Feature Request
Why This Matters
Environment Details
The text was updated successfully, but these errors were encountered: