2805 Bowers Ave, Santa Clara, CA 95051 | 408-730-2275
research@colfax-intl.com

FlexAttention + FlashAttention-4: Fast and Flexible (External)

In this PyTorch blog on which we collaborated, we explain the FlexAttention extension to FlashAttention-4 (or from another point of view, the incorporation of FA-4 as an attention backend for the PyTorch FlexAttention API).


Discover more from Colfax Research

Subscribe to get the latest posts sent to your email.

Posted

in

, ,

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *