[WARNING|logging.py:328] 2025-02-26 14:54:51,387 >> You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with model.to('cuda').
Does this indicate I fail to use flash attention?