How do I evaluate the performance of a .pte file after Executorch quantization?
#14988
Replies: 2 comments
-
|
Sorry for getting back late. Do you mean that you want to measure the accuracy on device? I think there are two ways, one is bundling the testing input/reference output with the model, and the other one is to push the data on device and pull the data back. cc: @Gasoonjia |
Beta Was this translation helpful? Give feedback.
-
|
Hi! 👋 That’s a great question. I’m not an expert in wav2vec2 quantization, but here’s what I’d try if I were evaluating the .pte file on mobile: Performance Metrics PER/FER: You can feed some audio samples into the quantized model and compare the predicted transcription with the ground truth. Calculate PER (Phoneme Error Rate) or FER (Frame Error Rate) using a small evaluation script. RTF (Real-Time Factor): Measure the time it takes for the model to process an audio sample and divide by the audio duration. This gives you RTF. Peak RAM Usage: On Android, you could use Android Studio Profiler; on iOS, Xcode Instruments can track memory usage while running inference. If executorch produced a .pte file, you’ll likely need to load it using their runtime and run a loop over your test audio samples while logging time and memory. Hugging Face has a guide on quantization You can check torchaudio or transformers examples for evaluating WER/PER. Hope this helps! I’d be curious to hear if you find a simple way to log all metrics together – I’m planning to try something similar in my own project. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone!
I'm currently working on a
react-nativeproject and want to integrate the Hugging Face model (awav2vec2model) into my app.As you know, to use the large model on the mobile environment, the model needs to be quantized.
So, I used
executorchto quantize, and it generated a.ptefile.For now, I need to evaluate the performance (PER, FER, RTF, Peak RAM usage) of this file.
How to solve this problem?
Beta Was this translation helpful? Give feedback.
All reactions