MICROPHONE INPUT
(Real Time voice)
AUDIO LOADING
(Librosa)
RESAMPLING
(16 kHz)
NORMALIZATION &
MONO CONVERSION
WHISPER ASR MODEL
(Speech → English Text)
ASR TEXT CLEANING
(Lowercase, Normalize)
REFERENCE TEXT
(Ground Truth)
WER CALCULATION
(Compare ASR vs Reference)
(ASR Accuracy Metric)
by Tanuja