FPUH Filled pauses and interjections
Definition
Hesitation markers and interjections: oh, er, hmm, uh, ah, etc.
Detection Rules
mfte
cql[word={words}]
Word list (23 items)
ah
aw
bye
eh
er
erm
ha
heh
hey
hi
hm
hmm
huh
mmm
oh
oi
ouch
ow
uh
uhm
um
woops
yeh
mfte
MFTE catch-all (line 1225): all remaining UH-tagged tokens become FPUH. This catches interjections not in the explicit word list (e.g., “Sure” tagged UH by Stanza). DMA tokens excluded because DMA runs earlier and consumes some UH tokens.
cql[pos="UH"]
combine: _ & !DMA
Normalization
Per words
Sources
- mfte — Le Foll, Elen & Shakir, Muhammad (2023/2025) : Multi-Feature Tagger of English (MFTE) — Python version