FPUH Filled pauses and interjections

Definition

Hesitation markers and interjections: oh, er, hmm, uh, ah, etc.

Detection Rules

mfte

cql[word={words}]
Word list (23 items)
ah aw bye eh er erm ha heh hey hi hm hmm huh mmm oh oi ouch ow uh uhm um woops yeh

Requires: word

mfte

MFTE catch-all (line 1225): all remaining UH-tagged tokens become FPUH. This catches interjections not in the explicit word list (e.g., “Sure” tagged UH by Stanza). DMA tokens excluded because DMA runs earlier and consumes some UH tokens.

cql[pos="UH"]
combine: _ & !DMA

Requires: pos

Normalization

Per words

Sources

  • mfte — Le Foll, Elen & Shakir, Muhammad (2023/2025) : Multi-Feature Tagger of English (MFTE) — Python version