QUTAG YAML source

code: QUTAG
mfte_code: QUTAG
name: Question tags
definition: >-
  Tag questions appended to statements: "isn't it?", "do they?", "won't you?"
normalization: finite_verbs
detection:
- source: mfte
  requires:
  - word
  - pos
  parts:
    p1:
      cql: "[word=\"can|could|will|would|shall|should|may|might|must|did|had|do\"] [word=\"not|n't\"] [pos=\"PRP\"] [word=\"?\"]"
    p2:
      cql: '[word="can|could|will|would|shall|should|may|might|must|did|had|do"] [pos="PRP"] [word="?"]'
    p3:
      cql: "[word=\"is|does|was|has|do\"] [word=\"not|n't\"] [word=\"it|she|he|they\"] [word=\"?\"]"
    p4:
      cql: '[word="is|does|was|has"] [word="it|she|he"] [word="?"]'
    p5:
      cql: "[word=\"do|were|are|have\"] [word=\"not|n't\"] [word=\"you|we|they\"] [word=\"?\"]"
    p6:
      cql: '[word="do|were|are|have"] [word="you|we|they"] [word="?"]'
    p7:
      cql: '[pos="PRP"] [word="can|could|will|would|shall|should|may|might|must|did|had"] [word="?"]'
    p8:
      cql: "[word=\"it|she|he|they\"] [word=\"is|does|was|has|do\"] [word=\"not|n't\"] [word=\"?\"]"
    p9:
      cql: '[word="it|she|he"] [word="is|does|was|has"] [word="?"]'
    p10:
      cql: "[word=\"you|we|they\"] [word=\"do|were|are|have\"] [word=\"not|n't\"] [word=\"?\"]"
    p11:
      cql: '[word="you|we|they"] [word="do|were|are|have"] [word="?"]'
  combine: "p1 | p2 | p3 | p4 | p5 | p6 | p7 | p8 | p9 | p10 | p11"
  description: >-
    MFTE tag question patterns (lines 607-620). Two groups: canonical
    (aux/modal + optional negation + pronoun + ?) and reversed
    (pronoun + aux/modal + optional negation + ?). Specific word lists
    rather than POS classes to match MFTE's precision. MFTE uses
    negative WH-word checks at j-4/j-5 to prevent tagging in WH-question
    sentences (e.g., "What could I do?" should not tag "I do ?" as
    QUTAG). Since CQL cannot express positional lookbehind, "do" is
    removed from the reversed PRP+aux+? pattern to avoid this false
    positive; canonical patterns still catch "do you?" etc.
- requires:
  - word
  - pos
  cql: '[pos="MD|VBP|VBZ|VBD"] [pos="PRP"] [word="?"]'
  description: 'Approximate: auxiliary/modal + pronoun + question mark.'
sources:
- mfte
notes: Relevant to conversational/speech registers.