NCOMP YAML source

code: NCOMP
mfte_code: NCOMP
name: Noun compounds
definition: >-
  Two or more nouns appearing adjacently (e.g., "computer screen", "sword blade").
normalization: nouns
detection:
- source: mfte
  requires:
  - word
  - pos
  parts:
    p1:
      cql: '[cat="NN|NNP|NNS|NNPS" & word=".{3,}"] [cat="NN|NNS" & word=".{2,}"]'
      anchor: last
    p2:
      cql: '[cat="NN|NNP|NNS|NNPS" & word=".{5,}"] [word="-"] [cat="NN|NNS" & word=".{3,}"]'
      anchor: last
  combine: "p1 | p2"
  description: >-
    MFTE tags second noun in adjacent noun pairs. First noun 3+ chars,
    any NN tag (including NNP); second noun 2+ chars, NN or NNS only
    (excludes NNP to avoid tagging proper name pairs like "Barack Obama").
    Also matches hyphenated compounds: NN(5+) - NN(3+).
- requires:
  - pos
  cql: '[pos="NN|NNS|NNP|NNPS"]{2,}'
examples:
- text: Surely this stone must be the last one to cover the dungeon _entrance_!
  source: le_foll_2024
- text: Experts say that the rare winter _phenomenon_ is a natural occurrence.
  source: le_foll_2024
sources:
- mfte