Features by Category
447 features across 25 categories with 654 detection rules.
Adjective Modification (1) ¶
| Code | Name | Rules |
|---|---|---|
| COMPAR | Comparatives | 1 |
Adjective Semantics (15) ¶
Semantic classes of adjectives (Biber 2006).
Adjectives (8) ¶
Adjective features (attributive, predicative).
Adverb Semantics (18) ¶
Semantic classes of adverbs (Biber 2006).
Adverbials (7) ¶
Adverb and adverbial expression features.
Adverbs (15) ¶
Adverb features.
Conjunctions (7) ¶
Conjunction features (coordinating, subordinating).
| Code | Name | Rules |
|---|---|---|
| ALTHOUGH | Although | 2 |
| AS_IF | As if | 2 |
| AS_THOUGH | As though | 2 |
| THOUGH | Though | 2 |
| TILL | Till | 2 |
| UNTIL | Until | 2 |
| WHETHER | Whether | 2 |
Derivational morphology (62) ¶
Derivational prefixes and suffixes, following Bohmann (2019), Baayen (1994), and Biermeier (2008). Regex-based detection on word forms.
Determinatives (7) ¶
Determiners, quantifiers, demonstratives, genitives.
| Code | Name | Rules |
|---|---|---|
| CD | Cardinal numbers | 1 |
| DEMO | Demonstrative determiners | 2 |
| DEMOP | Demonstrative pronouns | 3 |
| DT | Determiners | 2 |
| NUMERAL | Numerals | 1 |
| POS | S-genitives | 1 |
| QUAN | Quantifiers | 1 |
Determiners (5) ¶
Determiner features.
| Code | Name | Rules |
|---|---|---|
| A_LOT_OF | A lot of | 2 |
| DEF_ART | Definite article the | 2 |
| INDEF_ART | Indefinite article a(n) | 1 |
| LOTS_OF | Lots of | 2 |
| MANY_MUCH | Many/much | 1 |
Discourse Organization (26) ¶
Conjunctions, subordination, relative clauses, questions, discourse markers.
Function Words (69) ¶
Individual function word frequencies, following stylometric tradition (Burrows, Mosteller & Wallace, Grieve 2023). Each feature measures the relative frequency of a single high-frequency function word.
General Text Properties (5) ¶
Text-level measures (word length, TTR, lexical density).
| Code | Name | Rules |
|---|---|---|
| AWL | Average word length | 1 |
| LDE | Lexical density | 1 |
| MSL | Mean sentence length | 1 |
| TTR | Type-token ratio | 1 |
| WORDCOUNT | Word count | 1 |
Lexis (9) ¶
Noun counts, noun compounds, nominalizations.
| Code | Name | Rules |
|---|---|---|
| EMO | Emoji and emoticons | 1 |
| GER | Gerunds | 2 |
| HST | Hashtags | 1 |
| NCOMP | Noun compounds | 2 |
| NN | Total other nouns | 2 |
| NNP | Proper nouns | 1 |
| NN_ALL | Total nouns (all) | 1 |
| NOMZ | Nominalizations | 2 |
| URL | URLs and email addresses | 2 |
Modals (13) ¶
Individual modal verbs and modal constructions.
| Code | Name | Rules |
|---|---|---|
| ABLE | BE ABLE TO | 1 |
| MDCA | Modal CAN | 3 |
| MDCO | Modal COULD | 3 |
| MDMM | Modals MAY and MIGHT | 2 |
| MDNE | Necessity modals | 2 |
| MDOU | Modal OUGHT | 2 |
| MDSL | Modal SHALL | 2 |
| MDWO | Modal WOULD | 3 |
| MDWS | Modals WILL and SHALL | 2 |
| POMD_ALL | Possibility modals | 2 |
| PREDMD_ALL | Predictive modals | 2 |
| WILL_CONT | Contracted will ('ll) | 1 |
| WILL_FULL | Uncontracted will | 2 |
Negation (3) ¶
Negation features (analytic and synthetic).
| Code | Name | Rules |
|---|---|---|
| NEG_ALL | Negation (all) | 1 |
| XX0 | Analytic negation | 3 |
| XXSYN | Synthetic negation | 2 |
Noun Semantics (26) ¶
Semantic classes of nouns (Biber 2006).
Prepositions (4) ¶
Preposition counts.
| Code | Name | Rules |
|---|---|---|
| AMONG | Among | 2 |
| AMONGST | Amongst | 2 |
| IN | Prepositions | 3 |
| PREP_SEQ | Preposition sequences | 1 |
Pronouns (20) ¶
Personal, demonstrative, indefinite, and quantifying pronouns.
Stance (12) ¶
Stance-taking devices: amplifiers, downtoners, emphatics, hedges, politeness.
| Code | Name | Rules |
|---|---|---|
| AMP | Amplifiers | 3 |
| DEFNEG | Definite: negative | 2 |
| DEFPOS | Definite: positive | 2 |
| DWNT | Downtoners | 3 |
| EMPH | Emphatics | 3 |
| HDG | Hedges | 3 |
| POLITE | Politeness markers | 1 |
| RBAPPROX | Approximators | 2 |
| RBDIMIN | Diminishers | 2 |
| RBINTNS | Intensifiers (non-specific degree) | 2 |
| RBMAX | Maximisers | 2 |
| RBMIN | Minimisers | 2 |
Stance Complement Patterns (28) ¶
That-clauses, to-clauses, and WH-clauses subcategorised by the stance type of the preceding adjective, noun, or verb (Biber 2006).
Stative Forms (3) ¶
Existential THERE and copular BE.
| Code | Name | Rules |
|---|---|---|
| BEMA | BE as main verb | 3 |
| EX | Existential THERE | 2 |
| _EXTHERE | Existential there + BE | 1 |
Syntax (15) ¶
Syntactic features: split auxiliaries, stranded prepositions, coordination, pied-piping.
Verb Features (43) ¶
Verb morphology: tense, aspect, voice, contractions, particles.
Verb Semantics (26) ¶
Semantic verb classes (activity, mental, communication, etc.).