This is a repost of my forensic linguistic analysis of "The Shadowbrokers texts", as posted on the Taia Global blog on August 18, 2016.This is an initial linguistic analysis of the texts from “The Shadowbrokers”, as posted on Pastebin, taken from the tumblr account. This is a qualitative analysis, looking at patterns of grammatical and orthographic errors, to examine the question of whether the author of the text is a native speaker/writer of US English. A quantitative analysis would help firm up and estimate the reliability of our conclusions. This analysis assumes that all the texts were written by a single individual.
There are a number of grammatical errors that are not usual in native speaker US English:
- Omission of definite and indefinite articles (“a” and “the”)
- Omission of infinitive “to” (e.g., “I want get” instead of “I want to get”)
- Omission of modal verbs “should” and “must” and auxiliary verb “will”
- Elision of “it” in “it is ...”
- Use of progressive form “is Xing” instead of present or past tense form “X” (e.g., “He is breaking” instead of “he breaks” or “he broke”)
- Use of “are X” instead of “are Xing” or “X” (“they are go” instead of “they are going” or “they go”)
- Tense confusion – use of base verb form instead of past tense
Evidence that the author is a native speaker trying to appear non-native:
- Spelling. The spelling is entirely correct throughout, including some long and complex words such as “dictatorship”, “prostitutes”, and “consolation”. If this had been achieved through the use of spell-checking software, we would have expected to see at least one “Cupertino” (choice of a correctly-spelled but contextually wrong word).
- Inconsistent errors. Grammatical errors such as omitting the infinitive “to” or using “is breaking” to mean “breaks” result, in a non-native writer, from deeply held intuitions about how grammar works. The fact that errors 2, 3, 5, and 6 all occur inconsistently (they occur a majority of the time, but not by much) indicates that someone was inserting errors, rather than making them naturally.
- Mutually inconsistent errors. Errors 5 and 6 are odd together – if the writer knows about the progressive (-ing) form, then why do they use it only sometimes, when using the auxiliary “is” or “are” with the verb?
- Grammatical errors in idioms. There are a number of idioms that would be surprising for a low-skilled non-native speaker to use, and some of them are used with grammatical errors that a skilled English speaker would be unlikely to make. The most reasonable explanation, then, is that the errors were inserted by a native speaker after writing the idioms. Examples include:
- “or [the] bid pump[s] [the] price up”
- “bidding war”
- “top friends”
- “go bye bye”
- “where [does that] leave Wealthy Elites”
In the (unlikely) event that the writer is, in fact, not a native English speaker, their native tongue is much more likely to be Slavic (e.g., Russian or Polish) than either Germanic or Romance.