I feel like with the rise of AI something that anonymizes writing styles should exist. For example it could look for differences in American versus British spelling like color versus colour or contextual things like soccer versus football and make edits accordingly. ChatGPT could be fed a prompt that says “Rewrite the following paragraphs as if they were written by an Australian” but I don’t know if it would have a good enough grasp on the objective or if it would start shoehorning in references to koalas and fairy floss.

I tried searching online to see if something like this existed and found a few articles from around the 2010s such as Software Helps Identify Anonymous Writers or Helps Them Stay That Way by the New York Times. It talks about stylometry and Anonymouth but it seems like Anonymouth hasn’t been updated in years. All recent articles seem to be about plagiarism and AI.

For context what got me thinking about the topic was remembering JK Rowling being revealed to be the author of a mystery novel called The Cuckoo’s Calling. Smithsonian wrote an article about it called How Did Computers Uncover J.K. Rowling’s Pseudonym?. I thought it could make for a neat post here.

  • Syn_Attck@lemmy.today
    link
    fedilink
    arrow-up
    17
    arrow-down
    1
    ·
    8 months ago

    There is a program built into Whonix, I believe it’s called Kloak, that randomizes your keyboard input times so you can’t be identified via keystroke timing JavaScript. There’s also research into defeating stylomeyric analysis such as anonymouth but I’m sure there are plenty of new tools, if anyone find any that work well please reply here as I haven’t looked in some years. ‘Stylometric analysis’ is the key phrase to search for.

    With AI this will get worse (better identification based on typing styles) but it will also get better because you can setup a local LLM and ask it to re-write your text in a certain style. Touching on this, everyone uses a combination of unique phrases and misspelling or mis-spelling (see?) of words, and with enough text from a given account the chance of statistical probability in attribution is very high. It’s how the Unibomber was identified after his manifesto was published, because he used a very specific phrase incorrectly and his brother recognized it, so his wife convinced him to call the FBI tip line about his brother.