Recently, on my main blog, I found people commenting to say that they were debating my gender going solely by my writing. That brought back an old set of ideas I had.
There’s no dearth of web apps that determine the gender of the writer given a sample piece of writing. But these mostly were erroneous when they started off – Jane Austen was classfied a male writer by one of these, I remember.
Now however, GenderAnalyzer seems to have improved. Guess it’s due to learning, increasing of the sample space, etc etc. Not at all… they have just gone on from randomly tagging things as Male to tagging things Female.
I thought this was strictly for entertainment purposes, until I saw this as one of the possible tasks on the TREC Blog Track. That set me thinking.
The first application of such a technology that came to mind was spawned by Agatha Christie’s novels – determining whether the writer of threatening notes was a man or a woman. It helps narrow down the suspects, look out for possible accomplices… yeah, it can be put to various uses.
So over the next couple of days, I should try reading more on this, and try analyzing the rationale (if any) behind this task. I’m skeptical, as I feel something so inherently biological like gender does not map perfectly to social and culturally influenced things like writing style, and hence any such task is an exercise in futility.
But let’s see.
Watch this space.