I didn’t do as much literature survey on this as I’d’ve wanted, but I came across this paper [pdf]. Word frequencies are different among men and women, apparently. That’s the basis of disambiguation. Women use more pronouns than men do, and the frequency compares with that of fiction, while that of men compares with nonfiction.
So I guess it should work like this: identify genre of the piece, and then identify gender.
When I was looking up stuff for Blog Gender Analysis, I came across uClassify.com. Great site. I guess it can be used for rapid prototyping and things. Just to see if a particular approach might work or not. Or something like that.
What is it used for, basically? Please do tell me… I’d like to know.