bookblog.net

 

Main
Search This Site

« back to final climax in bed?
» forward to talking about being good in bed

Discussion Archives
blindness
bridge of birds
a canticle for leibowitz
charlie and the chocolate factory
chronicle of a death foretold
a confederacy of dunces
confessions of an ugly stepsister
coraline
the curious incident of the dog in the night-time
descent into hell
the diamond age
don quixote
fight club
the five people you meet in heaven
fried green tomatoes at the whistle stop cafe
the ghost writer
good in bed
harry potter and the sorcerer's stone
a home at the end of the world
house of leaves
if on a winter's night a traveler
invisible monsters
the kite runner
life of pi
memoirs of a geisha
middlesex
mysterious skin
noir
norwegian wood
one for the money
the poisonwood bible
revenge
the secret life of bees
shopgirl
the solitaire mystery
the stupidest angel
thumbsucker
the time traveler's wife
troll
veronika decides to die
watch your mouth
a wrinkle in time

Monthly Archives
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002

 

August 22, 2003

Sex vs. Gender

Well. When The Gender Genie was launched, no one here expected it to take off and make its way around the Internet. Thanks to everyone who linked to it, and hope you’re having fun taunting your friends with their results.

From reading various sites that link to it, I’ve noticed that many have taken issue with both its results and stats. It also seems like a lot of people are confusing gender with sex, so I thought I’d write up a post to explain the difference.

Sex, apart from the act of having it, refers to biological or physical traits that determine whether one is a man or a woman. We all know the difference between a penis and a vagina, right?

Gender refers to society’s classification of characteristics perceived to be particular to a certain sex. For example, think about humans as hunter-gatherers. Hunting connotes a masculine activity, so your brain might conjure up images of burly men carrying huge rifles and wearing orange vests. But not all hunters are men and a woman who hunts is still biologically a woman. In imagining her, however, you might assign her some masculine traits like being butch or wearing iron-toed boots.

I realize the above gender example leans heavily toward stereotyping, but it gets the point across. Biology determines sex while society assigns gender. To relate this back to The Gender Genie, a woman author whose passage comes up with a male result is seen by Koppel and Argamon’s algorithm as having a masculine quality to her writing because she’s writing more about specific things (using keywords like "the," "a," "some," numbers, and "it") than connections (using keywords like "with," possessives, possessive pronouns, "for," and "not").

The Gender Genie should really come up with results like "masculine" or "feminine" rather than "male" or "female." However, the former set of terms is highly subjective since gender can be assigned by either society as a whole or individual members of society. If a user puts in a passage by a man and gets "feminine" as a result, the user might think of that man as having feminine qualities and answer yes when asked if the result is correct.

The stats themselves are not to be taken at face value. Their near 50/50 results shows us that determining sex from a writing sample is hit or miss. Determining gender from writing, though, is another matter entirely.

As for all you men who think The Gender Genie is bunk because of your consistent female results, I suggest you stop fighting it and go buy a dress already.



comments
Trackback Excerpt: BookBlog is featuring a little toy called the Gender Genie. Based on an algorithm designed to link syntax and word choice with gender, the Genie is supposed to be able to intuit an author's sex based on a sample of their writing....
[Read More]

The page we got the algorithm from specifically mentions it's for fiction, too, so for everyone slapping blog posts and e-mail into it, you're bound to get off results anyway.

Still, a fun exercise.

A fun exercise indeed, and an interesting theory behind it.

But -- unless I have misunderstood the NYTimes description of the algorithm -- I think the programming on the Gender Genie may be a bit off. Step One is to count the total number of words in the document; I interpret the subsequent steps to mean adding to or subtracting from this total. Yet if I input the gender-marker-free sentence "Dogs run faster than cats" into the Gender Genie as it is programmed now, it gives me a score of 0 rather than the score of 7 (= total number of words, with no modifiers for gender) I would expect from the algorithm.

If this interpretation of the (rather ambiguous) algorithm is correct, the program will more likely misattribute men's writing as "female" than the other way 'round. Perhaps this, in addition to the already-mentioned tendency for users to report incorrect results more frequently than correct ones, is skewing the program's accuracy?

I forwarded this post to DH, thank you for writing it!! He took it, it said female and started laughing about it being wrong. I had to explain to him that, well honey, you're just more "feminine and sensitive" than other men. I don't think he appreciated it, but it was hilarious. He even got a kick out of it. I love it so much I linked to it on my blog!! Thanks again!!!

Heh. OK, in my earlier comment, that should be a word count of *five*. Duh.

I think perhaps I've spent too much time in front of the computer today. :)

What a load of rubbish. The Gender Genie assigns gender arbitrarily. How can a writing style be considered "masculine" when the results show that it is not typical of men, or considered "feminine" when the results show that it is not typical of women? All you've done is take one group of words and arbitrarily assigned them with the "feminine" label, and arbitrarily assigned another group of words with the label "masculine".

The results clearly show that this labelling is meaningless. The results have to do with the subject and purpose of the writing, but the subjects and purposes are not masculine or feminine - they are ungendered.

Karen, Rich and I debated your very point right after he built the original program. If you notice, though, the instructions don’t say: 1. Count the number of words in the document. 2. For each appearance in the document of the following words ADD the number of points indicated *to that total*. If we were to do that, the algorithm would come up with a male result nearly every time. When you add the keyword score to the word count, you practically guarantee a higher score because the keyword total would have to be a negative number for it to be lower. Considering that "the" is worth +17 points, it’s difficult to come up with a negative keyword total. Try it by hand on a few texts and you’ll see what we mean.

Cristina, glad to hear that you’ve been enjoying it and thanks for the link! DH now knows what kinds of words he needs to use in order to butch up his writing. :)

Aussieintn, I think your beef should be with the authors of the algorithm and not with us. As we’ve explained several times, we didn’t create the formula. We simply wrote the program that tests it. Although it seems like you’re a bit insecure with it having stuck a feminine label on some of your writing, you shouldn’t let The Gender Genie threaten your masculinity. Have a good cry, eat some chocolate, talk about it with your best friend, go shopping, buy a new pair of shoes, then move on. And if it makes you feel better, your comment above came up as male.

Aussietn:

Here's a final attempt to help you "get it," which I expect to be ignored (given your demonstrated propensity for failing to acknowledge anything we say to you), but would feel remiss if I didn't make.

Try reading the article that gave us the idea.

It's radical, I know.

Bah. I don't know why I even try. It's apparent either A) you're just commenting here to bust on people you don't know, like any common Internet troll, or B) Mary's right, the formula keeps calling you a girly-man, and you need to go watch "An Affair to Remember" and treat yourself to a day at the spa.

Get over it.

hey ... it says i write like a girl.... it works correctly - that's that... excellent bit of coding, Mary!!!

(and kick arse coding, as well, rich!)

10-Q, 10-Q. :-)

On two out of the three things I pasted in your fun little Gender Genie it said I wrote like a man. Which I just find terribly funny. Since I've always wanted to be butch and wear iron-toed boots, now I have the perfect reason to shave my head and go buy some men's work boots. Thank you so much!!
Kidding...but I still think this is a fun little thing to play with, I need to test out some more stuff.

A fun exercise indeed :) But unfortunately has the potential to be taken much too seriously.

I do think it is a good idea to change 'The Gender Genie' to say 'personal/intimist' instead of 'female', and 'impersonal' instead of 'male'.

I personally think it's unfair to men to assume that they should always write like cold heartless bastards. :)

(A poetic blog post of mine came out as 'male', as did blog post of mine on a technical subject. However, I stuck in two emails I wrote to a friend, and they turned out to be 'female'. :))

At least now we are not limited to "You throw like a girl!" or, as Austin Powers would put it, "You fight like a woman!"

I'm moved.

I want to formally disagree with changing the outcome labels from male/female to something dilute like specificist/intimist, or even masculine/feminine.

The formula came from an article about two guys who'd come up with an algorithm to attempt to determine the PHYSICAL SEX of the author. Not socially-defined "gender," and not "specificist" or "intimist."

The question the algorithm's creators wanted to answer was, "Is the author a boy or a girl?" It was not "are they masculine or feminine?" or "are they trying to write in a masculine or feminine manner?"

Mutter mutter politically-correct mumbo jumbo grumble grump geroff my yard!

To say that men (or the masculine element) use numbers more than women is perhaps almost as insulting as saying "men are better at mathematics than women"

Interesting, though I give it and its results about as much credence as a vegan gives an all-steak diet.

I found it funny that most of my prose comes out male, and most of my poetry comes out female, likely because I tend to exclude articles (a and the being big scorers) when I write poetry, for the sake of meter. (Not that my meter is anything to write home about.)

"The wedding made the whole audience cry. It was just a beautiful experience. The bride was charming and the groom was a charmer," is a prime example of the flaws in the program. Though this is a very stereotypically female line, it definitely comes out masculine. I personally don't think keywords can be used to detect gender of author with any real reliability, since syntax is so hard to determine and that's what's critical, though the stereotype of women writing about relationships and men writing about things may indeed be very accurate.

As far as your implementation goes, "their", "our", "your" and "my" were not recognized as possessive pronouns (as far as I could tell.) Also, "her" was not, though it is actually more likely to be used other than as a posessive. (Another syntax challenge.) I'm sure you're aware of these things. Still, it was a great toy to show some friends and made a great conversation atarter.

I did notice (as some other people have) a distinct difference based on the person of a text. Almost all my prose that came out feminine was written in first person present. Most my prose is written in third person past, and accordingly, most of it came out as masculine.

Just some observations. Thanks for sharing this toy with the internet.

I should have caught the gender versus sex distinction, as it annoys me people completely interchange the two words. Then again, making the name sound good could have meant exactly that; sex genie would sound very different...

I wasn't completely surprised at my overwhelming tendency toward female results.

I would like to defend the position of aussieintn. He or she puts it rather blunt but in essence he/she is right.

I understand the difference between sex and gender. Sex being biologically and gender sociologically. Sex is rather straightforward to ascertain, gender however is a little more complicated. But the conclusion of the theory is that there is some connection between those two. Otherwise the definition of gender would be meaningless. Maybe you would like the world to be otherwise, I would for like it to, but it isn't.

The trick to proof this theory is to predict from material that we asume as gender-based the sex of the person who has produced it. In this case in the form of an algorithm which skims word of a text.
The results for the gendergenie are now aproximating 50/50, something you would expect if you'd randomly guess people's sex, but not when there's ought to be a correlation, as in this case. You would expect it to be off on one side, maybe just a little (because gender and sex aren't exactly the same, but there should be some similarity)

SO this algoritm, fun as it may be, doesn't do anything really. What it does do is that it proves there's no connection, whatsoever between this algorithm and the sex of the writer (and in connection with that, their gender). None whatsoever. Which is actually already something, albeit nothing spectacular. But the thought itself remains to be fun.

Steph, I’m going to have to agree with Rich (but in a much less curmudgeonly way) on leaving the Gender Genie’s results as is for now. Your point is well-taken and right in some repsects, but gender is really all about stereotyping. Our little toy is just that, a toy, and no one should take what it says about their writing to heart.

Ben, somehow, I think "you write like a girl" doesn’t carry the same insult value as "you throw like a girl," but it makes us book geeks laugh anyway. :)

Dildo, sorry you’re insulted.

Leticia, thanks for your observations, and it’s true that the Gender Genie’s results mean little in the grand scheme of things. Just as FYI, "their," "our," "your," "my," and "her" are all possessive adjectives and not possessive pronouns. Pronouns take the place of a noun while these words are used to modify a noun. (Look, Rich, another lesson about pronouns from the schoolteacher!)

Jay, one of the names we tossed around was "Sex Sensor," but it lost out to Gender Genie for fear of drawing the attention of every perv on the Internet. :)

Tim, we didn’t take offense to Aussieintn’s observations, just his tone of voice in this and another thread. The 50/50 results don’t surprise us at all since we knew the algorithm needed work before the application was written. It’s just a fun time-waster and does prove our original conclusion that sex cannot be definitively ascertained through writing samples. Since gender is a way of perpetuating stereotypes, it’s my opinion that we all need to live our lives dispelling such myths. As a result, the Gender Genie’s failure to be more accurate pleases me.

I just had a thought (and I hope it's not been brought up already or I'll look like an idiot)...is the fact that most people are analyzing short passages or stories that may be about one specific thing or person have anything to do with its inaccuracy. The algorithem is based on novels with lots of words and a series of episodes right? This allows the algorithem to spread itself over a variety of styles and events in the novel because most novels are not about just one thing. Would that matter at all do you think?

Could the Gender genie read an xml feed and predict a bloggers genter from their musings?

Ben, the scientists who developed the algorithm used it on texts of around 42,000 words each. In addition, they used both fiction and non-fiction, but I'm guessing their sample was of more formal writing than what most bloggers have been putting into it. Both of these facts have probably added to it's pitiful accuracy percentage.

Chris, I suppose it could do that, but it would require writing up a parsing program in order to pull whole blog posts. BookBlog has a parser for its home page which needs a serious overhaul, so I'm not about to combine that script with anything. A lot of XML feeds also only contain an exerpt of a post, and probably wouldn't give the genie enough material to make a worthwhile analysis.

Reference to Ben August 27th.

I have fed several samples of the book (fiction) I am writing into the Genie, with consistently MALE results (I am female). The largest sample was over 5000 words, which I think is plenty enough to give an accurate reading.

I am somewhat puzzled by Marydell's quibbling between sex and gender, surely if they wanted to distinguish between these labels they should have used 'Masculine writing' rather than 'Male Author'.

PS. I have tried selecting samples of prose with many uses of the 'female' words but alas, to no avail - apparently I am one butch chick when I write.

Hey-ho.


 

Category Archives
book news
book reviews
club news
other cool sites
site news
stuff about us
textbooks

Support BookBlog
Author:
Title:

Keyword:
Additional Features:
 First Edition
 Signed
 Dust Jacket
 Any Binding
 Hard Cover
 Soft Cover