Author Archives: zephoria

the biases of links

I have a hard time respecting anyone who believes that science or technology is neutral. Unfortunately, even when people consciously know that they are not, they give credence to the biased outputs without questioning the underlying assumptions. This is why i’m an academic – nothing gives me greater joy than to think about what biases go into the creation of a particular system.

After reminding folks at Blogher that there are gender differences in networking habits, i decided to do some investigation into the network structures of blogs. Kevin Marks of Technorati kindly gave me a random sample of 500 blogs to play with. I began coding them based on gender (which is surprisingly easy to do given the amount of personal information people put about themselves) and looking for patterns in links and blogrolls.

I decided to do the same for non-group blogs in the Technorati Top 100. I hadn’t looked at the Top 100 in a while and was floored to realize that most of those blogs are group blogs and/or professional blogs (with “editors” and clear financial backing). Most are covered in advertisements and other things meant to make them money. It’s very clear that their creators have worked hard to reach many eyes (for fame, power or money?).

Here are some of the patterns that i saw*:

Blogrolls:

  • All MSNSpaces users have a list of “Updated Spaces” that looks like a blogroll. It’s not. It’s a random list of 10 blogs on MSNSpaces that have been recently updated. As a result, without special code (like in Technorati), search engines get to see MSNSpace bloggers as connecting to lots of other blogs. This would create the impression of high network density between MSNSpaces which is inaccurate.
  • Few LiveJournals have a blogroll but almost all have a list of friends one click away. This is not considered by search tools that look only at the front page.
  • Bloggers who use hosting services tend to link to only others on the same hosting service (from the blogrolls on Xanga and Rakuten to the friend links on LJ). The blogroll structure on these is often set up to only accept lists of blogs from that service.
  • Blogrolls seem to be very common on politically-oriented blogs and always connect to blogs with similar political views (or to mainstream media).
  • Blogrolls by group blogging companies (like Weblogs, Inc.) always link to other blogs in the domain, using collective link power to help all.
  • A fraction of the Top 100 have blogrolls of blogs. Some have blogrolls that are a link away (like Crooked Timber). Quite a few use that space to advertise or link to mainstream media or companies.
  • Male bloggers who write about technology (particularly social software) seem to be the most likely to keep blogrolls. Their blogrolls tend be be dominantly male, even when few of the blogs they link to are about technology. I haven’t found one with >25% female bloggers (and most seem to be closer to 10%).
  • On LJ (even though it doesn’t count) and Xanga, there’s a gender division in blogrolls whereby female bloggers have mostly female “friends” and vice versa.
  • I was also fascinated that most of the mommy bloggers that i met at Blogher link to Dooce (in Top 100) but Dooce links to no one. This seems to be true of a lot of topical sites – there’s a consensus on who is in the “top” and everyone links to them but they link to no one.
  • I also get the impression that blogrolls are not frequently updated (although i have to imagine that the blogs one reads are). I wonder how static blogrolls are.

Linking patterns:

  • The Top 100 tend to link to mainstream media, companies or websites (like Wikipedia, IMDB) more than to other blogs (Boing Boing is an exception).
  • Blogs on blogging services rarely link to blogs in the posts (even when they are talking about other friends who are in their blogroll or friends’ list). It looks like there’s a gender split in tool use; Mena said that LJ is like 75% female, while Typepad and Moveable Type have far fewer women.
  • Bloggers often talk about other people without linking to their blog (as though the audience would know the blog based on the person). For example, a blogger might talk about Halley Suitt’s presence or comments at Blogher but never link to her. This is much rarer in the Top 100 who tend to link to people when they reference them.
  • Content type is correlated with link structure (personal blogs contain few links, politics blogs contain lots of links). There’s a gender split in content type.
  • When bloggers link to another blog, it is more likely to be same gender.

I began this investigation curious about gender differences. There are a few things that we know in social networks. First, our social networks are frequently split by gender (from childhood on). Second, men tend to have large numbers of weak ties and women tend to have fewer, but stronger ties. This means that in traditional social networks, men tend to know far more people but not nearly as intimately as those women know. (This is a huge advantage for men in professional spheres but tends to wreak havoc when social support becomes more necessary and is often attributed to depression later in life.)

While blog linking tends to be gender-dependent, the number of links seems to be primarily correlated with content type and service. Of course, since content type and service are correlated by gender, gender is likely a secondary effect.

Interestingly, there are distinct clusters of norms wrt linking in blogging, not a coherent and consistent one. The search engines (and the Technorati 100 and PubSub’s Daily 100 Top Links) are validating one of those clusters, regardless of whether or not that is what searchers are looking for. The Top 100 is a list of blogs who either fit into those norms or have adopted those norms in their patterns (most commonly the companies).

I also want to point out a few other issues in link biases that are relevant here:

  • All links are created equal. All relationships are not. Treating everything like a consistent weak tie is quantity over quality and in social networks, that means male over female.
  • When the data being measured has inconsistent structure rules, any ranking metric is inherently flawed. In blogs, there’s no consistency for what a link means, no consistent social norms for blogrolls, no agreed-upon links norms. Metrics inherently squish out this nuance and force all of the square pegs into the round holes.
  • Links indicate no weight, no valence, no attributes. I know Technorati has asked folks to indicate positive/negative in their links or to use nofollow, but few do this. And even if people did, that kind of articulation is a social disaster (::cough:: think Friendster).
  • Traditionally, there is power in keeping your black book shut; one’s position in a network can be quite powerful. You get kudos by helping two unconnected people. You can limit information flow and acquire credit when you take something from one group to another. (This is the basis for some interesting work on creativity – creativity is when bridges connect information from disparate worlds.) While some think that transparency is good, some hide their network to maintain power. For example, if as a blogger, you provide “cool links,” you want others to read you, not the collection of people you read. Of course, a reasonable counter argument is that this person is no longer needed as a bridge, but as a curator. Still, some people hide so that they must be asked for recommendations directly and thus can control who they send people to. (Note: this is a particular kind of power move; transparency can also be a power move by through gifting.)
  • There are social consequences to linking structures and those who have a lot of eyes on them are probably more aware of the consequences of their linking habits. This is another reason why people with a lot of eyes may get rid of blogrolls. Having to negotiate lots of requests for links can be a real turn-off.
  • People will try to manipulate any ranking if there is an advantage to being up top. Static measurement algorithms cause harm to the entire community that is being measured. Web search engines know this, but it’s equally critical for blog search.

These services are definitely measuring something but what they’re measuring is what their algorithms are designed to do, not necessarily influence or prestige or anything else. They’re very effectively measuring the available link structure. The difficulty is that there is nothing consistent whatsoever with that link structure. There are disparate norms, varied uses of links and linking artifacts controlled by external sources (like the hosting company). There is power in defining the norms, but one should question whether or companies or collectives should define them. By squishing everyone into the same rule set so that something can be measured, the people behind an algorithm are exerting authority and power, not of the collective, but of their biased view of what should be. This is inherently why there’s nothing neutral about an algorithm.

While i’ve been looking into the linking patterns, Mary Hodder has been thinking through new metrics for measurement. These are very important but not because one is better than the other. In fact, if we all switched to any of her metrics, we’d have just as many biases as we have now. And many of the Top blogs would try to figure out how to get rank in that system. The significance lies in the ability to offer choice.

Of course, choice is difficulty. Lots of people want to know what the “best” one is and don’t want to think about the metrics behind it (yes, these are the “neutral” people). Unfortunately, many of those types have a lot of power that motivate people to want their attention. The press want a list of the best and many bloggers want the attention of the press and thus want to be listed among the best. Breaking this cycle is virtually impossible, but it how power maintains power. And in our current system, we are doing a damn fine job of replicating the power structures that pervade everyday life under the auspices of creating a new system that usurps power. Ah, what fun.

Still, i think it’s critical to work on new metrics so that we can at least start showing alternate ways of organizing information if for no other reason than to push back against the conception of neutrality. And thus, i’m stoked to help Mary out and i would encourage everyone else interested in altering the power structure to do so as well.

At the least, i do think we need to really think about what is at stake and what we’re inadvertently supporting through our current systems. Are these the power structures that we want to maintain? Because there’s nothing neutral about our technological choices.

* Note: these are patterns, not findings. The methodology used here is not solid enough for findings. I am not offering quantitative data because i want it to be clear that these are trends based on tracking patterns. Think of them as guesstimated hypotheses (and i’d be ecstatic if someone would compute them).

Updated: Related Links

Note: i don’t agree with the points of all of the related posts but i do think they’re important to consider and i want to respond more broadly when i can. In the meantime, i figured that those interested in this post should know about them.

mapping sex in America

Ever have an amazing sexual experience? A night that has become a memory in your head that will always bring you smiles? Or interested in the sexual lives of others?

The Museum of Sex has decided to map the stories of people’s sexual lives. Just click on a state and add your story. And then wander around and read the adventures of others (definitely check out NYC). It’s fascinating to see what people attest to in different parts of the country. (And yes, it seems as though people have sex everywhere!)

(tx benchun)

Technorati Tags: ,

finding fascinating flickr clusters

One of the best things about Flickr’s new clustering algorithm is that it brings out the treasure hunt desire. Surfing can go on for hours as you track down fascinating clusters amongst the bazillion photos. Ever since my night (where many hours disappeared), i forced myself to resist the temptation. But then, benchun went and pointed me to: the twister cluster:

Has anyone else found fascinating ones?

Technorati Tags: , ,

interestingness

Flickr just released interestingness. This is a fascinating way to browse photos, checking out what people are into.

There are lots of things that make a photo ‘interesting’ (or not) in the Flickr. Where the clickthroughs are coming from; who comments on it and when; who marks it as a favorite; its tags and many more things which are constantly changing. Interestingness changes over time, as more and more fantastic photos and stories are added to Flickr.

So, engage a way because your engagement affects the interestingness and there’s nothing like oohing and aweing over the pretty pictures this month.

(PS: they also released clustering so that you can check out tags based on related words. Check out all of the ones related to urban)

Technorati Tags: , ,

i came, i went…

So, Blogher was a complete and utter trip. It was great to see old friends and meet new ones. I have to admit that i was totally overwhelmed by the level of energy that so many people had – i totally crashed last night as a result. I spent the bulk of my day hearing the voices of different types of bloggers – the hiphop bloggers, the teen bloggers, the academic bloggers, the mommy bloggers. I had _no_ idea how many mommy bloggers were out there or the struggles with voice that they experience. That alone made the entire conference worth it for me.

Things got a little strange for me at the end. I was supposed to introduce the keynote speaker, Caterina Fake (Flickr). Caterina was supposed to speak about Yahoo! and what they’re doing in the social media space. (Yahoo! was a featured sponsor of the conference and the keynote position was given to them; they nominated Caterina to speak.) Due to an unexpected family crisis, she wasn’t able to come at the last minute. So, instead of introducing her, i ended up doing a brief ad-hoc explanation of why i am consulting for the Yahoo! Research Labs-Berkeley, briefly explaining Jeff Weiner’s FUSE model – Find, Use, Share, Expand (see Supernova notes and Weiner interviews ). I do genuinely believe that Jeff gets it and i love his model so i was happy to represent his mission, but it felt a little strange to be speaking as an insider when i just got my contractor badge 3 days ago. So trippy.

The point is… i was there.. it was fun… and i’m really really bad at writing up notes about what actually happened. Mostly, i had really good conversations and it was really invigorating to hear different perspectives and have conversations that i haven’t had over and over again.

(And now, i’m in a strange hotel in the middle of Michigan where i’m going to have to miraculously remember matrix algebra before tomorrow so as to not embarrass myself in front of a professor that i admire. ::gulp::)

Technorati Tags:

@ Blogher

I’m at Blogher, which is a trip. Of course, the first thing you notice is how people greet each other – hugs, kisses, screams, joy. There’s no feathers flailing, chests puffing. I smiled – i’m so used to the boys’ world. I decided to sit back and watch the boys who are attending. The ones who usually have the most colorful feathers are sitting back, shoulders hunched, listening, trying. I remember what it was like when i first went to etech – i didn’t know how to talk to anyone. I knew no one and i felt like such an outsider. I was afraid of looking stupid. I wonder if they feel that way here.

I have to admit that the beginning conversation really got to me. There’s definitely a lot of frustration and anger here, frustration over the purported authority of the men in blogging, anger over the validation that the mass media gives them. So there was a lot of airing that negativity. That’s hard to hear.

Some of it, unfortunately, was lacking facts. One issue came up over the fact that women don’t network. Well, that’s bullshit. Actually, women are traditionally the maintainers of domestic social networks. They tend to network more than men. The gender difference concerns the style of networking. Men are more likely to gather many weak ties; women tend to work hard to maintain strong ties. Each have their value. But when it comes to technology like Technorati, there is a validation of weak ties over strong ties. Or more actually, there’s an assumption that all ties are created equal, which inadvertently validates the weak ties over the strong ties.

My argument here is that we need to pay attention to the network structures. If folks are angry about their position in some purported hierarchy, they need to understand how the hierarchy works. And then change it. I’m not interested in having separate networks; i’m interested in making certain that people understand the gender bias they build into the network and that it represents a diversity of perspectives, is flexible to deal with a diversity of social structures.

Anyhow, it’s a fascinating place to be. I’m not going to be good about blogging this conference so definitely watch the links on Blogher.

Technorati Tags:

what everyone should know about blog depression

Over at The Nonist, there’s a public service announcement concerning blog depression. To address this, jmorrison created an educational PDF to help you deal with depression.

To give you a sense, the first page asks what blog depression is. Some symptoms include:

  • Loss of pleasure in the internet
  • Feelings of sadness, disappointment, anger, self loathing, hopelessness, dimentia
  • Passive aggressive moaning and a steady lengthening of the interval between posts

Definitely take a look at it – i’m super curious what others think of this.