(Note: This is the first part of a series of Digg related articles, which will be published irregularly,
Digg is Big. Very Big. 61k stories had been on the frontpage when I last checked, which was quite a while ago. Along with the fact that a lot of users contribute to it, it sure does yield some pretty interesting visualizations.
So, I grabbed the headlines from Digg into two different datasets (one from April 2006-March 2007 and the other from April 2007 - March 2008), shipped it off to Jeff Clark, and waited till he was kind enough to generate this Document Contrast Diagram and send it to me.
(Click for larger image)
Words that occur more number of times have bigger circles. Words appearing bluish circles have words that appear more on the April 2006-March 2007 time period, while the red ones appear more on the April 2007-March 2008 timeframe. The purplish ones are common to both. This chart should be fairly self-explanatory. More information in interpreting the chart is available here.
I will refer to the April 06-Marc 07 dataset as first dataset and the April 07-March 08 dataset as the second one, from now on.
Consoles seem to be popular in the first dataset, while it is a tad more diverse in the second datasaet. The lesser amount of big circles towards the right say that discussion on Digg has begun to diversify, which might be related to the reported death of Tech on Digg. But still, of the big circles I can spot from 20 metres away, most of them (except some misc. curiosities, like “kill”) seem to be Tech related.
Also, the big size of “Top” shows that Lists are “very” popular in Digg. The presence of the word “says” there means that interviews are popular too.
I could go on with these, but since this is so simple, I’ll leave it to the commenters to find the cool ones
Just get the bigger version of the picture & amuse yourself for some time
Also, I’m making the list of Titles I used available for public download, under a Creative Commons Attribution License. Note that this applies only to the raw data as of now. Download it here: Apr06-Mar07, Apr07-Mar08 (links to direct text files, so right click and save). And, don’t forget the attribution!
(Side Note: I’ve been a fan of Jeff Clark for a long time, ever since I discovered his visualizations. Just check out his portfolio - amazing visualizations there
Most are somewhat politically oriented though)
Update: Jeff’s observations:
- Words used much more frequently in 2006-2007 were: nintendo, wii, itunes, windows, vista, sony, flash, amd, intel, zune, ps3
- Words used much more frequently in 2007-2008 were: gonzales, cheney, pics, paul, impeachment, iphone, obama, clinton
- common words used about the same in both periods: linux, video, ubuntu, make, pictures, take ,year, best, music
- A large overlap in vocabulary used between the two periods.
- clinton & obama mentions strongly appear towards the end of the 2007-2008 period.
What are your observations?
8 responses so far ↓
1 Louis Gray // May 3, 2008 at 9:43 am
If it wasn’t for the iPhone, it’d be almost entirely split between “Digg used to show Tech” and “Digg is now mainstream”. It’s also interesting to see the Wii is no longer the conversation starter it once was. Funny how that’s the case, even when it’s still hard to find and still a major tech lust toy. Nice chart.
2 Yuvi // May 3, 2008 at 9:59 am
@Louis: Thanks!
I’m doing another set of charts comparing Digg Headlines vs Slashdot vs Reddit. That should prove more interesting as well
3 engtech // May 3, 2008 at 11:03 am
Freaking amazing visualization technique.
Great idea. Looking forward to more stuff from your new blog, Yuvi.
4 Yuvi // May 3, 2008 at 11:20 am
@engtech: Thanks! Yep, more stuff coming soon….
5 A Year in Digg Headlines - Charting the Trends from 2006-2008 | RyanSpoon.com // May 5, 2008 at 8:21 pm
[...] is a charting of Digg’s popular headlines compared year-over-year. Digg is considered a haven for techies - but if you scan left-to-right, [...]
6 Weekend Reader - programming, lifehacks, code, blogging, funny « // Internet Duct Tape // May 10, 2008 at 8:58 am
[...] [DIGG] The StatBot pits Digg vs Digg, thestatbot.com [...]
7 jagi // May 28, 2008 at 7:53 am
hey cool work yuvi ~
8 mininglabs // May 28, 2008 at 11:32 am
Digg taking over Slashdot … says AideRSS…
Very much impressed by the recent article on 3d rails on when to publish a post to be noticed we decided to give the AideRSS api a try. This rather new service (the api) lets you dive into the huge amount of posts of any major feed available. They also…
Leave a Comment