Google trigram frequency visualization

9 May 2008

google_trigrams.jpg
a set of relational visualizations represent the relative frequencies of trigrams as they appear on the web, based on a massive (100GB) n-gram dataset from Google's corpus archive. n-grams are pieces of sentences. a trigram (n=3), for example, might be "I like food" or "frog is tasty."

the first visualization compares the 120 trigrams of the terms 'He' with 'She', while the other is based on 75 trigrams of 'I' & 'You'. the frequencies of the 2nd word in the trigrams were sorted in decreasing order. words are sized according to the square root of their use frequencies. the color-coded lines act like paths (similar to a tree structure), enumerating all of the occurring trigrams.

[link: chrisharrison.net

add to delicious.gif add to digg

recent entries

twistori twitter message filter tweetwheel twitter network viz cocovas search visualization personal profile network graphs timetube youtube video timeline silent energy consumption visualization greenpix zero-energy massive LED display Google trigram frequency visualization Diesel infographic safety video oh shiit spelling frequency information design patterns cookbook msnbc 3D live news reader average American consumer spending wifi geographical mapping

extra

google_trigrams2.jpg

google_trigrams3.jpg

google_trigrams4.jpg

shop

post a comment