Data Hacks is a new library developed at popular URL-shortener bit.ly that consists of a set of command line tools to assist in quick and dirty data analysis. For instance, the library has tools to calculate 95 percentile values, a histogram display, sample to a % of stdin, and a tool to pass stdin to stdout for a set time period, all controlled from the command line. Such tools could be used to grab a sample from a log file of a running system. When combined with other command line tools, this could be useful for a real-time evaluation of the effect of configuration changes.

As a bit of coincidence, they also ended up with horizontal bar graphs made in ASCII [jehiah.cz].

You can download it here.

horizontal bar graphs are very old-school -- it the earliest days of the web, they were also used there: a row of █ (ascii 219) characters does quite nicely when you don't have server-side image generators.


Mon 25 Oct 2010 at 4:29 AM

Sometimes, when you're comparing data, a linear scale doesn't cover a wide enough range of values. You can switch to logs, but then it's harder to see a difference between similar values. A solution I found to this was to use both (with horizontal ASCII bar charts), placing the log scale outside with square brackets - see http://www.acooke.org/cute/RXPYBenchm0.html

Sat 30 Oct 2010 at 12:39 AM
