<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>bitplane.net &#187; bash</title>
	<atom:link href="http://bitplane.net/tag/bash/feed/" rel="self" type="application/rss+xml" />
	<link>http://bitplane.net</link>
	<description>Rants, ramblings, free software</description>
	<lastBuildDate>Mon, 16 Jan 2012 16:58:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Chart Against The X Factor</title>
		<link>http://bitplane.net/2009/12/bash-for-xmas-number-one/</link>
		<comments>http://bitplane.net/2009/12/bash-for-xmas-number-one/#comments</comments>
		<pubDate>Sun, 06 Dec 2009 22:30:33 +0000</pubDate>
		<dc:creator>Gaz Davidson</dc:creator>
				<category><![CDATA[Music]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[bash]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[google charts]]></category>

		<guid isPermaLink="false">http://bitplane.net/?p=342</guid>
		<description><![CDATA[For the last four years running Simon Cowell&#8217;s plastic karaoke acts have held the Christmas #1 spot in the UK singles charts thanks to ITV&#8217;s hit show The X Factor. People have been complaining that this has ruined the great British tradition of betting on which artist will take the number one slot, as it&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>For the last four years running Simon Cowell&#8217;s plastic karaoke acts have held the Christmas #1 spot in the UK singles charts thanks to ITV&#8217;s hit show <a href="http://xfactor.itv.com/">The X Factor</a>. People have been complaining that this has ruined the great British tradition of betting on which artist will take the number one slot, as it&#8217;s traditionally the only time of year when the chart is dominated by wacky Christmas songs rather than the latest boy bands and whoever else thirteen year old girls spend their pocket money on.</p>
<p>I&#8217;m not too bothered about popular music, the singles chart or who gets the Xmas #1 slot, but last week I was invited to join a growing <a href="http://www.facebook.com/group.php?gid=2228594104">group on Facebook</a> who are campaigning to knock the X Factor winner from the top spot by mass purchasing Rage Against The Machine&#8217;s classic track <a href="http://www.youtube.com/watch?v=fkuOAY-S6OY">Killing in the Name</a>. The sound of rebellion to conquer the airwaves, political rap metal on future Christmas compilation albums, all for the princely sum of 79p? I don&#8217;t usually buy digital downloads but this time you can count me in!</p>
<p>According to <a href="http://news.sky.com/skynews/Home/Showbiz-News/X-Factor-Could-Be-Beaten-To-Christmas-Number-One-By-Anti-Simon-Cowell-Facebook-Campaign/Article/200912115491121">Sky News</a> the group had 43,000 members sometime on Friday, but by the time I got home on Saturday night there were 180,000 members and rising. As the media coverage increases so do the new members, which made me interested: how does a phenomenon like this evolve, how will it turn out next Sunday? What happens when the UK Charts people decide that it&#8217;s against the rules and disqualify the single?</p>
<p>So I decided to log and graph the group&#8217;s membership, every fifteen minutes I grab the page using wget, I extract the number of users and dump that into a text file along with the current date and time. Then I cut through it using a couple of awk and sed one liners, dump the results into an HTML file, graph it using Google Charts and upload the output to my file dump.</p>
<p><strong>Update: These graphs are no longer live! Click for the <a href="http://dump.bitplane.net/ratm/index.html">live versions</a> which are updated much more often using a different script</strong> </p>
<a href="http://dump.bitplane.net/ratm/index.html"><img class="size-full wp-image-345" title="Members" src="http://dump.bitplane.net/ratm/members.png" alt="Click for the source data" width="600" height="300" /></a>
<a href="http://dump.bitplane.net/ratm/index.html"><img class="size-full wp-image-345" title="Members/hr" src="http://dump.bitplane.net/ratm/perhr.png" alt="Members per hour" width="600" height="300" /></a>
<p>Here&#8217;s the scraping script:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#!/bin/bash</span>
&nbsp;
<span style="color: #7a0874; font-weight: bold;">cd</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>gaz<span style="color: #000000; font-weight: bold;">/</span>ratm<span style="color: #000000; font-weight: bold;">/</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># get the timestamp</span>
<span style="color: #007800;">timestamp</span>=<span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">date</span> <span style="color: #ff0000;">&quot;+20%y/%m/%d %H:%M:%S&quot;</span><span style="color: #000000; font-weight: bold;">`</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># get the file</span>
<span style="color: #c20cb9; font-weight: bold;">wget</span> <span style="color: #660033;">--max-redirect</span> <span style="color: #000000;">2</span> <span style="color: #660033;">-O</span> temp.html http:<span style="color: #000000; font-weight: bold;">//</span>www.facebook.com<span style="color: #000000; font-weight: bold;">/</span>group.php?<span style="color: #007800;">gid</span>=<span style="color: #000000;">2228594104</span> <span style="color: #660033;">--user-agent</span>=<span style="color: #ff0000;">&quot;Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.1.5) Gecko/20091109 Ubuntu/9.10 (karmic) Firefox/3.5.5&quot;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># extract user count from the file</span>
<span style="color: #007800;">usercount</span>=<span style="color: #000000; font-weight: bold;">`</span><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">&quot;s/.* of \(.*\) members.*/\1/p&quot;</span> temp.html<span style="color: #000000; font-weight: bold;">`</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># remove any commas from the string</span>
<span style="color: #007800;">usercount</span>=<span style="color: #800000;">${usercount//[,]/}</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># it must have a length, or it will cause problems when Facebook is having problems!</span>
<span style="color: #666666; font-style: italic;"># in this case, we just give a -1 (not good practice from a stats PoV, but it keeps it simple) </span>
<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">${#usercount}</span>&quot;</span> <span style="color: #660033;">-eq</span> <span style="color: #ff0000;">&quot;0&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>
<span style="color: #000000; font-weight: bold;">then</span>
    <span style="color: #007800;">usercount</span>=<span style="color: #ff0000;">&quot;-1&quot;</span>
<span style="color: #000000; font-weight: bold;">fi</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># remove the temporary file</span>
<span style="color: #c20cb9; font-weight: bold;">rm</span> temp.html
&nbsp;
<span style="color: #666666; font-style: italic;"># write the output in CSV format</span>
<span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$timestamp</span>,<span style="color: #007800;">$usercount</span>&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;&gt;</span> data.dat
&nbsp;
<span style="color: #666666; font-style: italic;"># next I run the graph generating script</span></pre></div></div>

<p>And this one (no longer in use) creates the two above charts from the data:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#!/bin/bash</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># gets a column from a line of a CSV file. The first index is 1, not 0.</span>
getElement<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span> <span style="color: #7a0874; font-weight: bold;">&#123;</span>
    <span style="color: #007800;">RESULT</span>=<span style="color: #000000;">0</span>
    <span style="color: #7a0874; font-weight: bold;">local</span> <span style="color: #007800;">p</span>=<span style="color: #000000; font-weight: bold;">`</span><span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;$1&quot;</span>p<span style="color: #000000; font-weight: bold;">`</span>
    <span style="color: #007800;">RESULT</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #007800;">$2</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/,/\n/g'</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #007800;">$p</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #7a0874; font-weight: bold;">&#125;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># get the start and end times</span>
&nbsp;
getElement <span style="color: #000000;">1</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$(tail -1 data.dat)</span>&quot;</span>
<span style="color: #007800;">end</span>=<span style="color: #007800;">$RESULT</span>
getElement <span style="color: #000000;">1</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$(sed -n '1p' data.dat)</span>&quot;</span>
<span style="color: #007800;">start</span>=<span style="color: #007800;">$RESULT</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># get the current minimum and maximum values</span>
<span style="color: #007800;">min</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">cat</span> minval<span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">max</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">cat</span> maxval<span style="color: #7a0874; font-weight: bold;">&#41;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># get the last value</span>
getElement <span style="color: #000000;">2</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$(tail -1 data.dat)</span>&quot;</span>
<span style="color: #007800;">lastval</span>=<span style="color: #007800;">$RESULT</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># set new max value</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$lastval</span>&quot;</span> <span style="color: #660033;">-gt</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$max</span>&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>
<span style="color: #000000; font-weight: bold;">then</span>
    <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$lastval</span>&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;</span> maxval
    <span style="color: #007800;">maxval</span>=<span style="color: #007800;">$lastval</span>
    <span style="color: #7a0874; font-weight: bold;">echo</span> New maximum, <span style="color: #007800;">$lastval</span>
<span style="color: #000000; font-weight: bold;">fi</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># and the new min value</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$lastval</span>&quot;</span> <span style="color: #660033;">-gt</span> <span style="color: #000000;">0</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>
<span style="color: #000000; font-weight: bold;">then</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$lastval</span>&quot;</span> <span style="color: #660033;">-lt</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$min</span>&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>
    <span style="color: #000000; font-weight: bold;">then</span>
        <span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$lastval</span>&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;</span> minval
        <span style="color: #007800;">min</span>=<span style="color: #007800;">$lastval</span>
    <span style="color: #000000; font-weight: bold;">fi</span>
<span style="color: #000000; font-weight: bold;">fi</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># get values for the Y axis</span>
<span style="color: #007800;">quart</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #007800;">$max</span> - <span style="color: #007800;">$min</span><span style="color: #7a0874; font-weight: bold;">&#41;</span> <span style="color: #000000; font-weight: bold;">/</span> <span style="color: #000000;">4</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">q1</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #007800;">$min</span> + <span style="color: #007800;">$quart</span> <span style="color: #000000; font-weight: bold;">*</span> <span style="color: #000000;">1</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">q2</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #007800;">$min</span> + <span style="color: #007800;">$quart</span> <span style="color: #000000; font-weight: bold;">*</span> <span style="color: #000000;">2</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #007800;">q3</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #007800;">$min</span> + <span style="color: #007800;">$quart</span> <span style="color: #000000; font-weight: bold;">*</span> <span style="color: #000000;">3</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># extract the data using regexp:</span>
<span style="color: #666666; font-style: italic;"># 1. get every 4th line of the file, meaning hourly</span>
<span style="color: #666666; font-style: italic;"># 2. take all the values from the file</span>
<span style="color: #666666; font-style: italic;"># 3. remove the trailing comma</span>
&nbsp;
<span style="color: #007800;">data</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'NR%4==0'</span> data.dat <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">&quot;s/.*,\([0-9]*\)/\1/p&quot;</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">tr</span> <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span> <span style="color: #ff0000;">&quot;,&quot;</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">&quot;s/\(.*\),/\1/&quot;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># build the URL to the total members chart</span>
<span style="color: #007800;">total_members</span>=<span style="color: #ff0000;">&quot;http://chart.apis.google.com/chart?chtt=Total+Members&amp;chs=600x300&amp;cht=ls&amp;chxt=x,y&amp;chxl=0:|<span style="color: #007800;">$start</span>|<span style="color: #007800;">$end</span>|1:|<span style="color: #007800;">$min</span>|<span style="color: #007800;">$q1</span>|<span style="color: #007800;">$q2</span>|<span style="color: #007800;">$q3</span>|<span style="color: #007800;">$max</span>&amp;chds=<span style="color: #007800;">$min</span>,<span style="color: #007800;">$max</span>&amp;chd=t:<span style="color: #007800;">$data</span>&quot;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># now let's do members per hour</span>
&nbsp;
<span style="color: #007800;">lastval</span>=<span style="color: #007800;">$min</span>
<span style="color: #007800;">min</span>=<span style="color: #000000;">0</span>
<span style="color: #007800;">max</span>=<span style="color: #000000;">0</span>
<span style="color: #007800;">data</span>=<span style="color: #ff0000;">&quot;&quot;</span>
<span style="color: #007800;">inputList</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'NR%4==0'</span> data.dat <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">&quot;s/.*,\([0-9]*\)/\1/p&quot;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
<span style="color: #000000; font-weight: bold;">while</span> <span style="color: #c20cb9; font-weight: bold;">read</span> line; <span style="color: #000000; font-weight: bold;">do</span>
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$line</span>&quot;</span> <span style="color: #660033;">-gt</span> <span style="color: #ff0000;">&quot;0&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>
    <span style="color: #000000; font-weight: bold;">then</span> 
        <span style="color: #007800;">val</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #007800;">$line</span> - <span style="color: #007800;">$lastval</span><span style="color: #7a0874; font-weight: bold;">&#41;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
        <span style="color: #007800;">lastval</span>=<span style="color: #007800;">$line</span>
    <span style="color: #000000; font-weight: bold;">else</span>
        <span style="color: #007800;">val</span>=<span style="color: #000000;">0</span>
    <span style="color: #000000; font-weight: bold;">fi</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$val</span>&quot;</span> <span style="color: #660033;">-gt</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$max</span>&quot;</span> <span style="color: #7a0874; font-weight: bold;">&#93;</span>
    <span style="color: #000000; font-weight: bold;">then</span>
        <span style="color: #007800;">max</span>=<span style="color: #007800;">$val</span>
    <span style="color: #000000; font-weight: bold;">fi</span>
&nbsp;
    <span style="color: #007800;">data</span>=<span style="color: #ff0000;">&quot;<span style="color: #007800;">$data</span>,<span style="color: #007800;">$val</span>&quot;</span>
<span style="color: #000000; font-weight: bold;">done</span> <span style="color: #000000; font-weight: bold;">&lt;&lt;&lt;</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$inputList</span>&quot;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># remove comma prefix</span>
<span style="color: #007800;">data</span>=$<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #ff0000;">&quot;<span style="color: #007800;">$data</span>&quot;</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">&quot;s/,\(.*\)/\1/g&quot;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># build the per hour chart</span>
<span style="color: #007800;">members_per_hr</span>=<span style="color: #ff0000;">&quot;http://chart.apis.google.com/chart?chtt=Members+per+hr&amp;chs=600x300&amp;cht=ls&amp;chxt=x,y&amp;chxl=0:|<span style="color: #007800;">$start</span>|<span style="color: #007800;">$end</span>|1:|<span style="color: #007800;">$min</span>|<span style="color: #007800;">$max</span>&amp;chds=<span style="color: #007800;">$min</span>,<span style="color: #007800;">$max</span>&amp;chd=t:<span style="color: #007800;">$data</span>&quot;</span>
&nbsp;
<span style="color: #666666; font-style: italic;"># I then create an HTML file from some templates and upload everything to my dump</span></pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://bitplane.net/2009/12/bash-for-xmas-number-one/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
	</channel>
</rss>

