<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>dekstop weblog : Music Feeds -- Pop Culture Snippets, Opinionated Commentary, and Lots and Lots of Noise</title>
    <link>http://dekstop.de/weblog/2009/07/music_feeds/</link>
    <description>Last weekend I was at the music hack day in London, organised by Dave Haynes and James Darling: a two-day event where software developers met up and wrote music-related software (or built hardware.) Instruments, a distributed content resolver, various SoundCloud tools, etc. Although the event attracted lots of interesting people ...</description>
    <dc:language>en-us</dc:language>
    <dc:rights>Copyright 2009 Martin Dittus</dc:rights>
    <lastBuildDate>Sat, 18 Jul 2009 16:15:03 GMT</lastBuildDate>
    <generator>MicroLinks 5.6 (dekstop.de)</generator>
    <managingEditor>public&#64;dekstop&#46;de</managingEditor>
    <webMaster>public&#64;dekstop&#46;de</webMaster>



    <item>
      <title>Music Feeds -- Pop Culture Snippets, Opinionated Commentary, and Lots and Lots of Noise</title>
      <link>http://dekstop.de/weblog/2009/07/music_feeds/</link> 
      <description><![CDATA[<p>Last weekend I was at the <a href="http://musichackday.org/">music hack day</a> in London, organised by <a href="http://www.bytesizemusic.net/">Dave Haynes</a> and <a href="http://coupde.com/">James Darling</a>: a two-day event where software developers met up and wrote music-related software (or built hardware.) <a href="http://davenoise.com/blog/">Instruments</a>, a <a href="http://www.playdar.org/">distributed content resolver</a>, various SoundCloud tools, <a href="http://musichackday.org/info/Hacks">etc</a>.</p>

<p>Although the event attracted lots of interesting people from all over the planet (well, Europe) I ended up coding most of the weekend instead of talking. (On that note, I&#x27;m still amazed by the amount of time coding requires, even after you learned how to channel your ambitions more efficiently. Software development is still a painful process.)</p>

<p>I built a small single-page site: <a href="http://musicfeeds.screamorap.org/">Music Feeds</a>, a river-of-news aggregator of music-related RSS feeds, where you can filter the incoming posts via Last.fm user attention profiles. For example: my <a href="http://musicfeeds.screamorap.org/?lfm:user=martind">own profile</a> at the moment uncovers a lot of dubstep-related posts, because that is what I&#x27;ve been listening to. A surprising amount of the Last.fm profiles I tested with evoked Michael Jackson-related posts. Etc.</p>

<table class="imagetable" border="0" width="500">
<tr>
    <td align="left">
        <a href="http://musicfeeds.screamorap.org/"><img src="http://farm3.static.flickr.com/2638/3731351853_d295570e84_o.png" width="500" height="524" alt="musicfeeds" /></a></td>
</tr>
<tr>
    <td align="left"><a href="http://musicfeeds.screamorap.org/">Music Feeds</a>: a simple blog filter, modulated by Last.fm attention data.</td>
</tr>
</table>

<p>Music feeds provides you with multiple filters, and you can mix them freely: A Last.fm attention profile filter that uncovers posts referencing the names of the user&#x27;s most listened-to artists. The ability to browse by category/topic, as provided by the blog post&#x27;s author. A filter by keyword search (which the former two are based on.)</p>

<p>Some examples:</p>
<ul>
  <li><a href="http://musicfeeds.screamorap.org/?lfm:user=martind">Posts for Last.fm user &#x27;martind&#x27;</a></li>
  <li><a href="http://musicfeeds.screamorap.org/?lfm:user=martind&amp;q=dubstep">Additionally filtered with a keyword match for &#x27;dubstep&#x27;</a></li>
  <li><a href="http://musicfeeds.screamorap.org/?q=%22uk+funky%22+AND+enclosure_mimetype:audio/mpeg">UK Funky podcasts</a>, <a href="http://musicfeeds.screamorap.org/?q=skweee+AND+enclosure_mimetype:audio/mpeg">Skweee podcasts</a> (check the enclosure section at the end of each post, or subscribe to the feed)</li>
  <li><a href="http://musicfeeds.screamorap.org/?q=category:review">Just reviews</a> (or more specifically, only posts that have a category of &quot;review&quot;)</li>
</ul>

<p>I see it as a basic toolbox for writing simple notification mechanisms; a way to combine behavioural data and text search into a news filtering mechanism that is hopefully both sufficiently reactive to a person&#x27;s changes in interest, but also specific enough to pick out distinct elements from a noisy influx of posts.</p>

<p>Thanks to music feeds I already found out that FWD&gt;&gt; is now offering mp3 <a href="http://www.factmagazine.co.uk/index.php?option=com_content&amp;task=view&amp;id=3018&amp;Itemid=28">downloads</a> of their nights, that there is a new <a href="http://sonicrouter.blogspot.com/2009/07/subscribe-hotflush-podcast.html">Hotflush podcast</a>, and that <a href="http://www.factmagazine.co.uk/">FACT</a> generally keeps releasing great mixes. Finding out about this became effortless. I also learned that Lisa Blanning published a <a href="http://stonesthrow.com/news/2009/07/madlib-interview-the-wire">great interview with Madlib</a> in the Wire. And a random &quot;<a href="http://musicfeeds.screamorap.org/?q=shoreditch">Shoreditch</a>&quot; search uncovered that my colleague Helen is releasing a <a href="http://www.einsteinmusicjournal.co.nz/2009/07/15/pens/">PENS debut album</a>.</p>

<h3>So what is this.</h3>

<p>It is definitely not an edited medium. There is no flow, no binding voice, and the nature of what you see varies wildly with your search query and the time of search.</p>

<p>It&#x27;s not a recommendation mechanism. There is no reasoning about user taste models, no predictive algorithm behind what is shown. What&#x27;s shown is simply what could pass the filters.</p>

<p>It&#x27;s also not an archive. It has neither pagination, nor permalinks, not even a URL structure. This is deliberate and will probably not change. (Partially of born out of a consideration for &quot;intellectual property&quot; legislation, and partly because this shouldn&#x27;t turn into a republisher.) At the core of it there is just a stream of incoming posts and a search query that acts as a filter. It&#x27;s sort of a routing/messaging system; or at least it is more this than it is a corpus of documents that you access like a library. </p>

<p>I see it as a useful notification mechanism that you can make use of on the side. It&#x27;s a <em>supplementary medium</em>. A substitute for randomly turning on the TV. In its best moments it could be a substitute for actively pursuing news, but I wouldn&#x27;t expect that to happen a lot.</p>

<p>In the end it&#x27;s just a text search.</p>

<p>On the other hand I would still consider this a social filter, because <em>people now become shorthands for quite complex search queries</em>. Your search fu becomes stronger by getting to know other Last.fm users, or at least their profiles; this allows you to pick your &quot;viewpoint.&quot; You can learn about new music, or achieve a specific mixture, by browsing other people&#x27;s streams. So like with <a href="http://dekstop.de/weblog/2008/05/pool_radio/">Pool Radio</a> this is also about people as mediators.</p>

<table class="imagetable" border="0" width="500">
<tr>
    <td align="left">
        <a href="http://musicfeeds.screamorap.org/"><img src="http://farm3.static.flickr.com/2445/3732350330_cf2381b4e6.jpg" width="500" height="374" alt="musicfeeds-2" /></a></td>
</tr>
<tr>
    <td align="left">Music Feeds displays feed enclosures, and can be used as a simple podcast generator. Just subscribe to the feed of a search result page.</td>
</tr>
</table>

<h3>Limitations</h3>

<p><a href="http://musicfeeds.screamorap.org/?lfm:user=sideb0ard">Thor</a>&#x27;s stream, despite the interesting mixture of its topics, also demonstrated some systemic flaws. When I first started browsing it there was always a little too much Jay-Z in his stream. And he also always had a post by the same annoying real-estate feed right at the top, just because that seemed to be a really active feed, i.e. always had new stuff. (That feed has now been removed.)</p>

<p>These apparent flaws are also a little interesting. Especially since the effect of this social filter may change over time. A lot of recent searches I made brought up Michael Jackson posts; both because Last.fm users whose accounts I was testing with had listened to him a lot, but also because people wrote more about him. This will soon go away and then be replaced with something else.</p>

<p>Sometimes however you only get &quot;noise&quot;, too much stuff that matched random keywords regardless of actual theme. A good indicator that a.) the system still needs more feeds for loads of ill-represented musical subcultures, and b.) you do need to listen to a certain type of music to make this work. </p>

<p>It obviously works best with music that people write about <em>at this time</em>, because it&#x27;s current or topical. </p>

<p>Yet if your own listening habits are towards the non-topical this search model could still be interesting as a notification mechanism -- e.g. to keep looking for unexpected album releases, just in case.</p>

<p>But that requires that the artist names in one&#x27;s Last.fm profile are unique enough so they don&#x27;t cause too many false positives. My own Last.fm filter keeps letting posts through that randomly match the name of the grime artist &quot;Doctor&quot;, without actually being about the artist.</p>

<h3>On the Source Data</h3></p>

<p>It helps a lot that this is based on a fairly controlled data set -- these are mostly hand-picked feeds, even in cases where I didn&#x27;t do the picking myself. Initially I thought about implementing a crawler, but at this point that is probably counter-productive. I only want <em>good</em> feeds. I don&#x27;t want to have to waste time on implementing ranking algorithms.</p>

<p>But obviously I don&#x27;t want to hand-pick them all myself. So instead I&#x27;m concentrating on finding good mediators for feed URLs:</p>

<ul>
  <li>My own private subscriptions, just because you have to start somewhere.</li>
  <li>The <a href="http://hypem.com/list">Hypem blog list</a> (thx!), one of the best sources by far, but with a skew towards mp3 blogs (which means they don&#x27;t have any other good music blogs.)</li>
  <li>The <a href="http://technorati.com/blogs/directory/entertainment/music">Technorati Music Blogs</a> directory.</li>
  <li>To a smaller extent: manually traversing music blog link rolls (As a result there are a whole bunch of London-based blogs in there now.)</li>
  <li>And then <a href="http://twitter.com/sideb0ard">Thor</a> sent me his Google Reader OPML file -- ca. 400 great &amp; original music feeds I didn&#x27;t have! Goldmine!</li>
</ul>

<p>Finally, the inverse: I spent a fair amount of time on pruning feeds that didn&#x27;t quite fit. Gossip blogs, lifestyle wank, real estate &quot;reporting&quot; (esp. the vicarious kind), news, ... there&#x27;s a lot of adjacent stuff that sort of happens in a similar context, and it&#x27;s OK to have up to a degree. But mostly it&#x27;s just a distraction.</p>

<p>Next up, maybe: getting artist homepages from MusicBrainz and determining which ones have a feed. Still unsure about that one. I&#x27;m neither interested in PR blogs nor in the touring minutiae of random rock bands, so this might just be a pandora&#x27;s box.</p> 

<p>(Do you read a lot of music blogs? Or know other good music blog link lists? <a href="http://dekstop.de/contact/">Let me know</a>/send me your OPML file!)</p>

<h3>Briefly on the Technology</h3>

<p>I built a feed aggregator a couple of months ago in Python, with Mark Pilgrim&#x27;s feedparser, PostgreSQL, etc. At the moment it aggregates ca. 3k blogs, the size of the archive just surpassed 700k posts. Music Feeds is based on this archive. </p>

<p>It&#x27;s using Solr for search. Artist name search is peculiar because stemming rules don&#x27;t really apply; which acts in our favour since it means we don&#x27;t have to worry about language models. Additionally we benefit from Last.fm&#x27;s scrobble <a href="http://blog.last.fm/2009/01/29/closing-in-on-clean-metadata-artist-and-track-spelling-auto-correction-is-here">metadata corrections</a>, i.e. the attention data we get is fairly clean, so a simple text search against our corpus works really well.</p>

<p>Music Feeds has a very simple PHP UI. I love removing features.</p>

<p>This was also a chance to try VirtualBox and run a Debian dev server on OS X. Virtualisation is great. VirtualBox is nice &amp; pretty, but at times it also becomes <a href="http://www.virtualbox.org/ticket/2524">apparent</a> that writing a solid VM is an artform that takes years of practise.</p>

<p>Music Feeds and the architecture behind it was partially written in Z&uuml;rich, San Francisco, and Sardinia. Mostly in London though. (This year I get to travel a lot.)</p>]]></description>
      <dc:creator>Martin Dittus</dc:creator>
      <category>data mining</category>
      <category>konsum</category>
      <category>pop culture</category>
      <category>recommendation engines</category>
      <category>tools</category>
      <category>web services</category>
      
      <guid isPermaLink="true">http://dekstop.de/weblog/2009/07/music_feeds/</guid>
      <pubDate>Sat, 18 Jul 2009 16:15:03 GMT</pubDate>
    </item>
  </channel>
</rss>
