The need for free and open music metadata
February 19, 2008
I made a post on what I thought was the essential architecture of a digital music package. The top left section represents the metadata or facts that describe:
1. An artist, or
2. A group, or
3. A song, or
4. An album or music collection
Presently there are a number of options for this:
All Music Group (AMG)
AllMusic.com
Pros: Structured data, highly accurate, deep in content for major artists
Cons: Private, terms and license not open, not offered as a Web service, not timely as most indie and non-English bands are not represented, cannot be updated by third parties
Audioscrobler
Audioscrobler.com
Pros: Web service, timely metadata
Cons: Private, terms and license not open, light on artist content as metadata focused on discovery and social map, cannot be updated by third parties
Music Brainz
MusicBrainz.org
Pros: Non-profit, open, free, structured data, updateable by community, pull web service
Cons: Complex model, incomplete and inaccurate content, difficult to maintain and update, not offered as a Web service No API for updating/writing to the service
Wikipedia
Wikipedia.org
Pros: Non-profit, open, free, in depth and highly accurate data, timely updates
Cons: not structured data, not offered as Web service.
Why are non-profits better for offering a metadata service?
You may be wondering why I listed non-profit entities as a pro, and private companies as a con. My friends are wondering – that’s for sure. The reason is simple, and it has nothing to do with wanting to sit around the open Web campfire singing “Kum By Ya”. It’s pure economics of the new digital world.
Non-profits are simply the stable entity for offering metadata, whereas private companies will be inherently unstable. Music metadata is factual content about known items. As such the cost of acquisition of this data is quite low and falling. The price of metadata, like that of music will approach ‘near free’. Wikipedia already offers better quality music metadata than the other 3 services combined. If Wikipedia data was offered as a structured web service it would be game over.
Check out Wikipedia’s entry on Pink Floyd.
Check out All Music Group’s entry on Pink Floyd
Save for music moods and similar “taste” data, Wikipedia’s is far richer for describing the band's history and relationships.
As for timeliness it’s no contest:
Check out Wikipedia’s entry for local Vancouver band, Art of Dying
Against All Music’s Art of Dying entry.
An indie band has to stop being "indie" before they will be properly covered on AMG.
Why didn’t Music Brainz make it as the default service?
It was staffed by wonderful, incredibly smart, and committed people who understand the need for a free and open metadata service. However, in my opinion Music Brainz is simply way to complex, tedious, and time consuming to update. Wikipedia on the other hand is dead simple. End of story. However, the Music Brainz still has a lot to offer as we discovered.
Building new music metadata Web service
Unless you have deep pockets AMG is out of the question, but Wikipedia does not provide a web service with structured data. So, how can new Web based music businesses effectively use the data?
So to solve this problem we have taken Wikipedia and joined it to Music Brainz to get structured Wikipedia music metadata. That’s cool and useful. At least we think it is.
We will be offering it as a free “for commercial use” Web service in a few months. Sure, we could offer this service at a rate to undercut All Music Group and there would be many takers, but then someone else would come along and under cut us, and so on, and so on, until finally the price was near the cost of offering the service – the margin, or “near free”. So let’s skip all of that rigmarole and go right to free service.
A service like this will help give the hundreds of small music service companies trying to innovate a leg up and an opportunity to innovate rather than trying to collect or pay for music metadata.
Yes there are costs to offering this service: support, hosting and maintenance. We figure the way to pay for this is for companies who hit the service frequently to pay a minimal fee. Hence, ‘near free’. So a new company can use the service for free and make money using it. Once thet grow to the point where they use the service frequently they can help support the infrastructure by paying a small fee.
We are still about 3 months away from releasing it as a service to the public. So, if someone comes along and offers this service before we do, great, we should all use it. The economic rules won’t change.
However, should people take to our implementation, we are going to need some help. Maybe Jimmy Wales can take it over as part of the Wikimedia Foundation. It is using data from his baby after all. Maybe Mozilla or Music Brainz can help and show how this can be managed. We are open to and are actively seeking suggestions.




Open Music Metadata
Came across your blog in search of better matadata services. How is your Wiki/Music Brainz project going? It seems like a really good idea.
My search was prompted by the constant amendments to my library's metadata for reasons of accuracy and custom details. I'm into old blues, jazz,country, folk. I think your model is good but there needs to be some kind of agreement as to the details that it harvests.
For me the ideal would be a system that displays the obvious: track, title, artist, composer, etc...but one that also displayed: the year the track was originally released, the muscians who perform on the track, the original label the track was recorded on. I would like each metadata detail to be able me to link to sites which provided more in depth information.
For example, if I was playing Marty Robbins' 'El Paso', I should be able to see in the metadata that the superb guitar playng, that makes the track, was performed by Grady Martin. Then, on clicking on Grady Martin's name I should be able to find more about him, or which other tracks he sessioned on, like finding out he was the muscian who conceived the immortal intro the Roy Orbison's 'Pretty Woman'. I think such an active system would generate sales as music nuts explore new musical tangents.
Putting this kind of detail together would have to be a combination of collaborative enthusiasm and a sound CMS, as you have proposed.
Anyway, good luck.
MusicBrainz Model
I've looked at the MusicBrainz metadata spec, and I'm surprised that you see it as complex. To me it looks rather sparse. I can't see that it would provide too much of the richer content you're after, it seems pretty much geared to providing music players with track lists and not much else.
Or are you referring to their API?
The data mostly
The data is poor. It is incomplete, often inaccurate, and the schema has flaws (for soundtracks - the artists is not the title of the Movie.) All of this would be solvable if the system was easy to update by a broad community - a la wikipedia. However, the update process is a nightmare. Hey if MusicBrainz worked, we would use it. Plain and simple. I can't find a music system of any worth that actually uses MusicBrainz. Maybe that situation will change, but I'm not crossing my fingers.