MetaBrainz Foundation Annual Report 2011
We did it! The Next Generation Schema (NGS) went live in May 2011 after more than five years of planning and two years of active development!
The changes in NGS moves MusicBrainz from a "will this work?" open source project aimed at identifying Audio CDs, and solidly moves it towards being a comprehensive music encyclopedia. The previous schema we used was simple, but it could not capture all of the nuances inherent in music metadata. For many years we'd known that we needed to overhaul our schema and our website and deciding what and how we wanted to do it took more than 3 years of planning. Then after 3 arduous years of re-writing all of MusicBrainz we were finally somewhat ready to unleash NGS on the world. On May 18th we finally flipped the switch and set MusicBrainz NGS live on the 5 spiffy new servers we had just purchased with money raised in a fundraiser we held in March. To get more information on NGS, please see the NGS release notes.
Our web service API also saw a major overhaul with version 2 adding support for all the new NGS entities.
Finally, MusicBrainz, through the efforts of Jamie McDonald, also released its first mobile application, the MusicBrainz Android application.
The Guardian is a British daily newspaper that started using our data in 2010 to identify and link to music related articles on their web site. In 2011 they became a customer of the MetaBrainz Foundation.
Google Summer of Code
We had three students for Google Summer of Code. Ian McEwen (ianmcorvidae) worked on collecting and visualizing our database statistics and Michael Wiencek (bitmap) took on the task of making Picard NGS-ready. Eliza Gebow also participated and hoped to create MusicBrainz widgets, but she never finished her project.
On a related note, Jamie McDonald's MusicBrainz Android App from last year's Summer of Code was officially published in the Android Market.
The 11th MusicBrainz Summit was held in Rotterdam, NL, from Oct 15-17th. The summit was a great success with 17 people in attendance, including representatives from the BBC, Google, Last.fm, musiXmatch, and Zvooq.
There were many popular topics discussed at the summit such as how to improve the edit system, how to define data quality, how to integrate third party datasets, and much more. Copious amounts of chocolate, energy drinks, stroopwafels, and sandwiches (the family-sized, Spanish variety) were consumed over the course of the summit. For lots more information, view the summit session notes.
Special thanks to Last.fm, musiXmatch, Google, and Grooveshark for sponsoring the summit.
|Live Data Feed licenses||$94,653.09|
|Tagger Affiliate Program||$18,485.64|
|CC Data License||$6,575.00|
|In Kind Donations||$49,480.00|
The Profit & Loss shows:
- Over the course of the year we spent $36,800.94 on hosting and hardware costs and served out 4.6 billion web hits out of which 3.8 billion (93%) were web service hits. Calculating a cost per hit, we find that we spent $8.01 per web million hits and $8.56 per million web service hits. Compared to our 2010 numbers of $6.93 per one million web hits and $14.50 per one million web service hits, we can find two things:
- We're getting a lot more web service hits compared to web hits
- Our cost per overall web hits has gone up for the first time in our history, but this was due to the massive cost of buying 5 new servers.
- Development costs in the form of salaries paid to Oliver Charles and Kuno Woudt came to $104,730.01, and administration costs including taxes came to $68,398.40.
- We earned $94,653.00 from live data feed licenses and $5,100 from Creative Commons licensed data for a total of $99,753.00. This is almost double compared to last year!
- End user donations via PayPal came to $14,851.60 which is significantly more than last year -- this was mostly due to our March hardware fundraiser where we raised a total of $15,527.50 from end users and corporate sponsors. Sponsorships large donations for 2011 came to $55,234.99. Additionally we had a massive $49,480 donation of server hardware. Thank you Google and our anonymous donor for your support -- without you MusicBrainz would be moving a lot slower!
A hardware fundraiser was held earlier in the year in preparation for the NGS release. After a month of gentle prodding and poking, our community and sponsors came through and exceeded our goal by donating a total of $15,527.50. Thank you Google, SoundCloud, EchoNest, Matt Mullenweg (WordPress), Decibel.net, Magic MP3 Tagger, Grooveshark, Songkick, and Affinity Chiropractic (the people MetaBrainz rents an office from), and thank you to the 83 individual donors who helped make this happen.
There was a significant increase in traffic over the course of 2011. We started the year at 9M hits/day and ended the year with a whopping 22M hits/day. This is a significant increase of more than 140%!
As a consequence of our increasing traffic and finite resources, harsher rate limiting rules were implemented on a global level as well as for some specific misbehaving applications.
A big thank you to all 23,637 editors/voters who contributed in 2011! MusicBrainz would be nothing without your hard work!
At the end of 2011, MusicBrainz had 13 machines in service. From the top, going down:
- rika: User sandbox machine (mbsandbox.org)
- lolo: An extra front end web server
- scooby: Our aging catch all server: blog, forums, mailing lists, etc
- catbus: retired
- bender: retired
- blik: retired
- stimpy: MetaBrainz, MusicBrainz Classic
- dexter: retired
- cartman: Classic search server, index builder
- wiley: New catch all server: SVN, git, jira, wiki, trac, mail, backups
- lenny/carl: Redundant network gateways
- tails: off currently
- asterix: web server
- astro: web server
- roobarb: search server
- pingu: web server
- dora: search server, memcached server
- totoro: database server
Over the course of 2012, we're going to remove and donate the retired servers and replace them with the servers that were donated to us at the end of 2011. MusicBrainz uses 8mbits of bandwidth per second on average and draws 27 Amps of current for a power consumption of about 2,970 Watts. MusicBrainz physically occupies 20Us of space (half of a rack) at Digital West in San Luis Obispo, CA.
Words of appreciation
2011 was a challenging year for us, but in the end everything turned out quite well. With the release of NGS and the many months of bug fixing that followed the release, we were able to complete a massive five year project. Towards the end of the year we finally found time to work on new features and and other improvements that were not related cleaning up after NGS. We've now moved past NGS and have a fresh, clean and stable codebase that will allow us to make MusicBrainz even better.
Our traffic keeps increasing and community support is also on the rise. Between the fundraiser and the donated hardware MusicBrainz looks quite a bit more mature than it did a year ago -- we've done a lot of growing up in the past year. And most of that was thanks to you, our users, editors, developers, sponsors, friends and board of directors. There are so many people to thank, we wont even attempt to do so because we know we can't possibly think of everyone who helped us through this critical year.
Thank you for your continued support; we're going to work hard to make MusicBrainz even better!