The MetaBrainz Datasets

The MetaBrainz core mission is to curate and maintain public datasets that anyone can download and use. We ask commercial supporters to support us in order to help fund the creation and maintenance of these datasets. Personal use of our datasets will always be free. We appreciate end user donations as well!

Our datasets fit into two main categories: Main project data dumps, containing the entirety of the data for a given project, and derived data dumps that are based on the data in our main project databases and have a more specific purpose.

Database Dumps

MusicBrainz PostgreSQL Data

This data dump includes all public portions of the data from our MusicBrainz project. All artists, releases, recordings, labels and the relationships between them, and much more data, including everything needed to run your own copy of MusicBrainz is included.

Read more...

MusicBrainz JSON Data

We also provide the MusicBrainz data in the easily consumable format of JSON documents. If you simply cannot work with PostgreSQL or you prefer to work with a document oriented data store, then this data dump is for you.

Read more...

ListenBrainz PostgreSQL Data

ListenBrainz collects user listening information and creates insights for its users about their own listening habits, as well as providing various tools for discovering new music. All of this data, indexed against MusicBrainz, is available for download.

Read more...

CritiqueBrainz PostgreSQL Data

The CritiqueBrainz project collects user reviews of music and books, based on the data of the MusicBrainz and BookBrainz projects. All of the reviews and everything necessary to run your own copy of CritiqueBrainz is included in these data dumps.

Read more...

Derived Dumps

Derived dumps take data from one project and transform it into a new dataset that solves a different problem.

MusicBrainz Canonical Data Dumps

These dumps contain canonical MusicBrainz data, which makes it easier to reason about the core metadata in MusicBrainz, providing one single record for each musical recording and release in the database. This dataset is useful for matching data to or from MusicBrainz.

Read more...

MHLD+ Data Dump

The Music Listening Histories Dataset collects a large number of music listening events assembled from more than 27 billion time-stamped logs extracted from Last.fm. Using the MusicBrainz canonical data, we cleaned up errors in the data in order to provide an improved version of this dataset.

Read more...