Hacker Newsnew | past | comments | ask | show | jobs | submit | more aerozol's commentslogin

The MusicBrainz project by MetaBrainz has released their latest dataset, MusicBrainz Canonical Metadata. This dataset solves a number of problems involving matching music to the correct entry in the massive MusicBrainz database. Previously it has been difficult to programmatically identify the main (canonical) release of an album or song. This dataset solves the problem, for anyone interested in building their own music database, tagger application, or other music-related application.

You can find all the MetaBrainz datasets here: https://metabrainz.org/dataset…

The MusicBrainz database aims to collect all the metadata for all music that has ever been published. For popular albums and songs, which have been released many times, it can be hard to answer the question “which one is the main (canonical) entry?” Using the new dataset, a user can enter any release or recording MBID (MusicBrainz identifier), and match it to the canonical entry.

The tables included in the dataset contain all the string metadata necessary to make effective use of the dataset. Artist names, release names and recording names are all present, indexed against the MBID’s. This lowers the barrier for entry to music-based development considerably — anyone can now import the dataset into their favourite datastore, and start looking up tracks.

The MetaBrainz Foundation offers a number of different datasets, often under the Creative Commons Zero (CC0) licence. These datasets can be used to build applications, databases, or train machine learning algorithms/AI. MetaBrainz Foundation datasets power countless projects, and stand behind the scenes of many of today’s largest tech companies, such as Microsoft, Google, and Amazon. The MetaBrainz Foundation datasets are all available on the MetaBrainz datasets page. The MetaBrainz Foundation uses the new MusicBrainz canonical metadata dataset themselves, primarily in the tagging application MusicBrainz Picard, and the social music site ListenBrainz.


Heads up that the MetaBrainz page you link is 404 page not found.

This info is most awesome to know though, thank you


I think it was meant to be https://metabrainz.org/datasets, which is the same as the top level link.


I hate to say this, but I suspect that an LLM that has been trained on how to post on HN truncated the link because links on HN are (visually) truncated.


A suspicion that I also have often, these days… but no, no LLM in this case!


Is this an AI summary post?


Hey, I was away for the long weekend, so a bit late…

To answer your question, I’m 99.9% sure I’m not AI, just a derp who pastes in truncated URLs.


The Discogs ‘schema’ doesn’t attempt to solve any issues brought up in the article.


I believe digs.fm is partly powered by MusicBrainz already, hurray :D


That's true! All release groups from MusicBrainz are fed into the Digs database.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: