more aerozol's comments

aerozol · on June 1, 2023

The MusicBrainz project by MetaBrainz has released their latest dataset, MusicBrainz Canonical Metadata. This dataset solves a number of problems involving matching music to the correct entry in the massive MusicBrainz database. Previously it has been difficult to programmatically identify the main (canonical) release of an album or song. This dataset solves the problem, for anyone interested in building their own music database, tagger application, or other music-related application.

You can find all the MetaBrainz datasets here: https://metabrainz.org/dataset…

The MusicBrainz database aims to collect all the metadata for all music that has ever been published. For popular albums and songs, which have been released many times, it can be hard to answer the question “which one is the main (canonical) entry?” Using the new dataset, a user can enter any release or recording MBID (MusicBrainz identifier), and match it to the canonical entry.

The tables included in the dataset contain all the string metadata necessary to make effective use of the dataset. Artist names, release names and recording names are all present, indexed against the MBID’s. This lowers the barrier for entry to music-based development considerably — anyone can now import the dataset into their favourite datastore, and start looking up tracks.

The MetaBrainz Foundation offers a number of different datasets, often under the Creative Commons Zero (CC0) licence. These datasets can be used to build applications, databases, or train machine learning algorithms/AI. MetaBrainz Foundation datasets power countless projects, and stand behind the scenes of many of today’s largest tech companies, such as Microsoft, Google, and Amazon. The MetaBrainz Foundation datasets are all available on the MetaBrainz datasets page. The MetaBrainz Foundation uses the new MusicBrainz canonical metadata dataset themselves, primarily in the tagging application MusicBrainz Picard, and the social music site ListenBrainz.

canadiantim · on June 2, 2023

Heads up that the MetaBrainz page you link is 404 page not found.

This info is most awesome to know though, thank you

foolswisdom · on June 2, 2023

I think it was meant to be https://metabrainz.org/datasets, which is the same as the top level link.

pessimizer · on June 2, 2023

I hate to say this, but I suspect that an LLM that has been trained on how to post on HN truncated the link because links on HN are (visually) truncated.

aerozol · on June 5, 2023

A suspicion that I also have often, these days… but no, no LLM in this case!

have_faith · on June 2, 2023

Is this an AI summary post?

aerozol · on June 5, 2023

Hey, I was away for the long weekend, so a bit late…

To answer your question, I’m 99.9% sure I’m not AI, just a derp who pastes in truncated URLs.

aerozol · on Feb 20, 2023

The Discogs ‘schema’ doesn’t attempt to solve any issues brought up in the article.

aerozol · on Aug 23, 2022

I believe digs.fm is partly powered by MusicBrainz already, hurray :D

throwaway874839 · on Aug 23, 2022

That's true! All release groups from MusicBrainz are fed into the Digs database.