lundi 28 novembre 2016 à 17:06

Added Info from Discogs.com API to my Database

Par Eric Antoine Scuccimarra

Discogs.com is one of my favorite web sites. I discovered it 15 years ago, and have used it since to research records and music and such. I just recently discovered that they have an API, so I used it to search for each record in my collection, and if it found results, I imported the link to the discogs page as well as a link to the thumbnail images to my database.

The process was rather convoluted. To start I just did a search for the data as it was in my database - artist, title, label and catalog number. But discogs often has multiple entries for each record - maybe it was released in different countries, or re-released, or has different entries for promos, test pressings and white labels. So my starting algorithm was as follows:

  • Match the full data in my database to a search request to Discogs API.
  • If one and only one result is returned, take that one and link it.
  • If more than one result is returned, filter out the ones for promos, mispresses, white labels, etc.
  • If there is still just one result, use that one. Otherwise mark the number of results returned in the database and move onto the next record.

This matched a couple hundred out of the couple thousand records in my database. Most of my records got 0 matches to Discogs, some still had multiple matches - anywhere from 2 to 35. So I started reviewing the ones with multiples by hand, and I realized that for most of the records with under 5 matches the matches were pretty much equivalent. So for those I just took the first match and assigned it. This matched another couple hundred records.

Now I put aside the few remaining records with from 5 to 35 potential matches and focused on the thousand or so that had no matches. Reviewing some of them manually, I found that many of them were due to typos in my database. So my next step was to omit the artist field and just check the title, label and catalog number. I got another couple hundred matches using this method. Then I went on and just searched using the catalog number. This method matched about half of the remaining unmatched records - but I had to manually verify each match because some catalog numbers are not unique. 

Unfortunately I do not have a catalog number for every record in my collection, and as of now about 1/3 of the records in my database are still unmatched. For those that are matched, on the record information page you will now see a link to the discogs.com page for that record, as well as a thumbnail pulled from discogs.com if available. For anyone interested in collecting records I highly recommend discogs.com as it is by far the most comprehensive database of music releases I know of. 

Libellés: personal, music


Commentaires

Connectez-vous ou Inscrivez-vous pour enregistrer un commentaire..


Archives du Blogue