There was an open source audio fingerprinting system called echoprint, which actually implemented the shazam algorithm in a way that made it hard to claim it's the same approach as shazam, but in reality it was almost the same. The hardest part about these kind of services is designing the fingerprints so that you can search them effectively. The audio part is interesting and fun, but actually less critical.
what would you say is the difference between fingerprint and using something like OpenAI's whisper approach (visual spectrogram ML) for finding the music
tangent: I'm also thinking about some fast way to search text algo maybe related to Spotify damn that was a long time ago read that article
You should be careful with this. Last time I saw an article about reproducing Shazam's algorithm, their lawyers came after them and eventually the article was removed.
There were questions as to the validity of the threats their lawyers used, but even a bulletproof case is a costly endeavor when going up against the scale of companies.
and then goes on to say he lets people put in Spotify links to add songs. Spotify won't let you download songs, but he uses their API to get the band and title... then searches for it on Youtube and downloads the song from there instead
PFFFT that's the sound of Youtube's lawyers spitting out their coffee and sprinting back to their desks
Well sure, a songs database is important. But song databases like https://acoustid.org/ exist, which let you look up songs that share the same audio "fingerprint" (https://github.com/acoustid/chromaprint). You need the full track to make that fingerprint.
Shazam can take only a tiny snippet, and can guess quite accurately just from that snippet. By comparison to AcoustID, which is also a song database (with an entirely different purpose) we can say that the "main ingredient" is Shazam's system for identifying songs from short snippets.
There was an open source audio fingerprinting system called echoprint, which actually implemented the shazam algorithm in a way that made it hard to claim it's the same approach as shazam, but in reality it was almost the same. The hardest part about these kind of services is designing the fingerprints so that you can search them effectively. The audio part is interesting and fun, but actually less critical.
what would you say is the difference between fingerprint and using something like OpenAI's whisper approach (visual spectrogram ML) for finding the music
tangent: I'm also thinking about some fast way to search text algo maybe related to Spotify damn that was a long time ago read that article
You should be careful with this. Last time I saw an article about reproducing Shazam's algorithm, their lawyers came after them and eventually the article was removed.
There were questions as to the validity of the threats their lawyers used, but even a bulletproof case is a costly endeavor when going up against the scale of companies.
The title card of the video is "Please don't sue me", so I assume OP is at least somewhat familiar with the risks.
and then goes on to say he lets people put in Spotify links to add songs. Spotify won't let you download songs, but he uses their API to get the band and title... then searches for it on Youtube and downloads the song from there instead
PFFFT that's the sound of Youtube's lawyers spitting out their coffee and sprinting back to their desks
The Shazam patent is out there.
i want to believe
Related I recreated Shazam’s algorithm with Go (494 points, 7 months ago, 117 comments) https://news.ycombinator.com/item?id=41127726
How Shazam Works (2003) [pdf] (117 points, 11 months ago, 29 comments) https://news.ycombinator.com/item?id=40029036 - there's a lot of links to past Shazam stories in comments
I also did this also in Go about 8 years ago for a company. I wonder if that company still exists actually it was called SpotOn.
Hmm they did seem to have gotten some more customers after I left but the website is all glitchy now so I guess it's abandoned.
https://spot-on.media/
git repository of this project ->
https://github.com/cgzirim/seek-tune
this pops up every now and then so question as an uneducated guniea pig :- isnt the main ingredient the songs database?
Well sure, a songs database is important. But song databases like https://acoustid.org/ exist, which let you look up songs that share the same audio "fingerprint" (https://github.com/acoustid/chromaprint). You need the full track to make that fingerprint.
Shazam can take only a tiny snippet, and can guess quite accurately just from that snippet. By comparison to AcoustID, which is also a song database (with an entirely different purpose) we can say that the "main ingredient" is Shazam's system for identifying songs from short snippets.
I'm curious who's better Shazam or SoundHound
I love this explanation
Can we see the code?
Yes
[dead]