Ever Wonder How the Shazam Algorithm Works?

time：2025-04-26 20:52:07

«-- --»

Your phone's ability to identify any song it listens to is pure technological magic. In this article, I'll show you how one of the most popular apps, Shazam, does it. Now, interestingly, the founders of Shazam released a paper documenting how it works in 2003, and I personally have been working on an open source implementation of that paper, on a project I called abracadabra.

Where the paper doesn't explain something, I will fill in the gaps with how abracadabra approaches it. I've also included links to the corresponding part of the abracadabra codebase in relevant sections so you can follow along in Python if you prefer.

Granted, the state of the art has moved on since this paper was published, and Shazam has probably evolved its algorithm since it was acquired by Apple in 2018. However, the core principles of audio identification systems have not changed, and the accuracy you can obtain using the original Shazam method is impressive.

To get the most out of this article, you should understand:

Frequency and pitch
Frequency is "how often" something happens, or the number of cycles a soundwave completes in a second, measured in hertz (Hz). Pitch is the human perception of the frequency of sound, with higher frequencies being heard as higher pitches and lower frequencies as lower pitches.
Waves
Waveforms are like the shapes or patterns that sound makes when you could see it. They show how the air moves back and forth when something makes a noise.
Graphs and axes
Graphs are pictures that show information using lines, dots, or bars. Axes are the two lines on a graph that help you see where the information belongs, with one line usually going side to side (horizontal) and the other going up and down (vertical).

Content

Ever Wonder How the Shazam Algorithm Works?

Knowledge

Encyclopedia