Skip to main content

Unicode audio analyzer

· 4 min read
Bruno Felix
Digital plumber, organizational archaeologist and occasional pixel pusher

What does unicode, audio processing and admittedly bad early 2000s Internet memes have to do with one another?

In the previous post in the deep dive into unicode series we explored how combining characters like diacritics work. One interesting property of unicode is that it is possible to combine multiple combining characters together.

Stacked characters: ç̰̀́̂

The example above shows the letter C with several combining characters, and as we can see they stack up quite nicely. This is the basis for an Internet meme of the early 2000s called Zalgo text1. We can take this to the next level with a "winamp style" analyzer bar, but fully in (zalgo) text for an extra metal look and feel. 🤘 🤘

----------------------------------------------------------------

The reality is that this was really an excuse to play around with the Web Audio API2 and some modern React (I hadn't touched front-end development in a while), and there were a few learnings in the process.

Implementation and technicalities

From an implementation perspective the first challenge was to understand what the Web Audio API offers in terms of digital signal processing and how to use it. The documentation is excellent and gist of it is that audio operations happen in an Audio Context that represents an audio processing graph built from several AudioNodes linked together in such a way that the output of one node serves as the input for the next. Because I wanted to extract the frequency domain from the audio signal in order to render it on screen, I used an AnalyzerNode3, which doesn't modify the audio but returns data about the frequency domain using a trusty old FFT4.

The following code example puts all of these concepts together:

const context = new AudioContext();
const theAnalyser = context.createAnalyser();
const source = context.createMediaElementSource(audioNode.current);
// build the audio processing graph connecting the input source
// to the analyzer node, and the output of the analyzer to the
// output of the Audio Context.
source.connect(theAnalyser);
theAnalyser.connect(context.destination);

Another interesting learning was about the advantages of requestAnimationFrame5 (RAF) versus a plain old setInterval for rendering. Since in this case I wanted performant and smooth updates RAF was an interesting choice as its refresh rate tries to match the display's refresh rate and calls are paused when running in background tabs or hidden - meaning better performance and battery life.

Finally, why not put everything together in a nice NPM package? Since I don't usually work in the JS ecosystem it was a nice opportunity to get some hands-on experience with this. The npmjs6 documentation is very good and the setup was straightforward, especially if you've published packages in Maven Central, Artifactory or equivalent. Top marks there. You can find the package here: https://www.npmjs.com/package/@felix.bruno/zalgo-player and installation is of course super easy:

$ npm install @felix.bruno/zalgo-player

This being the Javascript/Typescript ecosystem not everything was smooth sailing and I discovered that Create React App7 still doesn't support Typescript 5, and Github seems a bit dead which is a bit of a bummer. After spending some time looking Vite8 seemed like a decent choice to set up a basic react library with properly configured Typescript support.

In this case, since I wanted to publish only a React component and not a full-blown web application I had to make some changes to what Vite offers out-of-the-box9, but I am quite happy with the end result. The npm module is less than 15KB uncompressed, and has no dependencies (since this is a React component, it can only be used in that context, and thus we don't need to ship React with the package).

The code is available in Github: https://github.com/felix19350/zalgo-player

Note: In a next iteration I will work a bit to enable the component to be responsive, so if you view this in a mobile phone this may not render very well.


Footnotes

  1. Zalgo - Wikipedia

  2. Web Audio API docs

  3. Web audio visualizations and AnalyzerNode docs

  4. FFT - Fast Fourier Transform. This video provides a nice intuition for how Fourier Transforms work in general, so go watch it!

  5. requestAnimationFrame documentation

  6. npmjs documentation

  7. Create React App

  8. Vite

  9. This article was quite helpful to get me up-to-speed on the changes that I needed to make in order to publish the ZalgoPlayer component as a library.