Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Is active music cancelation possible?
10 points by solardev 43 days ago | hide | past | favorite | 15 comments
Any audio engineers out there? I don't know enough about waveforms, but I was wondering if it might be possible to combine active noise cancelation techniques (as in Airpods or other headphones) with music fingerprinting and waveform inversion in order to make headphones that can cancel out music?

For example, let's you say you want to go to a coffee shop, but don't like the music that they play. Regular active noise cancelation headphones can filter out some of the background noise already, but what if they could also recognize the song that's playing (using existing fingerprinting techniques), download it, invert the waveform and then use the microphone to measure delay and frequency shifts in real time and try to destructively cancel it out? (Only for the headphones wearer, not the actual source of the music.

My hope is that while regular noise cancelation works best on repetitive waveforms (like an engine hum or an electrical whine) because it's limited to what the mic hears in real time, having the exact song downloaded ahead of time would allow you to more easily apply the corrections in sync with the waveform.

Is that conceivable?




Back in 2018 I spent a week or two messing around with this idea, and produced a hacky proof-of-concept.[1] It's not intended for real-time or production use, I just made a prototype to see if it could be done.

The README explains the method: once the contaminating song is identified, it syncs up the recordings in time with a correlation analysis, adjusts for frequency-dependent gain effects, then subtracts the undesired content.

Warning: I'm not an audio engineer, the output sound quality is NOT good! This was just a toy project in my early days of learning to code. I assume there are much better ways to approach this that would yield significantly better results.

[1] https://github.com/mitchellpkt/tracksubtract


This is really cool, thank you for sharing! It's a great proof of concept that significantly reduces the song volume.

I'll have to take a look at the code in depth to try to understand how the signal processing works. I really appreciate you making it open source and providing great documentation too!


The tricky part would be maintaining the equalization + delay matching in a changing environment - imagine a person suddenly walks in between you and the speaker that's playing the music.

Unless your correction system can respond in real-time, you've accidentally created an audio-frequency bistatic radar :P


If the waveform is just slightly out of sync, it should still be partially destructive, right? If a person walks in between, maybe it reduces the cancelation by a bit for a few seconds, but it should still be quieter than not having any cancelation... I think?

I think the correction system would necessarily have to be real time, like regular active noise cancelation already is. The main difference would just be that it has an additional input source (the downloaded waveform) that it can use to assist what the microphone hears.

I think normal noise cancelation continuously records for a few milliseconds and then uses that waveform for the cancelation, while continuing to record and update in real time. This would still be that same approach, except it knows what's going to happen the next second because it already has the waveform downloaded.


It's a very interesting problem. I'm worried about echos and crappy speakers distortion, but it looks totally posible.


There’s comb filtering too.


I have wanted this at the speaker level, be able to cancel environmental noise coming through eg a window. A library of cancellation patterns, delivered through an aptly named sound bar.


If the noise is repetitive and predicable (like from a motor) and the environment is relatively stable, I think there are existing industrial noise cancelation systems that can do that. Some cars have that built in too.


Here is one approach that works: block/reduce the majority of sound physically. And then use s microphone with an adaptive filterbank to let in only the sounds you want to have. This is how hearing protection with speech passthrough works.

Masking with a static noise source can also help.


A. Time Domain

Fingerprinting a song introduces latency because the song has to play long enough to get an adequate sample for the fingerprint.

What would the headphones do before a match is established?

B. Space Domain

Canceling requires a waveform 180 degrees out of phase. It needs to be about the same size as the song. If you stream that waveform, there’s latency and mechanical rights issues. If it is stored locally, you need space for all-the-songs.

C. Music is bigger than Texas

Remixes, covers, recordings of live performances add complexity to fingerprinting and increase the size of the phased wave database.

Good luck.


I don't think it has to be perfect to be useful.

A) Sure, it takes a few seconds to not only fingerprint but also start downloading the song. That's okay. It would just sound like normal noise cancelation until the music cancelation begins a few seconds later.

B) I don't see why you can't just download the current song that's playing. It should only take a few seconds over a typical phone or wifi connection.

C) Remixes and covers can already be identified by existing fingerprinting apps, can't they? It probably just won't work for live performances, but why would someone even go to a live performance if they don't want to hear the music?

Even if it takes a few seconds per song to initiate, requires a connected smartphone or watch, and only works 90% of the time... that'd still be a huge improvement for me and some noise-sensitive people I know, in the use case I'm imagining (trying to tune out background music while working).

I wish I knew how to build something like this... I'd gladly pay for it if someone else built it.


> but why would someone even go to a live performance if they don't want to hear the music

Once I went with my wife for dinner in a restaurant and there was an unexpected live tango pianist. It was nice, but sometimes I prefer a quite dinner.

Also, I think the GP was talking about the multiple version of the live performances that are recorded and played later in radio or tv. Each one has minor variations, but identifying the exact one may be tricky.


> It was nice, but sometimes I prefer a quiet dinner.

Oh yeah, I get that. Often I have to leave when trying to work at a place once the music starts.

On the other hand, my gf probably wouldn't be too happy with me if I put on noise-canceling headphones while she's trying to talk to me at dinner :)

If the technology could be further tuned to not only isolate unwanted music and noise, but also amplify & enhance desired frequencies (like your wife's specific voiceprint), that'd be cool. The ultimate hearing aid?

> Also, I think the GP was talking about the multiple version of the live performances that are recorded and played later in radio or tv. Each one has minor variations, but identifying the exact one may be tricky.

Oh, gotcha. That makes more sense. I might to try to build a simple PoC in software, and that'll be one of the edge cases to consider.


If regular noise canceling headphones are good enough while waiting on latency, they are probably good enough after that for three or more sigmas of people who use noise canceling headphones for coffee shop music.

People who noise cancel coffee shop music are only a minority fraction of coffee shop patrons. And people spend only a small fraction of their lives in coffee shops.

Anyway building it is the simplest way to learn how to build it and the way to get what you want or understand its practicality or impracticality. Paying someone else to develop it is a possibility but typically naught but a logical one I’ve included for completeness.

For reference, I was gifted some AirPod pros and they handle the coffee shop reasonably well provided I have music or similar playing in them. When I want just silence, I use cheap earplugs.


You're right, it would probably be a niche thing. I just wish they existed so I could buy some for myself to use.

I have no background in audio processing at all beyond fiddling around in Audacity a bit. It would be a cool side-project rabbit hole to dive into, though! Another user, mitchellpkt, has an open-source project doing this same thing on record audio tracks: https://news.ycombinator.com/item?id=41088865

It's not real-time, but it's still way beyond anything I know how to do right now. Time to learn!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: