I think everyone knows that there's only one correct answer and that's phase cancellation because the dry vocal track is typically dead center in the mix.
Here it happens because when the earphones' ground connection is disconnected from the source, the earpiece drivers remain connected to the L and R hot signals and their former common grounding point floats and leaves them in-series. This means that any signal present equally on L and R results in no current through the transducers. You hear only the difference signal in both ears, although to be precise, one is actually opposite in polarity.
Why repeat exactly what's in the linked page? Probably half the people reading that stackoverflow question from hn have EE degrees so the answer was obvious to most, but I don't see the point of just copying the answer over to hn.
I didn't think anyone had covered the small leap from "disconnected ground" to a "get a difference signal" as presented in other comments. Glad to hear there are a lot of EE's here!
As a non-EE, something that confused me briefly is that "L-R" doesn't seem symmetric but the scenario is symmetric. But I see now that you'd hear L-R in your left ear and R-L in your right ear. The R current goes the "wrong way" through the left speaker and causes the coil to move opposite the way it does in the right speaker (and vv).
Discovering this trick as a kid was one of the many steps on the path to my fascination with all things acoustic. The widening effect of having the two earpieces out of phase was particularly intriguing to my young mind.
The answer from the link said 'in-phase cancellation' which isn't really a 'thing' nor accurate. He was just trying to say that when the signals are in phase you wouldn't get sound. There isn't actually any cancellation going on here. When the signal is the same in both channels there is just no voltage across the speaker so it doesn't do anything.
Their signals are cancelling out because they are in phase, this is a thing. It describes what is happening and why. You are arguing semantics here more than a technical point here, from my perspective. I agree with the physical reason, just because it is simple does not mean you cannot have a term for it in the case.
I am in a way arguing semantics because I believe its confusing to non EE people reading this to use a term like 'phase cancellation' in a way that is very different from its usage in almost every other context. In general it refers to the sum of two waves equaling zero (or some reduced value), i.e. destructive interference. In this case the opposite is happening and the vocals are removed not due to a sum of the waves but rather a difference equating to zero.
Yes, we have interacted there a few times. I would probably use the term "common-mode cancellation." But it is something that is a very base concept in EE and the other terminology makes sense to me, although for non-EE I could see the issue.
You should make hacker news your website on EE? Your account looks some bare there, add some information!
There is no phase inversion here. When the signals are the same there is just no voltage across the speaker. Think of it as putting 5 volts on both sides of the speaker, the voltage difference from one side of the speaker to the other is 0 volts, so no current flows and the speaker is at rest.
Stereo audio has completely separate audio signals for the left and the right channels. One of the reasons to use separate channels is to allow creation of a "sound stage". If a track is well mastered/mixed and you have good speakers/headphones you should be able to pick out each performer's location as if they were on a stage in front of you. For example you should be able to discern where the lead guitarist is standing vs where the bass guitarist is standing. This is done by controlling the volume and phase of the sound in each channel for each performer. Generally the vocalist is placed dead center of the "sound stage" and the way to achieve that effect with stereo audio is to feed the exact same signal(in both phase and volume) into both the left and right channels. The 'mix' is just a term used for the way in which the various recordings of each performer are combined to create the final product.
There are a lot of fun things you can do acoustically with stereo sound. Some phasing effects can actually be pretty 'tripy' for lack of a better word.
Interesting. Any comment on how they get the "front of stage" and "back of stage" effect? I used to not believe this was possible, until I listened to a good recording on good speakers and could place the bass player clearly to the left and behind the singer.
Some depth can be modeled with phase control. By controlling the phase of a signal compared to another you can create a perceived time delay which makes it appear as though one performer is behind another. I'd also add that this is much easier to do with low frequencies (bass player) due to the wavelength being so long. At 49hz (G1 on a bass guitar) the acoustical wavelength is ~6.9 meters which means even small phase shifts (time delays) can create meaningful depth. This approach is pretty useless at higher frequencies as the time delay (and depth) achievable gets very small.
There are other ways to do this via more complex 3D acoustical modeling. Mostly focusing on modeling of reverberation effects but you don't see that much in music recording. It is used a lot in games though.
In addition to what mbell said, another simple approach for making one sound seem further away than another is changing the balance of the direct vs. reverberant sound (more distant == more reverberant sound, lower direct sound volume). This would happen naturally in a stereo recording with two microphones, and can also be done artificially.
Predelay time on the input to the reverb. The longer the gap between the direct and reflected sound, the closer to you the direct sound will seem to be.
Here it happens because when the earphones' ground connection is disconnected from the source, the earpiece drivers remain connected to the L and R hot signals and their former common grounding point floats and leaves them in-series. This means that any signal present equally on L and R results in no current through the transducers. You hear only the difference signal in both ears, although to be precise, one is actually opposite in polarity.