But that's what I mean -- swapping seems indeed suprisingly easy to write incorrectly, just as you say.
But just randomly picking from a pile seems pretty easy to write correctly. No? And if you have an off-by-one error, your deck of 52 cards gets turned into 51, so it's not like you won't catch it...
Maintaining a "pile" of not-yet-selected items, and randomly removing elements from it, turns your shuffle from an O(n) algorithm into an O(n^2) one (unless you go out of your way to use a more sophisticated data structure than either an array or a linked list).
It's not likely to matter much if you're only shuffling 52 cards, but if you're writing a general-purpose shuffle algorithm, there's no reason to needlessly make it inefficient.
But just randomly picking from a pile seems pretty easy to write correctly. No? And if you have an off-by-one error, your deck of 52 cards gets turned into 51, so it's not like you won't catch it...