Oh, thanks for the correction. If all the vectors are on the unit ball, then cos... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

VHRanger 7 months ago | parent | context | favorite | on: Is Cosine-Similarity of Embeddings Really About Si...

Oh, thanks for the correction.

If all the vectors are on the unit ball, then cosine = dot product. But then the dot product is a linear transformation away from the euclidean distance:

https://math.stackexchange.com/questions/1236465/euclidean-d...

If you're using it in a machine learning model, things that are one linear transform away are more or less the same (might need more parameters/layers/etc.)

If you're using it for classical statistics uses (analytics), right, they're not equivalent and it would be good to remember this distinction.

gbjw 7 months ago [–]

To be very explicit, if |x| = |y| = 1, we have |x - y|^2 = |x|^2 - 2xy + |y|^2 = 2 - 2xy = 2 - 2* cos(th). So they are not identical but minimizing the Euclidian distance of two unit vectors is the same as maximizing the cosine similarity.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact