Distance Metrics

2024.11.07 09:32

Use the distance metric that matches the model that you’re using.

Cosine Similarity

The cosine similarity measures the angle between two vectors in a multi-dimensional space – with the idea that similar vectors point in a similar direction. Cosine similarity is commonly used in Natural Language Processing (NLP). It measures the similarity between documents regardless of the magnitude.

Dot Product

The dot product takes two or more vectors and multiplies them together. It is also known as the scalar product since the output is a single (scalar) value. The dot product shows the alignment of two vectors. The dot product is negative if the vectors are oriented in different directions and positive if the vectors are oriented in the same direction.

Squared Euclidean (L2-Squared)

The L2 norm takes the square root of the sum of the squared vector values.

Manhattan (L1 Norm or Taxicab Distance)

The L1 norm is calculated by taking the sum of the absolute values of the vector. The Manhattan distance is faster to calculate since the values are typically smaller than the Euclidean distance.

Hamming

The Hamming distance is a metric for comparing two numeric vectors. It computes how many changes are needed to convert one vector to the other. The fewer changes are required, the more similar the vectors.

Compare two numeric vectors
Compare two binary vectors

참고

os운영체제 유형 os유저, 커널레벨 스레드 os파일 시스템 vectorsearchIndex-realated chians vectorsearchVector Search 실습CNN, RNN