This blog post gives a brief introduction on what information word embeddings hold on words with multiple senses.
Word embeddings can be interpreted as points in a high dimensional space, which I will refer to as a word space. In this word space, cosine similarity is commonly used to measure the similarity of two words. Cosine similarity lets us measure that great is similar to good, terrific and huge. But is great similar to good in the same way as it is similar to huge? Of course not!
In a high dimensional word space, a word can be similar to other words in completely different ways. In their 2015 paper, Amaru Gyllensten and Magnus Sahlgren propose the use of relative neighborhood graphs to explore the local neighborhood around a word. Simply put, a relative neighborhood graph links two words together if there is no word between them.
By constructing the relative neighborhood graph of great, and some of its closest neighbours, we get a glimpse into the local structures found in the the word space:
Great has four branches, connecting it to good, greatest, tremendous and terrific. This can be interpreted as great being similar to these words in different ways. On the other hand, words along the same branch will be similar to great in the same way. For example, good and nice are similar to great in a similar way, while tremendous and huge are similar to great in another way.
The relative neighborhood graph reveals that the local structure in word space holds semantic information about different senses of a word. This can be useful for word sense disambiguation when querying for similar words in a word space. It is also interesting to keep in mind that this information is available for downstream machine learning models that operate on the word embeddings.
This was just a quick introduction to the usefulness of relative neighbourhood graphs as a tool for exploring word embeddings. If you want to know more, I recommend reading Amaru and Magnus’s paper!
Gustav Gränsbo
Master’s thesis student at Gavagai