How To Compute Word Embeddings with pure Optimization

Word embeddings are one of the most important achievements in Natural Language Processing (NLP) of recent years. Unlike the tinier and tinier percentage improvement achieved on some famous task, that are usually the highlight of many papers, word embeddings are a conceptual achievement. Today, they are so well-known that every more word of mine about them would be quite useless. But just to set the pace: the words embeddings are vectors of real numbers in a very high dimensional space (everywhere from 50 on), such that words that are “related” have similar embeddings. [Read More]