GloVe
Glove is a very important vectorizing tool in NLP. It is an acronym for Global Vectors for Word Representation.It means the global array to represent words.
Glove was used first time in 2014. It is basically an unsupervised algorithm that has already been trained on a huge corpus to learn how close words are to each other, and to recognize the inclusion matrix of words and to draw words close or far from each other. It is an innovation of Stanford University.
The main advantage of GloVe over Word2Vec and the Word Embedding values is that GloVe does not compute the local words values on the data that they are training on, but on the huge global data that Stanford University has trained them on, which makes the accuracy much higher, even if our words are few. Meaning that Glove has used the benefit of the local semantic of the text under analysis and the global use of the word that has been applied from Stanford project.
It is based on the idea of the relationship of words to each other. For examples the close and far relationship between south, north, east and west or the relationship between father, mother, brother sister or the relationship between any set of words. Therefore Glove is used now to get the relationship between semantics such as synonyms, antonyms, equipments’ relationship, comparatives, superlatives, cities and countries..etc.
The idea of training GloVe is based on the calculation of what is called: the Co-Occurrence Matrix. It is an equal array of square rows and columns in which all the words in the text after removing the repetition and the shared values in the table indicate the extent of their coexistence with other words in a specific context or in a specific sentence.
For example the Co-Occurrence Matrix of the following text:
Amazon Alexa developed by Amazon. First used in Amazon Echo.
as follows
To calculate the co-presence matrix, we must first determine the window size, which is the number of words that will be looked at right and left of the target word.If the window size equals to 1, we will calculate the relationship of each word, with only the previous and next word. If the window size equals to two then they are two words before and two words after the defined word will be considered, and so on. But there is a condition that the target word and the words defined within the window size must be in the same sentence. If the defined word at the beginning of the sentence, then we do not look to the previous word, even if it is adjacent to it because it is in a deferent sentence. Also if the defined word at the end of the sentence, we do not look at the next word for it, because this word relates to a new sentence and it is not within the same sentence of my defined word. Now, we must know what is called: The Probability Ratios Table.
The Probability Ratios Table
In the probability ratios table, each column represents the word that we want to measure, and each row represents a topic.The values in the following table represents the probability that this word will be present in this specific topic, while the last row represents the division of the first row by the second which is the ratio of the first probability to the second one.
The values in the above probability ratio table indicate the extent to which the word is present in one topic more than the other one.
References:
1- Hesham Asem, NLP, Glove,Part two