The traditional approach to text analytics
In traditional customer experience systems, the analyst will start with creating a model and pre-define certain topics, for example, “customer satisfaction” and “price”. The analyst will then create an explicit list of words that belongs to each such topic. For each topic, the system will then calculate sentiments, which again will be based on static lists of words for each sentiment.
This approach has three main issues: firstly, it is very time consuming to create the topics, and risks missing words and expressions that rightfully belong to a certain topic. But more importantly, it may miss something that users feel is important, but that the analyst didn’t think of. And finally, by relying on static lists for sentiments, the model will either need constant updating or risk becoming obsolete as the language usage changes over time.
Let’s take an example and assume that you are analyzing a restaurant chain. You may create topics for “food”, “service” and “premises” and start classifying your texts into these buckets. But let’s assume that some of the restaurants have a problem with poor hygiene, making customers sick. But if the analyst had not expected that, this will not show up.
Contrast that with Gavagai’s approach
Topics are calculated automatically, using our unsupervised AI. It will suggest probable topics and the job of the analyst will be to refine these, which is a much easier task than coming up with them from scratch. If food poisoning is indeed a problem, it will show up by itself. And finally, since the system is self-learning, it always keeps up to date with new usage of the language.
N-grams is a powerful technology that deserves special attention. This concerns combination of words that do not mean the same thing as the individual words. For example, “new” and “york” mean something completely different from “new york”. This also works for expressions that are longer than two words. Gavagai has a unique ability to automatically find N-grams, and this adds significant power to the system.
For these reasons, Gavagai’s technology is not only superior in coverage (it finds more examples of what you are looking for) and more accurate (a higher proportion of texts are classified correctly), but it is also cheaper to maintain. This means that Gavagai can support 46 different languages natively, whereas most other systems, to the extent that they do support many languages at all, most often does this by having one language model in English, and then auto-translating texts from the target language. This does not produce good results.
What is semantic memory?
Our semantic memories are inspired by how the human brain understands text. We humans can learn what words mean simply from their usage and context, and we do this effortlessly and seamlessly, and it happens to each of us more frequently than we realize.
Gavagai’s technology works in a similar way. It learns the meanings of words by observing their usages and contexts, and it never stops learning: language evolves constantly, as do our semantic memories. Our semantic memories are built for Big Data; they learn from all available text data and are always online.
If you invent a new word and start using it on social media, our models will have learned it in a matter of minutes. The same goes for new languages: as long as there are texts available, we can learn a semantic memory for that language. All this is made possible by clever engineering and the use of hyperdimensional representations.
Have a look under the hood
Would you like to sneak a peek into our semantic memory? Head over to Gavagai’s Living Lexicon where you can see and explore the wordspaces supporting our text analytics software. Our English wordspace has a vocabulary of 1 million words – to be compared to an English dictionary, which contains about 40 thousand words.
Look up words to see their current left side neighbors, right side neighbors, n-grams, semantically similar words, and related topics to better understand how our model works.
Data security
We have multiple solutions to ensure data security depending on your needs. We ensure all your sensitive data is secured using the latest encryption algorithms and all of our products can be deployed onsite. Gavagai’s servers are located in Stockholm. We are not affiliated with Amazon Web Services (AWS) or Microsoft Azure.
Data integrity is important to us, and we are fully compliant with the General Data Protection Regulation (GDPR) and similar frameworks internationally.