Natural Language Processing
NLP module is an unsupervised machine learning module which can be used for analyzing the text data by creating topic model to find hidden semantic structure in documents. NLP module comes built-in with a wide range of text pre-processing techniques which is the fundamental step in any NLP problem. It transforms the raw text into a format that machine learning algorithms can learn from.
NLP module only support English language and provides several popular implementation of topic models from Latent Dirichlet Allocation to Non-Negative Matrix Factorization. It has over 5 ready-to-use algorithms and over 10 plots to analyze the text. NLP module also implements a unique function That allows you to tune the hyperparameters of a topic model to optimize the supervised learning objective.
What is Topic Model? In machine learning and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body.
In our case, every text column in your table can be considered as a corpus , and each row, of the text column can be considered as a documents (part of the corpus), Now each group of rows can be associated to a specific topic according to the number of topics you specified before starting the process of topic modeling.
We will dicuss all above in details, next.
Copy link