Basic NLP Parlance

What Is NLP?

“Natural language processing (NLP) is a combination of linguistics, computer science, information engineering, and Artificial Intelligence. It is concerned with the interactions between computer and human (natural) languages”.

NLP is the complete process of consuming natural language data, processing the data, understanding it, and then producing the required human-understandable output. NLP is broadly classified into two parts: NLU (Natural Language Understanding) and NLG (Natural Language Generation). Natural Language Understanding deals with the problem of making sense out of the data, whereas Natural Language Generation involves the system’s production of natural language.

Human language is ambiguous, which makes NLP difficult. As humans, most of us naturally understand the context of a conversation and how context changes the meaning of words. An example from the NIU computer science department illustrates this perfectly:

“I made her a duck”, can have various be interpretations. Such ased as:

I cooked waterfowl for her.
I cooked waterfowl belonging to her.
I created the (toy or sculpture?) duck she owns.

Over the course of time, we’ve been able to significantly develop techniques to solve the problems associated with NLP. Initially, rule-based techniques were used to solve the NLP problem; this gradually evolved into much more sophisticated statistical techniques like Naïve Bayes, hidden Markov models, etc. But in the past few years, it’s been the Deep Learning approaches and techniques that have produced state of the art results for various NLP tasks. More specifically, BERT, ELMo, UniLM, ULMFit, etc. have shown great results in question answering, text summarization, text classification, and other areas, etc.

NLP Is Ubiquitous

We are using NLP in our daily life without realizing it. For example, Siri, Google Search, automatic spam detection, and automatic spelling correction all use NLP. The following are some of the most common tasks/applications of NLP.

Machine translation

Machine translation is an NLP task that converts one natural language into another. It is very helpful in data mining tasks, customer support, etc.

Sentiment analysis

Sentiment analysis is used to figure out the feeling behind speech or text. There are various real-life implementations of sentiment analysis; it helps companies determine users’ response to their products, campaigns, business decisions, etc.

Text classification

Text classification is an NLP task that organizes texts into different categories. It is similar to classification problems in machine learning, but it uses text instead of numerical data. Overall there are many problems that can be addressed directly through text classification. And in many cases, it can act as an intermediate step to solve a bigger problem.
For instance, suppose we were developing a ticket management system that takes a user’s description of a ticket as an input, categorizes it based on the text, and then generates a ticket and forwards it to the user and any other concerned party. The text classification model would not solve the complete problem here, but it will help resolve a significant part of it.

Numerous application..

There is a plethora of NLP applications, such as reading comprehension, text summarization, relation extraction, image captioning, emotion recognition, etc.
NLP applications seldom achieve 100% accuracy, but they are an integral part of our lives as they help us with our daily tasks.

References

Source: https://en.wikipedia.org/wiki/Natural_language_processing
Source: http://faculty.cs.niu.edu/~freedman/csnl/intro-duck02.txt

-Authored by Pranav Sharma, Data Scientist at Absolutdata

Technical articles are published from the Absolutdata Labs group, and hail from The Absolutdata Data Science Center of Excellence. These articles also appear in BrainWave, Absolutdata’s quarterly data science digest.

Subscribe to BrainWave