Text2Speech Blog

NeoSpeech: Text-to-Speech Solutions.

What Is Natural Language Processing and How Does It Work?

NeoSpeech looks into everything about NLP

In 1950, Alan Turing published his famous paper titled “Computing Machinery and Intelligence”. The paper proposed a test to determine if a machine was artificially intelligent. Basically, Turing said that if a machine could have a conversation with a human and trick the human into thinking the machine was a person itself, then it was artificially intelligent.

This became known as the Turing Test, and passing it has been one of the most sought after goals in computer science. Passing the Turing Test would signal the birth of artificial intelligence.

The most essential part of the Turing Test is communication. The computer has to be able to communicate with a human. This is called natural language processing.

While we still haven’t quite achieved artificial intelligence, natural language processing has become very popular over the past few years and is used in many products today. It’s a fascinating technology that’s already changed the tech world a lot and promises to do so even more in the future. So what makes this bit of technology so exciting? Let’s find out.

Natural Language Processing Definition

Natural language processing (NLP) can be defined as the ability of a machine to analyze, understand, and generate human speech.  The goal of NLP is to make interactions between computers and humans feel exactly like interactions between humans and humans.

And when we say interactions between humans and humans, we’re talking about how humans communicate with each other by using natural language. Natural language is a language that is native to people.  English, Spanish, French, and Mandarin are all examples of a natural language.

On the other hand, computers have always operated on artificial languages (computer programming languages such as SQL, Java, C++, etc.).  These languages were constructed to communicate instructions to machines.

Because computers operate on artificial languages, they are unable to understand natural language. This is the problem that NLP solves. With NLP, a computer is able to listen to a natural language being spoken by a person, understand the meaning of it, and then if needed, respond to it by generating natural language to communicate back to the person.

Of course, there are several complex steps involved in that process.  NLP is a field of computer science that has been around for a while, but has gained much popularity in recent years as advances in technology have made it easier to develop computers with NLP abilities.

How is natural language processing used today?

There are several different tasks that NLP can be used to accomplish, and each of those tasks can be done in many different ways. Let’s look at some of the most common applications for NLP today:

Spam filters

One of the biggest headaches of email is spam. To set up a first line of defense, services such as Gmail use NLP to determine which emails are good and which are spam. These spam filters scan the text in all the emails you receive, and attempt to understand the meaning of that text to determine if it’s spam or not.

Algorithmic trading

Wouldn’t it be amazing if you could master the stock market without having to do a thing? That’s what algorithmic trading is for. Using NLP, this technology reads news stories concerning companies and stocks and attempts to understand the meaning of them to determine if you should buy, sell, or hold onto certain stocks.

Answering questions

If you’ve ever typed a question in Google search, or asked Siri for directions, then you’ve seen this form of NLP in action. A major use of NLP is to make search engines understand the meaning of what we are asking, and then often times generating natural language in return to give us the answers we’re looking for.

Summarizing information

There’s a lot of information on the web, and a lot of that information is in the form of long documents or articles. NLP is used to understand the meaning of this information, and then generates shorter summaries of the information so humans can understand it quicker.

Many devices us NLP today

Those are just a handful of the ways NLP is used today. But by looking at those few examples you might have spotted some patterns. Did you notice that in all examples, NLP was used to understand natural language? And in most cases, it was also used to generate natural language. These are generally considered the two main components of NLP. They are Natural Language Understanding (NLU) and Natural Language Generation (NLG).

So how does natural language processing work?

To understand how NLP works, we have to take a look at the two main components of it, NLU and NLG. These two parts of NLP are very different from each other and are achieved by using different methods.

Natural Language Understanding

The most difficult part of NLP is understanding, or providing meaning to the natural language that the computer received.

First, the computer must take natural language and convert it into artificial language. This is what speech recognition, or speech-to-text, does. This is the first step of NLU. Once the information is in text form, NLU can take place to try to understand the meaning of that text.

Most speech recognition systems today are based on Hidden Markov Models (HMMs). These are statistical models that turn your speech to text by making mathematical calculations to determine what you said.

HMMs do this by listening to you speak, breaking it down into small units (usually 10-20 milliseconds), then comparing it to pre-recorded speech to determine the phoneme you said in each unit of your speech (a phoneme is the smallest unit of speech there is). Then, it looks at the series of phonemes and statistically determines the most likely words and sentences you were saying. It outputs this information in the form of text.

The next, and hardest step of NLU, is the actual understanding part.

Again, different NLP systems use different techniques. However, the process is generally similar. First, the computer must understand what each word is. It tries to understand if it’s a noun or a verb, if it’s past or present tense, and so on. This is called Part-of-Speech tagging (POS).

NLP systems also have a lexicon (a vocabulary) and a set of grammar rules coded into the system. Modern NLP algorithms use statistical machine learning to apply these rules to the natural language and determine the most likely meaning behind what was said.

By the end of the process, the computer should understand the meaning of what you said. There are several challenges in accomplishing this when considering problems such as words having several meanings (polysemy) or different words having similar meanings (synonymy), but developers encode rules into their NLU systems and train them to learn to apply the rules correctly.

Natural Language Generation

NLG is much simpler to accomplish. NLG translates a computer’s artificial language into text, and can also go a step further by translating that text into audible speech with text-to-speech.

First, the NLP system determines what information to translate into text. If you asked you computer a question about the weather, it most likely did an online search to find your answer, and from there it decides that the temperature, wind, and humidity are the parts that should be read aloud to you.

Then, it organizes the structure of how it’s going to say it. This is similar to NLU except backwards. Using a lexicon and a set of grammar rules, a NLG system can form complete sentences.

Finally, if the natural language text is going to be read aloud, text-to-speech takes over. The text-to-speech engine analyzes the text using a prosody model, which determines breaks, duration, and pitch. Then, using a speech database (recordings from a voice actor), the engine puts together all the recorded phonemes to form one coherent string of speech.

(If you’re more curious about text-to-speech, read our blog on What Is Text-To-Speech And How Does It Work? You can also learn more about the two most common methods for creating a speech database in HTS vs. USS: Which Speech Synthesis Technique Is Better?)

We could’ve gone into so much more detail on everything above, but this should give you a general understanding on how NLP works today.

The Future of NLP

Deep neural networks might be the future of speech technology

We’re already seeing new ways and developing even better systems. Companies like Google are experimenting with Deep Neural Networks (DNNs) to push the limits of NLP and make it possible for humans-to-machine interactions to feel just like human-to-human interactions.

You can read more about how DNNs can significantly improve text-to-speech technology in this article. While we’re still a ways away from DNN-based text-to-speech engines from hitting the market, the potential for this technology is exciting!

Let Us Know What You Think!

What are your thoughts on NLP? If you have any questions, comments or ideas, feel free to comment below and join the discussion!

Learn More about Text-to-Speech

To learn more about the different areas in which Text-to-Speech technology can be used, visit our Text-to-Speech Areas of Application page.

If you’re interested in adding text-to-speech software to your application or would like to learn more about TTS, please fill out our Sales Inquiry form and one of our friendly team members will be happy to help.

Related Articles

What is Text-to-Speech and How Does It Work?

Infographic: Speech Technology Market Forecast

The Impact Of Voice Search On SEO

2 Comments
  • Nikki D.

    October 21, 2016 at 2:25 pm Reply

    Awesome break-down of how NLP works behind the scenes without the unnecessary technical mumbo jumbo that I’ve been seeing on other sites. Thank you!!

    • neoadmin

      October 25, 2016 at 2:19 pm Reply

      Hi Nikki, glad you enjoyed our post about it! I agree that a lot of information regarding NLP out there is confusing or incomplete, so I’m glad this helped!

Post a Comment

Wordpress SEO Plugin by SEOPressor