Community Blog Introduction to Natural Language Processing in Python

Introduction to Natural Language Processing in Python

This tutorial introduces the basics of natural language processing (NLP) in Python.

This tutorial introduces the basics of natural language processing (NLP) in Python. If you have encountered a pile of textual data for the first time, this is the right place for you to begin your journey of making sense of the data. This tutorial is based on Python version 3.6.5 and NLTK version 3.3.

Before the common NLP tasks --- word frequency, word cloud, NER and TF-IDF, the data should be cleaned by word tokenization, converting words to their canonical form and removing noise.

Taking out word frequency is the most basic form of analysis on textual data. A single tweet is too small an entity to find out the distribution of words, hence, the analysis of the frequency of words would be done on all of the 20000 tweets.

After you have plotted the most frequent words, you need to visualize the distribution of words, then you can create a word cloud using the wordcloud package.

Named Entity Recognition (NER) is the process of detecting the named entities such as persons, locations and organizations from your text.

The TF-IDF (term frequency - inverse document frequency) is a statistic that signifies how important a term is to a document. Ideally, the terms at the top of the TF-IDF list should play an important role in deciding the topic of the text.

For step by step tutorial of Natural Language Processing in Python, please go to Natural Language Processing in Python 3 Using NLTK.

Related Blog Posts

An Introduction to Core Machine Learning

Simply put, the Core Machine Learning Framework enables developers to integrate their machine learning models into iOS applications.

There are three libraries that are associated with Core ML that form part of its functionality:

  1. Vision: identification of faces, detection of features, or classification of image and video scenes in the framework is achieved through this library. The vision library is based on computer vision techniques and high-performance image processing.
  2. Foundation (NLP): the library incorporates the tools to enable natural language processing in iOS apps.
  3. Gameplay Kit: the kit uses decision trees for game development purposes and for artificial intelligence requirements.

The following are the requirements for setting up a simple Core ML project:

  1. The Operating System: MacOS (Sierra 10.12 or above)
  2. Programming Language: python for mac (Python 2.7) and pip.
  3. Coremltools: the package will be the one to convert machine models written in python to a format that is understood by the Core ML framework.
  4. Xcode 9: the Xcode 9 is the default platform on which iOS applications are built.

How to Create and Deploy a Pre-Trained Word2Vec Deep Learning REST API

Word Vectors have recently been shaking up the deep learning world due to their flexibility and ease of training. Word embeddings has revolutionized the field of NLP. In this tutorial, we will make a pre-trained deep learning model named Word2Vec available to other services by building a REST API from the ground up.

Prerequisite Knowledge:

  1. A Unix-based machine such as Alibaba Cloud Elastic Compute Service (ECS) instances, preferably with more compute power.
  2. Understanding of python and pip commands
  3. Knowledge of how to use the Linux operating system to create/navigate/edit folders and files

Related Products

Machine Learning Platform for AI

Machine Learning Platform for AI is an end-to-end platform that provides various machine learning algorithms to meet your data mining and analysis requirements, including text processing components for NLP.

Realtime Compute

Realtime Compute offers a one-stop, high-performance platform that enables real-time big data processing based on Apache Flink. It is widely used in diverse scenarios, such as streaming data processing, offline data processing, and data lake computing.

0 0 0
Share on

Alibaba Clouder

2,606 posts | 737 followers

You may also like