Syntax

How words are arranged.


Text Classification

Text classification is one of the most commonly and widely known NLP tasks. In this note, several text classification tasks, algorithms for different tasks and their evaluation are summarized.


Language Model - N-gram

Words, Sentences, Paragraphs, Documents, Corpus are all sparse. they are made up of characters and have uncountable variations. Let’s simplify the question: How can we learn from the meaning of a piece of text based on a given corpus? Language Models are devised to tackle this challenge. In this notes, N-gram Model, its smoothing, application and evaluation are summarized.


Preprocessing - Word Sequences and Documents

Preprocessing is the first step of almost all NLP tasks. What techniques are commonly used? Why is it important? Which preprocessing techniques should we use in a specific task? How it will affect the outcome?