A Glance at Natural Language Processing: Theories and Practices

Name

Zhengge Tang

Major

Data Science

Class

2022

About

Zhengge Tang is an undergraduate student of class 2022 at Duke Kunshan University, majoring in Data Science.

Signature Work Project Overview

Natural Language Processing (NLP) is an important subbranch of data science with a vast prospect of application and various techniques. In addition, deep learning architectures such as transformer and Bidirectional Encoder Representations from Transformers (BERT) have in recent years greatly improved the efficiency of NLP tasks in terms of real-world practice and increased their popularity in the field. Due to my interest in the area and the objective to establish a solid background for potential study and research in related fields in the future, for the Signature Work project I will try to: master common NLP methods, models and architectures, and understand their mathematic and statistical foundations; construct deep learning models based on frameworks such as Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) that are easier to train with available resources and test them on common NLP tasks such as text-based sentiment analysis; experiment with models and architectures that are harder to be constructed from scratch like transformer and Word2Vec with existing pretrained resources and try to analyze their performance concerning various criteria on the same NLP tasks compared with the previous models.

Signature Work Presentation Video