Project Title: Toxic Comment Classification
Category: Deep Learning / Computer Vision
Project File: Download Project File
Umair Ali
umairali@vu.edu.pk
live:umairalihamid_1
Toxic Comment Classification
Project Domain / Category
Natural Language Processing/Deep Learning
Abstract / Introduction
With the rise of social media and online platforms, toxic comments such as hate speech, bullying, and offensive language have become widespread. These comments negatively impact online communities, mental health, and constructive communication. Detecting and filtering such harmful content manually is inefficient and impractical given the massive volume of user-generated content. This project aims to develop a machine learning–based system to automatically classify comments as toxic or non-toxic.
Online platforms face a serious challenge in moderating abusive and harmful comments. Traditional moderation methods (manual review, keyword filtering) are limited, prone to bias, and fail to adapt to new slang or expressions. An automated and intelligent toxic comment classification system is required to improve moderation efficiency and user safety.
Functional Requirements:
The Admin (Student) will design and develop a system capable of performing the following tasks:
To design and implement a text classification model for detecting toxic comments.
To preprocess and clean raw text data (remove noise, normalize language, handle slang).
To extract meaningful features using techniques such as TF-IDF and word embeddings.
To train and evaluate machine learning and deep learning models (e.g., Logistic Regression, SVM, LSTM, BERT).
To compare models based on accuracy, precision, recall, F1-score, and confusion matrix.
To build a simple prototype interface where users can input comments and get classification results.
Dataset:
https://drive.google.com/drive/folders/1VOuo8byq10Bt4Hr552ch60OcU6KZiRjM?usp=sharing
*You must use your VU email id to access/download the dataset.
Tools:
Use Python with NLP libraries (e.g., NLTK, spaCy) in Jupyter Notebook, VS Code, or similar environments.
Prerequisite:
Artificial Intelligence, Machine Learning, and Natural Language Processing Concepts,
Admin (student) will cover short courses relevant to the mentioned concepts besides initial documentation, i.e. SRS and Design document.
Page 158 of 167
|
Topic # |
Weblink |
|
|
1 |
https://www.python.org/ |
|
|
2 |
https://www.w3schools.com/python/ |
|
|
3 |
https://www.tutorialspoint.com/python/index.htm |
|
|
4 |
https://www.kaggle.com/learn/python |
|
|
5 |
https://www.kaggle.com/learn/intro-to-machine-learning |
|
|
6 |
https://developers.google.com/machine-learning/crash-course |
|
|
7 |
https://www.kaggle.com/learn/intro-to-deep-learning |
|
|
8 |
https://www.tutorialspoint.com/python_deep_learning/index.htm |
|
|
9 |
https://www.tutorialspoint.com/deep-learning-tutorials/index.asp |
|
|
10 |
||
|
11 |
||
|
12 |
||
|
13 |
Custom NER with spaCy v3 Tutorial | Free NER Data Annotation | |
|
|
|
Here are some additional tips for finding freely available courses and resources for NER:
Use keywords such as "text classification," "text analysis," "NLP," and "natural language processing" in your search.
Look for websites that specialize in NLP education and resources.
Check MOOC platforms such as Coursera, edX, and Udacity for free courses and tutorials.
Read blog posts and articles written by experts in the field.
Join online communities and forums dedicated to NLP.
Supervisor:
Name: Umair Ali
Email ID: umairali@vu.edu.pk
MS Teams ID: umair.ali.hamid@outlook.com
No schedules available for this project.
No reviews available for this project.