ASSISTIVE VOCALIZER USING OPENCV FOR SIGN LANGUAGE RECOGNITION AND TRANSLATION

Image Processing

Project Details

Project Information

Project Title: ASSISTIVE VOCALIZER USING OPENCV FOR SIGN LANGUAGE RECOGNITION AND TRANSLATION

Category: Image Processing

Semester: Fall 2025

Course: CS619

Complexity: Very Complex

Supervisor Details

Project Description

ASSISTIVE VOCALIZER USING OPENCV FOR SIGN LANGUAGE RECOGNITION AND TRANSLATION

 

Project Domain / Category

 

Image Processing / Artificial Intelligence / Web App/Deep Learning

 

Abstract / Introduction

 

The Assistive Vocalizer is created as a two-way communication system, which integrates computer vision and speech recognition to assist speech and hearing impaired people. The system works with a camera built into the web interface that is based on Django to record real-time hand gestures and work with them via OpenCV to identify, segment and extract landmarks. This is followed by the classification of these features of gestures with the help of a TensorFlow/Keras deep learning model CNN to map them to alphabets, words, or phrases. The identified motions are presented on the web front-end and converted further to audible speech by the text-to-speech integration, which enables the smooth communication between the sign language and the non-sign language users. The project is made to be accessible because the logic is embedded on Django views and templates to provide a user-friendly and clean interface that can be deployed on the web. The system also has voice recognition to facilitate two way communication where words spoken are translated into sign language gestures shown in the interface. Such advanced tools as sentiment analysis improve the natural course of communication by understanding the emotional tone and Urdu translation makes the application inclusive to local users. The system uses image processing, deep learning models, and natural language processing together on the same interactive platform, thereby integrating the front-end and back-end communication with Django. This integrative design will grant the speech-impaired community with an effective and scalable assistive tool that can be utilized in various gadgets.

 

Functional Requirements:

 

        The system will record real-time hand gestures with the help of a camera and be processed with OpenCV to detect gestures and identify landmarks.

 

        The system will identify identified gestures as alphabets, words or phrases through a trained TensorFlow/Keras deep learning model.

 

        The system will process accepted gestures into text and consequently an audio speech with the text-to-speech software.

 

        The system will also be voice recognition to be able to translate voice input into the sign language gesture that is displayed on the interface.

 

        The system will incorporate sentiment analysis to comprehend the emotion of conversation.

 

        It will be inclusive with the system offering multilingual assistance, such as English and Urdu translation.

 

Non-functional Requirements:

 

        Act promptly on gestures with low latency levels so that there is free interaction.

 

        Learn to work under various lighting, background and hand positions.

 

        Notify to personal users, languages and custom gestures.

 

        Grow effortlessly to facilitate more gestures and words.

 

 

Page 27 of 167

 

        Be usable and accessible to people with disabilities/limitations.

 

        Offer easy user-friendly interface.

 

Prerequisites:

 

        Have a good understanding of Python.

 

        Having knowledge of basic deep learning concepts and models.

 

        Understanding of basic image processing techniques (preferable but not mandatory).

 

        Basic idea of working with image related datasets.

 

Tools:

 

        Programming & Framework:

 

Python for core development and Django for front-end and back-end integration.

 

        Computer Vision & Deep Learning Libraries:

 

The system uses CNN for training, OpenCV for image processing, and TensorFlow/Keras for gesture classification.

 

        Speech Processing Tools:

 

Text-to-Speech (gTTS or pyttsx3) for audio output and Speech Recognition library for voice input.

 

        Development & Deployment Environment: IDE (PyCharm/VS Code) for coding

 

Supervisor:

Name: Muhammad Wasif Mairaj

 

Email ID: wasif.mairaj@vu.edu.pk

 

MS Teams ID: wasif.vu@outlook.com

Languages

  • Python Language

Tools

  • Django, CNN, OpenCV, TensorFlow, Keras, gTTS, pyttsx3, SpeechRecognition, PyCharm, VS Code Tool

Project Schedules

No schedules available for this project.

Viva Review Submission

Review Information
Supervisor Behavior

Student Viva Reviews

No reviews available for this project.