Rudra Roy

RudraRoy

MSc EEE @ BUET · AI · Computer Vision · Speech/Audio

I am Rudra Roy, MSc student in Electrical & Electronic Engineering (Communication & Signal Processing) at Bangladesh University of Engineering and Technology (BUET). My work spans AI, Computer Vision, Speech and Audio Processing, OCR, and Medical Image Analysis, with a strong focus on turning research ideas into practical systems.

Currently, I work as a Machine Learning Engineer (MLE) at ACI PLC, where I build applied AI solutions for Bangla OCR, legal LLMs, ASR systems, and document intelligence pipelines.

Skill Set

PythonPyTorch / TensorFlowOpenCVLibrosa / TorchaudioScikit-learnMATLAB / SimulinkEmbedded / IoT (Arduino, RPi)ROS

Research

Exploring the frontiers of AI, signal processing, and embedded systems

Sign Language Recognition

Developing real-time sign language recognition systems using custom deep learning architectures and GAN-aided data augmentation for robust performance across diverse signing styles and environments.

Speech Enhancement

Building advanced speech enhancement models that operate in low-SNR conditions using novel architectures combining waveform processing with discrete wavelet transform for superior noise suppression.

IoT & Robotics

Designing embedded sensing systems and real-time monitoring solutions with control pipelines for practical industrial and agricultural applications, from greenhouse automation to gas burner safety.

Projects

Research projects and engineering builds

CAPRes50-GAN — Word-Level Sign Language Recognition

Ongoing
CVSLRGAN

pFLOCT — Personalized FL for OCT Classification

Ongoing
FLMedical ImagingOCT

CAR-UNet — Speech Enhancement with ConvNeXt + Attention

Ongoing
SpeechConvNeXtAttentionUNet

BreastDCGAN — End-to-End Breast Cancer Segmentation & Classification

Ongoing
Medical ImagingBreast CancerGAN

Publications

Peer-reviewed research contributions

Enhancing Communication for the Deaf: Real-Time Sign Language Recognition & Translation

Enhancing Communication for the Deaf: Real-Time Sign Language Recognition & Translation

IEEE R10-HTC 2024, Kuala Lumpur
DPMAS-Net: Privacy-Preserving EMG Hand Gesture Recognition

DPMAS-Net: Privacy-Preserving EMG Hand Gesture Recognition

TENSYMP 2024, New Delhi
N2N2N: Clean Data Independent Speech Enhancement with Modified cGAN

N2N2N: Clean Data Independent Speech Enhancement with Modified cGAN

TENCON 2024, Singapore

Upcoming News

What I am currently building and what is coming next

UpcomingJune 12–17, 2026

APIE Advanced Camp 04 (Funded by SOI Asia)

I will attend APIE Advanced Camp 04, funded by SOI Asia.

SOI Asia

Program Page
In ProgressQ3 2026

Production OCR Module Rollout

Preparing the next milestone release of enterprise OCR pipelines with improved invoice and legal document parsing quality.

ACI PLC, Dhaka

UpcomingQ4 2026

Bangla ASR Domain Adaptation Update

Upcoming internal benchmark update for Bangla ASR with petroleum domain vocabulary expansion and evaluation reporting.

Dhaka / Remote

Planned2026

New Research Collaboration Window

Open slot for collaborative work in speech, vision, and applied LLM systems for real-world deployments.

Remote