Feasibility Aspect

Strategies Used

Multiformat Input Support

Handles PDF, image, and audio using pdf2image, OpenCV, and Vosk for seamless uploading and processing

Text Extraction & Recognition

Uses Tesseract OCR and OpenCV pre-processing for accurate regional script extraction (Nepali, Sinhalese)

Audio Processing & Transcription

Leverages pydub for noise removal and Vosk for high-accuracy speech-to-text conversion

AI-Powered Language Translation

Integrates Hugging Face Transformers for contextual, fluent Nepali/Sinhalese to English translation

Quick Text Summarization

Employs DistilBART for concise AI-generated summaries of lengthy documents

User-Friendly Interface

Next.js and Material UI deliver an intuitive frontend for document upload, results display, and export

Secure Data Management

Uses PostgreSQL for scalable, privacy-focused storage of data and translation history

Offline Operation

All core functionalities operate fully on local device, ensuring privacy, speed, and reliability without internet

Scalability & Integration Ready

Modular design with MCP tool support for easy plugin and expansion into other platforms and AI systems

Continuous Model Improvement

Enables feedback-based fine-tuning via PyTorch for smarter, ever-evolving AI performance

FT6E WHSDN