Develop a Python SDK for TranscribeIt

Project Size: Medium (175 hours) Difficulty: Medium

Mentors: Keerthana, Shalini, Bowrna

Project Link: https://codeberg.org/fossiaorg/transcribeit

Description

TranscribeIt currently provides multimedia accessibility services such as audio and video transcriptions (with or without timestamping, speaker diarization) with video descriptions via a web interface, using the FastAPI server (backend). However, this means limiting developers working on integrating multimedia accessibility to write custom wrappers for TranscribeIt's RESTful API, which is tedious and error-prone.

A well-documented, configurable Python SDK that factors the components from the backend server will help in ensuring modul focuses on creating a robust, well-documented Python SDK that allows developers to integrate TranscribeIt into scripts, applications, and data pipelines.

Required Skills

  • Python programming

Bonus Skills

  • SDK or library design experience
  • Familiarity with multimedia processing systems: FFMpeg, OpenAI Whisper
  • Python packaging and versioning

Recommended Reading / Resources

Package TranscribeIt Web Application as a Cross-Platform Desktop App (Wails)

Project Size: Small (90 hours) Difficulty: Easy–Medium

Mentors: Aqsa, Keerthana

Project Link: https://codeberg.org/fossiaorg/transcribeit

Description

Package the existing TranscribeIt web application into a cross-platform desktop application using the Wails framework for Windows, macOS, and Linux.

Ensure the server is packaged as an independent service for supporting local operations for multimedia accessibility, using the sidecar pattern. This enables end-users to navigate through the application in an intuitive manner.

Required Skills

  • TypeScript
  • Next.js

Bonus Skills

  • Go (basics)
  • Wails (or similar frameworks like Tauri, Electron)

Recommended Reading / Resources

Improve User Experience and Accessibility of TranscribeIt Web Interface

Project Size: Medium (175 hours) Difficulty: Medium

Mentors: Deepraj, Keerthana, Shalini

Project Link: https://codeberg.org/fossiaorg/transcribeit

Description

Improve the usability and accessibility of the TranscribeIt web interface, making it WCAG 2.2 AA compliant, by performing accessibility audit, remediation, testing and validation during development.

Integrate multimedia accessibility features for end-user experience improvement, by providing keyboard bindings, customization in accessibility services (transcription, diarization, video description, etc.)

Required Skills

  • Next.js/React
  • WCAG knowledge (2.2)
  • Accessibility testing with axe-core, WAVE and manual testing
  • State Management

Recommended Reading / Resources

Integrate ASR Support for Indic Languages in TranscribeIt

Project Size: Small (90 hours) Difficulty: Medium

Mentors: Bowrna, Shalini

Project Link: https://codeberg.org/fossiaorg/transcribeit

Description

Currently, OpenAI Whisper's fork, named faster-whisper, is integrated in the project which provides timestamped transcription for multi-lingual audio (audio stream containing multiple languages). However, the accuracy for Indic languages is not optimal with the base model. Integrate support for extending ASR capabilities by integrating support (on-demand) for multiple Indic ASR models such as Whisperx-Hindi to reduce WER from 170% to 5%. Similarly, extend to other Indian languages to improve accessibility and integrate support for multi-lingual detection with these language models in a parallelized manner. This would significantly improve accessibility for the Indian audience.

Required Skills

  • Python
  • Basics of machine learning or NLP
  • Audio processing fundamentals

Bonus Skills

  • Experience with Whisper, Kaldi, or Vosk
  • Knowledge of Indic language datasets

Recommended Reading / Resources

Develop Optimized Multi-Lingual Video Description Generation for framestoryx

Project Size: Medium (175 hours) Difficulty: Medium

Mentors: Keerthana, Shalini, Deepraj, Bowrna

Project Link: https://codeberg.org/fossiaorg/framestoryx

Description

Build an optimized, portable pipeline for generating multi-lingual video descriptions to improve accessibility for visually impaired users.

Required Skills

  • Python
  • Computer vision basics
  • NLP fundamentals

Recommended Reading / Resources

Develop and Deploy Backend for TravelFolks

Project Size: Large (350 hours) Difficulty: Medium–Hard

Mentors: Varshha, Deepraj, Keerthana

Description

Design and deploy a scalable backend to support TravelFolks user accounts, travel listings, recommendations, and accessibility features.

Required Skills

  • Backend development (Node.js / Python / Java)
  • RESTful API design
  • SQL or NoSQL databases

Recommended Reading / Resources

Develop an Accessible Web Interface for TravelFolks

Project Size: Large (350 hours) Difficulty: Medium

Mentors: Varshha, Deepraj

Description

Build a modern, accessible, and user-friendly web interface for TravelFolks, ensuring inclusivity for users with diverse accessibility needs.

Required Skills

  • HTML, CSS, JavaScript
  • Frontend frameworks (React, Vue, etc.)
  • Accessibility best practices

Bonus Skills

  • UX research and usability testing
  • Design systems and component libraries

Recommended Reading / Resources