Skip to content
/ ADA Public

This repository contains code for training and evaluating transformer-based models like TimeSformer and VideoMAE for sign language recognition on the WLASL dataset. The project includes frame sampling techniques, preprocessing pipelines, fine-tuning strategies, and performance evaluation using metrics like top-1, top-5, and top-10 accuracy.

Notifications You must be signed in to change notification settings

JakobGr/ADA

Repository files navigation

Project Structure

The repository is organized as follows:

  • WLASL100_videos: Folder containing video data used for training and evaluation.
  • nslt_100.json: Annotation file mapping videos to their corresponding labels and subsets (train, validation, test).
  • TimeSformer.ipynb: Notebook implementing the TimeSformer model for ASL recognition.
  • VideoMAE.ipynb: Initial implementation of VideoMAE for ASL recognition.
  • VideoMAE_correct_split.ipynb: Corrected implementation of VideoMAE with proper dataset splits.
  • VideoMAE_correct_split.pth: Pretrained weights for the corrected VideoMAE implementation.
  • .gitattributes: Configuration for handling large files in GitHub.

Other trained models and configurations are available upon request.

About

This repository contains code for training and evaluating transformer-based models like TimeSformer and VideoMAE for sign language recognition on the WLASL dataset. The project includes frame sampling techniques, preprocessing pipelines, fine-tuning strategies, and performance evaluation using metrics like top-1, top-5, and top-10 accuracy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •