Skip to content

Latest commit

 

History

History
18 lines (12 loc) · 1.09 KB

README.md

File metadata and controls

18 lines (12 loc) · 1.09 KB

VGGish

VGGish are features from a pretrained CNN by Google (research paper). Apple has a nice comprehensible explanation.

They benchmark their approach against Audio Set (obs innehåller även djurljud!). It seems to be just tags to YouTube videos?

  • AED: Acoustic Event Detection
  • VGGish: Seem to be feature from a pretrained CNN? Not sure, but link to repo here

Where do I find the code?

Currently working on data preprocessing in data.ipynb.

How to run

  1. Download the OpenMIC-2018 dataset and add as a subfolder to data (data/openmic-2018/all/goes/here)
  2. Make sure you have Docker up and running, and open the project in Visual Studio Code.
  3. VSC should prompt for Open project in .devcontainer?
  4. Accept.