Skip to content

Latest commit

 

History

History
23 lines (19 loc) · 438 Bytes

how-to-reproduce.md

File metadata and controls

23 lines (19 loc) · 438 Bytes

Steps to reproduce:

  1. Clone the repository with shared task data
https://github.com/sigtyp/ST2024.git
  1. Install requirements (in a virtual environment)
pip install -r requirements.txt
  1. Convert data for training
python convert_lemmatisation.py
python convert_mlm.py
python convert_mlm_decomposed.py
python convert_tagging.py
  1. Train custom tokenizers
  2. Train the models
  3. Make predictions