- 🚀[2025/2/28] EgoGPT codebase is released!
- Clone this repository.
git clone https://github.com/egolife-ntu/EgoLife
cd EgoLife/EgoGPT
- Install the dependencies.
conda create -n egogpt python=3.10
conda activate egogpt
pip install --upgrade pip
pip install -e .
- Install the dependencies for training and inference.
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
-
Download EgoGPT-7b from 🤗EgoGPT and audio encoder from Audio Encoder.
-
Download EgoIT dataset from 🤗Huggingface and construct the directory as follows:
from huggingface_hub import snapshot_download
local_path = snapshot_download(
repo_id="EgoGPT/EgoIT_Video",
repo_type="dataset",
local_dir="data"
)
data/ # The directory for videos and audio (keep the same as the huggingface dataset)
├── ADL/
│ ├── images/
│ ├── audio/
├── ChardesEgo/
│ ├── *.mp4/
│ ...
datasets/ # The directory for json
├── ADL/
│ ├── ADL.json
├── ChardesEgo/
│ ├── ChardesEgo.json
├── ...
├── EgoIT.json # The concatenated json for training
- If you want to train EgoGPT from scratch(e.g from LLaVA-Onevision), please download the audio projector from here or via:
from huggingface_hub import hf_hub_download
hf_hub_download(
repo_id="lmms-lab/EgoIT-99K",
filename="speech_projector/7B/speech_projector_7b.bin",
local_dir="./",
repo_type="model",
)
python inference.py --pretrained_path checkpoints/EgoGPT-7b-EgoIT-EgoLife --video_path data/train/A1_JAKE/DAY1/DAY1_A1_JAKE_11223000.mp4 --audio_path audio/DAY1_A1_JAKE_11223000.mp3 --query "Please describe the video in detail."
Run the following command to start the demo that identical to the EgoGPT Demo.
python gradio_demo.py
Please replace the DATA_PATH
, MODEL_PATH
, SPEECH_PROJECTOR_PATH
and SPEECH_ENCODER_PATH
in the following command with your own paths.
bash scripts/train_egogpt.sh
Our evaluation are conducted on lmms-eval. Please refers to the lmms-eval repository for the evaluation setup.
python3 -m accelerate.commands.launch \
--main_process_port 10043 \
--num_processes=8 \
-m lmms_eval \
--model egogpt \
--model_args pretrained=YOUR_EGOGPT_MODEL_PATH, conv_template="qwen_1_5"\
--tasks egoplan, egothink \
--batch_size 1 \
--log_samples \
--output_path YOUR_OUTPUT_PATH
Our code is released under the Apache-2.0 License.
If our work is useful for you, please cite as:
@inproceedings{yang2025egolife,
title={EgoLife: Towards Egocentric Life Assistant},
author={Yang, Jingkang and Liu, Shuai and Guo, Hongming and Dong, Yuhao and Zhang, Xiamengwei and Zhang, Sicheng and Wang, Pengyun and Zhou, Zitang and Xie, Binzhu and Wang, Ziyue and Ouyang, Bei and Lin, Zhengyu and Cominelli, Marco and Cai, Zhongang and Zhang, Yuanhan and Zhang, Peiyuan and Hong, Fangzhou and Widmer, Joerg and Gringoli, Francesco and Yang, Lei and Li, Bo and Liu, Ziwei},
booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2025},
}