Skip to content

EvolvingLMMs-Lab/EgoLife

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 

Repository files navigation

The EgoLife Project

           

teaser.png

Figure 1. The Overview of EgoLife Project. EgoLife is an ambitious egocentric AI project capturing multimodal daily activities of six participants over a week. Using Meta Aria glasses, synchronized third-person cameras, and mmWave sensors, it provides a rich dataset for long-term video understanding. Leveraging this dataset, the project enables AI assistants—powered by EgoGPT and EgoRAG—to support memory, habit tracking, event recall, and task management, advancing real-world egocentric AI applications.

🚀 News

🤹 2025-02: We provide HuggingFace gradio demo and self-deployed demo for EgoGPT.

🌟 2025-02: The EgoLife video is released at HuggingFace and uploaded to Youtube as video collection.

🌟 2025-02: We release the EgoIT-99K dataset at HuggingFace.

🌟 2025-02: We release the first version of EgoGPT and EgoRAG codebase.

📖 2025-02: Our arXiv submission is currently on hold. For an overview, please visit our academic page.

🎉 2025-02: The paper is accepted to CVPR 2025. Please be invited to our online EgoHouse.

What is in this repo?

🧠 EgoGPT: Clip-Level Multimodal Understanding

EgoGPT is an omni-modal vision-language model fine-tuned on egocentric datasets. It performs continuous video captioning, extracting key events, actions, and context from first-person video and audio streams.

Key Features:

  • Dense captioning for visual and auditory events.
  • Fine-tuned for egocentric scenarios (optimized for EgoLife data).

📖 EgoRAG: Long-Context Question Answering

EgoRAG is a retrieval-augmented generation (RAG) module that enables long-term reasoning and memory reconstruction. It retrieves relevant past events and synthesizes contextualized answers to user queries.

Key Features:

  • Hierarchical memory bank (hourly, daily summaries).
  • Time-stamped retrieval for context-aware Q&A.

📂 Code Structure

EgoLife/
│── assets/                # General assets used across the project
│── EgoGPT/                # Core module for egocentric omni-modal model
│── EgoRAG/                # Retrieval-augmented generation (RAG) module
│── README.md              # Main documentation for the overall project

Please dive in to the project of EgoGPT and EgoRAG for more details.

📢 Citation

If you use EgoLife in your research, please cite our work:

@inproceedings{yang2025egolife,
  title={EgoLife: Towards Egocentric Life Assistant},
  author={Yang, Jingkang and Liu, Shuai and Guo, Hongming and Dong, Yuhao and Zhang, Xiamengwei and Zhang, Sicheng and Wang, Pengyun and Zhou, Zitang and Xie, Binzhu and Wang, Ziyue and Ouyang, Bei and Lin, Zhengyu and Cominelli, Marco and Cai, Zhongang and Zhang, Yuanhan and Zhang, Peiyuan and Hong, Fangzhou and Widmer, Joerg and Gringoli, Francesco and Yang, Lei and Li, Bo and Liu, Ziwei},
  booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025},
}

📝 License

This project is licensed under the MIT License. See the LICENSE file for details.

Star History

Star History Chart