Through it all
Computer Vision;
Master@ZJU; Bachelor@NUAA
-
Zhejiang University
- Hangzhou
- https://yuqianyuan.github.io/
Highlights
- Pro
Pinned Loading
-
TokenPacker
TokenPacker PublicThe code for "TokenPacker: Efficient Visual Projector for Multimodal LLM".
-
LiWentomng/BoxInstSeg
LiWentomng/BoxInstSeg PublicA toolbox for box-supervised instance segmentation.
-
DAMO-NLP-SG/VideoRefer
DAMO-NLP-SG/VideoRefer PublicThe code for "VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM"
-
DAMO-NLP-SG/VideoLLaMA3
DAMO-NLP-SG/VideoLLaMA3 PublicFrontier Multimodal Foundation Models for Image and Video Understanding
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.