邓安东 Andong Deng

I am a second year PhD student at Center for Research in Computer Vision, University of Central Florida, advised by Dr. Chen Chen.

From 2021 to 2022, I have been spending a wonderful period of time at GeWu Lab, working with Dr. Di Hu . I have also worked as a deep learning research intern at Big Data Lab, Baidu, working with Dr. Xingjian Li and Dr. Dejing Dou. Prior to that, I obtained my master's degree from Shanghai Jiao Tong University in 2021, and worked as a research assistant advised by Dr. Weiwei Cai. I spent stupid years from 2016 to 2018, during which I obtained my bachelor's degree from Sichuan University in 2017.

Beyond research, I regularly spend times on photography, hold'em, music and a variety of sports (especially basketball and surfing and except soccer).

Email  /  Curriculum Vitae  /  X (Twitter)  /  知乎 Zhihu  /  Google Scholar

profile photo
Research Interests

I'm interested in computer vision, video understanding, multimodal learning, and cognitive science.

Updates

[2024-11-17] GroundMoRe is available now on ArXiv!

[2024-06-25] GroundMoRe will be coming soon!

[2024-05-27] I will be joining UII this summer as a research intern with Dr. Zhongpai Gao and Dr. Ziyan Wu.

[2023-07-14] Two papers accepted by ICCV 2023!

[2023-07-02] One paper accepted by IEEE TCSVT 2023! Congrats to Shoubin and Zhongying!

[2023-03-13] One paper accepted by ICME 2023! Congrats to Wenke!

[2022-03-09] I will be joining UCF this summer as a CS PhD student with Dr. Chen Chen.

[2022-02-03] One paper accepted by CVPR 2022! Congrats to Xiaokang and Yake!

* equal contribution

pre-prints
groundmore Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng, Tongjia Chen, Shoubin Yu, Taojiannan Yang, Lincoln Spencer, Yapeng Tian, Ajmal Saeed Mian, Mohit Bansal and Chen Chen
arXiv, 2024
arxiv / project

We present GroundMoRe, a new benchmark for novel Motion-Grounded Video Reasoning, designed to assess multimodal models' reasoning and perception capabilities for motion understanding.

ois Order-aware Interactive Segmentation
Bin Wang, Anwesa Choudhuri, Meng Zheng, Zhongpai Gao, Benjamin Planche, Andong Deng, Qin Liu, Terrence Chen, Ulas Bagci and Ziyan Wu.
arXiv, 2024
arxiv / project

We propose OIS: order-aware interactive segmentation, to explicitly integrate missing relative depth information into 2D interactive segmentation.

sportsqa Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports
Haopeng Li, Andong Deng, Qiuhong Ke, Jun Liu, Hossein Rahmani, Yulan Guo, Bernt Schiele and Chen Chen
arXiv, 2024
arxiv / code

In this paper, we introduce Sports-QA, the first dataset specifically designed for the sports-related VideoQA task.

Publications
inadequate A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition
Andong Deng*, Taojiannan Yang*, and Chen Chen
ICCV, 2023
arxiv / code / CVF / Supp

We introduce BEAR, a new BEnchmark on video Action Recognition. BEAR is a collection of 18 video datasets grouped into 5 categories, which covers a diverse set of real-world applications.

inadequate Towards Inadequately Pre-trained Models in Transfer Learning
Andong Deng*, Xingjian Li*, Di Hu, Tianyang Wang, Haoyi Xiong and Chengzhong Xu
ICCV, 2023
arxiv / CVF / Supp

Inadequately pre-trained models often extract better features than fully pre-trained ones. And deep models tend to first learn spectral components corresponding to large singular values.

moprl Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection
Shoubin Yu, Zhongying Zhao, Haoshu Fang, Andong Deng, Haisheng Su, Dongliang Wang, Weihao Gan, Cewu Lu and Wei Wu
IEEE Transactions on Circuits and Systems for Video Technology, 2022
arxiv

A motion prior distribution is utilized to construct the pose representation. MoPRL achieves the state-of-the-art performance by an average improvement of 4.7% AUC on several challenging datasets.

wenke_icme Robust Cross-Modal Knowledge Distillation for Unconstrained Videos
Wenke Xia, Xingjian Li, Andong Deng, Haoyi Xiong, Dejing Dou, and Di Hu
ICME, 2023
arxiv

We propose a Modality Noise Filter module to erase the irrelevant noise in teacher modality with cross-modal context and design a Contrastive Semantic Calibration module to adaptively distill useful knowledge for target modality, by referring to the differentiated sample-wise semantic correlation in a contrastive fashion.

ogm-ge Balanced Multimodal Learning via On-the-fly Gradient Modulation
Xiaokang Peng*, Yake Wei*, Andong Deng, Dong Wang and Di Hu
CVPR, 2022   (Oral Presentation)
arxiv / CVF / code

Modulate gradients of two modalities to adaptively balance the optimization process of multimodal learning.

Service

Conference Reviewer: CVPR 2022, ECCV 2022, ICCV 2023 and ECCV 2024

Journal Reviewer: IEEE Internet of Things Journal, IEEE Transactions on Neural Networks and Learning Systems

Gallery

Updated: November 17th, 2024.


Thanks Jon Barron for this amazing template.