👋 Hello,it's a happy day~~~
- My name is Jinrong Zhang, and I am a researcher specializing in computer vision and multimodal large models. 🚀
- If you have any interest in collaboration or academic exchange, please feel free to contact me.
🧑💻 About Me
📚 PhD Student in Electronic Information at Harbin Institute of Technology, Shenzhen.
🔬 Research Interests:
- Video Understanding and Generation
- Multimodal Representation
- Temporal Action Segmentation
📄 Research Papers
I love publishing and sharing my findings with the world! Here's a list of some of my published research papers:
Just a Few Glances: Open-Set Visual Perception with Image Prompt Paradigm – AAAI, CCF-A, 2025
End-to-End Streaming Video Temporal Action Segmentation with Reinforce Learning – TNNLS, CCF-B, IF=10.2, 2025
Flexible Streaming Temporal Action Segmentation with Diffusion Models – ICME, CCF-B, 2025
DTOS: Dynamic Time Object Sensing with Large Multimodal Model – CVPR, CCF-A, 2025
Cluster-Refined Optimal Transport for Unsupervised Action Segmentation – ICASSP, CCF-B, 2025
Unsupervised Temporal Action Segmentation Based on Wavelet Feature Processing - IJCNN, CCF-C. 2025
On the Papers page, you can also access the key details of these research papers.
💼 Internship Experience
- Xiaomi AI Lab – AI Research Intern
2024/2 – 2025/10- I provided a large model solution for access permission detection at the Xiaomi car factory and successfully implemented it.
- During my internship, I published a paper in AAAI.
🌐 My Profile
- 🌍 Website: Google Shcoral