博士论坛

博士论坛导师介绍

（点击头像查看介绍）

赵才荣

同济大学

闵雄阔

上海交通大学

马超

上海交通大学

张鼎文

西北工业大学

张姗姗

南京理工大学

徐天阳

江南大学

博士生介绍

（点击头像查看介绍）

宋子帆

同济大学

张玲

华东师范大学

卢健之

复旦大学

王飞锋

上海大学

张豪

上海交通大学

赵才荣

同济大学

导师介绍

赵才荣，同济大学计算机科学与技术学院教授，博士生导师，大数据与人工智能系主任，上海市计算机学会计算机视觉专委会主任，中国图象图形学学会青工委秘书长，《中国图象图形学报》、《计算机科学》、《计算机工程》青年编委。主要研究领域：计算机视觉，重点研究高效可信行人再识别、多模态数据驱动的自动驾驶以及垂直领域模型的知识表示与推理问题。已在TPAMI、IJCV、IEEE TIP、IEEE TIFS、《中国科学.信息科学》、CVPR、ICML、NIPS、ICLR、AAAI、ACM MM、ECCV等发表学术论文50余篇，授权发明专利18项，研究成果获2022年上海市科技进步一等奖（序4/13），研究成果获2023年上海市自然科学二等奖（序1/4），获《中国科学:信息科学》2023年度热点论文奖，获2024年全国人工智能应用场景创新挑战赛总决赛特等奖（项目首席科学家），获2024年上海市计算机学会教学成果一等奖（序1/8）。主持国家自然科学基金4项。指导学生获中国电子学会优秀硕士论文，获上海市计算机学会优秀硕士论文。

TOP ↑

闵雄阔

上海交通大学

导师介绍

闵雄阔，上海交通大学副教授，主要研究多媒体信号处理，主持国家自然科学基金青年B类、面上等项目，入选博新计划，获吴文俊人工智能青年科技奖、电子学会优博、图像图形学会技术发明一等奖、电子学会技术发明一等奖等国内奖励，及IEEE TBC最佳论文奖、IEEE TMM最佳论文提名奖等国际论文奖9项，获CVPR、ECCV等国际挑战赛冠军7项，在IEEE/ACM汇刊与CCF A类期刊/会议上发表论文100余篇，10篇入选ESI热点/高被引，谷歌学术引用1万余次，担任ACM TOMM等期刊编委。

TOP ↑

马超

上海交通大学

导师介绍

马超，上海交通大学人工智能研究院教授，博士生导师。上海市浦江人才、中国图象图形学学会优博。上海交通大学与加州大学默塞德分校联合培养博士。澳大利亚机器人视觉研究中心(阿德莱德大学)博士后研究员。主要研究计算机视觉问题。谷歌学术引用1万4千余次，连续五年入选爱思唯尔中国高被引学者（2020-2024）。任中国图象图形学学会优博俱乐部主席、青年工作委员会副秘书长。担任CVPR 2024/2025、ICCV 2025、ICLR 2025/2026、AAAI 2026等会议领域主席，IEEE Trans. on Multimedia (TMM)、Journal of Artificial Intelligence Research (JAIR)编委。主持自然基金委青年项目(B类)。获中国图象图形学学会青年科学家奖、第30届多媒体建模会议（MMM 2024）唯一最佳论文奖、华为技术合作领域2021年度优秀技术成果奖。

TOP ↑

张鼎文

西北工业大学

导师介绍

张鼎文，西北工业大学教授，国家优秀青年科学基金获得者、科睿唯安“全球高被引科学家”。2015年赴美国卡耐基梅隆大学进行为期2年的访问研究，致力于建立面向开放环境下、具备动态学习能力的新一代计算机视觉学习框架。迄今为止，作为第一作者/通讯作者在领域内国际重要期刊及会议发表学术论文60余篇，其中包含 TPAMI、IJCV、IEEE SPM、TIP、CVPR、ICCV、Science China: Information Science 等。曾入选中国博士后创新人才计划、AI 华人青年学者榜单，获吴文俊人工智能优秀青年奖、2021 IEEE TCSVT最佳论文奖、中国图象图形学学会优秀博士论文奖等奖励。担任中国图象图形学学会青年工作委员会副秘书长，IEEE TMM 与 TCSVT 的副编辑。

TOP ↑

张姗姗

南京理工大学

导师介绍

张姗姗，南京理工大学计算机学院（人工智能学院）教授、博士生导师，国家优青、江苏省杰青获得者，研究领域为模式识别与计算机视觉。博士毕业于德国波恩大学，并曾在德国马普计算机研究所担任博士后研究员。2018年入选中国科协“青年人才托举工程”、微软“铸星学者”计划；2021年获得中国图象图形学学会石青云女科学家奖；2022-2024连续三年入选爱思唯尔中国高被引学者；2024年获得CAAI-华为昇思MindSpore学术基金优秀项目奖励。目前担任模式识别权威期刊Pattern Recognition编委、CVPR/ICCV/AAAI等会议领域主席、江苏省“社会安全图像与视频理解”重点实验室副主任、VALSE常务领域主席。

TOP ↑

徐天阳

江南大学

导师介绍

徐天阳，江南大学副教授，博导。研究方向为视频理解与多模态融合，发表期刊与会议论文100余篇，包括CCF-A/IEEE汇刊50余篇，其中IEEE TPAMI/IJCV 9篇，谷歌学术引用6000余次。主持国自然面上项目、青年项目、江苏省杰出青年基金项目、国自然重点项目课题等。获中国图象图形学学会优秀博士学位论文奖，获CVPR/ICCV/ECCV等国际学术会议举办相关学术竞赛（VOT、MMVRAC、Anti-UAV、AI City Challenge、Perception Test Challenge）冠亚军10余项，入选斯坦福大学全球前2%顶尖科学家年度榜单。

TOP ↑

宋子帆

同济大学

报告题目

Data-Centric Synchronous Optimization of Images and Labels

报告摘要

Current deep learning research predominantly focuses on model optimization, while the static and suboptimal training data itself also constitutes a performance bottleneck. This report focuses on a data-centric learning paradigm: by collaboratively optimizing feature inputs and supervision signals, it breaks through the limitations of static datasets and guides models toward more efficient and robust learning. This paradigm is primarily implemented through the technical approach of perturbation learning based on analytical optimization: analytical optimization methods are employed to calculate small yet precise feature and label perturbations for targeted corrections, thereby generating customized training sample pairs. This report aims to systematically elucidate the effectiveness of "data-centric collaborative optimization" as a general performance enhancement strategy, providing research insights and practical directions for expanding the generalization capabilities of models.

报告嘉宾介绍

宋子帆，同济大学博士生，师从赵才荣教授，本科毕业于同济大学。曾在上海人工智能实验室、微软亚洲研究院等机构实习。在TPAMI、NeurIPS、ICML等国际期刊会议发表一作论文5篇，主要研究方向为多模态学习、大模型智能体。

TOP ↑

张玲

华东师范大学

报告题目

Towards a Multimodal Pathology Intelligence Framework: From Genetic Biomarker Prediction to Pathology Report Generation

报告摘要

This talk presents a unified research line toward multimodal pathology intelligence, bridging genetic biomarker prediction and pathology report generation. We first propose PromptBio, an LLM-guided framework leveraging medical prompts for instance extraction and tumor microenvironment mining. Next, D²Bio employs dictionary-based hierarchical pathology mining and debiasing to capture fine-grained interactions across tumor regions. Finally, BiGen introduces a bi-modal concurrent learning strategy guided by historical reports to learn visual and textual knowledge. Together, these works advance interpretable and clinically applicable computational pathology.

报告嘉宾介绍

张玲，女，华东师范大学信息与通信工程专业25级博士研究生，导师是华东师范大学王妍教授，研究方向是医学图像分析，在医学图像计算顶会MICCAI发表2篇一作论文，并荣获MICCAI 2025颁发的Best Computational Pathology Paper奖。曾作为项目队长获得2025年挑战杯上海市特等奖、2024年“天翼云杯”上海市大学生人工智能算法创新赛一等奖。

TOP ↑

卢健之

复旦大学

报告题目

Dynamic 3D Reconstruction and Structural Modeling

报告摘要

Current 3D reconstruction research primarily emphasizes static geometry recovery, whereas capturing the dynamic nature of real-world objects—continuous motion and structural variation—is equally important. This report focuses on a dynamic reconstruction paradigm that models both spatial structures and their temporal or conformational evolution. By integrating representation learning and decomposition strategies, it advances 3D understanding across different domains and scales. The paradigm is exemplified through two technical frameworks: FacialFlowNet, which introduces a large-scale facial optical flow dataset and a decomposed model to disentangle head motion and expression dynamics, and CryoDyna, which reconstructs protein conformations by learning a continuous structural space from cryo-EM projections. Together, these works demonstrate how dynamic 3D modeling can unify motion and deformation analysis, providing new insights into reconstructing complex, evolving structures in both visual and biological systems.

报告嘉宾介绍

卢健之，复旦大学博士研究生，师从颜波教授，本科毕业于华东理工大学。在国际会议 ACM Multimedia 发表第一作者论文一篇。主要研究方向为三维重建与动态结构建模，聚焦于如何从二维观测中恢复三维空间的几何结构及其动态变化。研究内容涵盖三维计算机视觉、动态场景重建以及基于深度学习的结构表示学习。

TOP ↑

王飞锋

上海大学

报告题目

Deep Learning-Based Efficient Compression for Screen Content Images and Videos

报告摘要

Current deep learning-based compression research focuses primarily on natural image/video scenarios, but fails to fully adapt to the unique characteristics of Screen Content Images (SCIs) (limited color values, abundant repetitive patterns) and Screen Content Videos (SCVs) (homogeneous regions, abrupt content changes). This makes general compression schemes perform poorly in screen content scenarios, becoming a bottleneck for efficient transmission. This report proposes a screen content-specific compression paradigm: by targeted optimizing image/video compression modules, it overcomes the limitations of general schemes, enables the compression system to better fit screen content characteristics, and achieves more efficient compression. The paradigm relies on following core technologies: For SCIs: a framework with Color Context Generator (CCG) (eliminates color redundancy via main color components) and Region-Based Block Aggregation (RBA) (aggregates repetitive features through block matching), plus a Diverse Template Library (DTL) scheme to exploit inter/intra-image redundancy. For SCVs: a framework with Superpixel-Constrained Motion Estimation (SCME) (captures large-scale motions using superpixel global correlations) and Inter-Intra Context Aggregation (I2CA) (achieves content-aware fusion and long-range repetitive feature localization via gating mechanism and displacement-guided window attention). This report seeks to validate that "screen content-specific compression optimization" works as an effective scenario-driven performance booster, while also offering actionable research insights and technical pathways to advance the field of screen content compression.

报告嘉宾介绍

王飞锋，上海大学信息与通信工程博士研究生，师从沈礼权研究员，主要研究方向聚焦深度学习在图像视频处理与压缩领域的应用，尤其专注于屏幕内容图像 / 视频的高效压缩技术研究。在 TCSVT、CVPR等国际期刊会议发表一作论文2篇，针对通用压缩方案适配性不足的问题，为图像视频压缩技术的场景化改进提供学术支撑。

TOP ↑

张豪

上海交通大学

报告题目

Efficient Convergence, Generalization, and Adaptation in Federated Learning under Heterogeneous Settings

报告摘要

The growth of distributed data and the enactment of privacy regulations have driven a paradigm shift from centralized machine learning to Federated Learning (FL). In FL, clients collaboratively train a model by performing local updates on heterogeneous data and communicating only model parameters with a central server. However, data and system heterogeneity in FL introduce core challenges in convergence, generalization, and adaptation. This report presents three contributions to enhance efficiency across these domains. First, to address data heterogeneity, this report introduces a novel stability metric for FL that unifies the theoretical bounds of convergence and generalization. It then proposes a convergence algorithm integrating quasi-Nesterov optimization and variance reduction, which achieves a theoretical linear speedup. Second, to address the challenge of learning from decentralized data, this report applies the Information Bottleneck principle to construct a regularization for features and model parameters. This approach learns a minimal, sufficient, and invariant cross-client representation, and the resulting regularization is proven to tighten the federated generalization upper bound. Finally, to address resource heterogeneity, the research presented here proposes a reinforcement learning-based, resource-aware compression strategy that dynamically allocates computation bit-widths and employs extreme, sign-based communication compression.

报告嘉宾介绍

张豪，上海交通大学信息与通信工程专业博士研究生，导师为上海交通大学李成林教授，研究方向包括联邦学习、分布式优化、信息瓶颈、和贝叶斯学习等，目前在机器学习和人工智能领域IEEE汇刊和顶级会议以第一作者身份发表论文 6 篇，包括 IEEE T-PAMI, T-NNLS汇刊长文 2 篇，NeurIPS、ICML 会议论文 4 篇。

TOP ↑

主办单位

中国图象图形学学会 (CSIG)

中国人工智能学会 (CAAI)

中国计算机学会 (CCF)

中国自动化学会 (CAA)

承办单位

上海交通大学 (SJTU)

上海飞腾文化传播有限公司

协办单位

AutoDL

华东师范大学

官方微信公众号