Keynote Speakers

  • Prof. Song-Chun Zhu
  • Dean of Beijing Institute for General Artificial Intelligence, Chair Professor of Peking University, Chair Professor of Basic Science of Tsinghua University
  • Bio: Song-Chun Zhu was born in Ezhou, Hubei Province, China. He is a world-famous computer vision expert, statistics and applied mathematician, and expert in artificial intelligence. He graduated from the University of Science and Technology of China in 1991, studied in the United States from 1992, and received a PhD degree in computer science from Harvard University in 1996. From 2002 to 2020, he was a professor of the Department of Statistics and Computer Science in UCLA, and the director of UCLA Center for vision, cognition, learning and autonomous robotics. He has published more than 300 papers in international top journals and conferences, won many international awards in the fields of computer vision, pattern recognition and cognitive science, including three top international awards in the field of computer vision - Marr Prize, Helmholtz award, etc., twice served as the chairman of the International Conference on Computer Vision and Pattern Recognition (CVPR2012, CVPR2019), and from 2010 to 2020 twice served as Director of MURI, an US multi University and interdisciplinary cooperative project in the fields of vision, cognitive science and artificial intelligence. Professor Song-Chun Zhu has long been committed to building a unified mathematical framework of computer vision, cognitive science, and even artificial intelligence science. After 28 years in the United States, Professor Zhu returned to China in September 2020 to serve as dean of Beijing Institute for General Artificial Intelligence, Chair Professor of Peking University and Chair Professor of Basic Science of Tsinghua University.

  • Title:Computer Vision: A Task-oriented and Agent-based Perspective

  • Prof. Larry Davis
  • University of Maryland, USA
  • Bio: Larry S. Davis is a Distinguished University Professor in the Department of Computer Science and director of the Center for Automation Research (CfAR). His research focuses on object/action recognition/scene analysis, event and modeling recognition, image and video databases, tracking, human movement modeling, 3-D human motion capture, and camera networks. Davis is also affiliated with the Computer Vision Laboratory in CfAR. He served as chair of the Department of Computer Science from 1999 to 2012. He received his doctorate from the University of Maryland in 1976. He was named an IAPR Fellow, an IEEE Fellow, and ACM Fellow.

  • Prof. Yoichi Sato
  • University of Tokyo, Japan
  • Bio: Yoichi Sato is a professor at Institute of Industrial Science, the University of Tokyo. He received his B.S. degree from the University of Tokyo in 1990, and his MS and PhD degrees in robotics from School of Computer Science, Carnegie Mellon University in 1993 and 1997. His research interests include first-person vision, and gaze sensing and analysis, physics-based vision, and reflectance analysis. He served/is serving in several conference organization and journal editorial roles including IEEE Transactions on Pattern Analysis and Machine Intelligence, International Journal of Computer Vision, Computer Vision and Image Understanding, CVPR 2023 General Co-Chair, ICCV 2021 Program Co-Chair, ACCV 2018 General Co-Chair, ACCV 2016 Program Co-Chair and ECCV 2012 Program Co-Chair.

  • Title:Understanding Human Activities from First-Person Perspectives

  • Abstract: Wearable cameras have become widely available as off-the-shelf products. First-person videos captured by wearable cameras provide close-up views of fine-grained human behavior, such as interaction with objects using hands, interaction with people, and interaction with the environment. First-person videos also provide an important clue to the intention of the person wearing the camera, such as what they are trying to do or what they are attended to. These advantages are unique to first-person videos, which are different from videos captured by fixed cameras like surveillance cameras. As a result, they attracted increasing interest to develop various computer vision methods using first-person videos as input. On the other hand, first-person videos pose a major challenge to computer vision due to multiple factors such as continuous and often violent camera movements, a limited field of view, and rapid illumination changes. In this talk, I will talk about our attempts to develop first-person vision methods for different tasks, including action recognition, future person localization, and gaze estimation.

  • Michael Black
  • Max Planck Institute for Intelligent Systems, Germany
  • Bio: Michael Black received his B.Sc. from the University of British Columbia (1985), his M.S. from Stanford (1989), and his Ph.D. from Yale University (1992). After post-doctoral research at the University of Toronto, he worked at Xerox PARC as a member of research staff and area manager. From 2000 to 2010 he was on the faculty of Brown University in the Department of Computer Science (Assoc. Prof. 2000-2004, Prof. 2004-2010). He is one of the founding directors at the Max Planck Institute for Intelligent Systems in Tübingen, Germany, where he leads the Perceiving Systems department. He is also a Distinguished Amazon Scholar (VP), an Honorarprofessor at the University of Tuebingen, and Adjunct Professor at Brown University. His work has won several awards including the IEEE Computer Society Outstanding Paper Award (1991), Honorable Mention for the Marr Prize (1999 and 2005), and all three major test-of-time awards including the 2010 Koenderink Prize, the 2013 Helmholtz Prize, and the 2020 Longuet-Higgins Prize. He is a foreign member of the Royal Swedish Academy of Sciences. In 2013 he co-founded Body Labs Inc., which was acquired by Amazon in 2017.