首页

王勇涛

副研究员，2009年毕业于华中科技大学图像所，获博士学位。2010-2011年在新加坡南洋理工大学淡马锡实验室从事博士后研究工作。担任CCF YOCSEF委员、中国图象图形学学会“文档图像分析与识别”专委会委员。目前主要从事文档图像理解、模式识别、计算机视觉与深度学习方面的研究。作为项目负责人承担了国家自然科学基金、国家重大专项子课题、北京市自然科学基金等5项国家级省部级课题，和阿里巴巴、海信等知名企业成功开展了多项横向课题合作。已发表TIP、PR、ICCV、AAAI、MM等期刊/会议论文30余篇，并在无人车/无人机目标检测、场景文字检测识别等方向上多个国际评测竞赛上取得佳绩。

研究方向

复杂文档图像理解
计算机视觉与模式识别
深度学习及应用

近期论文

[1]. Zheqi He, Yafeng Zhou, Yongtao Wang*, Siwei Wang, Xiaoqing Lu, Zhi Tang, and Ling Cai, “An End-to-End Quadrilateral Regression Network for Comic Panel Extraction,” ACM Multimedia (MM), Seoul, Korea, Oct. 2018.
[2]. Ting Guo, Rundong Cui, Xiaoran Qin, Yongtao Wang*, and Zhi Tang, “Bottom-up/Top-down Geometric Object Reconstruction with CNN Classification for Mobile Education,” The 26th Pacific Conference on Computer Graphics and Applications (Pacific Graphics), Hong Kong, China, Oct. 2018.
[3]. Jiahui Li, Siwei Wang, Yongtao Wang*, and Zhi Tang, “Synthetic Data for Text Recognition with Style Transfer,” Multimedia Tools and Applications, 2018.
[4]. Yuan Liao, Xiaoqing Lu, Chengcui Zhang, Yongtao Wang, and Zhi Tang, “Mutual Enhancement for Detection of Multiple Logos in Sports Videos,” International Conference on Computer Vision (ICCV), Venice, Italy, pp. 4846-4855, Oct. 2017.
[5]. Zheqi He, Yafeng Zhou, Yongtao Wang*, and Zhi Tang, “SReN: Shape Regression Network for Comic Storyboard Extraction,” The Thirty-First AAAI Conference on Artificial Intelligence (AAAI), San Francisco, USA, pp. 4937-4938, Feb. 2017.
[6]. Ting Guo, Yongtao Wang, Yafeng Zhou, Zheqi He, and Zhi Tang, “Geometric Object 3D Reconstruction from Single Line Drawing Image with Bottom-Up and Top-Down Classification and Sketch Generation,” The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, pp. 670-676, Nov. 2017.
[7]. Xiaoran Qin, Yafeng Zhou, Zheqi He, Yongtao Wang*, and Zhi Tang, “A Faster R-CNN based Method for Comic Characters Face Detection,” The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, pp. 1074-1080, Nov. 2017.
[8]. Jinxin Zheng, Yongtao Wang*, and Zhi Tang, “Context-aware Geometric Object Reconstruction for Mobile Education,” ACM Multimedia (MM), Amsterdam, The Netherlands, pp. 367-371, Oct. 2016.
[9]. Jinxin Zheng, Yongtao Wang*, and Zhi Tang, “Recovering solid geometric object from single line drawing image,” Multimedia Tools and Applications, 75(17), pp. 10153-10174, 2016.
[10]. Yongtao Wang*, Yafeng Zhou, Dong Liu, and Zhi Tang, “Comic storyboard extraction via edge segment analysis,” Multimedia Tools and Applications, 75(5), pp. 2637-2654, 2016.
[11]. Yongtao Wang, Xicheng Liu, and Zhi Tang, “An R-CNN Based Method to Localize Speech Balloons in Comics,” Proc. of Multi-Media Modeling (MMM), Miami, USA, pp. 444-453, Jan. 2016.
[12]. Lu Liu, Xiaoqing Lu, Yuan Liao, Yongtao Wang, and Zhi Tang, “Improving Retrieval of Plane Geometry Figure with Learning to Rank,” Pattern Recognition Letters, 83, pp. 423-439, 2016.
[13]. Luyuan Li, Yongtao Wang*, Ching Y. Suen, Zhi Tang, and Dong Liu, “A Tree Conditional Random Field Model for Panel Detection in Comic Images,” Pattern Recognition, 48(7), pp. 2129–2140, 2015.
[14]. Yongtao Wang, Yafeng Zhou, and Zhi Tang, “Comic Frame Extraction via Line Segments Combination,” The 13th IAPR International Conference on Document Analysis and Recognition (ICDAR), Nancy, France, pp. 856-860, Aug. 2015.
[15]. Xicheng Liu, Yongtao Wang, and Zhi Tang. “A Clump Splitting-based Method to Localize Speech Balloons in Comics,” The 13th IAPR International Conference on Document Analysis and Recognition (ICDAR), Nancy, France, pp. 901-905, Aug. 2015.
[16]. Jinxin Zheng, Yongtao Wang*, and Zhi Tang, “Solid Geometric Object Reconstruction from Single Line Drawing Image,” The 10th International Conference on Computer Graphics Theory and Applications (GRAPP), Berlin, Germany, pp. 391-400, Mar. 2015.
[17]. Yongtao Wang, Zheqi He, Xicheng Liu, Zhi Tang, and Luyuan Li, “A fast and robust ellipse detector based on top-down least-square fitting,” The 26th British Machine Vision Conference (BMVC), Swansea, UK, pp. 1-12, Sep. 2015.
[18]. Luyuan Li, Yongtao Wang*, Liangcai Gao, Zhi Tang, and Ching Y. Suen, “Comic2CEBX: A System for Automatic Comic Content Adaptation,” ACM/IEEE Joint Conference on Digital Libraries (JCDL), London, United Kingdom, pp. 299-308, Sep. 2014.
[19]. Luyuan Li, Yongtao Wang*, Zhi Tang, and Liangcai Gao, “Automatic Comic Page Segmentation Based on Polygon Detection,” Multimedia Tools and Applications, 69(1), pp. 171-197, 2014.
[20]. Dong Liu, Yongtao Wang*, Zhi Tang, Luyuan Li, and Liangcai Gao, “Automatic Comic Page Image Understanding Based on Edge Segment Analysis,” SPIE Document Recognition and Retrieval XXI, San Francisco, California, USA, pp. 90210J/1-12, 2014.
[21]. Xin Tao, Zhi Tang, Canhui Xu, and Yongtao Wang, “Logical Labeling of Fixed Layout PDF Documents Using Multiple Contexts,” The 11th IAPR International Workshop on Document Analysis Systems (DAS), Tours, France, pp. 360-364, Apr. 2014.
[22]. Dong Liu, Yongtao Wang*, Zhi Tang, and Xiaoqing Lv, “A Robust Circle Detection Algorithm Based on Top-down Least-square Fitting Analysis,” Computers & Electrical Engineering, 40, pp. 1415-1428, 2014.
[23]. Dazhi Zhang, Yongtao Wang, Wenbing Tao, and Chengyi Xiong, “Epipolar Geometry Estimation for Wide Baseline Stereo by Clustering Pairing Consensus,” Pattern Recognition Letters, 36, pp. 1-9, 2014.
[24]. Huanqiang Zeng, Yongtao Wang, Zhe Wei, and Canhui Cai, “Efficient Two-stage Early SKIP Mode Termination for Depth Video Coding,” Computers & Electrical Engineering, 40(4), pp. 1344-1352, 2014.
[25]. Luyuan Li, Yongtao Wang*, Zhi Tang, and Dong Liu, “Comic Image Understanding Based on Polygon Detection,” SPIE Document Recognition and Retrieval XX, San Francisco, California, USA, pp. 86580B/1-11, Feb. 2013.
[26]. Luyuan Li, Yongtao Wang*, Zhi Tang, Xiaoqing Lu, and Liangcai Gao, “Unsupervised Speech Text Localization in Comic Images,” The Twelfth International Conference on Document Analysis and Recognition (ICDAR), Washington, DC, USA, pp. 1190-1194, Aug. 2013.
[27]. Liangcai Gao, Yongtao Wang, Zhi Tang, and Xiaofan Lin, “Newspaper Article Reconstruction Using Ant Colony Optimization and Bipartite Graph,” Applied Soft Computing, 13(6), pp. 3033-3046, 2013.
[28]. Chenqiang Gao, Deyu Meng, Yi Yang, Yongtao Wang, Xiaofang Zhou, and Alexander G. Hauptmann, “Infrared Patch-Image Model for Small Target Detection in a Single Image,” IEEE Trans. on Image Processing, 22(12), pp. 4996-5009, 2013.
[29]. Liangcai Gao, Zhi Tang, Xiaoyan Lin, and Yongtao Wang, “A Graph-based Method of Newspaper Article Reconstruction,” The Twenty-first International Conference on Pattern Recognition (ICPR), Tsukuba, Japan, pp. 1566-1569, Nov. 2012.
[30]. Yongtao Wang, Junbin Gong, Dazhi Zhang, Chenqiang Gao, Jinwen Tian, and Huanqiang Zeng, “Large Disparity Motion Layer Extraction via Topological Clustering,” IEEE Trans. on Image Processing, 20(1), pp. 43-52, 2011.
[31]. Chao Tao, Yihua Tan, Yongtao Wang, and Jinwen Tian, “Discard Wide-baseline Mismatch Using Contour Fragments,” Electronics Letters, 46(12), pp. 834-835, 2010.
[32]. Yongtao Wang, Dazhi Zhang, and Jinwen Tian, “Discarding Wide Baseline Mismatches via Topological Clustering,” Electronics Letters, 44(11), pp. 670-671, 2008.
[33]. Yongtao Wang, Dazhi Zhang, and Jinwen Tian, “Topological Clustering and Its Application for Discarding Wide Baseline Mismatches,” Optical Engineering, 47(5), pp. 057202-1-6, 2008.

国际竞赛获奖

ECCV 2018: Vision Meets Drone Challenge (Task 2: Object Detection in Videos) 第一名
CVPR 2018 Workshop: Autonomous Driving (Task 2: Road Object Detection) 第二名
ICDAR 2017: Robust Reading Challenge on COCO-Text (Task 1: Text Localization) 第一名
ICDAR 2017: Robust Reading Challenge on COCO-Text (Task 3: End-to-End) 第二名
ICDAR 2017: Reading Chinese Text in the Wild (Task 1: Text Localization) 第一名

部分科研项目

线条图像理解方法研究(61673029), 国家自然科学基金面上项目, 2017.01-2020.12，项目负责人
面向移动阅读的复杂文档图像理解方法研究(61300061),国家自然科学基金青年基金项目， 2014.01-2016.12，项目负责人
基于能量最小化模型的漫画图像理解方法研究(20130001120012),教育部博士点基金新教师基金项目，2014.01-2016.12，项目负责人
面向移动阅读的漫画图像理解方法研究(4132033), 北京市自然科学基金面上项目, 2013.01-2015.12，项目负责人
内容交易与分发版权保护技术研发(GXTC-CZ-1015004/03),重大科技专项，2011.01-2014.07，项目负责人
智能驾驶场景下车辆目标3D包围框回归算法研究, 阿里巴巴AI实验室横向合作课题，2018.01-2018.09，项目负责人

学术服务

CCF YOCSEF委员、中国图象图形学学会“文档图像分析与识别”专委会委员
ICDAR 2017 PC Member、ISPACS 2017 Publication Co-Chair、AAAI 2019 PC Member
TIP、PR、TCSVT、TKDE等多个国际学术期刊审稿人

联系方式：

电话：010-82529542
传真：010-82529207
E-mail： wyt@pku.edu.cn