STRUCT::ICMEW17

Team Name	School / Organization	Team Members	mAP::IOU=0.5	mAP::IOU=0.7	mAP::IOU=0.95
HRI	Hikvision	Chao Li, Qiaoyong Zhong, Liang Ma, Di Xie	0.95004	0.94754	0.91242
DIGCASIA	Institute of Automation, Chinese Academy of Sciences	Hongsong Wang, Yuqi Zhang, Liang Wang	0.90291	0.90022	0.86009
THU_IVG	Tsinghua University	Yansong Tang, Lei Chen, Ruogu Lin, Zi'an Wang, Jiwen Lu, Jie Zhou	0.70960	0.70459	0.65153
NJUST_IMAG	Nanjing University of Science and Technology	Jinhui Tang, Yan Song, Zechao Li, Xiangbo Shu, Chengcheng Ning, Lingling Fa, Minglei Yang, Li Sun	0.70818	0.69008	0.58939
Inteam	Northwestern Polytechnical University	Bo Li, Huahui Chen, Xuelian Cheng, Yucheng Chen, Yi Li, Yuchao Dai, Mingyi He.	0.58533	0.55953	0.48247
404 NOT FOUND	Zhejiang University	Songyang Zhang, Yimeng Li, Ze Li, Yunan Ye, Jun Xiao	0.35834	0.35112	0.32129

Information for authors

1. Participants are required to submit a paper describing your method for this challenge.

2. Papers must be no longer than 4 pages, including all text, figures and references.

3. Submission must be in English.

4. Submission must be in PDF format, with all fonts and subsets embedded and under 20M bytes.

5. Templates for submission: Word LaTex

6. Supplemental material is allowed, and should be zipped into a single file under 30M bytes.

7. Please send the paper by email with the final test results.

8. The evaluation of recognition task follows the basic protocol. That is we calculate the precision ratio to measure the number of correct prediction of video among all test clips. Participants should submit a list of predicted label for each testing video clip.

9. We adopt the Mean Average Precision of Actions (mAP) for detection evaluation: AP is used as the metric for evaluating the results on each activity category. The AP is averaged over all the activity categories. For more information, please refer to PKUMMD.

About this challenge

Large Scale 3D Human Activity Analysis has been attracting an increasing attention. As a significant part of video understanding in the ﬁeld of computer vision and multimedia, activity analysis such as Action Recognition and Action Detection remains a challenging problem. The technology will potentially facilitate a wide range of practical applications.

For human activity analysis, many great algorithms have been designed for RGB videos recorded by 2D cameras in the past couple of decades. Recently, with the prevalence of the affordable color-depth sensing cameras like Microsoft Kinect, it is much easier and cheaper to obtain depth data and the 3D skeleton of human body. As an intrinsic high level representation, 3D skeleton is valuable and comprehensive for summarizing a series of human dynamics in the video, and thus benefits the more general action analysis. Besides of succinctness and effectiveness, it shows great robustness to illumination, clustered background, and camera motion. Additionally, it is captured based on the infrared ray which can avoid the accuracy loss caused by object occlusion.

Recently, large scale data and deep learning have been revolutionizing computer vision research. To address the lack of a large scale 3D dataset for activity analysis, we build a new dataset and establish a Half-Day workshop to stimulate the computer vision community to design models and algorithms which can improve the performance of human activity analysis on 3D skeleton data

Topics Of Interest

The challenge focuses on analysis of daily indoor activities from skeleton data captured by 3D cameras for two different tasks.

⦁ Segmented Action Recognition Challenge:
Given a well-segmented skeleton video clip, predict the label of the activity present in the video clip.
⦁ Untrimmed Action Detection Challenge:
Given a long skeleton video, predict the action intervals with labels of the activities.

Significance

Activity analysis is an important area in computer vision and strongly relevant to multimedia. Different from other topics in the main conference, this workshop focuses on 3D human activity analysis which has been shown to have a potentially large impact in broad practical applications like visual surveillance, human-robot interaction, elderly assistance systems, etc.

Organizers

Dr. Jiaying Liu
Associate Professor, Institute of Computer Science and Technology
Peking University, Beijing, P.R. China
Email: liujiaying@pku.edu.cn

Dr. Wenjun Zeng
Principal Research Manager, Internet Media Group and the Media Computing Group
Microsoft Research Asia, Beijing, P.R. China
Email: wezeng@microsoft.com

Dr. Gang Wang
Assistant Professor, School of EEE
Nanyang Technological University, Singapore
Email: wanggang@ntu.edu.sg

Format

⦁ For Action Recognition

Top 5 will be invited to ICME 2017 workshop as invited speakers to present their works.
Top 10 will be invited to submit technical report for presentation in the ICME 2017 workshop.

⦁ For Action Detection

Top 3 will be invited to ICME 2017 workshop as invited speakers to present their works.
Top 7 will be invited to submit technical report for presentation in the ICME 2017 workshop

TPC Members

⦁ Xilin Chen, Chinese Academy of Sciences, Institute of Computing Technology, China

⦁ Wanli Ouyang, The Chinese University of Hong Kong, Hong Kong

⦁ Jiashi Feng, National University of Singapore, Singapore

⦁ Yuchao Dai, Australian National University, Australia

⦁ Qi Tian, University of Texas at San Antonio, USA

⦁ Yizhou Wang, Peking University, China

⦁ Zicheng Liu, Microsoft Research, USA

⦁ Liang Wang, Chinese Academy of Sciences, Institute of Automation, China

Team Name
School/Organization Name
Team Members' Name
Email Address

ICME 2017 Workshop

Recent News