- Digital Publishing
- Natural Language Processing
- Multimedia Information Processing
- Digital Document Processing
- Information Security
Multimedia Information Processing
ICST has been dedicated to the research on digital video/audio technologies since 1995. The computer-assisted animation system was awarded as one of the Top-10 Scientific and Technological Progress in Chinese Colleges and Universities in 1998. The Digital Broadcasting Control System for TV Program won the second prize of "National Scientific and Technological Progress Award" in 2007. Our research has focused on video/audio processing, transmission, and retrieval.
A number of high-quality papers have been published in top-tier international journals and conferences, including IEEE TIP, TCSVT, TMM, CVPR, ACM MM, AAAI, IJCAI, ICCV and INFOCOM. Our video retrieval technology won first places in Instant Search tasks in TRECVID 2012-2016.
We hosted and participated in a series of research projects granted by 973 Program, 863 Program, National Key Technology R&D Program and NSFC. We have over 100 granted patents, and many research achievements have already been applied in industry.
Main research contents include the following:
· Video coding standard, including H.264/AVC, H.264/SVC, AVS, HEVC. It is well explored to optimize video coding algorithms.
· Image restoration and enhancement, including super-resolution, illumination enhancement, denoising, stylization, bad-weather image restoration. It emphasizes on recovering an original image from its degraded counterpart by investigating image modeling and different priors.
· Video action analytics and understanding, especially action recognition, detection, and prediction based on big data analysis. It focuses on extracting discriminative spatial and temporal features to model the spatial and temporal evolutions of different actions.
· Information-centric future Internet. Research on content delivery network, cache-aware routing and collaborative cache to improve the efficiency of content distribution over Internet.
· Adaptive Internet video streaming. Research on HTTP adaptive bitrate streaming, viewport-adaptive panorama streaming, and real-time video communications.
· Music and sound analysis and retrieval technologies, mainly including music emotion, genre, usage, artists and other high-level semantic concept recognition technology; music identification, version recognition, similarity search, playlist generation and other music retrieval and recommendation technology; detection, recognition and retrieval techniques for audio events and scenes.
· Image and video understanding and retrieval, including image/video fine-grained analysis, image/video concept detection, image/video object detection, and large-scale index structure, which can realize image/video analysis, recognition and retrieval, providing technological support for Internet image/video retrieval and supervision applications.
· Cross-media analysis and reasoning, including cross-media unified representation, correlation understanding, knowledge graph construction, knowledge reasoning, as well as content description and generation, which can realize cross-media analysis, retrieval, mining and reasoning, providing technological support for cross-media intelligence.
Major research achievements include the following:
· Research and applications of digital video/audio control
We developed the digital video/audio control system, and won the second prize in the "National Science and Technology Progress Award" in 2007.
Fig 2.3 Network Video Applications
We developed key technologies and applications of various-terminal oriented large network video system, including functions as acquisition, editing, optimized coding, storage, content-based retrieval, and publishing of network videos. Our system is applied to WebTV, IPTV, mobile TV, multimedia digital signal in news publishing, advertising, and emerging media, and placed second in the Science and Technology Progress Award of the Ministry of Education in 2011.
· Innovation and application of high efficiency video coding (HEVC)
We developed a practical codec product Lentoid, based on the latest international video codec standard HEVC/H.265. Lentoid is the industry leader in H.265 codecs that combines high performance parallel coding framework, fast rate distortion optimization and efficient full platform decoding solution. Lentoid H.265 codec provides HEVC service to over 150 million people in China via Xunlei Kankan player, which supports 1.3 million times of complete HEVC video playback daily.
· Internet multimodal content analysis and recognition
Regarding the two major problems on "hard to supervise" and "difficult to exploit" of Internet multimodal big data, we have achieved several major technical breakthroughs and invention innovations in the following four aspects: image and video concept detection based on incremental deep learning and attention model, visual object detection based on cascaded classifiers and distinctive topological constraint, multimodal data fusion, as well as hot topic detection based on knowledge element and emotional perspective.
Fig 2.4 Network Video Retrieval and Recognition Application
We have obtained 50 authorized invention patents, and published more than 100 papers, including 27 international top journal papers and conference full papers, and won the First Place in the authoritative international evaluation TRECVID for five times. The research achievements have been applied in some important departments such as Office of the Central Leading Group for Cyberspace Affairs, Ministry of public security of the People’s Republic of China, State Administration of Press, Publication, Radio, Film and Television of the People’s Republic of China, resulting in significant economic and social benefits, which won the first prize of "Beijing Science and Technology Award for Technological Invention" in 2016.