Measuring perceptual quality, known as quality of experience (QoE), of multimedia is fundamental for developing and evaluating various human-centric image and video processing algorithms. We are working on important issues around QoE, including developing efficient subjective quality evaluation methods, evaluating image/video compression methods, studying perceptual quality of 4K UHD, 3D, video scalability, benchmarking objective quality metrics, etc.
BCI is a powerful way to understand users' perception directly by monitoring brain activities. We have been working on investigating effective BCI techniques using EEG and fNIRS for implict monitoring of users' perception of given multimedia content. Here, perceptual dimensions include satisfaction, quality, immersiveness, emotion, sense of reality, etc. In particular, we focus on understanding perception for novel types of media such as 3D, high dynamic range (HDR), 4K UHD, etc. Furthermore, peripheral physiological signals such as skin temperature, skin conductance, and heart rate are also considered, which have potential to complement brain activities.
Deep learning has received significant attention due to its power to learn effective representations from data. We are working on developing deep learning techniques for analysis of multimedia data including images, videos, music, etc.
The demand for wireless video delivery is ever increasing with the spread of powerful portable devices. A key issue for successful video delivery services is how to satisfy users' quality of experience (QoE) for the delivered content. We have been working on developing techniques for QoE-optimized video communications and QoE measurement for video communications, particularly based on massive-MIMO and scalable video coding (SVC).
The human visual attention allows to focus on a small region in the scene and ignore others at each time. Understanding how visual attention works via eye-tracking has importance in designing perceptually optimized multimedia processing and machine vision algorithms. Examples of our work on visual attention analysis include audio-visual focus of attention, gaze pattern analysis for temporal resolution changes, gaze pattern analysis for video packet loss, etc.
Image super resolution is a process to reconstruct a high-resolution image from a single low-resolution image. It has a wide range of applications in computer vision, medical imaging, satellite image processing, surveillance systems, etc. Recently, we have developed a blind single image super resolution technique. While existing methods usually require a prior knowledge about parameters used for low-pass filtering during down-scaling, our method robustly works for various down-scaling conditions. In addition, we have studied a proper way of conducting quality assessment of super resolution.
In these days, people easily take a large number of videos using handheld devices, but many of them have poor aesthetic quality. We have been working on developing automated video editing methods that can generate aesthetically improved versions of user-generated videos. Such methods not only exploit computer vision techniques such as motion tracking and shot detection, but also in-depth knowledge regarding human aesthetic perception and cinematography.
Music streaming services are prevalent in these days. In such services, it is important to understand listeners' satisfaction and recommend relevant songs at the right moment. We are seeking answers to various research questions such as: how to generate a music playlist satisfying users, how to recommend particular music clips based on listeners' states, how to sense listeners' response to given songs, how to predict popularity of songs, etc.