Scientific Program

Conference Series Ltd invites all the participants across the globe to attend 4th Global Summit and Expo on Multimedia & Artificial Intelligence Rome, Italy.

Day 14 :

  • Animation and Simulations | Artificial Intelligence | Image Processing | Computer Vision & Pattern Recognition | Multimedia Networking|Internet of Things
Speaker

Chair

K Asari Vijayan

University of Dayton, USA

Speaker

Co-Chair

Hector Perez-Meana

National Polytechnic Institute, Mexico

Session Introduction

Haipeng Peng

Beijing University of Posts and Telecommunications, China

Title: Semi-tensor product compressive sensing and its application in image processing
Speaker
Biography:

Haipeng Peng received the MS degree in System Engineering from Shenyang University of Technology, Shenyang, China, in 2006, and the PhD degree in Signal and Information Processing from Beijing University of Posts and Telecommunications, Beijing, China, in 2010. He is currently a Professor at the School of Cyber Space Security, Beijing University of Posts and Telecommunications, China. His research interests include information security, network security, complex networks and control of dynamical systems. He is the co-author of over 100 scientific papers. His SCI citations of other scholars are over 1400 times, and his Google citations are over 2700 times.

 

Abstract:

Compressive sensing (CS) is a popular research at home and abroad. One of the key bases of modern signal processing is the Shannon-Nyquist sampling theory, where the number of discrete samples required for a signal that can be reconstructed without distortion is determined by its bandwidth. As a new sampling theory, CS can obtain discrete samples of the signal by random sampling, which is far less than the Shannon-Nyquist sampling rate by the sparse characteristic of the signal, and then CS can reconstruct the original signal by non-linear reconstruction algorithms. CS was proposed by Terence Tao (the winner of Fields Medal), Emmanuel Candès (IEEE Fellow), and David Donoho (the member of the US National Academy of Sciences) in 2004. This theory has received wild attention in academia and industry once it has been put forward. It has been applied into the fields of signal processing, digital communication, network security, image processing, medical imaging, geologic survey, radiating systems etc. It has been rated as the top 10 scientific and technological progress in 2007 by American Science and Technology Review. This speech focuses on the latest research progress of semi-tensor compressive sensing theory and its application in image processing. This speech introduces the basic theory of CS from the point of view of collecting data more effectively, and introduces our newly proposed semi-tensor compressive sensing theory, which breaks through the limitation of traditional compression perception matrix. The dimension of the measurement matrix that needs to be processed during the sampling recovery process is greatly reduced so that the signal is processed and restored at a lower sampling rate than the classical CS theory.

 

 

Speaker
Biography:

Qaysar S Mahdi has completed his PhD from Engineering Military College and Postdoctoral studies from Engineering Military College. He is the Director of IT Services & Website Department; Ishik University. He has published more than 15 papers in reputed journals and has been serving as an Editorial Board Member of repute.

 

Abstract:

Atmospheric propagation is very effective on the performance of the wireless, mobile, radar, and communication systems. In this paper different atmospheric models are constructed under different atmospheric conditions. The performance of a GSM mobile communication is tested under different atmospheric models. From the obtained results, it is noticed that the coverage of the mobile system antenna is changed highly if the refractive index model of certain country is changed. It is concluded that the atmospheric propagation is very essential parameter to be taken into account when the siting of a mobile GSM network is to be evaluated and designed. This study will be very useful in order to predict the performance of ground radio and airborne systems.

 

 

Byung Geun Lee

Gwangju Institute of Science and Technology, South Korea

Title: Recent research on hardware neural network
Speaker
Biography:

Byung Geun Lee received the BS degree in Electrical Engineering from Korea University, Seoul, South Korea, in 2000; the MS and PhD degrees in Electrical and Computer Engineering from the University of Texas at Austin, in 2004 and 2007, respectively. From 2008 to 2010, he was a Senior Design Engineer at Qualcomm Inc., San Diego, CA, USA, where he had been involved in the development of mixed-signal ICs. Since 2010, he has been with the Gwangju Institute of Science and Technology. He is currently an Associate Professor with the School of Electrical and Computer Science. His research interests include high-speed data converter, CMOS image sensor, and neuromorphic system developments.

 

 

Abstract:

An artificial neural network (ANN) is a computational model inspired by neo-cortex of human brain that is capable of solving a variety of problems in recognition, prediction, optimization, and control. It can be also described as a network of synaptically connected neurons that can create, modify, and preserve information through sequential learning procedures. Recently, hardware implementation of artificial neural network called hardware neural network (HNN) is gaining popularity due to its potential usability for industrial applications requiring recognition, optimization, and prediction using complex data sets. However, hardware implementing issues need to be solved for the widespread of HNN. In this presentation, I will summarize recent efforts of HNN implementations with pros and cons of each approach.

 

Lixiang Li

Beijing University of Posts and Telecommunications, China

Title: Foraging behavior of ants and its application in optimization field
Speaker
Biography:

Lixiang Li is currently a Professor at the School of Cyber Space Security, Beijing University of Posts and Telecommunications, China. She received the PhD degree in Signal and Information Processing from Beijing University of Posts and Telecommunications, Beijing, China, in 2006. Her interests include compressive sensing, swarm intelligence, neural networks and complex networks. She is the co-author of more than 150 papers. Her SCI citations of other scholars are over 2000 times, and her Google citations are over 3400 times. In 2014, her result of swarm intelligence published in PNAS has been widely reported by more than 200 domestic and foreign media (such as Time Magazine, Science Daily, the Christian Science Monitor, the Daily Mail, Science and Technology Daily and etc.). In 2015, her result of memory resistance neural network published in EPJB was assessed as highlight paper and was reported by at least 28 international media.

 

 

Abstract:

Ants and other social animals have captured the attention of many scientists because of their self-organizing behavior and the high level of structuration their colonies can achieve, especially when compared to the relative simplicity of the individuals. The study of the foraging behavior of group animals (especially ants) is of practical ecological importance, but it also contributes to the development of widely applicable optimization problem-solving techniques. In recent years, algorithms inspired by models of animal group behaviors have achieved increasing success among researchers in computer science, communication networks and operations research. This talk introduces basic mechanisms of effective foraging for social insects or group animals that have a home. The whole foraging process of ants is controlled by three successive strategies: hunting, homing, and path building. These learning strategies have advantages on the internet optimization process. This speech also introduces some dynamical models of ant foraging. We introduce the influences of the special region around the nest, the size of the food source, the search range, the limitation of ants’ physical ability, and ants’ learning process with respect to foraging behavior. Our analysis suggests that group animals that have a home do not perform random walks, but rather deterministic walks in a random environment. They use their knowledge to guide them and their behavior is also influenced by their physical abilities, their age, and the existence of homes. In this talk, we will also introduce the application fields of ant foraging behavior, such as network optimization, signal processing, network security, distributed control et al.

 

Hanmin Jung

Korea Institute of Science and Technology Information, South Korea

Title: Detectionof uneven road surfaces on internet of vehicles
Speaker
Biography:

Hanmin Jung works as the Head of Scientific Data Research Center and Chief Researcher at Korea Institute of Science and Technology Information, Korea since 2004. He received his BS, MS, and PhD degrees in Computer Science and Engineering from Pohang University of Science and Technology, Korea in 1992, 1994, and 2003. Previously, he was Senior Researcher at Electronics and Telecommunications Research Institute, Korea, and worked as CTO at DiQuest Inc., Korea. Now, he is also full Professor at University of Science & Technology, Korea; Visiting Professor at National Human Resources Development Institute, Korea; Visiting Fellow at University of Southampton, UK; Guest Professor at Graz University, Austria; Guest Professor at Paderborn University, Germany; Editor at Korea Contents Association; Director at Korean Society for Internet Information; Director at Korean Society for Big Data Service; Director at the Korean Society of Computer and Information and; Committee Member of ISO/IEC JTC1/SC32. His current research interests include the fourth industrial revolution, semantic web, artificial intelligence, text mining, big data, information retrieval, human-computer interaction (HCI), data analytics, natural language processing (NLP), and Internet of Things (IoT).

 

Abstract:

This study aims at finding uneven road surfaces on an Internet of Vehicles (IoV) network composed of about 40 taxis and 2 dedicated vehicles. The taxis are operating in Daegu city, which is the 4th largest metropolitan city of 883.56 km2 in Korea, for 7 months and are transmitting sensor data every 10 seconds. Thus, about 1.1 million sensing points per month are generated from the network. There are 10 types of sensors including vibration sensor, particulate matter sensors (PM10 and PM2.5), carbon monoxide (CO) sensor, and nitrogen dioxide (NO2) sensor inside taxi cab light. For the dedicated vehicles, 32 types of sensors including vibration sensor, acceleration sensor, gyro sensor, and black box are mounted inside hard-shell carrier. In this study, we’ve drawn a heat map as shown in fig. 2 with the vibration-related sensor data without pre-fixed road segmentation. Intensive road imbalances have been observed in major arterial roads leading from Daegu city hall (center in the map) to Seongseo industrial complex (left in the map), which imply that such roads are affected by frequent traffic and heavy trucks. Field inspections allowed us to identify a series of portholes and cracks at these points, many of which occur on maintained roads. It indicates that it is necessary to consider maintenance method for preventing and repairing potholes and pavement cracks as well as various environmental factors such as temperature, humidity, and traffic volume. Future study will include the monitoring and analysis of road conditions for other cities in Korea and abroad, and tracking the changes of the conditions through stable IoV deployment. Further, we will investigate the relationships among the factors affecting road imbalances on major bridges as such abnormality causes serious disaster.

 

Break: Networking & Refreshments 15:55-16:15 @ Foyer

Michael Robert Doran

Translational Research Institute - Queensland University of Technology, Australia

Title: Use of multimedia in grant applications
Speaker
Biography:

He then completed a combined Postdoctoral Fellowship at the University of Queensland and Mater Medical Research Institute. He is a National Health and Medical Research Fellow and an Associate Professor at the Queensland University of Technology (Australia). Currently his laboratory is located at the Translational Research Institute (TRI) on the Princess Alexandra Hospital campus in Brisbane. His group’s multidisciplinary research interests include the study of bone, bone marrow, cartilage, and cancers that metastasize to the bone.

 

Abstract:

The Problem: Researchers and funding organizations are struggling with the ever increasing time and effort needed to prepare and review grant proposals. Ioannidis argued in Nature that burdensome funding systems mean that “scientists don’t have time for science any more”. We estimated that Australian researchers invest an average of 38 days preparing each new NHMRC Project Grant proposal, and 28 days on a resubmission. Based on anecdotal data and the similarity of the systems, we believe that ARC Discovery Project time burdens are similar. In 2017, Australian researchers submitted 3,136 ARC Discovery Project Grant proposals and 3,345 NHMRC Project Grant proposals. Assuming a conservative time of 28 working days per proposal, Australian researchers would have invested around 500 years preparing these proposals in 2017. Time is also needed to review proposals and in 2011 we estimated that $1,700 dollars of reviewer time is needed to review a project grant proposal, giving an estimated $5.6 million in review costs for NHMRC Project Grants alone. Despite the enormous investment by applicants and reviewers, estimates are that for one-third of grant proposals, success is somewhat random because of the variability among peer reviewers.

 

Necessary Change: The grant preparation and review systems must improve to address the current challenges. Funding allocation should remain merit-based, but preparing proposals should be less burdensome. Written proposals are often dense and tiring to review. The proposal format should engage reviewers and clearly contain the detailed information needed to assess the proposal’s feasibility, novelty and impact.

 

Possible Solution: Recently we argued in Nature, Trends in Biochemical Sciences, a Cell Press Video and Nature Index that an effective mechanism to enhance communication between applicants and reviewers was through video. Researchers routinely prepare PowerPoint presentations for conferences and record such presentations as lecture material. PowerPoint presentations with voice recordings are a logical potential alternative to written project descriptions. Such videos may be highly effective at transferring the key ideas from the minds of the authors to the reviewers, leading to better decision-making. This talk will outline our preliminary data and discuss the merits of trailing 15-minute PowerPoint presentations, with voice recording, as an alternative to traditional text-based grant applications. We reason that this approach will enable more effective and efficient communication and more reliable ranking of proposals than current written grant project descriptions.

 

 

Speaker
Biography:

Cagcag Yolcu Ozge is a Professor at Department of Industrial Engineering of Giresun University, Turkey. She received the MSc (Thesis title: Optimization of patient waiting time in the department of brain surgery of Ondokuz Mayis University by simulation) and the PhD (Thesis title: A new hybrid fuzzy time series approach) degrees in statistics at the Faculty of Science and Arts of the University of Ondokuz Mayis, Turkey, in 2010 and 2013, respectively. She was situated in the Robotics Research Group, Department of Informatics at King’s College London for a year in 2015 for Postdoctoral research. She has her expertise in time series analysis, fuzzy inference systems, artificial neural networks, artificial intelligence optimization algorithms, and robust statistics. She has various studies on time series prediction models including fuzzy inference systems and computational systems.

 

Abstract:

As a data mining field, analysis of time series has been one of the main research subjects for decades. In the literature, many models have been put forward for the prediction problems. Traditional prediction models, for modelling the time series, may fail to solve the prediction problems including complex real-world time series because of the several assumptions which need to be taken account of. These methods can be grouped as probabilistic methods. An effective way to predict time series has been to utilize advanced time series prediction models that can be grouped as non-probabilistic including fuzzy inference systems based on fuzzy sets and fuzzy arithmetic, and computational inference system based on artificial neural networks. Not requiring any assumption in prediction, therefore, makes advanced time series prediction models applicable for many fields. Multilayer perceptron (MLP) proposed by McCuloch and Pitts (1943) has been commonly used as a computational method. Single multiplicative neuron model (S-MNM) that does not contain this type of problem is introduced by Yadav et al. (2007). S-MNM uses a multiplicative function in its neuron as an aggregation function on the contrary to MLP that uses additive function. An ANN structure named linear and non-linear artificial neural network incorporating the properties of these two neural networks has been suggested by Yolcu et al. (2013). Fuzzy time series (FTS) approaches, introduced by Song and Chissom (1993), are another prediction tools that have been used efficiently in recent years. To improve prediction performance, Cagcag Yolcu (2013), and Cagcag Yolcu et al. (2016) put forward hybrid FTS models in which combined artificial neural networks and fuzzy clustering. In this talk, the reason of researchers need to use non-probabilistic prediction methods will be emphasized and some models in fuzzy inference systems and computational inference systems will be talked with their some applications.

 

 

Speaker
Biography:

Tian Tian received her BS and PhD from Huazhong University of Science and Technology (HUST), China, and visited Oakland University as a Visiting Scholar in 2012. She is currently with School of Computer Science, China University of Geoscience, Wuhan, China. Her major interests include computer vision and remote sensing image processing, and she has worked on feature extraction and image classification methods since the research of postgraduate period. Currently, she focuses on local image descriptors, biologically inspired image processing and deep learning methods on remote sensing image classification.

 

 

Abstract:

Statement of the Problem: Color information has been acknowledged for its important role in object recognition and scene classification. How to describe the color characteristics and extract combined spatial and chromatic feature is a challenging task in computer vision. To deal with the description of colors, plenty of approaches have been proposed. To study our visual system and mimic its structure and mechanism has naturally been proposed in the research of perceiving colors.

 

Methodology: The color information processing is implemented under a biologically inspired hierarchical framework, where cone cells, single-opponent and double-opponent cells are simulated respectively to mimic the color perception of primate visual system. More detailed, so responses are obtained by linear combination of retinal cone-like responses generated by Gaussian functions. The receptive field of a do cell is seen as the overlap of two oriented cells with inverse phases, and the double-opponent channels are simulated with oriented filters of inverse phases. Then the robust SIFT feature is extended as a shape description on the processed opponent color channels to obtain a spatio-chromatic descriptor for color object recognition.

 

Findings: The biologically inspired method is tested for color object recognition task on two public datasets, and the results support the potential of the proposed approach.

 

 

  • Virtual reality | Neural Networks | Artificial Intelligence | Image Processing | Computer Vision & Pattern Recognition | Multimedia Networking
Location: Olimpica 2
Speaker

Chair

Daphne Economou

Westminster University, UK

Session Introduction

Hector Perez Meana

National Polytechnic Institute, Mexico

Title: Face expression recognition in constrained and unconstrained environments
Speaker
Biography:

Hector Perez Meana received his PhD degree in Electrical Engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 1989. He is the Dean of the Graduate Studies and Research Section of the Mechanical and Electrical Engineering School, Culhuacan Campus, of the National Polytechnic Institute of Mexico. In 1991 he received the IEICE excellent Paper Award, and in 2000 the IPN Research Award and the IPN Research Diploma. In 1998 he was Chair of the ISITA’98, and in 2009 the General Chair of The IEEE Midwest Symposium on Circuit and Systems (MWSCAS). He has published more than 150 papers in indexed journals and two books. He also has directed 20 PhD theses. He is a senior member of the IEEE, member of the IEICE, The Mexican Researcher System and The Mexican Academy of Science. His principal research interests are adaptive systems, image processing, pattern recognition, watermarking and related fields.

 

Abstract:

The facial expression recognition (FER) systems have been used to recognize the mood of the persons. Because to determine the mood of a given person may be important in several practical applications; several efficient algorithms have been proposed to this end. Most of them achieve high recognition rates under controlled conditions, of lighting and position of the person with respect to the camera. Most FER system uses the Viola-Jones algorithm for face detection in both, images and video frames. However, because for FER systems the eyes and mouth regions provide the most relevant information, some segmentation schemes must be used to estimate the ROI used for feature extraction. Besides ROI estimation, the face orientation related to the camera is another important issue, because if the person is not looking straightforward to the camera, partial occlusion of the face may occur; or the presence of shadows due to poor illumination conditions. To reduce the problems described above, we propose an algorithm that is able to detect the face orientation in the frame under analysis, such that only if the face is perpendicular to the camera, the ROI is estimated. After the ROI estimation each region is segmented into a set of N×M blocks to get the feature vector using the modal value. The resulting features matrix is then applied to a PCA and LDA for dimensionality reduction. The proposed algorithm was trained using the KDEF data base which consists of 490 images which are divided into 7 facial expressions (Afraid, Angry, Disgusted, Happy, Sad, Surprise and Neutral) of 70 people. Finally, the proposed system is tested using the HOHA database which consists of 150 videos of 32 movies. The evaluation results show that the proposed system provides recognition rates of about 90%.

 

 

Break: Lunch Break 13:00-14:00 @ Hotel Restaurant
Speaker
Biography:

Benjamin Seide has an expertise in Animation, Visual Effects and Virtual Reality. His visual effects work as animation practitioner contributed to international feature films such as Roman Polanski's Oliver Twist, Wim Wender's Don't Come Knocking and Martin Scorsese’s Hugo. His research focuses on interdisciplinary collaboration of art and technology, immersive media experiences such as 360Ëš films, 3D stereoscopy and Virtual Reality.

Abstract:

Cultural heritage commonly utilizes laser scanning, CGI, 360- degree imagery and photogrammetry aiming to create photorealistic and accurate representations of historical environments. The goal of being as accurate and realistic as possible has not been fully accomplished yet, but considering the rate of improvement, virtual environments and augmented extensions will become indistinguishable from reality. A countermovement of artists and researchers create artistic impressions of virtual environments, not aiming for photorealistic perfection but to add an interpretation to the debate how the deeper meaning beyond the visual representation can be best represented. Philosopher Merleau-Ponty quotes Rodin, “It is the artist who is truthful, while the photograph lies; for, in reality, time never stops.” This research project investigates the possibilities of impressionism in virtual reality by exploring and comparing the effect of stylized interpretations to photorealistic representations to attempted but failed photorealism in virtual reality environments. I propose that the meaning of heritage is not just the form of a heritage site but could be understood as different layers. Interactive and immersive applications, such as augmented and virtual reality applications, enable us to explore alternative layers beyond the basic image acquisition. These layers are commonly understood as additional information layers, from superimposed text providing more detailed information to animated CG characters performing a relevant historic scene inside the virtual environment. Layers of meaning could also be interpreted in a more artistic sense by creating impressions rather than photorealistic representations. These artistic impressions utilize animation, laser scanning and photogrammetry to create representations on an abstract interpretation level, aiming to create a sense of atmosphere and trigger a stronger emotional response.

Speaker
Biography:

Shi Jinn Horng received the BS degree in Electronic Engineering from National Taiwan Institute of Technology, Taiwan; the MS degree in Information Engineering from National Central University, Taiwan, and the PhD degree in Computer Science from National Tsing Hua University, Taiwan, in 1980, 1984, and 1989, respectively. Currently, he is a Chair Professor in the Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology. His research interests include Deep Learning, Biometric Recognitions and Image Processing.

Abstract:

Due to the difficulty of finding the specific features of faces, in computer vision, low-resolution face image recognition is one of the challenging problems and the accuracy of recognition is still quite low. We were trying to solve this problem using deep learning techniques. Two major parts are used for the proposed method; first the restricted Boltzmann machine is used to preprocess the face images, then the deep convolution neural network is used to do classification. The data set was combined from the Georgia Institute of Technology, Aleix Martinez, and Robert Benavente. Based on this combined data, we conducted the training and testing processes. The proposed method is the first method that combines restricted Boltzmann machine and deep convolution neural networks to do low-resolution face image recognition. From the experimental results, compared to existing methods, the proposed method greatly improves the accuracy of recognition. The proposed method is shown in Figure 1. The experimental results are shown in Table 1.

 

Image

 

Kai Lung Hua

National Taiwan University of Science and Technology, Taiwan

Title: Multimodal image popularity prediction on social media
Speaker
Biography:

Kai Lung Hua received the BS degree in Electrical Engineering from National Tsing Hua University in 2000, and the MS degree in Communication Engineering from National Chiao Tung University in 2002, both in Hsinchu, Taiwan. He received the PhD degree from the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, in 2010. Since 2010, he has been with National Taiwan University of Science and Technology, where he is currently an Associate Professor in the Department of Computer Science and Information Engineering. He is a member of Eta Kappa Nu and Phi Tau Phi, as well as a recipient of MediaTek Doctoral Fellowship. His current research interests include digital image and video processing, computer vision, and multimedia networking. He has received several research awards, including Top 10% Paper Award of 2015 IEEE International Workshop on Multimedia Signal Processing, the Second Award of the 2014 ACM Multimedia Grand Challenge, the Best Paper Award of the 2013 IEEE International Symposium on Consumer Electronics, and the Best Poster Paper Award of the 2012 International Conference on 3D Systems and Applications.

 

Abstract:

Social media websites are one of the most important channels for content sharing and communication between users on social networks. The posted images on the websites, even the ones from the identical user, generally obtain very different numbers of views. This motivates researchers to predict the popularity of a candidate image on social media. To address this task, we investigate the effects of multimodal on user profile, post metadata, and photo aesthetics. The proposed method is evaluated via a large number of real image posts from Flickr. The experimental results verified the effectiveness of the proposed method.

Speaker
Biography:

Xiaodong Huang is an Associate Professor of Capital Normal University, China. He received his PhD degree in Computer Science from the Beijing University of Posts and Telecommunications in 2010, MS degree in Computer Science from the Beijing University of Posts and Telecommunications in 2006 and BS degree in Computer Science from Wuhan University of Technology in 1995. His research interests include pattern recognition and computer vision.

Abstract:

Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The nonsubsampled contourlet transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines.

 

Break: Networking & Refreshments 15:40-16:05 @ Foyer
Biography:

Daijin Kim received the BS degree in Electronic and Engineering from Yonsei University, Seoul, South Korea, in 1981, and the MS degree in Electrical Engineering from the Korea Advanced Institute of Science and Technology (KAIST), Taejon, 1984. In 1991, he received the PhD degree in Electrical and Computer Engineering from Syracuse University, Syracuse, NY. During 1992-1999, he was an Associate Professor in the Department of Computer Engineering at DongA University, Pusan, Korea. He is currently a Professor in the Department of Computer Science and Engineering at POSTECH, Pohang, Korea. His research interests include face and human analysis, machine intelligence and advanced driver assistance systems.

Abstract:

Recently, many face alignment methods using convolutional neural networks (CNN) have been introduced due to their high accuracies. However, they do not show real-time processing due to their high computational costs. In this paper, we propose a three-stage convolutional neural regression network (CNRN) to achieve a highly accurate face alignment in the real-time. The first stage consists of one CNRN that maps the facial image into the center positions of seven facial parts such as eyes, nose, mouth, etc. We obtain 68 local facial patches by aligning the center positions of seven facial parts onto the mean shape. The second stage consists of seven independent CNRNs, where each CNRN maps the local facial patches within its facial part into their displacements of x and y direction to reach the target positions. We obtain the fitted whole facial features and make a warped facial image from them. The third stage consists of one CNRN that maps the warped facial image into the appearance error. We repeat the second and third stage until the appearance error becomes small. The proposed method is fast because it trains first the facial parts and then facial features within the facial part like a coarse to fine fitting and each CNRN is relatively simple. The proposed method is highly accurate because it trains the facial features iteratively by performing the local regression on the facial features and the global regression on the warped appearance image. In the experiments, the proposed method will yield more accurate and stable face alignment or tracking under heavy occlusion and large pose variation than the existing the state of the art methods and run in the real-time.

 

Yaqi Mi

Beijing University of Posts and Telecommunications, China

Title: A new matrix operation in compressed sensing
Speaker
Biography:

Yaqi Mi received his Bachelor of Science degree in Computer Science and Technology from Zhejiang University of Finance & Economics, Hangzhou, China in 2016. Now she is working towards the Master of Science degree in Information Security at Beijing University of Posts and Telecommunications, Beijing, China. Her major interests are compressive sensing and signal processing.

Abstract:

In this speech, we propose a new matrix operation called P-tensor product (PTP) and apply it to compressed sensing (CS), the new model of CS is named PTP-CS. In order to break the restrictions of the traditional matrix multiplication, the PTP makes the dimension of two matrices matching by Kronecker product. Aiming at the large storage of the random matrix in CS, the PTP can construct a high-dimension matrix using a matrix, which can be chosen as random matrix or generalized permutation matrix. Similar with the traditional CS, we analyze some reconstruction conditions of the PTP-CS such as the spark, the coherence and the restricted isometry property (RIP). The experimental results demonstrate that our PTP-CS model can not only increase the choice of Kronecker matric and decrease the storage of traditional CS, but also maintain the considerable recovery performance.