Dr Wanqing Li B.Sc (ZJU), M.Sc (ZJU), PhD (UWA), SMIEEE
Office: Building 3.101
Tel: +61 2 4221 5410 or 4661, Fax: +61 2 4221 4170 or +61 2 4227 3277
Wanqing Lireceived his PhD in electronic engineering from The University of Western Australia. He was a Principal Researcher at Motorola Lab from 1998-2003 and a visiting researcher at Microsoft Research, Redmond in 2008, 2010 and 2013. He is currently an Associate Professor and Co-Director of Advanced Multimedia Research Lab (AMRL) of University of Wollongong, Australia. His research areas include 3D computer vision, 3D multimedia signal processing and medical image analysis.
Dr. Li is a Senior Member of IEEE and currently a co-chair of the 3D Rendering, Processing and Communications Interest Group, Multimedia Technical Committee of IEEE Communication Society. He is the guest editor of the special issue on Human activity understanding from 2D and 3D data (2015), International Journal of Computer Vision, and the special issue on Visual Understanding and Applications with RGB-D Cameras (2013), Journal of Visual Communication and Image Representation. He served as a Co-organizer of the IEEE International workshop on Human Activity Understanding from 3D Data (HAU3D) (2011-2013) and Hot Topics in 3D multimedia (Hot3D) (2014), an area chair of International Conference on Multimedia & Expo (ICME) 2014, a publication chair of IEEE Workshop on Multimedia Signal Processing (MMSP) 2008, General Co-Chair of ASIACCS'09 and DRMTICS'05, and technical committee members of numerous international conferences and workshops including CVPR, ICME, ICIP, MMSP and 3DTV-Con.
· Machine Learning and 3D Computer Vision - Human activity understanding, human detection, gait recognition, 3D sensing and reconstruction from RGB-D data
· 3D Multimedia Signal Processing and Understanding – scene analysis and event detection
· Free Viewpoint Video (FVV) – Acquisition, processing, understanding and compression
· Medical Image Processing and Understanding – Image reconstruction for low-dose X-ray and fast MR imaging, segmentation of medical images
Refereed Journal Articles (Selected)
14. C. Tang, J. Wu, Y. Hou, P. Wang and W. Li, A Spectral and Spatial Approach of Coarse-to-Fine Blurred Image Region Detection, IEEE Signal Processing Letters, 23(11), pp.1652-1656, 2016
19. P. Wang, W. Li, Z. Gao, J. Zhang, C. Tang and P. Ogunbona, Action Recognition from Depth Maps Using Deep Convolutional Neural Networks, IEEE Trans. Human-Machine Systems, 46(4), pp. 498-509, 2016
20. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, Human Detection from Images and Videos: A Survey, Pattern Recognition, 51, 2016, 148-175
21. Yasmine Probst, Duc Thanh Nguyen, Minh Khoi Tran and Wanqing Li, Dietary Assessment on a Mobile Phone Using Image Processing and Pattern Recognition Techniques: Algorithm Design and System Prototyping, Nutrients, 7(8), 2015, 6128-38
22. J. Zhang, L. Wang, L. Zhou, and W. Li, Learning Discriminative Stein Kernel for SPD Matrices and Its Applications, IEEE Trans Neural Networks and Learning Systems, (to appear, online first on 17 June 2015)
23. J. Zhang, L. Zhou, L. Wang and W. Li, Functional Brain Network Classification With Compact Representation of SICE Matrices, IEEE Trans Biomedical Engineering, 62(6), 2015, pp.1623-1634,
24. H. Shidanshidi, F. Safaei and W. Li, Estimation of Signal Distortion Using Effective Sampling Density for Light Field-based Free Viewpoint Video, IEEE Trans Multimedia, 17(10), 2015, pp. 1677-1693
25. D. T. Nguyen, Z. Zong, P. Ogunbona, Y. Probst, W. Li, Food image classification using local appearance and global structural information, Neurocomputing, 140, 2014, pp.242-251.
26. H. Tian, W. Li, L. Wang and P. Ogunbona, Smoke Detection in Video: An Image Separation Approach, International Journal of Computer Vision, 106, 2013, pp.192-209.
27. Duc Thanh Nguyen, Wanqing Li, Philip O. Ogunbona, Inter-Occlusion Reasoning for Human Detection Based on Variational Mean Field, Neurocomputing, 110, 2013, pp.56-61.
28. Thanh Duc Nguyen, P. Ogunbona and W. Li, A Novel Shape-Based Non-Redundant Local Binary Pattern Descriptor for Object Detection, Pattern Recognition, 46(5), 2013, pp.1485-1500.
29. C. Zhan, W. Li and P. Ogunbona, Measuring the Degree of Face Familiarity Based on Extended NMF, ACM Transactions on Applied Perception, 10(2), 2013, pp.8:1-8:21.
30. Jianhua Luo, Shanshan Wang, Wanqing Li and Yuemin Zhu, Removal of Truncation Artefacts in Magnetic Resonance by Recovering Missing Spectral Data, Journal of Magnetic Resonance, 224, 2012, pp.82-93.
31. C. Zhan, W. Li and P. Ogunbona, Local representation of faces through extended NMF, Electronics Letters, 48(7), 2012, pp.373-375.
32. J. Luo, Y. Zhu, W. Li, P. Croisille and I. E. Magnin, MRI Reconstruction From 2D Truncated k-Space, Journal of Magnetic Resonance Imaging, 35(5), 2012, pp.1196-206
33. J. Luo, J. Liu, W. Li and Y. Zhu, Image Reconstruction from Sparse Projections Using S-Transform, Journal of Mathematical Imaging and Vision, 43, 2012, pp.227-239.
34. Wanqing Li, Philip Ogunboba, Chris deSilver and Yannia Attikiouzel, Semi-Supervised MAP Segmentation of Brain Tissues from Dual Echo MR Scans Using Incomplete Training Data, IET Image Processing, 5(3), pp.222-232, April 2011.
35. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, A Local Intensity Distribution Descriptor for Object Detection, Electronics Letters, 47(5), 2011, p. 322-324.
38. L. Dong, G. Yu, P. Ogunbona and W. Li, An Efficient Iterative Algorithm for Image Thresholding, Pattern Recognition Letter, 29, 2008, pp.1311-1316.
39. J. Randall, L. Guan, W. Li and X. Zhang, The HCM for Perceptual Image Segmentation, NeuroComputing, 71(10-12), 2008, pp.1966-1979.
40. Weerasinghe, C.,
41. Ce Zhan,
42. I. Kharitonenko, W. Li, C. Weerasinghe, X. Zhang, A Prototype of Intelligent Video Surveillance Cameras, International Journal of Information and Systems Science, 3(3), Sept. 2007. pp.222-230.
43. J. Randall, L. Guan, and W. Li, A Hierarchical Neural Network Model for Image Analysis, International Journal of Fuzzy Systems, Vol.6, No.3, September 2004, pp.136-146.
44. W. Li, P. Ogunbona, Y. Shi, and I. Kharitonenko, CMOS sensor cross-talk compensation for digital cameras, IEEE Trans Consumer Electronics, Volume: 48 Issue: 2 , May 2002, pp.292-297.
45. J. C. Bezdek, W. Li, Y. Attikiouzel, M. Windham, A geometric approach to cluster validity for normal mixture, Soft Computing, 1 1997, pp.166-179.
Refereed International Conference Papers (Selected)
46. J. Zhang, W. Li and P. Ogunbona, Joint Geometrical and Statistical Alignment for Visual Domain Adaptation, IEEE CVPR 2017
47. P. Wang, W. Li and P. Ogunbona, Scene flow to action map: A new representation for RGB-D based action recognition with convolutional neural networks, IEEE CVPR 2017
48. Z. Ding, W. Li, P. Wang, P. Ogunbona and L. Qin, Weakly structured information aggregation for upper-body posture assessment using CONVNETS, IEEE ICME 2017
49. P. Wang, W. Li, S. Liu, Z. Gao, C. Tang and P. Ogunbona, Large-scale Isolated Gesture Recognition Using Convolutional Neural Networks, ICPR ChaLearn Contest of Isolated Gesture Recognition 2016 (2rd Place)
50. P. Wang, W. Li, S. Liu, Y. Zhang, Z. Gao, P. Ogunbona, Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks, ICPR ChaLearn Contest of Continuous Gesture Recognition 2016 (3rd Place)
55. L. Wang, J. Zhang, L. Zhou, C. Tang, and W. Li, Beyond Covariance: Feature Representation with Nonlinear Kernel Matrices, International Conference on Computer Vision (ICCV), 2015
56. Pichao Wang, Wanqing Li, Zhimin Gao1, Chang Tang, Jing Zhang and Philip Ogunbona, ConvNets-Based Action Recognition from Depth Maps through Virtual Cameras and Pseudocoloring, ACM Multimedia 2015 (accepted)
57. Song Liu, Wanqing Li, Philip Ogunbona and Yang-Wai Chow, Creating Simplified 3D Models with High Quality Textures, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2015 (oral, APRS best paper award)
58. H. Shidanshidi, F. Safaei, and W. Li, Optimization Of The Number Of Rays In Interpolation For Light Field Based Free Viewpoint Systems, IEEE ICME 2015
59. Pichao Wang, Wanqing Li, Philip Ogunbona, Zhimin Gao and Hanling Zhang, Mining Mid-level Features for Action Recognition Based on Effective Skeleton Representation, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2014
60. Lijuan Zhou, Wanqing Li, Yuyao Zhang, Philip Ogunbona, Duc Thanh Nguyen and Hanling Zhang, Discriminative Key Pose Extraction using Extended LC-KSVD for Action Recognition, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2014.
61. H. Tian, W. Li, P. Ogunbona and L. Wang, Single Image Smoke Detection, Asian Conference on Computer Vision (ACCV) 2014.
62. J. Zhang, L. Zhou, L. Wang, and W. Li, Exploring Compact Representation of SICE Matrices for Functional Brain Network Classification, MICCAI Workshop on Machine Learning in Medical Imaging (MLMI), Boston, USA, 2014
63. Jianjia Zhang, Lei Wang, Lingqiao Liu, Luping Zhou and Wanqing Li, Accelerating the Divisive Information-Theoretic Clustering of Visual Words, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2013
64. Yuyao Zhang, Philip O. Ogunbona, Wanqing Li, Bridget Munro and Gordon G. Wallace, Pathological Gait Detection of Parkinson’s Disease using Sparse Representation, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2013
65. F. Safaei, P. Mokhtarian, H. Shidanshidi, W. Li, M. Namazi-Rad‡ A. Mousavinia, Scene-adaptive Configuration of Two Cameras using the Correspondence Field Function, IEEE ICME 2013, San Jose, USA, 15-19 July 2013 (oral, nomination for the best paper award).
66. H. Shidanshidi, F. Safaei, and W. Li, A method for calculating the minimum number of cameras in a light field based free viewpoint video system, IEEE ICME 2013, San Jose, USA, 15-19 July 2013 (oral).
67. H. Shidanshidi, F. Safaei, and W. Li, Non-uniform sampling of plenoptic signal based on the scene complexity variations for a free viewpoint video system, IEEE ICIP 2013, Melbourne, Australia, 15-18 September 2013.
68. Lei Wang, Jianjia Zhang†, Luping Zhou, Wanqing Li, A Fast Approximate AIB Algorithm for Distributional Word Clustering, IEEE CVPR Portland, Oregon 2013.
69. Elahe Farahzadeh, Cham Tat-jen and Wanqing Li, Incorporating Local and Global Information using a Novel Distance Function for Scene Recognition, IEEE Workshop On Robot Vision (WoRV) 2013.
70. Hongda Tian, Wanqing Li, Lei Wang, Philip Ogunbona, A Novel Video-Based Smoke Detection Method Using Image Separation, IEEE ICME 2012.
71. Ce Zhan, Wanqing Li, and Philip Ogunbona Measuring Face Familiarity and Its Application to Face Recognition, IEEE Workshop on the Applications of Computer Vision (WACV) 2012
72. Qishen Wang, Ou Wu, Weiming Hu, Jinfeng Yang and Wanqing Li, Ranking Social Emotions by Learning Listwise Preference, Asian Conference on Pattern Recognition (ACPR), 2011
73. Ce Zhan, Wanqing Li, and Philip Ogunbona, Face Representation Based on Extended Non-negative Matrix Factorization, International Conference Image and Vision Computing New Zealand 2011
74. Ce Zhan, Wanqing Li, and Philip Ogunbona, Age Estimation Based on Extended Non-negative Matrix Factorization, IEEE Workshop on Multimedia Signal Processing 2011
75. Hongda Tian, Wanqing Li, Philip Ogunbona, Duc Thanh Nguyen, Ce Zhan, Smoke Detection in Videos Using Non-Redundant Local Binary Pattern-Based Features, IEEE Workshop on Multimedia Signal Processing 2011
76. Hooman Shidanshidi #1, Farzad Safaei #2, Wanqing Li, Objective Evaluation Of Light Field Rendering Methods Using Effective Sampling Density, IEEE Workshop on Multimedia Signal Processing 2011
77. Ramakrishna Kakarala, Prabhu Kaliamoorthi, Wanqing Li, Viewpoint invariants from three-dimensional data: the role of reflection in human activity understanding, Proc. Computer Vision and Pattern Recognition (CVPR) Workshop on Human Activity Understanding from 3D Data, 2011
78. Duc Thanh Nguyen, Philip Ogunbona, and Wanqing Li, Detecting Humans Under Occlusion Using Variational Mean Field Method, IEEE ICIP 2011
79. Duc Thanh Nguyen, Philip Ogunbona, and Wanqing Li, Human Detection With Contour-Based Local Motion Binary Patterns, IEEE ICIP 2011
80. Hooman Shidanshidi, Farzad Safaei, Wanqing Li, A Quantitative Approach For Comparison And Evaluation Of Light Field Rendering Techniques, IEEE ICME 2011
81. Ce Zhan,
82. Duc Thanh Nguyen,
83. Duc Thanh Nguyen, Zhimin Zong, Philip Ogunbona, Wanqing Li, Object Detection Using Non-Redundant Local Binary Patterns, IEEE ICIP, 2010.
84. LI Li, Weiming Hu, Bing Li, Chunfeng Yuan, Pengfei Zhu, Wanqing Li, Event Recognition based on Top-Down Motion Attention, Proc Intl Conference on Pattern Recognition (ICPR), 2010.
85. Zhimin Zong, Duc Thanh Nguyen, Philip Ogunbona, Wanqing Li, On the Combination of Local Texture and Global Structure for Food Classification, IEEE Intl Symposium on Multimedia, 2010
86. Ce Zhan, Wanqing Li, and Philip Ogunbona, Head Pose Estimation Based on Extended Non-negative Matrix Factorization, Proc Image and Vision Computing New Zealand (IVCNZ) 2010.
87. Wanqing Li, Zhengyou Zhang, Zicheng Liu, Action recognition based on a bag of 3D points, Proc. Computer Vision and Pattern Recognition (CVPR) Workshop, 2010, pp.9-14.
88. Duc Thanh Nguyen,
89. Ce Zhan, Wanqing Li and Philip Ogunbona, Face Recognition from Single Sample based on Human perception, Proc Image and Vision Computing New Zealand (IVCNZ) 2009
90. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, A Part-based Template Matching for Multi-view Human Detection, , Proc Image and Vision Computing New Zealand (IVCNZ) 2009
91. Peng Chen, Wanqing Li and Philip Ogunboba, Kernel PCA of HOG features for Posture Detection, , Proc Image and Vision Computing New Zealand (IVCNZ) 2009
92. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, A Novel Template Matching Method For Human Detection, IEEE ICIP 2009
93. Alister Cordiner, Philip Ogunbona and Wanqing Li, Face Detection Using Generalised Integral Image Features, IEEE ICIP 2009
94. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, An Improved Template Matching Method for Object Detection, The Ninth Asian Confernce on Computer Vision (ACCV) 2009
95. Li Li, Weiming Hu, Wanqing Li, Xiaoqing Zhang, Ying Chen, A New Shot Detection Method Based On Optical Flow, IEEE Pacific Rim Conference on Multimedia 2009
96. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, Human Detection Based On Weighted Template Matching, IEEE ICME 2009
97. Peng Chen, Wanqing Li and Philip Ogunboba, Greedy approximation of kernel PCA by minimizing the mapping error, Digital Image Computing: Techniques and Applications (DICTA) 2009
98. Li Li, Weiming Hu, Ying Chen,
99. Wanqing Li, Zhengyou Zhang and Zicheng Liu, Graphical Modeling and Decoding of Human Actions, IEEE MMSP 2008, pp. 175-180.
100. Xianglin Zeng, Weiming. Hu, Wanqing Li, Xiaoqin Zhang and Bo Xu, Key-frame Extraction Using Dominant-Set Clustering, IEEE ICME 2008, pp.1285-1288.
101. Alister Cordiner, Philip Ogunbona and Wanqing Li, Illumination Invariant Face Detection Using Classifier Fusion, LNCS 5353, Springer-Verlag, 2008, pp.456-465.
102. Yuan Zhong, Lei Ye, Wanqing Li and Philip Ogunbona, Perceived Similarity and Visual Descriptions in Content-Based Image Retrieval, Proc. IEEE ISM2007, pp.173-180.
103. Ce Zhan, Wanqing Li, Philip Ogunbona, and Farzad Safaei, Emotional States Control for On-line Game Avatars, Proceedings of the 6th ACM SIGCOMM workshop on Network and system support for games, NetGames 2007, pp.31-35.
104. Wenming Lu, Wanqing Li, Rei Safavi-Naini, Philip Ogunbona, A Maximum Likelihood Watermark Decoding Scheme, ICME 2007, pp.1247-1250.
105. Ce Zhan, Wanqing Li, Philip Ogunbona, and Farzad Safaei, Real-Time Facial Feature Point Extraction, LNCS 4810, Springer-Verlag, 2007, pp.88-97.
106. Gang Zheng, Wanqing Li, Philip Ogunbona, Liju Dong, and Igor Kharitonenko, Human Motion Simulation and Action Corpus, LNCS 4561, Springer-Verlag , 2007, pp.314-322.
107. Ce Zhan, Wanqing Li, Philip Ogunbona, and Farzad Safaei, Face to Face Communications in Multiplayer Online Games: A Real-Time System, LNCS 4553, Springer-Verlag, 2007, pp.401-410.
108. Ce Zhan, Wanqing Li, Philip Ogunbona, and Farzad Safaei, Facial expression recognition for multiplayer online game, Joint International Conference on CyberGames and Interactive Entertainment 2006 (CGIE2006), 4-6 December 2006 in Perth, Western Australia, IE'06, pp.52-58
109. Wanqing Li, Igor Kharitonenko, Serge Lichman, Chaminda Weerasinghe, A Prototype of Autonomous Intelligent Surveillance Cameras, IEEE AVSS 2006, 22-24 November 2006, Sydney, Australia, pp.101-106.
110. Ying Chen, Weiming Hu, and Xianglin Zeng, Wanqing Li, Indexing and Matching of Video Shot Based on Motion Analysis, ICARCV06.
111. Gavin Zheng, Wanqing Li and Ce Zhan, Cryptographic Key Generation from Biometric Data Using Lattice Mapping, ICPR 2006, vol.4, pp.513-516.
112. Liju Dong, Philip Ogunbona, Wanqing Li, Ge Yu, and Linan Fan, A fast algorithm for color image segmentation, ICIC 2006.
113. Wenming Lu, Wanqing Li, Rei Safavi-Naini and Philip Ogunbona, A pixel-based robust image watermarking system, ICME 2006. pp.1565-1568.
117. L.Ye, L. Cao, P. Ogunbona and W. Li, Description of evolution changes in image time sequences using visual descriptors, LNCS vol. 3893, Springer-Verlag, 2006, pp.189-197.
118. W. Lu, W. Li, R. Safavi-Naini, and P. Ogunbona, A new QIM-based image watermarking method and system, 2005 Asia-Pacific Workshop on Visual Information Processing, Hong Kong, December 2005, pp. 160–164.
119. Wanqing Li, Mingren Shi, Philip Ogunbona, A New Divide and Conquer Algorithm for Image and Video Segmentation, IEEE MMSP 2005, pp.585-588.
120. W. Li, C. deSilver and Y. Attikiouzel, Simultaneous MAP estimation of inhomogeneity and segmentation of brain tissues from MR images, IEEE ICIP 2005, Genova, Italy. vol.2, pp.1234– 1237.
121. I. Kharitonenko, W. Li, and C. Weerasinghe, Novel Architecture for Surveillance Cameras with Complementary Metal Oxide Semiconductor Image Sensors, IEEE ICCE 2005, p. 6.4-9.
122. W. Li, P. Ogunbona, L. Ye and I. Kharitonenko, Visual Process Model and Object Segmentation, The 7th International Conference on Signal Processing, Beijing, Sept. 2004, pp.753-756
123. W. Li, C. deSilver and Y. Attikiouzel, A Semi-Supervised Segmentation of Brain Tissues, The 7th International Conference on Signal Processing, Beijing, Sept. 2004, pp.757-760
124. W. Lu, R. Safavi-Naini, T. Uehara and W. Li, A Scalable and Oblivious Digital Watermarking for Images, The 7th International Conference on Signal Processing, Beijing, Sept. 2004, pp.2338-2341.
125. C. Weerasinghe, W. Li and P. Ogunbon, Stereoscopic panoramic video generation using centro-circular projection technique, ICASSP’03, vol.III, pp.473-476.
126. J. Randall, L. Guan, X. Zhang and W. Li, Hierarchical cluster model for perceptual image processing, ICASSP’02, Orlando, Florida, May 13 - 17, 2002, vol. 1, pp.1041-1044.
127. W. Li, P. Ogunbona, Y. Shi, and I. Kharitonenko, Modelling of color cross-talk in CMOS image sensors, ICASSP’02, Orlando, Florida, May 13 - 17, 2002, vol.IV, pp.3576-3579.
128. J. Randall, L. Guan, W. Li and X. Zhang, The hierarchical cluster model for image region segmentation, IEEE ICME, August 2002, Proceedings. Vol.1, pp.693-696.
129. J. Randall, L. Guan, X. Zhang and W. Li, The self-organising tree map for color image segmentation, International Symposium on Intelligent Signal Processing and Communications, November 2001
130. C. Weerasinghe, P. Ogunbona and W. Li, 2D to pseudo-3D conversion of “head and shoulder” images using feature based parametric display maps, ICIP 2001, vol.iii, pp.963-966.
131. W. Li, P. Ogunbona, and C. Weerasinghe, Stereoscopic video coding: an overview, APRS/IEEE Workshop on Stereo Image and Video Processing, December 2000, Sydney, Australia
132. C. Weerasinghe, P. Ogunbona and W. Li, Depth creation: a review of current technologies for monoscopic to pseudo stereoscopic conversion of video sequences, APRS/IEEE Workshop on Stereo Image and Video Processing, December 2000, Sydney, Australia
133. J. Randall, L. Guan, X. Zhang and W. Li, Investigation of the self organising tree map, Proceedings ICONIP’99, vol.2, 1999, pp.724-828.
134. W. Li, J. Bezdek, Y. Attikiouzel, Estimating the number of components in a normal mixture, Proceedings of International conference on Information, Statistics and Induction in Science, Melbourne, Australia, August, 1996.
135. W. Li, M. Morrison, Y. Attikiouzel, Unsupervised Segmentation of Dual-echo MR Images by a Sequentially Learned Gaussian Mixture Model, IEEE ICIP 1995, Washington, D.C., USA, pp.576-579.
136. W. Li and Y. Attikiouzel, Unsupervised Segmentation of Dual-echo MR Images With an ART-Based Neural Network, ICNN’95, Perth, Australia, pp.2600-2604.
137. W. Li and Y. Attikiouzel, Initialization of Clustering Algorithms for Unsupervised Segmentation of Multi-echo MR Images, ANZIIS’95, Perth, Australia, pp.88-92.
138. W. Li, H. Xie and Y. Attikiouzel, An Efficient Method of Volume Rendering for Medical Slices, IEEE ICIP 1994, Austin, Texas, USA, Nov. 1994, pp.652-656.
IEEE Technical Committees
· Secretary (executive member) of the Multimedia Technical Committee (MMTC), IEEE Communication Society, 2016-2018
· Associate Editor, Journal Visual Communication & image Representation, 2016 - Present
· Guest Editor, International Journal of Computer Vision, Special issue on, Special issue on Human Activity Understanding from 2D and 3D data, 2015
· Guest Editor, Journal Visual Communication & image Representation, Special issue on Visual Understanding and Applications with RGB-D Cameras, June 2013
· Guest Editor, International Journal of Information and Systems Science, Special issue on Visual Information Processing for Large Scale Pattern Recognition, Vol.3, No.3, 2007
· Co-chair, IEEE International Workshop on Human Activity Understanding from 3D Data (HAU3D), 2011, 2012, 2013 in conjunction with CVPR
· Co-Chair, The Australian Summit on 3D Multimedia (AU3DMM), 2011
· General Co-chair, ACM Symposium on InformAtion, Computer and Communications Security 2009 (AsiaCCS’09)
· Publication Chair, IEEE International Workshop on Multimedia Signal Processing 2008 (MMSP'08)
· Co-chair of the special session on "Visual Information Processing for Large Scale Pattern Recognition", International Conference on Control, Automation, Robotics and Vision 2006 (ICARCV'06)
· General Co-chair, First International Conference on Digital Rights Management: Technology, Issues, Challenges and Systems 2005 (DRMTICS’05).
· Australian Research Council (ARC)
· International Journal of Computer Vision, IEEE Transactions on Pattern Analysis and Machine Intelligence; IEEE Transactions on Neural Networks; IEEE Transactions on Circuits and Systems on Video Technology; IEEE Transactions on Multimedia; Computer Vision and Image Understanding, IEEE Signal Processing Letters, IEEE Transactions on Electronic Devices; Journal of Computer Science and Technology, Image and Vision Computing.
This part is provied by Dr Zicheng Liu, Microsoft Research Redmond, USA and used to be hosted under Dr Zicheng Liu's home page
The dataset was captured by a Kinect device. There are 12 dynamic
American Sign Language (ASL) gestures, and 10 people. Each person
performs each gesture 2-3 times. There are 336 files in total, each
corresponding to a depth sequence. The hand portion (above the wrist)
has been segmented. The file name has the format sub_depth_m_n where m
is the person index. n ranges from 1 to 36. Note that for some (m,n),
the file sub_depth_m_n does not exist. For example, there is no
"sub_depth_02_03". The reason is that some of the bad sequences are
excluded from the dataset. The mapping from n to gesture type is the
Each file is a MAT file which can be loaded with 64bit MATLAB. Below is a sample MATLAB code to load a file:
width = size(x.depth_part,1);
height = size(x.depth_part,2);
nFrames = size(x.depth_part,3);
depthval = x.depth_part(i,j,k);
The following two papers reported experiment results on this dataset:
 Alexey Kurakin, Zhengyou Zhang, Zicheng Liu, A Real-Time System for Dynamic Hand Gesture Recognition with a Depth Sensor, EUSIPCO, 2012.
 Jiang Wang, Zicheng Liu, Jan Chorowski, Zhuoyuan Chen, Ying Wu, Robust 3D Action Recognition with Random Occupancy Patterns, ECCV, 2012.
The dataset was captured by using a Kinect device. There are 16
activities: drink, eat, read book, call cellphone, write on a paper, use
laptop, use vacuum cleaner, cheer up, sit still, toss paper, play game,
lie down on sofa, walk, play guitar, stand up, sit down. There are 10
subjects. Each subject performs each activity twice, once in standing
position, and once in sitting position. There is a sofa in the scene.
Three channels are recorded: depth maps (.bin), skeleton joint positions
(.txt), and RGB video (.avi). There are 16*10*2=320 files for each
channel. In total, there are 320*3=960 files. Note that the RGB channel
anddepth channel are recorded independently, so they are not strictly
The format of the skeleton file is as follows. The first integer is the number of frames. The second integer is the number of joints which is always 20. For each frame, the first integer is the number of rows. This integer is 40 when there is exactly one skeleton being detected in this frame. It is zero when no skeleton is detected. It is 80 when two skeletons are detected (in that case which is rare, we simply use the first skeleton in our experiments). For most of the frames, the number of rows is 40. Each joint corresponds to two rows. The first row is its real world coordinates (x,y,z) and the second row is its screen coordinates plus depth (u, v, depth) where u and v are normalized to be within [0,1]. For each row, the integer at the end is supposed to be the confidence value, but it is not useful.
Activity recognition experiment with this dataset is reported in the following paper:
Mining Actionlet Ensemble for Action Recognition with Depth Cameras, Jiang Wang, Zicheng Liu, Ying Wu, Junsong Yuan, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012), Providence, Rhode Island, June 16-21, 2012.
20 action types, 10 subjects, each subject performs each action 2 or
3 times. There are 567 depth map sequences in total. The resolution is
320x240. The data was recorded with a depth sensor similar to the Kinect
device. The dataset is described in the following paper. Click here for
a description of the subject splits used in various papers.
Action Recognition Based on A Bag of 3D Points, Wanqing Li, Zhengyou Zhang, Zicheng Liu, IEEE International Workshop on CVPR for Human Communicative Behavior Analysis (in conjunction with CVPR2010), San Francisco, CA, June, 2010.
Better classification results are reported in the following paper:
Mining Actionlet Ensemble for Action Recognition with Depth Cameras, Jiang Wang, Zicheng Liu, Ying Wu, Junsong Yuan, IEEE Conference on Computer Vision and Pattern Recognition (CVPR2012), Providence, Rhode Island, June 16-21, 2012. Note that there is an error in the paper on the number of samples being used for the experiment. The number 402 in the paper is not correct. The correct number is 557. Out of the original 567 sequences in MSR Action3D Dataset, 10 sequences are not used in this paper's experiment because the skeletons are either missing or too erroneous. Here is a list of the file names that are used in the experiment: list of file names.
Sample code to load MSR Action3D Dataset (drawskt.zip)
Skeleton Data in screen coordinates (MSRAction3DSkeleton (20joints).rar) (Thanks to Yi Wen Wan, University of North Texas, for data cleaning and conversion). There is a skeleton sequence file for each depth sequence in the Action3D dataset. A skeleton has 20 joint positions (see the image for illustrations of the joint positions). Four real numbers are stored for each joint: u, v, d, c where (u,v) are screen coordinates, d is the depth value, and c is the confidence score. If a depth sequence has n frames, then the number of real numbers stored in the corresponding skeleton file is equal to: n*20*4. Click here for MATLAB code to visualize the skeleton motions (The code is provided by Antonio Vieira from Federal University of Minas Gerais).
This diagram shows the correpondence between the 20 points in the skeleton data and the joints (Thanks to Yu Zhong from AIT, BAE Systems for providing this diagram).
Skeleton Data in real world coordinates (MSRAction3DSkeletonREal3D.rar) (Thanks to Ferda Ofli, UC Berkeley, for processing the data).
Human activity understanding from RGB-D data has attracted increasing attention since the first work reported in 2010. Over this period, many benchmark datasets have been created to facilitate the development and evaluation of new algorithms. However, the existing datasets are mostly captured in laboratory environment with small number of actions and small variations, which impede the development of higher level algorithms for real world applications. Thus, this paper proposes a large scale dataset along with a set of evaluation protocols. The large dataset is created by combining nine existing publicly available datasets and can be expanded easily by adding more datasets. The large dataset has 94 actions and is suitable for testing algorithms from different perspectives using the proposed evaluation protocols. Four state-of-the-art algorithms are evaluated on the large combined dataset and the results have verified the limitations of current algorithms and the effectiveness of the large dataset.
Readers are referred to the following paper on details. If you are to
use the combined dataset, Please cite the following paper as well as all
the origional papers of individual datasets.
Zhang, Jing and Li, Wanqing and Wang, Pichao and Ogunbona, Philip and Liu, Song and Tang, Chang, A Large Scale RGB-D Dataset for Action Recognition, International Workshop on Understanding Human Activities through 3D Sensors (UHA3DS) 2016 in conjunction with 23rd International Conference on Pattern Recognition (ICPR2016).
Last updated: 02 April 2017