Dr Wanqing Li B.Sc (ZJU), M.Sc (ZJU), PhD (UWA), SMIEEE
Director, Advanced Multimedia Research Lab (AMRL)
Office: Building 3.101
Tel: +61 2 4221 5410 or 4661, Fax: +61 2 4221 4170 or +61 2 4227 3277
· 06-01-18, Appointment as an Associate Editor for IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) from Jan 1, 2018 to Dec 31 2019
· 19-02-18, paper on “Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN” is accepted by CVPR’18
· 04-01-18, paper on “Robust Unsupervised Feature Selection via Dual Self-representation and Manifold Regularization” is accepted by Knowledge-Based Systems.
Wanqing Lireceived his PhD in electronic engineering from The University of Western Australia. He was a Principal Researcher at Motorola Lab from 1998-2003 and a visiting researcher at Microsoft Research, Redmond in 2008, 2010 and 2013. He is currently an Associate Professor and Co-Director of Advanced Multimedia Research Lab (AMRL) of University of Wollongong, Australia. His research areas include 3D computer vision, 3D multimedia signal processing and medical image analysis.
Dr. Li is a Senior Member of IEEE and currently the Secretary of Multimedia Technical Committee (MMTC) of IEEE Communication Society (2017-2018). He was a co-chair of the 3D Rendering, Processing and Communications Interest Group of the MMTC (2014-2016).
He is the guest editor of the special issue on Human activity understanding from 2D and 3D data (2015), International Journal of Computer Vision, and the special issue on Visual Understanding and Applications with RGB-D Cameras (2013), Journal of Visual Communication and Image Representation (JVCI). He is currently an Associate Editor for the IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) (2018- ) and an Associate Editor for JVCI (2016- )
Dr Li served as a Co-organizer of the IEEE International workshop on Human Activity Understanding from 3D Data (HAU3D) (2011-2013) and Hot Topics in 3D multimedia (Hot3D) (2014), an Area Chair of International Conference on Multimedia & Expo (ICME) 2014 and 2018, a publication chair of IEEE Workshop on Multimedia Signal Processing (MMSP) 2008, General Co-Chair of ASIACCS'09 and DRMTICS'05, and technical committee members of numerous international conferences and workshops including CVPR, ICME, ICIP, MMSP and 3DTV-Con.
· Machine Learning and 3D Computer Vision - Human activity understanding, human detection, gait recognition, 3D sensing and reconstruction from RGB-D data
· 3D Multimedia Signal Processing and Understanding – scene analysis and event detection
· Free Viewpoint Video (FVV) – Acquisition, processing, understanding and compression
· Medical Image Processing and Understanding – Image reconstruction for low-dose X-ray and fast MR imaging, segmentation of medical images
Refereed Journal Articles (Selected)
9. C. Tang, W. Li, P. Wang, L. Wang, Online Human Action Recognition Based on Incremental Learning of Weighted Covariance Descriptors, Information Sciences (accepted on 2 August 2018)
10. P. Wang, W. Li, P. Ogunbona, J, Wan and S. Escalera, RGB-D-based Human Motion Recognition with Deep Learning: A Survey, Computer Vision and Image Understanding, (accepted on 25 April 2018)
11. S. Li, D. Florencio, W. Li. Y. Zhao and C. Cook, A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain, IEEE Transactions on Image Processing, 27(8), pp.3918-3930, 2018
12. P. Wang, W. Li, Z. Gao, C. Tang and P. Ogunbona, Depth Pooling Based Large-scale 3D Action Recognition with Convolutional Neural Networks, IEEE Transactions on Multimedia, 20(5), pp.1051-1061, 2018
25. C. Tang, J. Wu, Y. Hou, P. Wang and W. Li, A Spectral and Spatial Approach of Coarse-to-Fine Blurred Image Region Detection, IEEE Signal Processing Letters, 23(11), pp.1652-1656, 2016
30. P. Wang, W. Li, Z. Gao, J. Zhang, C. Tang and P. Ogunbona, Action Recognition from Depth Maps Using Deep Convolutional Neural Networks, IEEE Trans. Human-Machine Systems, 46(4), pp. 498-509, 2016
31. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, Human Detection from Images and Videos: A Survey, Pattern Recognition, 51, 2016, 148-175
32. Yasmine Probst, Duc Thanh Nguyen, Minh Khoi Tran and Wanqing Li, Dietary Assessment on a Mobile Phone Using Image Processing and Pattern Recognition Techniques: Algorithm Design and System Prototyping, Nutrients, 7(8), 2015, 6128-38
33. J. Zhang, L. Wang, L. Zhou, and W. Li, Learning Discriminative Stein Kernel for SPD Matrices and Its Applications, IEEE Trans Neural Networks and Learning Systems, (to appear, online first on 17 June 2015)
34. J. Zhang, L. Zhou, L. Wang and W. Li, Functional Brain Network Classification With Compact Representation of SICE Matrices, IEEE Trans Biomedical Engineering, 62(6), 2015, pp.1623-1634,
35. H. Shidanshidi, F. Safaei and W. Li, Estimation of Signal Distortion Using Effective Sampling Density for Light Field-based Free Viewpoint Video, IEEE Trans Multimedia, 17(10), 2015, pp. 1677-1693
36. D. T. Nguyen, Z. Zong, P. Ogunbona, Y. Probst, W. Li, Food image classification using local appearance and global structural information, Neurocomputing, 140, 2014, pp.242-251.
37. H. Tian, W. Li, L. Wang and P. Ogunbona, Smoke Detection in Video: An Image Separation Approach, International Journal of Computer Vision, 106, 2013, pp.192-209.
38. Duc Thanh Nguyen, Wanqing Li, Philip O. Ogunbona, Inter-Occlusion Reasoning for Human Detection Based on Variational Mean Field, Neurocomputing, 110, 2013, pp.56-61.
39. Thanh Duc Nguyen, P. Ogunbona and W. Li, A Novel Shape-Based Non-Redundant Local Binary Pattern Descriptor for Object Detection, Pattern Recognition, 46(5), 2013, pp.1485-1500.
40. C. Zhan, W. Li and P. Ogunbona, Measuring the Degree of Face Familiarity Based on Extended NMF, ACM Transactions on Applied Perception, 10(2), 2013, pp.8:1-8:21.
41. Jianhua Luo, Shanshan Wang, Wanqing Li and Yuemin Zhu, Removal of Truncation Artefacts in Magnetic Resonance by Recovering Missing Spectral Data, Journal of Magnetic Resonance, 224, 2012, pp.82-93.
42. C. Zhan, W. Li and P. Ogunbona, Local representation of faces through extended NMF, Electronics Letters, 48(7), 2012, pp.373-375.
43. J. Luo, Y. Zhu, W. Li, P. Croisille and I. E. Magnin, MRI Reconstruction From 2D Truncated k-Space, Journal of Magnetic Resonance Imaging, 35(5), 2012, pp.1196-206
44. J. Luo, J. Liu, W. Li and Y. Zhu, Image Reconstruction from Sparse Projections Using S-Transform, Journal of Mathematical Imaging and Vision, 43, 2012, pp.227-239.
45. Wanqing Li, Philip Ogunboba, Chris deSilver and Yannia Attikiouzel, Semi-Supervised MAP Segmentation of Brain Tissues from Dual Echo MR Scans Using Incomplete Training Data, IET Image Processing, 5(3), pp.222-232, April 2011.
46. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, A Local Intensity Distribution Descriptor for Object Detection, Electronics Letters, 47(5), 2011, p. 322-324.
47. Jianhua Luo,
49. L. Dong, G. Yu, P. Ogunbona and W. Li, An Efficient Iterative Algorithm for Image Thresholding, Pattern Recognition Letter, 29, 2008, pp.1311-1316.
50. J. Randall, L. Guan, W. Li and X. Zhang, The HCM for Perceptual Image Segmentation, NeuroComputing, 71(10-12), 2008, pp.1966-1979.
53. I. Kharitonenko, W. Li, C. Weerasinghe, X. Zhang, A Prototype of Intelligent Video Surveillance Cameras, International Journal of Information and Systems Science, 3(3), Sept. 2007. pp.222-230.
54. J. Randall, L. Guan, and W. Li, A Hierarchical Neural Network Model for Image Analysis, International Journal of Fuzzy Systems, Vol.6, No.3, September 2004, pp.136-146.
55. W. Li, P. Ogunbona, Y. Shi, and I. Kharitonenko, CMOS sensor cross-talk compensation for digital cameras, IEEE Trans Consumer Electronics, Volume: 48 Issue: 2 , May 2002, pp.292-297.
56. J. C. Bezdek, W. Li, Y. Attikiouzel, M. Windham, A geometric approach to cluster validity for normal mixture, Soft Computing, 1 1997, pp.166-179.
Refereed International Conference Papers (Selected)
57. P. Wang, W. Li, J. Wan, P. Ogunbona, X. Liu, Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition, AAAI 2018
58. Z. Ding, W. Li, P. Wang, P. Ogunbona and L. Qin, Weakly structured information aggregation for upper-body posture assessment using convnets, ICME 2017 (oral)
59. C. Li, P. Wang, S. Wang, Y. Hou, W. Li, Skeleton-based action recognition using LSTM and CNN, ICMEW 2017
56. P. Wang, S. Wang*, Z. Gao, Y. Hou, and W. Li, Structured Images for RGB-D Action Recognition, ICCVW 2017
57. H. Wang, P. Wang, Z. Song, and W. Li, Large-scale Multimodal Gesture Recognition Using Heterogeneous Networks, ICCVW2017
58. H. Wang, Pichao Wang, Z. Song, and Wanqing Li, Large-scale Multimodal Gesture Segmentation and Recognition based on Convolutional Neural Network, ICCVW2017
59. J. Zhang, W. Li and P. Ogunbona, Joint Geometrical and Statistical Alignment for Visual Domain Adaptation, IEEE CVPR 2017
60. P. Wang, W. Li and P. Ogunbona, Scene flow to action map: A new representation for RGB-D based action recognition with convolutional neural networks, IEEE CVPR 2017
61. Z. Ding, W. Li, P. Wang, P. Ogunbona and L. Qin, Weakly structured information aggregation for upper-body posture assessment using CONVNETS, IEEE ICME 2017
62. P. Wang, W. Li, S. Liu, Z. Gao, C. Tang and P. Ogunbona, Large-scale Isolated Gesture Recognition Using Convolutional Neural Networks, ICPR ChaLearn Contest of Isolated Gesture Recognition 2016 (2rd Place)
63. P. Wang, W. Li, S. Liu, Y. Zhang, Z. Gao, P. Ogunbona, Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks, ICPR ChaLearn Contest of Continuous Gesture Recognition 2016 (3rd Place)
68. L. Wang, J. Zhang, L. Zhou, C. Tang, and W. Li, Beyond Covariance: Feature Representation with Nonlinear Kernel Matrices, International Conference on Computer Vision (ICCV), 2015
69. Pichao Wang, Wanqing Li, Zhimin Gao1, Chang Tang, Jing Zhang and Philip Ogunbona, ConvNets-Based Action Recognition from Depth Maps through Virtual Cameras and Pseudocoloring, ACM Multimedia 2015 (accepted)
70. Song Liu, Wanqing Li, Philip Ogunbona and Yang-Wai Chow, Creating Simplified 3D Models with High Quality Textures, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2015 (oral, APRS best paper award)
71. H. Shidanshidi, F. Safaei, and W. Li, Optimization Of The Number Of Rays In Interpolation For Light Field Based Free Viewpoint Systems, IEEE ICME 2015
72. Pichao Wang, Wanqing Li, Philip Ogunbona, Zhimin Gao and Hanling Zhang, Mining Mid-level Features for Action Recognition Based on Effective Skeleton Representation, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2014
73. Lijuan Zhou, Wanqing Li, Yuyao Zhang, Philip Ogunbona, Duc Thanh Nguyen and Hanling Zhang, Discriminative Key Pose Extraction using Extended LC-KSVD for Action Recognition, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2014.
74. H. Tian, W. Li, P. Ogunbona and L. Wang, Single Image Smoke Detection, Asian Conference on Computer Vision (ACCV) 2014.
75. J. Zhang, L. Zhou, L. Wang, and W. Li, Exploring Compact Representation of SICE Matrices for Functional Brain Network Classification, MICCAI Workshop on Machine Learning in Medical Imaging (MLMI), Boston, USA, 2014
76. Jianjia Zhang, Lei Wang, Lingqiao Liu, Luping Zhou and Wanqing Li, Accelerating the Divisive Information-Theoretic Clustering of Visual Words, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2013
77. Yuyao Zhang, Philip O. Ogunbona, Wanqing Li, Bridget Munro and Gordon G. Wallace, Pathological Gait Detection of Parkinson’s Disease using Sparse Representation, International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2013
78. F. Safaei, P. Mokhtarian, H. Shidanshidi, W. Li, M. Namazi-Rad‡ A. Mousavinia, Scene-adaptive Configuration of Two Cameras using the Correspondence Field Function, IEEE ICME 2013, San Jose, USA, 15-19 July 2013 (oral, nomination for the best paper award).
79. H. Shidanshidi, F. Safaei, and W. Li, A method for calculating the minimum number of cameras in a light field based free viewpoint video system, IEEE ICME 2013, San Jose, USA, 15-19 July 2013 (oral).
80. H. Shidanshidi, F. Safaei, and W. Li, Non-uniform sampling of plenoptic signal based on the scene complexity variations for a free viewpoint video system, IEEE ICIP 2013, Melbourne, Australia, 15-18 September 2013.
81. Lei Wang, Jianjia Zhang†, Luping Zhou, Wanqing Li, A Fast Approximate AIB Algorithm for Distributional Word Clustering, IEEE CVPR Portland, Oregon 2013.
82. Elahe Farahzadeh, Cham Tat-jen and Wanqing Li, Incorporating Local and Global Information using a Novel Distance Function for Scene Recognition, IEEE Workshop On Robot Vision (WoRV) 2013.
83. Hongda Tian, Wanqing Li, Lei Wang, Philip Ogunbona, A Novel Video-Based Smoke Detection Method Using Image Separation, IEEE ICME 2012.
84. Ce Zhan, Wanqing Li, and Philip Ogunbona Measuring Face Familiarity and Its Application to Face Recognition, IEEE Workshop on the Applications of Computer Vision (WACV) 2012
85. Qishen Wang, Ou Wu, Weiming Hu, Jinfeng Yang and Wanqing Li, Ranking Social Emotions by Learning Listwise Preference, Asian Conference on Pattern Recognition (ACPR), 2011
86. Ce Zhan, Wanqing Li, and Philip Ogunbona, Face Representation Based on Extended Non-negative Matrix Factorization, International Conference Image and Vision Computing New Zealand 2011
87. Ce Zhan, Wanqing Li, and Philip Ogunbona, Age Estimation Based on Extended Non-negative Matrix Factorization, IEEE Workshop on Multimedia Signal Processing 2011
88. Hongda Tian, Wanqing Li, Philip Ogunbona, Duc Thanh Nguyen, Ce Zhan, Smoke Detection in Videos Using Non-Redundant Local Binary Pattern-Based Features, IEEE Workshop on Multimedia Signal Processing 2011
89. Hooman Shidanshidi #1, Farzad Safaei #2, Wanqing Li, Objective Evaluation Of Light Field Rendering Methods Using Effective Sampling Density, IEEE Workshop on Multimedia Signal Processing 2011
90. Ramakrishna Kakarala, Prabhu Kaliamoorthi, Wanqing Li, Viewpoint invariants from three-dimensional data: the role of reflection in human activity understanding, Proc. Computer Vision and Pattern Recognition (CVPR) Workshop on Human Activity Understanding from 3D Data, 2011
91. Duc Thanh Nguyen, Philip Ogunbona, and Wanqing Li, Detecting Humans Under Occlusion Using Variational Mean Field Method, IEEE ICIP 2011
92. Duc Thanh Nguyen, Philip Ogunbona, and Wanqing Li, Human Detection With Contour-Based Local Motion Binary Patterns, IEEE ICIP 2011
93. Hooman Shidanshidi, Farzad Safaei, Wanqing Li, A Quantitative Approach For Comparison And Evaluation Of Light Field Rendering Techniques, IEEE ICME 2011
94. Ce Zhan,
96. Duc Thanh Nguyen, Zhimin Zong, Philip Ogunbona, Wanqing Li, Object Detection Using Non-Redundant Local Binary Patterns, IEEE ICIP, 2010.
97. LI Li, Weiming Hu, Bing Li, Chunfeng Yuan, Pengfei Zhu, Wanqing Li, Event Recognition based on Top-Down Motion Attention, Proc Intl Conference on Pattern Recognition (ICPR), 2010.
98. Zhimin Zong, Duc Thanh Nguyen, Philip Ogunbona, Wanqing Li, On the Combination of Local Texture and Global Structure for Food Classification, IEEE Intl Symposium on Multimedia, 2010
99. Ce Zhan, Wanqing Li, and Philip Ogunbona, Head Pose Estimation Based on Extended Non-negative Matrix Factorization, Proc Image and Vision Computing New Zealand (IVCNZ) 2010.
100. Wanqing Li, Zhengyou Zhang, Zicheng Liu, Action recognition based on a bag of 3D points, Proc. Computer Vision and Pattern Recognition (CVPR) Workshop, 2010, pp.9-14.
Duc Thanh Nguyen,
102. Ce Zhan, Wanqing Li and Philip Ogunbona, Face Recognition from Single Sample based on Human perception, Proc Image and Vision Computing New Zealand (IVCNZ) 2009
103. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, A Part-based Template Matching for Multi-view Human Detection, , Proc Image and Vision Computing New Zealand (IVCNZ) 2009
104. Peng Chen, Wanqing Li and Philip Ogunboba, Kernel PCA of HOG features for Posture Detection, , Proc Image and Vision Computing New Zealand (IVCNZ) 2009
105. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, A Novel Template Matching Method For Human Detection, IEEE ICIP 2009
106. Alister Cordiner, Philip Ogunbona and Wanqing Li, Face Detection Using Generalised Integral Image Features, IEEE ICIP 2009
107. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, An Improved Template Matching Method for Object Detection, The Ninth Asian Confernce on Computer Vision (ACCV) 2009
108. Li Li, Weiming Hu, Wanqing Li, Xiaoqing Zhang, Ying Chen, A New Shot Detection Method Based On Optical Flow, IEEE Pacific Rim Conference on Multimedia 2009
109. Duc Thanh Nguyen, Wanqing Li and Philip Ogunbona, Human Detection Based On Weighted Template Matching, IEEE ICME 2009
110. Peng Chen, Wanqing Li and Philip Ogunboba, Greedy approximation of kernel PCA by minimizing the mapping error, Digital Image Computing: Techniques and Applications (DICTA) 2009
Li Li, Weiming
Hu, Ying Chen,
112. Wanqing Li, Zhengyou Zhang and Zicheng Liu, Graphical Modeling and Decoding of Human Actions, IEEE MMSP 2008, pp. 175-180.
113. Xianglin Zeng, Weiming. Hu, Wanqing Li, Xiaoqin Zhang and Bo Xu, Key-frame Extraction Using Dominant-Set Clustering, IEEE ICME 2008, pp.1285-1288.
114. Alister Cordiner, Philip Ogunbona and Wanqing Li, Illumination Invariant Face Detection Using Classifier Fusion, LNCS 5353, Springer-Verlag, 2008, pp.456-465.
115. Yuan Zhong, Lei Ye, Wanqing Li and Philip Ogunbona, Perceived Similarity and Visual Descriptions in Content-Based Image Retrieval, Proc. IEEE ISM2007, pp.173-180.
116. Ce Zhan, Wanqing Li, Philip Ogunbona, and Farzad Safaei, Emotional States Control for On-line Game Avatars, Proceedings of the 6th ACM SIGCOMM workshop on Network and system support for games, NetGames 2007, pp.31-35.
117. Wenming Lu, Wanqing Li, Rei Safavi-Naini, Philip Ogunbona, A Maximum Likelihood Watermark Decoding Scheme, ICME 2007, pp.1247-1250.
118. Ce Zhan, Wanqing Li, Philip Ogunbona, and Farzad Safaei, Real-Time Facial Feature Point Extraction, LNCS 4810, Springer-Verlag, 2007, pp.88-97.
119. Gang Zheng, Wanqing Li, Philip Ogunbona, Liju Dong, and Igor Kharitonenko, Human Motion Simulation and Action Corpus, LNCS 4561, Springer-Verlag , 2007, pp.314-322.
120. Ce Zhan, Wanqing Li, Philip Ogunbona, and Farzad Safaei, Face to Face Communications in Multiplayer Online Games: A Real-Time System, LNCS 4553, Springer-Verlag, 2007, pp.401-410.
121. Ce Zhan, Wanqing Li, Philip Ogunbona, and Farzad Safaei, Facial expression recognition for multiplayer online game, Joint International Conference on CyberGames and Interactive Entertainment 2006 (CGIE2006), 4-6 December 2006 in Perth, Western Australia, IE'06, pp.52-58
122. Wanqing Li, Igor Kharitonenko, Serge Lichman, Chaminda Weerasinghe, A Prototype of Autonomous Intelligent Surveillance Cameras, IEEE AVSS 2006, 22-24 November 2006, Sydney, Australia, pp.101-106.
123. Ying Chen, Weiming Hu, and Xianglin Zeng, Wanqing Li, Indexing and Matching of Video Shot Based on Motion Analysis, ICARCV06.
124. Gavin Zheng, Wanqing Li and Ce Zhan, Cryptographic Key Generation from Biometric Data Using Lattice Mapping, ICPR 2006, vol.4, pp.513-516.
125. Liju Dong, Philip Ogunbona, Wanqing Li, Ge Yu, and Linan Fan, A fast algorithm for color image segmentation, ICIC 2006.
126. Wenming Lu, Wanqing Li, Rei Safavi-Naini and Philip Ogunbona, A pixel-based robust image watermarking system, ICME 2006. pp.1565-1568.
130. L.Ye, L. Cao, P. Ogunbona and W. Li, Description of evolution changes in image time sequences using visual descriptors, LNCS vol. 3893, Springer-Verlag, 2006, pp.189-197.
131. W. Lu, W. Li, R. Safavi-Naini, and P. Ogunbona, A new QIM-based image watermarking method and system, 2005 Asia-Pacific Workshop on Visual Information Processing, Hong Kong, December 2005, pp. 160–164.
132. Wanqing Li, Mingren Shi, Philip Ogunbona, A New Divide and Conquer Algorithm for Image and Video Segmentation, IEEE MMSP 2005, pp.585-588.
133. W. Li, C. deSilver and Y. Attikiouzel, Simultaneous MAP estimation of inhomogeneity and segmentation of brain tissues from MR images, IEEE ICIP 2005, Genova, Italy. vol.2, pp.1234– 1237.
134. I. Kharitonenko, W. Li, and C. Weerasinghe, Novel Architecture for Surveillance Cameras with Complementary Metal Oxide Semiconductor Image Sensors, IEEE ICCE 2005, p. 6.4-9.
135. W. Li, P. Ogunbona, L. Ye and I. Kharitonenko, Visual Process Model and Object Segmentation, The 7th International Conference on Signal Processing, Beijing, Sept. 2004, pp.753-756
136. W. Li, C. deSilver and Y. Attikiouzel, A Semi-Supervised Segmentation of Brain Tissues, The 7th International Conference on Signal Processing, Beijing, Sept. 2004, pp.757-760
137. W. Lu, R. Safavi-Naini, T. Uehara and W. Li, A Scalable and Oblivious Digital Watermarking for Images, The 7th International Conference on Signal Processing, Beijing, Sept. 2004, pp.2338-2341.
138. C. Weerasinghe, W. Li and P. Ogunbon, Stereoscopic panoramic video generation using centro-circular projection technique, ICASSP’03, vol.III, pp.473-476.
139. J. Randall, L. Guan, X. Zhang and W. Li, Hierarchical cluster model for perceptual image processing, ICASSP’02, Orlando, Florida, May 13 - 17, 2002, vol. 1, pp.1041-1044.
140. W. Li, P. Ogunbona, Y. Shi, and I. Kharitonenko, Modelling of color cross-talk in CMOS image sensors, ICASSP’02, Orlando, Florida, May 13 - 17, 2002, vol.IV, pp.3576-3579.
141. J. Randall, L. Guan, W. Li and X. Zhang, The hierarchical cluster model for image region segmentation, IEEE ICME, August 2002, Proceedings. Vol.1, pp.693-696.
142. J. Randall, L. Guan, X. Zhang and W. Li, The self-organising tree map for color image segmentation, International Symposium on Intelligent Signal Processing and Communications, November 2001
143. C. Weerasinghe, P. Ogunbona and W. Li, 2D to pseudo-3D conversion of “head and shoulder” images using feature based parametric display maps, ICIP 2001, vol.iii, pp.963-966.
144. W. Li, P. Ogunbona, and C. Weerasinghe, Stereoscopic video coding: an overview, APRS/IEEE Workshop on Stereo Image and Video Processing, December 2000, Sydney, Australia
145. C. Weerasinghe, P. Ogunbona and W. Li, Depth creation: a review of current technologies for monoscopic to pseudo stereoscopic conversion of video sequences, APRS/IEEE Workshop on Stereo Image and Video Processing, December 2000, Sydney, Australia
146. J. Randall, L. Guan, X. Zhang and W. Li, Investigation of the self organising tree map, Proceedings ICONIP’99, vol.2, 1999, pp.724-828.
147. W. Li, J. Bezdek, Y. Attikiouzel, Estimating the number of components in a normal mixture, Proceedings of International conference on Information, Statistics and Induction in Science, Melbourne, Australia, August, 1996.
148. W. Li, M. Morrison, Y. Attikiouzel, Unsupervised Segmentation of Dual-echo MR Images by a Sequentially Learned Gaussian Mixture Model, IEEE ICIP 1995, Washington, D.C., USA, pp.576-579.
149. W. Li and Y. Attikiouzel, Unsupervised Segmentation of Dual-echo MR Images With an ART-Based Neural Network, ICNN’95, Perth, Australia, pp.2600-2604.
150. W. Li and Y. Attikiouzel, Initialization of Clustering Algorithms for Unsupervised Segmentation of Multi-echo MR Images, ANZIIS’95, Perth, Australia, pp.88-92.
151. W. Li, H. Xie and Y. Attikiouzel, An Efficient Method of Volume Rendering for Medical Slices, IEEE ICIP 1994, Austin, Texas, USA, Nov. 1994, pp.652-656.
IEEE Technical Committees
· Secretary of the Multimedia Technical Committee (MMTC), IEEE Communication Society, 2016-2018
· Associate Editor, IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2018-
· Associate Editor, Journal Visual Communication & image Representation (JVCI), 2016 - Present
· Guest Editor, International Journal of Computer Vision, Special issue on, Special issue on Human Activity Understanding from 2D and 3D data, 2015
· Guest Editor, Journal Visual Communication & image Representation, Special issue on Visual Understanding and Applications with RGB-D Cameras, June 2013
· Guest Editor, International Journal of Information and Systems Science, Special issue on Visual Information Processing for Large Scale Pattern Recognition, Vol.3, No.3, 2007
· Co-chair, IEEE International Workshop on Human Activity Understanding from 3D Data (HAU3D), 2011, 2012, 2013 in conjunction with CVPR
· Co-Chair, The Australian Summit on 3D Multimedia (AU3DMM), 2011
· General Co-chair, ACM Symposium on InformAtion, Computer and Communications Security 2009 (AsiaCCS’09)
· Publication Chair, IEEE International Workshop on Multimedia Signal Processing 2008 (MMSP'08)
· Co-chair of the special session on "Visual Information Processing for Large Scale Pattern Recognition", International Conference on Control, Automation, Robotics and Vision 2006 (ICARCV'06)
· General Co-chair, First International Conference on Digital Rights Management: Technology, Issues, Challenges and Systems 2005 (DRMTICS’05).
· Australian Research Council (ARC)
· International Journal of Computer Vision, IEEE Transactions on Pattern Analysis and Machine Intelligence; IEEE Transactions on Neural Networks; IEEE Transactions on Circuits and Systems on Video Technology; IEEE Transactions on Multimedia; Computer Vision and Image Understanding, IEEE Signal Processing Letters, IEEE Transactions on Electronic Devices; Journal of Computer Science and Technology, Image and Vision Computing.
This part is provied by Dr Zicheng Liu, Microsoft Research Redmond, USA and used to be hosted under Dr Zicheng Liu's home page
The dataset was captured by a Kinect device. There are 12
dynamic American Sign Language (ASL) gestures, and 10 people. Each person
performs each gesture 2-3 times. There are 336 files in total, each
corresponding to a depth sequence. The hand portion (above the wrist) has been
segmented. The file name has the format sub_depth_m_n
where m is the person index. n ranges from 1 to 36.
Note that for some (m,n),
the file sub_depth_m_n does not exist. For example,
there is no "sub_depth_02_03". The reason is that some of the bad
sequences are excluded from the dataset. The mapping from n to gesture type is
Each file is a MAT file which can be loaded with 64bit MATLAB. Below is a sample MATLAB code to load a file:
width = size(x.depth_part,1);
height = size(x.depth_part,2);
nFrames = size(x.depth_part,3);
depthval = x.depth_part(i,j,k);
The following two papers reported experiment results on this dataset:
 Alexey Kurakin, Zhengyou Zhang, Zicheng Liu, A Real-Time System for Dynamic Hand Gesture Recognition with a Depth Sensor, EUSIPCO, 2012.
 Jiang Wang, Zicheng Liu, Jan Chorowski, Zhuoyuan Chen, Ying Wu, Robust 3D Action Recognition with Random Occupancy Patterns, ECCV, 2012.
The dataset was captured by using a Kinect device. There
are 16 activities: drink, eat, read book, call cellphone,
write on a paper, use laptop, use vacuum cleaner, cheer up, sit still, toss
paper, play game, lie down on sofa, walk, play guitar, stand up, sit down.
There are 10 subjects. Each subject performs each activity twice, once in
standing position, and once in sitting position. There is a sofa in the scene.
Three channels are recorded: depth maps (.bin), skeleton joint positions
(.txt), and RGB video (.avi). There are 16*10*2=320
files for each channel. In total, there are 320*3=960 files. Note that the RGB
channel anddepth channel are recorded independently,
so they are not strictly synchronized.
The format of the skeleton file is as follows. The first integer is the number of frames. The second integer is the number of joints which is always 20. For each frame, the first integer is the number of rows. This integer is 40 when there is exactly one skeleton being detected in this frame. It is zero when no skeleton is detected. It is 80 when two skeletons are detected (in that case which is rare, we simply use the first skeleton in our experiments). For most of the frames, the number of rows is 40. Each joint corresponds to two rows. The first row is its real world coordinates (x,y,z) and the second row is its screen coordinates plus depth (u, v, depth) where u and v are normalized to be within [0,1]. For each row, the integer at the end is supposed to be the confidence value, but it is not useful.
Activity recognition experiment with this dataset is reported in the following paper:
Mining Actionlet Ensemble for Action Recognition with Depth Cameras, Jiang Wang, Zicheng Liu, Ying Wu, Junsong Yuan, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012), Providence, Rhode Island, June 16-21, 2012.
20 action types, 10 subjects, each subject performs each
action 2 or 3 times. There are 567 depth map sequences in total. The resolution
is 640x240. The data was recorded with a depth sensor similar to the Kinect
device. The dataset is described in the following paper.
Action Recognition Based on A Bag of 3D Points, Wanqing Li, Zhengyou Zhang, Zicheng Liu, IEEE International Workshop on CVPR for Human Communicative Behavior Analysis (in conjunction with CVPR2010), San Francisco, CA, June, 2010.
Code to load and display depth maps (Load MSRAction3D_depth.zip) is provided by Josue Rocha Lima <mailto::firstname.lastname@example.org>
Better classification results are reported in the following paper:
Mining Actionlet Ensemble for Action Recognition with Depth Cameras, Jiang Wang, Zicheng Liu, Ying Wu, Junsong Yuan, IEEE Conference on Computer Vision and Pattern Recognition (CVPR2012), Providence, Rhode Island, June 16-21, 2012. Note that there is an error in the paper on the number of samples being used for the experiment. The number 402 in the paper is not correct. The correct number is 557. Out of the original 567 sequences in MSR Action3D Dataset, 10 sequences are not used in this paper's experiment because the skeletons are either missing or too erroneous.
Sample code to load MSR Action3D Dataset (drawskt.zip)
Skeleton Data in screen coordinates (MSRAction3DSkeleton (20joints).rar) (Thanks to Yi Wen Wan, University of North Texas, for data cleaning and conversion). There is a skeleton sequence file for each depth sequence in the Action3D dataset. A skeleton has 20 joint positions (see the image for illustrations of the joint positions). Four real numbers are stored for each joint: u, v, d, c where (u,v) are screen coordinates, d is the depth value, and c is the confidence score. If a depth sequence has n frames, then the number of real numbers stored in the corresponding skeleton file is equal to: n*20*4. Click here for MATLAB code to visualize the skeleton motions (The code is provided by Antonio Vieira from Federal University of Minas Gerais).
This diagram shows the correspondence between the 20 points in the skeleton data and the joints (Thanks to Yu Zhong from AIT, BAE Systems for providing this diagram).
Skeleton Data in real world coordinates (MSRAction3DSkeletonREal3D.rar) (Thanks to Ferda Ofli, UC Berkeley, for processing the data).
Human activity understanding from RGB-D data has attracted increasing attention since the first work reported in 2010. Over this period, many benchmark datasets have been created to facilitate the development and evaluation of new algorithms. However, the existing datasets are mostly captured in laboratory environment with small number of actions and small variations, which impede the development of higher level algorithms for real world applications. Thus, this paper proposes a large scale dataset along with a set of evaluation protocols. The large dataset is created by combining nine existing publicly available datasets and can be expanded easily by adding more datasets. The large dataset has 94 actions and is suitable for testing algorithms from different perspectives using the proposed evaluation protocols. Four state-of-the-art algorithms are evaluated on the large combined dataset and the results have verified the limitations of current algorithms and the effectiveness of the large dataset.
Readers are referred to the following paper on details. If
you are to use the combined dataset, Please cite the following paper as well as
all the origional papers of individual datasets.
Zhang, Jing and Li, Wanqing and Wang, Pichao and Ogunbona, Philip and Liu, Song and Tang, Chang, A Large Scale RGB-D Dataset for Action Recognition, International Workshop on Understanding Human Activities through 3D Sensors (UHA3DS) 2016 in conjunction with 23rd International Conference on Pattern Recognition (ICPR2016).
The CAMO_UOW dataset is used for camouflaged foreground detection (background subtraction). It contains 10 (high resolution) videos captured in real scenes including both in-door and out-door cases. In each video, one or two persons appear in the scene wearing clothes in the similar colour as the background. Ground truth foreground masks are manually labelled for all the frames and provided in the dataset. More details can be found in the following papers.
Please cite the following papers if you use the dataset.
S. Li, D. Florencio, Y. Zhao, C. Cook, W. Li, "Foreground detection in camouflaged scenes", IEEE International Conference on Image Processing, ICIP 2017, Beijing, China, September 2017.
S. Li, D. Florencio, W. Li, Y. Zhao, C. Cook, "A Fusion Framework for camouflaged moving foreground detection in the wavelet domain", IEEE Transactions on Image Processing, (to appear) 2018.
UOW Online Action3D Dataset
This UOW Online Action3D Dataset consists of action sequences of skeleton videos, the 20 actions are from the original MSR Action3D Dataset. The action videos are recorded by Microsoft Kinect V.2 with average 20fms/s frame rate.
There are 20 participants to perform these actions, every participant performs each action according to his/her personal habits. For each participant, he/she first repeats each action 3--5 times, then performs 20 actions continuously in a random order. These continuous action sequences can be used for online action recognition testing. The repeated action sequences will be used for training.
In order to make the dataset can be used for cross dataset test, the 20 participants perform the actions in 4 different environments.
Please cite the following papers if you use the dataset.
Chang Tang, Wanqing Li, Pichao Wang, Lizhe Wang, Online Human Action Recognition Based on Incremental Learning of Weighted Covariance Descriptors, Information Sciences, 2018 (to appear).
Last updated: 11 April 2018