报告时间:2015年3月11日下午3:00--4:00
报告地点:中国人民大学明德主楼1016会议室
报告题目: Principal Component Analysis: Regularization, Supervision, Application, Asymptotics
报告人:Haipeng Shen
1. Department of Statistics & Operations Research,University of North Carolina at Chapel Hill
2. Innovation and Information Management School of Business, Faculty of Business and Economics, University of Hong Kong
报告摘要:
Principal component analysis (PCA) is a ubiquitous technique for dimension reduction of multivariate data. Regularization of PCA becomes essential for high dimensionality, for example, in techniques such as functional PCA and sparse PCA. Maximizing variance of a standardized linear combination of variables is the standard textbook treatment of PCA. A more general perspective of PCA is by way of fitting low rank approximations to the data matrix. I shall first take this low-rank-approximation perspective and describe a general regularization framework for PCA, that leads to alternative approaches for its regularized siblings. This perspective can then be extended to a framework that incorporates supervision on (regularized) PCA, when there is auxiliary information relevant for dimension reduction. I shall finally introduce a general asymptotic framework for studying consistency properties of PCA. The framework includes several existing domains of asymptotics as special cases, and furthermore enables one to investigate interesting connections and transitions among the various domains. The various methods will be demonstrated with interesting applications from bioinformatics, neuroimaging, and business analytics.
报告人简介:
Haipeng Shen received his PhD in Statistics from The Wharton School of Business, University of Pennsylvania in 2003. He is a full professor of Statistics and Operations Research, at the University of North Carolina at Chapel Hill, and a visiting professor of Innovation and Information Management, at the School of Business, Faculty of Business and Economics, University of Hong Kong. His research evolves around the theme of data-driven decision making in the face of uncertainty, including fundamental statistical research about challenges imposed by big data (high dimensionality and complex structure), as well as interdisciplinary analytical research in business analytics, neuroimaging, bioinformatics, and network traffic modeling. His work has been supported by US NSF Statistics, NSF Service Enterprise Systems, NIH, The Xerox Foundation, and Hong Kong Stanley Ho Challenge Fund. He has published research articles in top journals in both Statistics (JASA and AOAS) and Operations Management (MSOM). He has collaborated with industry partners such as Allcatel-Lucent, Bank of America, Xerox, and i-MD.