时间:1129日200-3:30
地点:明德主楼1016
报告人:
1、 中央财经大学统计学院 张忠元
报告题目:Nonnegative Matrix Factorization and its applications in data mining
Non-negative Matrix Factorization (NMF for short) can be traced back to 1970`s (notes from Golub), and has attracted much attentions in recent years since it is studied extensively by Paatero & Tapper (1994) and Lee & Seung (1999, 2001). Mathematically, Non-negative Matrix Factorization (NMF) can be described as follows: Given an n*m matrix V composed of non-negative elements where n>>m is usually the case, our task is to factorize V into a nonnegative matrix W of size n*r and another non-negative matrix H of size r*m such that V≈WH. The parameter r is pre-assigned and usually satisfies the principle r < nm/(n + m). W and H can be explained variously in different fields and purpose or by different persons. In general, W can be seen as the basis matrix of the reduced dimensionality, H is the projected matrix of V onto W. NMF has a wide variety of applications since many pattern recognition and data mining problems focus on the analysis of non-negative matrix data and require non-negativity results. In this presentation, I will give a brief review on NMF, including the standard model, algorithms, and the relationships to some existing clustering methods (equivalence to K-means and PLSI). We give two extensions of NMF: Binary Matrix Factorization (BMF for short) specially designed for binary data and biclustering, and Non-negative Tri-factor Bi-orthogonal 3D tensor Decomposition (Tri-ONTD for short) specially designed for three dimensional tensor data analysis. Finally I will discuss the facing challenges and perspective of NMF.
2、 中央财经大学统计学院 刘苗
报告题目:统计学习方法在文本自动分类中的探索
文本自动分类是文本挖掘领域中重要的研究内容,本报告主要介绍本人对文本自动分类研究比较感兴趣的几个点。涉及到的问题主要在:文本的表示,文本的特征提取,文本的自动分类算法等方面。希望能够更好的将统计学习的思路和方法融会其中。