讲座信息

讲座信息

您当前的位置: 讲座信息
20181024 艾明要:Optimal Subsampling Algorithm for Big Data Generalized Linear Models
时间:2018-10-18

报告时间:20181024 14:00-15:00

报告地点:明德主楼1016会议室

报告题目:Optimal Subsampling Algorithm for Big Data Generalized Linear Models


报告摘要:

  To fast approximate the maximum likelihood estimator with massive data, Wang et al. (JASA, 2017) proposed an optimal subsampling method under the A-optimality criterion (OSMAC) for in logistic regression. This paper extends the scope of the OSMAC framework to include generalized linear models with canonical link functions. The consistency and asymptotic normality of the estimator from a general subsampling algorithm are established, and optimal subsampling probabilities under the A- and L-optimality criteria are derived. Furthermore, using Frobenius norm matrix concentration inequality, finite sample properties of the subsample estimator based on optimal subsampling probabilities are derived. Since the optimal subsampling probabilities depend on the full data estimate, an adaptive two-step algorithm is developed. Asymptotic normality and optimality of the estimator from this adaptive algorithm are established.

  The proposed methods are illustrated and evaluated through numerical experiments on simulated and real datasets.


报告人简介:

  艾明要,现为北京大学数学科学学院统计学教研室主任、教授、博士生导师,兼任中国现场统计研究会常务理事,中国现场统计研究会试验设计分会理事长、高维数据统计分会副理事长、空间统计分会秘书长,国际统计期刊《Statistica Sinica》、《Journal of Statistical Planning and Inference》、《Statistics and Probability Letters》、《STAT》副主编,国内数学期刊 《系统科学与数学》编委。

  主要从事试验设计与分析、计算机试验、大数据分析和应用统计的教学和研究工作,在Ann StatistJASABiometrikaTechnometricsStatist Sinica等国内外顶尖期刊发表学术论文六十余篇,主持完成国家自然科学基金面上项目5项、重点项目子课题1项,参与完成国家科技部973课题2项。