讲座信息

讲座信息

您当前的位置: 首页> 讲座信息
20210428马平:数据科学中的子抽样问题
时间:2021-04-25

报告时间:2021年4月28日 上午9:00-11:00

报告形式:腾讯会议(会议ID:352 163 401

报告嘉宾:马平

报告主题:数据科学中的子抽样问题


报告摘要

The rapid advance in science and technology in the past decade brings an extraordinary amount of data that were inaccessible just a decade ago, offering researchers an unprecedented opportunity to tackle much larger and more complex research challenges. The opportunity, however, has not yet been fully utilized, because effective and efficient statistical and computing tools for analyzing super-large dataset are still lacking. One major challenge is that the advance of computing technologies still lags far behind the exponential growth of database. One option is to invent algorithms that make better use of a fixed amount of computing power.

In this talk, I will review an emerging family of subsampling methods that are developed for achieving such a goal.  In subsampling methods, we sample a small proportion of the data (subsample) from the full sample, and then perform intended computations for the full sample using the small subsample as a surrogate.  In classic statistical literature, subsampling has been used to refer to ‘m-out-of-n’ bootstrap, whose primary motivation is to make approximate inference owing to the difficulty or intractability in deriving analytical expressions.  The general motivation of the subsampling methods in data science is different from the traditional subsampling. I will present challenges and opportunities.


个人简介

马平教授是美国佐治亚大学的杰出教授和大数据分析实验室的共同主任,2003年在普渡大学获得博士学位,2003年至2005年在哈佛大学从事博士后研究。2005年至2013年在伊利诺伊大学香槟分校任助理和副教授。他是伊利诺伊大学高等研究中心贝克曼讲席教授,美国国家超级计算和应用中心讲席教授、美国国家科学基金会杰出青年科学家奖CAREER AWARD获得者。他的论文获得了2011年加拿大统计杂志的最佳论文奖。他是2021美国国家科学基金生物科技杰出讲座的讲座人。他是美国统计协会的会士。


扫描下方二维码报名↘

所有消息会在两个群中同步通知

请大家不要重复加群~