我院在读博士生胡威在《Electronic Journal of Statistics》发表论文。该研究主要针对大规模网络数据,提出了一种爬取式采样的方法,能够对多元空间自回归模型得到参数的一致估计,并给出了相应的理论性质。从而解决了以往采样方法得到的样本数据下,无法对该类模型得到一致估计的问题。
论文题目
Crawling subsampling for multivariate spatial autoregression model in large-scale networks
文章摘要
In network data analysis, multivariate spatial autoregression (MSAR) models may be used to analyze the autocorrelation among multiple responses. With large-scale networks, the estimation for MSAR on the entire network is computationally expensive. In this case, the subsampling method could be adopted. This approach selects a sample of nodes and then uses the estimate based on the sample to approximate the estimate on the full data. However, traditional sampling methods cannot obtain unbiased parameter estimates. Considering the second-order friend information of sampled nodes, we propose the crawling subsampling (CS) method for a general framework. Thereafter, based on the sampled data only, we construct the least-squares objective function. Under certain conditions, the computational complexity of optimizing the objective function is linear with the sample size ns. The identification condition for the parameters on the sampled network is theoretically provided. The sample size order requirement is provided, and the asymptotic properties of the least-squares estimators are investigated. The numerical results for the simulated and real data are presented to demonstrate the performance of the proposed CS method and least-squares estimator for the MSAR model.
作者介绍
胡威,中国人民大学统计学院在读博士生,主要研究方向为网络数据分析,变量筛选等。
黄丹阳,中国人民大学统计学院副教授、应用统计科学研究中心研究员,主要研究方向为超高维数据降维分析、复杂网络建模、互联网征信。
张波,中国人民大学统计学院教授、应用统计科学研究中心研究员,主要研究方向为金融随机分析、金融高频数据分析、数理金融。
发表页面