2021

Research / 2021

Research

Communication-efficient Distributed Estimator for Generalized Linear Models with a Diverging Number of Covariates

2021.05.01

[Publication Time] 2021-05-01

[Lead Author] Zhou, Ping

[Corresponding Author] Ma, Jingyi

[Journal] Computational Statistics and Data Analysis


[Abstract]

Nowadays, it has become increasingly common to store large-scale data sets distributedly across a great number of clients. The aim of the study is to develop a distributed estimator for generalized linear models (GLMs) in the “large n, diverging pn” framework with a weak assumption on the number of clients. When the dimension diverges at the rate of o(n), the asymptotic efficiency of the global maximum likelihood estimator (MLE), the one-step MLE, and the aggregated estimating equation (AEE) estimator for GLMs are established. A novel distributed estimator is then proposed with two rounds of communication. It has the same asymptotic efficiency as the global MLE under pn=o(n). The assumption on the number of clients is more relaxed than that of the AEE estimator and the proposed method is thus more practical for real-world applications. Simulations and a case study demonstrate the satisfactory finite-sample performance of the proposed estimator.


[Keywords]

Generalized linear models; Large-scale distributed data; Asymptotic efficiency; One-step MLE; Diverging p