High-dimensional inference for spatial error models
In the literature of econometric theory and application, issues relating to urban, real estate, agricultural, and environmental economics, etc., where the data are collected spatially from cross-sectional units, are common and in these circumstances, the spatial relation among the sampling sites can not be ignored. Spatial autocorrelation is thus introduced to model the correlation among values of a single variable strictly attributable to their relatively close locational positions on a two-dimensional surface, which extends autocorrelation in time series to spatial dimensions.With the growth of computer capabilities, databases are becoming progressively larger and more complex, making traditional statistical methods less effective or sometimes even unsuitable. Data from high-frequency economic transactions, detailed macroeconomic data collected by a multitude of sources with varying data quality and varying sampling frequencies, and data on large economic and social networks are just a few examples of the content of enormous databases that are now subject to thorough examination.This dissertation discusses applicable (high-dimensional) variable selection and estimation methods and corresponding theories focusing on a spatial error model where the spatial autocorrelation comes from the disturbances across cross-sectional units, in a regression context.In the first part, we propose a generalized Lasso-type of estimator for the spatial error model, where the disturbance terms are autocorrelated across cross-sectional units. We further prove the estimation consistency and selection sign consistency of the parameter estimator under both the low dimensional setting when the dimension of the parameter p is fixed and smaller than the sample size n, as well as the high dimensional setting when p is greater than and can be growing with n. The number of non-zero components of the parameter in both settings are considered relatively smaller than the number of observations (sparsity).In the second part, we continue to investigate post-model selection estimators that apply estimation to the model selected by first-step variable selection. We show that by separating the model selection and estimation process, the post-model selection estimator can perform at least as well as the simultaneous variable selection and estimation method in terms of the rate of convergence. The convergence rate of the estimation error in both the l2 and sup norms are studied. Moreover, under perfect model selection, that is, when the selection process is able to correctly identify the significant covariates of the true model with probability goes to 1, the oracle convergence rate can be reached.In the last part, a sketch of the future work on high-dimensional analysis on mixed regressive, spatial autoregressive model, where the response unit depends not only on the explanatory variables but also on the response from its neighboring units, is described.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Cai, Liqian
- Thesis Advisors
-
Maiti, Tapabrata
- Committee Members
-
Calantone, Roger J.
Lim, Chae Young
Zhong, Pingshou
- Date
- 2016
- Program of Study
-
Statistics - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- viii, 106 pages
- ISBN
-
9781339966021
1339966026