In
statistics, a
proxy variable is something that is probably not in itself of any great interest, but from which a
variable of interest can be obtained. In order for this to be the case, the proxy variable must have a close
correlation, not necessarily linear or positive, with the inferred value.
Examples
Per-capita
GDP is often used as a proxy for measures of
standard of living or
quality of life.
When performing social collections, the gender of the respondent is an important variable. As gender commonly dictates how one responds. Most general collections, therefore, collect data on the respondent's sex and age, and that is used as a proxy for gender. In most general collections, the proportion of transsexual and transgendered individuals is low, making the correlation reasonably good.
Likewise, country of origin or birthplace might be used as a proxy for race.
See also
References
- Toutenburg, Helge; Götz Trenkler "Proxy variables and mean square error dominance in linear regression". Journal of Quantitative Economics 8 433-442.
- Stahlecker, Peter; Götz Trenkler "Some further results on the use of proxy variables in prediction". The Review of Economics and Statistics 75 707-711.
- Trenkler, Götz; Peter Stahlecker "Dropping variables versus use of proxy variables in linear regression". Journal of Statistical Planning and Inference 50 (1): 65–75.