Back to [Archive] Other discussions

1
Stereotype accuracy
Admin
In case someone missed my earlier tweets. We are doing a study of stereotype accuracy in Denmark with immigrant groups.

The study proposed is here: https://docs.google.com/document/d/1vm_wVG1Dih3zz83cCE52xc0rfS52-qne9h5h0ApMjqM/edit#

We are collecting data now and for the next 2 weeks or so.

Initial results reveal high levels of accuracy.

> describe(pers.cors) #desc. stats
vars n mean sd median trimmed mad min max range skew kurtosis se
1 1 25 0.51 0.24 0.58 0.55 0.12 -0.25 0.79 1.04 -1.72 2.35 0.05
One of the graphs has 3 outliers. You should remove them and redo the graph and see what happens.
Errors-in-variables (EIV) regression.

I have never recommended this technique to anyone before, although I could. The reason is because EIV regression was only available in Stata. Or so, I thought. Even if there is no R package, I recently found a handmade program by Culpepper & Aguinis (2011).

eiv<-function(formula,reliability,data){
mfx<-model.matrix(formula,data=data)
p<-length(mfx[1,])-1;n<-length(mfx[,1])
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action", "offset"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- as.name("model.frame")
mf <- eval(mf, parent.frame())
mf<-data.frame(mf)
MXX<-var(mfx[,c(2:(p+1))]);MXY<-var(mfx[,c(2:(p+1))],mf[,1])
Suu<-matrix(0,p,p)
if(p==1) Suu=(1-reliability)*MXX else
if(p>1) diag(Suu)<-(1-reliability)*diag(MXX)
Mxx<-MXX-(1-p/n)*Suu;Btilde<-solve(Mxx)%*%MXY
vY=var(mf[,1])
MSEtilde<-as.numeric(n*(vY-2*t(Btilde)%*%MXY+t(Btilde)%*%MXX%*%Btilde)/(n-p-1))
Rhat<-matrix(0,p,p);diag(Rhat)<-(t(Btilde)%*%Suu)^2
VCtilde<-MSEtilde*(1/n)*solve(Mxx)+(1/n)*solve(Mxx)%*%(Suu*MSEtilde+Suu%*%Btilde%*%t(Btilde)%*%Suu+2*Rhat)%*%solve(Mxx)
ttilde<-Btilde/sqrt(diag(VCtilde))
output<-cbind(reliability,Btilde,sqrt(diag(VCtilde)),ttilde,2*(1-pt(abs(ttilde),n-p)))
colnames(output)<-c('Reliability','Est.','S.E.','t','Prob.(>|t|)')
output
}

eiv(educ~wordsum+logincome+age,reliability=c(.71,.75,1),data=d)


The code I use above is the one displayed in the website of Aguinis, and that code is different from what is displayed in the Culpepper/Aguinis. Still, with the above code, I get the same parameter values as the eivreg function in Stata.

The problem is the absence of intercept, the absence of confidence intervals, R² and F test, and the impossibility to use sampling weight.

I have emailed the authors several days ago, asking them if they can complement this program. No answer. I don't think I will get any (as always...).

One thing here. It's known that EIV regression doesn't give you standardized coefficient. Only the unstandardized coefficient. But I have asked the question, here :
http://stats.stackexchange.com/questions/129689/how-would-you-get-the-standardized-coefficients-in-errors-in-variables-eiv-reg

Generally, I prefer the use of SEM, but if it's too complicated or there aren't enough variables to build latent variables, I would suggest EIV regression.


EDIT ;

I got an answer from Culpepper, but later than expected. He says he will do the improvement later. And that if I need the intercept, it's possible to get it in the same way as in classical regression, i.e., intercept = Ybar - coeff*X1bar - coeff*X2bar, where Ybar, X1bar and X1bar are the means of Y, X1 and X2.
1