"Btw, you are using the old version of the merger.R file. The new one does not include the libraries that gave you trouble."
You're right. It's my fault for not having downloaded the 2nd version, because now it finally works.
> write.meta()
Error: could not find function "write.meta"
You forgot to run the source("merger.R") first, which loads the write.mega() function.
Perhaps you can open a thread for exposing this basic stuff. I'm no good at R when it comes to prepare data. For example, I don't understand the two # warning sentences here :
read.mega = function(filename){
return(read.csv(filename,sep=";",row.names=1, #this loads the rownames
stringsAsFactors=FALSE, #FACTORS SUCK AVOID LIKE THE PLAGUE
check.names=FALSE)) #avoid prepending X to columns
}
R has a special object type for nominal data called a factor. When you load a file, R attempts to auto-detect strings that are factors. This caused severe problems when working with the megadataset due to some technical point I won't go into here. Short story is that I spent 6 hours trying to find the source of the error and it was that default setting (the default is stringsAsFactors=TRUE).
check.names=TRUE makes R add an X to variable names that begin with a number. I thought it was silly, but then I discovered that many functions in R don't work if the variable has a number first in the name. So in the 3rd version of the merger.R I removed that part.
I will try to work with Rcmdr package next time. I know for having tried that I didn't understand what to do with it. I may try Rstudio later. I have noticed that in several youtube videos that users usually work through Rstudio.
You mean R-commander? It's a GUI not a package. RStudio is an IDE as for other programming languages. I use RStudio always. Great IDE. :)
The reason why I need menus is because they ensure you're not making mistakes in your coding, and the data window can show you what your columns look like. When you use menus, the output is displayed along with the syntax. That's how I learn to make variables, display stats with the conditional "do if" or perform the analysis repeatedly by groups. See attachment.
As I mentioned, RStudio lets you view your data objects.
I don't understand why they need to make R so complicated. In Stata/SPSS, when you create vars with missing values, you don't need to do stuff like :
use="complete"
na.rm=TRUE
It is not normal that R forces you to learn so much. This is extremely error prone. Of course, the fact that the authors who work with R do not show their syntax, does not make things easier. This is irritating. R is free software, so it is very helpful to show us the syntax.
Many authors show their code. When I blog research, I post the code too so others can replicate my analyses. Many bloggers do that. Many researchers attach syntax as supplementary material. This should be mandatory for anyone using SYNTAX to ensure that analyses are replicable.
R is somewhat silly handling missing data. Some functions automatically ignore missing data, some require you to add "na.rm=TRUE" (removing missing), others want you to specific exactly how (e.g. use="pairwise.complete.obs" (for cor() when you want a correlation matrix).
It is true that R allows weight. But it is not true that you can work with them. For what I know, there is one package you can use to create an object that will weight your analysis.
http://r-survey.r-forge.r-project.org/survey/html/svydesign.html
However, this object works only within the package svydesign.
In other words, you can't do linear regression, correlation, tobit, etc. usually. You have to use the relevant functions within the svydesign. And in this package, I don't see regression, logistic regression, tobit, SEM, factor analysis, etc.
Even if you can create the following object:
dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
You cannot do this :
model<-lm(wordsum ~ race + cohort + race:cohort, data=GSSsubset, dstrat)
This is bothersome. It's just like AMOS. In that situation, the only thing you can do with R is to use input data matrix instead of raw data. By using Stata, SAS, or SPSS, you generate your correlation/covariance matrix by using sampling weight. And then switch to R. But in that case, I prefer to do the entire analysis on Stata/SPSS.
I there there are some more functions for working with weighted data. Some functions support it natively, for instance lm() (look in the help page for it) supports it natively. This means that you can do any linear regression including simple correlations with weights.
There is another package called weights that has a few more genetic functions, such as a weighted t-test.
-> "Read the help file for read.csv() if you need to know how to read semi-colon separated files."
Ok, but I asked the question because I wanted to work with AMOS. I think SEM should provide graphical output of the results, which AMOS does but not R (at least, not in lavaan package). Furthermore, as I think R is error prone, I prefer to replicate my analysis when using another software.
No graphical output as far as I am aware. Maybe there is a package for that. I haven't used SEM extensively, so I don't know.
Concerning the relevant question, the syntax does not display everything I wanted. Notably the partial corr, but I managed to get them by using this :
partial.r(DF.C.PISA.IQ, c(1:18),19)
There are indeed numbers outside the normal range -1;+1, and even a correlation of -2.16. As you know, I hate R for lot of stuff. For example, the syntax above partial out var19, which should be 19th in your list. The name is not displayed, so it's not easy. What is var19, then ? LV2012IQ ?
Yes. Just use colnames(DF.C.PISA.IQ) to get the variable names.
By the way, why did you use
lm = lm(formula,DF.C.PISA.IQ.Z)
When I once worked with regression, I sometimes standardized my vars before running the regression, but I don't think it's good idea, and anyway, the regression provides both unstandardized and standardized coeff. So, why using z-score then ?
Joost de Winter asked me to do it that way.
lm() does not provide standardized coefficients by default (very silly). However, you can get them using lm.beta() from QuantPsyc package. The standardized betas were identical to the unstandardized betas when you standardize the variables before running the regression. However I have seen a case where this was not true (results slightly different).