I finished redoing all the stuff that was rushed.
Now that we are switching to bins instead of snps, we need to find out how to interpret whether a bin is beneficial or detrimental. Should I pick the highest R2 snp for each bin and set the bin to beneficial if the allele is beneficial and vice versa? This won't work well I think; imagine that you have two bins, one from 0-300K and one from 300K-600K. If there is a SNP with very high R2 just left of the 300K border there is probably going to be another SNP with high R2 just right of the 300K border so this method is heavily influenced by linkage.
Another method would be to look at all SNPs over a certain R2 level within each bin and then set the bin to beneficial if most of the SNPs within it are beneficial, and vice versa. I think this is what you suggested here:
...[D]ivide the genomes in chunks of 500kb and treat each one as a single SNP to control for linkage. This will greatly reduce the significance of the good alleles but will also make the bad alleles (p<0.005) non significant so it's gonna weaken and strengthen our results and in the end it shouldn't make them weaker.
This might not work too well either, since you would expect there to be mostly beneficial SNPs within each bin in the reference population too, just fewer than for the brainiacs.
How should I solve this?