Imagine we had 5 NAEP Reading tests. And 2 of them showed no DIF and no measure non-invariance. Imagine that the H/W difference on these 2 tests was 0.65. This would provide evidence that there was a "true" population level latent H/W ability difference of 0.65 SD, no? Now imagine we had 3 other NAEP tests for which DIF and measure non-invariance were found, but adjusted scores were not presented. Imagine that these 3 tests also showed an average score difference of 0.65 SD. Knowing nothing else, we can infer that the psychometric bias on the latter 3 tests is not accounting for much of the average 0.65 H/W score difference because the evidence show that there is, in fact, a 0.65 SD latent ability difference.

I understand what you mean, And i think I said something like this earlier. My point is that your argument is correct only if the first 2 NAEP (invariant; d=0.65) and last 3 NAEP (non-invariant; d=0.65) test have the same or similar properties (note; if that happens, there is some indirect evidence that the bias is not cumulative, while I'm talking about cumulative bias, since non-cumulative bias is generally irrelevant when it concerns IQ). Why I said earlier that most people (including practioners) do not understand what measurement equivalence/invariance is, has to do with test composition. When MI is violated, this means that the group difference differs depending on the kind of subtest/items the test is composed of. This is probably why Wicherts said something like "scores are not comparable" when MI is not fulfilled. It's not 100% wrong, but it's highly misleading.

However, when you are talking about tests tapping into different cognitive dimensions (e.g., reading vs. math, achievement vs. IQ) the assumption that the tests have similar properties is very likely to be violated. And in this case, generalizability becomes impossible.

-----

Oh, concerning the median/mean issue...

You can use the following

=MEDIAN(F47:F64)

When you do this, the d gap for blacks are 1.05, 0.85, 1.00, while using means you have 0.98, 0.80, 0.98.