NIGMS blogger (oh, and yeah, the Director) Jeremy Berg has posted a very interesting set of data on the review of grants.
Director Berg examined the scoring for the 360 R01 applications assigned to his Institute for the October 2009 Council round. This, you will recall, was the first one to use the current scoring scheme . So in some senses this should be regarded as the baseline value.
The analysis Director Berg shows in the graph is the correlation between the "Significance" score and the Overall Impact Score. If you will recall, there has been a bit of grumbling on the part of reviewers and applicants alike about the weird disconnect of the new system.
Each of the five criteria (Significance, Investigator(s), Innovation, Approach and Environment) are to receive an individual score from the ~three assigned reviewers. They are also to assign a preliminary Overall Impact Score prior to the study section meeting. This Overall Impact Score then becomes the post-discussion score at the meeting and the range defined by the assigned reviewers is generally the space in which the entire panel votes. The weird part I alluded to is that there is not supposed to be any explicit numerical connection between the criterion scores and the Overall Impact. Maddening.
Plot of significance and overall impact scores in a sample
of 360 NIGMS R01 applications reviewed during the October 2009 Council round. [source]
What Director Berg has done is to ask about the relationship between these component scores and the Overall Impact in the actual scoring. Now one thing that is unclear to me is whether he's plotting the eventual voted Overall Impact score or the average of the individual reviewers' pre-discussion scores. I think the latter [update: I was wrong, Director Berg confirmed via email that it was the eventual voted score.] because the graph does not show the clustering around the integer scores that was predictable and indeed turned up, e.g., in this NIAID graph from a single study section.
As anticipated, the scores are reasonably strongly correlated, with a Pearson correlation coefficient of 0.63. Similar comparisons with the other peer review criteria revealed correlation coefficients of 0.74 for approach, 0.54 for innovation, 0.49 for investigator and 0.37 for environment.
Makes you salivate for similar analyses from all the other ICs doesn't it? And for individual study sections perhaps? It would be fascinating to see if behavior was more or less consistent across the CSR review sections, whether some prioritized innovation or others were GoodOldBoyGirl sections who prioritized the investigator. I can dream, can't I?
Anyway the take home from this particular data set seems to be that Approach and Significance are still the most consistent predictors of Overall Impact Score.