Disrupting with data

Aug 12 2010 Published by under Tactics

Fellow Scientopian DrugMonkey exults at the downfall of "supplemental materials" in a favorite journal. Also-fellow-Scientopian Christina wonders if that's an outlier or a trend harbinger.

Let's sit back and think about this. (What, you thought I was going to panic? Or go all spittle-flecked Data Are The Future How Dare They rabid? Nah, I'm good.)

One conclusion that jumps out at me is that the journal in question discovered that its processes didn't handle data well. This doesn't surprise me; handling data well is rather difficult, especially in these wild-west standardsless days. Rather than learn how, the journal bailed on it altogether. As a librarian with a strong interest in data management, that cheers me up remarkably, and I dearly hope more journals follow suit. As a librarian with a strong interest in Clayton-Christensen-disrupting the current journal universe, it cheers me even more.

See, one of the lesser-known bits of Christensen's market-disruption pattern is that the disrupting force needs to start out by "competing against nonconsumption." You can't take on the incumbent on its own turf; the incumbent will eat your lunch and you for dessert. (What's the lesson for institutional repositories here? Starting with peer-reviewed journal articles was a doomed strategy, that's what. Those are the crown jewels. The incumbents own those.) You have to find something else to work with, something unused or underserved that the incumbents turn up their august noses at—a low-end market, a different raw material—establish a market beachhead there, and expand your beachhead over time.

Well, isn't it interesting that a journal just turned up its nose at research data. Why, yes. Yes, it is. And now perhaps you see why I think that was a strategic mistake by the journal. Short-term, sure, it'll make their lives easier. Long-term, it gives us disruptive librarians an in, if we've got the will to take it.

Another interesting tidbit is how researchers were using supplemental materials to bolster their arguments. They weren't reluctantly turning this stuff over to the journal because the journal or their funder insisted. They thought their data helped their case—even, as DrugMonkey darkly hints, if only by snowing reviewers under with hard-to-interpret evidence. Are they going to stop thinking this suddenly? I wonder.

The paradigm case for data citation standards is giving credit to third parties for data they produced that you used. I wonder if that's the wrong case, large though it looms in the minds of nervous researchers who don't want to be scooped. Surely what's more likely at first is researchers wanting to cite their own data in a publication, wherever those data happen to be housed. That's how one gets around the stunt this journal just pulled, if one truly believes in one's data.

And this, Peter Murray-Rust, is partly why I believe institutions are not out of the data picture yet. The quickest, lowest-friction data-management service may well reside at one's institution. It's not to be found at a picky, red-tape-encumbered, heavily quality-controlled disciplinary data service on the ICPSR model, which is the model most disciplinary services I know of use. It's certainly possible, even likely, that data will migrate through institutions to disciplinary services over time, and I have no problem with that whatever—but when the pressure is on to publish, I suspect researchers will come to a local laissez-faire service before they'll put in the work to burnish up what they've got for the big dogs. (Can institutional data services disrupt the big-dog disciplinary data services? Interesting question. I don't know. I think a lot depends on how loosely-coupled datasets can be. Loose coupling works better for some than others.)

Finally, of course, we have further indication that the peer-review system is breaking down under load. (Professor In Training has an amusing growl that is yet further indication.) I haven't anything to say about that that Christina isn't better-placed to talk about; I'm just pointing to yet more evidence.


6 responses so far

  • Christina Pikas says:

    We can only hope that authors put their data in local repositories. One thing the optics and astro journals promise is to preserve these supplemental materials in the same way they preserve the articles. That's one reason the formats have been such a critical discussion. The editorial in J Neuroscience mentions that authors may choose to link to data on their own site and if these links break, so what, it's not necessary for the understanding of the paper. This worries me. I hope local repositories will take this data, but I'm afraid they won't or they won't let the scientists know that they will.

    • Dorothea Salo says:

      Quite so. Yesterday on FriendFeed I backed up a librarian whose director is starting an IR and (predictably) wanted to make a policy that only peer-reviewed articles would be allowed.

      "Tell your director not to be a freakin' idiot," I said, and I stand by that, rudeness aside. (Do these director-type people not read? I can understand not reading me on the subject, but I'm hardly the only one who's pointed out dismal self-archiving rates.)

      I still have an article in the back of my head about repos, disruption, and service models. My frustration-with-teh-st00pid level is almost into the red zone that will make me actually write it.

  • Patricia Hswe says:

    As someone starting out in the digital curation profession, I regularly learn a lot from your posts, Dorothea, and this one is no exception. I'm curious, then, to see what you think of initiatives like Dryad (http://datadryad.org/ ), based at UNC-Chapel Hill but working with a variety of institutional partners, as well as journals and professional societies as partners. It seems to be a step in the right direction. My question, in the end, is typically: how useful will this prove to be to researchers - and how do we measure that?

    • Dorothea Salo says:

      I'm quite enthusiastic about Dryad, not least because its staff includes fantastic librarians of my acquaintance. (I still wonder why they chose DSpace to work from, but they seem to be beating it into submission.)

      From a big-picture perspective, the journals that work with Dryad are taking a different road from DrugMonkey's Journal of Neuroscience. They are acknowledging the importance of data, but rather than pick up the necessary skills themselves, they're outsourcing the problem to Dryad. The collaboration is likely to be productive in the short and medium term for journals, researchers, and librarians.

      Longer-term, who knows? It'll be fun to watch. Will data disrupt publication? Will data disrupt publication when the data are open but the publication isn't? (The day that a dataset gets more citations than the paper it's supporting will be a red-letter day. RED-LETTER. I am serenely confident it will happen, and I strongly believe the shortchanged paper will be toll-access.) Will open data exert gravitational pull toward open-access publishing? Or will data take on sufficient importance that Dryad's publishing partners slowly wither away?

      I don't know. Interesting times.

  • Jill O'Neill says:

    Just something to keep on your radar: NFAIS and NISO are working together on an initiative (Best Practices kind of thing) pertaining to supplemental materials.