This is the final in my practice essays before taking the real comps test in the end of July. I need to correct the record, though. Apparently although all of these questions came from my advisor, he didn't write them all. These were ones proposed by committee members and rejected for inclusion in the exam. (the gap in numbers you see are two essays that didn't go well). This particular question might be by my advisor with an ok from the two STS committee members. I didn't have any STS questions to practice with so he came up with this one - which I think is an excellent question.
question:
Discuss the forces that move scientists towards open sharing of information and the countervailing forces that prevent scientists from sharing information or encourage them to actively guard information. You may want to distinguish between information on research problems and hypotheses, raw data / data sets, information on methods and apparatus, and information on results. Consider the role of technology in your answer.
I know, right?
My answer:
0. Introduction
In the past two decades, much controversy and discussion has centered on public access to scientific information, the cost of scholarly journals, and information sharing within science. There are many strong forces that encourage scientists to share and equally many countervailing forces that discourage scientists from sharing. This essay describes these forces and role of technology. The essay ends by considering the role of various mandates in supporting information and data sharing in science.
1. Forces That Encourage Scientists to Share
There are many forces acting on scientists to encourage them to share information and data. These include:
- wider recognition
- finding collaborators and making information available within collaborations
- making scientific information available to the public, scientists not in research institutions, and for data mining or serendipitous location
- for generalized reciprocity, in order to get data
- to increase the speed of science or creatively solve problems
1.1 Recognition
Science runs on reputation and recognition; that is, promotion, tenure, winning grants, and attracting graduate students all depend on successful publication of research results in prestigious journals and the citedness of those journal publications. Research has shown that there are many correlates to higher citation rates outside of the quality of the document. These include:
- article is on the cover
- article is discussed in the media
- article is a review article
- article is longer
- article is in a more prestigious publication
- article is open access.
This final correlate is somewhat disputed as there are studies showing both that open access does favorably impact citedness, immediacy, and usage as well as studies showing no statistically significant correlation between open access and citedness over the long term. Even if open access is not significant, we can see that being on the cover and being discussed in the media are both ways that the research is brought to the attention of other scholars. The point is that information sharing with the media increases article citation and recognition of the scientists.
Likewise the sharing of data, workflows, and algorithms in disciplinary repositories can lead to greater recognition of the scientist and his or her lab. Deposits to disciplinary repositories are signed, so high quality results are attributable to their source. The technology of the repository and standards for information structures within repositories make the shared information findable and useful.
1.2 Collaboration
In addition to recognition for promotion, tenure, and grants, recognition can also help in finding new collaborators and in sharing information within collaborations. By seeing the contributions of a person to a data, workflow, or e-print repository, a scientist looking for collaborators can judge the relevance of that person's experience and can also assess his or her expertise in an area.
Once scientists are in a collaboration, open and free information sharing is necessary for trust and to make the project work. This seems obvious but it must still be stated as the lack of information sharing within collaborations is frequently listed as a reason collaborations fail.
1.3 Making information available outside of the invisible college
Despite the frequent mentions in the literature that scientists do not want to consider the societal impact of their research (Polyani, Merton) and do not want to communicate with the public (Weigold), recent surveys indicate that 75% of scientists do communicate with the media about their research and most scientists want their research to be useful and used. Forces moving scientists toward open information sharing include making information available and useful:
- to scientists who cannot afford toll access to the literature
- to scientists outside of the particular research area
- for data mining
- to the public.
1.3.1 Scientists without toll access
Scientists who are not in large research institutions do not have the same access to the literature because many of the abstracting and indexing services and journals are extremely expensive. Scientists publish in open access journals, post e-prints to their web pages or repositories, or respond favorably to reprint requests to make their research available to these scientists. There is some altruism involved, but the point of publication is to make the results of research available, so sharing of publications does this.
1.3.2 Serendipitous Finds
Scientists who share information in places indexed by major search engines enable serendipitous discovery by researchers outside of the invisible college. Scientists within the research area likely know what labs are doing which work and have access to new research results. Scientists outside of the research area might happen upon this work when looking for something else.
1.3.3 Data mining
In many if not most or all areas of science, computational methods that leverage large collections of data are being used to make new knowledge. Scientists are encouraged to share information and data without restrictive licenses to enable these new uses.
1.3.4. The Public
Open sharing of information with the public can have a positive impact on the government funding of research as well as showing return on past investments in science. Besides getting government funding, scientists can be altruistic, too. While rare, there are often-touted examples of parents researching the biomedical literature to assist in the diagnosis and treatment of their sick children.
1.4 Reciprocity
Scientists might share data for specific or generalized reciprocity. In other words, scientists might share data in order to get data from another scientist in particular or in hopes of getting data in the future from some other scientist.
1.5 Speeding Up Science
Scientists might want to share information openly to get feedback and to solve problems and to speed up the cycle of science. Posting of data or publications on a web page prior to official publication makes that information usable sooner and to a larger group. Many scientific instruments output electronic information. This information can be shared in real time via the web to allow multiple simultaneous diverse uses.
2. Forces That Prevent or Discourage Scientists from Sharing
There are many forces acting on scientists to discourage them from sharing information and data. These include:
- fear of being scooped or ideas being stolen
- Inglefinger-type rules preventing information sharing prior to publ
ication
- intellec
tual property concerns of the organization
- sensitivities of information regarding human subjects or national security
- concern over misuse of information by anti-science groups
- effort required to describe or format information for deposit or reuse
2.1 Being Scooped
Scientists sometimes do not want to share data until they have "wrung" all of the possible publishable science out of it. The concerns are that another scientist will publish the same information more quickly without the expense of gathering the data or that another scientist will find different information in the data that the original scientist missed (Birnholtz).
Indeed, a cited form of misbehavior in peer review is that the reviewer who is a competitor might use information in the submitted article or might hold up publication of an article until his or her own article is published first.
Some conferences and small workshops do not consider information shared to be "published" and there are guidelines on how this information can be used. Nevertheless, attendees might act on the information and might publish first.
2.2 Inglefinger Rules
The Inglefinger rule from the New England Journal of Medicine states that the journal will not publish any information previously presented in any venue or discussed with the media. Similarly, many journals have an embargo on discussing findings with the media until the date of publication of the journal or the posting of the article on the journal's web page in "early view." Scientists might not share information if they fear that by sharing they will not be able to publish in a prestigious journal. Some of these rules were strengthened after the cold fusion episode in which the scientists held press conferences before peer review of their work. Subsequent peer review and evaluation by other scientists found that their results were not reproducible.
2.3 Intellectual Property Concerns
Scientists might be prevented from sharing data or publishing if their organization intends to patent their discovery. Discussing a discovery or publishing the results starts a clock for patent application or can prevent a patent from being filed.
2.4 Sensitivities
Scientists who work with human subjects or with national security information might be discouraged from sharing due to sensitivities about protecting the privacy of the subjects or concerns over export control or classified information. There are ways to anonymize human subjects data but this still presents a barrier. Likewise, scientific facts should not be classified, but the sensitivities of the research funder trump the forces encouraging the scientists to share information.
2.5 Concern Over Misuse
Open sharing of research using animal experimentation or stem cell research has endangered the physical security of the researchers. By publishing in obscure disciplinary journals, the information is available to other scientists but less likely to attract attention from anti-science groups who have reacted violently in the past. Short of these violent reactions, scientists might be concerned that their research will not be understood.
2.6 Effort Required
Finally, a force acting against the sharing of data is the effort required to describe and make data accessible for wider use. In some fields it is quite easy and straightforward to share data in pre-existing, established, and well-supported repositories. In other fields there might not be any repositories or what repositories exist might be fragmented and with uneven funding and support (Borgman). It is often easier to save the data on a cd-rom in a box under your desk than to properly document it and find a place to store it online.
3. Mandates
Despite the competing forces that encourage and discourage scientists from sharing information, there are mandates to share information coming from several sources. First, funding bodies may require submission of a journal publication to an open repository as a condition of accepting the grant and funders of big science projects require them to make resulting data freely available online. Second, research institutions may mandate submission of publications to the institutional repository for all of their scholars. Third and most successfully so far, groups of journals in a research area might require that the supporting data be submitted to a repository at the time of publication.
5. Conclusion
There are many competing forces moving scientists to share and not to share data, workflows, and publications. The salience of each of these depends on a number of factors not discussed explicitly above but including:
- the norms and the culture of the research area (sub-discipline)
- the existence of standards and established infrastructure to support sharing
- the funding source for the research
- the scientist's employer
- the scientist's place in his or her field (in other words, an established scientist might be less concerned with being scooped)
Information scientists can support information sharing by removing barriers related to finding a repository and making the deposit of data or publications. Likewise, we can address ways to secure information such that only those who should have access do. We can also help scientists discuss information sharing with publishers and other scientists to make these concerns explicit and to remove unnecessary barriers.
Technology has facilitated information sharing and discovery, but it alone does not address the cultural and social barriers to information sharing. Ultimately, understanding the social aspects of science along with the technological requirements for information sharing is needed to encourage scientists to share.