Archive for the 'Information Science' category

Instrument bibliographies, data citation, searching for data

Jan 11 2013 Published by under bibliometrics, Information Science

My place of work builds spacecraft and instruments that fly on other folks' spacecraft. So one of the things that we need to do is to come up with a list of publications that use our data. It's the same thing with telescopes and it ends up being a lot more difficult than you might expect. There are a few articles on how to do it from ADS staff and from librarians from ESO (Uta Grothkopf and co-authors), STSCI, and other telescopes. It turns out that you really have to do a full text search to get closer to comprehensive. ADS has a fulltext search covering some things, but I asked the experts in the Physics-Math-Astro Division of SLA and many of them also use the fulltext searches on the journal publisher pages (which are of varying quality). I found that Google Scholar was the only thing that got book chapters. This is all pretty complicated if your instrument has a pretty common name or a name that is a very common word.

Other suggestions were to use funding data from Web of Science or elsewhere (soon to be part of CrossRef data), but that really only gets the science team for the instrument. Our main point is to find out who is downloading the data from the NASA site (or elsewhere?) and doing good science with it.

Heather Piwowar has done a lot of research on data citation (is it done, how do you find it), but I believe mostly with life sciences data. Joe Hourclé has also presented several times on data citation and there is the DataCite organization to work on this issue. But this is all future stuff. Right now it's the wild west.

Share

No responses yet

ASIST2012: Other random sessions

Oct 31 2012 Published by under Information Science

These are random notes from the sessions I attended Sunday. I need a new laptop so I didn't bring my tired old one to live blog - these are from my scribbled notes on paper.

How much change do you get from 40$ - Erik Choi - this was a typology of failures in social q&a. The system offers some suggestions for how to do better questions but I think their intention was to use this research to help people ask better questions. As Joe Hourclé pointed out in questions - Stack Exchange supports query negotiation/refinement but they're looking at what to do with Yahoo, which is the most popular and has a lot of failed questions. Their big categories were: unclear, complex, inappropriate (prank, awkward...), multiquestion.

Dynamic query suggestions - dynamic search results - Chirag Shah. This was looking at google's way of showing you results as you type and also offering search completions as you type. Google says it saves 2-5s per search, but they wanted to test it. They did it in a laboratory setting with 3 conditions - neither, only autocompletion, all. They gave a task asking users to search for information on the velvet revolution and other revolutions and they looked at the number of pages viewed, concepts (noun phrases?) used, eye tracking. The dynamic stuff didn't change the number of concepts in a query, the queries were shorter but not necessarily better.

How do libraries use social networking software to communicate to users - they looked at big libraries in English-speaking countries and "greater China" (Taiwan + Hong Kong + PRC). They looked at the posts and interviewed a librarian from each. Some discussion afterward how Weibo is better at supporting conversations than Twitter - it would almost have to be :)

Barriers to collaborative information seeking in organizations - I'll have to read this paper... he spent too much time on methods and really cut his results section short.

Share

No responses yet

Clustering articles using Carrot2

Sep 11 2012 Published by under bibliometrics, Information Science

I did a very basic intro to using some social network analysis tools for bibliometrics here. This post will also be a brief how I did something for people with similar skills to mine. In other words, if you're a computer programmer or like, this will be too basic.

I've been getting into information analysis more and more at work. I've tried to learn this stuff and keep up as time goes on because I know that it's a very useful tool but it's taken this long to get any real uptake at my place of work. Typically I'll use metadata from either WoS or Scopus or from a collection of specific research databases filtered through RefWorks. Once I have the metadata from the research database I'll often run it through Vantage Point for some cleaning and to make matrices (co-authorship, author x keyword, etc). More recently, I've been using Sci2 for some of this.

All of the tools I use so far work with metadata but I get a lot of calls for doing mining with the text. I do know of tons of tools to do this but I think they all take a little more commitment than I'm willing to give right now (learning to program, for example). Some things can be done in R, but I really haven't tried that either as there is still a steep learning curve.

Anyway, a software developer (well he's really a lot more than that - he does rapid prototyping for human language technologies) buddy of mine from work has recommended Carrot2 a bunch of times. I now have a project that gives me an excuse to give it a whirl. We're mapping an area of research to pick out the key institutions, key authors, key venues... but also looking at the major areas of research. This could be done with author co-citation or bibliographic coupling, but another way is to cluster the articles based on their abstracts - I used Carrot2 for this. A reason not to use Sci2 with WoS data to do ACA or bib coupling is that for this particular research area I was having a very hard time getting a nice clean tight search in WoS where as some social sciences databases were yielding great results. As I was just telling a group at work, a lot depends on your starting set - if you do a crap search with lots of noise, then your bibliometrics aren't reliable and can be embarrassing.

Carrot2 out of the box will cluster search engine results from Bing, Google, and PubMed. If you download it, you can incorporate it into various programming thingies, and you can also use the document clustering workbench on any xml file or feed (like rss). They have a very simple xml input format and you use an xslt to get your base file or feed to look like that. I exported my records from RefWorks in their xml and I started reading up on XSLT...after some playing around I had an epiphany - I could just make a custom export format to get the input format directly from RefWorks!

I started from the RW xml but could have gone from scratch.In the output style editor, bibliography settings:

reference list title: <?xml version="1.0"  ?>\n<searchresult>\n<query></query>\n

text after bibliography: </searchresult>

Then I only defined generic and all the ref types use that:

refid precede with <document id="

follow with ">

Basically do the same for title, primary; abstract (call it snippet); and url

Then add text to output: </document>

You end up with

<?xml version="1.0" ?>
<searchresult>
<query>paste your subject here, its supposed to help</query>
<document id="ID">
<title>article title</title>
<url>url</url>
<snippet>abstract</snippet>
</document>
<document id="ID">
<title>article title</title>
<url>url</url>
<snippet>abstract</snippet>
</document>
</searchresult>

More or less, that is. I had some spaces I needed to remove. There was also one weird character that caused an error.

Then in Carrot2 workbench you select XML and then identify the location of the file and shazaam! You get 3 different visualizations and you can export the clusters. One of my biggest was copyright Sage but it can be tuned and you can add to the stopword list if you want. I still want to play with the tuning and the different clustering methods.

 

Share

No responses yet

Re-post: Commentary on: The persistence of behavior and form in the organization of personal information

Feb 10 2012 Published by under Information Science

This was originally posted on my blog on November 17, 2007. Deborah Barreau passed away this morning from cancer. There's a lovely message from Gary Marchionini on asis-l.

---

This post is a review and commentary on: Barreau, D. (2008). The persistence of behavior and form in the organization of personal information. Journal of the American Society for Information Science and Technology 59, 307-317. DOI: 10.1002/asi.20752

Goal: Barreau re-visits her 1993 study (published in 1995) in which she interviewed seven managers to determine how they manage electronic documents. In particular, in her 1995 study, her goal was to examine how Kwasnik's (1991) dimensions of organization of print materials translated into the electronic domain. In this study, her goal is to learn what has changed in the more than ten years and what impact new technologies have had.

Methods: Her sample consists of 4 of the 7 managers interviewed in her earlier study. She asked the participants broad questions on what personal information they have in their office, how they got it, how they organize it, and how they find things in it. She also asked what changes they would like to see in the technology.

The responses were coded using Kwasnik's dimensions. No information is provided on how the interviews were conducted and how the coding was actually performed. There are mentions of transcripts and notes, however. A sample of the statements were "double-coded" and an intercoder reliability check was done. (I almost missed this bit because the html is a bit goofy to read)

Results: I will just pull out a few interesting points here.

  • participants saw their intranet as an extension of personal space when they had bookmarked or used send to desktop as a link to keep some information.
  • they bookmark stuff and then never use it
  • participants were split between keeping a clean e-main in box by acting on or deleting things immediately and reporting that their e-mail was out out of control
  • retrieval is through browsing an ordered list

Changes they would like to see: synchronized single sign on

Conclusions: Many things remained the same. The way the managers name files, and use catch-all directories were two things in particular. Some things that have changed include the extension of the personal space to include bookmarked things from the web and the sheer number of different systems required to do the job. New dimensions are suggested to update Kwasnik's listing.

Commentary: My immediate reaction to this article was very positive -- mostly perhaps because it resonates with my own findings (Pikas, 2007). More information on methods is required to adequately judge the validity and transferability of this work.

She makes the point that corporations need to do better to back up user's work. This is something that also came out in my study. It could be that the corporations *are* doing a good job of backing information up but are not *communicating* well enough so that users trust the backups.

She also makes the point that organizations need to do better with e-mail. First, for records management purposes, they should discourage the retention of older e-mails. I strongly, strongly disagree with this. Much valuable information is included in e-mails - only in e-mails - and there should not be an arbitrary retention policy requiring their deletion if the user finds them useful (yes, I do know about e-discovery, but if you're not doing anything wrong- I guess I'm naive). Second, she states that organizations should do something about advertising e-mails received (ok, this is fine), about broader distribution lists than are required for the job (ok, I was getting e-mails in Maryland once for things lost and found in the Philadelphia office- so this is clearly a management issue), and about too many interruptions. I disagree about the interruptions truly being a something that the organization as a whole can/should fix through rule making. This article speaks to me that more training is required on the effective use of e-mail and IM. Perhaps the users should employ a do not disturb message on IM and log out of e-mail if they are working on an intensive task.

This is my first use of the BPR3 logo so I would be happy to take comments on that (or complaints if I'm not doing it right!)

---
Barreau, D.K. (1995). Context as a factor in personal information management systems. Journal of the American Society for Information Science and Technology , 46(5), 327-339. DOI:10.1002/(SICI)1097-4571(199506)46:5<327::aid-asi4>3.0.CO;2-C

Kwasnik, B. H. (1991). The importance of factors that are not document attributes in the organization of personal documents. Journal of Documentation, 47(4), 389-398.

Pikas, C. K. (2007). Personal information management strategies and tactics used by senior engineers. Proceedings of the Annual Meeting of the American Society for Information Science and Technology, Milwaukee, WI , 44 paper 14. (This will be made available open access 90 days after the conference)

Labels:

Share

No responses yet

Research Database Vendors Should Know

Research database vendors - the ones who design the interfaces that the end users use - should know that data export is not a trivial addition. Rather it is an essential part of their product.

Over and over and over again, librarians complain about one interface that works one day and doesn't work the next. The one that doesn't output the DOI unless you select complete format. The one that all the sudden stopped exporting the journal name. The interfaces that don't work with any known citation manager. The ones that download a text file with 3 random fields instead of direct exporting the full citation and abstract.

But you blow us off and you act like it's not important.

Well. I was just talking to a faculty member at another institution - even though a particular database is most appropriate for her research area and she finds interesting papers there, she now refuses to use it because it doesn't export to EndNote right. She's tired of the frustration and she is tired of finding that she has to edit everything she's imported so she's just given it up.

Once again librarians are telling you something and you need to listen. Researchers and faculty are super busy. They will not keep using your product if it makes their life harder. If they don't use your product then we'll stop subscribing. That's all there is to it.

Share

2 responses so far

It’s not programming, it’s not in my job description, but it took a lot of time these past 2 weeks

Aug 07 2011 Published by under Information Science

I didn’t participate the library day exercise this time – that’s when you document a day in the life in your job as a librarian so that others can see what it’s like to be a librarian.  I have a somewhat non-traditional job so my normal day in the life isn’t like the normal day in the life for most other librarians. It struck me earlier this week – as I got frustrated with things going wrong on 3 fronts at once – that I’m not even sure how to describe the work I’m doing.

Project one: we have an intranet search service that goes way beyond the standard appliance you might purchase. In addition to connecting to our document repository, sucking in the index from our web crawler, and indexing our SharePoint installation, it has an “expertise” tab. This is an index of custom compiled profiles for our employees with information pulled from their resumes, their MySites (a part of SharePoint where you can have a profile), the corporate directories, and, more recently, internal grant submissions and social network participation. One of the obvious things that’s missing is a listing of the external research articles the employees have written. For various complicated reasons, another librarian and I maintain the most complete listing of these documents.  We take alerts from all of our various databases, import the records into RefWorks, and then export them in a custom built Movable Type format which we then import into our listing which is on a MT blog. The tool we’re using can’t just index the web listing, as the author names are not pulled out and we need to link the articles to the directory IDs so that they turn up in the right profiles. The obvious thing was to just export the RefWorks records in some tagged or XML format. The only thing is that this would show the citation, but not provide any help on getting to the full text (we have a very hard time marketing our services). Great, so you can’t update an export format but you can make a custom bibliography with an open URL link. Anyway, to end this long description, I had to completely make an export format from scratch so that another librarian could import it in to Excel, run a script to parse out authors and match them to the directory ID (most of the time), and then upload them to a SharePoint list (ew.)

Project two: my larger institution has been working very hard on migrating to a new interface to our catalog. This runs on Blacklight, an open source effort led by UVa. It’s fabulous and we’re all very excited about it. Unfortunately, this means that other tools that made calls to the old interface will have to be changed. This includes Z39.50 services and things like LibX. If you aren’t aware of LibX, it’s a fabulous browser add-on that adds cues to bookstore websites so you can see if the book you’re viewing is available at your library; hotlinks PubMed IDs, DOIs, ISBNs, ISSNs so that you can click on them to see if the resource is available from your library; lets you reload a page through your proxy server for off-site access; and lets you search for highlighted things on the page in your catalog or other services you’ve added.  Ok, so obviously my larger institution’s LibX needed to be updated. I’m the only one left who knows how (although I should have been training 2 other people but have not  - not their fault, but mine since I’ve been busy) and I’m the primary maintainer. Mostly because I volunteered. Anyhow, I was totally baffled by the edition builder for a while, but then I was able to see what UVa did with theirs and then edit that. Basically I tried to take theirs and then substitute in any changes I knew about and then sent the information to our real programmer (who has been slammed with work) to see if I was close. He made a couple of suggestions and we were off to the races… except for … CRAP! It uses an OCLC service that needed to have our institution’s registry updated. Neither of us knew who was *supposed* to fix that, so I put in a couple of tickets and changed my lab’s registry information while the programmer changed the larger institution’s registry. And crap again, because then it came back thinking we could only ever search one ISBN at a time which is not true – you can OR a large (if not infinite number). Finally, I got that semi-fixed (it now works for up to 25 ISBNs coming back from the xISBN service). We’ve tested it and I pushed it out live on Friday.

Project three: we’re updating our internal portal page for information services – we had planned to suck in listings and descriptions and what not from the services offered by our parent institution… but ARGH, the site is being built in some version of SharePoint and of course it’s acting all wonky. Even embedding a catalog search is turning out to be a hassle.  So I started listing resources and building resource guides – which is actually a very typical job for a librarian… next job is to help the people figure out how to embed these services even though a) I’m not a programmer b) I don’t know anything about SP and c)I have other stuff I need to be doing

Project four: I’m embedded in a team in a sponsor-facing department working on a distributed knowledge management system that’s running on Semantic Media Wiki. Unfortunately the two members of the team with whom I work most closely have been completely pulled off for a few weeks to work on another project so that leaves me. So I’ve learned how to write these hugely complicated nested queries using arraymaps and templates to display results. Once again, it’s not programming, but it’s not literature searching either. It doesn’t seem that difficult … but it was, for me. I wish I could show this off, maybe eventually once it’s delivered.

Project five: It started with standard scientometrics, but I was working with an hci/visualization expert who prototyped a new system for exploring and visualizing connections, etc. I had already done a lot of data cleanup and visualization, but he needed data in a different format. So this is another example of me messaging data export from various tools (Sci2, VantagePoint, and originally WoS) for import to another system.  We also did a proposal for an internal grant for next year to continue working on this so that took some time.

 

I think there were more, but these were the ones that struck me. It’s not programming, it’s really messing with settings with existing products, but that doesn’t seem to really capture the complexity or frustration. Oh, and in the middle of this my institution changed over to default to the new RefWorks interface – good – so I had to redo the tutorials (only got one done). About 18 hours after I did the update and announced it, the interface changed again so I needed to redo a few screenshots… sigh.

Share

One response so far

Another tilt at a holy grail: identifying emerging research areas by mapping the literature

Jul 10 2011 Published by under bibliometrics, Information Science, STS

Technology surprise, disruptive technologies, or being caught unaware when managing a research portfolio or research funding are ResearchBlogging.orgsome of the fears that keep research managers and research funders up at night. Individual scientists might see some interesting things at conferences and might keep a mental note, but unless they can see the connection to their own work, will likely not bring it back. Even if they do, they might not be able to get funding for their new idea if the folks with the bucks don’t see where it’s going. Consequently, there are lots of different ways to do technology forecasting and the like. One of the main ways has been to mine the literature. Of course, as with anything using the literature you’re looking at some time delay. After all, a journal article may appear three years or more after the research was started.

I’ve been party to a bunch of conversations about this and I’ve also dabbled in the topic so I was intrigued when I saw this article in my feed from the journal. Plus, it uses a tool I’ve had a lot of success with recently in my work, Sci2.

Citation: Guo, H., Weingart, S., & Börner, K. (2011). Mixed-indicators model for identifying emerging research areas Scientometrics DOI: 10.1007/s11192-011-0433-7

The data set they are using is all the articles from PNAS and Scientometrics over a period of thirty years from 1980 to 2010. They’re using the information from Thomson Reuters Web of Science, not the full text of the articles.

Their indicators of emerging areas are:

  • Addition of lots of new authors
  • Increased interdisciplinarity of citations
  • Bursts of new phrases/words

This differs from other work from like Chen and Zitt and others that cluster on citation or co-citation networks and also look at nodes with high betweenness centrality for turning points.

The addition of new authors per year is pretty straight forward, but the other two methods deserve some description. For disciplinarity, each cited article is given a score based on its journal’s location on the UCSD map of science. Then a Rao-Stirling diversity score is calculated for each article to provide a interdisciplinarity of citations. For each pair of citations in the reference list, the probability of the first being in a discipline, the probability of the second in its discipline and then the great circle distance between the two disciplines are used for the score (the map is on a sphere hence not Euclidean distance). The limitations are pretty numerous. First the map is only journals, only 16k journals, and only journals that were around 2001-2005 (think of how many journals have come out in the last few years).  Articles with more than 50% of the citations not going to things on the map were dropped. They mentioned areas with a lot of citations to monographs, but I would think the bigger problem would be conferences. Newer research areas might have problems finding a home in established journals or might be too new for journals and might only be appearing at conferences.

For word bursts, I know from using Sci2 that they’re using Kleinburg’s (2002, $,free pdf on citeseer) algorithm, but I don’t believe they state that in the article. Their description of their implementation of the algorithm is here. I’ve been curious about it but haven’t had the time to read the original article.

 

In general, this article is pretty cool because it’s completely open science. You can go use their tool, use their dataset, and recreate their entire analysis – they encourage you to do so. However, I’m not completely convinced that their areas work for detecting emerging research areas given the plots of one of their methods against another. Their datasets might be to blame. They look at two individual journals so any given new research area might not catch on in a particular journal if there’s a conservative editor or something. PNAS is a general journal so the articles probably first appeared in more specialized journals (they would have to be big time before hitting Science, Nature, or PNAS). Also, the interesting thing with h-index and impact factor (the two emerging areas looked at for Scientometrics) is not their coverage in the information science disciplines, but h-index’s emergence from the physics literature and the coverage of both in biology, physics, and other areas. If you look across journals, your dataset quickly becomes huge, but you might get a better picture. Impact factor was first introduced decade(s) before the start of their window, but socially it’s become so controversial because of the funding tied to it and promotion and tenure tied to it – I believe a more recent phenomenon. Coding journals by discipline has often been done using JCR’s subject categories (lots of disagreement there) but others have mapped science journals by looking at citation networks. It would be computationally extremely expensive, but very cool to actually have a subdisciplinary category applied to each article – not taken from the journal in which it was published. Also, someone really needs to do more of this for areas with important conference papers like engineering and cs.

Share

10 responses so far

Many people really don’t know what to do with ebooks

Jun 24 2011 Published by under Information Science

Ever since MPOW got rid of our physical collection, I’ve had more and more to do with ebooks. We can’t replace the collection we built over the course of 60 years but we do have access to a lot of stuff. The biggest single provider we have is Springer for many reasons. Their books come as unlocked pdfs, one per chapter. So they kind of in a way work like journal articles. I mean, you’re on a platform that also has journals and it really doesn’t seem much different. But…. if you’re presented with a print book and you need to find information in it, you rely on the training you had from librarians and teachers from elementary school onward. You start with the table of contents or the index. Makes sense for an ebook, too, in many cases. People instead recognize it as a web search pattern, not a book.

I have a hard time convincing our technical staff to try the ebooks. They are skeptical. They don’t believe they’re actually getting the book. They don’t think it’s going to work.

So it was with interest that I saw this evidence summary in the new issue of Evidence Based Library and Information Practice: Undergraduate Science Students are Uncertain of How to Find Facts in E-books Compared to Print Books by Christina E. Carter.The students thought the ebooks should work like Google but they didn’t. The reviewer commented that they didn’t compare the ebooks to journals, which I pointed out above.

This was reviewing: Berg, S. A., Hoffmann, K., & Dawson, D. (2010). Not on the same page: Undergraduates' information retrieval in electronic and print books. The Journal of Academic Librarianship, 36(6), 518-525.

I’m not going to summarize the summary further – particularly since it’s open access.

I take it for granted about the standard ebooks (not the goofy platforms that require some bizarre check out and download and authentication), but I shouldn’t. People still don’t know how to use them or even believe they exist.

Share

5 responses so far

Science Mapping: Which tool to use?

May 17 2011 Published by under bibliometrics, Information Science

ResearchBlogging.org

Update 5/18/2011: A product marketing manager from Wiley has informed me that she has made this article free to view for 30 days because of this blog post. Thanks!

Update 6/22/2011: The first author contacted me to let me know the full citation since the article is fully published now.

Science mapping is using the network formed from the links between articles (citations, co-authorship), patents, or other information things, to understand the structure of science (Börner, Chen, & Boyack, 2003),  look for turning points or bursty periods in which lots of publishing happens, locate research specialties (Morris & Martens, 2008), locate important research institutes, look for geographic concentrations, and trace the history of an idea.  It’s been around for a while, but it’s gotten even more popular with better visualization techniques and more powerful computers. It’s particularly hot right now with the two major data providers, Thomson Reuters and Elsevier, heavily marketing analysis products. Elsevier sponsored a webinar last week in their Research Connect series on this topic and had more than 500 people dialed in (slides will be posted but I haven’t  been notified that they are yet).

In addition to a bunch of different really expensive commercial products that cover some aspects of this process, there are a ton of tools available from universities and research centers that are either free or inexpensive for non-profit or educational use. This article reviews the field – the general techniques – but mostly reviews the tools. For more on the whys and whats refer to the ARIST articles cited earlier or, really, read articles that cite them or are cited by them.

The article:

Cobo, M., López-Herrera, A., Herrera-Viedma, E., & Herrera, F. (2011). Science mapping software tools: Review, analysis, and cooperative study among tools Journal of the American Society for Information Science and Technology, 62, 1382–1402. DOI: 10.1002/asi.21525

This article is pretty readable and really useful – with the proliferating tools, it’s nice to compare them without having to install them all. They review: Bibexcel, CiteSpace, CoPalRed, IN-SPIRE, Ledeysdorff’s software, Network Workbench, Sci2, VantagePoint, and VOSViewer. Hey CiteSpace now takes data from ADS – that’s cool! IN-SPIRE is kind of weird – different – since it doesn’t do the thing with bibliographic data. Likewise Leydesdorff’s software is more a series of utilities to deal with bibliographic data. VantagePoint is commercial and somewhat expensive (I have it at work) but it’s really cool how it does data cleanup. They also provide helpful tables to compare the tools.

Interestingly, they tested things like I typically use them: all at the same time. They use one to to clean, another to analyze, and a third to visualize. Science Sci2 is free, powerful, and well documented, it looks like a good bet. I’ve tried CiteSpace before but it looks like it has improved a lot since then.

 

Other references

Börner, K., Chen, C., & Boyack, K. W. (2003). Visualizing knowledge domains. Annual Review of Information Science and Technology, 37, 179–255. doi: 10.1002/aris.1440370106

Morris, S. A. and Van der Veer Martens, B. (2008), Mapping research specialties. Annual Review of Information Science and Technology 42, 213–295. doi: 10.1002/aris.2008.1440420113

Share

2 responses so far

About the preservation of databases

Feb 16 2011 Published by under Information Science, Uncategorized

Egon Willighagen asked on Chm-Inf about why libraries aren't preserving databases. Beth Brown provided one reply.

I commented there and hopefully my comment will show up eventually but I seriously doubt we'll be able to help with this.

NASA, DOD, NIH, NSF, and other fund the development and first few years of hundreds if not thousands of databases. Then the database becomes less about new science and more about infrastructure or operations. Then the PI gets bored. Then maybe the users start to drop off... then the database disappears. I was just looking for information in a NASA database that was referenced all over the place. When I got there all I found was a notice that it wasn't funded anymore so no data for me!

We've been hearing this with data - about how it cost so much to gather but then is abandoned.  Libraries are working to try to take up some of the slack with this, but it's hard. Look, if NASA and DOD with big offices for science and technology information can't preserve their own stuff, they're not going to fund us to do so. Libraries don't have the money or the mandate.

I was at SLA whenever it was in DC and saw a presentation about yet another NASA database - even at the time the only thing I could think was how close is the PI to retirement?

Funders should ask about preservation plans for these things. I don't think they do.

Share

3 responses so far

Older posts »