The Data Imperative: Libraries and Research Data : comment

I put this in to a seperate post. It continues on from my previous post, but didn’t want my notes of the day to be taken over by my ill thought views.

Personal Thoughts

Reluctant to give some thoughts as I know so little about the service. However… (!)

There seems to be two clear areas here: Data formatting and Data storing. There is some linkage (Preserving surely covers both, formats can become obsolete, Servers die), yet the two seem to be somewhat seperate.

Both require IT skills, but IT is a broad church, the former is technical metadata (and is very much IT and library) and in the general area that I sees covered in the Eduserv efoundations blog.

The latter in its simplest form is hard core infrastructure. Disks, sans, servers, security, but also has elements at the application level (how do we access it, using what software, repositories? CRIS? Fedora?).

On another issue, while it is easy to say that libraries should take the lead, I think we need to be cautious. With the current climate of frozen or decreasing budgets nationally, and journal subscription pressure, how wise is it to go to the University’s executive and demand funding for resources/staff for data management. We know it’s important and could make the process of research more efficient, but there are other things higher up a Universities list of priorities (NSS/atracting good students, REF, research funding). Even at a library level, journals help researchers do research (which brings funding), and keep students happy because we have the stuff they need (NSS). How many journals should we cancel to focus on Research Data? Why? The recent JISC call will help with providing a business case.

The problem at the moment is that there are not enough clear benefits for most Universities to steam ahead with this. Let’s clarify this: not enough benefits for the institution itself. The benefits are for the UK as a while (actually, the while world). It’s the UK-wide economy and research that will benefit. So maybe it needs UK-wide funding. It’s easier to convince someone (or something) to spend money when the benefits for them are clear. In this case the benefits are for UK so it should the UK which sets aside explicit cash (via HEFCE, JISC, and so on).

And this is happening, with the JISC call (talked about today), amongst other things it will help build examples.

But I’m not sure if the institutional level is the best one. Australia has been successful with a centralised approach. We have a number of small Universities, and those which only have one or two departments which are research active. Yet the resources/knowledge required of them will be similar to that of a large institution. Will this leave them at a disadvantage?

On another note, it seems the range of data is vast. When dicussing this, I always – incorrectly – picture text based data, of vearying size, perhaps using XML. Of course this is blinkered. For auido, images and similar should a data service just provide a method to download, or a method to browse and view/listen? When it comes to storage and delivery, should we just treat all data as ‘blobs’ – things to be downloaded as a file, and we no nothing more with it? This makes it easy and repository softwareapplications (eprints/dspace/fedora) are well placed to cater to this need. But I get the impression that this is somewhat simplistic. Perhaps this means a data service needs a clear scope, otherwise we could end up building front end applications which mimic flickr, youtube and last.fm all in one. A costly path to go down.

[all views are my own. are wrong, badly worded, ill thought, why are you reading this?, just think the opposite and it will be right, etc]