or08: Open Repositories 2008

I’m at Open Repositories 2008.

Got in to southampton central at 8:15am, for a 9 start, thought i had loads of time but wait for bus, journey, and wondering around campus only just got here for 9. so did just about everyone else, so not username and password allocated until coffee time :( [but now online, hence you seeing this!]

draft notes from the first session, unedited or checked.
peter murray-rust
repositories data

Believes data is the most important thing for scientists, as opposed to open access final full text.

“PDF destroys information” – pdf destroys information, Word contains data which pdf just losses, word files (and latex, xml) are useful for sciencetists as they can reuse the data, formula, metadata etc contained within, which is lost as a pdf file.

academic theses are one of the most important thing for institutions/researchers. electronic theses are going to be very powerful.

technical problems slowed the talk down.

showed pdb repository, protein data range, going since the 70s. showed rsearch he *put* in to the *repository* while working at glaxo.

message is that scientists are already putting stuff in repositories.

crystaleye, built/started by a postgrad. now has over 100,000 crystal structures. harvests from those that release their crystallography (acs, rcs, etc). links to paper via doi.

scientists will not put things in to repositories (presumes he means articles based IRs, as he has just been describing how scientists do!).

OSCAR text extraction. showed example of cutting the text of a PDF doc in to OSCAR, it produced a table of formula that were contained within the text.

Royal Society Chemistry: PROSPECT , semantic markup of papers.
SPRECTRa (cam/imperial), how can we capture data as part of the academic process

“do not try to invent electronic notebooks”, success rate approx zero. i.e. don’t try and make capturing data the integral part of their workflow.

No one knows when their paper gets published.

“get at the authoring process”, that is the key.

