DAMS Meeting

1) Luna on Fedora. Idea is to pull information from Fedora and run the interface considering coordinates overlay.

2) Sympletic – REF call, it is a proprietary system, deals with science journals, maths, etc. Sold to central admin to hold all REF items. Get holdings -> can get entire Sympletic metadata. There is no “munging” of records so keep many of them.

3) Islamic Digitisation Project. JISC (Cambridge) + Yale Maps (Fedora but sitting atm in Greenstone – not shareable).

4) PASIG – 24th June in Malta – preservation archive. Stanford: how you architecture a digital library?

5) MDICS – Google crossref with OPAC records – Set of tools to pull the results of Google digitisation process.

DIAMMS – music manuscripts with metadata (variable), use a set of tools on disc without individually going through Fedora (“fedora light”). Just update index after the processing with Fedora light and update the count of GB.

HADB (honeycombe internal database) it gets where the objects are.

object_id  -> object_id_datastream

Not only list the objects but also query a specific collection.

Open storage boxes (Sun supporting) – will be ued as NFS boxes (VMs) with a web-based config interface. Inherit components from Honeycombe2.

Check Sun QFS – have objects + metadata like Honeycombe.

Fedora for us is the Architecture (bags of stuff) plus to keep simple the APIs (Rest). However not yet updated, it stalled from codebase.

D-space (Mike Diggery).

6) OULS Digital Preservation Strategy – covers the digital matters. DAMS + electronic records management (ex. Sharepoint) will be thrown at the department.

Bourne digital -> DAMS (preservation, not access).

Hydra Project (Stanford) same structures of DAMS: Apps -> Tools -> Object management -> Storage -> VMs.

DAMS Project

A meeting from the DAMS team made me generate these notes. The project is ambitious and has many points to be a successful one. I’m still not fully involved in the project and I hope that this will happen soon.

The project and services pages is at http://damssupport.ouls.ox.ac.uk/trac and more specific about the DAMS project can be also seen in this page.

The trac system is used, which is equivalent to a wiki, allows attaching files, code sub-vsn, tickets (bug tracking), roadmap (milestones), etc. It is also an event management system and is unique place to all projects.

Why use the trac system?

  • JISC accepts blogs and wikis as documentation
  • has RSS feeds for timeline changes
  • XSLT >> reports
  • backend is in Python, currently in VMWare

Vocabulary used as standard and created by projects will be at http://vocab.ouls.ox.ac.uk/

Event-message system

The filesystem has the structure:

Storage-messaging-index

JMS – Java messaging system, queue with msgs (stack)

AMQP – email msg type

RabbitMQ (silent: success, noise: fail) – indexes, specify in advance a queue

Fedora – 2 queues:  API-M and API-A (CRUD – create, read, update, delete)

Why DAMS?

  • digitisation
  • Store “things” and metadata about them, independent of Fedora
  • components, open interfaces, open standards

Scalability

  • object storage, cluster / distributed system, live replicas
  • MAD (idle disks)
  • Honeycombe (self-healing system)
  • search engines (indexing accessing)

Longevity

  • simplicity interfaces
  • reduce dependency of third-parties
  • abstraction layers/resolvers

Availability

  • enhance long-term availability
  • disaster-recovery
  • snapshot of VMs on the system
  • digital preservation
  • represent the entire collection

Interoperability

  • implementation of interfaces
  • avoid low-level interfaces (use standard interfaces)

Sustainability

  • budget, archival, migrate skills
  • from analogue to digital culture

DAMS Phase 2

Skills needed for the future and the current projects.

Honeycombe – write once read many. Make amends, difference and commit later.

Check also: Less talk, more code blog at oxfordrepo.blogspot.com

Library formats

Well, I took some time to post the remaining of my notes.  I shouldn’t take that long but I was catching up with some Python “de-rusting” and learning plus some other projects that will appear in the latests posts.

Well, the library formats post is about a meeting that I had explaining a bit about the transition from the old formats and systems to what is used today.  Some of the formats are still in use and library services has too many legacy systems that has to be taken into consideration.

So what I could gather is that Marc 21 (format for book catalogue – records) was the very first model and many of its concepts are still in use. After that UNIMARC was the Europe version, then it came AACR2 (with some rules of population).

The Library Management System became Integrated Management System and there was something still to consider the out-of-print materials like manuscripts and images.

OLIS used GeocAdvance.

OPAC was the catalogue for circulation.

The idea was to modernise and not have only hardcopy material for search in the library but a system of resource and discovery.

The ultimate goal is to have the material for long term preservation and that’s when DAMS, FutureArch and other projects will shine.

A simple schema of the relations:

Projects schema