What we’ve been up to….

After a year or so of planning, the last few months have been gratifyingly practical. Work is now racing forward on many fronts. With this blog post I’ll attempt to explain the various strands of work and how they all fit together to attempt to realise the JISC and RLUK resource discovery taskforce (rdtf) vision.

First a bit of background. The vision sets out an approach that we will be pursuing up until the end of 2012. As our work is more about defining this approach, rather than something firmer like an architecture, we need to be able to learn as we go along. Therefore the work between now and December 2012 has been split into 3 iterative phases. We are currently in the 1st phase that runs from January to July. This phase is all about laying the groundwork for the next two phases so the 2 main outputs of phase 1 are data that we can reuse in phases 2 and 3 and definitions of approaches for issues like metadata, technical architecture and sustainability.

Phase 1 activity

Institutional projects

We have funded 8 projects to investigate new approaches to making metadata about the collections of libraries, museums and archives available in a way that enables the metadata to be reused to enrich the original collection and make it more visible. Many of these projects are investigating a linked data approach. The 8 projects are summarised on the JISC website and also in the newsletter mentioned below. Each project has a blog so it is easy to follow along with any that interest you.

Existing aggregation projects

There are 2 projects at Edina and 2 projects at Mimas that are working with Copac, ArchivesHub, Suncat and Go Geo. The projects are all focused on making it easier for people to reuse the data from these services, they are just getting started and more details will be available soon…..

Studies and reports on specific issues

We have commissioned a few reports to address specific challenges involved in the vision. There have been 3 of these so far:

Management framework

The management framework project is the central project that helps bring together all of the strands of work above into one cohesive whole. The framework project has 3 headline purposes:
  1. To gather together the knowledge and lessons from the projects above and to turn it into advice and guidance for libraries museums and archives
  2. To produce a website to enable developers to engage with reusing metadata and to enable other stakeholders to follow progress and identify opportunities
  3. To provide advice to JISC and partners on the infrastructure required to realise the vision
The project has a website which describes the deliverables in more detail. It is a very collaborative project as can be seen from the people involved. Specifically, the framework project is supported by 2 advisory groups. One focused on technical issues and one on management issues.

Communications

The management framework project works extremely closely with a dedicated communications and relationship management project. The purpose of this project is to work with the vast range of stakeholders involved in this work (see the table in the implementation plan) and to ensure that we keep on top of their needs, issues and use cases. The project will also provide progress updates and information to help information professionals see how this work can benefit their collections and end users. You can see the first newsletter on rdtf progress on the site now. You can also use that site to sign up for future issues. If you are lucky, you could grab one of the few remaining invites to the April event this project is organising to explore the issue of open data for libraries, museums and archives.
The picture below illustrates how these projects all fit together. As you can see the management framework sits at the centre of the implementation and we’ll use this project and their website to make it easy for people to follow the progress with the work and to engage with any elements that interest them.
Alongside all of this work, I remain busy behind the scenes trying to ensure that the work to implement the rdtf vision takes into account relevant work happening in other nations and with other content types.
Phase 2 will start after July and the intention is to focus on the issue of aggregation. The exact shape of phase 2 will be determined by the work of the projects in phase 1 but we’ve started the planning process already and I’m excited by some of the ideas that are emerging.

A framework for managing the implementation of the vision

We realise that the scale and complexity of the RDTF Vision work requires a robust management framework. Over the next eight months we’ll be working closely with Mimas to design and implement that framework.  In this post Joy Palmer from Mimas shares the approach and ethos in carrying forward this work with us, and she also provides an overview of the management framework activities which will be taking place between now and July 2011. –  Andy McGregor.

Guest Post by Joy Palmer, Mimas


At Mimas, we are very pleased to be working with JISC in taking this ambitious work forward. As a service provider and National Data Centre, Mimas obviously has a key stake in the outcomes of this work. I’m going to use this opportunity to convey how we at Mimas perceive the RDTF Vision and the challenges ahead, and I’ll also touch on the specific activities we have planned for the Management Framework project. More detailed information will follow as we move on.

Adhering to the principle of ‘data in, data out’

The focus of the Vision is on aggregations and exploiting the strengths current infrastructures, making data ‘work harder.’ A mixed economy of technological solutions and approaches is the only way forward, and we’ll be paying attention to creating an open infrastructure that enables a broad range of activity, particularly innovation. It’s also critical that the work we plan around licensing and ‘Open Data’ advocacy contributes to an outcome where more metadata is made openly available through a variety of mechanisms, not limited to ‘centralised’ aggregation. We’ll be working with Eduserv and UKOLN to promote and understand the value of technological alternatives to centralised aggregation, including API development and deployment, and ‘web of data’ approaches. We’ll be working closely with Paul Walk and others to understand how existing aggregations can be joined, shared, or potentially merged, and develop a blueprint for a future infrastructure.

Mimas has been negotiating this space for several decades now, so we have a healthy understanding of the technical and cultural challenges ahead. The only way we can ensure success is to approach this as a collective with our key stakeholder partners. We know establishing the technical requirements for the infrastructure will be a major hurdle, but understanding the requirements of the different domains and gaining buy-in will be our most significant challenge.

Collaboration not competition in approaches to national aggregations

A key threat to success is a scenario where institutions participate in an increasing number of aggregation schemes, all with differing standards, objectives, and approaches to licensing and ‘openness.’ We want to work collectively and not competitively. There are a number of national aggregation players at present, all undertaking significant work on behalf of their communities. We’re going to be working in close consultation with other UK bodies that provide centralised aggregation services for their communities, key amongst those is Collections Trust (behind Culture Grid) and RLUK, who provide the bibliographic data that underpins Copac.  We’re also partnered with The UK Archives Discovery Network (UKAD) through the Archives Hub service engagement (http://www.ukad.org). Collectively, these partners represent many of the key stakeholders in the UK libraries, museums and archives sectors, and can help us reach individual institutions across the country.  At the same time, as usual we’ll be working closely with EDINA throughout the process, and we’re already consulting with OCLC to identify ways we can effectively collaborate.

Tackling the complexity in licensing issues

We see tremendous advantages in building on the work the Open Bibliographic Data Guide project, and we’re very pleased that David Kay, Paul Miller, and Owen Stephens have agreed to work with us in this complex area.  As highlighted in the recommendations for future work in the OBDG project, while there increased a community understanding the benefits around Open metadata in the library community, there are still gaps in understanding across all three domains which we will work to identify.  Developing further clarity around licensing will be critical, but just as critical will be the work to advocate and drive the opening up of metadata.

Challenging times ahead, no doubt.  We are looking forward to working with JISC and developing a practical route forward—a route that clearly articulates the benefits we can achieve for key stakeholders, and the specific tactics we’re deploying to get there. In the rest of this post I describe the key areas of activity over the next eight months.

Stakeholder engagement and communications

This work will encompass a range of activity associated with advocacy of open data to the end of making it available to national, subject, local and other aggregations.  We’ll be working closely with the RDTF Communications Framework project team to develop a stakeholder engagement strategy.  A focal point for the community will be a website articulating the Vision and its benefits, including use and business cases relevant for differing stakeholders. The site will include a developers’ area akin to that of Digital New Zealand (http://digitalnz.org/developer) that provides support tools.  The development of this site will include work to establish an appropriate name and visual identity to engage these stakeholders (we realise that the ‘Resource Discovery Taskforce Vision Management Framework’ doesn’t exactly trip off the tongue).  In addition, we’ll be holding a ‘Web of Data’ training and advocacy event with Eduserv in the Spring to engage developers at institutions.

Technical architecture and metadata standards

This area of work will define a fit-for-purpose infrastructure for aggregation. We’ll be working with UKOLN and our partners to identify the requirements for aggregation in the context of resource discovery for Libraries, Museums and Archives.  A key activity will involve identifying the existing infrastructures for aggregation across sectors, establishing a blueprint of the current data flow, including the role of vendors, the purpose of the aggregation, and the user and business needs they support.  We’ll be looking at how value is added within existing flows, and also existing barriers and opportunities – we aim to develop a functionally specific picture of business needs of the different sectors, and what aggregation must or could support.  We want to find out if ‘mutuality’ or a federation of aggregations is feasible to serve these needs.

Licensing

Here we’ll be building on the work already undertaken through the JISC funded Open Bibliographic Data Guide project. We’ll address gaps in understanding, and in identifying and communicating the specific benefits of ‘Open’ metadata to differing stakeholders and curatorial domains which have a related but at times differing agenda. We’ll work to establish clear licensing arrangements around aggregated data, particularly covering re-use.  A key deliverable will be a Risk Management Framework, which will help senior decision-makers weigh the risks and benefits of opening up data. At the same time we’re developing a plan for advocacy and community engagement to articulate the benefits of ‘Open’ metadata and to motivate institutions to open their data.  Part of this work will involve developing a plan to address gaps in community understanding and issues emerging particularly within the museum and archival sectors.

Scoping Study Final Report: Aggregations of Metadata about Images and Time-based Media (Films and Sounds)

Guest Post by Sheila Fraser of EDINA

JISC recently funded EDINA to undertake a short (3-month) scoping study to scope the feasibility, viability and value of creating an aggregation of metadata about images and time-based media, i.e. collections of information (e.g. catalogue information) about digital resources rather than collections of digital resources themselves.

The final report is now available and indicates benefits of, and opportunities for, aggregations of metadata describing images and time-based media, and the barriers to having open and shareable metadata for these resources. It also describes a number of scenarios in which aggregations of metadata about images and time-based media would be useful or required, and different models for aggregating metadata.

The report concludes: “Aggregations of metadata about images and time-based media are useful … it is desirable that these metadata be aggregated [and] it is valuable to have digital metadata for physical resources to enable discoverability of related resources”.

The team would like to thank all participants in the consultation process for their time and expertise. To enable further opportunity for contribution from the wide range of stakeholders, feedback on this report is encouraged as part of the consultation exercise. The full report is available at http://mass.blogs.edina.ac.uk/, where comments are welcome. Further project details are at: http://edina.ac.uk/projects/Aggregations_Scoping_summary.html.

Addressing some technical questions posed by the RDTF vision

The resource discovery taskforce (RDTF) vision poses a number of challenging technical questions such as:

  • What is an aggregation?
  • How do institutions contribute open metadata?
  • What metadata and standards do we use?
  • How do you build interfaces that developers will be keen to use?
  • What needs to be done to existing services and aggregations?

Implementing the resource discovery taskforce vision largely depends on addressing these challenges. Fortunately there are a lot of smart people in the HE community so I fancy our chances. Paul Walk and Adrian Stevenson of UKOLN are managing a project called the IE technical review which has been set up to examine these kind of issues. As part of this project Paul and Adrian pulled together a group of experts to discuss the technical side of the RDTF vision. You can read a summary of the meeting on the IE technical review blog.

The meeting was very productive and the group came up with a list of recommendations. We are issuing a number of calls for funding designed to further the RDTF implementation plan in the next month and the recommendations have been written into these calls. We expect to fund a number of different types of project aimed at institutions, existing aggregations and at overall management of the open metadata and aggregations. These projects will take these outline recommendations and build on them to attempt to answer the big questions we face.

Update on current work

Work on realising the resource discovery taskforce (RDTF) vision is well underway. We will be funding the projects that will form the backbone of the work in the next couple of months but we have a couple of interesting projects happening at the moment.

A guide to open bibliographic data

There is a lot of interesting work happening with open bibliographic data at the moment. A number of libraries in Europe have made all their bibliographic data openly available and there is a lively discussion happening on the OKFN mailing list. Karen Coyle’s blog post sums up efforts in this area nicely.

This area is obviously very relevant to the work of the RDTF but it is in the early stages with lots of independent innovation happening all over the place. So we decided to fund a resource that would describe why open bibliographic data is interesting for libraries and to help those who want to engage in this area to do so.

The work is being carried out by Sero consulting, Owen Stephens and Paul Miller. It is a couple of months off finishing at the moment but in the best spirit of openness they have decided to show their working and you can see the prototype for the guide by heading over to Owen’s blog. It is a work in progress and neither the content nor the design is in the final form but you can see where they’re heading.

Aggregations of metadata about images and time based media

Edina are preparing a scoping study to look at the issues involved in created aggregations of metadata about images and time based media. For the RDTF to be useful we need it to cover a wide range of content and institutions. Different content types bring their own challenges and there is a particularly interesting set of challenges for images and time based media. So for the RDTF we need to know more about what those challenges are and how we could go about addressing them. Edina are particularly well placed to to this work due to their experience with the visual sound and media portal.

Sheila Fraser, the project manager has produced a very clear one page summary of the work.

This project is in the information gathering stage and, if this sounds like an area you are interested in, you can get involved by following the instructions below:

We would like to hear from a wide range of people, particularly those who:

  • Use images, films/videos or sounds in learning, teaching or research,
  • Own, curate or manage collections (large and small) that contain images, films/videos or sounds, especially museums, libraries, archives and university departments who hold images or multimedia resources,
  • Work with metadata or collections of metadata in other areas, for example in document or learning repositories or archives,
  • Might be interested in developing collections of metadata for images and time-based media, or
  • Might be interested in in developing services using collections of metadata, or in using collections of metadata e.g. in research.

We invite you to take part in the survey, available at http://www.surveymonkey.com/s/YHT8JNR, which should take between 10 and 25 minutes to complete depending on your answers.

Launch of vision

After a year and a half’s work by the resource discovery taskforce our vision was launched at a JISC and UKOLN event called Survive or Thrive.

The vision focuses on the aggregation of metadata about library, museum and archive collections to allow the creation or enhancement of innovative resource discovery and library collection management service. The vision paper can be downloaded from the IE repository.

The vision is designed to steer work up until the end of 2012. The implementation plan describes the work to realise the vision in more detail. JISC will be funding projects that contribute to the implementation plan. However achieving the vision will only be possible by working with a wide range of stakeholders and the work of those people will also contribute to the implementation plan.

The implementation plan has a page on this blog and it will continue to be developed and will track all work relevant to the plan.

If you want to learn more about the taskforce you can explore the reports from each meeting or read about related work by browsing the pages menu on the right hand side. Future posts will go into more detail about the background and the context for this work as well as reporting on progress with the implementation plan.

Final meeting of the taskforce

The final meeting of the Resource Discovery Taskforce will take place on the 2nd of December. The focus of the meeting will be reeaching a consensus on the final vision and starting to develop an implementation plan to work towards achieving the vision.

Once the final meeting is over, the vision will be communicated to the rest of the community and a working implementation plan will be created. This will be a living document and partnerships with stakeholders will be developed to enable the work in the implementation plan to take place.

Documents from the meeting will be available from this blog.

Information gathering report

To help the Resource Discovery Taskforce members think about a vision for UK HE resource discovery JISC commissioned a report to look at how other nations are addressing resource discovery and what some commercial companies are providing in this area.

The report was written by Rightscom and is a detailed look at the work on resource discovery of:

  • Australia
  • Sweden
  • The Netherlands
  • Canada
  • Amazon
  • LibraryThing
  • OCLC
  • Biblios.net

Read the full report

Here is the executive summary of the report to whet your appetite:

This report describes and analyses some examples of resource discovery services (in Australia, Canada, the Netherlands and Sweden) incorporating national union catalogues, and also examines some commercial and non-commercial services which incorporate resource discovery as a key part of their operations.

In partnership with RLUK, JISC has set up a Resource Discovery Taskforce to focus on defining the vision and requirements for the provision of a shared UK infrastructure for resource discovery for libraries, archives and museums to support education and research. It is important for the Taskforce to be aware of relevant services, infrastructures and technologies used in other countries and in the commercial sector for resource discovery. The purpose of this report is to provide the Taskforce with that information. The report should be read in that context.

It is not possible to condense the description of each service into this Summary in a meaningful way. However, there are some common strategic challenges facing the national resource discovery services, and they are taking similar approaches to solving them.Common challenges include:

  • Making services more welcoming to those accustomed to using Google and social networking sites
  • Enabling discovery across media types
  • Ensuring that the information contained in library catalogues is exposed as widely as possible
  • Making it as easy as possible to move from finding an item to getting it

Common solutions include:

  • Redesigning interfaces
  • Incorporating relevance ranking, faceted search and results clustering, as well as features such as ‘Did you mean…?’
  • Importing book cover art, reviews and tagging from sources such as Amazon and LibraryThing (these services are examined in the report)
  • Improving and extending metadata
  • Taking library data out to users in various ways rather than expecting them to come and find it. Examples include exposing records via Google BookSearch, Google Scholar and Yahoo! either directly or via WorldCat; embedding search boxes in Facebook; making the entire catalogue available in Linked Data
  • Tying finding more tightly to getting, for example by deep linking to the catalogues of a user’s local library; providing pre-paid non-mediated Inter Library Loan accounts for users; linking to bookshops’ websites; experimenting with home delivery

The national resource discovery services face some dilemmas and barriers as well:

  • Very limited resources in comparison with commercial services, especially in the current economic situation. This applies whether the services are directly taxpayer-funded or dependent on subscriptions from libraries
  • Balancing the need to integrate into the wider network for greater visibility with the desire to not to become too dependent on large and powerful organisations such as Google
  • Breaking down technical and human/organisational barriers in order to present a more coherent discovery environment to users, for example, enabling single search across media types, integrating multiple portals
  • Making crucial technology decisions (for example, on open source, buy or build, semantic web technologies) when the environment is changing so rapidly
  • Balancing public accountability and a culture of caution against the need to take risks and adopt a ‘beta’ approach to development
  • The need to scope what national resource discovery services should and should not encompass e.g. there may be a temptation to try to build a community where it’s not appropriate or feasible

For the sake of convenience, the other services are referred to as ‘commercial’, though their business models differ considerably.

Drivers for these services include:

  • Shaping their services to compete effectively, which in turn may or may not imply selling more to the user, but in all cases implies creating and sustaining users’ trust
  • This is bound up with creating an environment in which users value and trust the contributions of other users, and satisfying their desire to pass their own knowledge and opinions on to others
  • To be competitive, these services must be able to provide user-friendly interfaces, additional features, depth of content and a sufficiently large user community to maintain content and ensure freshness
  • A business model that enables partnerships is critical

Challenges and barriers for these services:

  • The main barrier facing most of the services is the presence of a strong leading competitor, making it difficult for the others to get traction
  • Small numbers of users and small volumes of metadata restrict the usefulness of both automated and human-based recommendation and evaluation services; in addition, a relatively small proportion of active users means the services have to attract large total user numbers
  • A key challenge for smaller services is to retain access to (preferably open) sources of data required for sustainability

Conclusions specific to national resource discovery services:

  • All of the services examined in this report understand that change is vital if library catalogues are to retain relevance and visibility in the wider networked discovery environment
  • Some have made more progress towards change than others, but they are on similar paths, albeit with variations according to circumstances and strategies
  • There is a need to make more concerted ongoing efforts to understand users’ needs and behaviours (not only when developing new interfaces) and where appropriate, segment their user bases and market services more effectively to these different groups
  • Community features need scale and the services are right to follow the path of relying on tags and reviews from e.g. LibraryThing and Amazon, though this does imply dependence on services which are outwith their control

Overall conclusions, which apply to all services:

  • Scale is vital for user-generated material: both the size of the user community and the size of the metadata collection will make a considerable different to a service’s ability to attract and retain users.
  • The increasing dependence of resource discovery services (both commercial and non-commercial) on a number of large external datasets is of interest. This interdependence seems to be at the data level: as yet, functionality is not shared between the resource discovery services studied here. Although not at the moment an area for concern, dependence on data from other resource discovery services can cause some problems—as the recent reaction to OCLC’s change of policy on WorldCat records illustrates. In the long term, it may pay to monitor the risks of a monoculture for data. Models that allow many smaller services to contribute data, or for their data to be used in real-time, may offer some insurance against data monocultures.
  • More needs to be done to understand user journeys: individual sites monitor what users do, but the way they move between different sites is not known. This would help analyse how users get to resource discovery services and what they do once they have finished. Understanding user journeys would also help to elucidate the relationship between (for example) either identifying a book through Amazon and then using a union catalogue to locate a copy to borrow; or finding something in a union catalogue that was not available to borrow locally and purchasing a copy.

Welcome to the Resource Discovery Taskforce blog

JISC and RLUK have set up a taskforce to discuss what needs to be provided to help people discover and access items from Higher Education Libraries, Museums and Archives throughout the UK.

The taskforce first met in November 2008 and it will run until December 2009. We have had 3 meetings so far with one more planned in December 2009. The aim of the taskforce is to produce a vision for resource discovery and delivery across UK HE museums, libraries and archives.

Once the taskforce has produced its vision, an implementation plan and communications plan will be produced to help the HE sector acheive the vision set out by the taskforce.

The blog will be used to distribute information about the taskforce and all of the documents that the taskforce produces.

This blog includes pages that describe:

  • The scope and terms of reference of the taskforce.
  • The membership
  • A summary of meetings 1 – 3

See the links on the blog header for these pages