Wednesday 2nd of August, Morning Workshops
How to Make Your Research Networking System (RNS) Invaluable to Your Institution
- Brian Turner, Eric Meeks, Anirvan Chatterjee - University of California San Francisco
- Lamont Cannon, Julia Trimmer - Duke University
- Douglas Picadio, Lars Oestergaard, Kelechi Okere - Elsevier (Pure)
Duration: 3.5 hours
Attendees will be prepared to implement several different tactics to strengthen the utility and prominence of their RNS for their institution.
A half-day workshop for up to 20 people. We’ll need internet connectivity, a projector, classroom type seating for attendees with table/laptop space, and a lectern.
This workshop is designed to help institutions build, leverage, and deploy the information within their RNS across the institution. The goal is to increase awareness of, engagement with and dependence on your RNS to solidify the RNS’ roles in supporting researchers. Note that the takeaways from this workshop can be applied to your RNS regardless of the underlying product, and will work for a VIVO, Profiles, “home grown,” or commercial RNS installation.
We intend to provide a mix of lecture, discussion, exercises, and templates to enable participants to replicate the successful engagement at UCSF, Duke and Elsevier — and avoid our mistakes.
- John Riemer, chair - University of California Los Angeles (UCLA)
- Amber Billey - Columbia University
- Michelle Durocher, PoCo representative - Harvard University
- Paul Frank, PCC NACO - Library of Congress
- Stephen Hearn - University of Minnesota
- Violeta Ilik - Northwestern University Feinberg School of Medicine
- Jennifer Liss - Indiana University
- Andrew MacEwan - British Library
- Erin Stalberg - Mount Holyoke College
- All organizers are members of the PCC Task Group on Identity Management in NACO
Duration: 3.5 hours
- Understand the differences and similarities between authority control and identity management
- Identify common areas of interest between libraries and research profiling systems
- Contribute to the growing number of use cases for applying authority data in new ways
- Understand how the traditional curating role for authorities data in libraries today may become much more focused on identity matching and disambiguation in the future
Expected participants would include experts in the field of authority control as used in libraries and established experts familiar with the new initiatives that look to integrate the identity management in systems such as research profiling systems, institutional repositories, and …
This specific workshop has not been organized before. The PCC TG on Identity Management in NACO began with its work in the spring of 2016.
With increasing frequency, terminology like “Identity Management” is being used in many settings including libraries where the familiar term is “Authority Control.” Librarians are interested in understanding the difference between those concepts to better align their work with the new developments and new technologies and enable the use of the authority files and identity management registries in various settings. Our task group, Program for Cooperative Cataloging Task Group on Identity Management in NACO, would like to explore and discuss with the VIVO community our common areas of interest. Come hear about some of the emerging use cases illustrating the difference, where library authority data is being utilized in new ways and join us in discussing some of the implications these developments have for the broader community.
Libraries are shifting traditional notions of authority control from an approach primarily based on creating text strings to one focused on managing identities and entities. This workshop will examine the library experience of working collaboratively over centuries to standardize name forms, share important lessons learned, and explore what infrastructures might be put in place by libraries and institutions/organizations to enable us to work most effectively together going forward: minting and sharing identifiers, linking local identifiers to globally established ones, and creating metadata enrichment lifecycles that enable broad sharing of identity management activity.
The workshop will address the new initiative to start a pilot membership program for PCC (and other) institutions with the ISNI. This new initiative is intended to help create a pathway for globally shared identifier management work in libraries, in support of not only traditional uses, like including identifiers in MARC authority work, but also forward looking projects like linked data and non-MARC library initiatives like institutional repositories, faculty profiling systems and many other use cases.
Wednesday 2nd of August, Afternoon Workshops
Managing Assets as Linked Data with Fedora
- David Wilcox
- Andrew Woods
Duration: 3.5 hours
Fedora is a flexible, extensible, open source repository platform for managing, preserving, and providing access to digital content. Fedora is used in a wide variety of institutions including libraries, museums, archives, and government organizations. Fedora 4 introduces native linked data capabilities and a modular architecture based on well-documented APIs and ease of integration with existing applications. Recent community initiatives have added more robust functionality for exporting resources from Fedora in standard formats to support complete digital preservation workflows. Both new and existing Fedora users will be interested in learning about and experiencing Fedora features and functionality first-hand.
Attendees will be given pre-configured virtual machines that include Fedora bundled with the Solr search application and a triplestore that they can install on their laptops and continue using after the workshop. These virtual machines will be used to participate in hands-on exercises that will give attendees a chance to experience Fedora by following step-by-step instructions. Participants will learn how to create and manage content in Fedora in accordance with linked data best practices and the Portland Common Data Model. Attendees will also learn how to import resources into Fedora and export resources from Fedora to external systems and services as part of a digital curation workflow. Finally, participants will learn how to search and run SPARQL queries against content in Fedora using the included Solr index and triplestore.
Promoting FAIR data principles with figshare
- Alan Hyndman
Duration: 3.5 hours
There has been much talk around FAIR repositories – making content in a repository Findable, Accessible, Interoperable, and Discoverable– to help create efficiencies throughout the research workflow and allowing researchers to build on data and research that came before them. Figshare works with researchers and publishers to help bridge this gap and connect the valuable underlying data to both the article and the researcher themselves, allowing for more credit for non-traditional outputs of research to spur scientific discovery and incentivize data sharing. This presentation will show how, by providing valuable infrastructure and bringing non-traditional research outputs to the forefront, discoverability and data reuse can raise researcher profiles and allow publishers to provide additional value to the journal article itself.
Openly-available academic data on the web will soon become the norm. Funders and publishers are already making preparations for how this content will be best managed and preserved. The coming open data mandates from funders and governments mean that we are now talking about ‘when’, not ‘if’, the majority of academic outputs will live openly on the worldwide web. The EPSRC of the UK is mandating dissemination of all of the digital products of research they fund this year. Similarly, the European Commission, Whitehouse’s OSTP, and Government of Canada are pushing ahead with directives that are also causing a chain effect of open data directives amongst European governments and North American funding bodies.
This workshop will be a mix of group discussion and case study presentations from Carnegie Mellon University and St Edward’s University, who will be talking through their approach to implementing figshare and the tools they have built on top of the figshare API. The half day will look at the research data management landscape, from the different approaches on the institutional level that are being taken to adjust to the various funder mandates to the ways your institution can ensure researchers comply with these funder requirements. In doing so, we will explore how existing workflows will be disrupted and what potential opportunities there are for adding value to academic research and profile at your institution. It will also take the audience through the experience of figshare and how we’re attempting to contribute in an area that has many stakeholders - funders, governments, institutions and the researchers themselves.
Crosswalking Research Area Vocabularies in VIVO
Duration: 3.5 hours
Many VIVO sites use different vocabularies to indicate the research areas they are affiliated with. For example the biological sciences uses PubMed MeSH subject headings but the Physical sciences might use a controlled vocabulary from a commercial vendor like Clarivate’s Web of Science Keywords or FAST terms from the Library of Congress. This can lead to redundancy and confusion on a VIVO site that allows the end user to filter based on a vocabulary term. The same or similar terms might display multiple times. Generally an end user isn’t concerned with the originating vocabulary of the term. They just want to filter or center their experience on that term.
An example is how one can draw an equivalence between the Mesh Term Textile Industry and the same Agrovoc term. These both indicate “Textile Industry”. In VIVO the problem arises if one publication indicates the Mesh “Textile Industry” term while a different publication might indicate the Agrovoc “Textile Industry” term. VIVO now will show two “Textile Industry” concepts.
It gets more confounding as we search through the other vocabularies. Some sites like wikidata might have links to the term in various vocabularies, but not all. Looking at wikidata we see links to other vocabulary synonyms for “Textile”, but no links to FAST, MeSH, LCSH, Fields of Research (FOR), or others. Hence challenges are presented for VIVO sites that ingest publications from various sources, either directly or via applications like Symplectic Elements.
University of Colorado VIVO site is now impacted by this problem. We have thousands of publications from various sources using different vocabularies for research terms. We would like to import these publications and their terms into our VIVO. As a University that serves many disciplines how do we standardize which terms we will use. At first glance it seems that the amount of manual curation to do this properly is daunting.
The question then becomes what are the use cases for using Research Areas and harmonizing the terms within a site or across multiple site. An obvious case would be a journalist searching an institution for experts within a certain subject area. The journalist might not know specifically what the subject area is so it’s important to provide a top level view of general subject areas and allow them to drill down. This also might imply that the vocabularies utilize a SKOS type broader/narrower implementation. In this case each of the broader and narrower terms also needs to be harmonized with other vocabularies.
Solving this problem is crucial, especially if one wants to traverse multiple machine readable VIVO sites to locate items that might share a similar research area. Potential solutions could be that a VIVO site imports a crosswalk list of same-as statements between different research vocabularies or they utilize a lookup service. Other options include a federated vocabulary harmonizing service where all VIVOs register and have their taxonomies mined in order to be synced with a master service. Perhaps something similar to a distributed blockchain service. One reason this might be preferable is because many if not most VIVO sites require some sort of autonomy regarding the use of terms and their associations with other objects. Hence it’s imperative that the VIVO application continues to offer this flexibility.
This workshop will discuss and weigh the various options of modeling and displaying this data, in machine readable and html format, and align these options with the needs of the typical VIVO sites taking into account the governance mechanisms and uses cases for these various VIVO scenarios. This is a very broad topic hence discussion will be scoped to maintain an objective of having a VIVO site display research areas in a similar fashion as commercial sites like Amazon do.