DLF Digital Library Assessment Metadata Working Group DLF

DLF Metadata Assessment Working Group

Menu

Metadata Assessment Framework and Guidance

Metadata metrics

Determine what metadata quality means to you

Metadata quality is subjective. How you define metadata quality will be unique to the core functions and mission of your institution or needs. Bruce and Hillmann’s 2004 article, “The Continuum of Metadata Quality,” defines a framework with seven categories of metadata quality and is noteworthy for influencing the subject’s subsequent exploration in the libraries, archives, and museums (LAM) context. Please see our list of recommended publications for further investigation.

Sample metrics for common metadata quality criteria

The following list presents common quality criteria and assertions and is intended to serve as a prompt for thinking about your metadata. This list is not exhaustive, but rather provides focus areas for evaluating metadata. Metadata supports use: there may be other quality dimensions and definitions not listed here that would be more meaningful measures for your use case.

Determine what to assess

Documentation as a starting point

Well-documented practices and policies increase assessment efficiency and effectiveness. Strategic documentation can increase and improve communication across different communities (creators, stakeholders, users) and provide a record of decisions. Documentation about local metadata practices and policies can take many forms and may be known as guidelines, standards, data dictionaries, or application profiles. In the following sections we use the term “application profile” to refer to this type of documentation.

Your application profile

An application profile is a document that outlines your institutional/consortial metadata schema practice. It defines your metadata elements and properties, and delineates obligations and constraints for use. An application profile also establishes context for metadata implementers and aggregators. The document provides a human-readable summary of your schema’s characteristics, which is critical for metadata assessment planning, review, and revision. Application profiles could be used for non-MARC (e.g., DPLA) or even MARC metadata (e.g., BIBCO Profiles).

Your application profile establishes a foundation for the development of your approach to metadata assessment by clearly specifying requirements, ranges (e.g., controlled vocabularies and/or data types), and permissible cardinality. Application profiles can also include how external standards and schemas map to your institutional metadata.

For example, an application profile can include the following components to describe each element included in the profile:

Whether creating an application profile from scratch or updating a legacy profile, it’s helpful to review the application profiles of other institutions, especially those with similar collections and/or functional requirements. Also, while your application profile can be highly tailored to the metadata elements, controlled vocabularies, or functional requirements of your institution, it’s important to be aware of metadata standards within the cultural heritage community (or the communities with which your institution is associated) and, where it makes sense to do so, to align local practices with community standards. Examples of metadata profiles and mappings between common metadata schema are available in the Metadata Application Profile (MAP) Clearinghouse, a project maintained by the DLF AIG Metadata Assessment Working Group.

Considerations for prioritizing assessment

How you prioritize assessment depends on your application profile, technical and human resources, institutional strategic goals, and immediate needs. Below are some sample considerations for prioritizing your approach:

Keeping things simple

The best assessment approach is one that you’re able to put into practice to meet your objectives. There are many well-documented assessment approaches that utilize weighted scoring and algorithms; some may resonate with your technological capabilities and assessment goals. A couple examples from our Zotero citations list include:

If the intricacies of weighted scoring loom as obstacles to progress, use an approach that’s most readily implemented to kickstart your metadata assessment and get work done.

Get your data

Preparation

You’ll need access to your metadata in a format suitable for analysis. Some systems include functionality for the export of data from a user interface, but others do not; you may need to work with your developers on extracting the data directly from your database. It’s helpful to know ahead of time what data format you’ll need, which will depend on the tools and techniques you’ll use to assess your metadata. While this document’s purpose is to address metadata assessment, data format decisions impact remediation work in the cycle of metadata review and revision.

Data formats

If your data is “locked” into specific formats or systems, it may limit your ability to use different types of assessment tools and procedures. Use common, structured formats (like the ones described below) and extract the data in a way that minimizes risk of unintentional modification.

If you are exporting from a database, you may want your data in a format that you can iterate through with a script, such as JSON, XML, or character delimited value formats (CSV, TSV). If you’ll be analyzing using a spreadsheet or a tool like OpenRefine (see next section), you may prefer something flatter and less hierarchical, like CSV. Keep in mind that some file formats, like Microsoft Excel (.xls, .xlsx) contain embedded formatting that may make it more difficult to process in different analysis and refinement tools; embedded formatting can also cause problems when (re)importing into a system.

Data export considerations

Metadata assessment is valuable as an iterative process over time. The following considerations and documentation recommendations aim to ensure that reproducibility and transparency are part of your data export process.

Select the right tools

When evaluating potential tools, the ideal tool or toolset is something within technical reach that also meets your purposes and aligns with your budget. There are a lot of options out there. Fortunately, there are resources available to help with selecting and using new tools.

The DLF Metadata Assessment Environmental Scan identified a variety of tools and sorted them into categories that may match your metadata assessment project needs. An updated overview of the tools is forthcoming.

As you review potential tools, revisit the factors that helped determine your assessment approach. Also, consider how you will be getting your data—different tools may be suitable depending on how you can export/transform it. When evaluating possible tools, think about the tools or methodologies you are already familiar with and how much time you might have to learn a new software program or acquire a new skill. Metadata assessment is a good opportunity to begin learning some code, or expand your coding experience.

Supporting tool use

You’ll likely need to account for the following resources when working with a new tool for metadata assessment:

In addition to the three resource considerations mentioned above there is a fourth key resource:

It is inevitable that once you plunge enthusiastically into metadata assessment you’ll find you need a bit of help. Plan ahead. Some tools have more extensive documentation or more active communities than others; these resources may be helpful or daunting depending on your skill level. For a resource that requires no metadata or technical expertise, we recommend the tool-agnostic DLF Metadata Support Group, which provides a friendly community on Slack that is “a place to share resources, strategies for working through some common metadata conundrums, and reassurances that you’re not the only one that has no idea how that happened.” Identifying resources to support your tool use before you start will provide a foundation for finding solutions when you encounter issues down the road.

Documenting assessment

Documentation is an important part of metadata quality analysis.