1.3 Open Access and Metadata
The organization and dissemination of OA materials is presently passing through a complex phase. The major stakeholders of OA infrastructure like publishers, researchers, institutes, funders and end users have different concepts and expectations from OA systems and services. For example, governments (as funding agencies) want to ensure wide availability of research publications in public domain. Many governments are developing policies in this direction. (Please refer to Module 3, Unit 1 for further details). End users want to know what research is accessible to them, and to what extent they can reuse accessible contents. Another problematic zone is 'hybrid journals' in which some of the article are available freely (authors pay to make their paper freely available to readers), while the rest of the journal contents available against subscription fees. This varied environment limits – i) effective resource discovery; ii) clarity in reuse rights; and iii) possibility of adopting standards to bridge requirements of stakeholders. Till date no standardized bibliographic metadata schemas have metadata elements to specify whether a given article is openly accessible and what reuse rights are associated with it.
1.3.1 Policy Framework
An OA service (whether Gold or Green) needs to develop a policy framework for metadata in view of the importance of metadata in OA, discussed in previous sections. The policy framework for metadata needs to address issues like – i) Who can enter or edit metadata? ii) Which metadata standards are to be followed? iii) Whether different metadata schemas are required for describing different type of documents? iv) Whether or not the repository systems allow metadata harvesting by service providers? v) Which protocols should OA system support for metadata harvesting? As per OpenDOAR (OpenDOAR, 2013) database, more than 84% repositories have not defined metadata policy (Figure 1). Analysis of ROARMAP also shows that most of the OA repositories (OAR) have no metadata policy but almost all the OARs clearly state that anyone may access the metadata.
An efficient OA service must work on the basis of a standard metadata policy. Let us discuss metadata policy requirements for organizing OA resources one by one. The policy issues related to metadata are discussed on the basis of recommendations of OA experts and subsequent analysis of ROARMAP database.
Policy Issue I: Who can create or edit metadata?
OA experts' view: Many OA experts suggest (Graaf & Eijndhoven, 2008; Barton & Walker, 2002) that contributors of open contents may enter simple descriptive metadata like creator, title and keywords. In case of difficulties they may take help of intermediaries like library professionals. Some researchers and OA service providers (DINI, 2003; Pinfield, Gardner & MacColl, 2002) advocated that standardized metadata should be created and provided for exchange and harvesting services.
ROARMAP analysis: Only a few OARs (see Table 1 for an illustrative list) have suggested that metadata should be created and provided by author or eligible contributors. Library staff, if necessary, may edit or create additional metadata.
Policy Issue II: What metadata standards to be used?
OA experts' view: OAR systems differ widely in the selecting and applying metadata schema to support the ingest, management, and use of data in their collections. Most of the researchers recommended to use qualified Dublin Core as metadata standard for organizing OA resources (Graaf & Eijndhoven, 2008; Gibbons, 2004;) in general but some of the researchers are in opinion that domain-specific metadata should be employed by the OA service providers for organization of specialized contents like ETDs and learning objects.
ROARMAP analysis: It is also clear from the study that almost all the OARs use Dublin Core standards. A few repositories implemented additional or extended metadata schemas for domain specific datasets(see Table 1 for an illustrative list).
Policy Issue III: How to standardize subject access metadata elements?
OA experts' view: Expert and OA service providers (DINI, 2003; Nolan & Costanza, 2006) recommend that standard vocabularies should be adopted for populating subject access fields of metadata schema in use.
ROARMAP analysis: The analysis of the dataset shows that only a few OA service providers are using controlled vocabulary for populating subject access metadata element i.e DC.Subject metadata element for standardizing subject indexing. The other metadata elements also required use of authority list like language code (for DC.Language) etc.
Policy Issue IV:Whether metadata sets be open for harvesting?
OA experts' view: Most of the OA researchers are in favor of metadata harvesting to support developing federated search interface (Hirwade & Hirwade, 2006; Singh, Pandita & Dash, 2008; Sarkar & Mukhopadhyay, 2010). OA experts also opined that Gold and Green OA systems must be compliant with OAI/PMH standard to support metadata harvesting.
ROARMAP analysis: A detail report of the present statistics related to OAI/PMH Compliant repositories is given in Table 2.
Policy Issue V: If open for harvesting, what should be the metadata re-use policy?
OA systems need to follow a policy framework for metadata reuse to resolve issues like – i) whether harvesting requires prior permission? ii) whether link/acknowledgement is mandatory? iii) whether harvesting is open for all or restricted to non-commercial use only? Analysis of ROARMAP shows that only a few OARs have metadata reuse policy (see table 2 for an illustrative list). Most of the OARs allow metadata harvesting in any medium without prior permission for not-for-profit purposes. In some OARs restriction is that metadata must not be re-used in any medium for commercial purposes without formal permission.
1.3.2 Application Framework
On the basis of metadata policies discussed in previous section, a set of recommendations may be drawn to help application of metadata standards for organizing OA resources. The list of major decisions related to OA metadata is given below:
1)Anyone may access the metadata free of charge;
2)All metadata in the repository should be based on the recognized global standard;
3)Qualified version of the Dublin Core schema as a descriptive metadata standard will be used;
4)Community/domain-specific metadata elements will be used where no suitable element or element refinement exists in generic schema like DCMES;
5)Recommends DCMES as generic metadata schema and suggests respective domain-specific schemas for special objects like ETD (UK-ETD), Learning Objects (IEEE-LOM), Journal articles (Qualified DCMES) etc. on the basis of a set of standard parameters;
6)Deposit of materials to OA system requires a minimum set of descriptive information (metadata) to be provided at the point of deposit;
7)Basic metadata will be created by authors or their delegated agents at the time of submission;
8)Library professionals will create additional metadata elements and edit basic metadata set, if required, to ensure the quality of complete metadata records;
9)Recommends following basic cataloging standards –AACR/RDA – for rendering personal and corporate names;
10)OA systems may allow metadata harvesting and supports metadata extraction through OAI-PMH standards;
11)Metadata elements must support basic retrieval tasks including advanced set of search operators;
12)Controlled vocabularies will be used to maintain consistency and to enhance the quality of records exposed to search and browse services;
13)The metadata of withdrawn items shall not be searchable;
14) Appropriate standard lists (e.g. Geographic area code), international standards (e.g. ISO date format), and authority lists (e.g. name authority) may be used to ensure quality of metadata.
Similarly, a set of recommendations may be drawn on the area of metadata reuse.
1)The metadata may be re-used in any medium without prior permission for not-for-profit purposes; and
2)The metadata must not be re-used in any medium for commercial purposes without formal permission.
1.3.3 Usage Metadata
Another important aspect of OA metadata landscape is usage metadata. There are many standards and initiatives for describing and storing usage metadata in the domain of OA such as SURE (Statistics on the Usage of Repositories), PIRUS (Publishers and Institutional Repository Usage Statistics), OA-Statistik, NEEO (Network of European Economists Online), KE-USG (Knowledge Exchange Usage Statistics Guidelines), and OpenAIRE that specify metadata formats to be used to incorporate information of usage events. The usage metadata may serve as an important value-added service for users of open contents. Apart from the contributors and users of open access resources, funding agencies are also interested in availability of integrated usage data to measure research impact and to analyze trends over time. For example, PIRUS suggests to include following metadata elements to record usage of OA resources – i) either print ISSN OR online ISSN; ii) article version, where available; iii) article DOI; iv) online publication date or date of first successful request; and v) monthly count of the number of successful full-text requests. Other optional but desirable metadata elements are - i) journal title; ii) publisher name; iii) platform name; iv) journal DOI; v) article title; and vi) article type. The item level granularity in PIRUS is achieved through two additional metadata elements – article DOI and ORCHID as author identifier. Most of these initiatives are based on the OpenURL Context Object format. This format includes six elements: i) Referent (the item that was used, e.g. a paper deposited in a repository); ii) Referring Entity (the "atomic" entity within the referrer that contains the reference to the referent, e.g. a Google search hit); iii) Requester (the user or client requesting the referenced item, identified by its IP address); iv) Service Type (the action that is associated with the requested item, e.g. download or metadata view); v) Resolver (the service that holds or resolves to the requested item, e.g. the OAI base-URL of the repository); and vi) Referrer (the web service that provides a reference to the referent, e.g. the Google-search engine).