3.6 OAI-PMH Mechanism

Harvester is a client application which is operated by a service provider to collect metadata from repositories. Repositories are accessible by networked infrastructure by the means of 6 OAI-PMH requests (popularly called OAI verbs) that act as content negotiation mechanism between data providers (holder of metadata) and service provider (gatherer of metadata). So there are two classes of participants in the OAI-PMH framework (Lagoze & Sompel, 2003) - in one side there is service provider powered by harvester or harvesting software and on the other side there is data provider backed by repository. TheOAI/PMH is a light-weight standard protocol for harvesting metadata records from ‘data providers’ to ‘service providers’. It provides some rules to harvest the metadata of a repository not the full contents. The contents should be retrieved form source repository. Figure 3.7 shows that how a request is given by a service provider to the data provider. 

Service Providers use metadata harvested via the OAI-PMH as a basis for building value-added services; and

Data Providers administer systems that support the OAI-PMH as a means of exposing metadata.

Any one of the following harvesters can be used for harvesting metadata from data providers to service providers using six OAI verbs (Sutradhar, 2013):

  • Arc
  • Citebase
  • CYVLADES
  • DP9
  • DLESE OAI Software
  • OaI Repository Explorer
  • OAIster
  • OASIC
  • OAIHarvester
  • MeInd
  • METALIS
  • My OAI
  • Perseus
  • Public Knowledge Project-Open Archives Harvester

Repositories are always managed by data provider that makes OAR open to harvesting. OAI-PMH distinguishes between three distinct entities viz., Resource, Item and Record. Service provider send request by using HTTP protocol and Data Provider responds in XML syntax. Request epitomes are issued as GET or POST methods over HTTP protocol. In this mechanism a service provider may fetch OAI-PMH compliant documents from different data providers and data provider may also act as aggregators. It may be mentioned that a repository can act as service provider and data provider at the same time as well as only service provider or data provider.



The Open Archives Initiative Metadata Harvesting Protocol (OAI/PMH) supports interoperability and sharing of metadata across an array of institutions. The creation of large repositories by using OAI/PMH protocol is advantageous to bring together scholarly information bearing objects and cultural resources. However, the mixing of metadata from a variety of institutions and communities poses difficulties for discovery and interoperability. Open source OAI harvesting tools provide opportunities to make the difficult job an easy one. As mentioned earlier, there is an array of open source harvester software (compatible with OAI/PMH V.2). PKP (Public Knowledge Project) harvester developed by University of British Columbia has already been proved as an excellent metadata harvesting and presentation tool. This multi-platform Web-based tool extracts data and presents it in a coherent manner. It employs an intuitive user interface to organize data (see Evaluation of Open Source Spidering Tools74). Please see Table 3.6 in section 3.7 for a comparison of open source harvesting software.

Last modified: Wednesday, 31 March 2021, 4:49 PM