1.6 Metadata Modeling
As a library professional, you know that Paris Principles and ISBDs have served the role of bibliographic foundation for almost all the national and international cataloguing codes. But the environment within which cataloguing principles and standards operate has changed fundamentally and also substantially because of the emergence of computerized processing of bibliographic data, growth of large-scale databases, increasing use of shared cataloguing programmes, and proliferation of digital resources in Web and in libraries. Such a situation requires some general framework to assist in the understanding and further development of conventions for bibliographic description. Models for bibliographic description provide a logical base for the correlation of cataloguing rules with the data encoding structure. A model for bibliographic description endeavors to address complex bibliographic problems and provides a strong foundation to support retrieval, presentation and transfer systems in integrated environment.
1.6.1 Bibliographic Data Models
Some of the groundbreaking works towards developing bibliographic data models are discussed here to show you the application of model in resource description.
A. UKOLN’s Analytical Model of Collections and their Catalogues
This model has been developed in 2000 by United Kingdom Office for Library and Information Networking (UKOLN) under the Research Support Libraries Programme (RSLP). It is applicable to physical and digital collections of all kinds, including library, art and museum materials. This model identifies 3 main entities and associated attributes — Objects (Content, Item, Collection, Location, Content-Component, Item-Component); Agents (Creator, Producer, Collector, Owner, Administrator); Indirect-Agents (Creator’s Assignee, Producer’s Assignee). It also prescribes two types of relationships — internal relationships (relationships between the entities in Collection Description) and external relationships (relationships between Collection Descriptions themselves). The model tries to clarify the points at which rights and conditions of access and use become operable and attempts to act as a bridge linking collections and their users.
B. IFLA Models
IFLA developed a total of three related bibliographic data models in the span of 1998 to 2010. The first one of the series is FRBR and it is followed by FRAD and FRSAD. All these models of IFLA are based upon E-R data modeling and can be applied to print resources as well as digital resources. These three data models are proposed by IFLA during 1998-2010 to upgrade standards of resource description in digital environment. FRBR (Functional Requirements for Bibliographic Data) appeared first in 1998 followed by FRAD (Functional Requirements for Authority Data) in 2009 and FRSAD (Functional Requirements for Subject Authority Data) in 2010. FRBR deals with ER modeling of bibliographic data, FRAD deals with ER modeling of authority data and FRSAD deals with subject authority data. These three ER modeling standards aim to manage bibliographic and authority data at tandem. The FRBR model (Functional Requirements for Bibliographic Records) is a conceptual model that was developed by an IFLA group of experts from 1992 to 1997 and finally published in the year 1998. The model uses entity-relation techniques to identify entity, attributes and relationships in the bibliographic universe. It also identifies the relevance of each attribute and relationship to the generic tasks performed by users of bibliographic data. In FRBR model, the entities of bibliographic universe have been divided into three groups: i) the first group includes the products of intellectual or artistic endeavor; ii) the second group comprises those entities responsible for the intellectual or artistic content; and iii) the third group identifies entities that serve as the subjects of intellectual or artistic endeavor.
- Group I: The entities of this group represent the different aspects of user interests in the products of intellectual or artistic endeavor. These are: Work (a distinct intellectual or artistic creation; Expression (the intellectual or artistic realization of a work); Manifestation (the physical embodiment of an expression of a work; and Item (a single exemplar of a manifestation).
- Group II: The entities in the second group represent those responsible for the intellectual or artistic content, the physical production and dissemination, or the custodianship of the entities in the first group. The entities in this group include person (an individual) and corporate body(an organization or group of individuals and/or organizations).
- Group III: The entities of this group represent an additional set of entities that serve as the subjects of works. It includes concept (an abstract notion or idea), object (a material thing), event (an action or occurrence), and place (a location).
FRAD (Functional Requirements for Authority Data) is a new authority data model developed by IFLA recently. Library catalogue supports two major groups of functions – i) Finding function; and ii) Collocation function. Collocation functions require the support of authority control. The typical functions of authority control are as follows - 1) Document decisions; 2) Serve as reference tool; 3) Control forms of access points; 4) Support access to bibliographic file; and 5) Link bibliographic and authority files. The Functional Requirements for Authority Data (FRAD) is a conceptual model and a companion document to the Functional Requirements for Bibliographic Records (FRBR) conceptual model. FRAD includes additional attributes for each of the Group 1, 2, and 3 entities, as well as a new Group 2 entity (Family). It also includes attributes intended to support the authority control process (Name, Identifier, Controlled Access Point, Rules, and Agency). In addition to expanded entities and attributes, FRAD defines a different set of user tasks for authority data than FRBR did for bibliographic data. Here, the user tasks are Find, Identify, Contextualize, and Justify. The FRAD model, together with FRBR, serves as the foundation of the content standard Resource Description and Access (RDA). The FRSAR Group finalized in 2010 the FRSAD model (Functional Requirements for Subject Authority Data), which was published in English both as a printed book and online. This model focuses on the relationships between a work, its subjects, the way these subjects are named, and the information contained in indexing schemes about both the concepts and the appellations that refer to them.
C. XML Organic Bibliographic Information Schema (XOBIS)
XOBIS attempts to restructure bibliographic and authority data in a consistent and unified manner using Extensible Markup Language (XML). It has been developed at Lane Medical Library, Stanford University under the Medlane Project. The preliminary version (alpha version) of XOBIS appeared in September 2002. XOBIS prescribes a tripartite record element based structure in which each record consists of three required components. These are Control Data (contains metadata about record), Principal Elements (10 categories of data that provide bibliographic access and authority control to a wide variety of resources) and Relationships (element that accommodates links between any pair of principal elements). The basic structure of XOBIS may be illustrated in Figure 4:
XOBIS is an experimental model for resource description in XML schema. The aim of XOBIS is to achieve integration of digital world and print world as far as resource description area is concerned. It is generally used to retrieve MARC records from remote library catalogs, including OCLC’s World Cat, to facilitate copy cataloging and sharing of bibliographic records.
1.6.2 Applications of RDF and XML
The Network Working Group of Internet Society issued a memo in December 1999 (RFC: 2731) for encoding DC metadata in HTML. As per this memo the general syntax is
The current activities of W3C are centered on the development and standardization of two important projects, XML and RDF. The Extensible Markup Language (XML) is a data format for structured document interchange on the web. XML permits the web authors to add tags as necessary. It is intended to make easy and straightforward use of SGML in the web. The extensible feature of XML will make the encoding of metadata easier and more flexible. But this strength of XML leads to a serious problem in standardization. Any one can create a set of tags for describing resources. It reduces the scope of harmonization of various metadata schemas. Thus, along with the XML, the web also requires a unifying architecture to accommodate different metadata schemas from various communities. The Resource Description Framework (RDF) is a W3C initiative in this direction. DC metadata (IETF RFC: 2413) and RDF are two distinct specifications but both the communities have a number of members common and have evolved side-by-side. In fact, RDF is based on Warwick Framework, a major recommendation of the Second DC Workshop at Warwick in 1996. The co-evolution of DCMES and RDF forms a natural complement within the web's greater metadata architecture. The DC has provided a semantic focus for RDF, and in turn, RDF has clarified the importance of a formal underlying data model for DC metadata. RDF is a meta-language for representing information, and serves as a key piece of the technical framework underlying Semantic Web activities. RDF defines its statements in “triples”: the subject is what is being described, the predicate is an indication of what property of the subject is being described by the statement, and the object is the value of the property. A simple RDF model has three parts called RDF Triples. It says that a fact represented has three parts: a subject, a predicate (i.e. verb), and an object. The subject is what's at the start of the edge, the predicate is the type of edge (its label), and the object is what's at the end of the edge. The subjects, predicates, and objects in RDF always indicate things: concrete things or abstract concepts. The things that names denote are called resourcesor nodes or entities. Predicates indicate relations between two things. RDF also specifies that names for subjects, predicates, and objects must be expressed in Uniform Resource Identifiers (URIs). RDF uses XML namespace for identification of metadata schema. An XML namespace is a collection of names, identified by a URI reference that are used in XML documents as element types and attribute names. As per the recommendation of DCMI, the URI of the namespace for all DCMI elements that comprise the DCMES version 1.1 is http://purl.org/dc/elements/1.1/. Therefore, within the RDF documents, it may appear as xmlns: dc = http://purl.org/dc/elements/1.1/. We already know that an expression in RDF is a “triple,” consisting of a subject (the object being described e.g., the sky), a predicate (an element or field describing the object e.g., colour), and an object (the value that the predicate takes on e.g., blue). A set of RDF triples is called an RDF graph. Let’s see an example (Table 6) showing the representation of the Web site of the University of Burdwan by using DCMES as schema and RDF as framework.
Almost all the advanced level repository management software support RDF based encoding of DCMES. For example, a deposited record in EPrint archive software stores DC metadata elements in the following RDF format (the metadata of digital resource submitted to EPrint software can also be exported in RDF format).