7 Collection Development in Digital Library

Jagdish Arora and Kannan P


I. Objectives


The objective of this module is to impart knowledge of the following aspect of digital library.

•   Concept of collection development in digital library

•   Various digital materials and their sources

•   Digitization issues and benefits

•   Selection criteria to evaluate digital resources

•   Licensing issues and pricing model for electronic resources



II. Learning Outcomes


After going through this lesson, you would learn the significance of identification, selection of electronic resources and selection criteria that are considered for selecting e-resources. You would gain knowledge about different pricing models and issues involved in licensing e- resources for your library.



III. Structure 


1. Introduction

2. Collection Development

3. Digital Collection

4. Identification

5. Digitization

5.1. Benefits of Digitization

6. Selection of Electronic Resources

6.1. Selection Criteria and Evaluation of Electronic Resources

6.1.1. Content

6.1.2. Functionality and Reliability

6.1.3. Technical Support

6.1.4. Vendor support

6.1.5. Pricing Model Print + E Model Electronic Only Full-time Equivalent Models Concurrent-Users Model Perpetual Access V/s Annual Lease Back-file Access Document Delivery and Pay-Per-View Models

6.2. Licensing Consideration

6.2.1. Access Concern

6.2.2. Copy Right and Fair Use

6.2.3.  Flexibility and Enhancement

6.2.4.  Legal Issue

7. Review of Digital Resources

8. Summary





1.    Introduction 


The most important component of a digital library is the digital collection it holds or has access to. Viability and extent of the usefulness of a digital library depends upon the critical mass of digital collection it has. The digital collection infrastructure, typically consists of two components, i.e. collection or digital objects and metadata. The collection represents organized set of digital objects, object represent the digital material and metadata provides bibliographic or index information for the digital objects and collection. While digital objects are the primary documents that users wish to access, it is metadata that facilitates their identification and location using variety of searching techniques. Information contents of a digital library, depending on the media type it contain, may include a combination of structured / unstructured text, numerical data, scanned images, graphics, audio and video recordings. Different types of resources need to be handled differently in a digital library. This module discusses various aspects of collection development in digital library including identification, selection, pricing model, licensing, review and renewal, etc.



2.    Collection Development 


Encyclopedia of Library & Information Science (2000) defines library collection as “sum total of library materials – books, manuscripts, serials, government documents, pamphlets, catalogues, reports, recordings, microfilm reels, micro cards and microfiche, punch cards, computer tapes etc. that make up the holding of a particular library.”



Libraries are charged with the responsibility of applying judgements to determine what to select, organize, preserve and provide access to the user community [1]. The libraries, irrespective of media types, i.e. print, audio-visual or digital, are primarily responsible for identifying, selecting, organizing, preserving and providing access to resources to their users. The traditional libraries will not become digital libraries, but will rather acquire access to ever growing digital collections for their users. Collection development is a broad description for the management of collections of information resources and involves their identification, selection, acquisition, evaluation, and sustaining [2]. Collection development in a digital or hybrid library need to have pre-defined policies and practices similar to those being followed in traditional library while keeping in view the issues and complexities that are especially related to digital materials. Important processes involved in collection development in the digital environment are: identification, selection, licensing of digital materials, etc.


3.    Digital Collection 


Digital collection development can be seen as part of the broader concept of collection development and connected to the organization’s mission on collection development.


Online Dictionary of Library and Information Science [3] defines digital collection as “A collection of library or archival materials converted to machine-readable format for preservation or to provide electronic access. Also, library materials produced in electronic formats, including e-zines, e-journals, e-books, reference works published online and on CD- ROM, bibliographic databases, and other Web-based resources”.


The digital data are the part of digital collection, however, there is a huge different between borne-digital data and digitied library holdings as part of digitization project such as print books, journal article, audio, video, etc. Some of the key sources of digital data are.


•    A library’s own holdings that have been digitized

•    Purchased datasets on CD-ROM

•    Purchased datasets that are online

•    Electronic equivalent printed publications

•    Electronic reference works

•    E-books



4.    Identification


Collection development is a challenging area of activity in digital or hybrid environment. With recent developments in ICT, individuals and institutions publish their own contents on the web and make them accessible for a fee or free. While most traditional publishers make their products available in print as well as in digital format, several new publishers offer their products in digital formats only. The digital contents, therefore, may not be available through well-established distribution and marketing channels that exists for printed publications, making it difficult to identify them. Moreover, there is no effective bibliographic control over products and services generated by electronic publishing and selection tools such as national bibliographies and union catalogues that exist for printed publications are not available for digital resources.


The Identification process for digital resource can be time-consuming and laborious. Digital materials may be software or machine dependent. It may require a specific software or hardware to convert the print collection into digital. The process of identification and selection of digital resources, therefore, requires an understanding of the library’s existing computing and network environment, as well as understanding of trends in the development of digital information resources. The Library must ensure that it has adequate technical infrastructure to support digitization, access or to host the digital resources being purchased / leased from the publisher. Technical infrastructure may have to be evaluated in terms of computer hardware and operating system, initial storage capacity and rate of growth, software required to access or manage the resource, frequency of updates, network capabilities, storage and distribution media, cost of maintenance, access limitation (multi-user or stand-alone), site limitation, etc.


5.    Digitization 


The larger part of a digital library’s collection comprises of material that are born-digital such as electronic journals, electronic book and electronic databases, and so on. The libraries having huge valuable resource in print form such rare books, theses and dissertation, audio, video tapes, etc. not in digital format, but are important candidates for digitization. Digitization is the conversion of an analogue signal or code in to digital signal or code [4]. Digitization of existing collection plays a vital role in building any digital library.


The Digitization of any library may start as institutional plan for mass Digitization or user request for specific collection or special funding for Digitization. Whatever the reason for a Digitization process’s origin, the first and foremost step is the selection of materials to be digitized. The selection of digital materials depends on various factors such as organization mission, the target users, the available resources, rapid usage and the physical condition of the source materials, etc.


The Harvard University Library Digitization Initiative provides the following guidelines for Digitization of images and text materials [5]:


•   Determine whether page images, full text, or both need to be produced to meet project requirements;

•   Access source materials and plan appropriate preparation, transfer, handling and disposition procedures;

•   Create archival versions of page images and / or full text for long-term storage and production of deliverables as needed; and

•   Create deliverables for distribution as page images and / or full text;


5.1. Benefits of Digitization 


The two major benefits of Digitization are improved access and preservation for longer duration. The material once Digitized, can be accessed simultaneously by many users from different places. Digitized resources will not be damaged due to heavy and frequent usage like printed resources. The advantages of digitization include:


•   Immediate access to high-demand and frequently used items.

•   Rapid access to materials held remotely.

•   The ability to reinstate out-of-print materials.

•   The potentials to display materials that are in inaccessible formats, for instance, large volumes or maps.

•   The potential to conserve fragile / precious originals while presenting surrogates in more accessible forms.

•   The potential for presenting a critical mass of materials.


6.    Selection of Electronic Resources 


In general, majority of the digital resources that a library access come from outside the library such as publishers, aggregators, other libraries, websites of organizations as well as individuals, who are providing digital content. These contents are derived from a wide variety of sources, in many different data types, each with its own set of formats and delivered to the library by various methods. In such the selection of electronic resources for digital library is a complicated process. The electronic resources have more issues than the print resources as well as Digitized resources such as different access method, infrastructure, pricing and licensing, ownership, format and standards. The selection process involves receipt of request from the user for new electronic resources and feedback about the existing resources, usage statistics for the existing resources and the selection review based on the collection policy of the organization. The library should keep their user informed about new content and services, temporary problems in accessing electronic resources and trial access being arranged for new resources. Electronic resources represent an increasingly important component of the collection building activities of libraries.


6.1.  Selection Criteria and Evaluation of Electronic Resources 


Selection criteria that are applicable in traditional library apply to selection of electronic resources in a hybrid library. These criteria include: relevance to actual or potential users, scope and content, depth of the existing collection in the subject, currency and validity of information, cost-effectiveness, intellectual level and quality of information, authority of producer, uniqueness and completeness of information, etc. The above selection criteria to be evaluated when selecting the electronic resources by the subject expert, whoever is responsible for the collection.


6.1.1. Content 


There is an assumption that, electronic resources are the counterpart of their print version but there can be a number of important difference in electronic version such as simultaneous access, multimedia integration, update mechanism, which enhance the overall library usage. The electronic resources are initially evaluated on the content perspective as in case of print resources. Some of the criteria to be considered when selecting the electronic resources are:


•   The electronic resource must support research needs of the organization.

•   The resource should add depth to the existing collection.

•   Information must be current and updated regularly along with the print counterpart.

•   The electronic resources should come from an established and reputed author, publisher with peer and professionally reviewed.

•   Accuracy and completeness as compared with print format. This means that the electronic resource should have all the articles, illustrations, graphs and tables as they appear in the print counterpart.


6.1.2. Functionality and Reliability 


Functionality refers to the ease and flexibility to access the appropriate information in a quick way. The structure and logical arrangement of the content should be arranged in such a way, that user could navigate to the pertinent information within a few clicks. In order to assess the functionality and reliability of the electronic resources, the library should evaluate the followings:


•   The interface should be user-friendly such as introductory screens, online tutorials, context-sensitive help, and pop-ups and menus provided by the publisher.

•   The search and retrieval software must be powerful and flexible. Some features that should be available include keyword and Boolean searching, full-text searching, truncation, browsing (index and title), relevancy ranking, thesaurus and search history.

•   The system should support multiple export options (email, printing, and downloading) and provision of citation downloads into citation management software.

•   The system should provide access to other electronic resources and support resource integration via reference and full-text linking.


6.1.3. Technical Support 


The electronic resources can be intimidating and difficult for the first time user. Proper training should be provided to the user, so that the electronic resources are used effectively. Electronic resources having technical issues that need to be considered to ensure that the electronic resources are compatible with existing hardware and software and the library has the capability to provide continues access to the user. The following criteria should be considered to ensure technical support:


•   Method of Access: The publisher should provide web based access to contents instead of remote host compare with local host or mount. It enhances the usage of electronic resources and reduce the burden on libraries such as storage, preservation and maintenance.

•   Authentication: The publisher should provide access via IP filtering, so that more number of users could access simultaneously. IP address authentication should be supported with a proxy server to enable the user to access the electronic resources using proxy servers from multiple location.

•   Compatibility: Electronic resources should be compatible with existing system available within the library such as operating system, hardware and software. Content format to be considered, each format such as HTML, XML, PDF has its own positive and negative. XML is the most desirable format, that does not require any software to be installed and convergence with big files.


6.1.4. Vendor support


The reputation of the vendor, technical and user support to be considered when selecting the electronic resources. The vendor should have prior experience in delivering electronic resources. It is useful to determine the range of vendor support service available, including:


•   Trial Evaluation and Product Demonstration: The vendor should provide the resource on trial for certain period and also provide demonstration to use the electronic resources. This is useful to evaluate electronic resources in terms of functionality and reliability.

•     User Training and Support: The vendor should provide training to the user with proper documentation and also support ongoing training for the access period.

•     Technical Support: The vendor should have the capability to provide solutions to the technical issues. Support provided should be timely, professional and effective.

•   Data Security and Archiving: The vendors approach on data back frequency and the format to be considered. In case of the vendor decided to liquidate, how the resource will be accessed and the library’s capacity to manage electronic resources locally. Consideration to be given, whether the backup data compliant with LOCKSS / CLOCKSS or any other alternative for archival purpose.

•    Bibliographic Data Provision: The vendor should be able to provide bibliographic data in a preferred format, so that the burden on libraries will be reduced to develop a local bibliographic database.

•     Statistical Reporting: the quality and availability of statistical data is important to evaluate the usage of electronic resources and analyze the cost effectiveness. This will help the library for decision making during renewal or de-selection. The vendor / publisher should provide COUNTER / SUSHI compliant usage data to the subscribing libraries at a regular interval.


6.1.5. Pricing Model 


One of the major issues that the publishers are concerned with is to save their economic interest in the process of providing electronic access to their printed publications. The publishers make a significant investment in the process of production of a journal which involves activities like peer-review, administration, editing, layout design, production, subscription management and distribution. Most activities that are performed for publishing a journal are common to both electronic and paper media, except for production and distribution where the cost involved is relatively low. The selectors should carefully review the pricing models available for the resource under consideration as there is no standard pricing model for electronic resources. Pricing models are often based on a number of criteria and variables such as the size of the user population and the number of simultaneous users. It is important to give consideration for choosing the pricing model, pricing models may include, but are not limited to: Print + EModel


Print + electronic model was evolved by the publishers as a natural extension of their print subscription model. The publisher provides electronic access to all subscribed as well as un- subscribed titles or part of un-subscribed titles of a given subject collection on additional payment of certain % on their current print spending. The additional percentage payment may vary from publishers to publishers in the range of 5% and 30%. Libraries are expected to retain their print subscription that existed at the time of signing-up the deal with the publisher. It is obligatory on the part of the library to maintain their current level of subscription for the print journals. In a library consortium, a member library in the consortium may have liberty to drop subscription to the journal but should replace it by another journal of same or higher value. Managing this model may pose significant operational problems to both consortia and the publishers. The print + electronic model also provides access to the back-files in addition to the current year access. Moreover, depending on the deal, the publisher may also allow cross sharing of subscribed titles across members of the consortium. Archiving rights in such cases are generally limited to titles that are subscribed in print.  Electronic Only 


The e-only models offer electronic access to journals irrespective of their print subscription. Under such offers, publishers offer a pre-defined set of journals of pre-determined cost to libraries. In case of consortium, publishers develop consortium-specific offers taking into account current print spending by the member institutions to ensure that they do not lose of revenue from print cancellations. The proposal is made more attractive by offering discounts to those members of consortium who wish to maintain print subscription.


Responding to the demands from libraries and library consortia, publishers are moving gradually towards e-only model. E-only models grant consortia-wide archiving and perpetual access rights for the subscribed years’ content. Access and archiving rights for back-file content is offered either as an inclusive value of the offer price or for a one-time additional payment.   Full-time Equivalent Models 


Full-time Equivalent (FTE) models are offered based on population of total number of potential users per site. Generally the entire population of the organization, including students, faculty, researchers and employees of an organization are counted for FTE. Publishers like Nature and Science who had several multiple subscriptions across the campuses follow this model considering that online access could lead to extinction of their print version over time.   Concurrent-Users Model


The concurrent user model provides a fix number of concurrent accesses to all the members of consortium treating all members of the consortia as one single entity or site. The database providers such as, web of Science, use this model predominantly. Universities having multiple sites and national consortium can negotiate this kind of model.   Perpetual Access V/s Annual Lease 


The libraries and library consortium are increasingly demanding perpetual access to the contents based on the subscription model followed by the libraries in print environment. However, the cost that is charged for perpetual access, especially by aggregators like Ovid and OCLC is prohibitive. Annual Lease models, on the other hand, offer a significant cost advantage.   Back-file Access 


The Access to back-file of journals is a critical necessity especially for scholarly journals. Several leading publishers have embarked on the project of digitizing their complete back- files of journals. Most publishers including commercial publishers, academic societies, university presses launched their complete journal archives.


While several publishers, like ACM offer access to their entire back-file collection, as part of the current print subscription, a number of publishers, however, offer free online access to only 5 to 10 year’s content as part of the print subscription and back-file access is charged separately. Most publishers, who have created back-files from volume 1, offer the back-files on “one-time purchase and perpetual access” basis.   Document Delivery and Pay-Per-View Models 


The document delivery is an extension of inter-library-lending practice for resource sharing, which has been widely practiced world over as an exchange of photocopies of articles among libraries. The emerging pay-per-view model, made available by several publishers and third- party aggregators, is likely to replace the old document delivery model. In this model, the library does not subscribe to the complete journal, but pays for what is used. This is an ideal model for the contents of non-subscribed journals. Consortia negotiations can look at the opportunities for using this model for less used journals and engage the publishers  for advance purchase of articles for a lower fee per article. Pay-per-view model is driven and promoted by the publishers. It may replace document delivery completely in the future.


6.2.  Licensing Consideration 


The most electronic resources are licensed to the subscribing institutions with a written agreement that contains detailed explanationa about user’s rights and restrictions on usage. Electronic resources are leased or made accessible on annual payment and are not sold, therefore, libraries do not own the material in a digital environment, instead they license or lease access to digital material on behalf of their users for a defined period of time and under certain terms and conditions usually defined by the publishers in their license agreements. It is, therefore, necessary that the librarians or purchase personnel have full understanding of the terms of the license agreement before selection of an electronic resource.


Currently, there are no standards for licenses, each publisher / vendor have their own proprietary license agreement with terms and conditions set forth by them. The librarians or purchase personnel are, therefore, required to carefully study the license document before signing them. It is a common practice that clauses of license documents are modified or clauses specified by licensee are added to the agreement based on negotiations between licensee and licensor. Sections of agreement that should be carefully understood and modified, if need be are: authorised users, limitations on usage, responsibility of institution for monitoring or controlling access, archival access to subscribed contents (specially for subscribed period), responsibility for actions of users, basic rights of users / institutions under “fair use doctrine”, legal jurisdictions in case of dispute, etc. The library should seek modifications or inclusion of clauses in the agreement that are required to support scholarship and research and educational use as well as those rights that are considered “fair use” for printed materials.


International Coalition of Library Consortia (ICOLC) has developed model license agreements that can be studied before selection of electronic resources and signing a license document.


6.2.1.  Access Concern 


The selection process should also address the issue of remote access v/s local hosting, wherever applicable. Remote access essentially means that the access is provided usually via Internet through the publisher’s web server. There are around 50,000 scholarly electronic journals that are made accessible online by some 4,000 publishers. Several publisher’s journals do not own the technology that is required to host access, search and manage their content. It is important to give consideration to the following points regarding access by a library user be covered by any licensing agreement which a library or institution or consortium signs:


• Authorized users should be defined as broadly as possible. (all person with current authenticated affiliation with  the  subscription  organization  include  employee,  student, etc.). Visitors who have permission to use the institution’s library should have access to the licensed resource.

•  Authorized sites should be defined as broadly as possible. Authorized users should be permitted to access the electronic resource from anywhere via the institution’s secure network.

•  Access should be permitted via IP authentication for the entire institution, including simultaneous access for multiple user.

• Archiving Policy: The resource provider should provide a clearly articulated archiving policy for the licensed resources.

•  Perpetual Access: The provider shall grant access to the licensed content of the resource for the mutually agreed time period. The purchasing or leasing of electronic data should include provision for perpetual access to that data. After termination of the license agreement, the institution’s perpetual electronic access to the previously subscribed content should be guaranteed.

• Institutional Archives / Self-archiving: the resource provider should allow authors of subscribing institutions to upload their works into its IR either in pre or post print format.


6.2.2.  Copy Right and Fair Use 


An electronic resource provided to the user by the institutions are governed by license agreements negotiated between the library or library consortium and the publishers or vendors. The publisher should allow the authorized user to access these electronic resources for non-commercial, educational, scholarly and research activity. The following considerations regarding fair use, user statistics and liability for unauthorized use should be addressed in any licensing agreement which a library, its governing institution, or its consortium signs:


•   The license should permit fair use of all information for non-commercial, educational, instructional, and research purposes by the Libraries and authorized users. These include viewing, downloading, printing, e-reserves and course packs.

•   Pay-per-view services to access articles which are not available in the library’s print or digital collection.

•  User Statistics: the information provider shall provide COUNTER / SUSHI compliant usage statistics in the appropriate format to the concern library or consortia administrator.

•   In general, the vendor should employ a standard agreement that describes the rights of the Libraries in easy-to understand and explicit language.



6.2.3. Flexibility and Enhancement 


The library or library consortium carefully evaluates the terms and conditions around the cancellation of electronic resources. This might be cancelling a bundled deal and moving to selected content or moving to outright cancellation or cancellation of print linked products. Models that impose ‘no print cancellation’ clauses or impose limits on the number of titles or financial penalties should be avoided. The price of the electronic version should be less than or equal to the print version. In case of increase in price, there should be a corresponding increase in the number of access and functionality.


6.2.4.  Legal Issue 


The library or library Consortium should consult with legal authority before signing any license agreement with the publishers. The Agreements include provisions for payment and delivery of the product, warranties and limits, termination of the agreement, customer service information,  and  responsibility  of  the  licensee for the security of the product. The organization payment liability should commence with the date of accessibility of resources. The publisher should maintain access to the organization at least one month as the grace period before discontinuation of access. The license agreement should not restrict any legal rights of the library or library consortium, according to the governing laws of the organization or consortium.


7.  Review of Digital Resources 


The rapid changes in the technology and new offers from the digital content providers in terms of price and package, continuous pressure on the library budget, it is essential to review the existing digital resources in respect to their usefulness and relevance. Systematic and complete review of digital resources should occur at least every year from the time of purchase or from the time of Digitization. The subject experts and the potential user of the digital resources should assess the continuous use of the resources to the user community as well as other related resources. In addition to assessing user demand and potential uses of content, libraries should take a look at actual results. Using statistics compare the usage of digital resources and corresponding print copy, as well as budgetary implications for providing the digital resources instead of the print version of titles should be compared using usage statistics. The library should make sure that the publisher should provide sufficient advanced time to undertake an effective review in relation to the renewal of existing electronic resources.


8.  Summary 


The collection development in the digital or hybrid library is an important activity. It includes various complex task such as identification, selection, providing access and review, etc. It is mandatory that the library should have well-defined selection policy and guideline to fulfil the organization’s mission and user expectation. This module discussed about digital collection, identification and benefit of Digitization. This module also elaborated the significance of selection of electronic resources and various selection criteria to be considered while selecting electronic resources such as content scope, functionality, technical support, vendor support, pricing model and licensing issues, etc.







1. Borman, C (2000) From Gutenberg to the global information infrastructure: access to information in the networked world, New York, ACM Press

2. National Research Foundation (2010), Managing Digital Collections: A Collaborative Initiative on the South African Framework.

  1. Online Dictionary of Library and Information Science (http://www.abc- clio.com/ODLIS/odlis_A.aspx)
  2. Lee, S. D. (2000) Digital Imaging: a practical handbook, Library Association Publishing.
  3. Hamson, A (2001) Case Study: practical experience of Digitization in the BUILDER hybrid library project, program.
  4. NISO Framework Advisory Group (2004), A Framework of Guidance for Building Good Digital Collections. National Information Standards Organization, 2nd ed. (www.niso.org/framework/framework2.html)
  5. E-Resources Collection Development Policy. Available at (http://www.lib.colum.edu/about/ecollectiondevelopment.php)
  6. Electronic Resources Collection Development Policy (http://lib.hku.hk/cd/policies/erp.html)
  7. Key Issues for E-Resource Collection Development: A Guide for Libraries (http://www.ifla.org/files/assets/acquisition-collection- development/publications/Electronic-resource-guide.pdf)
  8. Selection: Electronic and Internet Resources (http://www.azlibrary.gov/cdt/slrer.aspx)