2 Historical Evolution of Digital Libraries

Jagdish Arora


I.     Objectives


The objective of this module is to discuss and impart knowledge on the historical evolution of different types of information system and technologies that led to the development of digital library/ libraries.


II.     Learning Outcomes 


After going through this module, learners would get acquainted with tools, technologies, IT-based services and products and their developments that were enabling factors for evolution of digital libraries.


III.     Structure 


1.   Introduction

2.  Computers and Microprocessor Technology

3.  Digital Storage Technology

3.1.  Magnetic Storage Media

3.2.  Optical Storage Media

3.3.  Flash Memory Devices or USB Drives

4.  Online Databases and Information Retrieval System (IRS)

5.  Computer-based Information Storage and Retrieval System

6.  Digital Imaging Technology

7.   Institutional Repositories

8.   Internet Technology and its Services

9.   Development of Web Browsers

10.  Hyperlinks and Development of World Wide Web

11.   Electronic Resources

12.  Summary





1. Introduction 


Although the term digital library has gained popularity in recent years, they have evolved along the technological ladder for the past thirty years. In the early 1970s, digital libraries were built around mini and main-frame computers providing remote access and online search and retrieval services to online databases using computer and communication technology available at that time.


The historical evolution of digital libraries has not been linear. Digital library is an eclectic science. Thus, contributions for evolution of digital libraries have come from several disciplines, leading to multiple conceptions of digital libraries,each one influenced by the perspective of primary discipline. History of digital libraries, therefore, is the history of variety of different types of information system and technologies that have been considered as “digital libraries” or their precursors. These systems  and  technologies  are  very  heterogeneous  in  their  objectives,  scope  and functionalities. As such, their evolution does not follow a single path. It can be observed that most of the systems described in this module are still practiced in one form or another using newer technological solutions and have applications in diverse field of information management.


The Rest of this module delves upon the historical evolution of different systems and technologies that contributed to the evolution of digital libraries.


2.  Computers and Microprocessor Technology 


Development in digital library progressed along with the developments in computer and communication technology. Two significant computers were built in 1946 and 1947. The ENIAC I (Electronic Numerical Integrator and Calculator) computer was developed by John Mauchly and J. Presper Eckert at the University of Pennsylvania. It contained over 18,000 vacuum tubes, weighed thirty tons and was housed in two stories of a building. Another computer, EDVAC, was designed to store two programs at once and switch between the sets of instructions. A major breakthrough occurred in 1947 when Bell Laboratories replaced vacuum tubes with the invention of the transistor. The transistors decreased the size of the computer, and at the same time increased the speed and capacity. The UNIVAC I (Universal Automatic Computer) became the first computer using transistors and was used at the U.S. Bureau of the Census from 1951 until 1963. Software development also was in progress during this time. Operating systems and programming languages were developed for the computers being built. Invention of integrated circuit by Robert Noyce of Intel and Jack Kirby of Texas Instruments in 1960s. All the components of an electronic circuit were placed onto a single “chip” of silicon(Bosworth, 2011).


Dramatic reductions in the size and cost of computer components and equally impressive gains in the speed, storage capacity and reliability of hardware components have expanded their use rapidly in all activities and functions of a library and information centre. Notable reductions in the size of microprocessors combined with dramatically enhanced capacity have added new dimensions to the computer hardware technology. Initially, small silicon chips contained only few components and circuits, but the average number of components has doubled each year since 1965. Early small-scale integration efforts first gave way to large-scale integration (LSI) chips that contained thousands of components, very large-scale integration (VLSI) chips that contained hundreds of thousands of components and circuits, and now ultra large-scale integration (ULSI) chips come with millions of components and circuits.


A microprocessor, also known as Central Processing Unit (CPU) of a computer, is a complete computation engine that is fabricated on a single chip. Different companies like Intel, Advanced Micro Devices (AMD), Motorola etc. manufacture microprocessors. However, Intel manufactures most widely used microprocessors. The first microprocessor used in a PC was Intel 8080. Introduced in 1974, it was a complete 8-bit microprocessor on a single chip. Intel 8088 used in the IBM PC in 1979, was the first microprocessor that made its presence felt in the market. The PC market moved from the 8088, 80286, 80386, 80486, Pentium, Pentium-II, III, IV and now to Intel Dual Core and Quad Core.


3.  Digital Storage Technology 


Like microprocessor technology, digital storage devices have also witnessed notable reductions in its size and cost with dramatically enhancement in capacity of storage. There are two types of data storage devices, i.e. removable data storage devices and non-removable data storage devices. The data storage devices come in many sizes and shapes. The storage devices can also be categorized based on media used for storage, for example magnetic storage media, optical storage media, metal-oxide semiconductors or flash memory devices (popularly known as Pen drives or USB drives).


3.1.  Magnetic Storage Media 


Magnetic storage media are commonly used for large volumes of data (e.g., video, image, or remote sensing data). The first public demonstration of a magnetic audio recorder was invention of Valdemar Poulsen in 1898(Wikipedia, 2014). While the early magnetic storage devices were designed to record analog audio signals, most audio and video magnetic storage devices including computer store digital data. Large amounts of data are stored through tape drives because the capacity on the drives is huge – three billion (or three gigabits) of data per square inch can fit on a single magnetic disk. Hard discs, floppies, tapes, cartridges, etc. are example of magnetic media.


3.2.  Optical Storage Media 


Although research into optical data storage is going on for several decades, the first popular optical storage media, called the compact disc, was introduced in 1982. With release of Yellow Book in 1985, two versions of CD ROM were released in 1988, i.e. i) CD-R: write-once, read-many (data once written cannot be erased); and ii) DR-RW: the data once written can be erased completely and the same storage device can be used again for storing different data. A typical disc used in a computer-based CD drive stores 700 MB. (Wikipedia, 2014)


DVD (initially “Digital Video Disc”, then modified to “Digital Versatile Disc”), like a CD, is an optical storage system was rolled out in 1996 as successor to CD. Recordable formats of DVD called DVD-R,was released in 1997 and writable format, called DVD+R,was released in 2002(Wikipedia, 2014). The DVD format provides several configurations of data layers, moving from 2D storage to 3D storage. Each configuration is designed to provide additional storage capacity. Commonly used DVDs (one side- one layer) in contrast to CDs has storage capacity of 4.7 GB.


3.3.  Flash Memory Devices or USB Drives 


Invention of metal-oxide semiconductors-based storage devices, popularly known as pen drives or USB drives or flash memory devices were patented in 1999 by IBM(Wikipedia, 2014). USB drive is a non-volatile computer memory that can be electrically erased and reprogrammed. It is a technology that is primarily used in memory cards and USB flash drives for general storage and transfer of data between computers and other digital products. Flash memory stores information in an array of memory  cells  made  from  floating-gate  transistors.  In  traditional  single-level  cell (SLC) devices, each cell stores only one bit of information. Some newer flash memory, known as multi-level cell (MLC) devices, can store more than one bit per cell by choosing between multiple levels of electrical charge to apply to the floating gates of its cells. Kingston Technology Company has released 3.0 USB Flash drives in 2013 with 1TB storage capacity(Kingston Technology Company, 2014).


4. Online Databases and Information Retrieval System (IRS) 


Creation and remote accessibility of online databases through Information retrieval systems (IRS) can be considered as an important landmark and precursor to the digital libraries. The first databases were bibliographic in nature and were online version of existing indexing and abstracting services such as Biological Abstracts, Index Medicus, Chemical Abstracts, etc. These databases used data files generated in the process of electronic phototypesetting of printed abstracting and indexing services and other primary journals. These bibliographic databases were hosted on mini and main-frame computers providing remote access and online search and retrieval services to users using computer and communication technology available at that time (Arora, 1996).


The earliest application of digital library concepts involved character-coded storage and full-text indexing of legal and scientific documents. The Legal Information through Electronics (LITE) System was first implemented by the US Air Force in 1967. DIALOG became the first commercial online service in 1972. By 1975 there were 300 publicly available online databases. By 1988, 3,893 online online databases were available from 1,723 database producers and 576 online services.


Sophisticated information storage and retrieval systems were built during 1980s using state-of-the-art technology of distributed database management system linking different remote systems. By the year 1988, only half of all databases were bibliographic. With the introduction of a number of online databases containing textual information, news, statistics, commodity prices, etc., a third type of databases holding text of full-length documents started appearing. As such, online hosts like DIALOG, STN, BLAISE and ESA-IRS were not only offering online databases but also full-text online journals for past several years, although as a simple ASCII or text files without graphics and pictures. In1989, there were almost 1,700 full-text sources in sixteen online systems. (Arora, 1996)


Availability of CD ROM in late 1980s, as a media with high storage capacity, longitivity, and ease of transportation triggered production of several CD ROM information products which were earlier available through online vendors or as conventional abstracting and indexing services in printed format. Moreover, several full-text databases also started appearing in late 1980s and early 1990s, launching the beginning of digital era. Some of the important full-text digital collections available on CD ROM include: ADONIS, IEEE / IEE Electronic Library (IEL), ABI/INFO, UMI’s Business Periodical Ondisc and General Periodicals Ondisc, Espace World, US Patents, etc.


5. Computer-based Information Storage and Retrieval System


Several software packages were released during mid-1970s and late 1970s for computer-based storage, indexing and retrieval of documents in character-coded form. Some of the better known text storage and retrieval packages included: IBM’s Storage and Information Retrieval System (STAIRS), Battelle Automated Search Information System   (BASIS), INQUIRE, BRS/SEARCH, DOCU/MASTER, ASSASSIN, STATUS, CAIRS, etc. By the late 1980s, text storage and retrieval programs were available from dozens of vendors for major computing environment including main- frame, microcomputers and LAN. Micro-CDS/ISIS, one of such advanced non- numerical information storage and retrieval software developed by UNESCO in 1985, was used extensively by libraries especially in developing countries. Micro CDS/ISIS is currently available in different flavours including CDS/ISIS for Windows, GenISIS, JavaISIS, WEBLIS, WWW-ISIS, etc.


Availability of a wide range of Database Management System (DBMS) such as Ingres, Microsoft Access, MS-SQL Server, MS FoxPro, MySQL, NoSQL, Oracle, Postgres, SqLite and MongoDB in late 1990s and early 2000 also contributed to evolution of digital libraries.


6. Digital Imaging Technology 


Digital imaging was developed in 1960s and 1970s to avoid the operational weaknesses of film cameras for use in scientific and military missions. The first digital image was produced in 1920, however, the invention of the CCD (charge- coupled device) in 1969 at AT&T Bell Labs by Willard Boyle and George E. Smith led to its application of imaging technology in consumer products like digital scanners and digital camera(Wikipedia, 2014).


Digital document imaging system, which employ computer hardware and software to scan and store images of documents in digitized formats, were evolved in early 1980s to overcome the limitation of text storage and retrieval systems which could only store textual information. The earliest application of a document imaging system was the “Optical Disk Pilot Project” at the Library of Congress. Several  document imaging software packages are currently available in the market.


7. Institutional Repositories 


The history of institutional repositories is closely associated with the history of open access movement. Major milestones in the history of institutional repositories can broadly be categorized into four major subheads namely, i) institutional repositories and service providers to harvest metadata from the growing number of IRs; ii) Major catalytic initiatives / movements; iii) OAI-compliant open source software for IRs; and iv) publisher’s response to IRs. A brief account of developments in IRs is given below (Arora, 2007).


The IR began as movement to facilitate access to digitized preprints to scientists that could be remotely accessed electronically. The informal FTP servers run by academic departments were referred to as the “archives” or “repository”. One of the earliest examples of a digital repository was arXiv (http://www.arxiv.org/) launched in 1991 by Prof. Paul Ginsparg at Los Alamos National Laboratory. ArXiv is a digital repository in the fields of physics, mathematics, non-linear science, computer science and quantitative biology. ArXiv was originally created at Los Alamos National Laboratory but later moved to Cornell University with Dr. Ginsparg. Over the past decade, the arXiv has evolved as a global repository for non peer-reviewed research papers in a variety of physics research areas. The Mathematical Physics Preprint Archive (mp_arc) was also launched in 1991 by H. Koch, R. de la Llave, and C. Radin at the University of Texas at Austin. Actual growth in IRs began with the Internet and the availability of browser software (e.g., Netscape, Mosaic) around the mid-1990s.


Several IRs were established in subsequent years which included: Computer Science Technical Reports (CS-TR), 1992; NASA’s Langley Technical Report Server, 1993; Networked Computer Science Technical Reference Library (NCSTRL) launched by DARPA and NSF, 1994; NASA Technical Report Server (NTRS), 1994; Networked Digital Library of Theses and Dissertations (NDLTD) launched by Virginia Polytechnic Institute, 1996; Research Papers in Economics (RePEc), 1997; PubMed launched by the National Center for Biotechnology Information, 1997; CogPrints launched by Stevan Harnad, 1997; CiteSeer (also called ResearchIndex) launched by the NEC Research Institute, 1997; Computing Research Repository (CoRR) launched by the ACM, arXiv, NCSTRL and AAAI, 1998; Open Citation Project (OpCit), 1999; PubMed Central, 2000; Citebase, launched at Southampton University, 2001; eScholarship Repository launched the California Digital Library, 2002.


With the proliferation of domain-specific digital repositories and institutional repositories, it was difficult to support searching across multiple repositories. Repositories needed greater capabilities to automatically identify and access papers that had been deposited in other repositories. Need was, therefore, felt to build a framework to bring about a kind of integration of these e-print / pre-print archives to solve these problems. Launching of Open Archives Initiative (OAI) and its Protocol of Metadata Harvesting (PMH) can be considered as one of the major initiatives that proved as catalyst to the growth of IRs. The OAI initiative was evolved in the Meeting of the Universal Preprint Service held in Santa Fe, New Mexico in 1999 to explore cooperation among growing number of scholarly IRs. The protocol provides an interoperability framework based on the harvesting of metadata. OAI-PMH defines a simple set of metadata that can facilitate federated resource discovery. A number of services were launched to harvest metadata from the growing number of institutional repositories with an aim to provide integrated access to content distributed across thousands of digital repositories and open access journals and to provide associated services to the users. The important service providers include: Cross-Archive Searching Service (ARC) launched by Old Dominion University in 2000, OAIster launched by the University of Michigan Libraries Digital Library Production Services in 2002 and Google Scholar in 2004.


While Open Archives Initiative (OAI) is one of the major catalyst to IRs movement, other catalyst to this movement include: Budapest Open Access Initiative (BOAI) launched by the Open Society Institute in 2002; Project RoMEO and Project SHERPA launched in 2002 and mandating deposition of articles by faculty in open access repositories in the University of Southampton Department of Electronics and Computer Science in 2003.


Availability of a number of OAI- compliant open-source software for IRs led to wide spread establishment of IRs in several research institutions and universities, especially in the developing countries. Some of the important OAI-compliant open source software are: Eprints (Southampton University, 2000), SDSWare (CERN, 2002), DSpace (MIT, 2002), FEDORA (University of Virginia and Cornell University, 2003)


Responding to the pressure created by the academic community with spread of open access journals, institutional repositories, author’s web sites, several publishers have come out with their policies on self-archiving. Major publishers like Elsevier, Sage, Springer, IEEE and IEE allow their authors to self-archive their research articles in IRs or on author’s web site. 72% of 1411 publishers surveyed by the SHERPA / ROMEO project have agreed to allow authors to self-archive papers published in their journals (pre-print or post-print) in their institutional repositories. 28% of 1411 publishers do not support “self-archiving” formally.


8. Internet Technology and its Services 


The history of Internet (Arora, 2004) can be traced back to 1957 when erstwhile Soviet Union launched its first satellite, Sputnik I, prompting US President Dwight Eisenhower to launch Defence Advanced Research Projects Agency (DARPA) to regain the lead in the technological race. DARPA’s mission was to advance science and technology for military applications. The DARPA developed its first successful satellite in 18 months. By the end of 1960, it began to focus on computer networking and communication technology essentially to established communication links between research centres and universities across the country as part of its overall mission. ARPANET was commissioned in 1969 and by 1971 it had 15 nodes and 23 hosts. The e-mail was invented in 1972 by Ray Tomlinson to send messages across a distributed network. In 1973, the first international connection to the evolving Internet was established at the University College of London and the Royal Radar Establishment (Norway). In the same year, DARPA initiated a research program to investigate techniques and technologies for interlinking packet networks of various kinds. The objective was to develop communication protocols based on “packet- switching” that would allow networked computers to communicate seamlessly across multiple, geographically dispersed locations. The “packet-switching” would split the data to be transmitted into tiny packets that can take different routes to their destination. This was called the Internetting project and the system of networks that emerged from the research was known as the “Internet”. The system of protocols which was developed over the course of this research effort became known as the TCP/IP Protocol Suite, after the two initial protocols developed: Transmission Control Protocol (TCP) and Internet Protocol (IP).


The operation management of the emerging Internet was handed over to the Defence Communication Agency (DCA) in1975. Unix to Unix Copy Program (UUCP) was developed at the Bell Labs (AT & T) in the year 1976. 1977 witnessed the development of mail specifications (RFC 733). Usenet was established in the same year using UUCP (Unix to Unix Copy Program) between Duke and the University of North Carolina (UNC). DARPA also established the Internet Configuration Control Board (ICCB) in the year 1977.


In 1981, CSNET (Computer Science Network) was built with the collaboration of a number of universities and industries in USA. The National Science Foundation gave financial support to the CSNET to provide networking services. CSNET used the Phonenet MMDF protocol for telephone-based electronic mail relaying and, in addition, pioneered the first use of TCP/IP over X.25 using commercial public data networks. The CSNET server provided an early example of white pages directory service and this software is still in use at numerous sites. At its peak, CSNET had approximately 200 participating sites and international connections to approximately fifteen countries. Another important development in the same year was the creation of BITNET (Because it’s time network). The BITNET was started as a cooperative network at the City University of New York with the first connection to University of Yale. At its peakin 1991, BITNET was connected to almost 500 organizations and 3,000 nodes in educational institutions in North America, Europe (as EARN), Israel (as ISRAEARN), India (TIFR) and some Persian Gulf states (as GulfNet). It was also very popular in other parts of the world, especially in South America, where about 200 nodes were implemented and were heavily used in the late 1980s and early 1990s. With the rapid growth of TCP/IP systems and the Internet in the early 1990s, and phasing out of IBM mainframe that was the base platform, BITNET’s popularity and use diminished quickly. In 1996, CREN ended their support for BITNET. As of 2007, BITNET has essentially ceased operation.


1982 was a year of great significance in the growth and development of Internet. Defence Communication Agency (DCA) and DARPA adopted Transmission Control Protocol (TCP) and Internet Protocol (IP) suite (commonly known as TCP/IP) as the official protocol suite for ARPANET. This led to one of the first definition of Internet as connected set of networks using TCP/IP. In the same year, the Eunet (European UNIX Network) was created to provide e-mail and Usenet services in Europe. The External Gateway Protocol (EGP) was also developed in the same year, which defines protocols for connecting networks that were not based on TCP/IP with the Internet. The University of Wisconsin developed “Name Server” in 1982 that facilitated translation of names into strings of numbers. This development led to the practice of assigning domain names for the sites that is being practiced even now. Other significant development that took place in 1982 included splitting of ARPANET into ARPANET and MILNET. The MILNET was later integrated with the Defence Data Network created in 1981.


Launch of desktop computers in1982 led to major shift from having a single, large main frame computer connected to the Internet on each site to the entire local areas network connected to the Internet. In the same year, the Internet Activities Board (IAB) replaced ICCB with a primary mission to guide evolution of the TCP / IP protocol suite and to provide research advice to the Internet community.


Domain Name Servers as distributed databases were introduced in 1984 to facilitate translation from domain names to IP addresses. Transition to naming standards from numeric addresses proved to be very helpful in popularisation of the Internet. For example, it is much easier to remember www.yahoo.com than its numerical equivalent.


In 1986, the U.S. National Science Foundation (NSF) initiated the development of the NSFNET, which, today provides a major backbone communication service for the Internet. The National Aeronautics and Space Administration (NASA) and the U.S. Department of Energy contributed additional backbone facilities in the form of the NSINET and ESNET respectively. The Network News Transfer Protocol (NNTP) was designed to enhance news performance over TCP/IP.


In 1987, the NSF signed a cooperative agreement to manage the NSFNet backbone with Merit Networks, Inc. Merit, IBM and MCI later founded Advanced Network and Services, Inc. (ANS). In the same year, BITNET and CSNET merged to form the Corporation for Research and Educational Networking (CREN). In the fall of 1991, CSNET service was discontinued having fulfilled its important early role in the provision of academic networking service. A key feature of CREN is that its operational costs were fully met through dues paid by its member organizations.


A computer virus for the first time affected approximately 6,000 of total 60,000 hosts on the Internet in the year 1988. The vulnerability of Internet and the need for more security was realised for the first time. DARPA formed the Computer Emergency Response Team (CERT) in response. In the same year, the Department of Defence adopted Open Systems Interconnection (OSI).


The total number of hosts on the Internet rose to 100,000 in 1989. The year also witnessed first relays between a commercial electronic mail carrier and the Internet. MCI Mail connected through the Corporation for the National Research Initiative (CNRI) and CompuServe connected through Ohio State University. The Corporation for Research and Education Networking (CREN) was formed with the merger of CSNET and BITNET. The Internet Engineering Task Force (IETF) and Internet Research Task Force (IRTF) also came into existence under the IAB in the year 1989. In the same year, several other countries got connected to the NSFNet including Australia, Germany, Israel, Italy, Japan, Mexico, the Netherlands, New Zealand, Puerto Rico and the United Kingdom. In Europe, major international backbones such as NORDUNET and others provide connectivity to over one hundred thousand computers on a large number of networks. During the course of its evolution, particularly after 1989, the Internet system began to integrate support for other protocol suites into its basic networking architecture. The present emphasis in the system is on multi-protocol internetworking, and in particular, with the integration of the Open Systems Interconnection (OSI) protocols into the architecture.


During the early 1990’s, OSI protocol implementations also became available and, by the end of 1991, the Internet has grown to include some 5,000 networks in over three dozen countries, serving over 700,000 host computers used by over 4,000,000 people. The ARPANET ceased to exist in 1990. Commercial network providers in the U.S. and Europe began to offer Internet backbone and access support on a competitive basis to interested parties. Access to Internet was first offered on commercial basis by “World” (world.std.com), thus it became the first Internet Service Provider (ISP) of Internet dial-up access. Several other countries got connected to the Internet in 1990 including Argentina, Austria, Belgium, Brazil, Chile, Greece, India, Ireland, South Korea, Spain and Switzerland.


Wide Area Information Servers (WAISs) were invented in 1991 by Brewster Kahle and released by the Thinking Machines Corporation. These servers became the basis of  indices  to  information  available  on  the  Internet.  The  indexing  and  search techniques implemented by these engines allow Internet users to find information using keywords across vast resources available on the net.


The most significant development in the history of Internet was the invention of World Wide Web (WWW) by Tim Berners-Lee at the CERN Laboratory in 1991. The first Web browser called “Mosaic” was released in 1993 that took the Internet by storm. Several other countries got connected to the Internet in the year 1993. The InterNIC was created in 1993 to provide specific Internet services including i) Directory of database services; ii) Registration services; and iii) Information services.


In 1994, the Internet (ARPANET) celebrated its 25th anniversary. Internet shopping and e-commerce commenced its operation on the net. Growth on the Internet traffic became geometric, i.e. NSFNet traffic passed 10 trillion bytes/month during 1994. WWW became the second most popular service on the net (behind FTP) leaving Telnet at third place. In March 1995, the WWW surpassed FTP as the service with greatest traffic on NSFNet based on packet count.


Several traditional dial-up systems in USA including CompuServe, America Online, Prodigy began to provide Internet access for services other than e-mail, i.e. WWW, Gopher, FTP and so on.


The technologies of the decade were WWW and search engines. New technologies emerged in late 1990s, including client-based code loaded from Web servers such as Java, JavaScript and ActiveX, etc. The research and development on the Internet and related technologies continues even today.


A great deal of support for the Internet community has come from the U.S. Federal Government, since the Internet was originally part of a federally-funded research program and, subsequently, has become a major part of the U.S. research infrastructure. During the late 1980’s, however, the population of Internet users and network constituents expanded internationally and began to include commercial facilities. Indeed, the bulk of the system today is made up of private networking facilities in educational and research institutions, businesses and in government organizations across the globe.


9.  Development of Web Browsers 


The beginning of full-text digital library involved building-up several client systems usable in a multitude of environments, such as MS Windows, MS DOS, Apple Macintosh and a diversity of UNIX systems as well as for terminal-oriented mainframe systems, notably VT-100 and VT-220. Upscaling of digital library in those days entailed huge maintenance problems because all client system had to be upgraded and scaled for new facilities and emerging new techniques and processes. However, 1990s brought-in a true revolution in digital library system.


The advent of World Wide Web (WWW) offered a crucial advantage with the availability of ready-to-use, publicly available, user-friendly graphical web browser for all prevalent platforms. Standard WWW clients such as Internet Explorer and Google Chrome that are being upgraded regularly for added functionality such as e- mail  client,  support  for  JAVA  and  Active  X  and  the  ability  to  view  important document formats without having to install plug-ins for them. These browsers solved the maintenance problem allowing developers to concentrate fully on the server side and not to bother with the client side. These browsers are available freely and are easy to use eliminating the need of extensive support and user’s  training.  The Internet and associated technologies, made it possible for digital libraries to include multimedia objects such as text, image, audio and video.


10.  Hyperlinks and Development of World Wide Web 


Vannevar Bush, one of the Roosevelt’s advisers in the World War II is generally credited with being the first thinker to suggest mechanical and electronic means to handle information. In his seminal article published in the “Atlantic Monthly” in 1945, Bush conceptualized Memory Extender called “MemEx”, a thinking machine in which an individual could store information and link them. However, given the developments in digital technology at that period, the MemEx was essentially proposed to be an analogue machine that could be used for information storage on microfilms with a mechanical linking processe.


Vannevar Bush’s microfilm-based “MemEx”, in turn, inspired Ted Nelson and Douglas Engelbart to carry forward the underlying concept behind MemEx. In 1962, Engelbart started work on the Augment Project, which aimed to produce tools to aid human capabilities and productivity. He developed NLS (oN-Line System) that allowed researchers in Augment Project to access all stored working papers in a shared “journal” which eventually had 100,000 items in it, and was one of the largest early digital libraries. Engelbart is also responsible for inventing pointing device (mouse) in 1968. Ted Nelson designed “Xanadu System” in 1965 and coined the word “Hypertext” and proposed a system wherein all publications in the world would be deeply inter-linked. Nelson also tackled the problems of copyrights and payments by proposing that there should be electronic copyright management system to keep track of accessing information and for charging it.


The most significant development in the history of Internet was the invention of World Wide Web (WWW) by Tim Berners-Lee at the CERN Laboratory in 1991. The crucial underlying concept behind World Wide Web (WWW) is hypertext that has its origin inTed Nelson’s Project Xanadu, and Douglas Engelbart’s oN-Line System (NLS). Berners-Lee, in his book titled “Weaving the Web”, explained that he had repeatedly suggested that a marriage between the two technologies was possible to members of both technical communities, but when no one took up his suggestion, he finally took-up the project himself. In the process, he developed three essential technologies (Wikipedia, 2014), i.e.:


i)  a system of globally unique identifiers for resources on the Web and elsewhere, the universal document identifier (UDI), later known as uniform resource locator (URL) and uniform resource identifier (URI);

ii)  Hypertext Markup Language (HTML); and

iii)  Hypertext Transfer Protocol (HTTP)


11.  Electronic Resources


The first electronic resources, in true sense, appeared in the form of bibliographic records in libraries. Machine Readable Cataloguing (MARC), introduced in 1964, can be considered as a major development in this regard. Soon after, automation of libraries started in a big way in the 1970s with the introduction of integrated library automation packages. The trend picked-up in the early 1980s with the introduction of PCs at a cost affordable to the libraries. The computerized catalogues of individual libraries led to formation of union catalogues through library networks like OCLC that were developed to facilitate online and copy cataloguing and resource sharing. By early 1970s, library OPACs and union databases were accessible from remote locations. Moreover, online search services, like DIALOG, ORBIT, BRS Search and Datastar in USA; BLAISE and Pergamon Infoline in UK; DIMDI in Germany; Euronet and Diane in Europe; ESA-IRS in Italy; and CAN/OLE in Canada, etc. were also made accessible online to the research community. Appearance of bibliographic and full-text databases on CD ROM by late 1980s can be considered as a major breakthrough in the evolution of electronic resources. Most of the bibliographic databases that were accessible through the online search services like DIALOG and STN became available on CD ROM (Arora, 2007).


The emergence of Internet and the World Wide Web (WWW) in early 1990s, as a new media of information storage and delivery, came as a real boon for evolution of electronic resources. While searching bibliographic databases became popular, it created demand for actual content in full-text that became difficult for libraries to obtain. Coincided with evolution of World Wide Web (WWW), display technology evolved, cost of storage came down drastically and networks became faster. It became possible for publishers to deliver content, either as a bitmap page images or other structured formats such as HTML, PDF or RTF. Increasingly larger number of publishers started using the Internet as a global way to offer their publications to the international community of scientists and technologists given the fact that technology is in a position to deliver more content to more users at a significantly lower cost per user. These new technologies are continuously driving the electronic resources to new peaks of usage, significantly beyond the library’s subscribed content.


These Internet and web technologies brought in the graphical components in electronic resources and digital libraries that were missing earlier.


There has, thus, been a steady move up the technological scale for the electronic resources from early (late 1980s) low-end electronic publications available as ASCII files, to being organized and searchable on gophers (1992), and to being tagged and graphically viewable on World Wide Web sites (1994).



12.  Summary 


Although the term digital library has gained popularity in recent years, they have evolved along the technological ladder for the past thirty years. In the early 1970s, digital libraries were built around mini and main-frame computers providing remote access and online search and retrieval services to online databases using computer and communication technology available at that time. While creation and remote accessibility of online databases through information retrieval systems (IRS) can be considered as an important landmark and precursor to the digital libraries, contributions for evolution of digital libraries have come from several disciplines, leading to multiple conceptions of digital libraries, each one influenced by the perspective of primary discipline. History of digital libraries, therefore, is the history of variety of different types of information system and technologies that have been considered as “digital libraries” or their precursors. The module elaborates on the historical evolution of the following information system or technologies:


•    Computers and Microprocessor Technology;

•    Digital Storage Technology

•    Online Databases and Information Retrieval System (IRS)

•    Computer-based Information Storage and Retrieval System

•    Digital Imaging Technology

•    Institutional Repositories

•    Internet Technology and its Services

•    Development of Web Browsers

•    Hyperlinks and Development of World Wide Web

•    Electronic Resources





  1. Arora, Jagdish. Online information retrieval system: Databases, data networks and online search services in India. In: Librarianship Today and Tomorrow (Dr. S.C. Verma Festschrift) India (eds. U.C.Sharma and M.R. Rawtani). New Delhi, Ess Ess Publications, p.173-192, 1996.
  2. Arora, Jagdish. Basics of Internet. Course Material on “Internet Resources and Services (Block 4- Unit 1)” for M.L.I.Sc. Programme offered by the Indira Gandhi National Open University (IGNOU) New Delhi: IGNOU, 2004.
  3. Arora, Jagdish. Course Material on “E-resources” for Certificate Programme on Soft Skills (course 3- Unit-9) offered by the Vardhaman Open University, Kota, 2007.
  4. Arora, Jagdish. Institutiona Repositories: An overview (Unit 12). Course Material on “Use of ICT in Libraries” for M. Phil. in Library and Information ScienceProgramme offered by the Vardhaman Open University, Kota, 2007.
  5. Bosworth, Edward L. Textbook for design and architecture of digital computers: An introduction (CPSC 5155). Bosworth, 2011. (http://www.edwardbosworth.com/My5155Textbook/MyText5155_AFrontMatter.ht m)
  6. Internet economy indicator. (http://www.internetindicators.com/factfigure.html)
  7. Internet World Stats: Usage and population statistics. (http://www.InternetWorldStats.com)
  8. Kingston Technology Company. Kingston Digital Ships Its Fastest, World’s Largest- Capacity USB 3.0 Flash Drive, 2013. (http://www.kingston.com/en/company/press/article/6487)
  9. Living Internet  (http://www.livinginternet.com/)
  10. McBride, P.K. Internet made simple. 2nd ed. Oxford, Butterworth-Heineman, 1999, 270 p. Whittaker, Jason. Internet: basics. London, Roultage, 2002. 228p. Wikipedia. Charge-coupled device, 2014.(http://en.wikipedia.org/wiki/Charge-coupled_device)
  11. Wikipedia. Magnetic storage, 2014. (http://en.wikipedia.org/wiki/Magnetic_storage)
  12. Wikipedia. Optical storage, 2014. (http://en.wikipedia.org/wiki/Optical_storage)
  13. Wikipedia. USB flash drive, 2014.(http://en.wikipedia.org/wiki/USB_flash_drive)
  14. Wikipedia. World Wide Web, 2014. (http://en.wikipedia.org/wiki/World_Wide_Web)
  15. Web Sites  (last visited on 22nd Feb., 2014)  ARC    (http://arc.cs.odu.edu/) ArXiv (http://www.arxiv.org/)
  16. Budapest Open Access Initiative  (http://www.soros.org/openaccess/)
  17. Cogprints (http://www.cogprints.org/)
  18. Dspace   (http://www.dspace.org) E-prints  (http://www.eprints.org/) Fedora (http://www.fedora.info/)
  19. Google Scholar  (http://scholar.google.com/)
  20. Directory of Open Access Repositories (OpenDOAR) (http://www.opendoar.org/) Mathematical Physics Preprint Archive (http://www.ma.utexas.edu/mp_arc/) OAIster (http:// www.oaister.org/)
  21. Open Archives Initiative (http://www.openarchives.org/) Open Citation Project (OpCit) (http://opcit.eprints.org/)
  22. Philosophy of Science Archive (http://philsci-archive.pitt.edu/) Research Papers in Economics (http://repec.org/)
  23. PubMed Central (http://www.pubmedcentral.nih.gov/) SHERPA/RoMEO – Publisher copyright policies and self-archiving (http://www.sherpa.ac.uk/romeo.php)
  24. Extensible Markup Language (XML)
  25. Main page for World Wide Web Consortium (W3C) XML  (www.w3.org/XML/)