Organizing Digital Collections:
click for larger photo of Dharlene's presentation.
Digital libraries, in the case of this paper, can be considered similar to traditional libraries, but instead of storage on shelves, the collection is entirely in digital format. Other than that, many of the roles and functions of a library persist in the digital model, such as the collection, organization, storage, use and dissemination of the information they contain, and they therefore have information policies, however casual or even unconscious. Decisions are made at every stage of the information cycle, although some roles, such as dissemination, are less in the control of the digital librarian (web site manager) than they could be in a physical setting. Even so, the organization of the digital library can very much affect which information users of the library are able to find, and therefore, dissemination is affected.
If the goal of a digital library is to make all the documents it holds available to a reading public, then it must draw upon the expertise and skills of librarianship in order to make all the documents findable. The extent to which this is true for the Bahá'í Academics Resource Library will be examined in this paper. The various ideas for organization of digital collections and technologies for information retrieval will be reviewed, and their applicability to the Bahá'í Academics Resource Library will be assessed. The goal of this paper is to review the usability of this digital library, and make recommendations based on current research which could improve its usability for current and future readers.
2. Overview of the Bahá'í Academics Resource Library
The Bahá'í Academics Resource Library started in 1997 as an accidental initiative of one religious studies scholar, Jonah Winters (See History at the Bahá'í Academics Resource Library site). Mr. Winters's goal is to provide other scholars with convenient access to as much literature and documentation relevant to the academic study of the Bahá'í Faith as possible. Digital versions of published print books, articles, essays and discussion group postings as well as non-text documentation such as maps, images and charts are also available. I'm not aware of any sound files such as lectures being available, but I'm sure that Mr. Winters would be open to the idea, if the talk contained ideas and information not elsewhere available.
Documents are in various formats and states of "cleanliness" depending on the interest and availability of volunteers to scan, edit and proofread (i.e. compare with original print text). Fully formatted lengthy texts are broken down into sections, with links provided for `previous' and `next' sections, the table of contents and the section of the web site which contains the book or article. Original page numbers from the print version are included in the files, which is invaluable for discussion and citation of sources. If the books were only in html, and the original completely ignored, it would be harder to refer to pages accurately. A link to an excellent site by Melvin Page called A Brief Citation Guide for Internet Source in History and the Humanities is provided on the Some notes on copyright page of the Bahá'í Academics Resource Library. The Library provides all the tools necessary for scholars to a) ensure that they are using reliable, authoritative sources as well as the tools to b) prove that fact to their readers.
For most files, there is a note regarding the permissions obtained and conditions of use for the text, and further information is available in the Some notes on copyright page.
There is a lot of activity in the area of digital library research, and much of it is being done by computer scientists who are looking for technological fixes for information organization, searching and retrieval. Börner (2000), Chung et al. (1999), Liu (2000), Maarek (1995) and Shipman (2000) are just a very few examples in this field, which is growing exponentially from week to week. The Bahá'í Academics Resource Library works with only a minimum of technology (e.g. the basic search engines attached to specific sections), but is very useable nonetheless. The Library has been organized in a very thoughtful manner, much in the same way that giant collections were managed before there were online catalogues. There are a lot of similarities between the Bahá'í Academics Resource Library and big physical academic collections before computer catalogues. The most obvious feature is how it is divided up into relatively small sections. Large universities often had their collections physically divided by department (which would also usually correspond to subject) and each department library would have its own catalogue for accessing the collection. It's quite common that when libraries get computer access for their catalogue, departmental collections are integrated into the main collection.
3. Navigability of the Library
To experience how the Library is structured start at "Primary Source Material." Following through a series of links will demonstrate the hierarchical structure of the site, and how it moves from big to small, while always offering the opportunity to return to the next level up, as well as the original starting point. [Note: the index page Dharlene is referring to here, "Primary Source Material," has been merged into the main Index page. -J.W.]
In this respect, the Bahá'í Academics Resource Library compares favourably with other digital libraries, and might even surpass them in ease of navigation. The survey by Yin Leng Theng (1999) showed that the most common indication of "lostness" in digital libraries was not being able to return to previously visited information. This problem is eliminated in the Bahá'í Academics Resource Library by the care taken in linking documents to the levels above them, as well as the generally transparent structure of the Library as a whole. Also, once the searcher actually reaches the text from a passage of a book, reference is made to the physical copy of the book from which the passage was taken. This helps scholars who use the Library to access their personal collections more efficiently, and those who want to cite from a physical copy of the book rather than an electronic version.
4. Shortcomings of the Bahá'í Academics Resource Library
The main, and quite important, element lacking in the Bahá'í Academics Resource Library is that there is no full subject access to the Library. The items in the Library are not catalogued in the traditional sense, and they don't have metadata with subject headings. The search engines are not operating with the benefit of a thesaurus which would help with problems like synonymy and varying forms of peoples' names. Schatz et al. (1996) present an intriguing technical solution which provides an intermediate level in the search process, allowing the user to refer to controlled vocabulary without having to use a separate document. However, this is an expensive technical solution, absolutely out of the reach of a small volunteer-based project such as the Bahá'í Academics Resource Library.
Much of the work of dealing with vocabulary/search terms problems is left to the searcher, or to the website manager who also functions as a reference librarian much of the time, pointing people to the files they need. The Bulletin Board and direct email are used for reference requests. Some decent alternatives are, however, available for searchers who need subject access, mainly in the Resource Tools section, and these will be looked at below.
Despite its relative simplicity, and the narrow scope of the subject being collected, growth of the Library is becoming unsustainable. Almost every file that is posted to the library is scanned, proofread, marked up for html and saved to the server by hand. Because of the time-consuming nature of these activities, volunteers contributing to the Library are reluctant to add additional information to the files such as a metadata standard such as Dublin Core, even if they had the technical knowledge to do this. As it stands now, the website manager estimates that there is a 4 year backlog of contributed texts waiting to be formatted and posted to the Library, and that is without the additional task of subject analysis.
The Library is very well organized for known-item searches, which also happens to be the most common way it is used, according to the very informal surveys by Mr. Winters. Some of the fundamentals of information organization are evident in the library. It does bring like items together, in that items in a hierarchy may be by the same author (in the Sacred Writings section), or of a similar format or nature (such as in the "Secondary Source Material section) [Note: the index page Dharlene is referring to here, "Secondary Source Material," has been merged into the main Index page. -J.W.]. It works very well for helping people find items that they already know exist, especially when they do a very specific search in the correct section using the search engines, but falls short in helping people find items of which they were previously unaware. The Resource Guide for the Scholarly Study of the Bahá'í Faith by Jonah Winters and Robert Stockman, which is available in the "Resources" section, takes care of some of this type of searching, but the Guide doesn't come close to covering the scope of the Library itself.
As you'll remember from the History paragraph, the Resource Guide for the Scholarly Study of the Bahá'í Faith was written before the Bahá'í Academics Resource Library started, and was therefore never intended to be used as access to the Bahá'í Academics Resource Library. I still feel, however, that it is a very reliable way for scholars to determine which books and articles in the Library are relevant to their interests.
My example begins in on the Table of Contents page of the Resource Guide for the Scholarly Study of the Bahá'í Faith . I clicked on the Table of Contents of the Annotated Bibliography of Scholarship on the Bahá'í Faith. The Table of Contents functions somewhat as a list of subject headings. I chose section 27: Gender Issues and Equality. The write-up for section 27 is in a narrative format. Some explanation of the topic and the Bahá'í perspective is given, then relevant pages from some standard works are listed, as well as referral to entire books, articles and passages from other books. Full bibliographic detail of the works is given in the Bibliography of All Works Cited. The searcher can then, armed with specific details of authors and titles, use the search engines on the Bahá'í Academics Resource Library. I chose "Sex, Gender, and New Age Stereotyping" by Lata Ta'eed, and was able to find it in the Published Articles section at: http://bahai-library.com/taeed_sex_gender_stereotyping.
The Resource Guide for the Scholarly Study of the Bahá'í
Faith has an index, which is also included in the online version. It is of
very low quality, because it was generated automatically, but there is a
friendly, honest note attesting to this fact. For the online version of the
Resource Guide, however, the index is unfortunately entirely useless,
because the page numbers it refers to are not included in the main body of the
guide. Making the page numbers of this index into links is not a
solution I would recommend, unless the index were rewritten by a professional.
Since the Resource Guide for the Scholarly Study of the Bahá'í Faith is the best subject access available on the Bahá'í Academics Resource Library, and since it is not perfect, other options for subject access are provided by the website manager.
5. Alternatives for Subject Access provided by Internet Resources Outside of the Bahá'í Academics Resource Library
The Bahá'í Academics Resource Library provides access to
a lot of resources outside of its own collection. The main starting points are
under "Resources" and "Catalogues" [Note: these indexes are no longer online. -J.W.]. There is nothing that
provides something like subject headings as well as the Resource Guide,
but many of the bibliographies are annotated and can be perused for topics.
The catalogues are actually all bibliographies or lists, and none have subject headings beyond some annotations. There is a Bahá'í university in Switzerland named Landegg University which was not linked by the Bahá'í Academics Resource Library in the catalogues section, rather only under Organizational sites, but which I thought must have a good collection, so I investigated, and present a short evaluation here for the sake of comparison.
The Landegg International University Library has a Microsoft Access database of its catalogue available on the web site, but it is not linked from the Bahá'í Academics Resource Library, for a few good reasons. It offers a title/author keyword search, and pull-down lists of subject headings and languages, which I thought was promising. Unfortunately, the subject headings seem to be the very broadest Library of Congress Classification Headings, with Library of Congress Classification letters only. The list runs right from A to Z in the Library of Congress classification, with additions for Videos made in the normally unused V area, and Holy Writings by the Báb and Bahá'u'lláh moved to the front of the classification in Ab, although these are not subject headings, but rather just a grouping by author. (The classification Ac says it is for Writings of Abdu'l-Bahá, but from the results list, seems to actually be for books on the environment.) The list of subject headings is very long indeed, including many subject headings for which there are no, and are likely to never be actual holdings. For instance, it is highly unlikely that the History of the Netherlands will be a big topic collected in this library. However other classification headings are already overfull, such as the History of Asia, because the Bahá'í Faith began in Persia, and the history of the Faith's beginnings are usually studied in conjunction with the history of the conditions in the region.
The main problem, though, is that in the subject headings pull-down under BP, where the Bahá'í Faith is listed in the Library of Congress Classification, there is no subdivision whatsoever. If the searcher chooses this option, they get a very long list of books. It's disappointing to see this, actually, when William Collins at the International Bahá'í Library, the depository library for the Bahá'í world, has provided a very useful expansion to the BP schedule which can handle all Bahá'í material very well. The International Bahá'í Library has also developed Bahá'í Subject Headings for use by Bahá'í Librarians managing Bahá'í collections.
The bibliographic information in the Landegg catalogue is minimal, providing only author, title, language (which the searcher already knows, because she chose it in the pull-down list) and the truncated LC classification number. The Bahá'í Academics Resource Library as it stands now, without the benefit of a database, is more useful than the Landegg International University Library catalogue for providing lists of Bahá'í literature, and even goes further by offering the full text. The Landegg catalogue seems to be simply a tool for locating the books on the shelf at the university itself, but walking up to the shelves and looking would probably be equally, if not more effective. This catalogue is indeed, insufficient for providing an alternative subject access to the Bahá'í Academics Resource Library.
6. Possible Improvements
Apply a controlled vocabulary to each file in the Bahá'í Academics Resource Library and provide this in the metadata
Advantages: This would dramatically increase the efficiency of the search engines currently running on the system, since more meaningful terms could be added to the files than might currently be there, and similar documents would be brought together in searches.
Applying a thesaurus to control synonymy and authorities for name control to the library would be an incredible improvement, since, in the case of Bahá'í studies, name authorities are essential to control variations. Not only is there a tremendous problem with transliteration and diacritics, but individuals also sometimes have a few names associated with them. Just one example would be Montreal-born Mary Maxwell. When she married Shoghi Effendi, the great-grandson of Bahá'u'lláh, the prophet-founder of the Faith, she was given the name Amatu'l-Baha Ruhiyyih Khanum, which is subject to many spelling variations.
Disadvantages: Someone would have to go through and apply subject headings to each file manually. Bahá'í Subject Headings are available at the International Bahá'í Library web site, but it's in an automated format, not a text that can be browsed very easily, but it is doable. Ideally of course, the International Bahá'í Library would make its catalogue available online, so that copy cataloguing could be done, thereby eliminating the time-consuming task of intellectually analysing the books, articles and other items in the Bahá'í Academics Resource Library for the best subject headings. It is very possible, however, that the Bahá'í Academics Resource Library has many items that the International Bahá'í Library does not (because of unpublished items), but still, the ability to copy catalogue would dramatically decrease the amount of time it would take to add essential metadata to the files which correspond.
Put the whole library into a database, with each file being available in full text
Advantages: Would aid in searching and inventory of the Library, because author, title and text would be searchable in separate indexes, thereby improving the precision of searching. Searchers could chose different ways to display and sort bibliographic records.
Disadvantages: Each file would have to be manipulated by hand to have the various parts (title, author, text, copyright info, etc.) put into the various fields. This would require many, many hours and extra training of volunteers to not only do this kind of work but to learn to perform according to standards. Highly detailed procedures manual would have to be written.
This wouldn't solve the problem of the lack of subject access, unless Improvement #1 were combined with Improvement #2.
Create a completely separate union catalogue of the literature of the Bahá'í world, to which various libraries could add a tag saying they held it and contribute records.
Advantages: Creating a completely separate union catalogue is very attractive, in that it would potentially provide access to all Bahá'í collections, rather than just the Bahá'í Academics Resource Library, and therefore have more long-term value. It would be like Materials in Dutch Libraries, but with multiple locations for each item, and many, many of those locations would be live links to the digital files held at the Bahá'í Academics Resource Library. It would have to be in database format, rather than a bibliography format, though, in order to provide the benefits mentioned in Improvement #2. Improvement #1 would of course also be implicit in this project, since to be a proper, authoritative, universal catalogue of Bahá'í literature, the records should be as complete as possible.
A resource like this would serve as a permanent union catalogue which Bahá'í libraries, physical or digital, all over the world, could use in lieu of creating their own cataloguing. They would simply have to look into the catalogue for the items they have in hand, and add their location code. If their item was not there, then they could contribute original cataloguing. It could work the same way that OCLC Worldcat and other bibliographic utilities do, but on a much smaller scale, because of the restricted subject scope of the relatively few participating libraries. Searchers would not have to use many different tools (bibliographies, lists, guides, etc.) to access many different collections. There would be one complete, comprehensive resource.
The other very attractive advantage is that the Bahá'í Academics Resource Library could be left intact in the format and structure it already has, and which users are currently finding very easy to use. This solution fits with the spirit of the Bahá'í Academics Resource Library in pointing to resources outside itself. Its missions and goals would not have to be altered in any way.
Disadvantages: This would be relatively simple for a team of trained librarians to do, but would not be free, even if the librarians volunteered. It would have to be done on a database, so that searchers could restrict searches to certain fields, which would be separately indexed, thereby allowing searches by title, author, subject, format, etc. Databases which are sophisticated enough to be useful for easy searching of bibliographic information, such as Inmagic, are far from cheap. Free databases, such as Microsoft Access would be unsatisfactory. In addition, a reliable server would have to be available.
The person hours required to do this would also be considerable. All of the disadvantages of Improvements #1 and #2 are applicable here as well. It would make a tremendous difference, though, if the catalogue of the International Bahá'í Library would be made available to the union catalogue to form its core, and if their subsequent cataloguing could be copied to it in the future, just as the Library of Congress holdings make up a very substantial proportion of the OCLC Worldcat database.
Digital libraries have emerged recently as an invaluable resource for all kinds of information gathering, but their development has been dominated by attempts to automate all processes associated with them, while abandoning the experience and wisdom which has been accumulated in centuries of library science. There is no technical replacement for intelligent and thoughtful organization, which the success of the Bahá'í Academics Resource Library proves very well. There is much that can be done to improve access not only to that digital library, but to all related collections, if the same energy and dedication which has gone into the Bahá'í Academics Resource Library could be applied to the new project of a universal Bahá'í catalogue.
Arms, Caroline (2001). Review of The Intellectual Foundation of Information Organization by Elaine Svenonius, Cambridge: MIT Press, 2000. D-Lib Magazine v.7:no.1 (2001:Jan.).[http://www.dlib.org/dlib/january01/01bookreview.html]
Bates, Marcia J. (1998) Indexing and Access for Digital Libraries and the Internet: Human, Database and Domain Factors. Journal of the American Society for Information Science. v.49: no.13 (1998), pp. 1185-1205.
Börner, Katy. (2000) Extracting and Visualizing Semantic Structures in Retrieval Results for Browsing. Proceedings of the 5th ACM Conference on Digital Libraries. June 2000, San Antonio, Texas. pp. 234-235.
Carroll, John M. (1995) How to Avoid Designing Digital Libraries: A Scenario-based Approach. Allerton 1995 Papers. 37th Allerton Institute 1995: How We do User-Centered Design and Evaluation of Digital Libraries: A Methodological Forum. [http://edfu.lis.uiuc.edu/allerton/95/s2/carroll.html]
Chung, Yi-MIng, Qin He, Kevin Powell and Bruce Schatz. (1999) Semantic Indexing for a Complete Subject Discipline. Proceedings of the 4th ACM Conference on Digital Libraries. Aug. 1999, Berkeley, California. pp. 39-48.
Collins, William P. (Last visited April 26, 2001) Full BP 300 Modification Based on the Library of Congress Classification.[http://library.bahai.org/cat/fullbp.html]
Committee on an Information Technology Strategy for the Library of Congress, Computer Science and Telecommunications Board, Commission on Physical Sciences, Mathematics and Applications, National Research Council.(2001) LC21: A Digital Strategy for the Library of Congress. [http://books.nap.edu/html/lc21..html]
Fox, Edward A., Robert M. Akscyn, Richard K. Furuta, and John J. Leggett. (1995) Digital Libraries. Communications of the ACM. v.38:no.4 (1995:April), pp. 23-28.[http://www..acm.org/pubs/contents/journals/cacm/1995-38/]
Hodge, Gail. (2000) Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files. [online]: Council on Library and Information Resources. [http://www.clir.org/pubs/reports/pub91/contents.html]
International Bahá'í Library.[http://library.bahai.org]
International Bahá'í Library. (Last visited April 26, 2001) Bahá'í Subject Headings. [http://library.bahai.org/cat/headings.html]
Internet Engineering Taskforce (IETF). (2001) Uniform Resource Names (urn) Charter.[http://www.ietf..org/html.charters/urn-charter.html]
Jermey, Jonathan and Glenda Browne. (2001) Excerpt from their book Website Indexing.[http://members.optusnet.com.au/~webindexing/Webbook.htm]
Klemperer, Katharina and Stephen Chapman. (1997) Digital Libraries: a Selected Resource Guide. ITAL v.16: no.3 [http://www.lita.org/ital/1603_klemperer.htm]
Landegg International University Library.[http://library.landegg.edu]
Liu, Yew-Huey, Paul Dantzig, Martin Sachs. (2000) Visualizing Document Classification: A Search Aid for the Digital Library. Journal of the American Society for Information Science. v.51: no.3 (2000), pp. 216-227.
Lunin, Lois F. (1993) Introduction and Overview to Special Issue on Digital Libraries. Journal of the American Society for Information Science. v.44:no.8 (1993) pp. 441-445.
Maarek, Yoelle S. (1995) Organizing Documents to Support Browsing in Digital
Libraries. Allerton 1995 Papers. 37th Allerton Institute 1995: How We do
User-Centered Design and Evaluation of Digital Libraries: A Methodological
Marchionini, Gary and Hermann Maurer. (1995) The Roles of Digital Libraries in Teaching and Learning. Communications of the ACM. v.38:no.4 (1995:April), pp. 67-75. [http://www..acm.org/pubs/contents/journals/cacm/1995-38/]
Meyyappan, N., G.G. Chowdhury and Schubert Foo. (2000) A Review of the Status of 20 Digital Libraries. Journal of Information Science. v.26:no.5 (2000), pp. 337-355.
Monostroi, Krisztián, Arkady Zaslavsky, and Heinz Schmidt. (2000) Document Overlap Detection System for Distributed Digital Libraries. Proceedings of the 5th ACM Conference on Digital Libraries. June 2000, San Antonio, Texas. pp. 226-227.
Nevill-Manning, Craig G., Ian H. Witten and Gordon W. Paynter. (1997) Browsing in Digital Libraries: a Phrase-Based Approach. Proceedings of the 2nd ACM Conference on Digital Libraries. June 1997, Philadelphia, PA. pp. 230-236.[http://www..acm.org/pubs/contents/proceedings/dl/263690/]
Nilan, Michael S. (1995) Ease of User Navigation through Digital Information Spaces. Allerton 1995 Papers. 37th Allerton Institute 1995: How We do User-Centered Design and Evaluation of Digital Libraries: A Methodological Forum. [http://edfu.lis..uiuc.edu/allerton/95/s4/nilan.html]
Paynter, Gordon W., Ian H. Witten, Sally Jo Cunningham and George Buchanan. (2000) Scalable Browsing for Large Collections. Proceedings of the 5th ACM Conference on Digital Libraries. June 2000, San Antonio, Texas. pp. 215-223.
Schatz, Bruce R., Eric H. Johnson, Pauline A. Cochrane and Hsinchun Chen. (1996) Interactive Term Suggestion for Users of Digital Libraries: Using Subject Thesauri and Co-occurrence Lists for Information Retrieval. Proceedings of the 1st ACM Conference on Digital Libraries. June 1996, Bethesda, Maryland. pp. 126-133. [http://www..acm.org/pubs/contents/proceedings/dl/226931/]
Shipman, Frank M. et al. (2000) Guided Paths through Web-based Collections: Design, Experiences, and Adaptations. Journal of the American Society for Information Science. v.51: no.3 (2000), pp. 260-272.
Smith, Philip J. (1995) Barriers to the Effective Search and Exploration of Large Document Databases. Allerton 1995 Papers. 37th Allerton Institute 1995: How We do User-Centered Design and Evaluation of Digital Libraries: A Methodological Forum. [http://edfu.lis.uiuc.edu/allerton/95/s4/psmith.html]
Sreenivasulu, V. (2000) The Role of a Digital Librarian in the Management of Digital Information Systems (DIS). The Electronic Library. v.18:no.1 (2000), PP. 12-20
Stockman, Robert H. and Jonah Winters (1997) A Resource Guide for the Scholarly Study of the Bahá'í Faith. Wilmette, IL: Research Office of the Bahá'í National Center.
Suleman, Hussein, Edward A. Fox and Marc Abrams. (2000) Building Quality into a Digital Libary. Proceedings of the 5th ACM Conference on Digital Libraries. June 2000, San Antonio, Texas. pp. 228-229. [http://www..acm.org/pubs/contents/proceedings/dl/336597/]
Theng, Yin Leng. (1999) "Lostness" and Digital Libraries. Proceedings of the 4th ACM Conference on Digital Libraries. Aug. 1999, Berkeley, California. pp. 250-251. [http://www..acm.org/pubs/contents/proceedings/dl/313238/]