In recent times, digital resources have taken on steadily greater importance in the field of Buddhist studies, with increasing numbers of digitized versions of Buddhist canonical texts, representation of material culture, and other objects of research becoming available on Web. However, despite the basic availability of such resources, most of them are not set up in an optimal way for usage by researchers; nor are they for the most part integrated with each other. For example, there does not yet exist a system that can operate with equal efficacy with philological data related to Sanskrit, Chinese, Tibetan language materials. Therefore, a comprehensive and concrete framework is needed. Although a method of text description is fairly well established in the form of TEI P5, neither the interfaces, nor the methods of presentation of results for digitized works are as yet satisfactory for the scholars of Buddhism. In this paper, we will present our approach to the establishment of requirements for various kinds of materials used in Buddhist studies and make some suggestions for the implementation of more functional interfaces as a Web research environment for such scholars.
At first, it must be noted that it is very difficult to define adequate requirements for the full range of the scholars of Buddhism, who come from a broad array of language training and methodological approaches. Thus, this paper will focus primarily on the fulfillment of the requirements for the scholars who are dealing with authentic scholarly digitized texts. In the field of Buddhist studies, where texts have been translated across a number of languages in different regions at various points in history, we have no recourse but to deal with several versions of a text, including transmitted, diffused, and translated variations at the same time (Fig.1).
In order to realize such relationship in digitized resources, it is necessary to provide the data in a punctuation-neutral manner. This is because separation of words or sentences is not always obvious; punctuating text itself amounts to an independent and original scholarly contribution, especially in the cases of Sanskrit and classical Chinese manuscripts. Therefore, the basic concepts are:
In addition, in order to preserve compatibility with past research results, all units should be identifiable in legacy media through traditional referencing methods such as T0001,01,0001a01 which means Taishō Daizōkyō, vol.1, page 1, register a (among a, b, and c), line 1. Then, they should be located by URI by implementation of Web API so that the other persons or applications can freely refer to arbitrary units through Web.
In order to realize such concepts, a collaborative editing system needs to be developed. Initially, users should be registered according to their own roles such as visitor, editor, or administrative editor. The role of the editor is that of inputting and checking of the data; the role of administrative editor is checking of the data and the determination of the distribution of the data. Both editor and administrative editor should be able to efficiently select arbitrary units with an easy method such as mouse click or drag on the text body without any restriction posed by the textual structure. Then they can append any necessary information to the units and link selected units to other ones (Fig. 2).
Finally, we will explain the Web Database function. The Web Database stores only relational data that includes one or two locations in the original materials, its annotation or relationship information, contributor's name, data composition and other related information. The writing rule of the location allows the user to use the information even without digitized textual material. However, if the material is originally in digitized form, the data must refer to a logical location such as the URI. Users don't need to be aware of separation between the database and the materials presented to them because to the viewer of the materials, they seem to be seamlessly integrated with the database. By this kind of method, it will be easier to integrate the other materials which are released on other Web sites into the environment. The database can provide arbitrary data according to the user's preference through Web API, as well as retrieve and show the data provided by other DBs. From here, the data can be published under various formats such as RDF. However, in principle, this kind of system would not be able to avoid the problem of overlap if there is an intent to publish its data fully including the original materials. So we are trying to do so according to TEI P5 by using a kind of stand-off markup.
Although we have just begun to develop this approach, it has already been tested by some scholars and we have seen positive results. Actually, these early testers say that it is very useful for them because their accessibility and permissions are limited and explicitly shown. However, some enhancements have been suggested, which mainly concern the function of browsing. We will work on this until the time of DH2011, when we will able to present a much more matured system.
Burnard, L. Bauman, S. 2007 P5: Guidelines for Electronic Text Encoding and Interchange, (link)
Caton, P. 2007 “Distributed Multivalent Encoding, ” Digital Humanities 2007, University of Illinois, Urbana-Champaign, 2-8 June, 2007 33-34
DeRose, S. 2004 “Overlap: A Review and a Horse, ” Extreme Markup Languages 2004, Montreal, 2-6 Aug., 2004 (link)
Lavagnino, J. 2009 “Access, ” Literary and Linguistic Computing, 24 (1) 63-76
Nagasaki, K. 2008 “A Collaboration System for the Philology of the Buddhist Study , ” Digital Humanities 2008, Oulu, Finland, 25-29 June, 2008 262-263
Rehm, G. Witt, A. 2008 “Aspects of Sustainability in Digital Humanities , ” Digital Humanities 2008, Oulu, Finland, 25-29 June, 2008 21-29
Renear, A. H. 2004 “Text Encoding, ” A Companion to Digital Humanities, Schreibman, S. Siemens, R. Unsworth, J. Blackwell Oxford 218-239
Steinkellner, E. 1988 “Methodological Remarks On The Constitution Of Sanskrit Texts From The Buddhist Pramāṇa-Tradition, ” Wiener Zeitschrift für die Kunde Südasiens, Band XXXII, 103-129