Next: References Up: The SUDA project: Collaborative Previous: Implementation method

Experience, Limitations, and Generalizations

Because the SOL package is still under development, we don't yet have much experience on which to base an evaluation. Our approach has been to build the fundamental structure as quickly as possible and to improve it iteratively, adding features as we understand the problems better and see needs for enhancement. In this way, many limitations will be identified and, hopefully, removed, during the course of development.

We believe that the SOL tools are generally useful for collaborative projects. We hope to see them adopted and enhanced by other scholars in their own work. Our design does have some built-in limitations, however, which might stand in the way of using our tools for other cooperative projects.

SOL is a translation effort. Collaborative work aimed at, for instance, cataloging paintings in museums is similar and could likely use the same sort of tools as SOL. Collaborative work for designing an airplane would most likely not fit into the SOL framework.
SOL is text-based. Not only the raw material (Greek text) but also the derived result (annotated English text) can be represented in Ascii, albeit with some encoding for Greek. Generalizing, we could imagine a collaborative effort to convert a large database stored in any raw form into some other derived form. Raw forms are not necessarily limited to text. For example, a collaboration might try to summarize the important aspects of a database of paintings. Here, the raw form would be paintings, represented as graphics. Derived forms are not limited, either; a collaboration might produce performances of a database of musical scores. Here, the derived form could be an audio file. Our tools could be modified to handle graphical raw forms such as paintings and musical scores, but it is not easy to see how non-textual derived forms such as performances could be submitted by the ``translators.'' Web forms are designed to accept text, not graphics or audio.
Furthermore, the fact that the derived form is text allows SOL to search through the translations based on content. If the raw form were text but the derived form were not, SOL could be modified to search the raw domain and present the associated derived material. If neither is text-based, searching becomes much harder, although methods are being studied for classifying multimedia data for search purposes [YGA96].
There is a one-to-one relation between raw and derived material. For every encyclopedia entry in the Suda, there is a single translation, albeit modified and enhanced by vetting. This correspondence allows us to display progress towards the final goal in a graphical way and to generate reasonable assignments. A collaborative effort to review movies, on the other hand, would admit multiple outputs for each input.
The raw material is organized serially. The fact that the text is organized alphabetically leads us to treat contiguous ranges efficiently. If a translator wants to be assigned all entries having to do with, say, pre-Socratic philosophers, the appropriate assignment includes singleton ranges in many different letters. This assignment is inefficient to represent and awkward for the managing editor to specify. Unfortunately, it is quite likely that translators with particular areas of interest or expertise will request exactly such an assignment. Although it is awkward, our current data structures are certainly able to handle this situation.
Since this project involves people acting in a community, there are many non-technical human issues that can arise. For instance, we have little control over the quality of the translation. In order to provide some control, we have editors who can vet the translations. It remains to be seen how effectively this organization applies proper control on the translations.

Next: References Up: The SUDA project: Collaborative Previous: Implementation method

Raphael Finkel
6/2/1998