A picture is worth a thousand words. So a concrete example that you can not only see, but also play with, is worth ten thousand words.
The Quranic Arabic Corpus incorporates much of my vision for the study of the Bible in the third millennium of our civilization. For “under the hood” details, see the description of the research of Kais Dukes, who is — of all things! — a VP of Merrill Lynch.
Three elements of this projects are morphological annotation, a syntax treebank and a semantic ontology. All three are combined into a web user interface in such a way that collaboration is possible. The general public interested in the Quran itself can browse the original Arabic text, and dive into morphology, syntax and semantics as desired. Scholars can work on the actual analysis simply by logging in.
This model of linguistic annotation of a corpus can easily be extended to include bibliography, web resources, archaeological and historical data — the possibilities are endless.
One extension ought to be the ability to add user annotation which is stored locally on the user/visitor’s own computer but which integrates seamlessly with the website.
I noticed one feature that is lacking: the ability for complex searching, using the morphology, syntax and semantic annotations. There is a search box for simple text queries, but a more sophisticated search engine would greatly enhance the value of this remarkable resource.
Tags: data visualization, linguistics, morphology, semantics, syntax
Hmmm… In HTML5, there are options for local storage on the computer. (The alternative would probably be browser-specific plugins, but I haven’t thought that much about this.)