Computers and the Humanities (November 2004), 38 (4), pg. 343-362
This paper discusses some pitfalls in corpus research and suggests solutions on the basis of examples and computer simulations. We first address reliability problems in language transcriptions, agreement between transcribers, and how disagreements can be dealt with. We then show that the frequencies of occurrence obtained from a... [view more]
Full-Text:
Publisher
Your
Library
Automatic Acquisition and Expansion of Hypernym Links
Computers and the Humanities (November 2004), 38 (4), pg. 363-396
Recent developments in computational terminology call for the design of multiple and complementary tools for the acquisition, the structuring and the exploitation of terminological data. This paper proposes to bridge the gap between term acquisition and thesaurus construction by offering a framework for automatic structuring of multi-word candidate... [view more]
Full-Text:
Publisher
Your
Library
Experimenting with a Question Answering System for the Arabic Language
Computers and the Humanities (November 2004), 38 (4), pg. 397-415
The World Wide Web (WWW) today is so vast that it has become more and more difficult to find answers to questions using standard search engines. Current search engines can return ranked lists of documents, but they do not deliver direct answers to the user. The goal ... [view more]
Full-Text:
Publisher
Your
Library
Evaluation of Linguistic Features for Word Sense Disambiguation with Self-Organized Document Maps
Computers and the Humanities (November 2004), 38 (4), pg. 417-435
Word sense disambiguation automatically determines the appropriate senses of a word in context. We have previously shown that self-organized document maps have properties similar to a large-scale semantic structure that is useful for word sense disambiguation. This work evaluates the impact of different linguistic features on self... [view more]
Full-Text:
Publisher
Your
Library
Multiple Heuristics and Their Combination for Automatic WordNet Mapping
Computers and the Humanities (November 2004), 38 (4), pg. 437-455
This paper presents an automatic construction of Korean WordNet from preexisting lexical resources. We develop a set of automatic word sense disambiguation techniques to link a Korean word sense collected from a bilingual machine-readable dictionary to a single corresponding English WordNet synset. We show how individual links provided... [view more]
Full-Text:
Publisher
Your
Library
A Stylometric Analysis of Yaşar Kemal's "İnce Memed" Tetralogy
Computers and the Humanities (November 2004), 38 (4), pg. 457-467
We analyze four İnce Memed novels of Yaşar Kemal using six style markers: "most frequent words," "syllable counts," "word type - or part of speech - information," "sentence length in terms of words," "word length in text," and "word length in ... [view more]
Full-Text:
Publisher
Your
Library
Stochastic Models for Automatic Diacritics Generation of Arabic Names
Computers and the Humanities (November 2004), 38 (4), pg. 469-481
In this paper, two new models for generating diacritics for Arabic names are proposed. The first proposed model is called N-gram model. It is a stochastic model that is based on generating a corpus database of N-grams extracted from a large database of names with their ... [view more]
Full-Text:
Publisher
Your
Library
Computers and the Humanities (November 2004), 38 (4), pg. iii-iv
Full-Text:
Publisher
Your
Library

Browse Issues: