Browsing by Author "Cicekli, I."

Now showing 1 - 15 of 15

Open Access
Abstract metaprolog engine
(Elsevier, 1998) Cicekli, I.
A compiler-based meta-level system for MetaProlog language is presented. Since MetaProlog is a meta-level extension of Prolog, the Warren Abstract Machine (WAM) is extended to get an efficient implementation of meta-level facilities; this extension is called the Abstract MetaProlog Engine (AMPE). Since theories and proofs are main meta-level objects in MetaProlog, we discuss their representations and implementations in detail. First, we describe how to efficiently represent theories and derivability relations. At the same time, we present the core part of the AMPE, which supports multiple theories and a fast context switching among theories in the MetaProlog system. Then we describe how to compute proofs, how to shrink the search space of a goal using partially instantiated proofs, and how to represent other control knowledge in a WAM-based system. In addition to computing proofs that are just success branches of search trees, fail branches can also be computed and used in the reasoning process.
Open Access
Automatic categorization and summarization of documentaries
(Sage Publications Ltd., 2010) Demirtas, K.; Cicekli, N. K.; Cicekli, I.
In this paper, we propose automatic categorization and summarization of documentaries using subtitles of videos. We propose two methods for video categorization. The first makes unsupervised categorization by applying natural language processing techniques on video subtitles and uses the WordNet lexical database and WordNet domains. The second has the same extraction steps but uses a learning module to categorize. Experiments with documentary videos give promising results in discovering the correct categories of videos. We also propose a video summarization method using the subtitles of videos and text summarization techniques. Significant sentences in the subtitles of a video are identified using these techniques and a video summary is then composed by finding the video parts corresponding to these summary sentences. © 2010 The Author(s).
Open Access
Design and evaluation of an ontology based information extraction system for radiological reports
(Pergamon Press, 2010) Soysal, E.; Cicekli, I.; Baykal, N.
This paper describes an information extraction system that extracts and converts the available information in free text Turkish radiology reports into a structured information model using manually created extraction rules and domain ontology. The ontology provides flexibility in the design of extraction rules, and determines the information model for the extracted semantic information. Although our information extraction system mainly concentrates on abdominal radiology reports, the system can be used in another field of medicine by adapting its ontology and extraction rule set. We achieved very high precision and recall results during the evaluation of the developed system with unseen radiology reports. © 2010 Elsevier Ltd.
Open Access
Formalizing the specification and execution of workflows using the event calculus
(Elsevier Inc., 2006-08-03) Cicekli, N. K.; Cicekli, I.
The event calculus is a logic programming formalism for representing events and their effects especially in database applications. This paper proposes the event calculus as a logic-based methodology for the specification and execution of workflows. It is shown that the control flow graph of a workflow specification can be expressed as a set of logical formulas and the event calculus can be used to specify the role of a workflow manager through a set of rules for the execution dependencies of activities. The proposed framework for a workflow manager maintains a history of events to control the execution of activities. The events are instructions to the workflow manager to coordinate the execution of activities. Based on the already occurred events, the workflow manager triggers new events to schedule new activities in accordance with the control flow graph of the workflow. The net effect is an alternative approach for defining a workflow engine whose operational semantics is naturally integrated with the operational semantics of a deductive database. Within this framework it is possible to model sequential and concurrent activities with or without synchronization. It is also possible to model agent assignment and execution of concurrent workflow instances. The paper, thus, contributes a logical perspective to the task of developing formalization for the workflow management systems. © 2005 Elsevier Inc. All rights reserved.
Open Access
Generalizing predicates with string arguments
(Springer New York LLC, 2006-06) Cicekli, I.; Cicekli, N. K.
The least general generalization (LGG) of strings may cause an over-generalization in the generalization process of the clauses of predicates with string arguments. We propose a specific generalization (SG) for strings to reduce over-generalization. SGs of strings are used in the generalization of a set of strings representing the arguments of a set of positive examples of a predicate with string arguments. In order to create a SG of two strings, first, a unique match sequence between these strings is found. A unique match sequence of two strings consists of similarities and differences to represent similar parts and differing parts between those strings. The differences in the unique match sequence are replaced to create a SG of those strings. In the generalization process, a coverage algorithm based on SGs of strings or learning heuristics based on match sequences are used. © Springer Science + Business Media, LLC 2006.
Open Access
Generic text summarization for Turkish
(Oxford University Press, 2010) Kutlu, M.; Cığır, C.; Cicekli, I.
In this paper, we propose a generic text summarization method that generates summaries of Turkish texts by ranking sentences according to their scores. Sentence scores are calculated using their surface-level features, and summaries are created by extracting the highest ranked sentences from the original documents. To extract sentences which form a summary with an extensive coverage of the main content of the text and less redundancy, we use features such as term frequency, key phrase (KP), centrality, title similarity and sentence position. The sentence rank is computed using a score function that uses its feature values and the weights of the features. The best feature weights are learned using machine-learning techniques with the help of human-constructed summaries. Performance evaluation is conducted by comparing summarization outputs with manual summaries of two newly created Turkish data sets. This paper presents one of the first Turkish summarization systems, and its results are promising. We introduce the usage of KP as a surface-level feature in text summarization, and we show the effectiveness of the centrality feature in text summarization. The effectiveness of the features in Turkish text summarization is also analyzed in detail. © The Author 2008. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved.
Open Access
An intelligent backtracking schema in a logic programming environment
(ACM, 1997) Cicekli, I.
We present a new method to represent variable bindings in the Warren Abstract Machine (WAM), so that the ages of variable bindings can be easily found using this new representation in our intelligent backtracking schema. The age of a variable bound to a non-variable term is the youngest choice point such that backtracking to that choice point can make that variable an unbound variable again. The procedure backtracking point is the choice point of the procedure currently being executed or the choice point of its first ancestor having a choice point. Variable ages and procedure backtracking points are used in the process of figuring out backtracking points in our intelligent backtracking schema. Our intelligent backtracking schema performs much better than the results of other intelligent backtracking methods in the literature for deterministic programs, and its performance for non-deterministic programs are comparable with their results.
Open Access
Keyphrase extraction through query performance prediction
(Sage Publications Ltd., 2012) Ercan, G.; Cicekli, I.
Previous research shows that keyphrases are useful tools in document retrieval and navigation. While these point to a relation between keyphrases and document retrieval performance, no other work uses this relationship to identify keyphrases of a given document. This work aims to establish a link between the problems of query performance prediction (QPP) and keyphrase extraction. To this end, features used in QPP are evaluated in keyphrase extraction using a naïve Bayes classifier. Our experiments indicate that these features improve the effectiveness of keyphrase extraction in documents of different length. More importantly, commonly used features of frequency and first position in text perform poorly on shorter documents, whereas QPP features are more robust and achieve better results. © 2012 The Author(s).
Open Access
Learning translation templates from bilingual translation examples
(Kluwer Academic Publishers, 2001-07) Cicekli, I.; Güvenir, H. A.
A mechanism for learning lexical correspondences between two languages from sets of translated sentence pairs is presented. These lexical level correspondences are learned using analogical reasoning between two translation examples. Given two translation examples, the similar parts of the sentences in the source language must correspond to the similar parts of the sentences in the target language. Similarly, the different parts must correspond to the respective parts in the translated sentences. The correspondences between similarities and between differences are learned in the form of translation templates. A translation template is a generalized translation exemplar pair where some components are generalized by replacing them with variables in both sentences and establishing bindings between these variables. The learned translation templates are obtained by replacing differences or similarities by variables. This approach has been implemented and tested on a set of sample training datasets and produced promising results for further investigation
Open Access
Natural language querying for video databases
(Elsevier Inc., 2008-06-15) Erozel, G.; Cicekli, N. K.; Cicekli, I.
The video databases have become popular in various areas due to the recent advances in technology. Video archive systems need user-friendly interfaces to retrieve video frames. In this paper, a user interface based on natural language processing (NLP) to a video database system is described. The video database is based on a content-based spatio-temporal video data model. The data model is focused on the semantic content which includes objects, activities, and spatial properties of objects. Spatio-temporal relationships between video objects and also trajectories of moving objects can be queried with this data model. In this video database system, a natural language interface enables flexible querying. The queries, which are given as English sentences, are parsed using link parser. The semantic representations of the queries are extracted from their syntactic structures using information extraction techniques. The extracted semantic representations are used to call the related parts of the underlying video database system to return the results of the queries. Not only exact matches but similar objects and activities are also returned from the database with the help of the conceptual ontology module. This module is implemented using a distance-based method of semantic similarity search on the semantic domain-independent ontology, WordNet. © 2008 Elsevier Inc. All rights reserved.
Open Access
Pragmatics in human-computer conversations
(Elsevier, 2002) Saygin, A. P.; Cicekli, I.
This paper provides a pragmatic analysis of some human-computer conversations carried out during the past six years within the context of the Loebner Prize Contest, an annual competition in which computers participate in Turing Tests. The Turing Test posits that to be granted intelligence, a computer should imitate human conversational behavior so well as to be indistinguishable from a real human being. We carried out an empirical study exploring the relationship between computers' violations of Grice's cooperative principle and conversational maxims, and their success in imitating human language use. Based on conversation analysis and a large survey, we found that different maxims have different effects when violated, but more often than not, when computers violate the maxims, they reveal their identity. The results indicate that Grice's cooperative principle is at work during conversations with computers. On the other hand, studying human-computer communication may require some modifications of existing frameworks in pragmatics because of certain characteristics of these conversational environments. Pragmatics constitutes a serious challenge to computational linguistics. While existing programs have other significant shortcomings, it may be that the biggest hurdle in developing computer programs which can successfully carry out conversations will be modeling the ability to 'cooperate'. © 2002 Elsevier Science B.V. All rights reserved.
Open Access
A ranking method for example based machine translation results by learning from user feedback
(Springer New York LLC, 2011-10) Daybelge, T.; Cicekli, I.
Example-Based Machine Translation (EBMT) is a corpus based approach to Machine Translation (MT), that utilizes the translation by analogy concept. In our EBMT system, translation templates are extracted automatically from bilingual aligned corpora by substituting the similarities and differences in pairs of translation examples with variables. In the earlier versions of the discussed system, the translation results were solely ranked using confidence factors of the translation templates. In this study, we introduce an improved ranking mechanism that dynamically learns from user feedback. When a user, such as a professional human translator, submits his evaluation of the generated translation results, the system learns "context-dependent co-occurrence rules" from this feedback. The newly learned rules are later consulted, while ranking the results of the subsequent translations. Through successive translation-evaluation cycles, we expect that the output of the ranking mechanism complies better with user expectations, listing the more preferred results in higher ranks. We also present the evaluation of our ranking method which uses the precision values at top results and the BLEU metric. © 2010 Springer Science+Business Media, LLC.
Open Access
Turing test: 50 years later
(Springer, 2000) Saygin, A. P.; Cicekli, I.; Akman, V.
The Turing Test is one of the most disputed topics in artificial intelligence, philosophy of mind, and cognitive science. This paper is a review of the past 50 years of the Turing Test. Philosophical debates, practical developments and repercussions in related disciplines are all covered. We discuss Turing's ideas in detail and present the important comments that have been made on them. Within this context, behaviorism, consciousness, the 'other minds' problem, and similar topics in philosophy of mind are discussed. We also cover the sociological and psychological aspects of the Turing Test. Finally, we look at the current situation and analyze programs that have been developed with the aim of passing the Turing Test.We conclude that the Turing Test has been, and will continue to be, an influential and controversial topic. © 2001 Kluwer Academic Publishers.
Open Access
Two learning approaches for protein name extraction
(Academic Press, 2009) Tatar, S.; Cicekli, I.
Protein name extraction, one of the basic tasks in automatic extraction of information from biological texts, remains challenging. In this paper, we explore the use of two different machine learning techniques and present the results of the conducted experiments. In the first method, Bigram language model is used to extract protein names. In the latter, we use an automatic rule learning method that can identify protein names located in the biological texts. In both cases, we generalize protein names by using hierarchically categorized syntactic token types. We conducted our experiments on two different datasets. Our first method based on Bigram language model achieved an F-score of 67.7% on the YAPEX dataset and 66.8% on the GENIA corpus. The developed rule learning method obtained 61.8% F-score value on the YAPEX dataset and 61.0% on the GENIA corpus. The results of the comparative experiments demonstrate that both techniques are applicable to the task of automatic protein name extraction, a prerequisite for the large-scale processing of biomedical literature. © 2009 Elsevier Inc. All rights reserved.
Open Access
Using lexical chains for keyword extraction
(Elsevier Ltd, 2007-11) Ercan, G.; Cicekli, I.
Keywords can be considered as condensed versions of documents and short forms of their summaries. In this paper, the problem of automatic extraction of keywords from documents is treated as a supervised learning task. A lexical chain holds a set of semantically related words of a text and it can be said that a lexical chain represents the semantic content of a portion of the text. Although lexical chains have been extensively used in text summarization, their usage for keyword extraction problem has not been fully investigated. In this paper, a keyword extraction technique that uses lexical chains is described, and encouraging results are obtained. © 2007 Elsevier Ltd. All rights reserved.