Hayri Sever | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hayri Sever is active.

Explore More

Publication

Featured researches published by Hayri Sever.

international acm sigir conference on research and development in information retrieval | 1995

On the reuse of past optimal queries

Vijay V. Raghavan; Hayri Sever

Vijay V. Raghavan Hayri Sever The Center for Advanced Computer Studies The Department of Computer Science University of Southwestern Louisiana University of Southwestern Louisiana Lafayette, LA 70504, USA Lafayette, LA 70504, USA e-mail: [email protected]. edu Information Retrieval (IR) systems exploit user feedback by generating an optimal query with respect to a particular information need. Since obtaining an optimal query is an expensive process, the need for mechanisms to save and reuse past optimal queries for future queries is obvions. In this article, we propose the use of a query base, a set of persistent past optimal queries, and investigate similarity measures between queries. The query base can be used either to answer user queries or to formulate optimal queries. We justify the former case analytically and the latter case by experiment.

knowledge discovery and data mining | 1998

Feature selection and effective classifiers

Jitender S. Deogun; Suresh K. Choubey; Vijay V. Raghavan; Hayri Sever

In this article, we develop and analyze four algorithms patterns from large databases. As described in Fayyad for feature selection in the context of rough set method- ( 1996 ) and Simoudis ( 1996 ) , this process is typically ology. The initial state and the feasibility criterion of all made up of selection and sampling, preprocessing and these algorithms are the same. That is, they start with a cleaning, transformation and reduction, data mining, and given feature set and progressively remove features, evaluation steps. The first step in the data-mining process while controlling the amount of degradation in classification quality. These algorithms, however, differ in the is to select a target data set from a database ( or a data heuristics used for pruning the search space of features. warehouse ) and to possibly sample the target data. The Our experimental results confirm the expected relation- preprocessing and data cleaning step handles noise and ship between the time complexity of these algorithms unknown values, as well as accounting for missing data and the classification accuracy of the resulting upper fields, time sequence information, and so forth. The data classifiers. Our experiments demonstrate that a u-reduct of a given feature set can be found efficiently. Although reduction and transformation step involves finding relewe have adopted upper classifiers in our investigations, vant features depending on the goal of the task and certain the algorithms presented can, however, be used with transformations on the data such as converting one type any method of deriving a classifier, where the quality of of data to another ( e.g., changing nominal values into classification is a monotonically decreasing function of the size of the feature set. We compare the performance numeric ones, discretizing continuous values ) , and / or deof upper classifiers with those of lower classifiers. We fining new attributes. In the mining step, the user may find that upper classifiers perform better than lower apply one or more knowledge discovery techniques on classifiers for a duodenal ulcer data set. This should be the transformed data to extract valuable patterns. Finally, generally true when there is a small number of elements the evaluation step involves interpreting the result ( or in the boundary region. An upper classifier has some important features that make it suitable for data mining discovered pattern ) with respect to the goal / task at hand. applications. In particular, we have shown that the upper Note that the data-mining process is not linear and inclassifiers can be summarized at a desired level of ab- volves a variety of feedback loops, because any one step straction by using extended decision tables. We also can result in changes in preceding or succeeding steps. point out that an upper classifier results in an inconsistent decision algorithm, which can be interpreted deter- Furthermore, the nature of a large, real-world data set, ministically or non-deterministically to obtain a consis- which may contain noisy, incomplete, dynamic, reduntent decision algorithm. dant, spare, and missing values, certainly requires that existing techniques and approaches be extended to cope with such problems ( Deogun, Raghavan, Sarkar, & Sever,

Linux Journal | 1997

Data Mining: Trends in Research and Development

Jitender S. Deogun; Vijay V. Raghavan; Amartya Sarkar; Hayri Sever

Data mining is an interdisciplinary research area spanning several disciplines such as database systems, machine learning, intelligent information systems, statistics, and expert systems. Data mining has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large real-world databases. Many aspects of data mining have been investigated in several related fields. But the problem is unique enough that there is a great need to extend these studies to include the nature of the contents of the real-world databases. In this chapter, we discuss the theory and foundational issues in data mining, describe data mining methods and algorithms, and review data mining applications. Since a major focus of this book is on rough sets and its applications to database mining, one full section is devoted to summarizing the state of rough sets as related to data mining of real-world databases. More importantly, we provide evidence showing that the theory of rough sets constitutes a sound basis for data mining applications.

ieee international conference on fuzzy systems | 1996

A comparison of feature selection algorithms in the context of rough classifiers

Suresh K. Choubey; Jitender S. Deogun; Vijay V. Raghavan; Hayri Sever

We study the feature selection problem and develop and analyze four algorithms for feature selection in the context of rough set methodology. The initial state and the feasibility criterion of all these algorithms are the same, that is, they start from a given feature set and progressively remove features, while controlling the amount of degradation in classification quality, but differ in the heuristic used for pruning the search space of features. Our experimental results confirm the analytical results on the complexity of algorithms as well as on controlled degradation of upper classification. The algorithms presented can be used with any methods of deriving a classifier where the quality of classification is a monotonically decreasing function while feature set is reduced, though we have adopted the upper classifier in our study. The upper classifier has some important features that makes it suitable for database mining applications. In particular, we have shown that the upper classifier can be summarized at a desired level of abstraction by using extended decision tables. We also point out that an inconsistent decision algorithm can be interpreted as if it were a consistent decision algorithm.

string processing and information retrieval | 2003

FindStem: Analysis and Evaluation of a Turkish Stemming Algorithm

Hayri Sever; Yiltan Bitirim

In this paper, we evaluate the effectiveness of a new stemming algorithm, FINDSTEM, for use with Turkish documents and queries, and compare the use of this algorithm with the other two previously defined Turkish stemmers, namely ”A-F” and ”L-M” algorithms. Of them, the FINDSTEM and A-F algorithms employ inflectional and derivational stemmers, whereas the L-M one handles only inflectional rules. Comparison of stemming algorithms was done manually using 5,000 distinct words out of which the FINDSTEM, A-F, and L-M failed on, in respect, 49, 270, and 559 cases. A medium-size collection, which is comprised of 2,468 law records with 280K document words, 15 queries in natural language with average length of 17 search words, and a complete relevancy information for each query, was used for the effectiveness of the stemming algorithm FINDSTEM. We localized SMART retrieval system in terms of a stopping list, introduction of Turkish characters, i.e., the ISO8859-9 (Latin-5) code set, a stemming algorithm (FINDSTEM), and a Turkish translation at message level. Our results based on average precision values at 11-point recall levels shows that indexing document as well as search terms with the use of FINDSTEM for stemming is clearly and consistently more effective than the one where the terms are indexed as they are (that is, no stemming at all).

Lecture Notes in Computer Science | 2000

Application of Metadata Concepts to Discovery of Internet Resources

Mehmet Emin Küçük; Baha Olgun; Hayri Sever

Internet resources are not yet machine-understandable resources. To address this problem a number of studies have been done. One such a study is the Resource Description Framework (RDF), which has been supported by World-Wide Web (WWW) Consortium. The DC (Dublin Core) metadata elements have been defined using the property of extensibility of RDF to handle electronic metadata information. In this article, an authoring editor, called H-DCEdit, is introduced. This editor makes use of RDF/DC model to define contents of Turkish electronic resources. To serialize (or to code) a RDF model, SGML (Standard Generalized Markup Language) has been used. In addition to this work, a possible view of RDF/DC documents is provided using Document Style Semantics and Specification Language (DSSSL) standard. HDCEdit supports use of Turkish language in describing Internet resources. Isite/Isearch system developed Center for Networked Information Discovery and Retrieval (CNIDR) organization with respect to Z.39.50 standard is able to index documents and allows one to query the indexed terms in tagged elements (e.g., terms in RDF/DC elements). In the scope of our work, the localization of this Isite/Isearch system is completed in terms of sorting, comparison, and stemming. The feature of supporting queries over tags provides basis for integrating H-DCEdit authoring tool with Isite/Isearch search engine.

Cybernetics and Systems | 1999

ILA-2: AN INDUCTIVE LEARNING ALGORITHM FOR KNOWLEDGE DISCOVERY

Mehmet R. Tolun; Hayri Sever; Mahmut Uludag; Saleh M. Abu-Soud

In this paper we describe the ILA-2 rule induction algorithm, which is the improved version of a novel inductive learning algorithm ILA . We first outline the basic algorithm ILA, and then present how the algorithm is improved using a new evaluation metric that handles uncertainty in the data. By using a new soft computing metric, users can reflect their preferences through a penalty factor to control the performance of the algorithm. Inductive learning algorithm has also a faster pass criteria feature which reduces the processing time without sacrificing much from the accuracy that is not available in basic ILA. We experimentally show that the performance of ILA-2 is comparable to that of well-known inductive learning algorithms, namely, CN2, OC1, ID3, and C4.5.

international syposium on methodologies for intelligent systems | 1999

Concept Based Retrieval by Minimal Term Sets

Ali H. Alsaffar; Jitender S. Deogun; Vijay V. Raghavan; Hayri Sever

The problem of bridging the terminological gap between the way users prefer to specify their information needs and the way queries are formulated in terms of words or text expressions is of considerable interest. The central ideas of existing approaches based on expert systems technology were introduced in the context of a system called RUBRIC. In RUBRIC, user query topics (or concepts) are captured in a rule base and the rule base is represented as an AND/OR tree. Determining the retrieval output by evaluation of the AND/OR tree is exponential in m, where m is the maximum number of conjunctions in the DNF expression associated with a query topic. In this paper, we propose a method of computing retrieval output that involves the preprocessing of the rule base to generate what we call Minimal Term Sets (MTS) that enhances the computations needed for retrieval. The computational complexity associated with the proposed approach is polynomial in m. We also show that MTSs can provide additional advantages for the users by enabling them to (i) choose query topics that best suit their needs from among existing ones and (ii) use retrieval functions that yield more refined and controlled retrieval output than is possible with the AND/OR tree.

intelligent information systems | 2000

Enhancing Concept-Based Retrieval Based onMinimal Term Sets

Ali H. Alsaffar; Jitender S. Deogun; Vijay V. Raghavan; Hayri Sever

There is considerable interest in bridging the terminological gap that exists between the way users prefer to specify their information needs and the way queries are expressed in terms of keywords or text expressions that occur in documents. One of the approaches proposed for bridging this gap is based on technologies for expert systems. The central idea of such an approach was introduced in the context of a system called Rule Based Information Retrieval by Computer (RUBRIC). In RUBRIC, user query topics (or concepts) are captured in a rule base represented by an AND/OR tree. The evaluation of AND/OR tree is essentially based on minimum and maximum weights of query terms for conjunctions and disjunctions, respectively. The time to generate the retrieval output of AND/OR tree for a given query topic is exponential in number of conjunctions in the DNF expression associated with the query topic. In this paper, we propose a new approach for computing the retrieval output. The proposed approach involves preprocessing of the rule base to generate Minimal Term Sets (MTSs) that speed up the retrieval process. The computational complexity of the on-line query evaluation following the preprocessing is polynomial in m. We show that the computation and use of MTSs allows a user to choose query topics that best suit their needs and to use retrieval functions that yield a more refined and controlled retrieval output than is possible with the AND/OR tree when document terms are binary. We incorporate p-Norm model into the process of evaluating MTSs to handle the case where weights of both documents and query terms are non-binary.

The Scientific World Journal | 2014

Performance Evaluation of the Machine Learning Algorithms Used in Inference Mechanism of a Medical Decision Support System

Mert Bal; M. Fatih Amasyali; Hayri Sever; Guven Kose; Ayşe Demirhan

The importance of the decision support systems is increasingly supporting the decision making process in cases of uncertainty and the lack of information and they are widely used in various fields like engineering, finance, medicine, and so forth, Medical decision support systems help the healthcare personnel to select optimal method during the treatment of the patients. Decision support systems are intelligent software systems that support decision makers on their decisions. The design of decision support systems consists of four main subjects called inference mechanism, knowledge-base, explanation module, and active memory. Inference mechanism constitutes the basis of decision support systems. There are various methods that can be used in these mechanisms approaches. Some of these methods are decision trees, artificial neural networks, statistical methods, rule-based methods, and so forth. In decision support systems, those methods can be used separately or a hybrid system, and also combination of those methods. In this study, synthetic data with 10, 100, 1000, and 2000 records have been produced to reflect the probabilities on the ALARM network. The accuracy of 11 machine learning methods for the inference mechanism of medical decision support system is compared on various data sets.

Explore More