Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sholom M. Weiss is active.

Publication


Featured researches published by Sholom M. Weiss.


IEEE Intelligent Systems & Their Applications | 1999

Maximizing text-mining performance

Sholom M. Weiss; Chidanand Apte; Fred J. Damerau; David E. Johnson; Frank J. Oles; Thilo Goetz; Thomas Hampp

The authors adaptive resampling approach surpasses previous decision-tree performance and validates the effectiveness of small, pooled local dictionaries. They demonstrate their approach using the Reuters-21578 benchmark data and a real-world customer E-mail routing system.


european conference on principles of data mining and knowledge discovery | 2000

Lightweight document clustering

Chidanand Apte; Sholom M. Weiss; Brian F. White

A lightweight document clustering method is described that operates in high dimensions, processes tens of thousands of documents and groups them into several thousand clusters, or by varying a single parameter, into a few dozen clusters. The method uses a reduced indexing view of the original documents, where only the k best keywords of each document are indexed. An efficient procedure for clustering is speci fied in two parts (a) compute k most similar documents for each document in the collection and (b) group the documents into clusters using these similarity scores. The method has been evaluated on a database of over 50,000 customer service problem reports that are reduced to 3,000 clusters and 5,000 exemplar documents. Results demonstrate efficient clustering performance with excellent group similarity measures.


Ibm Systems Journal | 2002

Predictive algorithms in the management of computer systems

Ricardo Vilalta; Chidanand Apte; Joseph L. Hellerstein; Sheng Ma; Sholom M. Weiss

Predictive algorithms play a crucial role in systems management by alerting the user to potential failures. We report on three case studies dealing with the prediction of failures in computer systems: (1) long-term prediction of performance variables (e.g., disk utilization), (2) short-term prediction of abnormal behavior (e.g., threshold violations), and (3) short-term prediction of system events (e.g., router failure). Empirical results show that predictive algorithms can be successfully employed in the estimation of performance variables and the prediction of critical events.


machine learning and data mining in pattern recognition | 2001

Advances in predictive models for data mining

Se June Hong; Sholom M. Weiss

Abstract Expanding application demand for data mining of massive data warehouses has fueled advances in automated predictive methods. We examine a few successful application areas and their technical challenges. We review the key theoretical developments in PAC and statistical learning theory that have lead to the development of support vector machines and to the use of multiple models for increased predictive accuracy.


Ibm Journal of Research and Development | 2003

Data-intensive analytics for predictive modeling

Chidanand Apte; Se June Hong; Ramesh Natarajan; Edwin P. D. Pednault; Fateh A. Tipu; Sholom M. Weiss

The Data Abstraction Research Group was formed in the early 1990s, to bring focus to the work of the Mathematical Sciences Department in the emerging area of knowledge discovery and data mining (KD & DM). Most activities in this group have been performed in the technical area of predictive modeling, roughly at the intersection of machine learning, statistical modeling, and database technology. There has been a major emphasis on using business and industrial problems to motivate the research agenda. Major accomplishments include advances in methods for feature analysis, rule-based pattern discovery, and probabilistic modeling, and novel solutions for insurance risk management, targeted marketing, and text mining. This paper presents an overview of the groups major technical accomplishments.


Ibm Systems Journal | 2007

Analytics-driven solutions for customer targeting and sales-force allocation

Richard D. Lawrence; Claudia Perlich; Saharon Rosset; J. Arroyo; M. Callahan; J. M. Collins; A. Ershov; S. Feinzig; Ildar Khabibrakhmanov; Shilpa N. Mahatma; M. Niemaszyk; Sholom M. Weiss

Sales professionals need to identify new sales prospects, and sales executives need to deploy the sales force against the sales accounts with the best potential for future revenue. We describe two analytics-based solutions developed within IBM to address these related issues. The Web-based tool OnTARGET provides a set of analytical models to identify new sales opportunities at existing client accounts and noncustomer companies. The models estimate the probability of purchase at the product-brand level. They use training examples drawn from historical transactions and extract explanatory features from transactional data joined with company firmographic data (e.g., revenue and number of employees). The second initiative, the Market Alignment Program, supports sales-force allocation based on field-validated analytical estimates of future revenue opportunity in each operational market segment. Revenue opportunity estimates are generated by defining the opportunity as a high percentile of a conditional distribution of the customers spending, that is, what we could realistically hope to sell to this customer. We describe the development of both sets of analytical models, the underlying data models, and the Web sites used to deliver the overall solution. We conclude with a discussion of the business impact of both initiatives.


knowledge discovery and data mining | 2001

Solving regression problems with rule-based ensemble classifiers

Nitin Indurkhya; Sholom M. Weiss

We describe a lightweight learning method that induces an ensemble of decision-rule solutions for regression problems. Instead of direct prediction of a continuous output variable, the method discretizes the variable by k-means clustering and solves the resultant classification problem. Predictions on new examples are made by averaging the mean values of classes with votes that are close in number to the most likely class. We provide experimental evidence that this indirect approach can often yield strong results for many applications, generally outperforming direct approaches such as regression trees and rivaling bagged regression trees.


knowledge discovery and data mining | 2003

Knowledge-based data mining

Sholom M. Weiss; Stephen J. Buckley; Shubir Kapoor; Søren Damgaard

We describe techniques for combining two types of knowledge systems: expert and machine learning. Both the expert system and the learning system represent information by logical decision rules or trees. Unlike the classical views of knowledge-base evaluation or refinement, our view accepts the contents of the knowledge base as completely correct. The knowledge base and the results of its stored cases will provide direction for the discovery of new relationships in the form of newly induced decision rules. An expert system called SEAS was built to discover sales leads for computer products and solutions. The system interviews executives by asking questions, and based on the responses, recommends products that may improve a business operations. Leveraging this expert system, we record the results of the interviews and the programs recommendations. The very same data stored by the expert system is used to find new predictive rules. Among the potential advantages of this approach are (a) the capability to spot new sales trends and (b) the substitution of less expensive probabilistic rules that use database data instead of interviews.


european conference on principles of data mining and knowledge discovery | 2001

Lightweight Collaborative Filtering Method for Binary-Encoded Data

Sholom M. Weiss; Nitin Indurkhya

A lightweight method for collaborative filtering is described that processes binary encoded data. Examples of transactions that can be described in this manner are items purchased by customers or web pages visited by individuals. As with all collaborative filtering, the objective is to match a persons records to customers with similar records. For example, based on prior purchases of a customer, one might recommend new items for purchase by examining stored records of other customers who made similar purchases. Because the data are binary (true-or-false) encoded, and not ranked preferences on a numerical scale, efficient and lightweight schemes are described for compactly storing data, computing similarities between new and stored records, and making recommendations tailored to an individual.


IEEE Intelligent Systems & Their Applications | 2000

Lightweight document matching for help-desk applications

Sholom M. Weiss; Brian F. White; Chidanand Apte; Fredrick J. Damerau

For decades, researchers have been working on ways to process text for classification and queries by relevant document retrieval. The authors describe a method that uses minimal data structures and lightweight algorithms to match new documents to those stored in a database. It is a completely automated Java based document matcher that accepts an unlimited-length textural structure as input and employs a fast matching algorithm to produce, like a search engine, a ranked list of relevant documents.

Researchain Logo
Decentralizing Knowledge