Nitin Indurkhya | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nitin Indurkhya is active.

Explore More

Publication

Featured researches published by Nitin Indurkhya.

Journal of Hypertension | 1994

The relationships between casual and ambulatory blood pressure measurements and central hemodynamics in essential human hypertension

William B. White; Per Lund-Johansen; Sholom M. Weiss; Per Omvik; Nitin Indurkhya

Objective To determine the association between ambulatory blood pressure (ABP) and central hemodynamics in hypertensive patients and between the area under the 24-h blood pressure curve and the hemodynamic indexes. Patient population: Forty untreated essential hypertensive patients (28 previously untreated, 12 withdrawn from therapy for > 12 weeks). Methods: Patients underwent casual and 24-h ABP monitoring and invasive measurements of central hemodynamics. Central measures of ABP included 24-h mean, awake, and sleep values guided by activity journals. The ABP data were modeled by Fourier series and the ability of the smoothed and unsmoothed data to predict hemodynamics was compared. Individual blood pressure curves were analyzed by calculating the area under the curve using different threshold awake and sleep values to test the correlations between this form of blood pressure load and hemodynamics. Results: Hemodynamic measures were not predicted by casual blood pressure but were related to ABP. Total peripheral resistance was strongly predicted by the area under the diastolic blood pressure (DBP) curve using an awake threshold of 90 mmHg and a sleep threshold of 80 mmHg (r = 0.56, P<0.001). Data smoothing using Fourier transformation did not alter any correlations between ABP and hemodynamics. Exercise stroke index, an indicator of cardiac function impaired in early hypertensive heart disease, was also best predicted by area under the DBP curve using the same thresholds as above (r=-0.56, P<0.001). Conclusions: These data imply that integrated areas under the ABP curve are related to hemodynamic hypertensive indexes and could be used to assess the extent of hypertensive burden in clinical trials.

Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery | 2015

Emerging directions in predictive text mining

Nitin Indurkhya

In recent years, Text Mining has seen a tremendous spurt of growth as data scientists focus their attention on analyzing unstructured data. The main drivers for this growth have been big data as well as complex applications where the information in the text is often combined with other kinds of information in building predictive models. These applications require highly efficient and scalable algorithms to meet the overall performance demands. In this context, six main directions are identified where research in text mining is heading: Deep Learning, Topic Models, Graphical Modeling, Summarization, Sentiment Analysis, Learning from Unlabeled Text. Each direction has its own motivations and goals. There is some overlap of concepts because of the common themes of text and prediction. The predictive models involved are typically ones that involve meta‐information or tags that could be added to the text. These tags can then be used in other text processing tasks such as information extraction. While the boundary between the fields of Text Mining and Natural Language Processing is becoming increasingly blurry, the importance of predictive models for various applications involving text means there is still substantial growth potential within the traditional sub‐fields of text mining. These data‐centric directions are also likely to influence future research in Natural Language Processing, especially in resource‐poor languages and in multilingual texts. WIREs Data Mining Knowl Discov 2015, 5:155–164. doi: 10.1002/widm.1154

Archive | 2015

Information Retrieval and Text Mining

Sholom M. Weiss; Nitin Indurkhya; Tong Zhang

Information retrieval is described as a predictive text-mining task. The methods for retrieval can be considered variations of similarity-based nearest-neighbor methods. Both key word search and full document matching are examined. Different methods of measuring similarity are considered including cosine similarity. Classical information retrieval has evolved from retrieval of documents stored in databases to web-based documents. These documents have richer representations with links among documents. Link analysis for ranking similarity of documents is reviewed. Some performance issues for computing similarity are considered including the specification of inverted lists for indexing documents.

Archive | 2015

From Textual Information to Numerical Vectors

Sholom M. Weiss; Nitin Indurkhya; Tong Zhang

Documents are composed of words, and machine learning methods process numerical vectors. This chapter discusses how words are transformed into vectors, readying them for processing by predictive methods. Documents may appear in different formats and may be collected from different sources. With minor modifications, they can be organized and unified for prediction by specifying them in a standard descriptive language, XML. The words or tokens may be further reduced to common roots by stemming. These tokens are added to a dictionary. The words in a document can be converted to vectors using local or global dictionaries. The value of each entry in the vector will be based on measures of frequency of occurrence of words in a document such as term frequency (tf and idf). An additional entry in a document vector is a label of the correct answer, such as its topic. Dictionaries can be extended to multiword features like phrases. Dictionary size may be significantly reduced by attribute ranking. The general approach is purely empirical, preparing data for statistical prediction. Linguistic concepts are also discussed including part-of-speech tagging, word sense disambiguation, phrase recognition, parsing and feature generation.

international health informatics symposium | 2010

Predictive rule discovery from electronic health records

Sholom M. Weiss; Nitin Indurkhya; Chidanand Apte

Automated procedures are described for discovering predictive rules from electronic health records. These patient records are structured, but are not collected relative to any targeted labels or study objectives. The learning methods cycle through all features, simulating labels and converting the problem from unlabeled learning to supervised classification and regression. Each feature in turn is processed as a simulated label, and a prediction is made from the remaining features. Using a decision-rule representation for knowledge extraction, machine learning techniques are applied to a large collection of electronic health records. Many rules are readily induced with significant predictive performance. By formulating the rules as queries to a web search engine, and then counting hit frequencies, we show how medical researchers can assess and rank potential for new insight among a collection of empirically strong associations.

Archive | 2015

Data Sources for Prediction: Databases, Hybrid Data and the Web

Sholom M. Weiss; Nitin Indurkhya; Tong Zhang

Data for automated prediction comes from many sources. In this chapter we expand our horizons to encompass both text and structured numerical data. Initially, we review the ideal data representations for prediction using either numerical or text data. We consider numerous sources of data including databases, the web, and hybrid forms of text and numerical data. Prototypical examples of blended numerical and text data are given. Using the web as a source of data for prediction is examined. Among the examples presented of web-sourced data are downloaded scientific publications formatted in XML, stock price data and related newswire headlines. Sentiment and opinion analysis are considered with examples from online product reviews and Twitter data. Predictive mining of electronic medical records mining is presented as an example of mixed-data mining.

Archive | 2010

Overview of Text Mining

Sholom M. Weiss; Nitin Indurkhya; Tong Zhang

Text mining and data mining are contrasted relative to automated prediction. Models are constructed by training on samples of unstructured documents, and results are projected to new text. A standard data format for input to prediction methods is described. The key objective of data preparation is to transform text into a numerical format, eventually sharing a common representation with numerical data mining. Different text-mining tasks are introduced that fit within a predictive framework for machine-learning. These include document classification, information retrieval, clustering of documents, information extraction, and performance evaluation.

Archive | 2015

Using Text for Prediction

Sholom M. Weiss; Nitin Indurkhya; Tong Zhang

Once text is transformed into numerical vectors, automated prediction methods can be applied. Predictive text mining is described in terms of an empirical analysis that looks for word patterns, especially for document classification. Fundamental methods of machine learning from sample data are outlined including similarity-based methods, decision rules and trees, probabilistic methods and linear methods. Evaluation techniques are examined to estimate future performance and to maximize empirical results. Errors and pitfalls in big data evaluation are considered, and graph models for social networks are introduced.

Archive | 2015

Finding Structure in a Document Collection

Sholom M. Weiss; Nitin Indurkhya; Tong Zhang

Document collections are frequently encountered without labels. Labels may be determined by clustering the documents into disparate groups and implicitly finding common themes among the document clusters. This chapter describes methods for clustering documents. A key theme for document clustering is computing measures of similarity. We review the major clustering methods: k-means clustering, hierarchical clustering and the EM algorithm. Strategies for assigning meaning to algorithmically generated clusters and labels are considered. Performance evaluation helps determine the empirical characteristics of desirable clusters.

Archive | 2015

Looking for Information in Documents

Sholom M. Weiss; Nitin Indurkhya; Tong Zhang

A common task in natural language processing and text mining is the extraction and formatting of information from unstructured text. One can think of the end goal of information extraction in terms of filling templates codifying the extracted information. The templates are then put into a knowledge database for future use. This chapter describes several models and learning methods that can be used to solve information extraction. We focused on two major subtasks, one is to extract entities, such as person name, organization, etc. from sentences, and the other is to determine the relationship among extracted entities.

Explore More