Sandip Debnath
Pennsylvania State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sandip Debnath.
electronic commerce | 2003
Sandip Debnath; David M. Pennock; C. Lee Giles; Steve Lawrence
We analyze data from
international syposium on methodologies for intelligent systems | 2005
Sandip Debnath; Prasenjit Mitra; C. Lee Giles
52
electronic commerce and web technologies | 2005
Sandip Debnath; Tracy Mullen; Arun Upneja; C. Lee Giles
online in-game sports betting markets (where betting is allowed continuously throughout a game), including 34 markets based on soccer (European football) games from the 2002 World Cup, and 18 basketball games from the 2002 USA National Basketball Association (NBA) championship. We show that prices on average approach the correct outcome over time, and the price dynamics in the markets are closely coupled with game events, agreeing with efficient market assumptions. We also examine qualitative distinctions between the two types of games.
international conference on knowledge capture | 2005
Sandip Debnath; C. Lee Giles
Intelligent information processing systems, such as digital libraries or search engines index web-pages according to their informative content. However, web-pages contain several non-informative contents, e.g., navigation sidebars, advertisements, copyright notices, etc. It is very important to separate the informative “primary content blocks” from these non-informative blocks. In this paper, two algorithms, FeatureExtractor and K-FeatureExtractor are proposed to identify the “primary content blocks” based on their features. None of these algorithms require any supervised learning, but still can identify the “primary content blocks” with high precision and recall. While operating on several thousand web-pages obtained from 15 different websites, our algorithms significantly outperform the Entropy-based algorithm proposed by Lin and Ho [14] in both precision and run-time.
acm symposium on applied computing | 2005
Sandip Debnath; Prasenjit Mitra; C. Lee Giles
The Web continues to grow at a tremendous rate. Search engines find it increasingly difficult to provide useful results. To manage this explosively large number of Web documents, automatic clustering of documents and organising them into domain dependent directories became very popular. In most cases, these directories represent a hierarchical structure of categories and sub-categories for domains and sub-domains. To fill up these directories with instances, individual documents are automatically analysed and placed into them according to their relevance. Though individual documents in these collections may not be ranked efficiently, combinedly they provide an excellent knowledge source for facilitating ontology construction in that domain. In (mainly automatic) ontology construction steps, we need to find and use relevant knowledge for a particular subject or term. News documents provide excellent relevant and up-to-date knowledge source. In this paper, we focus our attention in building business ontologies. To do that we use news documents from business domains to get an up-to-date knowledge about a particular company. To extract this knowledge in the form of important “terms” related to the company, we apply a novel method to find “related terms” given the company name. We show by examples that our technique can be successfully used to find “related terms” in similar cases.
First Monday | 2003
Amanda Spink; Yashmeet Khopkar; Prital Shah; Sandip Debnath
Metadata information plays a crucial role in augmenting document organising efficiency and archivability. News metadata includes DateLine, ByLine, HeadLine and many others. We found that HeadLine information is useful for guessing the theme of the news article. Particularly for financial news articles, we found that HeadLine can thus be specially helpful to locate explanatory sentences for any major events such as significant changes in stock prices. In this paper we explore a support vector based learning approach to automatically extract the HeadLine metadata. We find that the classification accuracy of finding the HeadLines improves if DateLines are identified first. We then used the extracted HeadLines to initiate a pattern matching of keywords to find the sentences responsible for story theme. Using this theme and a simple language model it is possible to locate any explanatory sentences for any significant price change.
uncertainty in artificial intelligence | 2002
David M. Pennock; Sandip Debnath; Eric J. Glover; C. Lee Giles
the florida ai research society | 2005
Sandip Debnath; Prasenjit Mitra; C. Lee Giles
electronic commerce | 2002
Sandip Debnath; David M. Pennock; Steve Lawrence; Eric J. Glover; C. Lee Giles
First Monday, ISSN 1396-0466 | 2003
Yashmeet Khopkar; Amanda Spink; C. Lee Giles; Prital Shah; Sandip Debnath