Is this you? Create Your Porfile

Daniel Hasan Dalip

Universidade Federal de Minas Gerais

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel Hasan Dalip is active.

Explore More

Publication

Featured researches published by Daniel Hasan Dalip.

acm/ieee joint conference on digital libraries | 2009

Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia

Daniel Hasan Dalip; Marcos André Gonçalves; Marco Cristo; Pável Calado

The old dream of a universal repository containing all the human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative, participation of people. Wikipedia is a great example. It is an enormous repository of information with free access and edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its relative quality. In this work we explore a significant number of quality indicators, some of them proposed by us and used here for the first time, and study their capability to assess the quality of Wikipedia articles. Furthermore, we explore machine learning techniques to combine these quality indicators into one single assessment judgment. Through experiments, we show that the most important quality indicators are the easiest ones to extract, namely, textual features related to length, structure and style. We were also able to determine which indicators did not contribute significantly to the quality assessment. These were, coincidentally, the most complex features, such as those based on link analysis. Finally, we compare our combination method with state-of-the-art solution and show significant improvements in terms of effective quality prediction.

international acm sigir conference on research and development in information retrieval | 2013

Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow

Daniel Hasan Dalip; Marcos André Gonçalves; Marco Cristo; Pável Calado

Collaborative web sites, such as collaborative encyclopedias, blogs, and forums, are characterized by a loose edit control, which allows anyone to freely edit their content. As a consequence, the quality of this content raises much concern. To deal with this, many sites adopt manual quality control mechanisms. However, given their size and change rate, manual assessment strategies do not scale and content that is new or unpopular is seldom reviewed. This has a negative impact on the many services provided, such as ranking and recommendation. To tackle with this problem, we propose a learning to rank (L2R) approach for ranking answers in Q&A forums. In particular, we adopt an approach based on Random Forests and represent query and answer pairs using eight different groups of features. Some of these features are used in the Q&A domain for the first time. Our L2R method was trained to learn the answer rating, based on the feedback users give to answers in Q&A forums. Using the proposed method, we were able (i) to outperform a state of the art baseline with gains of up to 21% in NDCG, a metric used to evaluate rankings; we also conducted a comprehensive study of the features, showing that (ii) review and user features are the most important in the Q&A domain although text features are useful for assessing quality of new answers; and (iii) the best set of new features we proposed was able to yield the best quality rankings.

Journal of Data and Information Quality | 2011

Automatic Assessment of Document Quality in Web Collaborative Digital Libraries

Daniel Hasan Dalip; Marcos André Gonçalves; Marco Cristo; Pável Calado

The old dream of a universal repository containing all of human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative participation of people. Wikipedia is a great example. It is an enormous repository of information with free access and open edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its quality. In this work, we explore a significant number of quality indicators and study their capability to assess the quality of articles from three Web collaborative digital libraries. Furthermore, we explore machine learning techniques to combine these quality indicators into one single assessment. Through experiments, we show that the most important quality indicators are those which are also the easiest to extract, namely, the textual features related to the structure of the article. Moreover, to the best of our knowledge, this work is the first that shows an empirical comparison between Web collaborative digital libraries regarding the task of assessing article quality.

association for information science and technology | 2017

A general multiview framework for assessing the quality of collaboratively created content on web 2.0

Daniel Hasan Dalip; Marcos André Gonçalves; Marco Cristo; Pável Calado

User‐generated content is one of the most interesting phenomena of current published media, as users are now able not only to consume, but also to produce content in a much faster and easier manner. However, such freedom also carries concerns about content quality. In this work, we propose an automatic framework to assess the quality of collaboratively generated content. Quality is addressed as a multidimensional concept, modeled as a combination of independent assessments, each regarding different quality dimensions. Accordingly, we adopt a machine‐learning (ML)‐based multiview approach to assess content quality. We perform a thorough analysis of our framework on two different domains: Questions and Answer Forums and Collaborative Encyclopedias. This allowed us to better understand when and how the proposed multiview approach is able to provide accurate quality assessments. Our main contributions are: (a) a general ML multiview framework that takes advantage of different views of quality indicators; (b) the improvement (up to 30%) in quality assessment over the best state‐of‐the‐art baseline methods; (c) a thorough feature and view analysis regarding impact, informativeness, and correlation, based on two distinct domains.

acm ieee joint conference on digital libraries | 2011

Building a research social network from an individual perspective

Alberto H. F. Laender; Mirella M. Moro; Marcos André Gonçalves; Clodoveu A. Davis; Altigran Soares da Silva; Allan J. C. Silva; Carolina A. S. Bigonha; Daniel Hasan Dalip; Eduardo M. Barbosa; Eli Cortez; Peterson S. Procópio; Rafael Odon de Alencar; Thiago N. C. Cardoso; Thiago Salles

In this poster paper, we present an overview of CiênciaBrasil, a research social network involving researchers within the Brazilian INCT program. We describe its architecture and the solutions adopted for data collection, extraction, and deduplication, and for materializing and visualizing the network.

acm/ieee joint conference on digital libraries | 2014

Quality assessment of collaborative content with minimal information

Daniel Hasan Dalip; Harlley Lima; Marcos André Gonçalves; Marco Cristo; Pável Calado

Content generated by users is one of the most interesting phenomena of published media. However, the possibility of unrestricted edition is a source of doubts about its quality. This issue has motivated many studies on how to automatically assess content quality in collaborative web sites. Generally, these studies use machine learning techniques to combine large number of quality indicators into a single value representing the overall quality of the document. This need for a high number of indicators, however, has detrimental implications both on the efficiency and on the effectiveness of the quality assessment algorithms. In this work, we exploit and extend a feature selection method based on the SPEA2 multi-objective genetic algorithm. Results show that we can reduce the feature set to a fraction of 15% through 25% of the original, while obtaining error rates comparable to the state of the art.

social informatics | 2013

Polarity Detection of Foursquare Tips

Felipe Moraes; Marisa A. Vasconcelos; Patrick Prado; Daniel Hasan Dalip; Jussara M. Almeida; Marcos André Gonçalves

In location-based social networks, such as Foursquare, users may post tips with their opinions about visited places. Tips may directly impact the behavior of future visitors, providing valuable feedback to business owners. Sentiment or polarity detection has attracted great attention due to its vast applicability in opinion summarization, ranking or recommendation. However, the automatic detection of polarity of tips faces challenges due to their short sizes and informal content. This paper presents an empirical study of supervised and unsupervised techniques to detect the polarity of Foursquare tips. We evaluate the effectiveness of four methods on two sets of tips, finding that a simpler lexicon-based approach, which does not require costly manual labeling, can be as effective as state-of-the-art supervised methods. We also find that a hybrid approach that combines all considered methods by means of stacking does not significantly outperform the best individual method.

theory and practice of digital libraries | 2012

On multiview-based meta-learning for automatic quality assessment of wiki articles

Daniel Hasan Dalip; Marcos André Gonçalves; Marco Cristo; Pável Calado

The Internet has seen a surge of new types of repositories with free access and collaborative open edition. However, this large amount of information, made available democratically and virtually without any control, raises questions about its quality. In this work, we investigate the use of meta-learning techniques to combine sets of semantically related quality indicators (aka, views) in order to automatically assess the quality of wiki articles. The idea is inspired on the combination of multiple (quality) experts. We perform a thorough analysis of the proposed multiview-based meta-learning approach in 3 collections. In our experiments, meta-learning was able to improve the performance of a state-of-the-art method in all tested datasets, with gains of up to 27% in quality assessment.

acm ieee joint conference on digital libraries | 2011

GreenWiki: a tool to support users' assessment of the quality of Wikipedia articles

Daniel Hasan Dalip; Raquel Lara dos Santos; Diogo Rennó Rocha de Oliveira; Valéria Freitas Amaral; Marcos André Gonçalves; Raquel Oliveira Prates; Raquel Cardoso de Melo Minardi; Jussara M. Almeida

In this work, we present GreenWiki, which is a wiki with a panel of quality indicators to assist the reader of a Wikipedia article in assessing its quality.

international conference on weblogs and social media | 2016