Roman Suvorov | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Roman Suvorov is active.

Explore More

Publication

Featured researches published by Roman Suvorov.

european conference on information retrieval | 2016

Exactus Like: Plagiarism Detection in Scientific Texts

Ilya Sochenkov; Denis Zubarev; Ilya Tikhomirov; Ivan Smirnov; Artem Shelmanov; Roman Suvorov; Gennady Osipov

The paper presents an overview of Exactus Like – a plagiarism detection system. Deep parsing for text alignment helps the system to find moderate forms of disguised plagiarism. The features of the system and its advantages are discussed. We describe the architecture of the system and present its performance.

international conference on pattern recognition applications and methods | 2015

Assessment of Dendritic Cell Therapy Effectiveness Based on the Feature Extraction from Scientific Publications

Alexey Yu. Lupatov; A. I. Panov; Roman Suvorov; Alexander Shvets; Konstantin N. Yarygin; Galina D. Volkova

Dendritic cells (DCs) vaccination is a promising way to contend cancer metastases especially in the case of immunogenic tumors. Unfortunately, it is only rarely possible to achieve a satisfactory clinical outcome in the majority of patients treated with a particular DC vaccine. Apparently, DC vaccination can be successful with certain combinations of features of the tumor and patients immune system that are not yet fully revealed. Difficulty in predicting the results of the therapy and high price of preparation of individual vaccines prevent wider use of DC vaccines in medical practice. Here we propose an approach aimed to uncover correlation between the effectiveness of specific DC vaccine types and personal characteristics of patients to increase efficiency of cancer treatment and reduce prices. To accomplish this, we suggest two-step analysis of published clinical trials results for DCs vaccines: first, the information extraction subsystem is trained, and, second, the extracted data is analyzed using JSM and AQ methodology.

international conference on speech and computer | 2013

Method for Pornography Filtering in the WEB Based onAutomatic Classification and Natural Language Processing

Roman Suvorov; Ilya Sochenkov; Ilya Tikhomirov

The paper presents a method for pornography detection in the web pages based on natural language processing. The described classification method uses feature set of single words and groups of words. Syntax analysis is performed to extract collocations. A modification of TF-IDF is used to weight terms. An evaluation and comparison of quality and performance of classification are given.

international conference on pattern recognition applications and methods | 2015

Assessment of the Extent of the Necessary Clinical Testing of New Biotechnological Products Based on the Analysis of Scientific Publications and Clinical Trials Reports

Roman Suvorov; Ivan Smirnov; K. V. Popov; Nikolay Yarygin; Konstantin N. Yarygin

To estimate patients risks and make clinical decisions, evidence based medicine (EBM) relies upon the results of reproducible trials and experiments supported by accurate mathematical methods. Experimental and clinical evidence is crucial, but laboratory testing and especially clinical trials are expensive and time-consuming. On the other hand, a new medical product to be evaluated may be similar to one or many already tested. Results of the studies hitherto performed with similar products may be a useful tool to determine the extent of further pre-clinical and clinical testing. This paper suggests a workflow design aimed to support such an approach including methods for information collection, assessment of research reliability, extraction of structured information about trials and meta-analysis. Additionally, the paper contains a discussion of the issues emering during development of an integrated software system that implements the proposed workflow.

artificial intelligence methodology systems applications | 2014

Training Datasets Collection and Evaluation of Feature Selection Methods for Web Content Filtering

Roman Suvorov; Ilya Sochenkov; Ilya Tikhomirov

This paper focuses on the main aspects of development of a qualitative system for dynamic content filtering. These aspects include collection of meaningful training data and the feature selection techniques. The Web changes rapidly so the classifier needs to be regularly re-trained. The problem of training data collection is treated as a special case of the focused crawling. A simple and easy-to-tune technique was proposed, implemented and tested. The proposed feature selection technique tends to minimize the feature set size without loss of accuracy and to consider interlinked nature of the Web. This is essential to make a content filtering solution fast and non-burdensome for end users, especially when content filtering is performed using a restricted hardware. Evaluation and comparison of various classifiers and techniques are provided.

WCSC | 2018

Automatic Image Classification for Web Content Filtering: New Dataset Evaluation

V. P. Fralenko; Roman Suvorov; Ilya Tikhomirov

The paper presents experimental evaluation of image classification in the field of web content filtering using bag of visual features and convolutional neural networks approach. A more difficult data set than traditionally used ones was built from very similar types of images in order to make conditions closer to real world practice. F1-measure of classifiers that are based on bags of visual features was significantly lower than that reported in previously published papers. Convolutional neural networks performed much better. Also, we measured and compared training and prediction time of various algorithms.

Archive | 2018

Scientific Research Funding Criteria: An Empirical Study of Peer Review and Scientometrics

Dmitry Devyatkin; Roman Suvorov; Ilya Tikhomirov; Oleg Grigoriev

In this paper we investigated the problem of scientific research funding from the perspective of data-mining. The object was to conduct versatile retrospective analysis of decisions made by the Russian Foundation for Basic Research regarding scientific research funding. The central task of the analysis was to compare the impact of various items of information on final decision making. In other words, we tried to answer two questions: (a) what does an evaluation committee mainly look at when it selects projects for funding; (b) are scientometric indicators (or science metrics) useful in decision analysis? To achieve this, we built predictive models (classifiers), performed introspection (extracted feature importance) and compared them. The input data was a set of review forms (questionnaires) from the Russian Foundation for Basic Research completed in by peer reviewers. Final decision is made by the foundation board (an evaluation committee). Finally, we concluded that the available input (project proposals, expert assessments and scientometric data) was not enough to explain all the decisions. We showed that scientometric data does not have any significant influence on project proposals assessment. It also means that h-index, mean impact factor, publication and citation number cannot supersede the peer review procedure.

Foresight and STI Governance | 2018

Mapping the Research Landscape of Agricultural Sciences

Dmitry Devyatkin; Elena Nechaeva; Roman Suvorov; Ilya Tikhomirov

A research landscape is a high-level description of the current state of a certain scientific field and its dynamics. High-quality research landscapes are important tools that allow for more effective research management. This paper presents a novel framework for the mapping of research. It relies on full-text mining and topic modeling to pool data from many sources without relying on any specific taxonomy of scientific fields and areas. The framework is especially useful for scientific fields that are poorly represented in scientometric databases, i.e., Scopus or Web of Science. The high-level algorithm consists of (1) full-text collection from reliable sources; (2) the automatic extraction of research fields using topic modeling; (3) semi-automatic linking to scientometric databases; and (4) a statistical analysis of metrics for the extracted scientific areas. Full-text mining is crucial due to (a) the poor representation of many Russian research areas in systems like Scopus or Web of Science; (b) the poor quality of Russian Science Index data; and (c) the differences between taxonomies used in different data sources. Major advantages of the proposed framework include its data-driven approach, its independence from scientific subjects’ taxonomies, and its ability to integrate data from multiple heterogeneous data sources. Furthermore, this framework complements traditional approaches to research mapping using scientometric software like Scopus or Web of Science rather than replacing them. We experimentally evaluated the framework using agricultural science as an example, but the framework is not limited to any particular domain. As a result, we created the first research landscape covering young researchers in agricultural science. Topic modeling yielded six major scientific areas within the field of agriculture. We found that statistically significant differences between these areas exist. This means that a differentiated approach to research management is critical. Further research on this subject includes the application of the framework to other scientific fields and the integration of other collections of research and technical documentation (especially patents).

artificial intelligence and natural language | 2017

Active Learning with Adaptive Density Weighted Sampling for Information Extraction from Scientific Papers

Roman Suvorov; Artem Shelmanov; Ivan Smirnov

The paper addresses the task of information extraction from scientific literature with machine learning methods. In particular, the tasks of definition and result extraction from scientific publications in Russian are considered. We note that annotation of scientific texts for creation of training dataset is very labor insensitive and expensive process. To tackle this problem, we propose methods and tools based on active learning. We describe and evaluate a novel adaptive density-weighted sampling (ADWeS) meta-strategy for active learning. The experiments demonstrate that active learning can be a very efficient technique for scientific text mining, and the proposed meta-strategy can be beneficial for corpus annotation with strongly skewed class distribution. We also investigate informative task-independent features for information extraction from scientific texts and present an openly available tool for corpus annotation, which is equipped with ADWeS and compatible with well-known sampling strategies.

Scientific and Technical Information Processing | 2017

An Information Retrieval System for Decision Support: An Arctic-Related Mass Media Case Study

Dmitry Devyatkin; Roman Suvorov; Ilya Sochenkov

This paper discusses the problem of building a comprehensive information retrieval system that facilitates the decision-making process in a specified wide topic. We analyze the requirements for such a system, types of information sources, and typical search queries and propose an architecture and an integrated pipeline. We also present a case study in the field of Arctic exploration (oil & mining, ecology issues, etc.). The results are also presented, including vibrant topics and typical associations between entities.

Explore More