Valery Solovyev
Kazan Federal University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Valery Solovyev.
Journal of the Royal Society Interface | 2014
V. Bochkarev; Valery Solovyev; Søren Wichmann
The frequency with which we use different words changes all the time, and every so often, a new lexical item is invented or another one ceases to be used. Beyond a small sample of lexical items whose properties are well studied, little is known about the dynamics of lexical evolution. How do the lexical inventories of languages, viewed as entire systems, evolve? Is the rate of evolution of the lexicon contingent upon historical factors or is it driven by regularities, perhaps to do with universals of cognition and social interaction? We address these questions using the Google Books N-Gram Corpus as a source of data and relative entropy as a measure of changes in the frequency distributions of words. It turns out that there are both universals and historical contingencies at work. Across several languages, we observe similar rates of change, but only at timescales of at least around five decades. At shorter timescales, the rate of change is highly variable and differs between languages. Major societal transformations as well as catastrophic events such as wars lead to increased change in frequency distributions, whereas stability in society has a dampening effect on lexical evolution.
international conference on computational linguistics | 2013
Rinat Gareev; Maksim Tkachenko; Valery Solovyev; Andrey Simanovsky; Vladimir Ivanov
Current research efforts in Named Entity Recognition deal mostly with the English language. Even though the interest in multi-language Information Extraction is growing, there are only few works reporting results for the Russian language. This paper introduces quality baselines for the Russian NER task. We propose a corpus which was manually annotated with organization and person names. The main purpose of this corpus is to provide gold standard for evaluation. We implemented and evaluated two approaches to NER: knowledge-based and statistical. The first one comprises several components: dictionary matching, pattern matching and rule-based search of lexical representations of entity names within a document. We assembled a set of linguistic resources and evaluated their impact on performance. For the data-driven approach we utilized our implementation of a linear-chain CRF which uses a rich set of features. The performance of both systems is promising (62.17% and 75.05% F1 measure), although they do not employ morphological or syntactical analysis.
web intelligence, mining and semantics | 2011
Valery Solovyev; Nikita Zhiltsov
Even though the Linking Open Data cloud is constantly growing, there is a serious lack of published data sets related to the domain of academic mathematics. At the same time, since most scholarly publications in mathematics are well-structured and conventional, its promising to get their helpful detailed representation. The paper describes an approach to extracting and analyzing the structure of mathematical papers. We present the Mocassin ontology that is used by analysis algorithms and can be considered as an ontology of the structure of scholarly publications in mathematics. The proposed semantic model has been evaluated on a set of real mathematical papers and preliminary evaluation results are encouraging. Also we discuss potential applications of the model to specific information retrieval tasks including semantic search.
Neurocomputing | 2016
Ildar Z. Batyrshin; Valery Solovyev; Vladimir Ivanov
The paper gives the new definition of non-statistical time series shape association measures that can measure positive and negative shape associations between time series. The local trend association measures based on linear regressions in sliding window are considered. The methods of extraction and presentation of positive and negative local trend association patterns from the pairs of time series are described. Examples of application of these methods to analysis of associations between securities data from Google Finance and between exchange rates are discussed. It was shown on the benchmark example and in the analysis of real time series that the correlation coefficient in spite of its fundamental role in statistics does not useful here and can cause confusion in analysis of time series shape similarity and shape associations.
Lobachevskii Journal of Mathematics | 2014
Alexander Elizarov; Alexander Kirillovich; Evgeny Lipachev; Olga Nevzorova; Valery Solovyev; Nikita Zhiltsov
The paper provides a survey of semantic methods for solution of fundamental tasks in mathematical knowledge management. Ontological models and formalisms are discussed. We propose an ontology of mathematical knowledge, covering wide range of fields of mathematics. We demonstrate applications of this representation in mathematical formula search, and learning.
Computational Intelligence and Neuroscience | 2016
Valery Solovyev; Vladimir Ivanov
Automatic event extraction form text is an important step in knowledge acquisition and knowledge base population. Manual work in development of extraction system is indispensable either in corpus annotation or in vocabularies and pattern creation for a knowledge-based system. Recent works have been focused on adaptation of existing system (for extraction from English texts) to new domains. Event extraction in other languages was not studied due to the lack of resources and algorithms necessary for natural language processing. In this paper we define a set of linguistic resources that are necessary in development of a knowledge-based event extraction system in Russian: a vocabulary of subordination models, a vocabulary of event triggers, and a vocabulary of Frame Elements that are basic building blocks for semantic patterns. We propose a set of methods for creation of such vocabularies in Russian and other languages using Google Books NGram Corpus. The methods are evaluated in development of event extraction system for Russian.
text speech and dialogue | 2014
Valery Solovyev; Vladimir Ivanov
This paper describes a system for problem phrase extraction from texts that contain users’ reviews of products. In contrast to recent works, this system is based on dictionaries and heuristics, not a machine learning algorithms. We explored two approaches to dictionary construction: manual and automatic. We evaluated the system on a dataset constructed using Amazon Mechanical Turk. Performance values are compared to a machine learning baseline.
Automatic Documentation and Mathematical Linguistics | 2014
E. V. Biryaltsev; Alexander Elizarov; Nikita Zhiltsov; Evgeny Lipachev; O. A. Nevzorova; Valery Solovyev
A survey of the key approaches to the semantic processing of mathematical texts is presented. A software platform prototype for the electronic storage of mathematical documents, which is based on the linked open-data (LOD) model and uses semantic information for data management, including formula-fragment searching, is proposed. The analysis of mathematical documents and the extraction of semantic information from the latter are carried out based on the electronic collection of the Izv. Vyssh. Uchebn. Zaved., Mat. (1995–2009) using special-purpose ontologies, metadata representation in the RDF (Resource Description Framework) format, and integration with existing LOD sets.
Linguistic Typology | 2009
Vladimir Polyakov; Valery Solovyev; Søren Wichmann; Oleg Belyaev
Abstract The articles primary concern is to address the usage of The world atlas of language structures through comparing it with another typological database of similar scope, Jazyki mira. Such a comparison is carried out based on a set of criteria. First, the scope of the databases is compared, as well as their differences and similarities in structure, in the number of errors, and in the existing user interfaces. Then calculations of typological similarity and temporal stability of language features based on the data provided by both databases are compared. Finally, conclusions are drawn as to the relative efficiency and usefulness of these databases for different aims of research or educational goals.
mexican conference on pattern recognition | 2014
Ildar Z. Batyrshin; Valery Solovyev
The paper introduces new time series shape association measures based on Euclidean distance. The method of analysis of associations between time series based on separate analysis of positively and negatively associated local trends is discussed. The examples of application of the proposed measures and methods to analysis of associations between historical prices of securities obtained from Google Finance are considered. An example of time series with inverse associations between them is discussed.