Maria Nisheva
Sofia University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maria Nisheva.
international conference on conceptual structures | 2017
Milko Krachunov; Maria Nisheva; Dimitar Vassilev
Abstract In high-variation genomics datasets, such as found in metagenomics or complex polyploid genome analysis, error detection and variant calling are impeded by the difficulty in discerning sequencing errors from actual biological variation. Confirming base candidates with high frequency of occurrence is no longer a reliable measure, because of the natural variation and the presence of rare bases. This work employs machine learning models to classify bases into erroneous and rare variations, after preselecting potential error candidates with a weighted frequency measure, which aims to focus on unexpected variations by using the inter-sequence pairwise similarity. Different similarity measures are used to account for different types of datasets. Four machine learning models are tested.
federated conference on computer science and information systems | 2016
Atanas Radenski; Todor V. Gurov; Kalinka Kaloyanova; Nikolay Kirov; Maria Nisheva; Peter Stanchev; Eugenia Stoimenova
Big data is a broad term with numerous dimensions, most notably: big data characteristics, techniques, software systems, application domains, computing platforms, and big data milieu (industry, government, and academia). In this paper we briefly introduce fundamental big data characteristics and then present seven case studies of big data techniques, systems, applications, and platforms, as seen from academic perspective (industry and government perspectives are not subject of this publication). While we feel that it is difficult, if at all possible, to encapsulate all of the important big data dimensions in a strict and uniform, yet comprehensible language, we believe that a set of diverse case studies - like the one that is offered in this paper - a set that spreads over the principal big data dimensions can indeed be beneficial to the broad big data community by helping experts in one realm to better understand currents trends in the other realms.
international conference on conceptual structures | 2015
Milko Krachunov; Dimitar Vassilev; Maria Nisheva; Ognyan Kulev; Valeriya Simeonova; Vladimir Dimitrov
NGS data processing in metagenomics studies has to deal with noisy data that can contain a large amount of reading errors which are difficult to detect and account for. This work introduces a fuzzy indicator of reliability technique to facilitate solutions to this problem. It includes modified Hamming and Levenshtein distance functions that are aimed to be used as drop-in replacements in NGS analysis procedures which rely on distances, such as phylogenetic tree construction. The distances utilise fuzzy sets of reliable bases or an equivalent fuzzy logic, potentially aggregating multiple sources of base reliability.
artificial intelligence: methodology, systems, applications | 2018
Iliyan Mihaylov; Maria Nisheva; Dimitar Vassilev
The use of machine learning in disease prediction and prognosis is part of a growing trend of personalized and predictive medicine. Cancer studies are domain of active machine learning implementation in particular in sense of accuracy of cancer prognosis and prediction. The accuracy of survival time prediction in breast cancer is the main object of the study. Two major features for survival time prediction, based on clinical data are used: the created in the study tumor integrated clinical feature and Nottingham prognostic index. The applied machine learning methods aside with data normalisation and classification provide promising results for accuracy of survival time prediction. Results showed prepotency of the support vector regression modles - linear and decision tree regression models, for more accurate prediction of the survival time in breast cancer. Cross-validation, based on four parameters for error evaluation, confirms the results of the model performance concerning the accuracy of survival time prediction in breast cancer.
artificial intelligence methodology systems applications | 2018
Milko Krachunov; Maria Nisheva; Dimitar Vassilev
Genomics studies have increasingly had to deal with datasets containing high variation between the sequenced nucleotide chains. This is most common in metagenomics studies and polyploid studies, where the biological nature of studied samples requires analysis of multiple variants of nearly identical sequences. The high variation makes it more difficult to determine the correct nucleotide sequences, as well as to distinguish signal from noise, producing digital results with higher error rates than the ones that can be achieved in samples with low variation. This paper presents an original pure machine learning-based approach for detecting and potentially correcting those errors. It uses a generic machine learning-based model that can be applied to different types of sequencing data with minor modifications. As presented in a separate part of this work, these models can be combined with data-specific error candidate selection to apply the models on, for a refined error discovery, but as shown here, can also be used independently.
international syposium on methodologies for intelligent systems | 2017
Milko Krachunov; Peter Petrov; Maria Nisheva; Dimitar Vassilev
A system for automated prediction and inference of cross-ontology links is presented. External knowledge sources are used to create a primary body of predictions. The structure of the projected super-ontology is then used to automatically infer additional predictions. Probabilistic scores are attached to all of these predictions, allowing them to be filtered using a statistically-selected threshold. Three anatomical ontologies were mapped in pairs, and all the predicted mapping links were individually checked by a manual curator, allowing a closer look at the quality of the chosen prediction procedures, and the validity of the resulting mappings.
The first computers | 2017
Milko Krachunov; Maria Nisheva; Dimitar Vassilev
For metagenomics datasets, datasets of complex polyploid genomes, and other high-variation genomics datasets, there are difficulties with the analysis, error detection and variant calling, stemming from the challenges of discerning sequencing errors from biological variation. Confirming base candidates with high frequency of occurrence is no longer a reliable measure because of the natural variation and the presence of rare bases. The paper discusses an approach to the application of machine learning models to classify bases into erroneous and rare variations after preselecting potential error candidates with a weighted frequency measure, which aims to focus on unexpected variations by using the inter-sequence pairwise similarity. Different similarity measures are used to account for different types of datasets. Four machine learning models are implemented and tested.
BIOMATH | 2012
Peter Petrov; Milko Krachounov; Ognyan Kulev; Maria Nisheva; Dimitar Vassilev
Archive | 2018
Milko Krachunov; Milena Sokolova; Valeriya Simeonova; Maria Nisheva; Irena Avdjieva; Dimitar Vassilev
F1000Research | 2015
Milko Krachunov; Dimitar Vassilev; Maria Nisheva; Ognyan Kulev; Valeriya Simeonova; Vladimir Dimitrov