Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hayda Almeida is active.

Publication


Featured researches published by Hayda Almeida.


Database | 2015

mycoCLAP, the database for characterized lignocellulose-active proteins of fungal origin: resource and text mining curation support

Kimchi-Audrey Strasser; Erin McDonnell; Carol Nyaga; Min Wu; Hayda Almeida; Marie-Jean Meurs; Leila Kosseim; Justin Powlowski; Greg Butler; Adrian Tsang

Enzymes active on components of lignocellulosic biomass are used for industrial applications ranging from food processing to biofuels production. These include a diverse array of glycoside hydrolases, carbohydrate esterases, polysaccharide lyases and oxidoreductases. Fungi are prolific producers of these enzymes, spurring fungal genome sequencing efforts to identify and catalogue the genes that encode them. To facilitate the functional annotation of these genes, biochemical data on over 800 fungal lignocellulose-degrading enzymes have been collected from the literature and organized into the searchable database, mycoCLAP (http://mycoclap.fungalgenomics.ca). First implemented in 2011, and updated as described here, mycoCLAP is capable of ranking search results according to closest biochemically characterized homologues: this improves the quality of the annotation, and significantly decreases the time required to annotate novel sequences. The database is freely available to the scientific community, as are the open source applications based on natural language processing developed to support the manual curation of mycoCLAP. Database URL: http://mycoclap.fungalgenomics.ca


IEEE Transactions on Nanobioscience | 2016

Data Sampling and Supervised Learning for HIV Literature Screening

Hayda Almeida; Marie-Jean Meurs; Leila Kosseim; Adrian Tsang

This paper presents a supervised learning approach to support the screening of HIV literature. The manual screening of biomedical literature is an important task in the process of systematic reviews. Researchers and curators have the very demanding, time-consuming, and error-prone task of manually identifying documents that should be included in a systematic review concerning a specific problem. We developed a supervised learning approach to support screening tasks, by automatically flagging potentially relevant documents from a list retrieved by a literature database search. To overcome the main issues associated with the automatic literature screening task, we evaluated the use of data sampling, feature combinations, and feature selection methods, generating a total of 105 classification models. The models yielding the best results were composed of a Logistic Model Trees classifier, a fairly balanced training set, and feature combination of Bag-Of-Words and MeSH terms. According to our results, the system correctly labels the great majority of relevant documents, making it usable to support HIV systematic reviews to allow researchers to assess a greater number of documents in less time.


north american chapter of the association for computational linguistics | 2016

Automatic Triage of Mental Health Online Forum Posts: CLPsych 2016 System Description.

Hayda Almeida; Marc Queudot; Marie-Jean Meurs

This paper presents a system capable of performing automatic triage of forum posts from ReachOut.com, a mental health online forum. The system assigns to each post a tag that indicates how urgently moderator attention is needed. The evaluation is based on experiments conducted on the CLPsych 2016 task, and the system is released as an open-source software.


bioinformatics and biomedicine | 2015

Supporting HIV literature screening with data sampling and supervised learning

Hayda Almeida; Marie-Jean Meurs; Leila Kosseim; Adrian Tsang

This paper presents a supervised learning approach to support the screening of HIV literature. The manual screening of biomedical literature is an important task in the process of systematic reviews. Researchers and curators have the very demanding, time-consuming and error-prone task of manually identifying documents that must be included in a systematic review concerning a specific problem. We implemented a supervised learning approach to support screening tasks, by automatically flagging potentially selected documents in a list retrieved by a literature database search. To overcome the main issues associated with the automatic literature screening task, we evaluated the use of data sampling, feature combinations, and feature selection methods, generating a total of 105 classification models. The models yielding best results were composed by the Logistic Model Trees classifier, a fairly balanced training set, and feature combination of Bag-Of-Words and MeSH terms. According to our results, the system correctly labels the great majority of relevant documents, and it could be used to support HIV systematic reviews to allow researchers to assess a greater number of documents in less time.


computational intelligence | 2018

An open source and modular search engine for biomedical literature retrieval

Hayda Almeida; Ludovic Jean-Louis; Marie-Jean Meurs

This work presents the bioMine system, a full‐text natural language search engine for biomedical literature. bioMine provides search capabilities based on the full‐text content of documents belonging to a database composed of scientific articles and allows users to submit their search queries using natural language. Beyond the text content of articles, the system engine also uses article metadata, empowering the search by considering extra information from picture and table captions. bioMine is publicly released as an open‐source system under the MIT license.


canadian conference on artificial intelligence | 2018

Analysis of Social Media Posts for Early Detection of Mental Health Conditions

Antoine Briand; Hayda Almeida; Marie-Jean Meurs

This paper presents a multipronged approach to predict early risk of mental health issues from user-generated content in social media. Supervised learning and information retrieval methods are used to estimate the risk of depression for a user given the content of its posts in reddit. The approach presented here was evaluated on the CLEF eRisk 2017 pilot task. We describe the details of five systems submitted to the task, and compare their performance. The comparisons show that combining information retrieval and machine learning methods gives the best results.


International Conference on E-Technologies | 2017

Supervised Methods to Support Online Scientific Data Triage

Hayda Almeida; Marc Queudot; Leila Kosseim; Marie-Jean Meurs

This paper presents machine learning approaches based on supervised methods applied to triage of health and biomedical data. We discuss the applications of such approaches in three different tasks, and evaluate the usage of triage pipelines, as well as data sampling and feature selection methods to improve performance on each task. The scientific data triage systems are based on a generic and light pipeline, and yet flexible enough to perform triage on distinct data. The presented approaches were developed to be integrated as a part of web-based systems, providing real time feedback to health and biomedical professionals. All systems are publicly available as open-source.


canadian conference on artificial intelligence | 2016

Mining Biomedical Literature: An Open Source and Modular Approach

Hayda Almeida; Ludovic Jean-Louis; Marie-Jean Meurs

This paper presents the ongoing development of a full-text natural language search engine for biomedical literature. The system aims to provide search on the full-text content of documents belonging to a database composed of scientific articles, while allowing users to submit their search queries using natural language. Beyond the text content of articles, the system engine also utilizes article metadata, empowering the search by considering extra information from picture and table captions. User queries can be submitted to the system in natural language, releasing the user from the burden of translating their search needs into a query language.


CLEF (Working Notes) | 2017

Detecting Early Risk of Depression from Social Media User-generated Content.

Hayda Almeida; Antoine Briand; Marie-Jean Meurs


F1000Research | 2017

Retrieving biomedical literature: an open source search engine based on open access resources

Hayda Almeida; Ludovic Jean-Louis; Marie-Jean Meurs

Collaboration


Dive into the Hayda Almeida's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ludovic Jean-Louis

École Polytechnique de Montréal

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Antoine Briand

Université du Québec à Montréal

View shared research outputs
Top Co-Authors

Avatar

Marc Queudot

Université du Québec à Montréal

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge