Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jacob Carter is active.

Publication


Featured researches published by Jacob Carter.


Database | 2016

BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID

Sun Kim; Rezarta Islamaj Doğan; Andrew Chatr-aryamontri; Christie S. Chang; Rose Oughtred; Jennifer M. Rust; Riza Theresa Batista-Navarro; Jacob Carter; Sophia Ananiadou; Sérgio Matos; André Santos; David Campos; José Luís Oliveira; Onkar Singh; Jitendra Jonnagaddala; Hong-Jie Dai; Emily Chia Yu Su; Yung Chun Chang; Yu-Chen Su; Chun-Han Chu; Chien Chin Chen; Wen-Lian Hsu; Yifan Peng; Cecilia N. Arighi; Cathy H. Wu; K. Vijay-Shanker; Ferhat Aydın; Zehra Melce Hüsünbeyi; Arzucan Özgür; Soo-Yong Shin

BioC is a simple XML format for text, annotations and relations, and was developed to achieve interoperability for biomedical text processing. Following the success of BioC in BioCreative IV, the BioCreative V BioC track addressed a collaborative task to build an assistant system for BioGRID curation. In this paper, we describe the framework of the collaborative BioC task and discuss our findings based on the user survey. This track consisted of eight subtasks including gene/protein/organism named entity recognition, protein–protein/genetic interaction passage identification and annotation visualization. Using BioC as their data-sharing and communication medium, nine teams, world-wide, participated and contributed either new methods or improvements of existing tools to address different subtasks of the BioC track. Results from different teams were shared in BioC and made available to other teams as they addressed different subtasks of the track. In the end, all submitted runs were merged using a machine learning classifier to produce an optimized output. The biocurator assistant system was evaluated by four BioGRID curators in terms of practical usability. The curators’ feedback was overall positive and highlighted the user-friendly design and the convenient gene/protein curation tool based on text mining. Database URL: http://www.biocreative.org/tasks/biocreative-v/track-1-bioc/


Database | 2014

Text Mining-assisted Biocuration Workflows in Argo

Rafal Rak; Riza Theresa Batista-Navarro; Andrew Rowley; Jacob Carter; Sophia Ananiadou

Biocuration activities have been broadly categorized into the selection of relevant documents, the annotation of biological concepts of interest and identification of interactions between the concepts. Text mining has been shown to have a potential to significantly reduce the effort of biocurators in all the three activities, and various semi-automatic methodologies have been integrated into curation pipelines to support them. We investigate the suitability of Argo, a workbench for building text-mining solutions with the use of a rich graphical user interface, for the process of biocuration. Central to Argo are customizable workflows that users compose by arranging available elementary analytics to form task-specific processing units. A built-in manual annotation editor is the single most used biocuration tool of the workbench, as it allows users to create annotations directly in text, as well as modify or delete annotations created by automatic processing components. Apart from syntactic and semantic analytics, the ever-growing library of components includes several data readers and consumers that support well-established as well as emerging data interchange formats such as XMI, RDF and BioC, which facilitate the interoperability of Argo with other platforms or resources. To validate the suitability of Argo for curation activities, we participated in the BioCreative IV challenge whose purpose was to evaluate Web-based systems addressing user-defined biocuration tasks. Argo proved to have the edge over other systems in terms of flexibility of defining biocuration tasks. As expected, the versatility of the workbench inevitably lengthened the time the curators spent on learning the system before taking on the task, which may have affected the usability of Argo. The participation in the challenge gave us an opportunity to gather valuable feedback and identify areas of improvement, some of which have already been introduced. Database URL: http://argo.nactem.ac.uk


PLOS ONE | 2016

Text Mining the History of Medicine.

Paul Thompson; Riza Theresa Batista-Navarro; Georgios Kontonatsios; Jacob Carter; Elizabeth Toon; John McNaught; Carsten Timmermann; Michael Worboys; Sophia Ananiadou

Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform.


Database | 2014

Processing biological literature with customizable Web services supporting interoperable formats

Rafal Rak; Riza Theresa Batista-Navarro; Jacob Carter; Andrew Rowley; Sophia Ananiadou

Web services have become a popular means of interconnecting solutions for processing a body of scientific literature. This has fuelled research on high-level data exchange formats suitable for a given domain and ensuring the interoperability of Web services. In this article, we focus on the biological domain and consider four interoperability formats, BioC, BioNLP, XMI and RDF, that represent domain-specific and generic representations and include well-established as well as emerging specifications. We use the formats in the context of customizable Web services created in our Web-based, text-mining workbench Argo that features an ever-growing library of elementary analytics and capabilities to build and deploy Web services straight from a convenient graphical user interface. We demonstrate a 2-fold customization of Web services: by building task-specific processing pipelines from a repository of available analytics, and by configuring services to accept and produce a combination of input and output data interchange formats. We provide qualitative evaluation of the formats as well as quantitative evaluation of automatic analytics. The latter was carried out as part of our participation in the fourth edition of the BioCreative challenge. Our analytics built into Web services for recognizing biochemical concepts in BioC collections achieved the highest combined scores out of 10 participating teams. Database URL: http://argo.nactem.ac.uk.


Wellcome Open Research | 2016

SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data

Aravind Venkatesan; Jee-Hyub Kim; Francesco Talo; Michele Ide-Smith; Julien Gobeill; Jacob Carter; Riza Theresa Batista-Navarro; Sophia Ananiadou; Patrick Ruch; Johanna McEntyre

The tremendous growth in biological data has resulted in an increase in the number of research papers being published. This presents a great challenge for scientists in searching and assimilating facts described in those papers. Particularly, biological databases depend on curators to add highly precise and useful information that are usually extracted by reading research articles. Therefore, there is an urgent need to find ways to improve linking literature to the underlying data, thereby minimising the effort in browsing content and identifying key biological concepts. As part of the development of Europe PMC, we have developed a new platform, SciLite, which integrates text-mined annotations from different sources and overlays those outputs on research articles. The aim is to aid researchers and curators using Europe PMC in finding key concepts more easily and provide links to related resources or tools, bridging the gap between literature and biological data.


Digital Heritage 2015 | 2015

Semantically enhanced search system for historical medical archives

Paul M. Thompson; Jacob Carter; John McNaught; Sophia Ananiadou

Large-scale efforts to digitise historical documents are making it increasingly easy for researchers of history to carry out searches over vast amounts of historical data from their computers. Although the constant growth in the volume of digitised historical text is enriching the body of knowledge that scholars of history have at their fingertips, it can often be difficult to explore such data collections efficiently without becoming overwhelmed. Standard keyword-based search systems treat documents as collections of unrelated words, and do not take into account their structure and meaning. Accordingly, keyword searches will often return many irrelevant documents. Equally, shifts in terminology usage over time can make it difficult to formulate queries that will retrieve all relevant documents from long-spanning historical archives. In this paper, we describe a new semantically oriented system for searching archives of historical medical documents covering wide time spans. By applying text mining techniques to the archives, the system allows for efficient searching, firstly by automatically suggesting ways to expand queries with (possibly time-sensitive) related terms, and secondly by allowing search results to be refined/explored using medically and historically relevant semantic information.


Database | 2016

Argo: enabling the development of bespoke workflows and services for disease annotation

Riza Theresa Batista-Navarro; Jacob Carter; Sophia Ananiadou


language resources and evaluation | 2014

Interoperability and Customisation of Annotation Schemata in Argo

Rafal Rak; Jacob Carter; Andrew Rowley; Riza Theresa Batista-Navarro; Sophia Ananiadou


meeting of the association for computational linguistics | 2013

Development and Analysis of NLP Pipelines in Argo

Rafal Rak; Andrew Rowley; Jacob Carter; Sophia Ananiadou


In: Proceedings of the Fourth BioCreative Challenge Evaluation Workshop; 2013. p. 270-278. | 2013

Customisable Curation Workflows in Argo

Rafal Rak; Riza Theresa Batista-Navarro; Andrew Rowley; Jacob Carter; Sophia Ananiadou

Collaboration


Dive into the Jacob Carter's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andrew Rowley

University of Manchester

View shared research outputs
Top Co-Authors

Avatar

Rafal Rak

University of Manchester

View shared research outputs
Top Co-Authors

Avatar

John McNaught

University of Manchester

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Elizabeth Toon

University of Manchester

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Paul M. Thompson

University of Southern California

View shared research outputs
Researchain Logo
Decentralizing Knowledge