Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tanja Bekhuis is active.

Publication


Featured researches published by Tanja Bekhuis.


Artificial Intelligence in Medicine | 2012

Screening nonrandomized studies for medical systematic reviews: A comparative study of classifiers

Tanja Bekhuis; Dina Demner-Fushman

OBJECTIVES To investigate whether (1) machine learning classifiers can help identify nonrandomized studies eligible for full-text screening by systematic reviewers; (2) classifier performance varies with optimization; and (3) the number of citations to screen can be reduced. METHODS We used an open-source, data-mining suite to process and classify biomedical citations that point to mostly nonrandomized studies from 2 systematic reviews. We built training and test sets for citation portions and compared classifier performance by considering the value of indexing, various feature sets, and optimization. We conducted our experiments in 2 phases. The design of phase I with no optimization was: 4 classifiers × 3 feature sets × 3 citation portions. Classifiers included k-nearest neighbor, naïve Bayes, complement naïve Bayes, and evolutionary support vector machine. Feature sets included bag of words, and 2- and 3-term n-grams. Citation portions included titles, titles and abstracts, and full citations with metadata. Phase II with optimization involved a subset of the classifiers, as well as features extracted from full citations, and full citations with overweighted titles. We optimized features and classifier parameters by manually setting information gain thresholds outside of a process for iterative grid optimization with 10-fold cross-validations. We independently tested models on data reserved for that purpose and statistically compared classifier performance on 2 types of feature sets. We estimated the number of citations needed to screen by reviewers during a second pass through a reduced set of citations. RESULTS In phase I, the evolutionary support vector machine returned the best recall for bag of words extracted from full citations; the best classifier with respect to overall performance was k-nearest neighbor. No classifier attained good enough recall for this task without optimization. In phase II, we boosted performance with optimization for evolutionary support vector machine and complement naïve Bayes classifiers. Generalization performance was better for the latter in the independent tests. For evolutionary support vector machine and complement naïve Bayes classifiers, the initial retrieval set was reduced by 46% and 35%, respectively. CONCLUSIONS Machine learning classifiers can help identify nonrandomized studies eligible for full-text screening by systematic reviewers. Optimization can markedly improve performance of classifiers. However, generalizability varies with the classifier. The number of citations to screen during a second independent pass through the citations can be substantially reduced.


PLOS ONE | 2014

Feature Engineering and a Proposed Decision-Support System for Systematic Reviewers of Medical Evidence

Tanja Bekhuis; Eugene Tseytlin; Kevin J. Mitchell; Dina Demner-Fushman

Objectives Evidence-based medicine depends on the timely synthesis of research findings. An important source of synthesized evidence resides in systematic reviews. However, a bottleneck in review production involves dual screening of citations with titles and abstracts to find eligible studies. For this research, we tested the effect of various kinds of textual information (features) on performance of a machine learning classifier. Based on our findings, we propose an automated system to reduce screeing burden, as well as offer quality assurance. Methods We built a database of citations from 5 systematic reviews that varied with respect to domain, topic, and sponsor. Consensus judgments regarding eligibility were inferred from published reports. We extracted 5 feature sets from citations: alphabetic, alphanumeric+, indexing, features mapped to concepts in systematic reviews, and topic models. To simulate a two-person team, we divided the data into random halves. We optimized the parameters of a Bayesian classifier, then trained and tested models on alternate data halves. Overall, we conducted 50 independent tests. Results All tests of summary performance (mean F3) surpassed the corresponding baseline, P<0.0001. The ranks for mean F3, precision, and classification error were statistically different across feature sets averaged over reviews; P-values for Friedmans test were .045, .002, and .002, respectively. Differences in ranks for mean recall were not statistically significant. Alphanumeric+ features were associated with best performance; mean reduction in screening burden for this feature type ranged from 88% to 98% for the second pass through citations and from 38% to 48% overall. Conclusions A computer-assisted, decision support system based on our methods could substantially reduce the burden of screening citations for systematic review teams and solo reviewers. Additionally, such a system could deliver quality assurance both by confirming concordant decisions and by naming studies associated with discordant decisions for further consideration.


Journal of Evidence Based Dental Practice | 2009

Music Therapy May Reduce Pain and Anxiety in Children Undergoing Medical and Dental Procedures

Tanja Bekhuis

Article Title and Bibliographic Information Music for pain and anxiety in children undergoing medical procedures: a systematic review of randomized controlled trials. Klassen JA, Liang Y, Tjosvold L, Klassen TP, Hartling L. Ambulatory Pediatrics 2008;8(2):117-28. Reviewer Tanja Bekhuis, PhD, MS, MLIS Purpose/Question To evaluate the effectiveness of music therapy for reducing pain and anxiety in children undergoing clinical procedures. Source of Funding NLM/NIDCR Pittsburgh Biomedical Informatics Training Program 5 T15 LM/DE07059-22 Type of Study/Design Systematic Review Level of Evidence Level 2: Limited-quality patient-oriented evidence Strength of Recommendation Grade Grade B: Inconsistent or limited-quality patient-oriented evidence


Journal of Medical Internet Research | 2011

Using natural language processing to enable in-depth analysis of clinical messages posted to an Internet mailing list: a feasibility study.

Tanja Bekhuis; Marcos Kreinacke; Heiko Spallek; Mei Song; Jean A. O'Donnell

Background An Internet mailing list may be characterized as a virtual community of practice that serves as an information hub with easy access to expert advice and opportunities for social networking. We are interested in mining messages posted to a list for dental practitioners to identify clinical topics. Once we understand the topical domain, we can study dentists’ real information needs and the nature of their shared expertise, and can avoid delivering useless content at the point of care in future informatics applications. However, a necessary first step involves developing procedures to identify messages that are worth studying given our resources for planned, labor-intensive research. Objectives The primary objective of this study was to develop a workflow for finding a manageable number of clinically relevant messages from a much larger corpus of messages posted to an Internet mailing list, and to demonstrate the potential usefulness of our procedures for investigators by retrieving a set of messages tailored to the research question of a qualitative research team. Methods We mined 14,576 messages posted to an Internet mailing list from April 2008 to May 2009. The list has about 450 subscribers, mostly dentists from North America interested in clinical practice. After extensive preprocessing, we used the Natural Language Toolkit to identify clinical phrases and keywords in the messages. Two academic dentists classified collocated phrases in an iterative, consensus-based process to describe the topics discussed by dental practitioners who subscribe to the list. We then consulted with qualitative researchers regarding their research question to develop a plan for targeted retrieval. We used selected phrases and keywords as search strings to identify clinically relevant messages and delivered the messages in a reusable database. Results About half of the subscribers (245/450, 54.4%) posted messages. Natural language processing (NLP) yielded 279,193 clinically relevant tokens or processed words (19% of all tokens). Of these, 2.02% (5634 unique tokens) represent the vocabulary for dental practitioners. Based on pointwise mutual information score and clinical relevance, 325 collocated phrases (eg, fistula filled obturation and herpes zoster) with 108 keywords (eg, mercury) were classified into 13 broad categories with subcategories. In the demonstration, we identified 305 relevant messages (2.1% of all messages) over 10 selected categories with instances of collocated phrases, and 299 messages (2.1%) with instances of phrases or keywords for the category systemic disease. Conclusions A workflow with a sequence of machine-based steps and human classification of NLP-discovered phrases can support researchers who need to identify relevant messages in a much larger corpus. Discovered phrases and keywords are useful search strings to aid targeted retrieval. We demonstrate the potential value of our procedures for qualitative researchers by retrieving a manageable set of messages concerning systemic and oral disease.


BMC Oral Health | 2013

Are dentists interested in the oral-systemic disease connection? A qualitative study of an online community of 450 practitioners

Mei Song; Jean A. O’Donnell; Tanja Bekhuis; Heiko Spallek

BackgroundDentists in the US see an increasing number of patients with systemic conditions. These patients are challenging to care for when the relationship between oral and systemic disease is not well understood. The prevalence of professional isolation exacerbates the problem due to the difficulty in finding expert advice or peer support. This study aims to identify whether dentists discuss the oral-systemic connection and what aspects they discuss; to understand their perceptions of and attitudes toward the connection; and to determine what information they need to treat patients with systemic conditions.MethodsWe retrieved 14,576 messages posted to the Internet Dental Forum from April 2008 to May 2009. Using natural language processing and human classification, we identified substantive phrases and keywords and used them to retrieve 141messages on the oral-systemic connection. We then conducted coding and thematic analysis to identify recurring themes on the topic.ResultsDentists discuss a variety of topics on oral diseases and systemic health, with the association between periodontal and systemic diseases, the effect of dental materials or procedures on general health, and the impact of oral-systemic connection on practice behaviors as the leading topics. They also disseminate and share research findings on oral and systemic health with colleagues online. However, dentists are very cautious about the nature of the oral-systemic linkage that may not be causal. Nonetheless, they embrace the positive association as a motivating point for patients in practice. When treating patients with systemic conditions, dentists enquire about the cause of less common dental diseases potentially in relation to medical conditions in one-third of the cases and in half of the cases seek clinical guidelines and evidence-based interventions on treating dental diseases with established association with systemic conditions.ConclusionsDentists’ unmet information needs call for more research into the association between less studied dental conditions and systemic diseases, and more actionable clinical guidelines for well-researched disease connections. To improve dissemination and foster behavioral change, it is imperative to understand what information clinicians need and in which situations. Leveraging peer influence via social media could be a useful strategy to achieve the goal.


Journal of The Medical Library Association | 2014

Mentors and tormentors on the road to informatics.

Tanja Bekhuis

My career path to biomedical informatics, late at every stage, is peculiar if I compare myself to other psychologists. On the other hand, a peculiar path is the norm for my peers who are loath to call themselves “informaticians,” so tightly coupled is their identity to their former lives as librarians, clinicians, engineers, or computer and social scientists. In thinking about how I came to be on the faculty of a medical school doing informatics research, it occurred to me that my mentors had a powerful influence and not all in the same way. Alas, a few tormentors almost derailed me. I dithered over that last descriptor because I could have more charitably called a tormentor an obstructor, but consider this: The origin of the former word is from the Old French tormentum, an engine for hurling stones, which seems apt 1, and one of several synonyms is backscratcher—the irony of which makes me laugh. So for those of you wrestling with a midlife career change, consider this editorial a call to courage, as well as an abridged guide to mentors and tormentors. First, I give you a brief backstory.


Journal of Evidence Based Dental Practice | 2011

Chlorhexidine Varnish may Prevent Dental Caries in Children and Adolescents

Tanja Bekhuis

Article Title and Bibliographic Information The caries-preventive effect of chlorhexidine varnish in children and adolescents: a systematic review. James P, Parnell C, Whelton H. Caries Res 2010;44:333-40. Reviewer Tanja Bekhuis, PhD, MS, MLIS Purpose/Question To evaluate the effectiveness of chlorhexidine varnish for preventing dental caries in primary and permanent teeth of children and adolescents when compared to placebo or no treatment, as well as to fluoride varnish. Source of Funding Health Research Board, Ireland (Grant No. S/A013) Type of Study/Design Systematic review Level of Evidence Level 2: Limited-quality, patient-oriented evidence Strength of Recommendation Grade Grade B: Inconsistent or limited-quality patient-oriented evidence


Journal of Biomedical Informatics | 2017

Automated annotation and classification of BI-RADS assessment from radiology reports

Sergio M. Castro; Eugene Tseytlin; Olga Medvedeva; Kevin J. Mitchell; Shyam Visweswaran; Tanja Bekhuis; Rebecca S. Jacobson

The Breast Imaging Reporting and Data System (BI-RADS) was developed to reduce variation in the descriptions of findings. Manual analysis of breast radiology report data is challenging but is necessary for clinical and healthcare quality assurance activities. The objective of this study is to develop a natural language processing (NLP) system for automated BI-RADS categories extraction from breast radiology reports. We evaluated an existing rule-based NLP algorithm, and then we developed and evaluated our own method using a supervised machine learning approach. We divided the BI-RADS category extraction task into two specific tasks: (1) annotation of all BI-RADS category values within a report, (2) classification of the laterality of each BI-RADS category value. We used one algorithm for task 1 and evaluated three algorithms for task 2. Across all evaluations and model training, we used a total of 2159 radiology reports from 18 hospitals, from 2003 to 2015. Performance with the existing rule-based algorithm was not satisfactory. Conditional random fields showed a high performance for task 1 with an F-1 measure of 0.95. Rules from partial decision trees (PART) algorithm showed the best performance across classes for task 2 with a weighted F-1 measure of 0.91 for BIRADS 0-6, and 0.93 for BIRADS 3-5. Classification performance by class showed that performance improved for all classes from Naïve Bayes to Support Vector Machine (SVM), and also from SVM to PART. Our system is able to annotate and classify all BI-RADS mentions present in a single radiology report and can serve as the foundation for future studies that will leverage automated BI-RADS annotation, to provide feedback to radiologists as part of a learning health system loop.


Journal of The Medical Library Association | 2015

HTA Database Canadian Repository.

Ashleigh Faith; Tanja Bekhuis

Librarians, informationists, researchers, health policymakers, and health care professionals retrieve reports of health technology assessments (HTAs) to review the evidence on medical, socioeconomic, and ethical implications of health care and health care investments [1]. These assessments are a useful type of gray literature that include cost-benefit analyses of health interventions, syntheses of intervention research findings, and discussion of the economic implications of health care and medical technology [2]. The HTA Database Canadian Repository is a new, valuable tool that aggregates these assessments and makes them findable via a convenient search interface. This international repository includes records for in-progress and published reports of HTAs, reviews, guidelines, protocols, journals, and articles. Launched in January of 2015, the database includes records beginning in 2003. The HTA Database is funded by the United Kingdoms National Institute for Health Research (NIHR) and administered by the University of York Centre for Reviews and Dissemination. The Canadian repository aggregates HTA records from Ontario, Quebec, Alberta, and pan-Canadian agencies (CADTH) in addition to records from the International Network of Agencies for Health Technology Assessment (INAHTA). The database includes reports from 93 INAHTA and other international repositories [3]. All records and documents are translated into English, French, and/or Spanish, representing about 20 source languages. As of May 2015, the Canadian repository contains over 14,000 unique records; approximately 2,400 are Canadian. Canadian search interface The Canadian interface for the HTA Database is free for noncommercial use [4]. The basic search can include title, full text, province, author, agency, and funder. Additional options allow the user to limit searches to Canadian databases, Canadian and international HTA databases, the Database of Abstracts of Reviews of Effects (DARE), Cochrane reviews, and the National Health Service Economic Evaluation Database (NHS EED) of assessed economic evaluations. While items in the DARE and NHS EED databases are critically appraised—in other words, reviewed for trustworthiness, value, and contextual relevance—items in the HTA database are not. However, if a critical appraisal exists, it is linked with the HTA database record [1]. Other search criteria include record date, publication year, and Medical Subject Headings (MeSH) term search. While search text can be truncated and wildcards accepted, special characters, such as the German umlaut, are not. Search results include year—although it is unclear if this indicates publication or record year—source database, source agency, and title. Although additional metadata exist for each record, search queries are limited to the basic metadata only. The Boolean operators AND, OR, and NOT are available, as well as the proximity operators NEAR and ADJ. Using proximity operators and restricting searches to specific fields (e.g., author [:au], title [:ti], language [:lp], source journal [:so], and funding [xfu]) are particularly helpful when performing advanced, full-text, and keyword searches. The MeSH terminology search is also helpful, if not ideal. MeSH terms can be searched by string (permut), stemming, or tree criteria, but combinations of terms cannot be selected for post-coordinated searching by index concept. Terms selected for searching are not visible in the interface for review or modification; using the interface can be frustrating when executing a high volume of advanced or expert searches. Also, terms cannot be exploded, and MeSH terms are not accompanied by MeSH tree numbers. This is significant for content and keyword analysis, because the tree number indicates the context of the MeSH term. Overall, the MeSH terminology search will not hinder basic or advanced terminology searches but may hinder expert searches. Records can be exported in hypertext markup language (HTML) format; however, downloading full-text reports, when available, must be done one-by-one on a case-by-case basis. Metadata Each record notes the title, agency location, publisher, year of publication, publication type, agency-assigned MeSH terms, HTA accession number, language, source database, and uniform resource locator (URL), along with other metadata. Some metadata are unique to the database from which they derive. For example, metadata vary in records from DARE, NHS EED, or HTA. Conformity of metadata is primarily applied when agencies upload records and input their metadata through the HTA interface. Considerations and advantages For record inclusion, the HTA Database has an “extremely flexible definition of what constitutes a health technology assessment,” which, although potentially beneficial to a variety of users, does not provide a reliable description of publications included in the database [3]. As a result, the database also contains records such as protocols for systematic reviews and care guidelines. Also, documentation is incomplete for metadata fields and database coverage, and there are no descriptions of governance policies. For example, in reviewing a sample of records, we found that not all metadata fields include the same type of information; however, this was a minor occurrence. As a final note, records appear to be added weekly, but since agencies add content on a voluntary basis, frequency of updates varies. The HTA Database Canadian repository and its interface will be helpful to librarians who support HTA investigators and policymakers, as well as patrons interested in assessments of health and health technologies. This new resource facilitates information retrieval and research and may reduce costly duplication of effort by enabling queries of both Canadian and international databases via one search interface. Additionally, the search interface enables discovery of gray literature that is otherwise unavailable or subject to purchase. Although its interface does not support expert searching, the superior aggregation of content makes the HTA Database Canadian Repository a valuable resource.


bioinformatics and biomedicine | 2015

A prototype for a hybrid system to support systematic review teams: A case study of organ transplantation

Tanja Bekhuis; Eugene Tseytlin; Kevin J. Mitchell

We describe a prototype for a hybrid system designed to reduce the number of citations needed to re-screen (NNRS) by systematic reviewers, where citations include titles, abstracts, and metadata. The system obviates the need for screening the entire set of citations a second time, which is typically done to control human error. The reference set is based on a complex review about organ transplantation (N=10,796 citations). Data were split into 50% training and test sets, randomly stratified for percentage eligible citations. The system consists of a rule-based module and a machine-learning (ML) module. The former substantially reduces the number of negative citations passed to the ML module and improves imbalance. Relative to the baseline, the system reduces classification error (5.6% vs 2.9%) thereby reducing NNRS by 47.3% (300 vs 158). We discuss the implications of de-emphasizing sensitivity (recall) in favor of specificity and negative predictive value to reduce screening burden.

Collaboration


Dive into the Tanja Bekhuis's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Heiko Spallek

University of Pittsburgh

View shared research outputs
Top Co-Authors

Avatar

Dina Demner-Fushman

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Mei Song

University of Pittsburgh

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ashleigh Faith

University of Pittsburgh

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Abby J. Brodie

Nova Southeastern University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge