Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Craig Harman is active.

Publication


Featured researches published by Craig Harman.


Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality | 2014

Quantifying Mental Health Signals in Twitter

Glen Coppersmith; Mark Dredze; Craig Harman

The ubiquity of social media provides a rich opportunity to enhance the data available to mental health clinicians and researchers, enabling a better-informed and better-equipped mental health field. We present analysis of mental health phenomena in publicly available Twitter data, demonstrating how rigorous application of simple natural language processing methods can yield insight into specific disorders as well as mental health writ large, along with evidence that as-of-yet undiscovered linguistic signals relevant to mental health exist in social media. We present a novel method for gathering data for a range of mental illnesses quickly and cheaply, then focus on analysis of four in particular: post-traumatic stress disorder (PTSD), depression, bipolar disorder, and seasonal affective disorder (SAD). We intend for these proof-of-concept results to inform the necessary ethical discussion regarding the balance between the utility of such data and the privacy of mental health related information.


north american chapter of the association for computational linguistics | 2015

From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diagnoses

Glen Coppersmith; Mark Dredze; Craig Harman; Kristy Hollingshead

Many significant challenges exist for the mental health field, but one in particular is a lack of data available to guide research. Language provides a natural lens for studying mental health ‐ much existing work and therapy have strong linguistic components, so the creation of a large, varied, language-centric dataset could provide significant grist for the field of mental health research. We examine a broad range of mental health conditions in Twitter data by identifying self-reported statements of diagnosis. We systematically explore language differences between ten conditions with respect to the general population, and to each other. Our aim is to provide guidance and a roadmap for where deeper exploration is likely to be fruitful.


north american chapter of the association for computational linguistics | 2015

CLPsych 2015 Shared Task: Depression and PTSD on Twitter

Glen Coppersmith; Mark Dredze; Craig Harman; Kristy Hollingshead; Margaret Mitchell

This paper presents a summary of the Computational Linguistics and Clinical Psychology (CLPsych) 2015 shared and unshared tasks. These tasks aimed to provide apples-to-apples comparisons of various approaches to modeling language relevant to mental health from social media. The data used for these tasks is from Twitter users who state a diagnosis of depression or post traumatic stress disorder (PTSD) and demographically-matched community controls. The unshared task was a hackathon held at Johns Hopkins University in November 2014 to explore the data, and the shared task was conducted remotely, with each participating team submitted scores for a held-back test set of users. The shared task consisted of three binary classification experiments: (1) depression versus control, (2) PTSD versus control, and (3) depression versus PTSD. Classifiers were compared primarily via their average precision, though a number of other metrics are used along with this to allow a more nuanced interpretation of the performance measures.


meeting of the association for computational linguistics | 2014

I’m a Belieber: Social Roles via Self-identification and Conceptual Attributes

Charley Beller; Rebecca Knowles; Craig Harman; Shane Bergsma; Margaret Mitchell; Benjamin Van Durme

Motivated by work predicting coarsegrained author categories in social media, such as gender or political preference, we explore whether Twitter contains information to support the prediction of finegrained categories, or social roles. We find that the simple self-identification pattern “I am a ” supports significantly richer classification than previously explored, successfully retrieving a variety of fine-grained roles. For a given role (e.g., writer), we can further identify characteristic attributes using a simple possessive construction (e.g., writer’s ). Tweets that incorporate the attribute terms in first person possessives (my ) are confirmed to be an indicator that the author holds the associated social role.


computational social science | 2014

Predicting Fine-grained Social Roles with Selectional Preferences

Charley Beller; Craig Harman; Benjamin Van Durme

Selectional preferences, the tendencies of predicates to select for certain semantic classes of arguments, have been successfully applied to a number of tasks in computational linguistics including word sense disambiguation, semantic role labeling, relation extraction, and textual inference. Here we leverage the information encoded in selectional preferences to the task of predicting fine-grained categories of authors on the social media platform Twitter. First person uses of verbs that select for a given social role as subject (e.g. I teach ... for teacher) are used to quickly build up binary classifiers for that role.


conference on human information interaction and retrieval | 2016

Vapor Engine: Demonstrating an Early Prototype of a Language-Independent Search Engine for Speech

Douglas W. Oard; Rashmi Sankepally; Jerome White; Craig Harman

Typical search engines for spoken content begin with some form of language-specific audio processing such as phonetic word recognition. Many languages, however, lack the language tuned preprocessing tools that are needed to create indexing terms for speech. One approach in such cases is to rely on repetition, detected using acoustic features, to find terms that might be worth indexing. Experiments have shown that this approach yields term sets that might be sufficient for some applications in both spoken term detection and ranked retrieval experiments. Such approaches currently work only with spoken queries, however, and only when the searcher is able to speak in a manner similar to that of the speakers in the collection. This demonstration paper proposes Vapor Engine, a new tool for selectively transcribing repeated terms that can be automatically detected from spoken content in any language. These transcribed terms could then be matched to queries formulated using written terms. Vapor Engine is early in development: it currently supports only single-term queries and has not yet having been formally evaluated. This paper introduces the interface and summarizes the challenges it seeks to address.


north american chapter of the association for computational linguistics | 2015

A Concrete Chinese NLP Pipeline

Nanyun Peng; Francis Ferraro; Mo Yu; Nicholas Andrews; Jay DeYoung; Max Thomas; Matthew R. Gormley; Travis Wolfe; Craig Harman; Benjamin Van Durme; Mark Dredze

Natural language processing research increasingly relies on the output of a variety of syntactic and semantic analytics. Yet integrating output from multiple analytics into a single framework can be time consuming and slow research progress. We present a CONCRETE Chinese NLP Pipeline: an NLP stack built using a series of open source systems integrated based on the CONCRETE data schema. Our pipeline includes data ingest, word segmentation, part of speech tagging, parsing, named entity recognition, relation extraction and cross document coreference resolution. Additionally, we integrate a tool for visualizing these annotations as well as allowing for the manual annotation of new data. We release our pipeline to the research community to facilitate work on Chinese language tasks that require rich linguistic annotations.


international acm sigir conference on research and development in information retrieval | 2015

A Test Collection for Spoken Gujarati Queries

Douglas W. Oard; Rashmi Sankepally; Jerome White; Aren Jansen; Craig Harman

The development of a new test collection is described in which the task is to search naturally occurring spoken content using naturally occurring spoken queries. To support research on speech retrieval for low-resource settings, the collection includes terms learned by zero-resource term discovery techniques. Use of a new tool designed for exploration of spoken collections provides some additional insight into characteristics of the collection.


international conference on weblogs and social media | 2014

Measuring Post Traumatic Stress Disorder in Twitter

Glen Coppersmith; Craig Harman; Mark Dredze


Transactions of the Association for Computational Linguistics | 2015

Semantic Proto-Roles

Drew Reisinger; Rachel Rudinger; Francis Ferraro; Craig Harman; Kyle Rawlins; Benjamin Van Durme

Collaboration


Dive into the Craig Harman's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mark Dredze

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Charley Beller

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

James Mayfield

Johns Hopkins University Applied Physics Laboratory

View shared research outputs
Top Co-Authors

Avatar

Kristy Hollingshead

Florida Institute for Human and Machine Cognition

View shared research outputs
Top Co-Authors

Avatar

Tim Finin

University of Maryland

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge