Suin Kim
KAIST
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Suin Kim.
conference on information and knowledge management | 2012
Dongwoo Kim; Suin Kim; Alice Haeyun Oh
Topic models such as latent Dirichlet allocation (LDA) and hierarchical Dirichlet processes (HDP) are simple solutions to discover topics from a set of unannotated documents. While they are simple and popular, a major shortcoming of LDA and HDP is that they do not organize the topics into a hierarchical structure which is naturally found in many datasets. We introduce the recursive Chinese restaurant process (rCRP) and a nonparametric topic model with rCRP as a prior for discovering a hierarchical topic structure with unbounded depth and width. Unlike previous models for discovering topic hierarchies, rCRP allows the documents to be generated from a mixture over the entire set of topics in the hierarchy. We apply rCRP to a corpus of New York Times articles, a dataset of MovieLens ratings, and a set of Wikipedia articles and show the discovered topic hierarchies. We compare the predictive power of rCRP with LDA, HDP, and nested Chinese restaurant process (nCRP) using heldout likelihood to show that rCRP outperforms the others. We suggest two metrics that quantify the characteristics of a topic hierarchy to compare the discovered topic hierarchies of rCRP and nCRP. The results show that rCRP discovers a hierarchy in which the topics become more specialized toward the leaves, and topics in the immediate family exhibit more affinity than topics beyond the immediate family.
international conference on computer safety reliability and security | 2010
Eunkyoung Jee; Suin Kim; Sung Deok Cha; Insup Lee
We present FBDTestMeasurer, an automated test coverage measurement tool for function block diagram (FBD) programs which are increasingly used in implementing safety critical systems such as nuclear reactor protection systems. We have defined new structural test coverage criteria for FBD programs in which dataflow-centric characteristics of FBD programs were well reflected. Given an FBD program and a set of test cases, FBDTestMeasurer produces test coverage score and uncovered test requirements with respect to the selected coverage criteria. Visual representation of uncovered data paths enables testers to easily identify which parts of the program need to be tested further. We found many aspects of the FBD logic that were not tested sufficiently when conducting a case study using test cases prepared by domain experts for reactor protection system software. Domain experts found this technique and tool highly intuitive and useful to measure the adequacy of FBD testing and generate additional test cases.
human factors in computing systems | 2015
Jae Won Kim; Dongwoo Kim; Brian Keegan; Suin Kim; Alice H. Oh
Sporting championships and other media events can induce very strong feelings of co-presence that can change communication patterns within large communities. Live tweeting reactions to media events provide high-resolution data with time-stamps to understand these behavioral dynamics. We employ a computational focus group method to identify a population of 790,744 international Twitter users, and we track their behavior before, during, and after the 2014 FIFA World Cup. We pick, in particular, a set of Twitter users who specified the teams that they are supporting, such that we can identify communities of fans of the teams, as well as the entire community of World Cup fans. The structure, dynamics, and content of communication of these communities of users are analyzed to compare behavior outside of the matches to behavior during the event and to examine behavioral responses across languages. Specifically, the temporal patterns of the tweeting volume, topics, retweet- ing, and mentioning behaviors are analyzed. We find there are similarities in the responses to media events, characteristic changes in activity patterns of users, and substantial differences in linguistic features. These findings have implications for designing more resilient socio-technical systems during crises and developing better models of complex social behavior.
international world wide web conferences | 2014
Yeooul Kim; Suin Kim; Alejandro Jaimes; Alice H. Oh
Agenda setting theory explains how media affects its audience. While traditional media studies have done extensive research on agenda setting, there are important limitations in those studies, including using a small set of issues, running costly surveys of public interest, and manually categorizing the articles into positive and negative frames. In this paper, we propose to tackle these limitations with a computational approach and a large dataset of online news. Overall, we demonstrate how to carry out a large-scale computational research of agenda setting with online news data using machine learning.
PLOS ONE | 2016
Suin Kim; Sungjoon Park; Scott A. Hale; Soo-Young Kim; Jeongmin Byun; Alice H. Oh
Multilingualism is common offline, but we have a more limited understanding of the ways multilingualism is displayed online and the roles that multilinguals play in the spread of content between speakers of different languages. We take a computational approach to studying multilingualism using one of the largest user-generated content platforms, Wikipedia. We study multilingualism by collecting and analyzing a large dataset of the content written by multilingual editors of the English, German, and Spanish editions of Wikipedia. This dataset contains over two million paragraphs edited by over 15,000 multilingual users from July 8 to August 9, 2013. We analyze these multilingual editors in terms of their engagement, interests, and language proficiency in their primary and non-primary (secondary) languages and find that the English edition of Wikipedia displays different dynamics from the Spanish and German editions. Users primarily editing the Spanish and German editions make more complex edits than users who edit these editions as a second language. In contrast, users editing the English edition as a second language make edits that are just as complex as the edits by users who primarily edit the English edition. In this way, English serves a special role bringing together content written by multilinguals from many language editions. Nonetheless, language remains a formidable hurdle to the spread of content: we find evidence for a complexity barrier whereby editors are less likely to edit complex content in a second language. In addition, we find that multilinguals are less engaged and show lower levels of language proficiency in their second languages. We also examine the topical interests of multilingual editors and find that there is no significant difference between primary and non-primary editors in each language.
learning at scale | 2016
Suin Kim; Jae Won Kim; Jungkook Park; Alice H. Oh
We present Elice, an online CS (computer science) education platform, and Elivate, a system for taking student learning data from Elice and infers their progress through an educational taxonomy tailored for programming education. Elice captures detailed student learning activities, such as the intermediate revisions of code as students make progress toward completing their programming exercises. With those data, Elivate recognizes each students progression through an education taxonomy which organizes intermediate stages of learning such that the taxonomy can be used to evaluate student progress as well as to design and improve course materials and structure. With more than 240,000 intermediate source codes generated by 1,000 students, we demonstrate the practicality of the Elice and Elivate. We present case studies that confirm that categorizing student actions into the different steps of the taxonomy results in better understanding of the effect of TAs assist and students performance.
conference on computer supported cooperative work | 2017
Jungkook Park; Yeong Hoon Park; Suin Kim; Alice H. Oh
In this paper, we investigate the effectiveness of visualization of code history on peer assessment in computer science education. Peer assessment is found to be an effective learning tool for programming education. While many systems are proposed to support peer assessment in programming education, little effort has been devoted to finding ways to improve the peer assessment by assisting the students to understand the programs they are assessing. We introduce Eliph, a web-based peer assessment system for programming education with code history visualization. Eliph incorporates the visualization of character-level code history, selection-based history tracking and the integration of execution events to assist students in understanding programs written by peers, thereby leading to more effective peer assessment. We evaluate Eliph with an experiment in an undergraduate CS course. We show that visualization of code history has positive effects on promoting higher quality of peer feedback by understanding the intention and thought process.
learning at scale | 2018
Jungkook Park; Yeong Hoon Park; Jinhan Kim; Jeongmin Cha; Suin Kim; Alice H. Oh
In programming education, instructors often supplement lectures with active learning experiences by offering programming lab sessions where learners themselves practice writing code. However, widely accessed instructional programming screencasts are not equipped with assessment format that encourages such hands-on programming activities. We introduce Elicast, a screencast tool for recording and viewing programming lectures with embedded programming exercises, to provide hands-on programming experiences in the screen-cast. In Elicast, instructors embed multiple programming exercises while creating a screencast, and learners engage in the exercises by writing code within the screencast, receiving auto-graded results immediately. We conducted an exploratory study of Elicast with five experienced instructors and 63 undergraduate students. We found that instructors structured the lectures into small learning units using embedded exercises as checkpoints. Also, learners more actively engaged in the screencast lectures, checked their understanding of the content through the embedded exercises, and more frequently modified and executed the code during the lectures.
user interface software and technology | 2017
Ja-Ryoung Choi; Suin Kim; Soon-Bum Lim
The emergence of social reading services has enabled readers to participate actively in reading activities by means of sharing and feedback. Readers can state their opinion on a book by providing feedback. However, because current e-books are published with fixed, unchangeable content, it is difficult to reflect the readers feedback on them. In this paper, we propose a system for an adaptive e-book that dynamically updates itself on user participation. To achieve this, we designed a Feedback Block Model and a Feedback Engine. In the Feedback Block Model, at the time of publication, the author defines the type of feedback expected from readers. After publication, the Feedback Engine collects and aggregates the readers? feedback. The Feedback Engine can be configured with drag-and-drop block programming, and hence, even authors inexperienced in programming can create an adaptive e-book.
learning at scale | 2016
Suin Kim; Jae Won Kim; Jungkook Park; Alice H. Oh
We present Elice, an online CS (computer science) education platform, and Elivate, a system for (i) taking student learning data from Elice, (ii) inferring their progress through an educational taxonomy tailored for programming education, and (iii) generating the real-time assistance for students and lecturers. Online courses suffer from high average attrition rates, and early prediction can enable early personalized feedback to motivate and assist students who may be having difficulties. Elice captures detailed student learning activities including intermediate revisions of code as students make progress toward completing their programming exercises and timestamps of student logins and submissions. Elivate then takes those data to analyze each students progress and estimate the time to completion. In doing so, Elivate uses a learning taxonomy and automatic clustering of source code revisions. Using more than 240,000 code revisions generated by 1,000 students, we demonstrate how Elivate processes large-scale student data and generates appropriate real-time feedback for students.