Is this you? Create Your Porfile

Tingting He

Central China Normal University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tingting He is active.

Explore More

Publication

Featured researches published by Tingting He.

computer science and software engineering | 2008

Query-Focused Multi-document Summarization Using Keyword Extraction

Liang Ma; Tingting He; Fang Li; Zhuoming Gui; Jinguang Chen

This paper proposes a strategy of the summary sentence selection for query-focused multi-document summarization through extracting keywords from relevant document set. It calculates the query related feature and the topic related feature for every word in relevant document set, then obtains the importance of the word by combining the two features. The score of candidate sentence is computed through the importance of words which they contains, and the modified MMR technology is used to adjust the score of the candidate sentence, then the candidate sentence with the highest score is selected as the summary sentence, till the length of the summary is enough. Experimental result shows that our method performs very well in DUC 2005 corpus and DUC 2006 corpus.

international conference on advanced language processing and web information technology | 2008

A New Feature-Fusion Sentence Selecting Strategy for Query-Focused Multi-document Summarization

Tingting He; Fang Li; Wei Shao; Jinguang Chen; Liang Ma

The most important step of query-focused extractive summarization is deciding which sentences are appropriately included in the final summary. In this paper, we propose a feature fusion based sentence selecting strategy, to identify the sentences with high query-relevance and high information density. We score each sentence by computing its similarity and Skip-Bigram co-occurrence with query. These two features can measure the query-relevance from content and structure respectively. Then, we re-score the sentences using the information density feature gained from a text graph which can provide position information. And finally, we adopt MMR for sentence extracting. Experimental results indicate that this method is effective in capturing important sentences. The ROUGE-2 and ROUGE-SU4 scores are 0.0640 and 0.1233, which are at the top of the DUC2005 scores.

international conference natural language processing | 2011

Tag-topic model for semantic knowledge acquisition from blogs

Fang Li; Huiyu Shen; Tingting He

This paper proposed a tag-topic model for semantic knowledge acquisition from blogs. The model extends the Latent Dirichlet Allocation by adding a tag layer between the document and topic layer, it represents each document with a mixture of tags, each tag is associated with a multinomial distribution over topics and each topic is associated with a multinomial distribution over words. After parameters estimating, the tags are regarded as concepts, the top words arranged to the top topics are selected as related words of the concepts, and PMI-IR is utilized for filtering out noisy words to improve the quality of the semantic knowledge. Experimental results show that the tag-topic model can effectively capture semantic knowledge.

international conference natural language processing | 2010

Emotion analysis in blogs at sentence level using a Chinese emotion corpus

Changqin Quan; Tingting He; Fuji Ren

Previous researches for emotional analysis of texts have included a variety of text contents: weblogs, stories, news, text messages, spoken dialogs, and so on. Compared with other text styles, the main characteristics of emotional expressions in blogs are as follows: (1) Highly personal, subjective writing style; (2) New words and expressions are constantly emerging; (3) The integrity and continuity of using language. Using a Chinese emotion corpus (Ren-CECps), in this study, we make an analysis on emotion expressions in blogs at sentence level. Firstly, we separate the sentences into two classes: simple sentences (sentences without negative words, conjunctions, or question mark) and complex sentences (sentences with negative words, conjunctions, or question mark). Then we compare the two classes of sentence on sentence emotion recognition based on emotional words. Furthermore we analysis the following factors for emotion change at sentence level: negative words, conjunctions, punctuation marks, and contextual emotions. At last, we make an hypothesis that the emotional focus of a sentence could be expressed by a certain clause in this sentence, and the experimental results have proved this hypothesis, which showed that selecting the clauses containing emotional focus of a sentence correctly would be helpful to recognize sentence emotions.

international conference natural language processing | 2010

Research on sentiment classification of Blog based on PMI-IR

Xiuting Duan; Tingting He; Le Song

Development of Blog texts information on the internet has brought new challenge to Chinese text classification. Aim to solving the semantics deficiency problem in traditional methods for Chinese text classification, this paper implements a text classification method on classifying a blog as joy, angry, sad or fear using a simple unsupervised learning algorithm. The classification of a blog text is predicted by the max semantic orientation (SO) of the phrases in the blog text that contains adjectives or adverbs. In this paper, the SO of a phrase is calculated as the mutual information between the given phrase and the polar words. Then the SO of the given blog text is determined by the max mutual information value. A blog text is classified as joy if the SO of its phrases is joy. Two different corpora are adopted to test our method, one is the Blog corpus collected by Monitor and Research Center for National Language Resource Network Multimedia Sub-branch Center, and the other is Chinese dataset provided by COAE2008 task. Based on the two datasets, the method respectively achieves a high improvement compared to the traditional methods.

granular computing | 2012

Pseudo-relevance feedback query based on Wikipedia

Tingting He; Xionglu Dai

The traditional information retrieval (IR) model always only use the BOW (bag-of-words)-based retrieval model or Concepts-based retrieval model. However BOW-based model ignore the rich semantic relations between the words and text, and Concept-based model always bring in the noisy concepts and loss the precision. Pseudo-relevance feedback (PRF) is a widely used method for improving retrieval effectiveness, but it is strongly dependent on the precision of initial retrieval results. In order to solve these issues, this paper proposes a new concept generator called Enrichment-ESA which is the enrichment of the Explicit Semantic Analysis (ESA) method. With the help of Enrichment-ESA, we propose a novel PRF method which combined the BOW-based retrieval model and Concept-based retrieval model together to solve shortcomings of the existing IR model in some degree. The experimental results show that our method improves over the baseline method and performs better than the common PRF method.

international conference natural language processing | 2010

Obtaining chinese semantic knowledge from online encyclopedia

Liu Yang; Tingting He; Xinhui Tu; Jinguang Chen

This paper proposes a method to obtain the semantic knowledge from an online encyclopedia called Hudong encyclopedia 2(hudong baike). We obtain concepts and then their semantic related concepts and compute the semantic relatedness by utilizing inner hyperlinks and the open category information in Hudong encyclopedia. By comparing our results with human judgments, we show that our relatedness computing method is quite effective.

granular computing | 2010

Document Relevance Identifying and its Effect in Query-Focused Text Summarization

Tingting He; Fang Li; Liang Ma

There is an important issue that text summarization has to embody personal information need and provide indicative message to user. In this paper, a method of acquiring relevant documents based on user-feedback information and transductive inference SVM machine learning is presented. This method can well avoid the subjectivity of deciding relevant documents empirically. Furthermore, a sentence selection strategy through extracting keywords is proposed. It calculated the word’s query related feature through word co-occurrence window, and obtained the topic related feature through likelihood ratio, then combined the two features to extract some keywords and score the candidate sentences. The experimental result shows that the proposed methods can capture the main idea of the document set and satisfy the query demand effectively.

international conference natural language processing | 2008

Automatic construction of biomedical abbreviations dictionary from text

Changqin Quan; Fuji Ren; Tingting He; Po Hu

The size and growth rate of biomedical abbreviation are increasing very fast, automatic construction of biomedical abbreviations dictionary from text helps to understand biomedical literature, and to update existing databases, ontologies, and dictionaries. This paper proposes a new method for automatic construction of biomedical abbreviations dictionary from text by combining string matching algorithm and searching algorithm. The string matching algorithm extracts abbreviations and their longforms. The searching algorithm corrects the false longforms produced by the string matching algorithm. The searching algorithm is based on the idea that readers often lookup relative articles to judge the longform of an abbreviation is correct or not. Our experiments show that the algorithm has high precision (97.5%) and recall (82.2%). And because tagged corpus is not necessary, the method has high efficiency.

computer science and software engineering | 2008

Application of Transductive Inference SVM Based Relevant Documents Acquiring in Query-Biased Summarization

Fang Li; Tingting He; Liang Ma; Wei Shao; Jinguang Chen

There is an important issue that text summarization has to embody the personal information need and provide the indicative message for user. In this paper, a method of acquiring relevant documents based on user-feedback information and transductive inference SVM machine learning technology is presented. This method can well avoid subjectivity of deciding relevant documents empirically. To validate the effect, we extract important sentences as the final summary using a feature-fusion sentence selection strategy. The result shows that the method can improve the performance of the query-biased summarization effectively.

Explore More