Is this you? Create Your Porfile

Songchang Jin

National University of Defense Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Songchang Jin is active.

Explore More

Publication

Featured researches published by Songchang Jin.

Cluster Computing | 2015

Community structure mining in big data social media networks with MapReduce

Songchang Jin; Wangqun Lin; Hong Yin; Shuqiang Yang; Aiping Li; Bo Deng

Social media networks are playing increasingly prominent role in people’s daily life. Community structure is one of the salient features of social media network and has been applied to practical applications, such as recommendation system and network marketing. With the rapid expansion of social media size and surge of tremendous amount of information, how to identify the communities in big data scenarios has become a challenge. Based on our previous work and the map equation (an equation from information theory for community mining), we develop a novel distributed community structure mining framework. In the framework, (1) we propose a new link information update method to try to avoid data writing related operations and try to speedup the process. (2) We use the local information from the nodes and their neighbors, instead of the pagerank, to calculate the probability distribution of the nodes. (3) We exclude the network partitioning process from our previous work and try to run the map equation directly on MapReduce. Empirical results on real-world social media networks and artificial networks show that the new framework outperforms our previous work and some well-known algorithms, such as Radetal, FastGN, in accuracy, velocity and scalability.

International Conference on Trustworthy Computing and Services | 2012

Design of a Trusted File System Based on Hadoop

Songchang Jin; Shuqiang Yang; Xiang Zhu; Hong Yin

This paper analyses the data security issues in Hadoop platform, and proposes a design of trusted file system for Hadoop. The design uses the latest cryptography—fully homomorphic encryption technology and authentication agent technology, it ensures the reliability and safety from the three levels of hardware, data, users and operations. The homomorphic encryption technology enables the encrypted data to be operable to protect the security of the data and the efficiency of the application. The authentication agent technology offers a variety of access control rules, which are a combination of access control mechanisms, privilege separation and security audit mechanisms, to ensure the safety for the data stored in the Hadoop file system.

international conference on computer science and network technology | 2012

Optimization of task assignment strategy for map-reduce

Songchang Jin; Shuqiang Yang; Yan Jia

With the coming of this big data age, parallel processing is essential to processing a massive volume of data in a timely manner. Map-Reduce, which has been popularized, is a scalable and fault-tolerant data processing framework. It enables to process a massive volume of data in parallel way with many low-end computing nodes. As an important part of the framework, map task assignment has a significant impact on the performance of Map-Reduce. But in the allocation of the input files for map tasks, Map-Reduce framework does not take into account the distribution of the input data blocks in the file system and the load of the computing nodes themselves, which leading to increase the amount of network data transfer and system load when running map tasks. Especially when the framework uses the FIFO job scheduling strategy to deal with a large number of small jobs, the performance of the framework will be very low. In this paper, we design and implement a new task assignment strategy to increase the performance and efficiency of the Map-Reduce framework.

international conference on computer science and network technology | 2012

An locality-aware scheduling based on a novel scheduling model to improve system throughput of MapReduce cluster

Hui Zhao; Shuqiang Yang; Zhikun Chen; Hong Yin; Songchang Jin

Scheduling algorithms place a crucial role in MapReduce systems. Several recent scheduling algorithms, however, are all under Job-Task scheduling model which makes task scheduling confined, leading to poor task scheduling preference such as data locality, scan sharing and etc. These characteristics are very important heuristics on data intensive computing and helpful in improving system throughput. In this paper, we firstly design a novel scheduling model termed as Tasks-Job scheduling to overcome the above issues. Furthermore, we propose a locality aware algorithm to improve system throughput. Comprehensive experiments have been conducted to compare the proposed scheduling model and algorithm with state-of-the-art Job-Task based algorithms. The experimental results validate the efficiency and effectiveness of our proposed algorithm.

web age information management | 2015

Spammer Detection on Online Social Networks Based on Logistic Regression

Xiang Zhu; Yuanping Nie; Songchang Jin; Aiping Li; Yan Jia

Millions of users generate and propagate information in online social network. Search engines and data mining tools allow people to track hot topics and events online. However, the massive use of social media also makes it easier for malicious users, known as social spammers, to occupy social network with junk information. To solve this problem, a classifier is needed to detect social spammers. One effective way for spammer detection is based on contents and user information. Nevertheless, social spammers are tricky and able to fool the system with evolving their contents and information. Firstly, social spammers continually change their patterns to deceive detecting system. Secondly, spammers will try to gain influence and disguise themselves as far as possible. Due to the dynamic pattern of social spammers, it is difficult for existing methods to effectively and efficiently respond to social spammers. In this paper, we present a model based on logistic regression considering content attributes and behavior attributes of users in social network. Analyses of user attributes are made to differentiate spammers and non-spammers inherently. Experimental results on Twitter data show the effectiveness and efficiency of the proposed method.

The Scientific World Journal | 2014

Satellite Fault Diagnosis Using Support Vector Machines Based on a Hybrid Voting Mechanism

Hong Yin; Shuqiang Yang; Xiaoqian Zhu; Songchang Jin; Xiang Wang

The satellite fault diagnosis has an important role in enhancing the safety, reliability, and availability of the satellite system. However, the problem of enormous parameters and multiple faults makes a challenge to the satellite fault diagnosis. The interactions between parameters and misclassifications from multiple faults will increase the false alarm rate and the false negative rate. On the other hand, for each satellite fault, there is not enough fault data for training. To most of the classification algorithms, it will degrade the performance of model. In this paper, we proposed an improving SVM based on a hybrid voting mechanism (HVM-SVM) to deal with the problem of enormous parameters, multiple faults, and small samples. Many experimental results show that the accuracy of fault diagnosis using HVM-SVM is improved.

Technology and Health Care | 2015

A sequential decision-theoretic model for medical diagnostic system.

Aiping Li; Songchang Jin; Lumin Zhang; Yan Jia

Although diagnostic expert systems using a knowledge base which models decision-making of traditional experts can provide important information to non-experts, they tend to duplicate the errors made by experts. Decision-Theoretic Model (DTM) is therefore very useful in expert system since they prevent experts from incorrect reasoning under uncertainty. For the diagnostic expert system, corresponding DTM and arithmetic are studied and a sequential diagnostic decision-theoretic model based on Bayesian Network is given. In the model, the alternative features are categorized into two classes (including diseases features and test features), then an arithmetic for prior of test is provided. The different features affect other features weights are also discussed. Bayesian Network is adopted to solve uncertainty presentation and propagation. The model can help knowledge engineers model the knowledge involved in sequential diagnosis and decide evidence alternative priority. A practical example of the models is also presented: at any time of the diagnostic process the expert is provided with a dynamically updated list of suggested tests in order to support him in the decision-making problem about which test to execute next. The results show it is better than the traditional diagnostic model which is based on experience.

CCL | 2015

Answer Quality Assessment in CQA Based on Similar Support Sets

Zongsheng Xie; Yuanping Nie; Songchang Jin; Shudong Li; Aiping Li

Community question answering portal (CQA) has become one of the most important sources for people to seek information from the Internet. With great quantity of online users ready to help, askers are willing to post questions in CQA and are likely to obtain desirable answers. However, the answer quality in CQA varies widely, from helpful answers to abusive spam. Answer quality assessment is therefore of great significance. Most of the existing approaches evaluate answer quality based on the relevance between questions and answers. Due to the lexical gap between questions and answers, these approaches are not quite satisfactory. In this paper, a novel approach is proposed to rank the candidate answers, which utilizes the support sets to reduce the impact of lexical gap between questions and answers. Firstly, similar questions are retrieved and support sets are produced with their high quality answers. Based on the assumption that high quality answers of similar questions would also have intrinsic similarity, the quality of candidate answers are then evaluated through their distance from the support sets in both aspects of content and structure. Unlike most of the existing approaches, previous knowledge from similar question-answer pairs are used to bridged the straight lexical and semantic gaps between questions and answers. Experiments are implemented on approximately 2.15 million real-world question-answer pairs from Yahoo! Answers to verify the effectiveness of our approach. The results on metrics of MAP@K and MRR show that the proposed approach can rank the candidate answers precisely.

web age information management | 2016

Collaborative Partitioning for Multiple Social Networks with Anchor Nodes

Fenglan Li; Anming Ji; Songchang Jin; Shuqiang Yang; Qiang Liu

Plenty of individuals are getting involved in more than one social networks, and maintaining multiple relationships of social networks. The value behind the integrated information of multiple social networks is high. Howerver, the research of multiple social networks has been less studied. Our work presented in this paper taps into abundant information of multiple social networks and aims to resolve the initial phase problem of multi-related social network analysis based on MapReduce by partition the mutli-related social networks into non-intersecting subsets. To concretize our discussion, we propose a new multilevel framework (CPMN), which usually proceed in four stages, Merging Phase, Coarsening Phase, Intial Partitioning Phase and Uncoarsening Phase. We propose a modified matching strategy in the second stage and a modified refinement algorithm in the fourth stage. We prove the effective of CPMN on both synthetic data and real datasets. Experiments show that the same node in different social networks is assigned to the same partition by 100 % without sacrificing the load balance and edge-cut too much. We believe that our work will shed light on the study of multiple social networks based on MapReduce.

Archive | 2013

Design and Implementation of an Experiment Environment for Network Attacks and Defense

Songchang Jin; Shuqiang Yang; Songhe Jin; Hui Zhao; Xiang Wang

With the development of computer and network technology, large scale network attacks on the Internet occur frequently, such as worms, distributed denial of service, Trojan etc. The network attacks will have serious impacts on network service and network infrastructure. Data analyses on the variety of network attacks could help people to infer the type of network attacks. According to the association rules, people can take out proper disposal program timely to effectively reduce the harm caused by network attacks. Large scale complex network requires amount of money to support a large number of servers, switches, routers and other physical devices, and the network topology is not easy to change to support flexible network structure. This paper designs and implements an Experiment Environment for large scale network attacks and defense, and presents metrics and indices of effectiveness evaluation of large scale network attacks and defense. Experiments show that the system supports a variety of network attack experiments running at the same time, and can analyze network attack data effectively and predict the trend.

Explore More