Derong Shen
Northeastern University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Derong Shen.
web age information management | 2004
Derong Shen; Ge Yu; Tiezheng Nie; Rui Li; Xiaochun Yang
Quality of Service (QoS) is an important factor for selecting a better Web service from numerous semantic equivalent Web services in Web services composition. In this paper, a QoS model (e_QoS) for evaluating semantic equivalent Web services is proposed, In which, the factors used to evaluate Web Services include Accessed Times, Invoked Success Rate, Average Responding Time, Stability Variance, and Customers Estimation. The performance of Web services in Latest Period of Time, the performance in history and the performance estimated by customers are considered together. By experiments, we concluded that e_QoS is effective for estimating Web services, and can improve the quality of web services composition greatly.
International Journal of Computer Integrated Manufacturing | 2007
Derong Shen; Ge Yu; Yue Kou; Tiezheng Nie; Zhibin Zhao
Web-service composition provides more powerful service functions through integrating existing Web services on the Internet, and collaborative working and resource-sharing among enterprises on the Internet have become a popular trend by means of Web-service composition technologies. Network manufacturing plays an important role in the manufacturing domain and can be realized based on Web-service composition. However, heterogeneity exists widely in complicated Web-service composition, except for manufacturing resources, and this must be resolved. In this paper, we focus on the heterogeneity existing in the execution of composite Web services in network manufacturing, in which the heterogeneity is classified into the heterogeneity between semantic equivalent Web services and the heterogeneity between subsequent Web services in a composite service. We summarize five types of conflicts, namely Semantic Conflict, Parameter Data Type Conflict, Parameter Structure Conflict, Parameter Number Conflict, and Parameter Data Unit Conflict. According to these five types of conflicts, an ontology-based approach for resolving them is proposed, including the definition of domain business process ontology knowledge, domain and industry ontology knowledge, and meta-ontology knowledge about transformation rules. Finally, a case is given, and the implementation technologies are introduced. The ontology based on heterogeneity resolution policy should be established as middleware and applied to any domain to make Web-service composition more available.
Journal of Computer Science and Technology | 2013
Dan Yang; Derong Shen; Ge Yu; Yue Kou; Tiezheng Nie
Keyword query has attracted much research attention due to its simplicity and wide applications. The inherent ambiguity of keyword query is prone to unsatisfied query results. Moreover some existing techniques on Web query keyword query in relational databases and XML databases cannot be completely applied to keyword query in dataspaces. So we propose KeymanticES a novel keyword-based semantic entity search mechanism in dataspaces which combines both keyword query and semantic query features. And we focus on query intent disambiguation problem and propose a novel three-step approach to resolve it. Extensive experimental results show the effectiveness and correctness of our proposed approach.
workshop on information security applications | 2011
Tiezheng Nie; Derong Shen; Yue Kou; Ge Yu; Dejun Yue
This paper proposes a relation extraction model based on semantic pattern matching in Web environment. It consists of frequent pattern extraction, pattern clustering based on density, and pattern matching based on semantic similarity. First, based on the entities with known relations in a limited training set, we extract relation patterns containing these named entities from the web page. Then the relations between entities from the web page in specific areas can be extracted based on these relation patterns extracted. Experiments show the affectivity and the self-adaptive of our method on extracting relations between entities from dynamic web environment.
international congress on big data | 2013
Xite Wang; Derong Shen; Ge Yu; Tiezheng Nie; Yue Kou
MapReduce has been proven to be a highly desirable platform for scalable parallel data analysis. The task scheduling in MapReduce is very crucial for the job execution and has a marked impact on the system performance. To the best of our knowledge, the previous scheduling algorithms rarely consider the job-intensive environments and are not able to provide high system throughput. Hence this paper proposes a novel technique for job-intensive scheduling to improve the system throughput. Firstly, by making an in-depth analysis of job-intensive environments, we sum up 4 major factors which affect the system throughput. Secondly, based on the factors, an efficient technique, called throughput driven task scheduler is proposed, in which, we adopt a series of effective measures to improve the throughput of a MapReduce cluster system. Finally, plenty of simulation experiments are made and the experimental results show that the scheduler can provide higher throughput than the previous systems and is able to meet the requirements of practical job-intensive applications.
international conference for young computer scientists | 2008
Yubin Bao; Jie Song; Daling Wang; Derong Shen; Ge Yu
As the wide uses of access control model in systems, a more agile access control model is required to solve complicated modeling, user authorizing and verifying problem. In this paper, an access control model based on the concepts of role, attribute and context, named C-RBAC, is proposed. This model is based on and further improved role-based access control (RBAC). The proposed model adds system conditions in access control, distinguishes users that belong to one role by user attributes, provides an agile and dynamic role model by adopting the concept of conditional role, and designs a more flexible access authorization mechanism to reinforce role model of RBAC. The implementation and the UML-modeling approaches of proposed model are also explained in this paper. Theoretical analysis and experiments prove that the new access control model is more effective by comparing with traditional RBAC model.
ieee international conference on dependable, autonomic and secure computing | 2014
Dejun Yue; Ge Yu; Derong Shen; Xiaocong Yu
Many challenging problems could be better solved by exploiting crowdsourcing platforms than traditional machine-based methods. However, data quality in crowdsourcing applications has become a crucial aspect since crowdsourcing workers may have different capabilities. In this paper, we propose a novel weighted aggregation rule (WAR) to improve the result accuracy in crowdsourcing systems. According to the agreement of answers given by the workers, we classify all the tasks into the high-agreement tasks and low-agreement tasks. For the high-agreement tasks, we use simple majority voting to select the correct answer while ensuring the result accuracy. For the low-agreement tasks, we adopt weighted majority voting strategy, which assigns a weight for each worker according to his performance on the high-agreement tasks. We evaluate the effectiveness of our proposed method using three real-world datasets on AMT. The experimental results show that our method achieves excellent result accuracy.
Frontiers of Computer Science in China | 2015
Xite Wang; Derong Shen; Mei Bai; Tiezheng Nie; Yue Kou; Ge Yu
MapReduce is a popular parallel data-processing system, and task scheduling is one of the kernel techniques in MapReduce. In many applications, users have requirements that their MapReduce jobs should be completed before specific deadlines. Hence, in this paper, a novel scheduling algorithm based on the most effective sequence (SAMES) is proposed for deadline-constraint jobs in MapReduce. First, according to the characteristics of MapReduce, we propose a novel sequence-based execution strategy for MapReduce jobs and a new concept, the effective sequence (ES). Then, we design some efficient approaches for finding ESes and choose the most effective sequence (MES) for job execution. We also propose methods for MES-updates and exception handling. Finally, we verify the effectiveness of SAMES through experiments. The experimental results show that SAMES is an efficient scheduling algorithm for deadline-constraint jobs in MapReduce.
database systems for advanced applications | 2013
Mingdong Zhu; Derong Shen; Ge Yu; Yue Kou; Tiezheng Nie
The explosive growth of Data is bringing more and more challenges and opportunities to data mining. In data mining, learning decision tree is a common method, in which determining split points is the key problem. Existing methods of calculating split points in the distributed setting on large data either (1) cause high communication overhead or (2) are not universal for different levels of skewness of data distribution. In this paper, we study the properties of Gini impurity, which is a measure for determining split points, and design new algorithms for calculating split points in MapReduce. Empirical evaluation demonstrates that our method outperforms existing state-of-the-art techniques on communication cost and universality.
web age information management | 2008
Tiezheng Nie; Derong Shen; Ge Yu; Yue Kou
To access the large-scale data sources efficiently and automatically, it is necessary to classify these data sources into different domains and categories. In this paper, we propose a novel classification approach to classify data sources into detail domain subjects by query probing. In our approach, we train sample instances for each subject category and use them to probe the data scale of each source and category. And then we build a matrix to classify a data source into one or more subject categories and develop a decision algorithm based on probing iteration to rectify the classification result. Our experiments over real deep web sources show that our approach can achieve higher accuracy across a variety of data sources.