Yongning Tang
Illinois State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yongning Tang.
integrated network management | 2005
Yongning Tang; Ehab Al-Shaer; Raouf Boutaba
Fault localization is a core element in fault management. Many fault reasoning techniques use deterministic or probabilistic symptom-fault causality model for fault diagnoses and localization. Symptom-fault map is commonly used to describe symptom-fault causality in fault reasoning. However, due to lost and spurious symptoms in fault reasoning systems that passively collect symptoms, the performance and accuracy of the fault localization can be significantly degraded. In this paper, we propose an extended symptom-fault-action model to incorporate actions into fault reasoning process to tackle the above problem. This technique is called active integrated fault reasoning (AIR), which contains three modules: fault reasoning, fidelity evaluation and action selection. Corresponding fault reasoning and action selection algorithms are elaborated. Simulation study shows both performance and accuracy of fault reasoning can be greatly improved by taking actions, especially when the rate of spurious and lost symptoms is high.
IEEE Transactions on Network and Service Management | 2008
Yongning Tang; Ehab Al-Shaer; Raouf Boutaba
Fault localization is the core element in fault management. Symptom-fault map is commonly used to describe the symptom-fault causality in fault reasoning. For Internet service networks, a well-designed monitoring system can effectively correlate the observable symptoms (i.e., alarms) with the critical network faults (e.g., link failure). However, the lost and spurious symptoms can significantly degrade the performance and accuracy of a passive fault localization system. For overlay networks, due to limited underlying network accessibility, as well as the overlay scalability and dynamics, it is impractical to build a static overlay symptom-fault map. In this paper, we firstly propose a novel active integrated fault reasoning (AIR) framework to incrementally incorporate active investigation actions into the passive fault reasoning process based on an extended symptom-fault-action (SFA) model. Secondly, we propose an overlay network profile (ONP) to facilitate the dynamic creation of an overlay symptom-fault-action (called O-SFA) model, such that the AIR framework can be applied seamlessly to overlay networks (called O-AIR). As a result, the corresponding fault reasoning and action selection algorithms are elaborated. Extensive simulations and Internet experiments show that AIR and O-AIR can significantly improve both accuracy and performance in the fault reasoning for Internet and overlay service networks, especially when the ratio of the lost and spurious symptoms is high.
international conference on natural computation | 2011
Feng Chen; Xinxin Sun; Dali Wei; Yongning Tang
Particle Swarm Optimization (PSO) is a class of stochastic search algorithms based on population. Due to the simplicity of implementation and promising optimization capability, PSO is successfully applied to solving a wide class of scientific and engineering optimization problems. However, PSO has some drawbacks such as high computational complexity and premature convergence. Inspired by the tradeoff strategy between exploration and exploitation in reinforcement learning, we propose an improved PSO. The sigmoid function is incorporated into the velocity update equation of PSO to tackle these drawbacks of PSO. The comparison with inertia weight PSO, constriction factor PSO and Tribe PSO using classic benchmark functions demonstrates that our approach achieves a good tradeoff between exploration and exploitation, and thus obtain better global optimization result and faster convergence speed.
Journal of Network and Systems Management | 2002
Ehab Al-Shaer; Yongning Tang
New network monitoring tools are necessary for supporting the deployment and the operations of multicast services on the Internet. Because of the peculiar characteristics of multicast routing (e.g., multicast forwarding trees) and the potential of message implosion problems, traditional network management tools are not sufficient for monitoring the quality of multicast delivery such as packet loss, delay, and jitter. Current multicast monitoring tools are either not scalable, limited in their functionality, or difficult to deploy in enterprise networks.In this paper, we present a new monitoring framework (called SMRM, SNMP-based multicast reachability monitoring) for multicast reachability based on SNMP. The SMRM framework is used for actively monitoring the health and the quality of service of multicast networks. SMRM provides a scalable real-time feedback on the packet loss, delay, and jitter of any selected segments of multicast delivery trees. In addition, NOC (network operations center) personnel can easily understand, deploy, and extend the SMRM framework in order to detect and isolate reachability and performance problems in multicast sessions. SMRM combines both distributed monitoring and centralized control, which offers scalability and simplicity. The integration of SMRM into SNMP is motivated by the wide distribution of SNMP agents in networks today, which significantly facilitates the deployment of SMRM is existing networks.
IEEE Transactions on Network and Service Management | 2012
Yongning Tang; Ehab Al-Shaer; Kaustubh R. Joshi
The performance and reliability of overlay services rely on the underlying overlay networks ability to effectively diagnose and recover from faults such as link failures and overlay node outages. However, overlay networks bring to fault diagnosis new challenges such as large-scale deployment, inaccessible underlay network information, dynamic symptom-fault causality relationship, and multi-layer complexity. In this paper, we develop an evidential overlay fault diagnosis framework called DigOver to tackle these challenges. Firstly, DigOver identifies a set of potential faulty components based on shared end-user observed negative symptoms. Then, each potential faulty component is evaluated to quantify its fault likelihood and the corresponding evaluation uncertainty. Finally, DigOver dynamically constructs a plausible fault graph to locate the root causes of end-user observed negative symptoms. Both simulation and Internet experiments demonstrate that DigOver can effectively and accurately diagnose overlay faults based on end-user observed negative symptoms.
integrated network management | 2007
Yongning Tang; Ehab Al-Shaer; Bin Zhang
Overlay networks have emerged as a powerful and flexible platform for developing new disruptive network applications. The performance and reliability of overlay applications depend on the capability of overlay networks to dynamically adapt to various factors such as link/node failures, overlay link quality, and overlay node characteristics. In order to achieve this, the overlay applications require scalable and open overlay monitoring services to monitor, aggregate globally distributed events and take appropriate control actions. In this paper, we propose the techniques and algorithms to create an optimal event monitoring and aggregation infrastructure (called MOON) that minimizes the monitoring latency (i.e., event retrival/detection time) and event aggregation cost (i.e., intrusiveness) considering the large-scale geographical and network distribution of overlay nodes. The proposed monitoring infrastructure, MOON, clusters and organizes overlay nodes efficiently such that overlay applications can globally monitor and query correlated events in an overlay network with minimum latency and monitoring cost. Our simulations and experimental studies show the evaluation of MOON under many various topological structures, network sizes, and event aggregation volumes.
network operations and management symposium | 2002
Ehab Al-Shaer; Yongning Tang
One of the main challenges of deploying multicast services in the Internet is the lack of active monitoring tools that can detect and isolate multicast reachability problems in real-time. Existing multicast monitoring tools are either not scalable or use proprietary protocols which limit their deployment in enterprise networks. This paper presents SNMP-based multicast reachability monitoring (SMRM), a framework for monitoring the health and the quality of multicast delivery paths (or forwarding tree) in real-time. SMRM addresses these limitations by using SNMP as a core component, which significantly facilitates the wide deployment of SMRM in existing networks. The SMRM framework combines distributed monitoring and centralized control, which offers a scalable, easy-to-use and easy-to-deploy multicast monitoring service.
dependable systems and networks | 2009
Yongning Tang; Ehab Al-Shaer
The dependability of overlay services rely on the overlay networks capabilities to effectively diagnose and recover faults (e.g., link failures, overlay node outages). However, overlay applications bring to overlay fault diagnosis new challenges, which include large-scale deployment, inaccessible underlying network information, dynamic symptom-fault causality relationship, and multi-layer complexity. In this paper, we develop an evidential overlay fault diagnosis framework (called DigOver) to tackle these challenges. Firstly, the DigOver identifies a set of potential faulty components based on shared end-user observed negative symptoms. Then, each potential faulty component is evaluated to quantify its fault likelihood and the corresponding evaluation uncertainty. Finally, the DigOver dynamically constructs a plausible fault graph to locate the root causes of end-user observed negative symptoms.
international conference on computer communications | 2009
Yongning Tang; Ehab Al-Shaer
The attractive characteristics of overlay networks bring to overlay fault diagnosis new challenges, which include inaccessible underlying network information, incomplete and inaccurate network status observations, dynamic symptom-fault causality relationship, and multi-layer complexity. To address these challenges, we propose a novel evidential reasoning based overlay fault diagnosis technique called ERD. Firstly, by analyzing end-user observed network symptoms, ERD narrows down suspicious components, and investigates their status (i.e., good or bad) with likelihood measurement and uncertainty evaluation using a novel evidence-driven belief function. Next, ERD adapts to the changes in highly dynamic overlay networks by dynamically constructing plausible fault diagnosis graph based on belief evaluation. Finally, ERD conducts plausible fault reasoning to locate the root causes of observed network symptoms.
Journal of Internet Services and Applications | 2016
Yongning Tang; Guang Cheng; Zhiwei Xu; Feng Chen; Khalid Elmansor; Yangxuan Wu
Fault localization for SDN becomes one of the most critical but difficult tasks. Existing tools typically only address a specific part of the problem (e.g., control plane verification, flow checker). In this paper, we propose a new approach to tackle SDN fault localization by automatically Modeling via Policy Inference (called MPI) the causality between SDN faults and their symptoms to a belief network. In the MPI system, a service oriented high level policy language is used to specify network services provisioned between end nodes. MPI parses each service provisioning policy to a logical policy view, which consists of a pair of logical end nodes, a traffic pattern specification, and a list of required network functions (or a service function chain). An SDN controller takes the policies from multiple parties and provisions the requested services on its orchestrated SDN network. MPI queries the controller about the network topology and retrieves flow rules from all SDN switches. MPI maps the policy view to the corresponding implementation view, in which all the logical components in the policy view are mapped to the actual system components along with the actual network topology. Referring to the component causality graph templates derived from SDN reference model, the implementation view of the current running network services can be modeled as a belief network. A heuristic fault reasoning algorithm is adopted to search for the most likely root causes. MPI has been evaluated in both a simulation environment and a real network system for its accuracy and efficiency. The evaluation shows that MPI is a highly scalable, effective and flexible modeling approach to tackle fault localization challenges in a highly dynamic and agile SDN network.