Sujing Wang
Lamar University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sujing Wang.
knowledge discovery and data mining | 2013
Zechun Cao; Sujing Wang; Germain Forestier; Anne Puissant; Christoph F. Eick
Cities all around the world are in constant evolution due to numerous factors, such as fast urbanization and new ways of communication and transportation. Since understanding the composition of cities is the key to intelligent urbanization, there is a growing need to develop urban computing and analysis tools to guide the orderly development of cities, as well as to enhance their smooth and beneficiary evolution. This paper presents a spatial clustering approach to discover interesting regions and regions which serve different functions in cities. Spatial clustering groups the objects in a spatial dataset and identifies contiguous regions in the space of the spatial attributes. We formally define the task of finding uniform regions in spatial data as a maximization problem of a plug-in measure of uniformity and introduce a prototype-based clustering algorithm named CLEVER to find such regions. Moreover, polygon models which capture the scope of a spatial cluster and histogram-style distribution signatures are used to annotate the content of a spatial cluster in the proposed methodology; they play a key role in summarizing the composition of a spatial dataset. Furthermore, algorithms for identifying popular distribution signatures and approaches for identifying regions which express a particular distribution signature will be presented. The proposed methodology is demonstrated and evaluated in a challenging real-world case study centering on analyzing the composition of the city of Strasbourg in France.
Computers & Chemical Engineering | 2013
Tianxing Cai; Sujing Wang; Qiang Xu
Abstract Chemical plant concentrated regions may suffer localized and transient air pollution events that violate national ambient air quality standards (NAAQS). Flaring emissions, especially intensive start-up flaring emissions from chemical plants, have potentially significant impacts on local air quality. Thus, when multiple plants in an industrial zone plan to start-up within a same time period, their start-up plans should be evaluated and optimally controlled so as to avoid unexpected air-quality violations in any air-quality concern regions (AQCRs). In this paper, a general systematic methodology for multi-plant start-up emission evaluation and control has been developed. The methodology starts with collecting regional meteorological information such as wind speed and temperature; geographical information of all of the involved chemical plants and AQCRs; as well as plant operation data such as the start-up time window, start-up duration, and estimated emission profile. Next, a regional air-quality evaluation based on Gaussian dispersion model will be conducted. If any air quality violation is predicted to an AQCR, a multi-objective scheduling problem will be generated and solved to optimize the start-up sequence and start-up beginning time for all chemical plants. The scheduling model minimizes the overall air quality impacts to all of the AQCRs as well as minimize the total start-up time mismatch of all plants, subject to the principles of atmospheric pollutant dispersion. This study may provide valuable quantitative decision supports for multiple stake holders, including government environmental agency, regional chemical plants, and local communities.
Geoinformatica | 2014
Sujing Wang; Christoph F. Eick
Polygons provide natural representations for many types of geospatial objects, such as countries, buildings, and pollution hotspots. Thus, polygon-based data mining techniques are particularly useful for mining geospatial datasets. In this paper, we propose a polygon-based clustering and analysis framework for mining multiple geospatial datasets that have inherently hidden relations. In this framework, polygons are first generated from multiple geospatial point datasets by using a density-based contouring algorithm called DCONTOUR. Next, a density-based clustering algorithm called Poly-SNN with novel dissimilarity functions is employed to cluster polygons to create meta-clusters of polygons. Finally, post-processing analysis techniques are proposed to extract interesting patterns and user-guided summarized knowledge from meta-clusters. These techniques employ plug-in reward functions that capture a domain expert’s notion of interestingness to guide the extraction of knowledge from meta-clusters. The effectiveness of our framework is tested in a real-world case study involving ozone pollution events in Texas. The experimental results show that our framework can reveal interesting relationships between different ozone hotspots represented by polygons; it can also identify interesting hidden relations between ozone hotspots and several meteorological variables, such as outdoor temperature, solar radiation, and wind speed.
Proceedings of the 1st ACM SIGSPATIAL International Workshop on Data Mining for Geoinformatics | 2010
Sujing Wang; Chun-Sheng Chen; Vadeerat Rinsurongkawong; Fatih Akdag; Christoph F. Eick
Polygons can serve an important role in the analysis of geo-referenced data as they provide a natural representation for particular types of spatial objects and in that they can be used as models for spatial clusters. This paper claims that polygon analysis is particularly useful for mining related, spatial datasets. A novel methodology for clustering polygons that have been extracted from different spatial datasets is proposed which consists of a meta clustering module that clusters polygons and a summary generation module that creates a final clustering from a polygonal meta clustering based on user preferences. Moreover, a density-based polygon clustering algorithm is introduced. Our methodology is evaluated in a real-world case study involving ozone pollution in Texas; it was able to reveal interesting relationships between different ozone hotspots and interesting associations between ozone hotspots and other meteorological variables.
international conference on data mining | 2013
Sujing Wang; Tianxing Cai; Christoph F. Eick
Spatiotemporal clustering is a process of grouping a set of objects based on their spatial and temporal similarities. In this paper we propose two new spatiotemporal clustering algorithms, called Spatiotemporal Shared Nearest Neighbor clustering algorithm (ST-SNN), and Spatiotemporal Separated Shared Nearest Neighbor clustering algorithm (ST-SEP-SNN), to cluster overlapping polygons that can change their locations, sizes and shapes over time. Both ST-SNN and ST-SEP-SNN are based on well established generic density-based clustering algorithm Shared Nearest Neighbor (SNN), which can find clusters of different sizes, shapes, and densities in high dimensional data. New similarity functions are proposed for computing spatiotemporal similarities between spatiotemporal polygons as well. We evaluate and demonstrate the effectiveness of our approaches in a case study involving ozone pollution events in the Houston-Galveston-Brazoria (HGB) area. The experimental results show that both ST-SNN and ST-SEP-SNN algorithms can find interesting spatiotemporal patterns from ozone pollution data.
Computers & Chemical Engineering | 2016
Honglin Qu; Sujing Wang; Qiang Xu
Abstract Multi-recipe and multi-stage material handling (M3H) processes are widely employed by various industries where complex multi-product manufacturing and assembly tasks are required. Cyclic hoist scheduling (CHS) could significantly improve the productivity of an M3H process. In this paper, a novel CHS methodology has been developed, which considers employing multiple sub-cycles to efficiently deal with job duality issues associated with multi-capacity processing units. The methodology contains three modeling and solving stages. In Stage I, a mixed-integer linear programing (MILP) model is developed to obtain the optimal solution of a sub-cycle CHS problem with the tolerance of job duality. In Stage II, the obtained solution will be examined to see if the job duality exists. Once a job duality issue is identified, another MILP model in Stage III will be performed to schedule an additional sub-cycle CHS problem, which targets the minimum slot usage discrepancy caused by the identified job duality. After that, the combined CHS solutions from the previous and additional sub-cycles will be fed back to Stage II for the job duality examination again. Iteratively checking and rescheduling between Stages II and III, job duality issues will be eventually eliminated and a full scheduling cycle coupling multiple sub-cycles will be accomplished. The methodology can significantly increase the CHS optimality for M3H processes, which are demonstrated by two case studies.
Computers & Chemical Engineering | 2017
Jialin Xu; Shujing Zhang; Jian Zhang; Sujing Wang; Qiang Xu
Abstract Scheduling of front-end crude-oil transfer and refinery processing are two critically important and challenging tasks to petroleum refineries. However, the simultaneous scheduling of front-end crude-oil transfer and refinery operations has never been considered in previous studies due to the large scale and complexity of the resultant optimization problem. In this paper, a systematic methodology for simultaneous scheduling of front-end crude transfer and refinery processing has been developed. It provides a large-scale continuous-time based scheduling model for crude unloading, transferring, and processing (CUTP) to simulate and optimize the front-end and refinery crude-oil operations simultaneously. The CUTP model consists of a newly developed refinery processing sub model, a crude processing status transition sub-model, and a borrowed front-end crude transferring sub model. The objective is to maximize the total operational profit while satisfying various constraints such as operation and production specifications, inventory limits, and production demands. The efficacy of the proposed scheduling model has been demonstrated by an industrial-scale case study.
Computer-aided chemical engineering | 2014
Tianxing Cai; Sujing Wang; Qiang Xu
Abstract Geographic allocation of chemical plants significantly affects industrial business sustainability as well as regional environmental sustainability. According to site selection rules, the air quality impact to surrounding communities for a newly constructed chemical plant must be taken into account. To address this issue, regional background air-quality information, new plant emissions, and local statistical meteorological conditions have to be simultaneously considered. Based on that, the potential air-quality impacts from candidate sites of a new chemical plant can be thoroughly evaluated and the final site determination can be optimized to minimize air-quality impacts based on the likelihood of local meteorological conditions. In this paper, a systematic methodology for this purpose has been developed. It includes the modeling and optimization work to apply Monte Carlo optimization for optimal site selection of new chemical plants with their given emission data. This study can not only determine the potential impact for the distribution of new chemical plants with respect to regional statistical meteorological conditions, but also identify an optimal site for each new chemical plant with the minimal environment impact to surrounding communities. Case studies are employed to demonstrate the efficacy of the developed methodology.
Journal of Environmental Management | 2015
Tianxing Cai; Sujing Wang; Qiang Xu
Geographic distribution of chemical manufacturing sites has significant impact on the business sustainability of industrial development and regional environmental sustainability as well. The common site selection rules have included the evaluation of the air quality impact of a newly constructed chemical manufacturing site to surrounding communities. In order to achieve this target, the simultaneous consideration should cover the regional background air-quality information, the emissions of new manufacturing site, and statistical pattern of local meteorological conditions. According to the above information, the risk assessment can be conducted for the potential air-quality impacts from candidate locations of a new chemical manufacturing site, and thus the optimization of the final site selection can be achieved by minimizing its air-quality impacts. This paper has provided a systematic methodology for the above purpose. There are total two stages of modeling and optimization work: i) Monte Carlo simulation for the purpose to identify background pollutant concentration based on currently existing emission sources and regional statistical meteorological conditions; and ii) multi-objective (simultaneous minimization of both peak pollutant concentration and standard deviation of pollutant concentration spatial distribution at air-quality concern regions) Monte Carlo optimization for optimal location selection of new chemical manufacturing sites according to their design data of potential emission. This study can be helpful to both determination of the potential air-quality impact for geographic distribution of multiple chemical plants with respect to regional statistical meteorological conditions, and the identification of an optimal site for each new chemical manufacturing site with the minimal environment impact to surrounding communities. The efficacy of the developed methodology has been demonstrated through the case studies.
international syposium on methodologies for intelligent systems | 2017
Yongli Zhang; Sujing Wang; Amar Mani Aryal; Christoph F. Eick
Spatio-temporal clustering, which is a process of grouping objects based on their spatial and temporal similarity, is increasingly gaining more scientific attention. Research in spatio-temporal clustering mainly focuses on approaches that use time and space in parallel. In this paper, we introduce a serial spatio-temporal clustering algorithm, called ST-DPOLY, which creates spatial clusters first and then creates spatio-temporal clusters by identifying continuing relationships between the spatial clusters in consecutive time frames. We compare this serial approach with a parallel approach named ST-SNN. Both ST-DPOLY and ST-SNN are density-based clustering approaches: while ST-DPOLY employs a density-contour based approach that operates on an actual density function, ST-SNN is based on well-established generic clustering algorithm Shared Nearest Neighbor (SNN). We demonstrate the effectiveness of these two approaches in a case study involving a New York city taxi trip dataset. The experimental results show that both ST-DPOLY and ST-SNN can find interesting spatio-temporal patterns in the dataset. Moreover, in terms of time and space complexity, ST-DPOLY has advantages over ST-SNN, while ST-SNN is more superior in terms of temporal flexibility; in terms of clustering results, results of ST-DPOLY are easier to interpret, while ST-SNN obtains more clusters which overlap with each other either spatially or temporally, which makes interpreting its clustering results more complicated.