Is this you? Create Your Porfile

Jiansong Zhang

University of Illinois at Urbana–Champaign

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jiansong Zhang is active.

Explore More

Publication

Featured researches published by Jiansong Zhang.

Journal of Computing in Civil Engineering | 2016

Semantic NLP-Based Information Extraction from Construction Regulatory Documents for Automated Compliance Checking

Jiansong Zhang; Nora El-Gohary

AbstractAutomated regulatory compliance checking requires automated extraction of requirements from regulatory textual documents and their formalization in a computer-processable rule representation. Such information extraction (IE) is a challenging task that requires complex analysis and processing of text. Natural language processing (NLP) aims to enable computers to process natural language text in a human-like manner. This paper proposes a semantic, rule-based NLP approach for automated IE from construction regulatory documents. The proposed approach uses a set of pattern-matching-based IE rules and conflict resolution (CR) rules in IE. A variety of syntactic (syntax/grammar-related) and semantic (meaning/context-related) text features are used in the patterns of the IE and CR rules. Phrase structure grammar (PSG)-based phrasal tags and separation and sequencing of semantic information elements are proposed and used to reduce the number of needed patterns. An ontology is used to aid in the recognition...

Computing in Civil Engineering | 2012

Extraction of Construction Regulatory Requirements from Textual Documents Using Natural Language Processing Techniques

Jiansong Zhang; Nora El-Gohary

Automated regulatory compliance checking requires automated information extraction (IE) from regulatory textual documents (e.g. building codes). Automated IE is a challenging task that requires complex processing of text. Natural Language Processing (NLP) aims at enabling computers to process natural language text in a human-like manner using a variety of text processing techniques, such as phrase-structure parsing, dependency parsing, etc. This paper proposes a hybrid syntactic (syntax/grammar-related) and semantic (meaning/context-related) NLP approach for automated IE from construction regulatory documents, and explores the use of two techniques (phrase-structure grammar and dependency grammar) for extracting information from complex sentences. IE rules were developed based on Chapter 12 of the 2006 International Building Code; and the approach was tested on Chapter 12 of the 2009 International Fire Code. Initial experimental results are presented, empirically evaluated in terms of precision and recall, and discussed.

Journal of Computing in Civil Engineering | 2015

Automated Information Transformation for Automated Regulatory Compliance Checking in Construction

Jiansong Zhang; Nora El-Gohary

AbstractTo fully automate regulatory compliance checking of construction projects, regulatory requirements need to be automatically extracted from various construction regulatory documents and then transformed into a formalized format that enables automated reasoning. To address this need, the authors propose an approach for automatically extracting information from construction regulatory textual documents and transforming them into logic clauses that could be directly used for automated reasoning. This paper focuses on presenting the proposed information transformation (ITr) methodology and the corresponding algorithms. The proposed ITr methodology utilizes a rule-based, semantic natural language processing (NLP) approach. A set of semantic mapping (SeM) rules and conflict resolution (CoR) rules are used to enable the automation of the transformation process. Several syntactic text features (captured using NLP techniques) and semantic text features (captured using an ontology) are used in the SeM and Co...

2013 ASCE International Workshop on Computing in Civil Engineering, IWCCE 2013 | 2013

Information transformation and automated reasoning for automated compliance checking in construction

Jiansong Zhang; Nora El-Gohary

This paper presents a new approach for automated compliance checking in the construction domain. The approach utilizes semantic modeling, semantic Natural Language Processing (NLP) techniques (including text classification and information extraction), and logic reasoning to facilitate automated textual regulatory document analysis and processing for extracting requirements from these documents and formalizing these requirements in a computer-processable format. The approach involves developing a set of algorithms and combining them into one computational platform: 1) semantic machine-learning-based algorithms for text classification (TC), 2) hybrid syntactic-semantic rule-based algorithms for information extraction (IE), 3) semantic rule-based algorithms for information transformation (ITr), and 4) logic-based algorithms for compliance reasoning (CR). This paper focuses on presenting our algorithms for ITr. A semantic logic-based representation for construction regulatory requirements is described. Semantic mapping rules and conflict resolution rules for transforming the extracted information into the representation are discussed. Our combined TC, IE and ITr algorithms were tested in extracting and formalizing quantitative requirements in the 2006 International Building Code, achieving 96% and 92% precision and recall, respectively.

Construction Research Congress 2012: Construction Challenges in a Flat World | 2012

Automated regulatory information extraction from building codes leveraging syntactic and semantic information

Jiansong Zhang; Nora El-Gohary

Manual regulatory compliance checking of construction projects is usually timeconsuming and error-prone. There have been efforts both in academia and industry to automate this process. However, none of them achieved full automation. Specifically, the extraction of rules from regulatory text (e.g. building code) and its representation in a computer-processable format is still conducted manually or semi-automatically. Natural language processing (NLP) aims at enabling computers to process natural language text in a human-like manner. It provides basic concepts and methods for text processing and analysis, such as part of speech (POS) tagging, tokenization, sentence splitting, named entity recognition, and semantic role labeling, etc. This paper is intended to explore the effectiveness of utilizing syntactic (i.e. grammatical) and semantic (i.e. meaning descriptive) features of the text (using NLP tools and techniques) to automatically extract regulatory information from building codes. An automated information extraction (IE) approach – involving the use of IE rules – is proposed. Chapter 12 of the 2006 International Building Code was used to develop the IE rules, while Chapter 12 of the 2009 International Fire Code was used to test the approach. An overall F-measure of 0.94 shows the potential of the proposed approach. Based on the experimental results and their analysis, we conclude the paper by pinpointing possible ways for improving the proposed approach.

Journal of Computing in Civil Engineering | 2017

Semantic-Based Logic Representation and Reasoning for Automated Regulatory Compliance Checking

Jiansong Zhang; Nora El-Gohary

AbstractExisting automated compliance checking (ACC) efforts are limited in their automation and reasoning capabilities; the state of the art in ACC still uses ad hoc reasoning schema/methods, with lack of support for complete automation in ACC reasoning. First-order logic (FOL) representation and reasoning can provide a generalized reasoning method to facilitate complete automation in ACC reasoning. This paper presents a new FOL-based information representation and compliance reasoning (IRep and CR) schema for representing and reasoning about regulatory information and design information for checking regulatory compliance of building designs. The schema formalizes the representation of regulatory information and design information in the form of semantic-based (ontology-based) logic clauses that could be directly used for automated compliance reasoning. Two alternative subschemas, following a closed-world assumption and an open-world assumption for noncompliance detection, respectively, were proposed and...

2014 Construction Research Congress: Construction in a Global Network, CRC 2014 | 2014

Automated Reasoning for Regulatory Compliance Checking in the Construction Domain

Jiansong Zhang; Nora El-Gohary

Automating the process of compliance checking is expected to reduce the time and cost of the process, as well as reduce the probability of making compliance assessment errors. Automated reasoning is essential for enabling the automation of compliance checking. Among the different types of formally-defined logic, which have varying degrees of descriptive capability, first order logic (FOL) is the most widely-used for logical inference-making. In this paper, we present our FOL-based representation method for supporting automated regulatory compliance checking in construction. The expressivity of FOL is leveraged to describe various concepts and their relations in construction regulations. Prolog is the most widely-used logic programming language and reasoner. We used B-Prolog (an implementation of Prolog) for implementing our proposed method. We tested the method on representing and reasoning about quantitative requirements in Chapter 12 of the 2006 edition of the International Building Code. We developed 109 instances of project information as the test set. We tested the performance of our proposed method in detecting non-compliance instances. Using automatically extracted and transformed regulatory information (and represented in the form of logic clauses), we achieved 0.929 and 0.981 for precision and recall in detecting non-compliance instances, respectively. We also compared automated checking to manual checking in terms of the time efficiency. Automated checking takes a time shorter than 1/10,000 of that for manual checking.

2015 ASCE International Workshop on Computing in Civil Engineering, IWCCE 2015 | 2015

Automated Extraction of Information from Building Information Models into a Semantic Logic-Based Representation

Jiansong Zhang; Nora El-Gohary

One of the major goals of building information modeling is to support automated compliance checking (ACC). To support ACC, building design information needs to be extracted from building information models (BIMs) and transformed into a representation that would allow for automated reasoning about those design information in combination with information from regulatory documents. However, existing BIM information extraction (IE) efforts are limited in supporting complete automation of ACC. Complete automation of ACC requires (1) automating both the extraction of information from BIMs and the extraction of regulatory information from regulatory documents and (2) aligning the instances of information concepts and relations extracted from a BIM with those extracted from regulatory documents, in order to facilitate direct automated reasoning about both information for compliance assessment. To address this gap, this paper proposes an automated BIM IE method for extracting design information from industry foundation classes (IFC)-based BIMs into a semantic logic-based representation that is aligned with a matching semantic logic-based representation of regulatory information. The proposed BIM IE method utilizes semantic natural language processing (NLP) techniques and java standard data access interface (JSDAI) techniques to automatically extract project information from IFC-based BIMs and transform it into a logic format (logic facts) that is ready to be automatically checked against logicrepresented regulatory rules (logic rules). The BIM IE method was tested on extracting design information from a Duplex Apartment BIM model. Compared to a manually developed gold standard, the testing results showed 100% precision and a short time of 15.02 seconds for processing 38,898 lines of data.

Construction Research Congress 2018 | 2018

Towards Systematic Understanding of Geometric Representations in BIM Standard: An Empirical Data-Driven Approach

Jiansong Zhang

The use of Industry foundation classes (IFC) data can facilitate interoperability of building information modeling (BIM) among different applications to alleviate the problems of information missing and inconsistency. By virtue of its goodwill of transparency and openness, IFC data can be opened and viewed in any text editor. But it normally requires a significant amount of effort when manually interpreting IFC data, due to (1) its large number of entities; and (2) the complex connections between one entity and another. On the other hand, the explanations of IFC entities in the IFC schema specifications are difficult to understand or verify. To address such difficulties, in this paper, an empirical data-driven approach is proposed for achieving a systematic understanding of entity definitions in an IFC schema. The approach utilizes IFC data and schema in a synergistic way, to facilitate such systematic understanding. Experimental testing is used to serve as verifications of the understanding and accrue the understanding, along with which byproduct BIM tools will be developed. The proposed approach was tested on understanding entities for geometric representations in the IFC2X3_TC1 schema. Through the experimental testing, systematic understanding of 62 IFC entities were obtained, and a visualization algorithm was developed and implemented based on this understanding.

Archive | 2015

A semantic similarity-based method for semi-automated IFC exension

Jiansong Zhang; Nora El-Gohary

The Industry Foundation Classes (IFC) schema was designed as a comprehensive data schema to cover information of all phases of a building project and all disciplines of the AEC industry. But due to its limited coverage of details in certain subdomains, the IFC schema needs to be extended for many information processing tasks such as information extraction for automated regulatory compliance checking. Previous IFC extension efforts typically extended IFC in an ad-hoc and subjective manner. A more objective, standardized, and application-independent method for extending IFC is, thus, needed. To address this gap, a new method for extending the IFC schema objectively and semi-automatically is proposed. The proposed method utilizes a semantic relation-based concept matching algorithm to find concepts – from domain documents – to incorporate into the current IFC schema class hierarchy. It utilizes the hypernymy, hyponymy, and synonymy semantic relations. This paper focuses on presenting the proposed semantic relation-based concept matching algorithm: the ZESeM (Zhang and El-Gohary Semantic Matching) algorithm. The ZESeM algorithm was tested on processing concepts from Chapter 12 of the International Building Code 2006. Different semantic similarity computation methods were tested in combination with the proposed ZESeM algorithm. The ZESeM algorithm was evaluated based on adoption rate, which is the number of concepts found by the ZESeM algorithm that are adopted divided by the total number of concepts found by the ZESeM algorithm. An adoption rate of 85.8% was achieved. The proposed semantic relation-based concept matching algorithm offers a more efficient concept matching method for semi-automatically extending the IFC schema.

Explore More