[PDF] Detecting Ontological Conflicts in Protocols between Semantic Web Services

Abstract

The task of verifying the compatibility between interacting web services has traditionally been limited to checking the compatibility of the interaction protocol in terms of message sequences and the type of data being exchanged. Since web services are developed largely in an uncoordinated way, different services often use independently developed ontologies for the same domain instead of adhering to a single ontology as standard. In this work we investigate the approaches that can be taken by the server to verify the possibility to reach a state with semantically inconsistent results during the execution of a protocol with a client, if the client ontology is published. Often database is used to store the actual data along with the ontologies instead of storing the actual data as a part of the ontology description. It is important to observe that at the current state of the database the semantic conflict state may not be reached even if the verification done by the server indicates the possibility of reaching a conflict state. A relational algebra based decision procedure is also developed to incorporate the current state of the client and the server databases in the overall verification procedure.

Full PDF

aa r X i v : . [ c s . A I] N ov International Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010

Detecting Ontological Conﬂicts in Protocols betweenSemantic Web Services

Priyankar Ghosh and Pallab Dasgupta

Department of Computer Science and Engineering,Indian Institute of Technology Kharagpur, India { priyankar, pallab } @cse.iitkgp.ernet.in Abstract.

The task of verifying the compatibility between interacting web services has tradi-tionally been limited to checking the compatibility of the interaction protocol in terms of messagesequences and the type of data being exchanged. Since web services are developed largely in anuncoordinated way, diﬀerent services often use independently developed ontologies for the samedomain instead of adhering to a single ontology as standard. In this work we investigate theapproaches that can be taken by the server to verify the possibility to reach a state with seman-tically inconsistent results during the execution of a protocol with a client, if the client ontologyis published. Often database is used to store the actual data along with the ontologies insteadof storing the actual data as a part of the ontology description. It is important to observe thatat the current state of the database the semantic conﬂict state may not be reached even if theveriﬁcation done by the server indicates the possibility of reaching a conﬂict state. A relationalalgebra based decision procedure is also developed to incorporate the current state of the clientand the server databases in the overall veriﬁcation procedure.

Ontology is regarded as a formal speciﬁcation of a (usually hierarchical) set of concepts and the relationsbetween them. The need for developing intelligent web services that can automatically interact withother web services has been one of the primary forces behind recent research towards standardization ofontologies of speciﬁc domains of interest [1, 2, 3, 4, 5]. For example, if several online book stores followthe same ontology for the book domain, then it facilitates an intelligent web service to automaticallysearch these book stores to ﬁnd books in a particular category.In the context of next generation of web, it is envisaged that intelligent agents will ﬁnd, combine,and act upon information on the web, thereby perform the routine day-to-day jobs independently.The protocols that will be used by such intelligent agents to communicate with the semantic webservices, will play an extremely important role towards materializing the next generation of web. Theprotocol may contain branches which are decisions made on the basis of the previous informationexchange. Along with deﬁning the information exchange between the client and server in the form of aset query-answer, independent actions will be described as a part of the protocol. The action may beautomatically executed or may need manual intervention for completion, but the information requiredto initiate the action is provided by answer of the previous queries. We present an example of suchprotocol in Section 2.When two communicating web services use ontologies, with respect to semantic conﬂict the followingscenarios are possible.

Scenario-1 :

If the web services choose to use the same ontology, there will be no semantic conﬂict.In this paper we observe that the requirement that the ontologies used by communicating webservices must match is a very strong requirement which is often not needed in practice.DOI : 10.5121/ijwest.2010.1403 28 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010

Scenario-2 :

If two communicating web services use diﬀerent ontologies, then they may potentiallyreach a state where there is a semantic conﬂict/mismatch arising out of the diﬀerences betweentheir ontologies. For example, suppose the ontologies of web service A and web service B recognizethe class vehicle and its sub-classes, namely, car , truck and bike . The ontology of A deﬁnes color as an attribute of class vehicle , where as the ontology of B deﬁnes color as an attribute of thesub-classes car and bike only. Now suppose A wants to follow the following protocol with B :Step-1: Ask B for the registration number of a vehicle which is owned by a given person.Step-2: If B ﬁnds the registration number, then ask B for the color of the vehicle.Several executions of this protocol are possible for diﬀerent valuations of the data exchanged bythe protocol. Semantic conﬂicts arising out of the diﬀerences in ontologies may occur in some ofthese cases, but not always. For example: – If B does not ﬁnd the registration number, then Step-2 is not executed and there is no semanticconﬂict. – If B ﬁnds the registration number and the vehicle happens to be a truck, then Step-2 of theprotocol will lead to a semantic conﬂict, since in B ’s ontology, the color attribute is not deﬁnedfor trucks. – If B ﬁnds the registration number and the vehicle happens to be a car or a bike, then Step-2will not lead to a semantic conﬂict, since in B ’s ontology, the color attribute is deﬁned for carsand bikes.If the ontology of A and the protocol is made available to B , then B can formally verify whetherany execution of the protocol may lead to a semantic conﬂict and warn A accordingly before theactual execution of the protocol begins.There has been considerable research in the recent past on matching ontologies and ﬁnding outsemantic conﬂicts/mismatches among two ontologies [6, 7, 8]. In many cases, two web servicesmay have conﬂicting ontologies, but the protocol between them may avoid the conﬂict scenarios.Consider the scenario where the direction of query-answer is reversed, that is the same sequence ofqueries are made by A and answered by B . Also A makes the query about the color of vehicle onlyif the vehicle is not a truck. In this case the conﬂict will not be sensitized by the protocol. In otherwords, two agents may not agree on all concepts in their universe, but may still be able to supportcertain protocols as long as they avoid the contentious issues – a fact which is often ignored inworld politics! Therefore an approach which rules out communication between two services on thegrounds that their ontologies do not match is too conservative in practice. Since the standardizationof ontologies and their acceptance in industrial practice seems to be a distant possibility, we believethat the veriﬁcation problem presented in this paper and its solution is very relevant at present. Scenario-3 :

The ontologies can be visualized as a combination of meta-data and a set of instances.Classes, relations and data-types form the meta-data part of the ontology, whereas the individualsand the valuations of the attributes are the actual data. It is often the case that the actual datais stored in a database, and ontologies are used as a wrapper on top of the databases. Thereforethe state of the database has to be incorporated, while the server checks whether the protocol canpossibly reach conﬂict state. Since the protocol between the client and the server typically havebranches and the decision for making the next query is dependent on the answer of the currentquery, the conﬂict that is present at the ontology level may not be sensitized due to the the answersgenerated from the back-end database. We present a relation algebra based decision procedure tocheck whether the conﬂict, that are present in the ontology level, are actually present with respectto the current state of the back-end database.

Scenario-4 :

It is important to observe that the protocol has diﬀerent runs depending on the instan-tiation of the variables that are used in the protocol. Since the conﬂict may not be sensitized ina particular run of the protocol, the server may choose to start the protocol and check the possi-bility to get into a conﬂict after every information exchange. Depending on how the conversation29 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 progresses the server may either continue to run protocol, or may terminate the conversation whenit ﬁnds that the conﬂict is inevitable.A preliminary version of this work is published in [9]. In that version we presented the veriﬁcationalgorithm for Scenario-2. In this paper we include the algorithms for Scenario-3, i.e. the veriﬁcation ofthe spuriousness of an ontological conﬂict with respect to the current state of the back-end database.We also show that the same algorithm can be used by the server for Scenario-4. The paper is organizedas follows. The syntax for describing a protocol is described in Section 2. In Section 3 we present agraph based model for representing the ontologies. The proposed formal method for detecting semanticconﬂicts at the ontology level is presented in Section 4. The notion of ontology with database and queryanswering with the back-end database and the algorithm to verify the conﬂicts at the ontology level inthe presence of the database are presented in Section 5. Related works are brieﬂy discussed in Section 6.Finally we present the conclusion in Section 7.

In this section we present a formalism similar to SQL for the speciﬁcation of the protocol. It maybe noted that other formalisms can also be used to specify a protocol as long as the formalism hasexpressive power similar to the formalism used in this paper. We present two example protocols andalso describe the notion of the conﬂict that we have addressed in this paper.

Typically, a protocol consists of a sequence of queries and answers. The query speciﬁes a set of variablesthrough “Get” keyword and speciﬁes a set of classes using “from” keyword. The valuations correspond-ing to the variable set are generated from those classes. Also an optional “where” keyword is used tospecify the conditions on the variables. The answer of a query is a tuple of valuations correspondingto the variable set speciﬁed in the query. The branching is speciﬁed using “if-else” statements.

Consider the protocol shown in Figure 1. The protocol depicts a conversation be-tween a client and a server over the publication domain. The query of the client is about the author ofsome speciﬁc manual. Then the client makes a query to retrieve a book by the author of that manual.According to the ontology of the client, ‘Proceedings’ is a subclass of ‘Book’ and the client makes thenext query to retrieve the proceedings by the same author. If the server does not recognize ‘Proceed-ings’ as a sub class of ‘Book’ , the query can not be answered by the server due to the mismatch in theontologies. [Protocol - 2 :]

In Figure 2 we present another protocol that exchanges information about the au-tomobile domain. The client makes a query to retrieve a brand which has sold more than a speciﬁcnumber of vehicles in a particular year. Then next query is made in the context of the previous queryto check whether that brand manufacture ‘Red Trucks’ . According to the ontology of the client thecolor is a property of the vehicle class and therefore all subclasses of vehicle class will have the colorattribute. However if the server recognizes ‘color’ as an attribute of some of the sub-classes(suppose ‘car’ and ‘two-wheeler’ ) instead of as an attribute of the class ‘Vehicle’ itself, the query can not beanswered by the server due to the mismatch in the ontology. 30 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010PSfrag replacements Client Server

Get ( title : t , author : a, date : d from Manual where t ManualName ′ h t , a, d i Get ( title : t , author : a ) from Book h t , a i if ( t ! = null ) Get ( title : t , author : a, date : d from Book.P roceedings h t , a, d i Fig. 1.

Protocol on Publication Domain [Protocol - 3 :]

In this example we present a protocol of an intelligent agent . Consider the semanticweb service for an online store. The online store can queried to retrieve the relevant information aboutthe available items. Also consider a multi-cuisine restaurant which is a client of that store. Wheneverthe stock of some item, say i , falls below some level, the intelligent agent that works on behalf of therestaurant, searches the availability of i by querying the online store. Suppose i comes in two qualities, q and q . The protocol, that is used by that agent to ﬁnd and buy the item under consideration, ispresented below using a format similar to pseudo code. Here the buy action is carried out by the agentautomatically, if the precondition is satisﬁed. Protocol for Buying an Item

Get the availability i of quality q ;If ( i of quality q is available)Get the price of i of quality q ;If (the price is less than C )Buy i of quantity Q ;Else Inform the Manager of the store;Else Get the price of i of quality q ;If (the price is less than C )Buy i of quantity Q ;Else Inform the Manager of the store; nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010PSfrag replacements Client Server Get ( Brand : b , ItemsSold : c , Y ear : y from SaleStatswhere ( c > y h b , c , y i Get ( Brand : b , Model : mod, Date : d , Color : col ) from V ehicle.T ruckwhere ( d .year > col = ‘ Red ′ ) h b , mod, d , col i Fig. 2.

Protocol on Automobile Domain

We focus on the following two types of mismatch between the client and server ontologies in this paper.

Specialization Mismatch(Type-1):

In this type of incompatibility the client recognizes a class c as the specialization of another class c whereas the server recognizes c as the specialization ofsome other class c ′ . Our ﬁrst example (Figure 1) is an instance of this type. Attribute Assignment Mismatch(Type-2):

A very common type of incompatibility arises wherethe client and the server both recognize classes c ′ , . . . , c ′ n as the specializations of another class c , but the client associates an attribute α with the super class c , whereas the server associates α with some of the sub classes c ′ i , . . . , c ′ j , 0 < i, j ≤ n . Since we view the mismatches from thequery answering perspective, we use the notion of this conﬂict from the query perspective. If theset of variables that is used in a query q , is not available at server side, we denote that as attributelevel(Type-2) mismatch. Our ﬁrst example (Figure 2) is an instance of this type. While describing an ontology using OWL, the class and the attributes(modeled as properties in thecontext of OWL) are used to represent the meta-data. We use a graph based approach to model themeta-data that are described as classes and attributes in OWL. While using OWL, the properties areused to express the attributes. Therefore we use the term property and attribute interchangeably. Wedeﬁne the ontology graph as follows.

Deﬁnition 1. A graph model for an ontology O is G = ( V, E ) where, V is the set of vertices and E is the set of directed edges. Each node v i ∈ V represents a class in the OWL ontology and v i isassociated with a property list L ( v i ) whose elements are the data properties of the class. The directededges can be of the following types nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 Inheritance-Edge : An inheritance-edge e ij ∈ E from v i to v j , where v i , v j ∈ V , if v j is a sub classof v i . Property-Edge : An property-edge e ij ∈ E from v i to v j , where v i , v j ∈ V , if v j is an object propertyof v i . In this section we present the relevant formalisms and present the overall algorithm for solving theproblem. The variable set and the class set speciﬁed in the query q are denoted by S v ( q ) and S c ( q )respectively. We present a graph search based structural matching algorithm to check the semanticsafety of the protocol. Deﬁnition 2.

The specialization sequence σ = h c .c . · · · .c k i in a query q is the sequence of classesthat are concatenated through the ‘ . ’ operator, and for any two consecutive classes c i and c i +1 in thesequence, c i is the super class of c i +1 . Therefore the elements of S c ( q ) can be individual classes orspeciﬁcation sequences. Check-Consistency input : The Protocol P and the Server Ontology O s V ← {} ; foreach query q in the protocol P do foreach element τ in S c ( q ) do if τ is a specialization sequence then c ← the ﬁrst concept of τ ; c t ← FindMatch( O s , c ); for i ← to length ( τ ) do c m ← the i th concept of τ ; if any class c ′ t equivalent to c m is not found as a sub class of c t in O s then Report Mismatch at c m ; else c t ← c ′ t end end V ← V ∪ property set for c t ; else /* c is an individual class */ c ← τ ; c t ← FindMatch( O s , c ); V ← V ∪ property set for c t ; end end if S v ( q ) ( V then Report { S v ( q ) − V } as unmatched variables; end end nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 Function

FindMatch( O s , c i ) Find the class c t which is equivalent to c i in O s ; if c t is not found in O s then Report Mismatch at c i ; exit; end return c i ; We present a working example to describe how the algorithm works. Consider the protocol shown inFigure 1. We elaborate the steps of applying Algorithm 1 with respect to the fragments of the clientand server ontologies shown in Figure 3 and Figure 4 respectively. These fragments are taken from thebenchmark provided by [10]. The benchmark has one reference ontology and four other real ontologiesand the domain of these ontologies is bibliographic references. We have used the reference ontology asthe server ontology and another real ontology named INRIA as the client ontology. We have used apictorial representation which is similar to entity-relationship diagram to show the fragments of theontologies. The classes are represented by the rounded rectangles and the ovals represent the propertiesof a particular class. The class hierarchy is shown using arrows, that is a sub class is connected to itssuper class by an arrow which is directed towards the sub class. The properties that belong to aparticular class are connected to the rounded rectangle corresponding to that class through a line.

Step-1:

While applying Algorithm 1 to the server ontology, the individual class ‘Manual’ is searchedand since the search is successful, it is checked that the attributes that are associated with class ‘Manual’ in the query in the protocol are actually answerable by the server and this check turnsout to be successful for the ontologies that are presented here.

Step-2:

The next query uses the class ‘Book’ . Algorithm 1 performs the consistency checking in theway that is similar to the previous query and the check is successful.

Step-3:

The third query uses a specialization sequence ‘Book.Proceedings’ . Algorithm 1 searches forthe ‘Book’ class in the server ontology and then checks whether ‘Proceedings’ is a sub class of ‘Book’ in the server ontology. Algorithm 1 reports a failure since in the server ontology ‘Proceedings’ isnot a sub class of ‘Book’ . [Soundness] The mismatches returned by Algorithm 1 are correct.Proof. Algorithm 1 reports mismatch in three cases. We observe each of the cases as follows.

Mismatch in individual class:

If Algorithm 1 does not ﬁnd a matching class c which is used in aquery, a conﬂict is reported. Since the class is not recognized by the server, it is not possible forthe server to answer the query. Therefore the outcome of the algorithm is correct. Mismatch in specialization sequence:

Consider a specialization sequence σ = h c .c . · · · .c k i ina query q on which Algorithm 1 returns a mismatch. We prove the correctness of the consistencychecking by induction on the length k of σ . Basis( k = 1 ): In this case there is only one class in the specialization sequence and this case fallsunder the case of mismatch in individual classes.

Inductive Step:

Suppose Algorithm 1 returns the mismatch correctly for specialization sequenceshaving length k . We prove that Algorithm 1 reports the conﬂicts correctly for the specializationsequences having length k + 1. There can be two possible cases. 34 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 Reference InformalAcademic ProceedingsCollectionMonograph Book title volumedateseriesauthor editionpublisher chapters organisation communicationseditor eventtitle author dateschoolchapters partstitle datehumanCreator title ManualBookletLectureNotes

Fig. 3.

Fragment of Client Ontologya. The conﬂict is reported for a class that appears in the i th location of the sequence, where1 < i < k + 1. The reported mismatch is correct according to the inductive hypothesis.b. The conﬂict is reported for the k +1 th class of the sequence. In this case there exists a matchingspecialization sequence at server ontology up to length k . But c k +1 is not a sub class of class c k according to the server ontology. Therefore the conﬂict reported by Algorithm 1 is correct. Mismatch on variables:

Suppose the set of variables that are speciﬁed by the client is V c in a query q corresponding to the class set S c ( q ) and the failure is reported on some variable in V c . SinceAlgorithm 1 ﬁrst ﬁnds the matches corresponding to the classes in S c ( q ) and then checks for theanswerability with respect to the variable set, in this case every class in S c ( q ) is matched withsuitable classes in the server side. Now Algorithm 1 reports conﬂict if there exists any variable thatis not recognized by the server as an attribute of at least one of the classes that correspond to theclasses in S c ( q ). Therefore the reported conﬂict falls under the Type-2 or attribute level conﬂictcategory. ⊓⊔ Theorem 2. [Completeness] For any protocol P , if there is any mismatch of type-1 or type-2, Algo-rithm 1 reports it.Proof. This proof is done by construction. For each of the type of the mismatches we show thatAlgorithm 1 uses a sequence of operations through which the mismatch is detected. We present theproof for each mismatch type.

Type-1 Mismatch:

Consider a specialization sequence σ = h c .c . · · · .c k i which is used in query q .Algorithm 1 starts by ﬁnding the class that is equivalent to c at the server side. If there is only35 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 EntryPublished CompositeInformal BooktitleBookpartArticle humanCreatoreditionauthor inJournal numbervolumedate author numberOrVolume publishertitleseriespagesOrChapter chapterdatetitle Monograph Collection ProceedingsdateTechReport Booklet Manual

Fig. 4.

Fragment of Server Ontologyone class in σ then Algorithm 1 reports mismatch when the corresponding class is not found in theserver ontology. When the length of σ is greater than 1, Algorithm 1 continues to check whether c i is a subclass of c i +1 where 1 < i < k . A mismatch is reported by Algorithm 1 whenever c i is asubclass of c i +1 for 1 < i < k . Hence if there exists any mismatch in any specialization sequence,the algorithm reports it. Type-2 Mismatch:

Consider a query q made by the client and the set of variables is V c in q . Theset of classes is denoted by S c ( q ). We argue that, if there exists a Type-2 mismatch for query q ,Algorithm 1 reports it. For Type-2 mismatches Algorithm 1 ﬁrst checks the presence of the equiv-alent classes c si in the server ontology and computes the union V s of the attributes correspondingto every c si . If there is any variable/s in V c that are not present in V s , a conﬂict is reported byAlgorithm 1. Hence if there exists a Type-2 mismatch for a query, Algorithm 1 reports it. ⊓⊔ In this section we describe the two level representation for describing ontologies – using OWL todescribe the classiﬁcation and using database to store the instances. This type of representation ishelpful for describing domains with large number of instances. From the point of view of the instancesof classes, the classes in an ontology can be categorized as follows.a. Classes of Abstract Type – these classes are used for purely the purpose of describing a domain inhierarchically. These classes does not have any instances. They act only as the super class of otherclasses. 36 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 b. Classes with Instances – these classes may act as super class of other classes but they have anon-empty set of instances.Consider the ontology fragment in Fig. 4. Here

Entry , Informal , and

Composite are the example ofabstract classes. On the other hand,

Book , Monograph etc. are the example of classes with instances.Although

Book is a super class of

Monograph and

Collection , it is possible to have instances of

Book which are neither

Monograph nor

Collection .While using the two level representation, it is important to keep the database schema consistentwith the wrapper ontology. A choice of describing the database schema could be maintaining a tablefor each of the non-abstract classes present in the ontology. Alternative ways of describing the databaseare possible, but we use this simplistic representation of the database schema to present the proposedalgorithm.

When the server side adheres the two layer structure for its ontology, every query in the protocolis answered by generating corresponding tuples from the back-end database. In the context of theback-end database the occurrences of variables in a protocol, can be categorized into the followingtypes.

Uninstantiated:

When a variable is placed in a query for the ﬁrst time without initialization, it isreferred to as an uninstantiated occurrence of variable or in short uninstantiated variable . Thevalues for the variables are instantiated at the side where the query is evaluated.

Instantiated:

Other than the ﬁrst occurrence without initialization, all other occurrences of a variableis referred to as instantiated occurrence of that variable or in short instantiated variable . At theseoccurrences, the variables are already assigned to some value by the server. These occurrences areused for value propagation. [Evaluation Semantics of a Query :]

The semantics of the evaluation of the query is similar tothe

Conjunctive Datalog . The evaluator of the query tries to assign value to uninstantiated variablesand forms a tuple which satisﬁes logical and of the conditions speciﬁed in the where clause of thequery. Same variables in diﬀerent classes speciﬁed in the where clause of the query have to be assignedto the same value.Consider the protocol presented in Fig. 1. In Section 4.2 we have shown that the protocol has anontological conﬂict, when the client and the server uses the ontologies in Fig. 3 and Fig. 4 respectively.Consider the fact, that the condition, ( t ! = null ) may always evaluate false due to the actual datathat is stored in the database of the server. In that case, the ontological conﬂict in the last query,[ Get ( title : t , author : a, date : d from Book.P roceedings ], will never be sensitized. In other wordsthe conﬂicts at the ontology level may turn out to be spurious. We deﬁne the spuriousness of anontological conﬂict as follows.

Deﬁnition 3.

An ontological conﬂict is spurious , when for all possible correct instantiations of thevariables, the conﬂict is not reachable from the start state of the protocol, due to the decisions takenat diﬀerent stages of the protocol. By correct instantiations we mean the instantiations that conformto the evaluation semantics deﬁned earlier.

Here we present the relevant formalisms for describing the algorithm to check the presence of theconﬂict detected by Algo. 1 at the current state of the server database. 37 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010

Deﬁnition 4.

The assignable set of values for a variable ϕ is the set of values that can be assigned to ϕ during the instantiation and it is denoted as AssignableSet( ϕ ) . Suppose in a protocol P , a query q has variable set v = { ϕ , ..., ϕ n } and concept set C = { C , ..., C m } .Let us also assume that in P all the variables of q are uninstantiated variables. The notion of assignableset in the presence of the previously instantiated variables is discussed later. The evaluation of thequery basically assigns a values to each of the variables in that query. All the variables together forma tuple τ = h val , val , . . . , val n i such that if any variable ϕ k is common between class C i and class C j then both the classes have to assign same value to the variable ϕ k . All such possible tuples thatcan be populated by the evaluator side, form the assignable set of values for v and the assignable setfor a variable ϕ i is: AssignableSet ( ϕ i ) = { val | ∃ τ ∈ AssignableSet ( v ) ∧ τ = h val , val , ..., val n i ∧ val i = val } The dependencies among the variables play an important role for determining the AssignableSet for avariable.

Deﬁnition 5.

In a query, if some of the variables are previously instantiated, we say that the previouslyinstantiated set of variables is constraining the set of values of the uninstantiated variables. Supposein the same query q , among the variables speciﬁed in q , ϕ , · · · , ϕ k are previously instantiated and ϕ k +1 , · · · , ϕ n are the variables that are instantiated by the evaluation of q . We deﬁne the constrainrelation R C and the ConstrainSet as follows. R C = { ( ϕ i , ϕ j ) (cid:12)(cid:12) where ϕ i ∈ { ϕ , · · · , ϕ k } and ϕ j ∈ { ϕ k +1 , ..., ϕ n }} ConstrainSet ( ϕ i ) = { ϕ k +1 , ϕ k +2 , · · · , ϕ n } Consider the same query q . The AssignableSet for the set of variables of q is the set of all tuples τ = h val , val , . . . ..., val n i such that the following conditions hold. – If any variable ϕ k is placed in more than one concepts, all the concepts assign same values to ϕ k . – ( val ∈ A ) ∧ ... ∧ ( val k ∈ A k ), where A , · · · , A k are the assignable sets of variable ϕ , · · · , ϕ k respectively. Deﬁnition 6.

The

RestrictSet for a variable set v is obtained by computing the transitive closure ofthe R C on v . We use the notion of the split operation on the assignable set of values of a variable and it worksas follows. Let a query, q , consists of concept C i with a uninstantiated variable ϕ i , and a previouslyinstantiated variable ϕ j . Suppose a decision is made on the variable ϕ j . In each branch, the possiblevalues of ϕ j forms a subset of its assignable set. Since the value of ϕ i is dependent on ϕ j , in eachbranch the possible values for ϕ i also forms a subset of the assignable set of ϕ i . Deﬁnition 7.

The

SplitSet for a variable set v is a subset of RestrictSet( v ) and is deﬁned as: SplitSet ( v ) = { ϕ j | ϕ j ∈ RestrictSet ( ϕ i ) and ϕ j appears in a condition in the path of the protocolfrom the start of the protocol to the query with ontological conﬂict ϕ i } Deﬁnition 8.

RelevantConditionSet of a variable set v is the set of conditions in true form on thevariable set v split , which have to be true for reaching the conﬂicting query. nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 Verify the Conﬂicts on Back-end Database Initialize a hash table H t ; /* In the hash table H t , a set of variables v forms the key, which is mapped to theAssignableSet of the variable set v */ foreach conﬂicting query q do v ← The set of instantiated variables speciﬁed in q ; if VerifyConﬂict( v ) then Report mismatch on variable v at database level; else Report the conﬂict as spurious; end Function

VerifyConﬂict( v ) v restrict ← The RestrictSet for the variable set v ; v split ← The SplitSet for the variable set v ; v srestrict ← MakeSets( v restrict ); Construct a priority queue Γ of variable sets; /* Γ is ordered according to the order of the instantiations of its variable sets */ forall the variable set v i ∈ v srestrict do Enqueue v i in Γ ; end Table set S t ← {} ; while Γ is not empty do u ← Dequeue ( Γ ); if (VerifyConﬂict( u )) then /* The set of possible valuations for u is not empty */ t ← Search H t and return the table containing u ; if t / ∈ S t then S t ← S t ∪ { t } ; end else /* The set of possible valuations for u is empty, so the conflict is spurious */ return false; end end Find the query q that instantiates variable set v ; if v split != ∅ then c ← The RelevantConditionSet on the variable set v split ; δ ← SplitAssignableSet ( δ, v split , c ); end if δ == ∅ then Report the conﬂict on v as spurious; return false; else Insert δ in H t ; return true; end nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 Function

MakeSets( v ) Initialize set of variable sets v ret = {} ; while v is not empty do Find a query q that instantiates some of the variables in v ; Initialize variable set v temp = {} ; forall the variable ϕ i ∈ v and ϕ i is instantiated by q do v ← v − { ϕ i } ; v temp ← v temp ∪ { ϕ i } ; end v ret ← v ret ∪ { v temp } ; end Function

GenerateAssignableSet( q , S t ) /* Suppose q is made with the concepts C , ..., C n and ϕ i , . . . , ϕ ik are the uninstantiatedvariables corresponding to the concept C i */ v ← { ϕ ij | ϕ ij = ∗} ; if S t == Φ then /* All the variables of q are uninstantiated */ Tuple set T ← ( C ⋊⋉ C ⋊⋉ ... ⋊⋉ C n ); else /* Some of the variables of q are previously instantiated and t , ..., t m ∈ S t are thetuple sets corresponding to those variables */ Tuple set T ← ( C ⋊⋉ C ⋊⋉ . . . ⋊⋉ C n ⋊⋉ t ⋊⋉ . . . ⋊⋉ t m ); end Relational algebra query q Rel ← π v ( T ); Compute q Rel and return the set of tuples;

Function

SplitAssignableSet( δ , v split , c ) /* Suppose c , · · · , c i ∈ c */ Relational algebra query q Rel ← σ ( c ∨ c ∨ ... ∨ c i ) ( δ ); Compute q Rel and return the set of tuples;

This algorithm can also be used by the server as the protocol progresses(described as Scenario-4 inSection 1). In that case, the variables in the queries which are already executed, have some valueassigned to them and those variables will be considered as instantiated by the algorithm.

The proof of correctness of Algo. 3 is presented below. Algo. 3 veriﬁes the spuriousness of conﬂictsreturned by Algo. 1 on the server database.

Theorem 3. [Soundness] Algorithm 3 correctly reports the spuriousness of conﬂict on the set of vari-ables v ′ , where v ′ = v ∪ RestrictSet ( v ) and v is the set of previously instantiated variables in a query q of protocol P with ontological conﬂict.Proof. The proof is done using induction. We do the induction on the integer parameter n , where n is the total number of VerifyConﬂict function calls done by Algorithm 3 for q . Among the diﬀerent VerifyConﬂict function calls, ﬁrst call is done by Algorithm 3 and the others are recursive calls. [Basis (n = 1) :]

In this case RestrictSet( v ) = φ . In this case if the AssignableSet ( v ) is ∅ Algo. 3correctly reports the conﬂict as spurious, otherwise Algo. 3 reports the conﬂict as not spurious, whichis correct. 40 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 [Inductive Step :]

We assume that the spuriousness of a conﬂict reported for the queries withontological conﬂict in n steps are true. We now prove that the spuriousness of a conﬂict that is reportedin ( n + 1) steps are correct. Consider the VerifyConﬂict function call at Algo. 3 and without loss ofgenerality, we can assume this function call as the ( n + 1) th function call (in the order of returningof the function calls). Therefore the other calls are recursive calls done by the VerifyConﬂict to itself.The following two cases are possible.a. The conﬂict may be detected as spurious by some call which is not the ( n + 1) th call. In this casethe spuriousness of the conﬂict is correct by the inductive hypothesis.b. The conﬂict is detected as spurious at the ( n + 1) th call to VerifyConﬂict . All other previouscalls to

VerifyConﬂict add a table to H t and the set of tables are kept in S t . After that, functionGenerateAssignableSet is called to compute the assignable set for the set of previously instantiatedvariables v in the query q with ontological conﬂict. It follows from the description of the function,that this function restricts the set of valuations of v by taking the natural join with the valuationsof variables in RestrictSet( v ). Since the conﬂict is not detected as spurious in the variables inRestrictSet( v ), when the function detects the conﬂict as spurious, the statement δ == ∅ is true.Therefore in the protocol q is not reachable from the start state of the protocol. ⊓⊔ Theorem 4. [Completeness] If there is a spurious conﬂict on the set of variables v ′ , where v ′ = v ∪ RestrictSet( v ) and v is the previously instantiated variable set speciﬁed in a query q of protocol P withontological conﬂict, the algorithm reports it. We do the proof by establishing the contrapositive of thestatement, i.e. Algorithm 3 reports the ontological as not spurious, if q is reachable from the start stateof P .Proof. Suppose v ′ = { ϕ , ..., ϕ n } . Let the valuations of the variables in v ′ are ( val , ..., val n ) whenthe conﬂict in q is not spurious. In this case the conﬂict may occur in the following way. Considerthe VerifyConﬂict function calls made to determine the spuriousness of the ontological conﬂict in q ,among which the ﬁrst call is done by Algo. 3 and the subsequent calls are recursive calls. The conﬂictis detected as not spurious, only if all the recursive calls to VerifyConﬂict add a table to H t and theset of tables are kept in S t . Since the conﬂict is determined as not spurious, the statement δ is notempty. Therefore in P , q is reachable from the start state of the protocol using any instantiation ofvariables belonging to δ . ⊓⊔ Diﬀerent aspects of web service interaction have been an active area of research. However most ofthese researches consider the interaction at syntactic level. Foster et. al. addressed the compatibilityveriﬁcation of web services in [11]. They adopted a model based approach for checking the compat-ibility of web services at diﬀerent level of abstraction. However the semantics of exchanged data isnot addressed by the researchers. In [12] researchers address the interaction among web services whichis asynchronous in nature and propose a design pattern to help the development of composite webservices based on asynchronous interaction. Zhao et. al. provides a formal treatment of web servicechoreography in [13]. They deﬁne a formal model of the of WS-CDL and propose a methodology to for-mally verify the correctness of a choreography using the model checker SPIN. In [14] authors proposeda formalism for specifying the web service interfaces. They discuss about three kind of constraintswhich can be put by a web service interface. The propositional constraints are imposed by an interfaceby specifying the methods that can be invoked by the clients along with the constraints on the inputand output parameters( signature constraints ). Protocol Constraints specify the temporal requirementson the sequence of the method invocations. An algorithm is proposed to check compatibility among41 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010 the web services based on the mentioned constraints. However all the proposed veriﬁcation strategieswork at a syntactic level, without considering the semantics of the exchanged data.On the other hand the current research in semantic web is focused towards the standardizationof the ontology used by the web services with a vision of computers becoming capable of analyzingall web data. Semantic matchmaking [15, 1] and discovery of semantic web services [16, 17, 18] aretwo important research directions in semantic web. The underlying objective of these approaches isto compare facts belonging to diﬀerent ontologies and to evaluate their compatibility. Standards likeRDF, OWL, WSML etc. are developed for this purpose.Ontology plays an important role towards enhancing the integration and interoperability of thesemantic web services. A signiﬁcant amount of research has been done towards formalizing the notionof conﬂict between two ontologies. In [6], authors present a detailed classiﬁcation of conﬂicts by distin-guishing between conceptualization and explication mismatches. In [19] authors further generalize thenotion of conﬂicts and classify semantic mismatches into language level mismatches and ontology levelmismatches. Then ontology level mismatches are further classiﬁed into conceptualization mismatch andexplication mismatch. Further research in the same direction [20] adds few new types of conceptualiza-tion mismatches. Researchers in [21] present alternative types of conﬂicts that are primarily relevant toOWL based ontologies. However primary focus of these works is towards the interoperability betweentwo ontologies – rather than the correctness of the protocol for information exchange with respect tothe interpretation.Ontology mapping primarily focuses on combining multiple heterogeneous ontologies. In [22] au-thors address the problem of specifying a mapping between a global and a set of local ontologies. In [23]authors discuss about establishing a mapping between local ontologies. In [24] the problem of ontologyalignment and automatic merging is addressed.Signiﬁcant amount of research has been done towards the development of the protocol. In [25]researchers proposed a methodology for developing protocols in a multi agent environment. Theyextend propositional dynamic logic to formally specify the protocol and also use an extension of state-charts for visual representation. In [26] a step by step procedure is presented for the developmentof web service interaction protocols from the problem deﬁnition to the ﬁnal speciﬁcation. Howeverthese approaches are focused towards the development of protocol for multi agent environment. Thesemantics of the exchanged data is not addressed in these works.The problem of checking compatibility between two ontologies with respect to a protocol is newand to the best of our knowledge there is no prior work on this topic.

In this paper we addressed the problem of detecting the presence of semantic mismatch where thedata exchange between two ontologies is deﬁned in terms of a protocol. We believe that the proposedmethodology will be very helpful for the integration of web services that are developed independently.Moreover the future of internet applications lie in exchanging knowledge, where semantic conﬂict willbe a major issue. 42 nternational Journal of Web & Semantic Technology (IJWest) Vol.1, Num.4, October 2010

Bibliography (5-6) (2002) 265–273[9] Ghosh, P., Dasgupta, P.: A formal method for detecting semantic conﬂicts in protocols between serviceswith diﬀerent ontologies. In Meghanathan, N., Boumerdassi, S., Chaki, N., Nagamalai, D., eds.: RecentTrends in Networks and Communications. Volume 90 of Communications in Computer and InformationScience., Springer Berlin Heidelberg (2010) 553–562[10] OAEI Benchmark: http://oaei.ontologymatching.org/2009/benchmarks/[11] Foster, H., Uchitel, S., Magee, J., Kramer, J.: Compatibility veriﬁcation for web service choreography. In:ICWS. (2004) 738–741[12] Betin-Can, A., Bultan, T., Fu, X.: Design for veriﬁcation for asynchronously communicating web services.In: WWW. (2005) 750–759[13] Zhao, X., Yang, H., Qiu, Z.: Towards the formal model and veriﬁcation of web service choreographydescription language. In: WS-FM. (2006) 273–287[14] Beyer, D., Chakrabarti, A., Henzinger, T.A.: Web service interfaces. In: WWW. (2005) 148–159[15] Guo, R., Le, J., Xia, X.: Capability matching of web services based on owl-s. In: DEXA Workshops.(2005) 653–657[16] Pathak, J., Koul, N., Caragea, D., Honavar, V.: A framework for semantic web services discovery. In:WIDM. (2005) 45–50[17] Klusch, M., Fries, B., Sycara, K.P.: Automated semantic web service discovery with owls-mx. In: AAMAS.(2006) 915–922[18] Vu, L.H., Hauswirth, M., Aberer, K.: Towards p2p-based semantic web service discovery with qos support.In: Business Process Management Workshops. (2005) 18–31[19] Klein, M.: Combining and relating ontologies: an analysis of problems and solutions. In: Workshop onOntologies and Information Sharing, IJCAI’01, Seattle, USA (2001)[20] Qadir, M.A., Fahad, M., Noshairwan, M.W.: On conceptualization mismatches between ontologies. In:GrC. (2007) 275–278[21] Li, C., Ling, T.W.: Owl-based semantic conﬂicts detection and resolution for data interoperability. In:ER (Workshops). (2004) 266–277[22] Calvanese, D., Giacomo, G.D., Lenzerini, M.: A framework for ontology integration, IOS Press (2001)303–316[23] Madhavan, J., Bernstein, P.A., Domingos, P., Halevy, A.Y.: Representing and reasoning about mappingsbetween domain models. (2002) 80–86[24] Noy, N.F., Musen, M.A.: Anchor-prompt: Using non-local context for semantic matching. In: In Proceed-ings of the Workshop on Ontologies and Information Sharing at the International Joint Conference onArtiﬁcial Intelligence (IJCAI. (2001) 63–70[25] Paurobally, S., Cunningham, J.: Developing agent interaction protocols using graphical and logicalmethodologies. In: PROMAS, volume 3067 of LNCS, Springer (2003) 149–168[26] Oluyomi, A., Sterling, L.: A dedicated approach for developing agent interaction protocols. In: PRIMA.(2004) 162–17743