Semantic Jira - Semantic Expert Finder in the Bug Tracking Tool Jira
SSemantic Jira - Semantic Expert Finder in the BugTracking Tool Jira
Velten Heyn and Adrian Paschke
Corporate Semantic Web, Institute of Computer Science,Koenigin-Luise-Str. 24, 14195 Berlin, Germany
Abstract.
The semantic expert recommender extension for the Jira bug tracking sys-tem semantically searches for similar tickets in Jira and recommends experts and linksto existing organizational (Wiki) knowledge for each ticket. This helps to avoid re-dundant work and supports the search and collaboration with experts in the projectmanagement and maintenance phase based on semantically enriched tickets in Jira.
Keywords:
Bug Tracking, Semantic Expert Finder, Semantic Web
There is a huge economic potential in the use of Corporate Semantic Web tools in Software Engi-neering and project management. Bucktracking systems, such as Jira, can benefit from such semanticsupport. With the information about software bugs, issues, and project tasks cumulated by a bugtracking system and with integrated semantic techniques for transforming this information into mean-ingful knowledge, it becomes possible to automatically support employees in their daily tasks.If they need help with a particular task they can browse through similar tickets in the bug trackingsystem and can reuse the documented solutions to solve their problem. However, with the growinginformation available in such an enterprise information system it becomes harder to find the relevanttickets.In this paper we propose a semantic extension to Jira in order to overcome this problem. ThisSemantic Jira supports semantic search on the knowledge documented in Jira. Similar tickets aresemantically inferred by the system and ranked by different metrics (tf-idf, freshness). Furthermore,the most active employees of the found similar tickets are extracted, ranked and proposed as knowl-edge experts. The underlying rational for this approach is, that they are very likely experts in thefield of knowledge, to which this ticket belongs to. The support of asking experts for help has theadvantage, that the expert can directly communicate their knowledge and that they can abstract allunnecessary details if the help seeking colleague is not familiar with the domain.Our expert recommender approach differs from the typical existing solutions which are expertfinder search tools. The drawback of searching by users is, that the employees and project managersneed to use the right search terms to describe the required skills of an expert for a given problem (bug,issue). This is a non trivial problem, in particular for non-IT persons. In our recommender approachan expert is recommended by a recommendation system, which uses the reported bug to infer theknowledge field and automatically find the experts in this field. These experts are recommended inthe ticket by the Semantic Jira system together with information about similar tasks or tickets (doc-umenting existing problem solutions) and matching wiki/wikipedia-articles which document existingenterprise knowledge.The main research questions in the propose solution are:1. How to find similar tickets in an automated way? What defines similarity?2. How do we find possible experts and which information sources should we use for this? a r X i v : . [ c s . S E ] D ec Semantic Jira - Semantic Expert Finder in the Bug Tracking Tool Jira3. Which possibilities do we have to request articles from enterprise Wiki systems which documentrelevant knowledge?The benefits of the propose solution, based on a semantic Jira approach, is that it increases thevisibility of implicit and explicit knowledge in the company and that it links this knowledge to theongoing activities of employees. This helps to avoid redundant work in a company and it helps tooptimise the distribution of work on the general resources of employees.The further paper is structured as follows: Section 2 describes the related work. Section 3 intro-duces the conceptual solution - a Semantic Jira back tracking system. Section 4 gives more detailsabout the implementation of this system. The evaluation in section 5 is based on a user study per-formed in a company. Finally, section 6 concludes this work. “Who Knows about That Bug? Automatic Bug Report Assignment with a Vocabulary-Based Devel-oper Expertise Model“ [1] from Dominique Urs Philipp Matter is the largest project we found in ourstate of art analysis. Their Expert Recommender uses source code analysis to find the most applicableexpert for a given bug tracking item. The main drawback of this approach is, that it needs half a yearactive contribution to the source code from a developer to make proper recommendations. Their usecase is to assign developers to new bug tracking items. The fact that they primarily consider sourcecode for expert triage makes the system not usable for non-IT users. That’s why we decided to stickto the data available directly in the bug tracking system.“Expertise Recommender: A Flexible Recommendation System and Architecture” [2] from DavidW. McDonald and Mark S. Ackerman is another paper which is making use of the change history ofsource code. A (proprietary) “Tech Support”-Database is used in their field study which was done ina company. With their approach it is only feasible to locate experts within the IT department. Thepaper further focuses on the architectural aspects of creating a reusable system which can take manydifferent algorithms into account.“Expert Recommender Systems in Practice: Evaluating Semi-automatic Profile Generation” [3] fromTim Reichelt and Volker Wulf is applying another source to tackle this problem. They are usinga client program which examines the documents within a folder and subfolder which was selectedby the user. This program sends these examined word statistics to the server and compares it withother statistics. It was also considered to use emails as data source but discarded because of privacyprotection issues.“Using Domain Ontologies for Finding Experts in Corporate Wikis” [4] from Ralph Sch¨afermeierand Adrian Paschke is taking the wiki entries of users as source. They are using the SEOntology toinfer if the users’ Wiki contributions are using an experts’ knowledge language. They also include anauthor reputation metric to gain more precision.All these previous works have the problem that they are not sufficiently integrated into the taskworkflow of an employee which is directly managed and supported by a bug tracking and projectmanagement system such as Jira.
Expertise represents the implicit knowledge and the competencies of an expert. The proposed con-ceptual solution allows inferring expertise from an expertise model which considers the knowledgefields of a ticket saved inside typical bug tracking systems such as Jira.Typical roles involved in a ticket for such a model are: – the ticket reporter / creatoremantic Jira 3 – the ticket assignee / solver – the ticket followerBefore the actual opening of a ticket and the specification of the involved roles, the ticket creatorhast to create the ticket in a relatively complex decision process as shown in figure 1. Ticket Reporter
The creator needs to have enough information and knowledge to distinguish if
Fig. 1.
Process of Ticket Creationthe occurred problem was a problem of misuse (could lead to a feature request) or a bug in the usedsystem. If the creator can distinguish the problem, it either means that he can act as an expert forthis reported part of the system or that he might know somebody who can act as contact person.That means, the more tickets a particular person has for the same area of work the more likely itis that he has knowledge either about the distribution of tasks and skills or about the integrationwithin the system.
Ticket Assignee
The ticket reporter and the assignee have different kinds of knowledge in a specificarea. During the process of problem solving the assignee creates new artifacts (e.g. source code) andif it’s not repetitive work new knowledge especially about specific details of the implementation.Thus the more tickets the person solves in a particularly area the more profound knowledge he shouldhave.It is likely that the words used within the description of a ticket capture the knowledge required Semantic Jira - Semantic Expert Finder in the Bug Tracking Tool Jirato solve new tickets. That means, to find similar tickets and experts who can possibly help, it isrequired to classify the text in order to discover the knowledge field addressed by the ticket.
We apply a simple statistical tf-idf measure to get the relevancy of the words within the ticketdescription. After taking the most k relevant words of the ticket (where k is normally a value between5 and 20) it is possible to search for other tickets which contain similar relevant words. The score foreach ticket is then similarity ( s, d ) = P w ∈ W tw( w, s ) · tw( w, d ) r P w ∈ W tw ( w, s ) · P w ∈ W tw ( w, d ) (where W is the set of relevant words, tw is the td-idf weighting function, s is the source ticket and d the ticket of which you want to find out the similarity). The upper part of the division is the sumof the weights multiplied from both tickets. To normalise the results if the contain different amountsof words the upper part is divided by the second norm of each weight. If you look at W as a vectoryou can transform the upper formula to similarity ( s, d ) = tW( W, s ) × tW( w, d ) | tW( W, s ) | · | tW( W, d ) | (where tW takes the vector of the words W and weights them with the tf-idf measure in regard to theticket given in the second argument). We can either apply a linear scoring or an inverted exponentialscore for the retrieval and ranking of similar tickets to a given new ticket.We further recommend similar tickets which are semantically similar according to a domainontology representing the expert vocabulary used in the ticket domain. The underlying rational forthis approach is, that experts typically use a topic specific vocabulary which is modelled by thedomains ontology. We apply a taxonomic ontology matcher which identifies the tickets’ terms asresources from the expert taxonomy. The matcher computes the similarity between two concepts c c d ( c , c
2) between them, which reflects their respectiveposition in the concept hierarchy. The matcher is able to handle multiple inheritance of concepts atthe leaf level of a taxonomy.The concept similarity is defined as: sim ( c , c
2) = 1 − d c ( c , c closest common parent (ccp), the distance is calculated as: d c ( c , c
2) = d c ( c , ccp ) + d c ( c , ccp ) d c ( c, ccp ) = milestone ( ccp ) − milestone ( c )The milestone values of concepts in a taxonomy are calculated either – with a linear milestone decrease milestone ( n ) = 1 − [ l ( n ) /l ( N )], where l ( n ) is the depth of thenode n in the taxonomic hierarchy and l ( N ) is the deepest hierarchy level, or – with an exponential milestone decrease milestone ( n ) = 0 . /k l ( n ) , where k is a factor greater than1 indicating the rate at which the milestone values decrease along the hierarchyAfter the retrieval and ranking of (statistically and semantically) similar tickets, in the nextstep the experts are identified. In general, the expertise score is a function a × t e for a givenauthor a and a topic t . The set of potential experts are the persons involved in these tickets eitheras creator, solver, or follower. For each person an experts score is calculated as the accumulatedsimilarity measure (linear or inverted exponential). We distinguish two dimensions, the organisationscore and the developer score and calculate the overall expert score e = e o + e d .Additionally, we retrieve relevant Wikipages from an Enterprise Wiki documenting expert knowl-edge. We assume that an individual who contributes content relevant to a specific ticket topic to sucha Wiki has expertise in this topic. we calculate a simple contribution based expertise score as follows:emantic Jira 5 expertise simple ( a, t ) := X s ∈ S a,t weight s ( level ( s )) + X w ∈ W a,t weight w ( w ) (1)where by S a,t we denominate the set of sections that cover the topic t and under which author a has contributed content. t is again identified as a concept in the tickets’ domain ontology. weight s is aweighting factor depending on the section level. We used a simple milestone metric in order to expressthe relevance of a section according to its level. The underlying assumption is that a section witha higher level is about a more general topic, and contributions to a highly specialized topic shouldbe reflected with a higher weighting in the expertise score. Accordingly, by W a,t we denominate alloccurrences of terms contributed by author a that can be mapped to ontology topic t , weighted bya relevance function weight w .We further semantically consolidate the expertise score by considering authors Wiki contributionsabout topics which are semantically similar to the topic of the addressed ticket topic. We utilize theclass hierarchy established by the owl:subClassOf OWL property and other selectable subtypes of owl:ObjectProperties in order to capture concept relatedness.
BIRT/FAQ/Data Access General
This section describes how to use the ODBC-JDBC drivers.
Sun offers a reference JDBC-ODBC bridge that is included in the JDK. This can be used by entering sun.jdbc.odbc.JdbcOdbcDriver in the driver url for the JDBC data source. Next enter a database URL similar to...
Use the ojdbc14.jar driver. The older classes12.jar drivers are for use with Java JDK 1.2 and 1.3. The ojdbc14.jar drivers are for use with JDK 1.4, which is what BIRT uses... ef:related_to ef:related_to
Database System
RelationalDatabase System
Oracle MySQL
DatabaseAPI
ODBCJDBC Java
C++ Objective C
ProgrammingLanguageObject OrientedLanguage Fig. 2.
Weighting of Detected and Related FeaturesBased on these relations, we calculate a relevance score using ontology based similarity measuresuntil a defined threshold is reached (see figure 2). All concepts with a similarity value higher thanthis threshold are considered similar and added to the feature set, weighted by its similarity value.This yields a consolidated expertise score which is calculated as follows: expertise ( a, t ) := X t sim ∈ T,sim ( t,t sim ) ≥ sim min expertise simple ( a, t sim ) · sim ( t, t sim ) . (2)Even if topic t is never referenced by author a , but neighbouring topics t sim with a similarity to t greater than the threshold sim min , then t benefits from a ’a expertise in each topic t sim . The moresimilar t and t sim are, the higher the benefit for t . This score can be further adjusted by additionallyconsidering the authors reputation based on e.g. consindering the revision history, stability, and lifetime of Wiki entries. [4] Semantic Jira - Semantic Expert Finder in the Bug Tracking Tool Jira The developed plugin for Jira delivers the following basic features: – Automatic indexing of all tickets with usage of • An adapter for the connection between the Index and the Jira ticket data structure • Automatic translation for ensuring a consistent language basis – Recommendation of alternative tickets which are similar to the current ticket • Calculation of different rankings (administrational and development score) • Recommendation of experts – Integration of matching Wiki articles and adapted expert score from the search engine and expertfinder
Fig. 3.
Overview of the interaction between the different systemsAs indexing and information retrieval platform Lucene with it’s efficient indexing engine is used.On top a Solr server manages the Lucene index, supplies Lucene with data, and retrieves the data.Solr gives a REST connector, over which all established (CRUD-) operations are realized.For the semantic similarity computation we use the CSW Semantic Similarity Matchmaking Frame-work (SemF) . The framework allows taxonomic and non-taxonomic concept matching techniquesto be applied to selected object properties.The Mediawiki software does have its own search engine to use for finding articles but their fea-tures are limited. It is not possible to search for more than one term at once otherwise all given emantic Jira 7words have to appear in the returned articles. With the extension LuceneSearch it is possible tomake more complex queries and give a ranking for the returned articles. To calculate the semanticexpert score from the Wiki contributions we adapt and integrate the Wiki expert finder described in[4]. Figure 3 shows the interaction between the different systems and components. Jira is at the coreof the system. From Jira the EventListener is activated to handle all operations on a ticket to sendthem to the Solr instance. The ExpertFinderContextProvider is triggered when a user requests aticket details page and gathers all necessary information from Solr, processes it and fills the view. For the evaluation employees from the IT department of a German midsized company with around60 employees were asked to give estimates for best fitting experts for a self-chosen ticket they alreadyworked on themselves. Afterwards all experts for the chosen tickets were collected from the system.Altogether 32 tickets have been evaluated, 98 experts have been given and 267 experts where proposedby the system. The results are shown in 5 and 4.
Table 1.
System evaluation
Fig. 4.
Evaluation Precision-Recall diagramAs described in section 3.1 it is possible to configure how many words should be take into accountas relevant for the similarity measurement. This has a direct impact on the quality of the results asshown in figure 5.The Wiki extension has been tested and evaluated using the project Wiki of the Eclipse Foundation and the Software Engineering Ontology (SEOntology) [5]. http://wiki.eclipse.org/Main_Page Semantic Jira - Semantic Expert Finder in the Bug Tracking Tool Jira
Fig. 5.
Best parameter for the amount of relevant words / terms
The recommender already reaches relatively good results. Future work might additionally considermulti-lingual translations and wordnets together with larger background knowledge coming, e.g. fromlinked open data sources such as DBPedia. In future work a comparison on the basis of the the bugtracking data from the eclipse project which was used by some of the related works would be useful.A problem to solve is the import of this data into a Jira bug tracking instance. Furthermore, thereare many other information stored or available in the bug tracking system which could be used tooptimise the results, e.g. the history of tickets (the change of assignees, the change of the status orof the resolution).
This work has been partially supported by the InnoProfile project ”Corporate Semantic Web” fundedby the German Federal Ministry of Education and Research (BMBF).
References
1. D. Matter, A. Kuhn, and O. Nierstrasz. Assigning bug reports using a vocabulary-based expertisemodel of developers. In
Mining Software Repositories, 2009. MSR’09. 6th IEEE InternationalWorking Conference on. IEEE, 2009. , pages 131–140, 2009.2. David W. McDonald and Mark S. Ackerman. Expertise recommender: a flexible recommendationsystem and architecture. In
Proceedings of the 2000 ACM conference on Computer supportedcooperative work , CSCW ’00, pages 231–240, New York, NY, USA, 2000. ACM.3. Tim Reichling and Volker Wulf. Expert recommender systems in practice: evaluating semi-automatic profile generation. In
Proceedings of the SIGCHI Conference on Human Factors inComputing Systems , CHI ’09, pages 59–68, New York, NY, USA, 2009. ACM.4. Ralph Sch¨afermeier and Adrian Paschke. Using domain ontologies for finding experts in corporatewikis. In
Proceedings of the 7th International Conference on Semantic Systems , I-Semantics ’11,pages 63–70, New York, NY, USA, 2011. ACM.5. P. Wongthongtham, E. Chang, and T. Dillon. Software Design Process Ontology Development.In