COBI-GRINE: A Tool for Visualization and Advanced Evaluation of Communities in Mass Channel Similarity Graphs
Karsten Wüllems, Daniel Göbel, Annika Zurowietz, Hanna Bednarz, Karsten Niehaus, Tim W. Nattkemper
CCOBI-GRINE: A T
OOL FOR V ISUALIZATION AND A DVANCED E VALUATION OF C OMMUNITIES IN M ASS C HANNEL S IMILARITY G RAPHS
A P
REPRINT
Karsten Wüllems
International DFG Research Training Group GRK 1906Biodata Mining Group, Faculty of TechnologyCenter for Biotechnology (CeBiTec)Bielefeld UniversityBielefeld, Germany [email protected]
Daniel Göbel
Biodata Mining Group, Faculty of TechnologyBielefeld UniversityBielefeld, Germany
Annika Zurowietz
Proteome and Metabolome Research, Faculty of BiologyBielefeld UniversityBielefeld, Germany
Hanna Bednarz
Proteome and Metabolome Research, Faculty of BiologyBielefeld UniversityBielefeld, Germany
Karsten Niehaus
Proteome and Metabolome Research, Faculty of BiologyBielefeld UniversityBielefeld, Germany
Tim W. Nattkemper
Biodata Mining Group, Faculty of TechnologyCenter for Biotechnology (CeBiTec)Bielefeld UniversityBielefeld, Germany [email protected]
September 25, 2020 A BSTRACT
The detection of groups of molecules that co-localize with histopathological patterns or sub-structuresis an important step to combine the rich high-dimensional content of mass spectrometry imaging(MSI) with classic histopathological staining. Here we present the evolution of GRINE to COBI-GRINE, an interactive web tool that maps MSI data onto a graph structure to detect communities oflaterally similar distributed molecules and co-visualizes the communities with Hematoxylin and Eosin(HE) stained images. Thereby the tool enables biologists and pathologists to examine the MSI imagegraph in a target-oriented manner and links molecular co-localization to pathology. Another featureis the manual optimization of cluster results with the assist of graph statistics in order to improve thecommunity results. As the graphs can become very complex, those statistics provide good heuristicsto support and accelerate the detection of sub-clusters and misclusterings. This kind of edited clusteroptimization allows the integration of expert background knowledge into the clustering result and amore precise analysis of links between molecular co-localization and pathology.
Keywords
MALDI imaging, Networks, Clustering, Community detection, Visualization, Graphs a r X i v : . [ s t a t . C O ] S e p rine A P
REPRINT
Cluster analysis is one of the most common unsupervised analysis method used in computer aided exploration of massspectrometry imaging (MSI) data. There are two main approaches to cluster MSI data: 1. clustering pixels according tomass spectra similarities or 2. clustering of mass channels according to similarities in molecular lateral distributionpatterns. The result is usually displayed as a segmentation map. The second approach groups mass channels accordingto their co-localization, expressed by similar distributions. As mass channels usually do not co-localize perfectly and adefinition of a distance function for lateral distributions is not straight forward, this clustering result is a little moreambiguous than segmentation maps. While many works about the pixel clustering approach are available [1][2][3] theworks published for mass channel clustering are less prominent, with GRINE [4] as one example. As the co-localizationof mass channels often appears fuzzy and ambiguous, we propose to explore the cluster results in combination withother image modalities, such as optical scans or histologically stained consecutive slides. Here we present the visualexploration tool COBI-GRINE (analysis, c luster o ptimization and b iological i nterpretation of gr aph mapped i mage data ne tworks). COBI-GRINE features new tools to visually explore but also edit co-localizing mass channel communities.One main feature is the option to link and fuse the community visualizations with other modalities, like Hematoxylinand Eosin (HE) stained sections, which paves the way to link groups of co-localizing molecules to distinct tissuesubstructures. To the best of our knowledge, there is no MSI visualization tool that features these functions. COBI-GRINE is a webtool, developed for the use in combination with a localhost. The backend is written in Pythonusing Flask as web application framework. The frontend is written in JavaScript using Vue as JavaScript frameworkand BootstrapVue as CSS framework. To facilitate its use, COBI-GRINE runs in a Docker container. This way the useronly needs Docker and a web browser (Chrome or Firefox are recommended). The source code is available under theGNU GPLv3 license at https://github.com/Kawue/grine-v2/ . COBI-GRINE visualizes hierarchically organized communities (or clusters) computed from so called Mass ChannelSimilarity Graph (MCSG). A MCSG represents the strengths of the molecular co-localizations in one MSI data set. Itcontains two types of nodes: 1. mass channel nodes, representing a single mass channel image and 2. community nodes,representing a cluster of mass channel images. Edges can be classified into three categories: 1. mass channel edges,connecting two mass channel nodes, with a length proportional to the co-localization of the respective mass channels, 2.community edges, connecting two different communities, with a length proportional to the mean co-localization of allmass channel edges that run between both communities and 3. hybrid edges, connecting a mass channel node and acommunity node, which indicates a connection to a mass channel node hidden within the community node. The weightof hybrid edges is a very small constant since they are only visual indicators and should not affect the graph layoutalgorithm. The whole MCSG structure is illustrated in Figure 1 I.
I II
X Y Z
Hierarchy iHierarchy i+1
AB bc a
Figure 1: I. Outline of the Mass Channel Similarity Graph (MCSG) structure. Nodes can either be mass channel nodes(A) or community nodes (B). Edges can be mass channel edges (a) if they connect two mass channel nodes, communityedges (b) if they connect two communities or hybrid edges (c) if a mass channel node is connected with another masschannel node hidden within a community. The weight of mass channel edges correspond to the similarity between therespective mass channel images, while the weight of community edges corresponds to the mean weight of all edges thatrun between mass channel nodes of the respective communities. II. The NodeTrix presentation with examples for acluster that is homogeneous (X), builds subclusters (Y) or is heterogeneous (Z).2rine
A P
REPRINT
A script presented in our previous work [4] performs both, graph mapping and community detection, and pro-vides all necessary files to run COBI-GRINE. The respective code is available at https://github.com/Kawue/msi-community-detection under the under the GNU GPLv3 license.For a detailed explanation of the basic interactions and functionalities we refer to our previous work on GRINE [4].However, the visual design of the tool has been changed significantly to integrate the new functions and for the sake of abetter usability. An example screenshot overview is shown in Figure 2. In summary, COBI-GRINE offers the followingfour main innovations to make the community detection on MCSG’s more accessible and to increase its capabilities forbiologists and pathologists:
A B b1b2b3b4b5b6
C D e1e2
Figure 2: Overview of the COBI-GRINE interface. (A) shows the graph panel with an expanded community and anactive hover annotation on the selected community. (B) shows the image panel, where all images are displayed. (b1)- (b4) are different mass channel images or mass channel image stacks. (b5) is a RGB three component dimensionreduction projection as described in our previous work [4]. (b6) shows an optical image. The lasso region selection isavailable for (b4) and (b6). (C) and (D) are lists that display QGP values and mass channel values (or annotations),respectively. (e1) and (e2) are settings elements.
NodeTrix Representation:
NodeTrix [5] represents the adjacency matrix of a selected subset of graph nodes as aheatmap, visually embedded in the entire graph structure. This is useful to visually explore densely connected subgraphsand to provide a very fast overview about the structural features of the selected subset of nodes, i.e. observation ofoverall strong interconnection, overall weak interconnection or strongly interconnected subgroups. Thus the heatmapcan reveal if a subset is homogeneous, heterogeneous or builds subclusters (exemplified in Figure 1 II). This can be ofgreat benefit to assess the quality of a cluster or to detect starting points for cluster modifications.
Image Guided Exploration:
If any aligned image modality is provided, a lasso selection on this modality can be usedto select a region of interest. COBI-GRINE will visually indicate all nodes that correspond to mass channel imageswith at least σ pixels of minimum intensity µ within the selected region. Both parameters µ and σ are user definedpercentage thresholds for values (exemplified in Figure 3). Manual Cluster Modification:
As mentioned before the clustering of mass channel images leads to results that requireposterior edits by the user. A new intuitive interface allows to merge and split communities or to change the communityassignment of single nodes. The modified MCSG can be exported as JSON file and used by other researchers. This waythe results can be easily stored and shared.
Quantitative Graph Property Guided Exploration:
To further use the advantages of the graph structure we haveadopted a proposal from our previous work and implemented a set of quantitative graph properties (QGP). TheseQGP’s can be used as a heuristic to identify nodes with special characteristics, like community cores (hubs), potentialsingletons, wrong assignments and bridges. For a large and complex graph, QGS’s can serve as a starting point forfurther detailed analysis. 3rine
A P
REPRINT σ =0.8 μ =0.2 σ =0.8 μ =0.2 A B C
Figure 3: Outline of the image guided lasso exploration. a
COBI-GRINE is a comprehensive extension of the previously presented GRINE framework. The new features simplifythe exploration and evaluation of MCSG communities, they enable the integration of expert knowledge into thecommunity detection result by manual modification and they extend the field of application to areas such as digitalhistopathology by integrating other image modalities.
Availability
Source code is available on github.
Project name:
Grine-v2
Project home page: https://github.com/Kawue/grine-v2/
Operating system(s):
Platform independent
Programming language:
Python
Other requirements:
Python 3.5 or higher
License:
GNU GPLv3
Acknowledgement
We acknowledge the financial support of KW by the German Research Foundation (DFG) as part of the GRK 1906.
References [1] Jan Kölling, Daniel Langenkämper, Sylvie Abouna, Michael Khan, and Tim W Nattkemper. Whide—a web toolfor visual data mining colocation patterns in multivariate bioimages.
Bioinformatics , 28(8):1143–1150, 2012.[2] Theodore Alexandrov, Michael Becker, Sören-oliver Deininger, Gunther Ernst, Liane Wehder, Markus Grasmair,Ferdinand von Eggeling, Herbert Thiele, and Peter Maass. Spatial segmentation of imaging mass spectrometry datawith edge-preserving image denoising and clustering.
Journal of proteome research , 9(12):6535–6546, 2010.[3] Soren-Oliver Deininger, Matthias P Ebert, Arne Futterer, Marc Gerhard, and Christoph Rocken. Maldi imagingcombined with hierarchical clustering as a new tool for the interpretation of complex human cancers.
Journal ofproteome research , 7(12):5230–5236, 2008.[4] Karsten Wüllems, Jan Kölling, Hanna Bednarz, Karsten Niehaus, Volkmar H Hans, and Tim W Nattkemper.Detection and visualization of communities in mass spectrometry imaging data.
BMC bioinformatics , 20(1):303,2019.[5] Nathalie Henry, Jean-Daniel Fekete, and Michael J McGuffin. Nodetrix: a hybrid visualization of social networks.