Network clustering is an important problem that has recently drawn a

Network clustering is an important problem that has recently drawn a lot of attentions. methods. First it is able to detect associations between clusters from different domains which however is not addressed by any existing methods. Second it achieves more consistent clustering results on multiple Rabbit Polyclonal to ELAV2/4. networks by leveraging the between clustering individual networks and inferring cross-network cluster alignment. Finally it Kobe0065 provides a multi-network clustering solution that is more robust to noise and errors. We perform extensive experiments on a variety of real and synthetic networks to demonstrate the effectiveness and efficiency of MCA. I. Introduction Networks (or graphs) are widely used in representing relationships between instances in which each node corresponds to an instance and each edge depicts the relationship between a pair of instances. Network clustering (or graph clustering) [1]–[3] has become an effective means in discovering modules formed by closely related instances in such networks which may in turn reveal functional structure of the networks. Recently the attention has moved from clustering in a single homogeneous network (built on instances from one domain) to joint clustering on multiple heterogeneous networks (from different but related domains) due to obvious reasons: integrating information from different but related domains not only may help to resolve ambiguity and inconsistency in clustering outcome but also may discover and leverage strong associations between clusters from different domains. Consequently these multi-view network clustering methods [3] [4] are able to substantially improve the clustering accuracy. For example millions of genetic variants on human genome have been reported to be disease related most of which are in the form of single nucleotide polymorphism (SNP). These SNPs do not function independently. Instead a set of SNPs may play joint roles in a disease. Such interactions between SNPs can be modeled by a SNP interaction network. Fig. 1 shows an exemplar SNP interaction network of 17 SNPs on the left in which nodes are SNPs and weighted edges represent interactions between SNPs. Even though the underlying biological processes are complex and only partially solved it is well established that SNPs may alter the expression levels of related genes which may in turn have a cascading effect to other genes e.g. in the same biological pathways [5]. The interactions between genes can be measured by correlations of gene expressions and represented by a gene interaction network. Fig. 1 shows an exemplar gene interaction network of 20 genes Kobe0065 on the right in which nodes are genes and weighted edges represent interactions between genes. These two networks are heavily related because of the (complicated) relationships between SNPs and genes as demonstrated in many expression quantitative trait loci (eQTL) studies. These cross-domain relationships are represented by dotted edges between SNPs and genes in Fig. 1. The strength of such relationship is coded by the edge weight. It is Kobe0065 evident that a joint analysis becomes essential in these related domains. Fig. 1 An exemplar SNP interaction network and gene interaction network in an eQTL study Despite the success of previous approaches in network clustering they still suffer from two common limitations. First existing methods usually assume that information collected in different domains are for the same set of instances. Thus the cross-domain instance relationships are strictly correspondence. This assumption may not Kobe0065 hold in many applications. More often than not data instances (e.g. SNPs) in one domain may be related to multiple instances (e.g. genes) in another domain. Methods that can account for many-to-many cross-domain relationships are in need [6]. Second existing approaches tend to focus on network clustering and ignore any associations that may be exhibited between clusters from different domains. However “alignment” Kobe0065 between clusters from multiple domains may provide a more comprehensive depiction of the whole system. For example a cluster of SNPs may jointly regulate the expressions of a cluster of genes which may Kobe0065 be revealed by cluster level associations. Fig. 1 shows 2 SNP clusters: A (including SNPs {1 2 3 4 and B (including SNPs {12 13 14 16 and 3 gene clusters: C (including genes {between clustering in individual networks and inferring cross-network cluster alignment enables mutual reinforcement when both tasks.