View all Seminars  |  Download ICal for this event

Statistical Network Analysis: Community Structure, Fairness Constraints,and Emergent Behavior

Series: Ph.D (Engg.) Colloquium- ON-LINE

Speaker: Mr. Shubham Gupta Ph.D Student Dept. of CSA

Date/Time: Mar 10 16:00:00

Location: Microsoft Teams - ON-LINE

Faculty Advisor: Prof. Ambedkar Dukkipati

Networks or graphs provide mathematical tools for describing and analyzing relational data. They are used in biology to model interactions between proteins, in economics to identify trade alliances among countries, in epidemiology to study the spread of diseases, and in computer science to rank webpages on a search engine, to name a few. Each application domain in this wide assortment encounters networks with diverse properties and imposes various constraints. For example, networks may be dynamic, heterogeneous, or attributed, and an application domain may require a fairness constraint on the communities. However, most existing research is concerned with the simplest type of networks with a fixed set of nodes and edges and focuses on the canonical forms of tasks like community detection and link prediction. This thesis aims at bridging this gap by proposing community detection and link prediction methods to analyze different types of networks from various perspectives.
Our first contribution is a spectral algorithm with theoretical guarantees that finds 'fair' clusters. We define a notion of individual fairness in communities using an auxiliary representation graph. Nodes are connected in this graph if they can represent each others interests in various communities. Informally speaking, a node considers a community fair if an adequate number of its representatives belong to that community. The goal is to find communities that are considered fair by all nodes under the representation graph. We show that our proposed fairness criterion (i) generalizes the idea of statistical fairness and (ii) is also applicable in cases where the sensitive node attributes (like gender and race) are not observable but instead manifest themselves as intrinsic or latent features of a social network. We develop a fair spectral clustering algorithm and prove that it is weakly consistent (#mistakes = o(n) with probability 1 - o(1)) under a proposed variant of the stochastic block model.
Second, we propose a community-based statistical model for dynamic networks where edges appear and disappear over time. Many networks like social networks, citation networks, contact networks, etc., are dynamic in nature. Our model embeds the nodes and communities in a d-dimensional latent space and specifies a procedure for updating these embeddings over time to model the network's evolution. Given an observed dynamic network, we infer these latent quantities using variational inference and use them for link forecasting and community detection. Unlike existing approaches, our model supports the birth and death of communities. It also allows us to use powerful neural networks during inference. Experiments demonstrate that our model is better at link forecasting and community detection as compared to existing approaches. Moreover, it discovers stable communities, as quantified by the normalized mutual information (NMI) score between communities discovered at successive time steps. This desirable quality is absent in methods that ignore the network dynamics.
Third, we propose a statistical model for heterogeneous dynamic networks where the nodes and relations additionally have a type associated with them (e.g., knowledge graphs). Besides the latent node attributes, this model also encodes a set of interaction matrices for each type of relation. These matrices specify the affinity between nodes based on their attribute values and can represent both homophilic (like attracts like) and heterophilic relationships (opposites attract). We develop a scalable neural network-based inference procedure for this model and demonstrate that it outperforms existing state-of-the-art approaches on several homogeneous and heterogeneous dynamic network datasets, particularly the temporal knowledge graphs.
Fourth, we develop a model for networks with node covariates to bring explainability to community detection. This model integrates node covariates into a stochastic block model using restricted Boltzmann machines. We subscribe to the view that a community can be explained by identifying the defining covariates of its member nodes. Our model provides the relative importance of various covariates for each community, thereby explaining its decision to group the members. Existing approaches for modeling networks with covariates lack this property, especially the ones that are based on deep neural networks. We also derive an efficient inference procedure that runs in linear time in the number of nodes and edges. Experiments confirm that our model's community detection performance is comparable with recent deep neural network-based approaches. However, it additionally offers the advantage of explainability.
The discussion till now views communities as passive structures arising out of interactions between nodes. However, just like existing links in a network determine future links, communities also play a functional role in shaping the behavior of the nodes (for example, preference for a clothing brand). Our final contribution explores this functional view of communities and shows that they affect emergent communication in a networked multi-agent reinforcement learning setting.

Meeting Link :

Speaker Bio:

Host Faculty: