SeminarsView all Seminars | Download ICal for this event
Hypergraph Network Models: Learning, Prediction, and Representation in the Presence of Higher-Order Relations
Series: Ph.D (Engg.) Thesis Defence - ON-LINE
Speaker: Mr. Govind Sharma Ph.D Student Dept. of CSA
Date/Time: Dec 08 10:00:00
Location: Microsoft Teams - ON-LINE
Faculty Advisor: Prof. M Narasimha Murty / Prof. Shalabh Bhatnagar
The very thought about relating objects makes us assume the relation would be pairwise, and not of a higher-order -- involving possibly more than two of them at a time. Yet in reality, higher-order relations do exist and are spread across multiple domains: medical science (e.g., co-existing diseases/symptoms), pharmacology (e.g., reacting chemicals), bibliometrics (e.g., collaborating researchers), the film industry (e.g., cast/crew), human resource (e.g., a team), social sciences (e.g., negotiating/conflicting nations), and so on. Since a collection of intersecting higher-order relations lose context when represented by a graph, "hypergraphs" -- graph-like structures that allow edges (called hyperedges/hyperlinks) spanning possibly more than two nodes -- capture them better. In a quest to better understand such relations, in this thesis we focus on solving a few network-science oriented problems involving hypergraphs.
"Hypergraphs and Pairwise Links": In the first of three broad parts, we study the behavior of usual graph-oriented networks that have an otherwise-ignored hypergraph underpinning. We particularly establish the skewness a hypergraph introduces into its induced graphs, and the effect of these biases on the structure and evaluation of the well-known problem of link prediction in networks. We find that an underlying hypergraph structure makes popular heuristics such as common-neighbors overestimate their ability to predict links. Gathering enough evidence -- both theoretical and empirical -- to support the need to reestablish the evaluations of link prediction algorithms on hypergraph-derived networks, we propose adjustments that essentially undo the undesired effects of hypergraphs in performance scores. Motivated by this observation, we extend graph-based structural node similarity measures to cater to hypergraphs (although still, for similarity between pairs of nodes). To be specific, we first establish mathematical transformations that could transfer any graph-structure-based notion of similarity between node pairs to a hypergraph-structure-based one. Using exhaustive combinations of the newly established scores with the existing ones, we could show improvements in the performance of both structural as well as temporal link prediction.
"Predicting Higher-order Relations": For the second part of our thesis, we turn our attention towards a more central problem in hypergraphs -- the hyperlink/hyperedge prediction problem. It simply refers to developing models to predict the occurrence of missing or future hyperedges. We first study the effect of negative sampling (sampling the negative class) -- an exercise performed due to the extreme intractability of the set of all non-hyperlinks, also known as the class imbalance problem -- on hyperlink prediction, which has never been studied in the past. Since we observe hyperlink prediction algorithms performing differently under different negative sampling techniques, our experiments help the seemingly unimportant procedure gain some significance. Moreover, we contribute towards two benchmark negative sampling algorithms that would help standardize the step. Moving on from the negative sampling problem to predicting hyperlinks themselves, we work on two different approaches: a clique-closure based approach, and a sub-higher-order oriented one. While in the former, we develop and successfully test the clique-closure hypothesis -- that hyperlinks mostly form from cliques or near-cliques -- and are able to utilize it for hyperlink prediction via matrix completion (C3MM), the latter approach works on a novel information flow model in hypergraphs. More precisely, we introduce the concept of sub-hyperedges to capture the sub-higher-order notion in relations, and utilize an attention-based neural network model called SHONeN focusing on sub-hyperedges of a hyperedge. Owing to SHONeNs computational complexity, we propose a sub-optimal heuristic that is able to perform better than its baselines on the downstream task of predicting hyperedges.
"Higher-order Bipartite Relations": The third and final part of our thesis is dedicated exclusively to "bipartite hypergraphs": structures that are used to capture higher-order relations between two disjoint node sets, e.g., a patients diagnosis (possibly multiple diseases and symptoms), a movie project (multiple actors and crew members), etc. We first capture the structure of real-world such networks using per-fixed bipartite hypergraphs (those where the set of left and right hyperedges is fixed beforehand), and then focus on the bipartite hyperlink prediction problem. Since existing self-attention based approaches meant for usual hypergraphs do not work for bipartite hypergraphs -- a fact that our experiments validate, we propose a cross-attention model for the same, and use the notion of set-matching over collections of sets to solve for bipartite hyperlink prediction. As a result, we develop a neural network architecture called CATSETMAT that performs way better than any of the other approaches meant to solve the bipartite hyperlink prediction problem. Last but not least, we also explain how, owing to an observation we call the positive-negative dilemma, existing state-of-the-art algorithms fail on bipartite hypergraphs.
Microsoft Teams Link: