Representing data using graphs: A sparse signal approximation view


Why do we need to represent data using graphs?

Graph driven machine learning has seen a surge of interest in the past few years with several applications in social sciences, biology, and network analysis, to name a few.

However, in some scenarios, no graph is given a priori and one has to infer and construct a graph to fit the data given. These graphs are useful for a few reasons:

  1. Unlike machine learning methods that learn a function to each point, one can leverage explicitly the information across points to obtain better functions over regions of space.
  2. Allows one to characterize data under minimal assumptions for use in semi-supervised and unsupervised learning scenarios

In a typical graph learning problem, we are given N data points and the goal is to learn an efficient graph representation of the data. The keyword here is efficient: An efficient graph can be defined as one with number of edges of the same order as the number of nodes (N). Efficient graphs lead to faster downstream processing making them

A primer on sparse signal approximation

Graph construction as sparse signal approximation

knn graph nnk graph
KNN (Left) vs NNK (Right) graph for different choices of K