|
|
|
|
|
Scalable Algorithms for Data and Network Analysis
Author(s): Shang-Hua Teng
Source: Journal:Foundations and Trends® in Theoretical Computer Science ISSN Print:1551-305X, ISSN Online:1551-3068 Publisher:Now Publishers Volume 12 Number 1-2, Pages: 278(1-274) DOI: 10.1561/0400000051
Abstract:
In the age of Big Data, efficient algorithms are now in higher demand
more than ever before. While Big Data takes us into the asymptotic
world envisioned by our pioneers, it also challenges the classical notion
of efficient algorithms: Algorithms that used to be considered efficient,
according to polynomial-time characterization, may no longer be adequate
for solving today’s problems. It is not just desirable, but essential,
that efficient algorithms should be scalable. In other words, their complexity
should be nearly linear or sub-linear with respect to the problem
size. Thus, scalability, not just polynomial-time computability, should
be elevated as the central complexity notion for characterizing efficient
computation.
In this tutorial, I will survey a family of algorithmic techniques for
the design of provably-good scalable algorithms. These techniques include
local network exploration, advanced sampling, sparsification, and
geometric partitioning. They also include spectral graph-theoretical
methods, such as those used for computing electrical flows and sampling
from Gaussian Markov random fields. These methods exemplify
the fusion of combinatorial, numerical, and statistical thinking in network
analysis. I will illustrate the use of these techniques by a few basic
problems that are fundamental in network analysis, particularly for the
identification of significant nodes and coherent clusters/communities in
social and information networks. I also take this opportunity to discuss
some frameworks beyond graph-theoretical models for studying conceptual
questions to understand multifaceted network data that arise
in social influence, network dynamics, and Internet economics.
|
|
|
|