Network Models

Exponential Random Graphs and their Log-Linear Representations

by Philipp Burckhardt

Outline

Motivating Example
Features of Social Networks
History of Social Network Models
Exponential Random Graph Models
- $\beta$-Model
- $p_1$-Model
- Log-linear representation
- Exponential Random Graph Models (ERGMs)

Sampson Monk Data Set

Common Features of Social Networks

Mutuality of ties
Heterogeneity in propensity to form ties
Homophily of attributes (formation of ties due to similar personal traits)
Context is important $\implies$ triads, not dyads should be fundamental social unit, Simmel (1950)
...

A Bit of History

Early work in Sociology (Moreno, Simmel)
Statistical Network Models 1970s (Holland & Leinhardt, Fienberg & Wasserman)
Journal Social Networks published since 1979

A Bit of History

Early work in Sociology (Moreno, Simmel)
Statistical Network Models 1970s (Holland & Leinhardt, Fienberg & Wasserman)
Journal Social Networks published since 1979
Spatial Statistics (Besag 1974, Ripley 1981)
Exponential Family Theory (Barndorff-Nielsen 1978)
Statistical Physics (Barabási, Watts)
Machine Learning:
- Stochastic blockmodels (Choi, Snijders)
- Mixed-Membership Models (Airoldi et. al, 2008)
Graphical Models?

Edges of Graph are Random Variables

	Graphical Model	Network Model
Nodes	Random	Fixed
Edges	Fixed, express conditional independence	Random, express relation between nodes

Edges of Graph are Random Variables

We first consider undirected graphs
Let $X_{ij} \in \{ 0, 1\}$ be indicator random variables.
$X_{i,j} = X_{j,i} = 1$ if there exists an edge between $i$ and $j$.

$X_{ii} = 0$ by convention
Set $\{ X_{i,j} : i < j\}$ of indicator random variables uniquely characterizes network

Beta Model

Extension of Erdős–Rényi random graph model in which the probability of observing an edge between two nodes is $p$
Allows for heterogeneity in propensity of nodes to form ties: $P \left( X_{i,j} = 1\right)$ not all equal to $p$

Beta Model

Models log odds of existences of edge between nodes $i$ and $j$:

$$ \log \frac{P \left( X_{ij} = 1\right)}{1 - P \left( X_{ij} = 1\right)} = \beta_i + \beta_j \quad \forall i \ne j $$ $$ \definecolor{lviolet}{RGB}{114,0,172} \definecolor{lgreen}{RGB}{45,177,93} \definecolor{lred}{RGB}{251,0,29} \definecolor{lblue}{RGB}{18,110,213} \definecolor{circle}{RGB}{217,86,16} \definecolor{average}{RGB}{203,23,206} $$

Beta Model

Models log odds of existences of edge between nodes $i$ and $j$:

$$ \log \frac{ {\color{lred} P \left( X_{ij} = 1\right) }} {1 - {\color{lred} P \left( X_{ij} = 1\right)}} = \beta_i + \beta_j \quad \forall i \ne j $$

In simplest form, model for graph with undirected edges and $n$ vertices. Edges are Bernoulli random variables with probabilities $\color{lred}p_{i,j}$

Beta Model

Models log odds of existences of edge between nodes $i$ and $j$:

$$ \log \frac{ {\color{lred} p_{i,j} }} {1 - {\color{lred} p_{i,j} }} = \beta_i + \beta_j \quad \forall i \ne j $$

In simplest form, model for graph with undirected edges and $n$ vertices. Edges are Bernoulli random variables with probabilities $\color{lred}p_{i,j}$

MLE

The likelihood function for the $\beta$ model is given by

$$ p(x ; \beta) = \prod_{i < j} p_{i,j}^{x_{i,j}} \left( 1 - p_{i,j} \right)^{1-x_{i,j}} = \prod_{i < j} \left( \frac{ p_{i,j}}{1-p_{i,j}} \right)^{x_{ij}} \left( 1 - p_{i,j} \right) $$

MLE

The likelihood function for the $\beta$ model is given by

$$ p(x ; \beta) = \prod_{i < j} p_{i,j}^{x_{i,j}} \left( 1 - p_{i,j} \right)^{1-x_{i,j}} = \prod_{i < j} \left( \frac{ p_{i,j}}{1-p_{i,j}} \right)^{x_{ij}} \left( 1 - p_{i,j} \right) $$

Written in exponential family form:

$$ p(x ; \beta) = \exp \left( \sum_{i < j} x_{i,j} \log \frac{ p_{i,j}}{1-p_{i,j}} + \log(1+p_{i,j}) \right) $$

Sufficient Statistics

$$ \begin{align} p(x ; \beta) &= \exp \left( \sum_{i < j} x_{i,j} \; {\color{lred} \log \frac{ p_{i,j}}{1-p_{i,j}} } + \sum_{i < j} \log(1+p_{i,j}) \right) \\ &= \exp \left( \sum_{i < j} x_{i,j} \; {\color{lred}\beta_i} + \sum_{i < j} x_{i,j} \; {\color{lred}\beta_j} - A(\beta) \right) \\ &= \exp \left( \sum_{i=1}^n \beta_i \left[ \sum_{j > i} x_{i,j} + \sum_{j < i} x_{i,j} \right] - A(\beta) \right) \end{align} $$

Sufficient Statistics

$$ p(x ; \beta) = \exp \left( \sum_{i=1}^n \beta_i \underbrace{ \left[ \sum_{j > i} x_{i,j} + \sum_{j < i} x_{i,j}\right]}_{{\color{lviolet}d_i}} - A(\beta) \right) $$

Minimal sufficient statistics $\color{lviolet} d_i$, the number of neighbors of node $i$ (degree of node $i$)
the vector $\color{lviolet} d(x)$ consisting of all $\color{lviolet} d_i$ is called the degree sequence of the graph $x$

Moment Equations for MLE

Maximum likelihood estimates are given by solution to equation $\mathbb{E}[{\color{lviolet}d(X)}] = \color{lviolet} d(x)$

Moment Equations for MLE

Maximum likelihood estimates are given by solution to equation $\mathbb{E}[{\color{lviolet}d(X)}] = \color{lviolet} d(x)$
Hence, $\hat \beta$ found by $$ d_i = \sum_{j \ne i} {\color{circle}\frac{\exp(\hat \beta_i + \hat \beta_j)}{1 + \exp(\hat \beta_i + \hat \beta_j)}},\qquad i=1, \ldots, n $$ as $$ \mathbb{E}[{\color{lviolet}d_i(X)}] = \mathbb{E} \left[ {\color{lviolet} \sum_{j > i} X_{i,j} + \sum_{j < i} X_{i,j} } \right] = \mathbb{E}\left[\sum_{j \ne i} X_{ij} \right] = \sum_{j \ne i} {\color{circle}p_{ij}} $$

Existence of MLE

Does the MLE exist?
Note that parameter space $\{ \beta_i \}_{i=1, \ldots, n}$ grows with the number of nodes
But although we only observe one graph, there are ${n \choose 2} = \frac{n (n-1)}{2}$ indicator variables $X_{ij}$, making estimation tractable most of the time
Conditions for Existence of MLE in the $\beta$-model:

Rinaldo, A., Petrović, S., & Fienberg, S. E. (2011). Maximum lilkelihood estimation in the $\beta$-model.

$p_1$-Model

(Holland & Leinhardt 1981)

Again $n$ actors, but this time directed edges

Decompose X into $n \choose 2$ dyads $D_{ij} = (X_{ij},X_{ji})$ for $i < j$

$p_1$-Model

Model the probability of an edge between nodes $i$ and $j$:

$p_1$-Model

Model the probability of an edge between nodes $i$ and $j$:
${\color{lviolet} \rho}$ is force of reciprocation

$p_1$-Model

Model the probability of an edge between nodes $i$ and $j$:
${\color{lviolet} \rho}$ is force of reciprocation
${\color{circle} \theta}$ is density parameter (= governs number of edges in digraph)

$p_1$-Model

Model the probability of an edge between nodes $i$ and $j$:
${\color{lviolet} \rho}$ is force of reciprocation
${\color{circle} \theta}$ is density parameter (= governs number of edges in digraph)
${\color{lgreen} \alpha_i}$ is productivity parameter, governs how likely it is for node $i$ to have outgoing ties

$p_1$-Model

Model the probability of an edge between nodes $i$ and $j$:
${\color{lblue} \rho}$ is force of reciprocation
${\color{circle} \theta}$ is density parameter (= governs number of edges in digraph)
${\color{lgreen} \alpha_i}$ is productivity parameter, governs how likely it is for node $i$ to have outgoing ties
${\color{lviolet} \beta_i}$ is attractiveness parameter as it governs how many incoming ties a node has

$p_1$-Model

$$ \begin{align} \log P(D_{ij} &= (0,0)) = \lambda_{ij} \\ \log P(D_{ij} &= (1,0)) = \lambda_{ij} + {\color{lgreen} \alpha_i} + {\color{lviolet} \beta_j} + {\color{circle}\theta} \\ \log P(D_{ij} &= (0,1)) = \lambda_{ij} + {\color{lgreen} \alpha_j} + {\color{lviolet} \beta_i} + {\color{circle}\theta} \\ \log P(D_{ij} &= (1,1)) = \lambda_{ij} + {\color{lgreen} \alpha_i} + {\color{lviolet} \beta_j} + {\color{lgreen} \alpha_j} + {\color{lviolet} \beta_i} + 2{\color{circle}\theta} + {\color{lblue} \rho} \end{align} $$

Likelihood is of exponential form and evaluates to $$ p_1(x) \propto \exp \left( {\color{lblue} \rho} \sum_{ij} x_{ij} x_{ji} + {\color{circle}\theta} x_{++} + \sum_i {\color{lgreen} \alpha_i} x_{i+} + \sum_j {\color{lviolet} \beta_j} x_{+j} \right) $$

Log-Linear Representation

Define $$ y_{{\color{blue}i}{\color{circle}j}{\color{average}k}{\color{lgreen}l}} = \begin{cases} 1 & \text{if } D_{{\color{blue}i}{\color{circle}j}} = (x_{{\color{blue}i}{\color{circle}j}}, x_{{\color{circle}j}{\color{blue}i}}) = ({\color{average}k},{\color{lgreen}l}) \\ 0 & \text{otherwise} \end{cases} $$ which yields ${\color{blue} n} \times {\color{circle}n} \times {\color{average}2} \times {\color{lgreen}2}$ array with $y_{iikl}$ equal to zero for all $i$ and $y_{ijkl} = y_{jilk}$.
Fienberg & Wasserman (1981) have shown that fitting the $p_1$ model to $x$ corresponds to estimating the no second-factor interaction model on $y$, i.e. the following log-linear model: $$ [12] [13] [24] [14] [23] [34] $$
Use of standard iterative proportional fitting

Log-Linear Representation

Corresponding log-linear model: $$ \scriptsize \log y_{ijkl} = u_{12(ij)} + u_{13(ik)} + u_{24(jl)} + u_{14(il)} + u_{23(jk)} + u_{34(kl)}. $$

Here: $$ {\scriptsize \begin{align} u_{12(ij)} &= \lambda_{ij} \\ u_{13(ik)} &= \left( \alpha_i + \theta \right) \times I(k=1) \\ u_{24(jl)} &= \left( \alpha_j + \theta \right) \times I(l=1) \\ u_{14(il)} &= \beta_i \times I(l=1) \\ u_{23(jk)} &= \beta_j \times I(k=1) \\ u_{34(kl)} &= \rho \times I(k=1,l=1) \\ \end{align} } $$

Sufficient Statistics

$$ \begin{align} 1/2 y_{++11} &= \sum_{i < j} x_{ij} x_{ji} &\text{number of mutuals} \\ y_{i+1+} &= x_{i+} &\text{out-degree of node } i \\ y_{+j+1} &= x_{+j} &\text{in-degree of node } j \\ y_{++1+} &= x_{++} &\text{total number of choices} \\ \end{align} $$

ERGMs or $p^*$-models

(Frank & Strauss 1986)

Extend the $p_1$ class by adding topoligical features of the graph, e.g. the mumber of triangles $$ T(x) = \sum_{i,j,k} x_{ij} x_{jk} x_{ki}. $$ by adding an additional parameter multiplied with $T(x)$ to the likelihood of the $p_1$ model: $$ \log p(x) \propto \tau T(x) + \rho m + \theta x_{++} + \sum_i a_i x_{i+} + \sum_j \beta_j x_{+j} $$

No dyadic independence anymore, joint density cannot be written as product of individual components

ERGMs or $p^*$-models

(Frank & Strauss 1986)

Normalizing contant for $p(x)$ is a sum over all possible graphs $\implies$ intractable to compute
- Independent-dyad approximation (Strauss & Ikeda, 1990)
- MCMC approaches (e.g. Snijders, 2002)