Why is counting triangles important
Triangle counting is an important problem in graph mining. Two frequently used metrics in complex network analysis that require the count of triangles are the clustering coefficients and the transitivity ratio of the graph. Triangles have been used successfully in several real-world applications, such as detection of spamming activity, uncovering the hidden thematic structure of the web and link recommendation in online social networks.
Using brute force
A triangle in a graph is a set of three vertices , where every two of them connected by an edge. An obvious algorithm for finding a triangle inspects every triple of vertices, and thus it needs roughly operations for an -vertex graph (there are triples to look at, and is approximately for large ).
Using matrix multiplication
There is a faster method for breaking the barrier which is algebraic and based on fast matrix multiplication.
For notational convenience let the vertex set of the given graph be labelled as . The adjacency matrix of is the matrix with
The key insight is to understand the square . By the definition of matrix multiplication we have , and
So counts the number of common neighbors of and .
Finding a triangle is equivalent to finding two adjacent vertices with a common neighbor . So we look for two indices such that both and . Summing up all the s where both and are non zero gives us times the number of triangles in the graph as the same triangle is counted twice for each of the three pairs of indices comprising the triangle.
If we compute the matrix according to the definition, we need about arithmetic operations and thus we save nothing compared to the naive method of inspecting all triples of vertices. However, ingenious algorithms are known that multiply matrices asymptotically faster. The oldest one, due to Strassen, needs roughly arithmetic operations.
Using the cube of the adjancency matrix
The cube of the adjancency matrix gives us the number of paths of length between any pair of vertices. We need to look at the trace of the matrix so that we get the number of triangles in the graph. The trace has to be divided by because each triangle is counted twice for each vertex in the triangle.
Using node-iteration
In this algorithm we iterate over all nodes and tests for each pair of neighbors if they are connected by an edge. The asymptotic running time is given by the expression and is bounded by . Here each triangle is counted thrice, once for each node.
Python code for counting triangles
import networkx as nx
from itertools import combinations, product
import numpy as np
G = nx.erdos_renyi_graph(20, 0.5, seed=123, directed=False)
def count_triangles_brute_force(G):
cnt_triangles = 0
for i,j,k in combinations(G.nodes, 3):
if G.has_edge(i,j) and G.has_edge(j,k) and G.has_edge(i,k):
cnt_triangles += 1
return cnt_triangles
def count_triangles_mat_mul(G):
cnt_triangles = 0
A = nx.to_numpy_matrix(G)
B = A.dot(A)
for i,j in product(G.nodes, G.nodes):
if G.has_edge(i,j) and B[i,j] != 0:
cnt_triangles += B[i,j]
return cnt_triangles/6
def count_triangles_mat_mul2(G):
A = nx.to_numpy_matrix(G)
return np.trace(np.linalg.matrix_power(A,3))/6
def count_triangles_node_iterator(G):
cnt_triangles = 0
for i in G.nodes:
for j,k in combinations(G.neighbors(i),2):
if G.has_edge(j,k):
cnt_triangles += 1
return cnt_triangles/3
print(count_triangles_brute_force(G))
print(count_triangles_mat_mul(G))
print(count_triangles_node_iterator(G))
print(count_triangles_mat_mul2(G))
Listing triangles using edge-iteration
The algorithm edge-iterator iterates over all edges and compares the adjacency data structure of both incident nodes. For an edge the nodes induce a triangle if and only if node is present in both adjacency data structures and .The running time can then be expressed with
Python code for listing triangles
def list_triangles_edge_iteration(G):
triangles = set()
for u,w in G.edges:
for v in set(G.neighbors(u)).intersection(G.neighbors(w)):
triangles.add(frozenset([u,v,w]))
return triangles
print(list_triangles_edge_iteration(G))
References
Thirty-three Miniatures: Mathematical and Algorithmic Applications of Linear Algebra
Finding, Counting and Listing all Triangles in Large Graphs, An Experimental Study