2021 C++17 Python NumPy NetworkX SciPy

Expanding-Shrinking Network Model

A novel random graph model combining growth and shrinkage dynamics to produce networks with real-world properties

Overview

Most random graph models — Erdős-Rényi, Barabási-Albert, Watts-Strogatz — only grow networks over time. But real networks both grow and shrink: web pages are added and removed, trade partnerships form and dissolve. The Expanding-Shrinking (ES) model addresses this gap. Starting from an arbitrary seed graph, at each iteration it either shrinks the network (merging two random nodes, preserving edges) or expands it (splitting a random node, both copies inheriting all edges). A single parameter α controls the shrink/expand probability. Despite its simplicity, the model produces networks with several hallmarks of real-world systems: giant connected components, shrinking distances, non-negligible transitivity for α > 0.5, and the emergence of high-degree hubs — all without embedding prior structural knowledge. The C++/Python polyglot pipeline generates networks in C++ and analyzes them with Python, including portrait divergence comparisons against real-world datasets like international trade networks and IMDB.

Architecture

The ES model pipeline is a C++/Python polyglot system. C++ handles network generation — the core ES loop iterates over time steps, choosing merge or split operations parameterized via std::function for swappable strategies. Generated networks are serialized and analyzed in Python using NetworkX and custom portrait divergence code (Jensen-Shannon divergence between network portraits). The pipeline sweeps across α × time grids, producing heatmaps and statistical profiles. A reusable C++ graph library with dual adjacency lists underpins all experiments, including community detection benchmarks and 37M-node scalability challenges.

Key Concepts

Expanding-Shrinking Dynamics

The core insight: real networks simultaneously grow and shrink, yet most models only capture growth. The ES model's shrink operation merges two random nodes (à la the war pact model), while the expand operation splits a node — both new nodes inherit all edges. The expected network size follows n_e = n_i + t(1 − 2α), meaning α < 0.5 grows, α > 0.5 shrinks, and α = 0.5 preserves size. Crucially, only a small amount of shrinking (even in growing networks) is needed to produce short distances — an effect Watts and Strogatz achieved through random rewiring, here arising naturally from node merging.

Portrait Divergence Analysis

To compare generated networks against each other and against real-world systems, the project uses portrait divergence — a Jensen-Shannon divergence measure between network portraits that captures structural similarity beyond simple statistics. Heatmaps across α parameter grids reveal that networks with α ≤ ~0.35 converge to structurally similar graphs regardless of exact parameters, suggesting the expansion dynamic dominates structure formation in that regime. Grid search against the international trade network achieved a portrait divergence of 0.18 with α = 0.7.

Custom Graph Library & Performance Engineering

All experiments run on a custom header-only C++ graph library (1400+ lines) with dual adjacency lists for directed graphs, supporting betweenness centrality (Brandes), closeness, PageRank, and community detection. A stripped-down variant with raw node_t* pointer arrays and heuristic 5× pre-allocation handles the 37M-node challenge. The library also integrates SIMD-vectorized RNG (AVX2/AVX512 xorshift128+) at 1.88 cycles/op for fast graph generation.

Code Highlights

ES Model — Higher-Order with Swappable Operations

void expanding_shrinking_model(
    Graph &g, float alpha, int iter, int seed, bool verbose) {
    expanding_shrinking_model(alpha, iter, seed,
        [&g]() {
            int a, b;
            do { a = rand() % g.get_nodes_count();
                 b = rand() % g.get_nodes_count();
            } while (a == b);
            return make_pair(a, b);
        },
        [&g]() { return rand() % g.get_nodes_count(); },
        [&g](auto a, auto b) { merge_nodes(g, a, b); },
        [&g](auto a) { split_node(g, a); },
        verbose);
}

Node Splitting — Expand Operation

void split_node(Graph &g, int node) {
    auto neighbours = g.get_neighbours(node);
    int new_node = g.add_node();
    for (auto neighbour : neighbours) {
        g.add_edge(new_node, neighbour);
    }
}

Raw Pointer Adjacency List for 37M Nodes

graph(size_t nodes_count, size_t edges_count) {
    out_adj_list.reserve(nodes_count);
    sizes = new size_t[nodes_count];
    auto reserve = (size_t) (((double) edges_count
        / (double) nodes_count + 1) * 5);
    for (size_t i = 0; i < nodes_count; i++) {
        node_t *adj = new node_t[reserve];
        out_adj_list.push_back(adj);
        sizes[i] = 0;
    }
}

Performance

The supporting graph library enables the ES model to scale: 37M-node Erdős-Rényi generation + LCC in ~8 seconds using raw pointer adjacency lists. Brandes betweenness on the IMDB 2M-edge dataset: 140s vs 540s reference (3.8× speedup). Portrait divergence grid searches over 11×11 α × time parameter spaces complete in minutes thanks to C++ generation with Python analysis.

Highlights

Novel ES model: at each step, either merge two random nodes (shrink) or split a random node inheriting all edges (expand) — controlled by a single α parameter
Generated networks exhibit real-world properties — hub emergence, shrinking distances, giant connected component, and non-negligible clustering — without encoding preferential attachment or any prior knowledge
Portrait divergence (Jensen-Shannon) heatmaps across 11×11 α grids reveal a phase transition: networks with α ≤ ~0.35 converge to structurally similar graphs regardless of parameters
Matched real-world international trade network with portrait divergence of 0.18 via grid search over (α, seed size, iterations)
Custom C++ graph library (1400+ lines) with dual adjacency lists — Brandes betweenness 3.8× faster than professor's Java reference on 2M-edge IMDB, closeness centrality 20× faster
37M-node Erdős-Rényi challenge: raw pointer adjacency lists with heuristic pre-allocation, generating and analyzing G(37M, ~80M) in ~8 seconds
Community detection benchmarking across 14+ real-world networks (Facebook 63K nodes, Google Web 875K, IMDB 2M+ edges) with Label Propagation, Louvain, and InfoMap

Related Projects

Music Classification with Deep Learning

IEEE-format research paper with novel noise-augmentation for robust music identification

Python TensorFlow/Keras LaTeX Swift

Published IEEE-format research paper presenting a novel noise-augmentation technique for robust music identification — training for real-world degradation rather than clean-sample accuracy

SwifyPy Numerical Mathematics Library

Protocol-oriented numerical computing in Swift with BLAS-accelerated matrix operations

Swift 5.1 Apple Accelerate BLAS/LAPACK PythonKit

Protocol-oriented numerical computing: six matrix types sharing a MatrixProtocol interface with compile-time BLAS dispatch through protocol witness tables

Back to CV