Reference

Landmarks

CGE.landmarksFunction
landmarks(edges::Array{Int,2},vweights::Vector{Float64}, clusters::Vector{Vector{Int}},
    embedding::Array{Float64,2}, verbose::Bool, land::Int, forced::Int, method::Function)

Arguments

  • edges::Array{Int,2} array with edges definition (two whitespace separated vertices ids)
  • weights::Vector{Float64} edges weights
  • vweights::Vector{Float64} vertices weights
  • clusters::Vector{Vector{Int}} vector of vectors indicating an initial 1-based cluster

assignment of vertices.

  • embedding::Array{Float64,2} array with vertices embeddings
  • comm::Array{Int,2} assignment of vertices to communities
  • verbose::Bool verbose switch, if true prints additional processing information
  • land::Int number of landmarks to generate
  • forced::Int required maximum number of forced splits of a cluster
  • method::Function method used to generate landmarks
  • directed::Bool flag for directed version of landmark-based graph creation
source
CGE.runsplitFunction
runsplit(embedding, w, initial_clusters, n, s, rule)

Take embedding with weights w where each column is a single observation and initial_clusters is a vector of vectors indicating an initial 1-based cluster assignment of vertices. Return a vector of 0-based assignments of vertices to n groups. Each cluster is guaranteed to be split to at most s landmarks (may be less if its size is less than s).

source
CGE.split_cluster_diameterFunction
split_cluster_diameter(m, w)

Splits cluster m with weights w into two clusters along its first principal component so as to make both clusters have approximately the same diameter along the first principal component.

source
CGE.split_cluster_sizeFunction
split_cluster_size(m, w)

Splits cluster m with weights w into two clusters along its first principal component so as to make the clusters have equal size (number of edges).

source
CGE.split_cluster_rssFunction
split_cluster_rss(m, w)

Splits cluster m with weights w into two clusters along its first principal component so as to minimize maximum RSS of one of the resulting clusters.

source
CGE.split_cluster_rss2Function
split_cluster_rss2(m, w)

Splits cluster m with weights w into two clusters along its first principal component so as to minimize maximum RSS of one of the resulting clusters.

This is a second slower version of the algorithm using sorting. Retained for testing purposes.

source

Divergence

CGE.wGCLFunction

wGCL(edges::Array{Int,2}, weights::Vector{Float64}, comm::Array{Int,2}, embed::Array{Float64,2}, distances::Vector{Float64}, verbose::Bool = false)

Calculates Weighted Geometric Chung-Lu model and divergence score for graph and embedding.

Arguments

  • edges::Array{Int,2} array with edges definition (two whitespace separated vertices ids)
  • eweights::Vector{Float64} edges weights
  • comm::Array{Int,2} assignment of vertices to communities
  • embed::Array{Float64,2} array with vertices embeddings
  • distances::Vector{Float64} distances between vertices
  • vweights::Vector{Float64} landmarks total weights - used only with landmarks approximation
  • init_vweights::Vector{Float64} vector with original (full) vertices weights - used only with landmarks approximation
  • v_to_l::Vector{Int} mapping from vertices to landmarks (landmarks membership) - used only with landmarks approximation
  • init_edges::Array{Int,2} array with original (full) graph edges - used only with landmarks approximation
  • init_eweights::Vector{Float64} vector with original (full) edges weights - used only with landmarks approximation
  • init_embed::Matrix{Float64} array with original embedding for full graph - used only with landmarks approximation
  • split::Bool indicator for splitting JS divergence score (global score)
  • seed::Int RNG seed for local measure score
  • auc_samples::Int no. samples for local measure score
  • verbose::Bool verbose switch, if true prints additional processing information
source
CGE.wGCL_directedFunction

wGCL_directed(edges::Array{Int,2}, weights::Vector{Float64}, comm::Array{Int,2}, embed::Array{Float64,2}, distances::Vector{Float64}, verbose::Bool = false)

Calculates directed Weighted Geometric Chung-Lu model and divergence score for graph and embedding.

Arguments

  • edges::Array{Int,2} array with edges definition (two whitespace separated vertices ids)
  • eweights::Vector{Float64} edges weights
  • comm::Array{Int,2} assignment of vertices to communities
  • embed::Array{Float64,2} array with vertices embeddings
  • distances::Vector{Float64} distances between vertices
  • vweights::Vector{Float64} landmarks total weights - used only with landmarks approximation
  • init_vweights::Vector{Float64} vector with original (full) vertices weights - used only with landmarks approximation
  • v_to_l::Vector{Int} mapping from vertices to landmarks (landmarks membership) - used only with landmarks approximation
  • init_edges::Array{Int,2} array with original (full) graph edges - used only with landmarks approximation
  • init_eweights::Vector{Float64} vector with original (full) edges weights - used only with landmarks approximation
  • init_embed::Matrix{Float64} array with original embedding for full graph - used only with landmarks approximation
  • split::Bool indicator for splitting JS divergence score (global score)
  • seed::Int RNG seed for local measure score
  • auc_samples::Int no. samples for local measure score
  • verbose::Bool verbose switch, if true prints additional processing information
source

Auxilary

CGE.distFunction
dist(i::Int, j::Int, embed::Array{Float64,2})

Calculates Euclidian distance between two vectors from embedding array.

Arguments

  • v1::Int index of first vector
  • v2::Int index of second vector
  • embed::Array{Float64,2} graph embedding array
source
CGE.JSFunction
JS(vC::Vector{Float64}, vB::Vector{Float64},
    vI::Vector{Int}, internal::Int, vLen::Int)

Jensen-Shannon divergence with Dirichlet-like prior.

Arguments

  • vC::Vector{Float64} first distribution of edges within and between communities
  • vB::Vector{Float64} second distribution of edges within and between communities
  • vI::Vector{Int} indicator of internal (1) and external (0) edges w.r.t. communities, if empty compute overall JS distance
  • internal::Int internal JS distance switch, if 1 return internal, else return external
source

Clustering

CGE.louvain_clustFunction

louvain_clust(edges::String)

Calculate communities in graph using Louvain algoritm

Arguments

  • edges::String name of file with edges definition
source

louvain_clust(filename::String, edges::Array{Int,2}, weights::Array{Float64,1}))

Calculate communities in weighted graph using Louvain algoritm

Arguments

  • filename::String name of file with edges definition
  • edges::Array{Int,2} list of edges
  • weights::Array{Float64,1} array of edges' weights
source