FlashGraphng
A new frontier in largescale graph analysis and data mining

Compute kmeans on matrix of features. More...
Classes  
class  FG_vector 
FlashGraph vector that provides several parallelized methods when compared to an STLvector. NOTE: Not an STLcompatible data structure. This vector is also ideally used with numeric data types. Methods marked with the keyword parallel are parallelized implementations. More...  
class  FG_graph 
A userfriendly wrapper for FlashGraph's raw graph type. Very usefule when when utilizing FlashGraph prewritten/library algorithms. More...  
class  graph_config 
class  compute_vertex 
Class from which users' vertexcentric programs should inherit. Serial code written when implementing run*. methods here will be run in parallel within the graph engine. More...  
class  compute_directed_vertex 
A directed version of the compute_vertex class that users inherit from when using the FlashGraph engine. More...  
class  vertex_scheduler 
class  vertex_filter 
When the graph engine starts, a user can use this filter to decide what vertices are activated for the first time. More...  
class  vertex_initializer 
A user may be decide to initialize individual vertex state in a custom way not expressible via the vertex constructor. This provides that capability. More...  
class  vertex_query 
Parallized query of the vertex state of all vertices in the graph. Each worker thread gets an instance of the query and the perthread query results will be merged in the end. Inherit from this class to run queries in parallel. More...  
class  graph_engine 
This is the class that coordinates how & where algorithms are run. It can be seen as the central organ of FlashGraph. More...  
class  graph_header 
class  graph_index 
This file contains a set of graph index implementation. The graph index is the inmemory container to keep all vertices of a graph. More...  
class  vertex_message 
struct  local_vid_t 
class  ts_edge_data 
class  edge_count 
class  page_vertex 
Vertex representation when in the page cache. More...  
class  TS_page_vertex 
class  page_directed_vertex 
class  page_undirected_vertex 
This vertex class represents an undirected vertex in the page cache. More...  
class  vertex_program 
class  vertex_program_creater 
Extend/Override when defining a custom vertex program. The graph engine uses this to construct vertex programs for each worker thread. More...  
class  vertex_program_impl 
The default implementation of a vertex program in the graph engine. More...  
class  ts_vertex_request 
Typedefs  
typedef unsigned int  vsize_t 
Basic data types used in FlashGraph.  
typedef std::pair< int, struct local_vid_t >  vertex_loc_t 
Enumerations  
enum  directed_triangle_type 
Triangle computation type. More...  
enum  edge_type { , IN_EDGE, OUT_EDGE, BOTH_EDGES, NUM_TYPES } 
Edge type of an edge in the graph. More...  
Functions  
template<class T , class ApplyFunc >  
void  multi_vec_apply (const std::vector< typename FG_vector< T >::ptr > &inputs, typename FG_vector< T >::ptr output, ApplyFunc apply) 
Apply a user defined function to multipl FG_vectors. parallel More...  
FG_vector< vertex_id_t >::ptr  compute_wcc (FG_graph::ptr fg) 
Compute all weakly connectected components of a graph. More...  
FG_vector< vertex_id_t >::ptr  compute_sync_wcc (FG_graph::ptr fg) 
Compute all weakly connectected components of a graph synchronously. The reason of having this implementation is to understand the performance of synchronous wcc and asynchronous wcc. More...  
FG_vector< vertex_id_t >::ptr  compute_ts_wcc (FG_graph::ptr fg, time_t start_time, time_t time_interval) 
Compute all weakly connectected components of a timeseries graph in a specified time interval. More...  
FG_vector< vertex_id_t >::ptr  compute_scc (FG_graph::ptr fg) 
Compute all strongly connected components of a graph. More...  
FG_vector< size_t >::ptr  compute_directed_triangles (FG_graph::ptr fg, directed_triangle_type type) 
Compute the directed triangle count for each each vertex. Currently, it only counts the number of cycle triangles. More...  
FG_vector< size_t >::ptr  compute_undirected_triangles (FG_graph::ptr fg) 
Compute undirected triangle count for each vertex. More...  
FG_vector< size_t >::ptr  compute_local_scan (FG_graph::ptr) 
Compute the pervertex local Scan Statistic. More...  
FG_vector< std::pair < vertex_id_t, size_t > >::ptr  compute_topK_scan (FG_graph::ptr, size_t topK) 
Obtain the top K vertices with the largest local Scan Statistic value. More...  
size_t  estimate_diameter (FG_graph::ptr fg, int num_bfs, bool directed) 
Compute the diameter estimation for a graph. More...  
FG_vector< float >::ptr  compute_pagerank (FG_graph::ptr fg, int num_iters, float damping_factor) 
Compute the PageRank of a graph using the pull method where vertices request the data from all their neighbors each iteration. Tends to converge to stable values. More...  
FG_vector< float >::ptr  compute_pagerank2 (FG_graph::ptr, int num_iters, float damping_factor) 
Compute the PageRank of a graph using the push method where vertices send deltas of their PageRank to neighbors in the event their own PageRank changes. More...  
in_mem_subgraph::ptr  fetch_subgraph (FG_graph::ptr graph, const std::vector< vertex_id_t > &vertices) 
Fetch the clusters with the wanted cluster IDs. More...  
FG_vector< size_t >::ptr  compute_kcore (FG_graph::ptr fg, size_t k, size_t kmax=0) 
Compute the kcore/coreness of a graph. The algorithm will determine which vertices are between core k and kmax – all other vertices will be assigned to core 0. More...  
FG_vector< vsize_t >::ptr  get_degree (FG_graph::ptr fg, edge_type type) 
Get the degree of all vertices in the graph. More...  
FG_vector< float >::ptr  compute_transitivity (FG_graph::ptr fg) 
Compute the transitivity/clustering coefficient of a graph. More...  
FG_vector< float >::ptr  compute_betweenness_centrality (FG_graph::ptr fg, const std::vector< vertex_id_t > &vids) 
Compute the betweeenness centrality of a graph. More...  
FG_vector< vsize_t >::ptr  get_ts_degree (FG_graph::ptr fg, edge_type type, time_t start_time, time_t time_interval) 
Get the degree of all vertices in a specified time interval in a timeseries graph. More...  
std::pair< time_t, time_t >  get_time_range (FG_graph::ptr fg) 
Get the time range in which the timeseries graph is. More...  
void  compute_overlap (FG_graph::ptr fg, const std::vector< vertex_id_t > &vids, std::vector< std::vector< double > > &overlap_matrix) 
Get the neighborhood overlap of each pair of vertices in `vids'. More...  
void  compute_louvain (FG_graph::ptr fg, const uint32_t levels) 
Compute louvain clustering for a graph. More...  
sem_kmeans_ret::ptr  compute_sem_kmeans (FG_graph::ptr fg, const size_t k, const std::string init, const unsigned max_iters, const double tolerance) 
Compute SemiExternal Memory kmeans. More...  
Variables  
const vertex_id_t  MAX_VERTEX_ID = UINT_MAX 
Compute kmeans on matrix of features.
matrix  The matrix who's row IDs are being clustered. 
clusters  The cluster centers (means). 
cluster_assignments  Which cluster each sample falls into. 
cluster_assignment_counts  How many members each cluster has. 
num_rows  The number of rows in matrix . 
nev  The number of eigenvalues / number of columns in matrix . 
k  The number of clusters required. 
max_iters  The maximum number of iterations of Kmeans to perform. 
init  The type of initilization ["random", "forgy", "kmeanspp"] 
Copyright 2014 Open Connectome Project (http://openconnecto.me) Written by Da Zheng (zheng) da19 36@gm ail. com
This file is part of FlashGraph.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
typedef std::pair<int, struct local_vid_t> fg::vertex_loc_t 
The vertex location in a partition. first: the partition ID. second: the location in the partition.
Triangle computation type.
enum fg::edge_type 
FG_vector<float>::ptr fg::compute_betweenness_centrality  (  FG_graph::ptr  fg, 
const std::vector< vertex_id_t > &  vids  
) 
Compute the betweeenness centrality of a graph.
fg  The FlashGraph graph object for which you want to compute. 
vids  The vertex IDs for which BC should be computed 
FG_vector<size_t>::ptr fg::compute_directed_triangles  (  FG_graph::ptr  fg, 
directed_triangle_type  type  
) 
Compute the directed triangle count for each each vertex. Currently, it only counts the number of cycle triangles.
fg  The FlashGraph graph object for which you want to compute. 
type  The type of triangles you wish to count. 
FG_vector<size_t>::ptr fg::compute_kcore  (  FG_graph::ptr  fg, 
size_t  k,  
size_t  kmax = 0 

) 
Compute the kcore/coreness of a graph. The algorithm will determine which vertices are between core k
and kmax
– all other vertices will be assigned to core 0.
fg  The FlashGraph graph object for which you want to compute. 
k  The core value to be computed. 
kmax  (Optional) The kmax value. If omitted then all cores are computed i.e., coreness. This is not recommended for very large graphs. 
FG_vector
containing the core of each vertex between k
and kmax
. All other vertices are assigned to core 0. FG_vector<size_t>::ptr fg::compute_local_scan  (  FG_graph::ptr  ) 
Compute the pervertex local Scan Statistic.
fg  The FlashGraph graph object for which you want to compute. 
void fg::compute_louvain  (  FG_graph::ptr  fg, 
const uint32_t  levels  
) 
Compute louvain clustering for a graph.
fg  The FlashGraph graph object for which you want to compute. 
levels  The number of levels of the hierarchy to do. 
void fg::compute_overlap  (  FG_graph::ptr  fg, 
const std::vector< vertex_id_t > &  vids,  
std::vector< std::vector< double > > &  overlap_matrix  
) 
Get the neighborhood overlap of each pair of vertices in `vids'.
fg  The FlashGraph graph object for which you want to compute. 
vids  The vertices whose neighborhood overlap is computed. 
overlap_matrix  A dense matrix that stores the overlap of each pair of vertices. 
FG_vector<float>::ptr fg::compute_pagerank  (  FG_graph::ptr  fg, 
int  num_iters,  
float  damping_factor  
) 
Compute the PageRank of a graph using the pull method where vertices request the data from all their neighbors each iteration. Tends to converge to stable values.
fg  The FlashGraph graph object for which you want to compute. 
num_iters  The maximum number of iterations for PageRank. 
damping_factor  The damping factor. Originally .85. 
FG_vector<float>::ptr fg::compute_pagerank2  (  FG_graph::ptr  , 
int  num_iters,  
float  damping_factor  
) 
Compute the PageRank of a graph using the push method where vertices send deltas of their PageRank to neighbors in the event their own PageRank changes.
fg  The FlashGraph graph object for which you want to compute. 
num_iters  The maximum number of iterations for PageRank. 
damping_factor  The damping factor. Originally .85. 
FG_vector<vertex_id_t>::ptr fg::compute_scc  (  FG_graph::ptr  fg  ) 
Compute all strongly connected components of a graph.
fg  The FlashGraph graph object for which you want to compute. 
sem_kmeans_ret::ptr fg::compute_sem_kmeans  (  FG_graph::ptr  fg, 
const size_t  k,  
const std::string  init,  
const unsigned  max_iters,  
const double  tolerance  
) 
Compute SemiExternal Memory kmeans.
fg  The FlashGraph graph object for which you want to compute. 
k  The number of clusters. 
init  Initialization type [random, forgy, kmeanspp]. 
max_iters  The max number of iterations to compute for. 
tolerance  The min fraction of changes from 1 iter to next required to converge. 
FG_vector<vertex_id_t>::ptr fg::compute_sync_wcc  (  FG_graph::ptr  fg  ) 
Compute all weakly connectected components of a graph synchronously. The reason of having this implementation is to understand the performance of synchronous wcc and asynchronous wcc.
fg  The FlashGraph graph object for which you want to compute. 
FG_vector<std::pair<vertex_id_t, size_t> >::ptr fg::compute_topK_scan  (  FG_graph::ptr  , 
size_t  topK  
) 
Obtain the top K vertices with the largest local Scan Statistic value.
fg  The FlashGraph graph object for which you want to compute. 
topK  The value for K used for the top K vertices. 
FG_vector< float >::ptr fg::compute_transitivity  (  FG_graph::ptr  fg  ) 
Compute the transitivity/clustering coefficient of a graph.
Compute transitivity of all vertices in the graph.
fg  The FlashGraph graph object for which you want to compute. 
fg  The FlashGraph graph object for which you want to compute. 
FG_vector<vertex_id_t>::ptr fg::compute_ts_wcc  (  FG_graph::ptr  fg, 
time_t  start_time,  
time_t  time_interval  
) 
Compute all weakly connectected components of a timeseries graph in a specified time interval.
fg  The FlashGraph graph object for which you want to compute. 
start_time  The start time of the time interval. 
time_interval  The length of the time interval. 
FG_vector<size_t>::ptr fg::compute_undirected_triangles  (  FG_graph::ptr  fg  ) 
Compute undirected triangle count for each vertex.
fg  The FlashGraph graph object for which you want to compute. 
FG_vector<vertex_id_t>::ptr fg::compute_wcc  (  FG_graph::ptr  fg  ) 
Compute all weakly connectected components of a graph.
fg  The FlashGraph graph object for which you want to compute. 
size_t fg::estimate_diameter  (  FG_graph::ptr  fg, 
int  num_bfs,  
bool  directed  
) 
Compute the diameter estimation for a graph.
fg  The FlashGraph graph object for which you want to compute. 
in_mem_subgraph::ptr fg::fetch_subgraph  (  FG_graph::ptr  graph, 
const std::vector< vertex_id_t > &  vertices  
) 
Fetch the clusters with the wanted cluster IDs.
fg  The FlashGraph graph object for which you want to compute. 
vertices  The vertices that the induced subgraph has. 
Get the degree of all vertices in the graph.
fg  The FlashGraph graph object for which you want to compute. 
type  The edge type: IN_EDGE, OUT_EDGE, BOTH_EDGES. 
std::pair<time_t, time_t> fg::get_time_range  (  FG_graph::ptr  fg  ) 
Get the time range in which the timeseries graph is.
fg  The FlashGraph graph object for which you want to compute. 
FG_vector<vsize_t>::ptr fg::get_ts_degree  (  FG_graph::ptr  fg, 
edge_type  type,  
time_t  start_time,  
time_t  time_interval  
) 
Get the degree of all vertices in a specified time interval in a timeseries graph.
fg  The FlashGraph graph object for which you want to compute. 
type  The edge type: IN_EDGE, OUT_EDGE, BOTH_EDGES. 
start_time  The start time of the time interval. 
time_interval  length of the time interval. 
void fg::multi_vec_apply  (  const std::vector< typename FG_vector< T >::ptr > &  inputs, 
typename FG_vector< T >::ptr  output,  
ApplyFunc  apply  
) 
Apply a user defined function to multipl FG_vectors. parallel
inputs  A vector of FG_vectors that are the inputs. 
output  A FG_vector that are the outputs. 
apply  The userdefined function that will be applied to all vecotors. 
const vertex_id_t fg::MAX_VERTEX_ID = UINT_MAX 
Used to represent vertex IDs in graph