Project Release Information
The clustering program mcl is now considerably faster due to optimizations in the memory management code. Additionally, it utilises vanilla matrix/vector multiplication where this is faster than sparse matrix/vector multiplication. The speed gains range from slightly below twofold to sixfold depending on graph size and edge density. Large speed increases may be obtained for graphs of sizes up to several hundred thousands of nodes. The graph query program has acquired several edge weight output and summary statistics options, and documentation was fixed and updated.
The clustering program is somewhat faster now, the program for creating networks from tabular data has become more capable, and throughout the suite parallelisation support was improved and streamlined. Graph transformations are now available to many programs in a single specification language, and more transformations have been added.
This release adds a network analysis program that generates statistics at different edge weight cutoffs. The gene expression data parser has been updated, and a mutual k-nearest-neighbour network reduction option was added. Some obscure options were removed and integration between the different clustering and network analysis programs was tightened.
This release improves support for reading and transforming mRNA array
data. MCL has acquired an option to sparsify input graphs, and analysis
modes have been split off and are now available as a mode in the clm
program. A bug introduced in mcl-09-182 in the cluster interpretation
routines has been fixed. The mcx program can now compute both node
eccentricity and betweenness centrality parallelized over multiple
machines and multiple threads. Minor improvements have been made
throughout the entire suite of programs.
The mcl suite is moving towards a wider focus on
general purpose large scale graph analysis, with
the emphasis, besides clustering, on basic graph
and clustering measures and transformations. The
program mcxarray can now transform tabular gene
expression data into graph input. The clm utility
computes clustering coefficients, diameter and
eccentricity, and betweenness centrality. Many
fixes and improvements were made throughout.
MCL-edge is an integrated command-line driven workbench for large scale network analysis. It includes programs for the computation of shortest paths, diameter, clustering coefficient, betweenness centrality, and network shuffles. A module for loading and analyzing gene expression data as a network is provided. The MCL algorithm is a fast and highly scalable cluster algorithm for networks based on stochastic flow. The flow process employed by the algorithm is mathematically sound and intrinsically tied to cluster structure, which is revealed as the imprint left by the process. The threaded implementation has handled networks with millions of nodes within hours and is widely used in the fields of bioinformatics, graph clustering, and network analysis.