In this work, we incorporate new edges from a paraclique-identification approach to the output of theMST-kNN graph partitioning method. We present a statistical analysis of the results on a dataset originated from a computational linguistic study of 84 Indo-European languages. We also present results from a computational stylistic study of 168 plays of the Shakespearean era. For the latter, results of the Kruskal- Wallis test 1 (observed vs. all permutations) showed a p-value of a 1.62E- 11 and a Wilcoxon test a p-value of 8.1E-12. Overall, our results clearly show in both cases that the modified approach provides statistically more significant results than the use of the MST-kNN alone, thus providing a highly-scalable alternative and statistically sound approach for data clustering.
Australasian Conference on Artificial Life and Computational Intelligence (ACALCI 2015). Proceedings of the Australasian Conference on Artificial Life and Computational Intelligence (Newcastle, N.S.W. 5-7 February, 2015) p. 373-386