This repository contains a Python implementation of the DRS-means algorithm, based on its description in [1] by Olzhas Kozbagarov and Rustam Mussabayev. This implementation, created by Tapio Pahikkala, utilizes Random swap-algorithm by Pasi Fränti and Juha Kivijärvi [2].
This implementation is used in our paper:
- N. Karmitsa, V.-P. Eronen, M.M. Mäkelä, T. Pahikkala, A. Airola, "Stochastic limited memory bundle algorithm for clustering in big data", Pattern Recognition, Vol. 165, 111654, 2025.
-
DRSmeans.py
- Main program for DRS-means
-
RandomSwapAlt.py
- Accelerated Random Swap algorithm. The original algorithm is available at https://github.com/uef-machine-learning/RandomSwap
To use the code:
- Download DRSmeans.py and RandomSwapAlt.py.
- Define the data-file, no. clusters, and no. random swap iterations at the end of the DRSmeans.py file.
- Finally, just type "python DRSmeans.py".
The algorithm returns the MSSC-function value and used time, as well as cluster centers and distributions.
[1] O. Kozbagarov, R. Mussabayev, "Distributed random swap: An efficient algorithm for minimum sum-of-squares clustering", Information Sciences 681 (2024) 121204.
[2] P. Fränti, J. Kivijärvi. "Randomized local search algorithm for the clustering problem". Pattern Analysis and Applications, 3 (4), 358-369, 2000.