Clustering Words

This project presents an algorithm to automatically compute the number of clusters and shifts in a subject's list of words. The corresponding code is experimental and was applied to three datasets.

The code performs 3 main steps, data cleaning, similarity, and clustering.

To run the algorithm with the provided example, download all codes, install the corresponding libraries, and run the code without any change. Otherwise, to apply the model to your own data, you have to change the paths of files for clustering the data, following the specifications of the CSV files (see below for more details).

Requirements

This project runs under Rstudio 4.2.2. You can install all the libraries using install.packages("namePackage"). The original code uses the library ggplot2 and reshape2

Uses

Suggestion

The R file clusteringWord.R has the main code and the corresponding instructions. To use this clustering model the dataset needs to be in a csv file with the two following columns: subject (the number/name of the subject, e.g. 1), and word (the raw word listed by a subject, e.g. “shark”, (please keep the order of the columns, but the labels of the head columns are not important).

The beginning of the code contains the main variables for the model, which may be changed according to your needs

nameData : The path and name of the file that contains the dataset.
simThreshold : The similarity threshold used for the clusterization process

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Animals_11.csv		Animals_11.csv
Animals_15.csv		Animals_15.csv
Animals_7.csv		Animals_7.csv
Comidas_11.csv		Comidas_11.csv
Comidas_15.csv		Comidas_15.csv
Comidas_7.csv		Comidas_7.csv
LICENSE		LICENSE
Prendas_11.csv		Prendas_11.csv
Prendas_15.csv		Prendas_15.csv
Prendas_7.csv		Prendas_7.csv
README.md		README.md
clusteringWords.R		clusteringWords.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Clustering Words

Requirements

Uses

Suggestion

About

Uh oh!

Releases

Packages

Languages

License

smorenoa/clusteringWords

Folders and files

Latest commit

History

Repository files navigation

Clustering Words

Requirements

Uses

Suggestion

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages