Dimensionality-reduction preview for tabular resources. The extension adds a resource view that:
Create the view, select a method, and generate a 2D or 3D projection of your data. You can color points by a chosen column and control which columns are used as features.
- Data loading: adapters handle CSV/TSV/XLS/XLSX; row sampling via
ckanext.dimred.max_rows. - Feature prep: numeric columns included; low-cardinality categoricals one-hot encoded if enabled; user can pick feature columns.
- Dimensionality reduction: choose UMAP or t-SNE or PCA, with configurable defaults and per-view JSON overrides.
- Rendering: configurable backend β interactive Apache ECharts with 3D scatter support (default) or static Matplotlib PNG (2D/3D); choose per view in the form, with the config value as the default; pluggable to custom renderer if you override bundle/module.
- API:
dimred_get_dimred_previewreturns the embedding and metadata (prep info, method params) for programmatic use. - Caching: results are cached in Redis by default so repeat calls with the same settings avoid recomputing the projection (configurable TTL and on/off toggle).
- Add a tabular resource (csv/tsv/xls/xlsx).
- Create a new resource view of type
dimred_view. - (Optional) Choose method (
UMAP/t-SNE/PCA), pickColor by column, and select feature columns. - (Optional) Choose output components (
2or3); defaults come from the method config (e.g.,ckanext.dimred.umap.n_components). - (Optional) Pick render backend (
EChartsinteractive orMatplotlibPNG) β defaults to the config value. - Save or Preview to see the rendered embedding (interactive or PNG, depending on
ckanext.dimred.render_backend), and use βDownload embedding (CSV)β to get the coordinates.
API: use dimred_get_dimred_preview with id (resource id) and view_id to retrieve
embedding/meta.
- Set
n_componentsto3in the form (or method parameters) to get a 3D embedding. - Interactive backend (
ckanext.dimred.render_backend = echarts) uses ECharts withecharts-glso you can rotate/zoom/pan the point cloud. - Static backend (
render_backend = matplotlib) renders a 3D scatter PNG (fixed view angle, rotatable only when using the interactive backend).
Iris dataset:
| rownames | Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
|---|---|---|---|---|---|
| 1 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
| 2 | 4.9 | 3.0 | 1.4 | 0.2 | setosa |
| ... | ... | ... | ... | ... | ... |
| 51 | 7.0 | 3.2 | 4.7 | 1.4 | versicolor |
| 52 | 6.4 | 3.2 | 4.5 | 1.5 | versicolor |
| ... | ... | ... | ... | ... | ... |
| 101 | 6.3 | 3.3 | 6.0 | 2.5 | virginica |
| 102 | 5.8 | 2.7 | 5.1 | 1.9 | virginica |
| ... | ... | ... | ... | ... | ... |
Creating the dimred view: Method, Feature selection, Color by:
Rendered 2D embedding PNG:
Compatibility with core CKAN versions:
| CKAN version | Compatible? |
|---|---|
| 2.9 and earlier | no |
| 2.10+ | yes |
To install ckanext-dimred:
-
Activate your CKAN virtual environment, for example:
. /usr/lib/ckan/default/bin/activate
-
Clone the source and install it on the virtualenv
git clone https://github.com/DataShades/ckanext-dimred.git cd ckanext-dimred pip install -e .
-
Add
dimredto theckan.pluginssetting in your CKAN config file (by default the config file is located at/etc/ckan/default/ckan.ini). -
Restart CKAN. For example if you've deployed CKAN with Apache on Ubuntu:
sudo service apache2 reload
General defaults:
ckanext.dimred.default_method(default:umap)ckanext.dimred.allowed_methods(default:umap tsne pca)ckanext.dimred.max_file_size_mb(default:50)ckanext.dimred.max_rows(default:50000)ckanext.dimred.enable_categorical(default:true)ckanext.dimred.max_categories_for_ohe(default:30)ckanext.dimred.export_enabled(default:true)ckanext.dimred.cache_enabled(default:true)ckanext.dimred.cache_ttl(default:3600)ckanext.dimred.render_backend(default:echarts;echartsfor interactive chart,matplotlibfor static PNG)ckanext.dimred.render_asset(optional; override the webassets bundle for the configured render backend)ckanext.dimred.render_module(optional; override the CKAN JS module for the configured render backend)ckanext.dimred.embedding_decimals(default:3; decimal places to round embedding coordinates before returning/exporting)
UMAP defaults:
ckanext.dimred.umap.n_neighbors(default:15)ckanext.dimred.umap.min_dist(default:0.1)ckanext.dimred.umap.n_components(default:2)
t-SNE defaults:
ckanext.dimred.tsne.perplexity(default:30)ckanext.dimred.tsne.n_components(default:2)
PCA defaults:
ckanext.dimred.pca.n_components(default:2)ckanext.dimred.pca.whiten(default:false)
Example:
ckan.plugins = ... dimred
ckanext.dimred.allowed_methods = umap
ckanext.dimred.max_rows = 10000
ckanext.dimred.enable_categorical = true
To install ckanext-dimred for development, activate your CKAN virtualenv and do:
git clone https://github.com/DataShades/ckanext-dimred.git
cd ckanext-dimred
pip install -e .
pip install -r dev-requirements.txt
To run the tests, do:
pytest --ckan-ini=test.ini


