Labeled multi-dimensional arrays for JavaScript
A JavaScript/TypeScript implementation inspired by Python's xarray library
Features • Installation • Quick Start • API • Examples
Working with multi-dimensional labeled data in JavaScript shouldn't be painful. jaxray brings the power of Python's xarray to the JavaScript ecosystem, making it easy to work with scientific data, climate datasets, and any labeled arrays.
Perfect for:
- 🌍 Climate and weather data analysis
- 📊 Time series with multiple dimensions
- 🗺️ Geospatial data processing
- 🔬 Scientific computing in the browser
- 📈 Large dataset streaming and processing
- ✨ Labeled Arrays: Named dimensions and coordinates like xarray
- 🎯 Smart Selection: Select by labels with nearest neighbor, forward/backward fill
- 🌊 Streaming: Process massive datasets chunk-by-chunk with progress tracking
- 📦 Zarr Support: Read sharded Zarr stores directly from IPFS
- 🔐 Encryption: Transparent XChaCha20-Poly1305 authenticated encryption for Zarr datasets
- 🔒 Type-Safe: Full TypeScript support with complete type definitions
- 💨 Memory Efficient: Stream large selections without loading everything
- 🔄 Immutable: All operations return new instances
npm install @dclimate/jaxrayimport { DataArray } from '@dclimate/jaxray';
// Simple 1D array with labeled coordinates
const temperatures = new DataArray([20, 22, 25, 23, 21], {
dims: ['time'],
coords: {
time: ['Mon', 'Tue', 'Wed', 'Thu', 'Fri']
},
attrs: {
units: 'celsius',
description: 'Daily temperatures'
}
});
console.log(temperatures.data); // [20, 22, 25, 23, 21]
console.log(temperatures.shape); // [5]const gridData = new DataArray(
[
[1, 2, 3],
[4, 5, 6]
],
{
dims: ['y', 'x'],
coords: {
y: [0, 10],
x: [0, 10, 20]
}
}
);// Select by label
const wednesday = await temperatures.sel({ time: 'Wed' });
console.log(wednesday.data); // 25
// Select multiple values
const selected = await temperatures.sel({ time: ['Mon', 'Wed', 'Fri'] });
console.log(selected.data); // [20, 25, 21]
// Slice selection
const midweek = await temperatures.sel({ time: { start: 'Tue', stop: 'Thu' } });
console.log(midweek.data); // [22, 25, 23]
// Select by integer position
const byIndex = await temperatures.isel({ time: 2 });
console.log(byIndex.data); // 25jaxray supports xarray-style nearest neighbor lookups and interpolation methods:
const data = new DataArray([10, 20, 30, 40, 50], {
dims: ['x'],
coords: {
x: [0, 5, 10, 15, 20]
}
});
// Find nearest coordinate
const nearest = await data.sel({ x: 7 }, { method: 'nearest' });
console.log(nearest.data); // 20 (nearest to x=7 is x=5)
// Forward fill (last value <= target)
const ffill = await data.sel({ x: 12 }, { method: 'ffill' });
console.log(ffill.data); // 30 (last value where x <= 12 is x=10)
// Backward fill (first value >= target)
const bfill = await data.sel({ x: 12 }, { method: 'bfill' });
console.log(bfill.data); // 40 (first value where x >= 12 is x=15)
// With tolerance
const tolerant = await data.sel({ x: 7 }, {
method: 'nearest',
tolerance: 3
});
// Succeeds because distance is 2
// This would throw an error (distance 7 > tolerance 2)
await data.sel({ x: 13 }, { method: 'nearest', tolerance: 2 });jaxray can auto-detect whether a CID points to a sharded Zarr store or a HAMT-backed one. All you need is the CID:
import { Dataset, openIpfsStore } from '@dclimate/jaxray';
const cid = 'bafy…';
// Automatically detects the underlying layout and selects the correct store
const { type, store } = await openIpfsStore(cid);
console.log(`Detected store type: ${type}`); // "sharded" or "hamt"
// Use the store with the regular Dataset API
const ds = await Dataset.open_zarr(store);
// Work with the dataset immediately
console.log(ds.dataVars);
const snapshot = await ds.sel({
latitude: 45,
longitude: 34,
time: '2025-09-02T00:00:00Z'
});
console.log(snapshot.getVariable('temperature').values);The helper uses dClimate's public gateway by default. Provide your own gateway or custom IPFS primitives when needed:
import { createIpfsElements, openIpfsStore } from '@dclimate/jaxray';
await openIpfsStore(cid, { gatewayUrl: 'https://ipfs.my-org.dev' });
// or reuse your own IPFS primitives
const customElements = createIpfsElements('https://ipfs.my-org.dev');
const { type, store } = await openIpfsStore(cid, { ipfsElements: customElements });jaxray supports transparent encryption and decryption of Zarr datasets using XChaCha20-Poly1305 authenticated encryption. This provides both confidentiality and integrity protection for your data.
Before opening encrypted datasets, register the encryption codec with your key management:
import { registerXChaCha20Poly1305Codec } from '@dclimate/jaxray';
// Register with a hex-encoded 256-bit key
registerXChaCha20Poly1305Codec({
getKey: () => '00112233445566778899aabbccddeeff00112233445566778899aabbccddeeff',
// Optional: customize nonce generation (defaults to crypto.getRandomValues)
// nonceGenerator: () => crypto.getRandomValues(new Uint8Array(24))
});Key Management Best Practices:
- 🔑 Key Generation: Use cryptographically secure random number generators to create 256-bit (32 byte) keys
- 🔒 Key Storage: Store keys securely using environment variables, key management services (KMS), or secure enclaves
- 🎲 Nonce Generation: The default nonce generator uses
crypto.getRandomValues()for secure randomness. Each encryption operation automatically generates a fresh nonce ⚠️ Never reuse nonces with the same key - the implementation handles this automatically
Once the codec is registered, encrypted datasets can be opened transparently:
import { ZarrBackend } from '@dclimate/jaxray';
// Open an encrypted Zarr store
const dataset = await ZarrBackend.open(encryptedStore);
// Check if the dataset is encrypted
console.log(dataset.isEncrypted); // true
// Access data - decryption happens automatically
const variable = dataset.getVariable('temperature');
const data = await variable.compute();
console.log(data.data); // Decrypted valuesjaxray automatically detects encrypted datasets by inspecting the Zarr metadata for XChaCha20-Poly1305 codecs:
const dataset = await ZarrBackend.open(store);
if (dataset.isEncrypted) {
console.log('Dataset contains encrypted variables');
// Ensure codec is registered with correct key
}- Encryption: When data is written, each chunk is encrypted with a randomly generated 24-byte nonce. The nonce is prepended to the ciphertext (first 24 bytes)
- Decryption: When reading, the nonce is extracted from the chunk, and the data is decrypted and authenticated
- Authentication: XChaCha20-Poly1305 includes a 16-byte authentication tag that protects against tampering. Decryption will fail if the data has been modified or if the wrong key is used
If decryption fails (wrong key, corrupted data, or tampering detected), an error will be thrown:
try {
const data = await variable.compute();
} catch (error) {
if (error.message.match(/tag|auth/i)) {
console.error('Decryption failed: Invalid key or corrupted data');
}
}- ✅ Authenticated Encryption: XChaCha20-Poly1305 provides both confidentiality and integrity
- ✅ Unique Nonces: Each encryption operation uses a fresh random nonce
- ✅ Extended Nonce: XChaCha20 uses 192-bit nonces, eliminating collision concerns
⚠️ Key Protection: The security depends entirely on keeping your encryption keys secret⚠️ Metadata: Zarr metadata (array shapes, dimension names, etc.) is not encrypted
See the xchacha20poly1305 example for a complete working example.
For large datasets that don't fit in memory, use streaming to process data in chunks:
// Stream large time series data
const largeData = new DataArray(/* ... large array ... */, {
dims: ['time'],
coords: { time: /* ... */ }
});
const stream = largeData.selStream(
{ time: ['2020-01-01', '2020-12-31'] },
{ chunkSize: 50 } // Process in 50MB chunks
);
for await (const chunk of stream) {
console.log(`Progress: ${chunk.progress}%`);
console.log(`Processing ${chunk.bytesProcessed} / ${chunk.totalBytes} bytes`);
// Process this chunk (e.g., write to file, compute statistics)
await processChunk(chunk.data);
}Streaming also works with Datasets:
const stream = dataset.selStream(
{ time: ['2020-01-01', '2020-12-31'], lat: 45, lon: -73 },
{ chunkSize: 100, method: 'nearest' }
);
for await (const chunk of stream) {
const temp = chunk.data.getVariable('temperature');
const pressure = chunk.data.getVariable('pressure');
// Write chunk to disk or process incrementally
await writeToFile(temp, pressure);
}// Sum all values
const total = temperatures.sum();
console.log(total); // 111
// Mean of all values
const average = temperatures.mean();
console.log(average); // 22.2
// Sum along a dimension
const rowSums = gridData.sum('x');
console.log(rowSums.data); // [6, 15]import { DataArray, Dataset } from '@dclimate/jaxray';
// Create multiple related DataArrays
const temp = new DataArray(
[[15, 16], [18, 19]],
{
dims: ['lat', 'lon'],
coords: {
lat: [40.0, 40.5],
lon: [-74.0, -73.5]
}
}
);
const pressure = new DataArray(
[[1013, 1014], [1012, 1013]],
{
dims: ['lat', 'lon'],
coords: {
lat: [40.0, 40.5],
lon: [-74.0, -73.5]
}
}
);
// Combine into a Dataset
const weather = new Dataset({
temperature: temp,
pressure: pressure
});
console.log(weather.dataVars); // ['temperature', 'pressure']
console.log(weather.dims); // ['lat', 'lon']
// Select from Dataset
const location = await weather.sel({ lat: 40.0, lon: -73.5 });
const tempAtLocation = location.getVariable('temperature');
console.log(tempAtLocation?.data); // 16
// Works with nearest neighbor too
const nearLocation = await weather.sel(
{ lat: 40.2, lon: -73.7 },
{ method: 'nearest' }
);const humidity = new DataArray([[65, 70], [68, 72]], {
dims: ['lat', 'lon'],
coords: {
lat: [40.0, 40.5],
lon: [-74.0, -73.5]
}
});
const humidityData = new Dataset({ humidity });
const combined = weather.merge(humidityData);
console.log(combined.dataVars); // ['temperature', 'pressure', 'humidity']Concatenate datasets along a dimension to combine time-series data from multiple sources. This is particularly useful for joining finalized and non-finalized data, or combining historical and recent observations:
// Create first dataset (historical data)
const historical = new Dataset({
temperature: new DataArray(
[[15.2, 16.8], [18.3, 19.1], [20.5, 21.2]],
{
dims: ['time', 'location'],
coords: {
time: ['2020', '2021', '2022'],
location: ['Station-A', 'Station-B']
}
}
)
});
// Create second dataset (recent data)
const recent = new Dataset({
temperature: new DataArray(
[[22.1, 23.4], [24.2, 25.1]],
{
dims: ['time', 'location'],
coords: {
time: ['2023', '2024'],
location: ['Station-A', 'Station-B']
}
}
)
});
// Concatenate along time dimension
const combined = historical.concat(recent, { dim: 'time' });
console.log(combined.sizes.time); // 5
console.log(combined.coords.time); // ['2020', '2021', '2022', '2023', '2024']
// Query across both datasets
const data = await combined.sel({ time: ['2021', '2022', '2023'] });
const computed = await data.compute();
console.log(computed.getVariable('temperature').data);
// [[18.3, 19.1], [20.5, 21.2], [22.1, 23.4]]Key Features:
- 🔗 Lazy Evaluation: The result is a lazy dataset that only fetches data when needed
- 🎯 Smart Routing: Automatically queries the correct source dataset(s) based on selection
- 📊 Multi-Variable: Works with datasets containing multiple variables
- ✅ Validation: Ensures dimensions and variables match across datasets
Use Cases:
- Combining finalized and provisional weather data
- Joining historical archives with recent observations
- Merging data from different time periods or sources
See the concat-datasets example for more details.
Use the where() method to filter data based on conditions. Create conditions using comparison methods like .lt(), .gt(), .le(), .ge():
const data = new DataArray([10, 20, 30, 40, 50], {
dims: ['time'],
coords: { time: ['Mon', 'Tue', 'Wed', 'Thu', 'Fri'] }
});
// Filter where values > 25 (keep values meeting condition, replace others with NaN)
const condition = data.gt(25);
const filtered = data.where(condition);
console.log(filtered.data); // [NaN, NaN, 30, 40, 50]
// Replace values not meeting condition with 0
const replaced = data.where(condition, 0);
console.log(replaced.data); // [0, 0, 30, 40, 50]
// Works with datasets too
const temp = weather.getVariable('temperature');
const hot = temp.gt(15);
const filtered_weather = weather.where(hot);
// Both temperature and pressure filtered by the conditionAvailable comparison methods: .lt(), .le(), .gt(), .ge(), .equal(), .notEqual()
Calculate rolling statistics over a dimension:
const temps = new DataArray([10, 15, 20, 18, 22, 25, 23], {
dims: ['time'],
coords: { time: ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'] }
});
// Rolling mean with window size 3
const rolling_mean = temps.rolling('time', 3).mean();
console.log(rolling_mean.data);
// [NaN, NaN, 15, 17.67, 20, 21.67, 23.33]
// Rolling sum
const rolling_sum = temps.rolling('time', 3).sum();
console.log(rolling_sum.data);
// [NaN, NaN, 45, 53, 60, 65, 70]
// Works with datasets
const rolling_weather = weather.rolling('lat', 2).mean();
// Both temperature and pressure computed with rolling meanRename data variables in a dataset:
const weather = new Dataset({
temperature: temp,
pressure: pressure
});
// Rename single variable
const renamed = weather.rename({ temperature: 'temp' });
console.log(renamed.dataVars); // ['temp', 'pressure']
// Rename multiple variables
const renamed_all = weather.rename({
temperature: 'temp',
pressure: 'pres'
});
console.log(renamed_all.dataVars); // ['temp', 'pres']Update or add coordinates to a dataset:
const weather = new Dataset({
temperature: temp,
pressure: pressure
});
// Update existing coordinates
const updated = weather.assignCoords({
lat: [41.0, 41.5], // New latitude values
lon: [-75.0, -74.5] // New longitude values
});
// Coordinates can also be DataArrays
const new_lats = new DataArray([41.0, 41.5], {
dims: ['lat']
});
const updated2 = weather.assignCoords({ lat: new_lats });Remove variables from a dataset:
const weather = new Dataset({
temperature: temp,
pressure: pressure,
humidity: humidity
});
// Drop single variable
const dropped = weather.dropVars('humidity');
console.log(dropped.dataVars); // ['temperature', 'pressure']
// Drop multiple variables
const dropped_multiple = weather.dropVars(['humidity', 'pressure']);
console.log(dropped_multiple.dataVars); // ['temperature']Remove dimensions of size 1:
const data = new DataArray(
[[[1, 2, 3]]],
{
dims: ['x', 'y', 'z'],
coords: {
x: [0], // size 1 - will be squeezed
y: [10], // size 1 - will be squeezed
z: [100, 101, 102]
}
}
);
const squeezed = data.squeeze();
console.log(squeezed.dims); // ['z']
console.log(squeezed.shape); // [3]
console.log(squeezed.data); // [1, 2, 3]
// Works with datasets too
const squeezed_weather = weather.squeeze();
// All dimensions of size 1 are removed from all variablesnew DataArray(data, options?)Parameters:
data: Multi-dimensional array of valuesoptions:dims: Array of dimension namescoords: Coordinate values for each dimensionattrs: Metadata attributesname: Name of the DataArray
data: Get the underlying datavalues: Alias for datadims: Array of dimension namesshape: Array of dimension sizescoords: Coordinate valuesattrs: Metadata attributesname: Name of the DataArrayndim: Number of dimensionssize: Total number of elements
sel(selection, options?): Select by coordinate labelsoptions.method: Selection method ('nearest', 'ffill', 'bfill', 'pad', 'backfill')options.tolerance: Maximum distance for method selection
selStream(selection, options?): Stream selection in chunks (returns AsyncGenerator)options.chunkSize: Target chunk size in MB (default: 100)options.dimension: Dimension to chunk along (default: auto-detect)options.method: Selection methodoptions.tolerance: Maximum distance for method selection
isel(selection): Select by integer positionswhere(condition, other?, options?): Filter data based on a conditioncondition: Boolean DataArray (create with comparison methods like.lt(),.gt(), etc.)other: Optional replacement value where condition is false (default: NaN)options.keepAttrs: Preserve attributes in result (default: false)options.skipNa: Skip NaN values in condition (default: false)
rolling(dim, window, options?): Create rolling window object- Returns
DataArrayRollingwith.mean()and.sum()methods window: Size of rolling windowoptions.min_periods: Minimum observations in window (default: window size)options.center: Center the window (default: false)
- Returns
squeeze(): Remove dimensions of size 1sum(dim?): Sum along dimension (or all values)mean(dim?): Mean along dimension (or all values)toObject(): Convert to plain JavaScript objecttoJSON(): Convert to JSON string
new Dataset(dataVars?, options?)Parameters:
dataVars: Object mapping variable names to DataArraysoptions:coords: Shared coordinatesattrs: Metadata attributes
dataVars: Array of variable namesdims: Array of all dimension namescoords: Shared coordinatesattrs: Metadata attributessizes: Object mapping dimension names to sizes
addVariable(name, dataArray): Add a new variablegetVariable(name): Get a variable by name (throws if not found)get(key): Dictionary-style access -get('varname')orget(['var1', 'var2'])hasVariable(name): Check if variable existsremoveVariable(name): Remove a variablesel(selection, options?): Select by coordinate labelsoptions.method: Selection method ('nearest', 'ffill', 'bfill', 'pad', 'backfill')options.tolerance: Maximum distance for method selection
selStream(selection, options?): Stream selection in chunks (returns AsyncGenerator)options.chunkSize: Target chunk size in MB (default: 100)options.dimension: Dimension to chunk along (default: auto-detect)options.method: Selection methodoptions.tolerance: Maximum distance for method selection
isel(selection): Select by integer positionsmap(fn): Apply function to all variableswhere(condition, other?, options?): Filter data based on a conditioncondition: Boolean DataArray or Dataset (create with comparison methods like.lt(),.gt(), etc.)other: Optional replacement value where condition is false (default: NaN)options.keepAttrs: Preserve attributes in result (default: false)options.skipNa: Skip NaN values in condition (default: false)
rolling(dim, window, options?): Create rolling window object- Returns
DatasetRollingwith.mean()and.sum()methods window: Size of rolling windowoptions.min_periods: Minimum observations in window (default: window size)options.center: Center the window (default: false)
- Returns
rename(mapping): Rename variablesmapping: Object mapping old names to new names
assignCoords(coords): Update or add coordinatescoords: Object mapping dimension names to coordinate values or DataArrays
dropVars(names): Remove variablesnames: String or array of variable names to drop
squeeze(): Remove dimensions of size 1merge(other): Merge with another DatasettoObject(): Convert to plain JavaScript objecttoJSON(): Convert to JSON string
See the examples directory for more detailed usage examples.
# Install dependencies
npm install
# Build
npm run build
# Run tests
npm testMIT
Contributions are welcome! Please feel free to submit a Pull Request.
This library is inspired by Python's xarray library, which provides labeled multi-dimensional arrays for scientific computing.