Skip to content

Commit 2ec3370

Browse files
committed
[new release] raven (11 packages) (1.0.0~alpha1)
CHANGES: This release expands the Raven ecosystem with three new libraries (Talon, Saga, Fehu) and significant enhancements to existing ones. `alpha1` focuses on breadth—adding foundational capabilities across data processing, NLP, and reinforcement learning—while continuing to iterate on core infrastructure. ### New Libraries #### Talon - DataFrame Processing We've added Talon, a new DataFrame library inspired by pandas and polars: - Columnar data structures that support mixed types (integers, floats, strings, etc.) within a single table (aka heterogeneous datasets) - Operations: filter rows, group by columns, join tables, compute aggregates - Load and save data in CSV and JSON formats - Seamless conversion to/from Nx arrays for numerical operations #### Saga - NLP & Text Processing Saga is a new text processing library for building language models. It provides: - Tokenizers: Byte-pair encoding (BPE), WordPiece subword tokenization, and character-level splitting - Text generation: Control output with temperature scaling, top-k filtering, nucleus (top-p) sampling, and custom sampling strategies - Language models: Train and generate text with statistical n-gram models (bigrams, trigrams, etc.) - I/O: Read large text files line-by-line and batch-process corpora #### Fehu - Reinforcement Learning Fehu brings reinforcement learning to Raven, with an API inspired by Gymnasium and Stable-Baselines3: - Standard RL environment interface (reset, step, render) with example environments like Random Walk and CartPole - Environment wrappers to modify observations, rewards, or episode termination conditions - Vectorized environments to collect experience from multiple parallel rollouts - Training utilities: Generalized advantage estimation (GAE), trajectory collection and management - RL algorithms: Policy gradient method (REINFORCE), deep Q-learning (DQN) with replay buffer - Use Kaun neural networks as function approximators for policies and value functions ### Major Enhancements #### Nx - Array Computing We've significantly expanded Nx's following early user feedback from alpha0: - Complete linear algebra suite: LAPACK-backed operations matching NumPy including singular value decomposition (SVD), QR factorization, Cholesky decomposition, eigenvalue/eigenvector computation, matrix inverse, and solving linear systems - FFT operations: Fast Fourier transforms (FFT/IFFT) for frequency domain analysis and signal processing - Advanced operations: Einstein summation notation (`einsum`) for complex tensor operations, extract/construct diagonal matrices (`diag`), cumulative sums and products along axes - Extended dtypes: Machine learning-focused types including bfloat16 (brain floating point), complex16, and float8 for reduced-precision training - Symbolic shapes: Internal infrastructure for symbolic shape inference to enable dynamic shapes in future releases (not yet exposed in public API) - Lazy views: Array views only copy and reorder memory when stride patterns require it, avoiding unnecessary allocations #### Rune - Autodiff & JIT We've continued iterating on Rune's autodiff capabilities, and made progress on upcoming features: - Forward-mode AD: Compute Jacobian-vector products (`jvp`) for forward-mode automatic differentiation, complementing existing reverse-mode - JIT: Ongoing development of LLVM-based just-in-time compilation for Rune computations (currently in prototype stage) - vmap: Experimental support for vectorized mapping to automatically batch operations (work-in-progress, not yet stable) - LLVM backend: Added compilation backend with support for LLVM versions 19, 20, and 21 - Metal backend: Continued work on GPU acceleration for macOS using Metal compute shaders #### Kaun - Deep Learning We've expanded Kaun with high-level APIs for deep learning. These APIs are inspired by popular Python frameworks like TensorFlow, PyTorch, and Flax, and should feel familiar to users building models in Python: - High-level training: Keras-style `fit()` function to train models with automatic batching, gradient computation, and parameter updates - Training state: Encapsulated training state (TrainState) holding parameters, optimizer state, and step count; automatic history tracking of loss and metrics - Checkpoints: Save and load model weights to disk for model persistence and transfer learning - Metrics: Automatic metric computation during training including accuracy, precision, recall, F1 score, mean absolute error (MAE), and mean squared error (MSE) - Data pipeline: Composable dataset operations (map, filter, batch, shuffle, cache) inspired by TensorFlow's `tf.data` for building input pipelines - Model zoo: Reference implementations of classic and modern architectures (LeNet5 for basic CNNs, BERT for masked language modeling, GPT2 for autoregressive generation) including reusable transformer components - Ecosystem integration: Load HuggingFace model architectures (`kaun.huggingface`), access common datasets like MNIST and CIFAR-10 (`kaun.datasets`), and use standardized model definitions (`kaun.models`) ### Contributors Thanks to everyone who contributed to this release: - @adamchol (Adam Cholewi) - Implemented the initial `associative_scan` native backend operation for cumulative operations - @akshay-gulab (Akshay Gulabrao) - @DhruvMakwana (Dhruv Makwana) - Implemented `einsum` for Einstein summation notation - @gabyfle (Gabriel Santamaria) - Built PocketFFT bindings that replaced our custom FFT kernels - @lukstafi (Lukasz Stafiniak) - Major contributions to Fehu and FunOCaml workshop on training Sokoban agents - @nickbetteridge - @sidkshatriya (Sidharth Kshatriya)
1 parent 6471587 commit 2ec3370

File tree

11 files changed

+587
-0
lines changed
  • packages
    • fehu/fehu.1.0.0~alpha1
    • hugin/hugin.1.0.0~alpha1
    • kaun/kaun.1.0.0~alpha1
    • nx-datasets/nx-datasets.1.0.0~alpha1
    • nx/nx.1.0.0~alpha1
    • quill/quill.1.0.0~alpha1
    • raven/raven.1.0.0~alpha1
    • rune/rune.1.0.0~alpha1
    • saga/saga.1.0.0~alpha1
    • sowilo/sowilo.1.0.0~alpha1
    • talon/talon.1.0.0~alpha1

11 files changed

+587
-0
lines changed
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
opam-version: "2.0"
2+
synopsis: "Reinforcement learning framework for OCaml"
3+
description:
4+
"Fehu is a reinforcement learning framework built on Raven's ecosystem, providing environments, algorithms, and training utilities"
5+
maintainer: ["Thibaut Mattio <[email protected]>"]
6+
authors: ["Thibaut Mattio <[email protected]>"]
7+
license: "ISC"
8+
tags: [
9+
"reinforcement-learning" "machine-learning" "ai" "environments" "agents"
10+
]
11+
homepage: "https://github.com/raven-ml/raven"
12+
doc: "https://raven-ml.dev/docs/"
13+
bug-reports: "https://github.com/raven-ml/raven/issues"
14+
depends: [
15+
"ocaml" {>= "5.3.0"}
16+
"dune" {>= "3.19"}
17+
"rune" {= version}
18+
"kaun" {= version}
19+
"yojson" {>= "2.0.0"}
20+
"alcotest" {with-test}
21+
"odoc" {with-doc}
22+
]
23+
build: [
24+
["dune" "subst"] {dev}
25+
[
26+
"dune"
27+
"build"
28+
"-p"
29+
name
30+
"-j"
31+
jobs
32+
"--promote-install-files=false"
33+
"@install"
34+
"@runtest" {with-test}
35+
"@doc" {with-doc}
36+
]
37+
["dune" "install" "-p" name "--create-install-files" name]
38+
]
39+
dev-repo: "git+https://github.com/raven-ml/raven.git"
40+
x-maintenance-intent: ["(latest)"]
41+
url {
42+
src:
43+
"https://github.com/raven-ml/raven/releases/download/1.0.0_alpha1/raven-1.0.0.alpha1.tbz"
44+
checksum: [
45+
"sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867"
46+
"sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c"
47+
]
48+
}
49+
x-commit-hash: "c9e8fe4badb33afbec7bb18e04698e2e249542aa"
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
opam-version: "2.0"
2+
synopsis: "Visualization library for OCaml"
3+
description:
4+
"Hugin is a powerful visualization library for OCaml that produces publication-quality plots and charts. It integrates with the Raven ecosystem to provide visualization of Nx data."
5+
maintainer: ["Thibaut Mattio <[email protected]>"]
6+
authors: ["Thibaut Mattio <[email protected]>"]
7+
license: "ISC"
8+
tags: ["visualization" "plotting" "charts" "data-science" "graphics"]
9+
homepage: "https://github.com/raven-ml/raven"
10+
doc: "https://raven-ml.dev/docs/"
11+
bug-reports: "https://github.com/raven-ml/raven/issues"
12+
depends: [
13+
"ocaml" {>= "5.3.0"}
14+
"dune" {>= "3.19"}
15+
"dune-configurator" {build}
16+
"conf-sdl2"
17+
"cairo2"
18+
"nx" {= version}
19+
"alcotest" {with-test}
20+
"odoc" {with-doc}
21+
]
22+
build: [
23+
["dune" "subst"] {dev}
24+
[
25+
"dune"
26+
"build"
27+
"-p"
28+
name
29+
"-j"
30+
jobs
31+
"--promote-install-files=false"
32+
"@install"
33+
"@runtest" {with-test}
34+
"@doc" {with-doc}
35+
]
36+
["dune" "install" "-p" name "--create-install-files" name]
37+
]
38+
dev-repo: "git+https://github.com/raven-ml/raven.git"
39+
x-maintenance-intent: ["(latest)"]
40+
url {
41+
src:
42+
"https://github.com/raven-ml/raven/releases/download/1.0.0_alpha1/raven-1.0.0.alpha1.tbz"
43+
checksum: [
44+
"sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867"
45+
"sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c"
46+
]
47+
}
48+
x-commit-hash: "c9e8fe4badb33afbec7bb18e04698e2e249542aa"
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
opam-version: "2.0"
2+
synopsis: "Flax-inspired neural network library for OCaml"
3+
description:
4+
"Kaun brings modern deep learning to OCaml with a flexible, type-safe API for building and training neural networks. It leverages Rune for automatic differentiation and computation graph optimization while maintaining OCaml's functional programming advantages."
5+
maintainer: ["Thibaut Mattio <[email protected]>"]
6+
authors: ["Thibaut Mattio <[email protected]>"]
7+
license: "ISC"
8+
tags: ["neural-networks" "machine-learning" "deep-learning"]
9+
homepage: "https://github.com/raven-ml/raven"
10+
doc: "https://raven-ml.dev/docs/"
11+
bug-reports: "https://github.com/raven-ml/raven/issues"
12+
depends: [
13+
"ocaml" {>= "5.3.0"}
14+
"dune" {>= "3.19"}
15+
"logs"
16+
"yojson" {>= "2.0.0"}
17+
"domainslib" {>= "0.5.0"}
18+
"saga" {= version}
19+
"rune" {= version}
20+
"nx-datasets" {= version}
21+
"alcotest" {with-test}
22+
"odoc" {with-doc}
23+
]
24+
build: [
25+
["dune" "subst"] {dev}
26+
[
27+
"dune"
28+
"build"
29+
"-p"
30+
name
31+
"-j"
32+
jobs
33+
"--promote-install-files=false"
34+
"@install"
35+
"@runtest" {with-test}
36+
"@doc" {with-doc}
37+
]
38+
["dune" "install" "-p" name "--create-install-files" name]
39+
]
40+
dev-repo: "git+https://github.com/raven-ml/raven.git"
41+
x-maintenance-intent: ["(latest)"]
42+
url {
43+
src:
44+
"https://github.com/raven-ml/raven/releases/download/1.0.0_alpha1/raven-1.0.0.alpha1.tbz"
45+
checksum: [
46+
"sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867"
47+
"sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c"
48+
]
49+
}
50+
x-commit-hash: "c9e8fe4badb33afbec7bb18e04698e2e249542aa"
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
opam-version: "2.0"
2+
synopsis: "Common datasets for machine learning"
3+
description:
4+
"A collection of common datasets for machine learning tasks, including image classification, regression, and more. This package provides easy access to popular datasets in a format compatible with Nx."
5+
maintainer: ["Thibaut Mattio <[email protected]>"]
6+
authors: ["Thibaut Mattio <[email protected]>"]
7+
license: "ISC"
8+
tags: [
9+
"datasets"
10+
"machine-learning"
11+
"data-science"
12+
"image-classification"
13+
"regression"
14+
]
15+
homepage: "https://github.com/raven-ml/raven"
16+
doc: "https://raven-ml.dev/docs/"
17+
bug-reports: "https://github.com/raven-ml/raven/issues"
18+
depends: [
19+
"ocaml" {>= "5.3.0"}
20+
"dune" {>= "3.19"}
21+
"ocurl"
22+
"csv"
23+
"logs"
24+
"nx" {= version}
25+
"alcotest" {with-test}
26+
"odoc" {with-doc}
27+
]
28+
build: [
29+
["dune" "subst"] {dev}
30+
[
31+
"dune"
32+
"build"
33+
"-p"
34+
name
35+
"-j"
36+
jobs
37+
"--promote-install-files=false"
38+
"@install"
39+
"@runtest" {with-test}
40+
"@doc" {with-doc}
41+
]
42+
["dune" "install" "-p" name "--create-install-files" name]
43+
]
44+
dev-repo: "git+https://github.com/raven-ml/raven.git"
45+
x-maintenance-intent: ["(latest)"]
46+
url {
47+
src:
48+
"https://github.com/raven-ml/raven/releases/download/1.0.0_alpha1/raven-1.0.0.alpha1.tbz"
49+
checksum: [
50+
"sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867"
51+
"sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c"
52+
]
53+
}
54+
x-commit-hash: "c9e8fe4badb33afbec7bb18e04698e2e249542aa"

packages/nx/nx.1.0.0~alpha1/opam

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
opam-version: "2.0"
2+
synopsis: "High-performance N-dimensional array library for OCaml"
3+
description:
4+
"Nx is the core component of the Raven ecosystem providing efficient numerical computation with multi-device support. It offers NumPy-like functionality with the benefits of OCaml's type system."
5+
maintainer: ["Thibaut Mattio <[email protected]>"]
6+
authors: ["Thibaut Mattio <[email protected]>"]
7+
license: "ISC"
8+
tags: ["numerical-computation" "tensor-library" "machine-learning"]
9+
homepage: "https://github.com/raven-ml/raven"
10+
doc: "https://raven-ml.dev/docs/"
11+
bug-reports: "https://github.com/raven-ml/raven/issues"
12+
depends: [
13+
"ocaml" {>= "5.3.0"}
14+
"dune" {>= "3.19"}
15+
"dune-configurator" {build}
16+
"conf-pkg-config" {build}
17+
"conf-zlib" {build}
18+
"stdlib-shims"
19+
"alcotest" {with-test}
20+
"mdx" {with-test}
21+
"odoc" {with-doc}
22+
]
23+
build: [
24+
["dune" "subst"] {dev}
25+
[
26+
"dune"
27+
"build"
28+
"-p"
29+
name
30+
"-j"
31+
jobs
32+
"--promote-install-files=false"
33+
"@install"
34+
"@runtest" {with-test}
35+
"@doc" {with-doc}
36+
]
37+
["dune" "install" "-p" name "--create-install-files" name]
38+
]
39+
dev-repo: "git+https://github.com/raven-ml/raven.git"
40+
x-maintenance-intent: ["(latest)"]
41+
depexts: [
42+
["libc-dev" "openblas-dev" "lapack-dev"] {os-distribution = "alpine"}
43+
["epel-release" "openblas-devel"] {os-distribution = "centos"}
44+
["libopenblas-dev" "liblapacke-dev"] {os-family = "debian"}
45+
["libopenblas-dev" "liblapacke-dev"] {os-family = "ubuntu"}
46+
["openblas-devel"] {os-family = "fedora"}
47+
["libopenblas_openmp-devel"] {os-family = "suse" | os-family = "opensuse"}
48+
["openblas" "lapacke" "cblas"] {os-distribution = "arch"}
49+
["openblas"] {os = "macos" & os-distribution = "homebrew"}
50+
["openblas" "lapacke"] {os = "freebsd"}
51+
]
52+
x-ci-accept-failures: [
53+
"oraclelinux-7"
54+
"oraclelinux-8"
55+
"oraclelinux-9"
56+
]
57+
url {
58+
src:
59+
"https://github.com/raven-ml/raven/releases/download/1.0.0_alpha1/raven-1.0.0.alpha1.tbz"
60+
checksum: [
61+
"sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867"
62+
"sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c"
63+
]
64+
}
65+
x-commit-hash: "c9e8fe4badb33afbec7bb18e04698e2e249542aa"
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
opam-version: "2.0"
2+
synopsis: "Interactive notebook for OCaml data science"
3+
description:
4+
"Quill is an interactive notebook application for data exploration, prototyping, and knowledge sharing in OCaml. It provides a Jupyter-like experience with rich visualization and documentation capabilities."
5+
maintainer: ["Thibaut Mattio <[email protected]>"]
6+
authors: ["Thibaut Mattio <[email protected]>"]
7+
license: "ISC"
8+
tags: [
9+
"notebook" "interactive-computing" "data-science" "literate-programming"
10+
]
11+
homepage: "https://github.com/raven-ml/raven"
12+
doc: "https://raven-ml.dev/docs/"
13+
bug-reports: "https://github.com/raven-ml/raven/issues"
14+
depends: [
15+
"ocaml" {>= "5.3.0"}
16+
"dune" {>= "3.19"}
17+
"dune-site" {>= "3.19.0"}
18+
"cmdliner"
19+
"wasm_of_ocaml-compiler"
20+
"js_of_ocaml-toplevel"
21+
"dream" {>= "1.0.0~alpha8"}
22+
"ppx_deriving_yojson"
23+
"crunch"
24+
"cmarkit"
25+
"vdom"
26+
"brr"
27+
"base64"
28+
"nx" {= version}
29+
"nx-datasets" {= version}
30+
"saga" {= version}
31+
"rune" {= version}
32+
"kaun" {= version}
33+
"sowilo" {= version}
34+
"hugin" {= version}
35+
"alcotest" {with-test}
36+
"odoc" {with-doc}
37+
]
38+
dev-repo: "git+https://github.com/raven-ml/raven.git"
39+
x-maintenance-intent: ["(latest)"]
40+
build: [
41+
["dune" "subst"] {dev}
42+
[
43+
"dune"
44+
"build"
45+
"--root"
46+
"."
47+
"--only-packages"
48+
name
49+
"--no-config"
50+
"--profile"
51+
"release"
52+
"-j"
53+
jobs
54+
"--auto-promote"
55+
"--promote-install-files=false"
56+
"@install"
57+
"@runtest" {with-test}
58+
"@doc" {with-doc}
59+
]
60+
["dune" "install" "-p" name "--create-install-files" name]
61+
]
62+
url {
63+
src:
64+
"https://github.com/raven-ml/raven/releases/download/1.0.0_alpha1/raven-1.0.0.alpha1.tbz"
65+
checksum: [
66+
"sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867"
67+
"sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c"
68+
]
69+
}
70+
x-commit-hash: "c9e8fe4badb33afbec7bb18e04698e2e249542aa"
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
opam-version: "2.0"
2+
synopsis: "Meta package for the Raven ML ecosystem"
3+
description:
4+
"Raven is a comprehensive machine learning ecosystem for OCaml. This meta package installs all Raven components including Nx (tensors), Hugin (plotting), Quill (notebooks), Rune (autodiff), Kaun (neural networks), and Sowilo (computer vision)."
5+
maintainer: ["Thibaut Mattio <[email protected]>"]
6+
authors: ["Thibaut Mattio <[email protected]>"]
7+
license: "ISC"
8+
tags: ["machine-learning" "data-science" "numerical-computation"]
9+
homepage: "https://github.com/raven-ml/raven"
10+
doc: "https://raven-ml.dev/docs/"
11+
bug-reports: "https://github.com/raven-ml/raven/issues"
12+
depends: [
13+
"dune" {>= "3.19"}
14+
"nx" {= version}
15+
"nx-datasets" {= version}
16+
"saga" {= version}
17+
"rune" {= version}
18+
"kaun" {= version}
19+
"sowilo" {= version}
20+
"fehu" {= version}
21+
"quill" {= version}
22+
"hugin" {= version}
23+
"odoc" {with-doc}
24+
]
25+
build: [
26+
["dune" "subst"] {dev}
27+
[
28+
"dune"
29+
"build"
30+
"-p"
31+
name
32+
"-j"
33+
jobs
34+
"--promote-install-files=false"
35+
"@install"
36+
"@runtest" {with-test}
37+
"@doc" {with-doc}
38+
]
39+
["dune" "install" "-p" name "--create-install-files" name]
40+
]
41+
dev-repo: "git+https://github.com/raven-ml/raven.git"
42+
x-maintenance-intent: ["(latest)"]
43+
url {
44+
src:
45+
"https://github.com/raven-ml/raven/releases/download/1.0.0_alpha1/raven-1.0.0.alpha1.tbz"
46+
checksum: [
47+
"sha256=8e277ed56615d388bc69c4333e43d1acd112b5f2d5d352e2453aef223ff59867"
48+
"sha512=369eda6df6b84b08f92c8957954d107058fb8d3d8374082e074b56f3a139351b3ae6e3a99f2d4a4a2930dd950fd609593467e502368a13ad6217b571382da28c"
49+
]
50+
}
51+
x-commit-hash: "c9e8fe4badb33afbec7bb18e04698e2e249542aa"

0 commit comments

Comments
 (0)