WIP: Adding support for contrib/intarray style fast-allocated arrays #2086

mhov · 2025-06-24T23:40:25Z

Faster arrays!

Now that Array<T> is stable, this PR ports in performance enhancements for creating PostgreSQL int[] arrays—based on the optimizations found in PostgreSQL’s contrib/intarray extension (specifically new_intArrayType(int num)).

The goal is to significantly speed up the conversion from &[i32] to a PostgreSQL ArrayType Datum.

Background

Currently, to return a PostgreSQL int[] from a Rust function, we typically return a Vec<i32>, which then gets converted to a Datum via array_datum_from_iter(..)
(implementation here).

This approach uses:

pg_sys::initArrayResult
pg_sys::accumArrayResult
pg_sys::makeArrayResult

These APIs build a varlena array by accumulating values and repeatedly reallocating memory as the capacity grows. For larger arrays, this results in substantial performance overhead.

Optimization

PostgreSQL's contrib/intarray takes a different approach for fixed-size datums: it pre-allocates the entire array up front and directly memcpys the data into ARR_DATA_PTR. This avoids the need for reallocations and is much faster.

What I've added

Implemented BoxRet for Array<'a, T> when T is a fixed-size numeric type (i8, i16, i32, f32, f64)
- Enables using Array<T> directly as a return type from #[pg_extern] functions.
Adds optimized allocation logic for these numeric types
- Pre-allocates the full ArrayType
- Uses memcpy-like behavior for fast construction from slices
Adds helper functions for:
- Creating empty arrays
- Creating Array<T> from &[T]

Benchmarks

I've benchmarked two approaches for turning Vec<i32> -> Array<i32>:

The current array_datum_from_iter(..)
The new fast pre-allocated method

Each test used 10,000 rows of randomly generated int[] in a temp table, work_mem = '1GB'. I used Instant::now() to time only the microseconds needed to convert and accumulated all those timings

    #[pg_extern]
    fn accum_alloc(input: Vec<i32>) -> i32 {
        let start = std::time::Instant::now();
        let array = input.into_datum().unwrap();  // this uses array_datum_from_iter(..)
        let duration = start.elapsed();
        duration.subsec_micros() as i32
    }
    
    #[pg_extern]
    fn fast_alloc(input: Vec<i32>) -> i32 {
        let start = std::time::Instant::now();
        let array = Array::<i32>::new_from_slice(input.as_slice()).expect("couldn't allocate");
        let duration = start.elapsed();
        duration.subsec_micros() as i32
    }

Row Count	int[] Length	accum_alloc (total μs)	fast_alloc (total μs)	Improvement
10,000	10	16,933.00	2,120.00	8.0×
10,000	100	155,352.00	10,228.00	15.2×
10,000	1,000	1,375,613.00	30,443.00	45.2×
10,000	10,000	13,666,835.00	251,825.00	54.3×

It's been a while since I contributed, so I'm a little rusty on matching the teams style/conventions, and as usual lifetime differences between PG allocated and rust allocated still escape me. I'm sure the ergonomics of the new Array functions could be improved. Am I doing anything the wrong way here?

pgrx/src/array.rs

workingjubilee

Some initial comments, mostly style nits. Let's discuss the motivation more.

If memory serves, I believe an array is a varlena allocation that does not have to match the precise size of the array's length, yes? That is, the varlena can be allocated bigger than the array needs. Why not something more like the spare_capacity_mut API that Vec has, so we can construct things without first zeroing them? I believe the zeroing overhead is in many cases inconsequential and often optimized-out anyways, but we can always make our code easier to optimize.

workingjubilee · 2025-09-02T20:20:55Z

pgrx/src/array.rs

+        let elem_size = std::mem::size_of::<T>();
+        let nbytes: usize = port::ARR_OVERHEAD_NONULLS(1) + elem_size * len;
+
+        unsafe {


In new code, a safe function with internal unsafe should explain why the code is sound.

pgrx/src/datum/array.rs

pgrx/src/array.rs

pgrx/src/array/port.rs

mhov · 2025-09-03T16:03:28Z

@usamoi @workingjubilee instead of all the T: Sized + IntoDatum + UnboxDatum stuff should we just make a opt-in trait like we do for RangeSubType that we specifically implement for i8,i16,i32,i64,f32,f64 ?

workingjubilee · 2025-09-04T19:51:36Z

@mhov Probably so, I think that would be a better starting point anyways as then it would be clearer what we're asking for.

mhov force-pushed the array-alloc branch from 223fccf to 27afe52 Compare July 2, 2025 22:34

mhov added 2 commits August 6, 2025 13:40

Adding support for contrib/intarray style fast-allocated arrays

52b142c

fixed documention generation

14ef1b8

mhov force-pushed the array-alloc branch from 27afe52 to 14ef1b8 Compare August 6, 2025 20:52

usamoi reviewed Aug 9, 2025

View reviewed changes

pgrx/src/array.rs Show resolved Hide resolved

workingjubilee requested changes Sep 2, 2025

View reviewed changes

mhov added 2 commits September 12, 2025 18:05

adding ArrayFastAllocSubType trait and typos

29eafb8

rustdoc corrections

1e650f5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WIP: Adding support for contrib/intarray style fast-allocated arrays #2086

WIP: Adding support for contrib/intarray style fast-allocated arrays #2086

Uh oh!

mhov commented Jun 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

workingjubilee left a comment •

edited

Loading

Uh oh!

workingjubilee Sep 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mhov commented Sep 3, 2025

Uh oh!

workingjubilee commented Sep 4, 2025

Uh oh!

Uh oh!

Uh oh!

WIP: Adding support for contrib/intarray style fast-allocated arrays #2086

Are you sure you want to change the base?

WIP: Adding support for contrib/intarray style fast-allocated arrays #2086

Uh oh!

Conversation

mhov commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Faster arrays!

Background

Optimization

What I've added

Benchmarks

Uh oh!

Uh oh!

workingjubilee left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

workingjubilee Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mhov commented Sep 3, 2025

Uh oh!

workingjubilee commented Sep 4, 2025

Uh oh!

Uh oh!

mhov commented Jun 24, 2025 •

edited

Loading

workingjubilee left a comment •

edited

Loading