Skip to content

Conversation

giladchase
Copy link
Contributor

@giladchase giladchase commented Sep 14, 2025

When slicing bytespans (to-be-implemented), the end-offset is trimmed by shifting the word to the
right.
The left offset, however, will be trimmed lazily only if the `ByteSpan` is casted into a `ByteArray`.

Note: Lazily removing the end-offset will require saving an additional field, `end_offset` in
`ByteSpan`, due to how strings are represented inside felt252 (the first byte is the msb of the
word). In other words, we cannot just reduce the remainder_len, because then it'd be impossible to
know how much to trim off from the remainder word at `to_byte_array`.

@reviewable-StarkWare
Copy link

This change is Reviewable

Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @TomerStarkware)


corelib/src/byte_array.cairo line 635 at r1 (raw file):

            remainder_word: 0,
            remainder_len: downcast(0).unwrap(),
        }

Suggestion:

            data: [].span(),
            first_char_start_offset: 0,
            remainder_word: 0,
            remainder_len: 0,
        }

corelib/src/byte_array.cairo line 657 at r1 (raw file):

impl ByteSpanIntoByteArray of Into<ByteSpan, ByteArray> {
    fn into(mut self: ByteSpan) -> ByteArray {
        let start_offset = upcast(self.first_char_start_offset);

delay the upcast - as not really needed for comparison.

Code quote:

        let start_offset = upcast(self.first_char_start_offset);

corelib/src/byte_array.cairo line 659 at r1 (raw file):

        let start_offset = upcast(self.first_char_start_offset);
        // Span is aligned to word boundaries.
        if start_offset == 0 || self.data.is_empty() {

this sounds possibly wrong - as if you had the bytearray "short" and took the sub "hort" you would have empty data, but this implementation would be wrong.

Code quote:

        if start_offset == 0 || self.data.is_empty() {

corelib/src/test/byte_array_test.cairo line 555 at r1 (raw file):

fn test_span_into_bytearray() {
    let empty_ba: ByteArray = "";
    assert_eq!(empty_ba.span().into(), empty_ba, "empty round-trip");

remove at most tests - this does not add clarity here.

Suggestion:

    assert_eq!(empty_ba.span().into(), empty_ba);

@giladchase giladchase force-pushed the gilad/09-03-feat_byte_array_add_bytearray_span_and_span_len_ branch from 87a0aa6 to d8c0e8b Compare September 14, 2025 09:43
@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch 2 times, most recently from 3ba3ba2 to 3129ad4 Compare September 14, 2025 10:18
Copy link
Contributor Author

@giladchase giladchase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @orizi and @TomerStarkware)


corelib/src/byte_array.cairo line 657 at r1 (raw file):

Previously, orizi wrote…

delay the upcast - as not really needed for comparison.

Done.


corelib/src/byte_array.cairo line 659 at r1 (raw file):

Previously, orizi wrote…

this sounds possibly wrong - as if you had the bytearray "short" and took the sub "hort" you would have empty data, but this implementation would be wrong.

I'm testing for this case in the next PR, see test_span_slice_under_31_bytes. It works as is, but the conditional (or the way ByteSpan::slice handles < 31 byte slices) is a bit misleading.

Slices that are < 31 bytes (so are held in pending/remainder word as a felt) split off the start prefix and save the word offset-free.

Meaning, the the structure of "short".span().slice(1,4) would be:

ByteSpan {
   data: [],
   remainder: "hort"
   start_offset: 0
}

rather than

ByteSpan {
   data: [],
   remainder: "short"
   start_offset: 1
}

Motivation for this is detailed in the slice PR, but the tldr is that the logic is simpler, and slice representation more consistent, if we add an invariant that the remainder word cannot have a start offset, just an end offset.
There is a small overhead for this of course (cost of split_bytes31), but i wanted to optimize for simplicity and consistency for starters.

Added a comment to clarify this.


corelib/src/test/byte_array_test.cairo line 555 at r1 (raw file):

Previously, orizi wrote…

remove at most tests - this does not add clarity here.

Done, tnx for letting me know these aren't required 🙏


corelib/src/byte_array.cairo line 635 at r1 (raw file):

            remainder_word: 0,
            remainder_len: downcast(0).unwrap(),
        }

🙏 Not sure why i thought this wasn't possible.

@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from 3129ad4 to 7b0a77a Compare September 14, 2025 10:29
Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orizi reviewed all commit messages.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @giladchase and @TomerStarkware)


corelib/src/byte_array.cairo line 659 at r1 (raw file):

Previously, giladchase wrote…

I'm testing for this case in the next PR, see test_span_slice_under_31_bytes. It works as is, but the conditional (or the way ByteSpan::slice handles < 31 byte slices) is a bit misleading.

Slices that are < 31 bytes (so are held in pending/remainder word as a felt) split off the start prefix and save the word offset-free.

Meaning, the the structure of "short".span().slice(1,4) would be:

ByteSpan {
   data: [],
   remainder: "hort"
   start_offset: 0
}

rather than

ByteSpan {
   data: [],
   remainder: "short"
   start_offset: 1
}

Motivation for this is detailed in the slice PR, but the tldr is that the logic is simpler, and slice representation more consistent, if we add an invariant that the remainder word cannot have a start offset, just an end offset.
There is a small overhead for this of course (cost of split_bytes31), but i wanted to optimize for simplicity and consistency for starters.

Added a comment to clarify this.

i'm not sure it is better than allowing the slice itself to be simplest.
imagine, slice of slice of slice, and evaluation only here - this causes actual slicing of data with work, while when doing the final iteration should have been the only actual pricy action.

@giladchase giladchase changed the base branch from gilad/09-03-feat_byte_array_add_bytearray_span_and_span_len_ to graphite-base/8416 September 14, 2025 11:49
@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from 7b0a77a to 055a0e3 Compare September 14, 2025 13:50
@giladchase giladchase changed the base branch from graphite-base/8416 to gilad/09-03-feat_byte_array_add_bytearray_span_and_span_len_ September 14, 2025 13:50
@giladchase giladchase changed the base branch from gilad/09-03-feat_byte_array_add_bytearray_span_and_span_len_ to graphite-base/8416 September 15, 2025 10:46
@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from 055a0e3 to 558ac51 Compare September 16, 2025 10:06
@giladchase giladchase changed the base branch from graphite-base/8416 to gilad/09-03-feat_byte_array_add_bytearray_span_and_span_len_ September 16, 2025 10:06
Copy link
Contributor Author

@giladchase giladchase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @orizi and @TomerStarkware)


corelib/src/byte_array.cairo line 659 at r1 (raw file):

Previously, orizi wrote…

i'm not sure it is better than allowing the slice itself to be simplest.
imagine, slice of slice of slice, and evaluation only here - this causes actual slicing of data with work, while when doing the final iteration should have been the only actual pricy action.

Done.

Discussed offline: start offset will be applied lazily only in into, end-offset is shifted-right on the spot (otherwise we have to save an extra field on ByteSpan in order to apply it lazily, as discussed offline).

@giladchase giladchase changed the title feat(byte_array): add ByteSpan::into and bytes31_slice feat(byte_array): add ByteSpan::into Sep 16, 2025
Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orizi reviewed all commit messages.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @TomerStarkware)


corelib/src/byte_array.cairo line 696 at r4 (raw file):

    fn into(mut self: ByteSpan) -> ByteArray {
        let remainder_len = upcast(self.remainder_len);
        let Some(first_word) = self.data.pop_front() else {

can probably mix the code a bit the the rest as well.

Suggestion:

    fn into(mut self: ByteSpan) -> ByteArray {
        if self.first_char_start_offset == 0 {
            let mut ba = ByteArray { 
                data: self.data.into(),
                pending_word: 0,
                pending_word_len: 0,
            };
            ba.append_word(self.remainder_word, self.remainder_len);
            return ba;
        }
        let remainder_len = upcast(self.remainder_len);
        let Some(first_word) = self.data.pop_front() else {

Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orizi reviewed 1 of 2 files at r3, 1 of 1 files at r4.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @giladchase and @TomerStarkware)

@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from 672763a to 127e45d Compare September 29, 2025 14:11
Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@orizi reviewed 1 of 1 files at r12, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @TomerStarkware)

@giladchase giladchase force-pushed the gilad/09-29-refactor_byte_array_extract_bytearray_append_logic branch from ea841b4 to b581c8f Compare September 30, 2025 09:23
@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from 127e45d to 1000d8d Compare September 30, 2025 09:23
Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orizi reviewed 1 of 1 files at r13, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @TomerStarkware)

@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from 1000d8d to 1379fb9 Compare October 5, 2025 13:05
@giladchase giladchase force-pushed the gilad/09-29-refactor_byte_array_extract_bytearray_append_logic branch from b581c8f to 3910f15 Compare October 5, 2025 13:05
Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orizi reviewed 1 of 1 files at r14, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @TomerStarkware)

@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from 1379fb9 to ef06e84 Compare October 8, 2025 06:23
@giladchase giladchase force-pushed the gilad/09-29-refactor_byte_array_extract_bytearray_append_logic branch from 3910f15 to c27c17f Compare October 8, 2025 06:23
Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orizi reviewed 2 of 2 files at r15, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @TomerStarkware)

@giladchase giladchase force-pushed the gilad/09-29-refactor_byte_array_extract_bytearray_append_logic branch from c27c17f to ddbba3e Compare October 8, 2025 08:32
@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch 2 times, most recently from 36fe3fb to dc2c4f0 Compare October 8, 2025 08:40
@giladchase giladchase force-pushed the gilad/09-29-refactor_byte_array_extract_bytearray_append_logic branch 2 times, most recently from 26ed464 to 4903e01 Compare October 8, 2025 08:51
@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from dc2c4f0 to a4b8ab5 Compare October 8, 2025 08:51
@giladchase giladchase force-pushed the gilad/09-29-refactor_byte_array_extract_bytearray_append_logic branch from 4903e01 to 1ec28a3 Compare October 8, 2025 08:54
@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from a4b8ab5 to 1fb81cc Compare October 8, 2025 08:54
Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orizi reviewed 2 of 2 files at r16, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @TomerStarkware)

@graphite-app graphite-app bot changed the base branch from gilad/09-29-refactor_byte_array_extract_bytearray_append_logic to graphite-base/8416 October 8, 2025 09:12
When slicing bytespans (to-be-implemented), the end-offset is trimmed by shifting the word to the
right.
The left offset, however, will be trimmed lazily only if the `ByteSpan` is casted into a `ByteArray`.

Note: Lazily removing the end-offset will require saving an additional field, `end_offset` in
`ByteSpan`, due to how strings are represented inside felt252 (the first byte is the msb of the
word). In other words, we cannot just reduce the remainder_len, because then it'd be impossible to
know how much to trim off from the remainder word at `to_byte_array`.
@giladchase giladchase force-pushed the gilad/09-11-feat_byte_array_add_bytespan_into_ branch from 1fb81cc to 380727c Compare October 8, 2025 09:25
@graphite-app graphite-app bot changed the base branch from graphite-base/8416 to main October 8, 2025 09:26
Copy link

graphite-app bot commented Oct 8, 2025

Merge activity

  • Oct 8, 9:26 AM UTC: Graphite rebased this pull request, because this pull request is set to merge when ready.

@giladchase giladchase added this pull request to the merge queue Oct 8, 2025
Merged via the queue into main with commit 13b62db Oct 8, 2025
106 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants