-
Notifications
You must be signed in to change notification settings - Fork 635
feat(byte_array): add ByteSpan
iterator
#8530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(byte_array): add ByteSpan
iterator
#8530
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 825 at r1 (raw file):
byte_in_word_index: usize, remainder_word: felt252, remainder_len: usize,
If we're already messing with bytearray, maybe we can add a Word
struct that includes this couple (breaking change for bytearray).
Code quote:
remainder_word: felt252,
remainder_len: usize,
corelib/src/byte_array.cairo
line 863 at r1 (raw file):
#[inline] fn next(ref self: ByteSpanIter) -> Option<u8> {
Inlining because it's called implicitly from for
loops.
Code quote:
#[inline]
fn next(ref self: ByteSpanIter) -> Option<u8> {
73fe947
to
218a5bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed all commit messages.
Reviewable status: 0 of 4 files reviewed, 3 unresolved discussions (waiting on @giladchase and @TomerStarkware)
corelib/src/byte_array.cairo
line 825 at r1 (raw file):
Previously, giladchase wrote…
If we're already messing with bytearray, maybe we can add a
Word
struct that includes this couple (breaking change for bytearray).
short-str
sort of struct?
in any case, it is a more serious breakage, as direct normal usages will still work as is, and with this change they will not.
corelib/src/byte_array.cairo
line 817 at r2 (raw file):
/// An iterator struct over a ByteSpan. // TODO(giladchase): Use this in byte array into_iter and deprecate ByteArrayIter.
wrong place for the todo - as it would easily be forgotten here.
corelib/src/byte_array.cairo
line 825 at r2 (raw file):
byte_in_word_index: usize, remainder_word: felt252, remainder_len: usize,
or BoundedInt<0, 31>, whatever is actually correct.
Suggestion:
remainder_len: BoundedInt<0, 30>,
a29cd34
to
2280b52
Compare
96b6e7e
to
7e50300
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed all commit messages.
Reviewable status: 0 of 4 files reviewed, 3 unresolved discussions (waiting on @giladchase and @TomerStarkware)
corelib/src/test/byte_array_test.cairo
line 870 at r3 (raw file):
assert_eq!(iter.next(), Some('C')); assert_eq!(iter.next(), None); assert_eq!(iter.next(), None, "Idempotent empty");
and so on
Suggestion:
let empty: ByteArray = "";
assert_eq!(empty.span().into_iter().collect(), array![]);
let ba: ByteArray = "A";
assert_eq!(ba.span().into_iter().collect(), array!['A']);
let ba: ByteArray = "ABC";
assert_eq!(ba.span().into_iter().collect(), array!['A', 'B', 'C']);
7e50300
to
d32b608
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 825 at r1 (raw file):
short-str
sort of struct?
Yep, that's better :)
in any case, it is a more serious breakage, as direct normal usages will still work as is, and with this change they will not.
Got it, resolving.
corelib/src/byte_array.cairo
line 817 at r2 (raw file):
Previously, orizi wrote…
wrong place for the todo - as it would easily be forgotten here.
Right, and indeed it was forgotten at #8539 :)
Done + consolidated existing similar old todo.
corelib/src/byte_array.cairo
line 825 at r2 (raw file):
Previously, orizi wrote…
or BoundedInt<0, 31>, whatever is actually correct.
Thought about this too, but this hits a snag in next()
, where it's compared with byte_in_word_index
. So given that it's not possible to directly compare usize with BoundedInt
(?), this ends up adding an additional upcast
every next
call by either:
- upcasting
remainder_len
before comparing withbyte_in_word_index
. - make
byte_in_word_index
also aBoundedInt
and increment it withbyte31_index_inc
(which contains an upcast + trim_max).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed all commit messages.
Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @giladchase and @TomerStarkware)
corelib/src/byte_array.cairo
line 825 at r2 (raw file):
Previously, giladchase wrote…
Thought about this too, but this hits a snag in
next()
, where it's compared withbyte_in_word_index
. So given that it's not possible to directly compare usize withBoundedInt
(?), this ends up adding an additionalupcast
everynext
call by either:
- upcasting
remainder_len
before comparing withbyte_in_word_index
.- make
byte_in_word_index
also aBoundedInt
and increment it withbyte31_index_inc
(which contains an upcast + trim_max).
got it - in any case use (0..current_word_len).into_iter()
to iterate over both.
7e3b9ef
to
727b961
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 825 at r2 (raw file):
Previously, orizi wrote…
got it - in any case use
(0..current_word_len).into_iter()
to iterate over both.
Done.
corelib/src/test/byte_array_test.cairo
line 870 at r3 (raw file):
Previously, orizi wrote…
and so on
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed all commit messages.
Reviewable status: 0 of 4 files reviewed, 3 unresolved discussions (waiting on @TomerStarkware)
corelib/src/byte_array.cairo
line 973 at r5 (raw file):
data_iter: [].span().into_iter(), current_word: Default::default(), current_word_len: Default::default(),
this includes current_word_len
Code quote:
current_word_len: Default::default(),
02d1817
to
826e33e
Compare
727b961
to
15d5c7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed 1 of 3 files at r3, all commit messages.
Reviewable status: 1 of 4 files reviewed, 1 unresolved discussion (waiting on @giladchase and @TomerStarkware)
corelib/src/byte_array.cairo
line 863 at r1 (raw file):
Previously, giladchase wrote…
Inlining because it's called implicitly from
for
loops.
doesn't matter - should still be decided by the algorithm.
15d5c7d
to
8ae9468
Compare
826e33e
to
1b13525
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed all commit messages.
Reviewable status: 1 of 4 files reviewed, 1 unresolved discussion (waiting on @giladchase and @TomerStarkware)
8ae9468
to
def6613
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 4 files reviewed, 1 unresolved discussion (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 863 at r1 (raw file):
Previously, orizi wrote…
doesn't matter - should still be decided by the algorithm.
Done
corelib/src/byte_array.cairo
line 973 at r5 (raw file):
Previously, orizi wrote…
this includes
current_word_len
Not sure i understand what you mean, do you mean that current_word_len is redundant because it's already baked into byte_in_word_iter
?
If so, then I thought of this as well, but since we are saving the iterator, rather than the Range
, we don't have access to range.end
anymore. Alternatively, we can save a Range
object instead of both current_word_len
and byte_in_word_iter
and then increment Range.start
on next()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 5 files reviewed, 3 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 1104 at r14 (raw file):
Previously, orizi wrote…
and update inplementation.
Won't work for remainder words (since we consider end_index to represent len
for remainders, rather than len - 1
as is the case for current_word). For example, if pop returns Option<u8>
we'll get:
"A".span().iter().next() // => None
// The `next` calls self.pop_first(), which returns None because `iter.remainder.end_index == 0` so the `dec` on 0 returns None.
Conversly, if we return something like (None, last_byte)
we hit the issue in the other direction with remainder: the last pop will return word[len]
which is not defined for remainder words (so "".span().iter().next()
will still output a byte from iter.word even though it's not initiailized)
I think we have to separate the peek from the pop if we want to allow the caller to overload this struct's "length" like we are doing.
Either that or stop overloading this struct and create a different struct for the remainder 🥲
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 5 files reviewed, 3 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 1104 at r14 (raw file):
Previously, giladchase wrote…
Won't work for remainder words (since we consider end_index to represent
len
for remainders, rather thanlen - 1
as is the case for current_word). For example, if pop returnsOption<u8>
we'll get:"A".span().iter().next() // => None // The `next` calls self.pop_first(), which returns None because `iter.remainder.end_index == 0` so the `dec` on 0 returns None.Conversly, if we return something like
(None, last_byte)
we hit the issue in the other direction with remainder: the last pop will returnword[len]
which is not defined for remainder words (so"".span().iter().next()
will still output a byte from iter.word even though it's not initiailized)I think we have to separate the peek from the pop if we want to allow the caller to overload this struct's "length" like we are doing.
Either that or stop overloading this struct and create a different struct for the remainder 🥲
NVM just spotted the suggestion of using Boundedint<0,31> instead, ignore the above :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed 2 of 2 files at r15, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @TomerStarkware)
ccfb107
to
5c79087
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 969 at r14 (raw file):
Previously, orizi wrote…
updated short-str - now i think the game about the index is no longer required.
Done.
corelib/src/byte_array.cairo
line 1104 at r14 (raw file):
Previously, giladchase wrote…
NVM just spotted the suggestion of using Boundedint<0,31> instead, ignore the above :)
Done
corelib/src/byte_array.cairo
line 1084 at r14 (raw file):
// choice. end_index: Bytes31Index, }
Done.
5c79087
to
d71857a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 5 files reviewed, 3 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 980 at r15 (raw file):
pub struct ByteSpanIter { // Whether all bytes have been consumed. iterator_exhausted: bool,
Removed, it's no longer necessary since exhaustion is now unambiguously implied from the data and reminder word being empty.
This makes next()
on an exhausted itertor slightly more expensive, but next()
on non-exhausted iterators faster (also less code and struct size)
Code quote:
iterator_exhausted: bool,
corelib/src/byte_array.cairo
line 1074 at r17 (raw file):
struct ShortString { /// The actual data. word: u256,
Saves an extra into::<u256>
for u8_at_u256, and since word
is read-only, we don't gain anything by keeping it a felt.
Code quote:
word: u256,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 5 files reviewed, 6 unresolved discussions (waiting on @TomerStarkware)
corelib/src/byte_array.cairo
line 1074 at r17 (raw file):
Previously, giladchase wrote…
Saves an extra
into::<u256>
for u8_at_u256, and sinceword
is read-only, we don't gain anything by keeping it a felt.
no idea what you are mentioning.
corelib/src/byte_array.cairo
line 1004 at r16 (raw file):
if self.remainder.word_len == 0 { // Remainder is empty. return None; }
works without - and basically as efficient - as the entire structure is returned in any case.
corelib/src/byte_array.cairo
line 1030 at r16 (raw file):
} // If remainder length is nonzero then it's strictly larger than the start offset.
No need - more code, and still equivalent.
Suggestion:
// On empty data span, remainder length is larger than or equals to the start offset.
corelib/src/byte_array.cairo
line 1037 at r16 (raw file):
return ByteSpanIter { data_iter, current_word: ShortString { word: self.remainder_word, word_len: word_len },
Suggestion:
data_iter,
current_word: ShortString { word: self.remainder_word, word_leמ },
corelib/src/byte_array.cairo
line 1267 at r16 (raw file):
pub fn length_sub_offset(length: BoundedInt<0, 31>, offset: Bytes31Index) -> BoundedInt<0, 31> { let diff = bounded_int::sub(length, offset); bounded_int::constrain::<_, 0>(diff).unwrap_err()
dont unwrap - return option - and handle with empty viable code.
Code quote:
bounded_int::constrain::<_, 0>(diff).unwrap_err()
c79bbee
to
e11d736
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed 1 of 1 files at r18, all commit messages.
Reviewable status: 4 of 5 files reviewed, 6 unresolved discussions (waiting on @giladchase and @TomerStarkware)
corelib/src/byte_array.cairo
line 1076 at r18 (raw file):
word: u256, /// The actual length of the short string in bytes. word_len: BoundedInt<0, 31>,
cleaner names.
Suggestion:
/// The actual data.
data: u256,
/// The actual length of the short string in bytes.
len: BoundedInt<0, 31>,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 5 files reviewed, 6 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 1074 at r17 (raw file):
Previously, orizi wrote…
no idea what you are mentioning.
Rationale why it's ok to make word
a u256
, rather than felt252
.
corelib/src/byte_array.cairo
line 1037 at r16 (raw file):
return ByteSpanIter { data_iter, current_word: ShortString { word: self.remainder_word, word_len: word_len },
Done.
e11d736
to
b66e573
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed 1 of 2 files at r16.
Reviewable status: 4 of 5 files reviewed, 5 unresolved discussions (waiting on @giladchase and @TomerStarkware)
corelib/src/byte_array.cairo
line 1074 at r17 (raw file):
Previously, giladchase wrote…
Rationale why it's ok to make
word
au256
, rather thanfelt252
.
but it is larger - and it requires the change for the remainder as well.
revert for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed all commit messages.
Reviewable status: 4 of 5 files reviewed, 6 unresolved discussions (waiting on @giladchase and @TomerStarkware)
corelib/src/byte_array.cairo
line 999 at r20 (raw file):
Some(word) => { let word_len = upcast(BYTES_IN_BYTES31); self.current_word = ShortString { word: (*word).into(), word_len };
rather have this, as the create sierra is possibly smaller, and it is still the same meaning (the name includes the number in any case..)
Suggestion:
self.current_word = ShortString { word: (*word).into(), word_len: 31 };
b74fbe0
to
0eb401e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 5 files reviewed, 5 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 1076 at r18 (raw file):
Previously, orizi wrote…
cleaner names.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed 1 of 1 files at r21, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @TomerStarkware)
0eb401e
to
afeac4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 1004 at r16 (raw file):
Previously, orizi wrote…
works without - and basically as efficient - as the entire structure is returned in any case.
Done.
corelib/src/byte_array.cairo
line 1030 at r16 (raw file):
Previously, orizi wrote…
No need - more code, and still equivalent.
Done.
corelib/src/byte_array.cairo
line 1267 at r16 (raw file):
Previously, orizi wrote…
dont unwrap - return option - and handle with empty viable code.
Done, wasn't sure what you meant in the last part of the sentence, i assumed you meant to expect
on the option in the callsite.
(Failure in either callsite is a bug on our end, so if we return an empty iterator it would be hiding the bug.)
corelib/src/byte_array.cairo
line 999 at r20 (raw file):
Previously, orizi wrote…
rather have this, as the create sierra is possibly smaller, and it is still the same meaning (the name includes the number in any case..)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed 1 of 1 files at r22, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @TomerStarkware)
corelib/src/byte_array.cairo
line 1022 at r22 (raw file):
upcast(self.remainder_len), self.first_char_start_offset, ) .expect('Start offset is at most len');
Suggestion:
let len = match helpers::length_sub_offset(
upcast(self.remainder_len), self.first_char_start_offset,
) {
Some(len) => len,
// Can't actually happen, as start offset is at most lengnth.
None => 0,
};
corelib/src/byte_array.cairo
line 1032 at r22 (raw file):
let len = helpers::length_sub_offset(upcast(BYTES_IN_BYTES31), self.first_char_start_offset) .expect('Start offset is at most 30');
Suggestion:
// TODO(orizi): Use `complement_to_31` when added.
let len = match helpers::length_sub_offset(upcast(BYTES_IN_BYTES31), self.first_char_start_offset) {
Some(len) => len,
// Can't actually happen, as start offset is at most 30.
None => 0,
};
afeac4a
to
459d8cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @orizi and @TomerStarkware)
corelib/src/byte_array.cairo
line 1022 at r22 (raw file):
upcast(self.remainder_len), self.first_char_start_offset, ) .expect('Start offset is at most len');
Done.
Note that these trigger an incorrect warning from cairo-analyzer extension, it suggests to replace them with unwrap_or_default
, which doesn't work because len
is a BoundedInt, so it has no default.
(unwrap_or(0)
does work though.)
corelib/src/byte_array.cairo
line 1032 at r22 (raw file):
let len = helpers::length_sub_offset(upcast(BYTES_IN_BYTES31), self.first_char_start_offset) .expect('Start offset is at most 30');
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orizi reviewed 1 of 1 files at r23, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @TomerStarkware)
corelib/src/byte_array.cairo
line 1022 at r22 (raw file):
Previously, giladchase wrote…
Done.
Note that these trigger an incorrect warning from cairo-analyzer extension, it suggests to replace them with
unwrap_or_default
, which doesn't work becauselen
is a BoundedInt, so it has no default.
(unwrap_or(0)
does work though.)
cute lint - ignore in any case.
No description provided.