SIMD-0363: Simple Alpenglow Clock #363

rogerANZA · 2025-09-19T16:46:52Z

No description provided.

removed this paragraph (see discussion with ashwin)

ksn6 · 2025-09-22T02:45:23Z

Could you articulate how these clock conditions interact with fast leader handover / a changing parent + the fact that your approach still works under these conditions?

Separately - with this SIMD, it may be worth including a "parent block (slot, hash)" field in the block footer in addition to the existing clock. This would be the "final parent" of the block, taking fast leader handover into account. Two good things happen as a result of this:

(1) given the final parent, we can skip a block with a bad clock quickly on shred ingest rather than after replay, since the first shred of the last ~~FEC set~~ block marker** will contain both the clock + last parent.

(2) repair in a block with an UpdateParent marker gets slightly better because now there are two shreds with the "final parent" that race each other

EDIT: fixed an issue with comment (1) above.

ksn6 · 2025-09-22T05:40:00Z

proposals/0363-simple-alpenglow-clock.md

+
+In Alpenglow, the current block leader includes an updated integer clock value
+(Rust system time in nanoseconds) in its block *b* in slot *s* in the block
+footer. This value is bounded by the clock value in the parent block.


At what point should we get the system clock time? When it is the time to create our leader slot? When we actually send out the block footer? Or just refer to what the current clock sysvar says?

You insert your system time when you produce the slice with the clock info in the block marker.

I mean - what precisely do you mean by "system time when you produce the slice?"

E.g. is this:

The beginning of block production for a slot?

The end of block production for a slot, post block building?

When we actually create the block footer, post block building, and pre-shredding for turbine dissemination?

The specific choice isn't really enforceable, although, for default implementation purposes / consistency, it's probably worth opining on the specifics in the SIMD.

All of these are possible (and enforceable). This depends on how we organize meta-data in a block (i.e., our discussion in Chicago). If we may change the parent, then the header (the first slice) is the wrong place, but the footer (the last slice) should be alright.

For now, it looks like nothing prevents a validator from reporting a timestamp that is off by +/- 50 ms, equating to an honest validator reporting a timestamp that is at the beginning of block production vs. maybe even somewhere in the middle of block production or even the end of block production.

To somewhat make the notion of a clock uniform across validator implementations, we should probably specify roughly at what point in a leader's production of a block the timestamp should be captured.

If this can somehow be enforced, that's even better imho.

For now, it looks like nothing prevents a validator from reporting a timestamp that is off by +/- 50 ms...

Well, I would say it can even be up to 800 ms wrongly reported. Nothing we can do here.

But we established that the clock should be in the block footer, and we established that it should be captured before putting it it there, so what's still unclear?

All of these are possible (and enforceable).

Well, I would say it can even be up to 800 ms wrongly reported. Nothing we can do here.

This part is a bit unclear - these two statements seem a bit contradictory. I agree with the second statement you're making re. being off by up to 800 ms, though.

But we established that the clock should be in the block footer, and we established that it should be captured before putting it it there, so what's still unclear?

The part that's unclear - at what exact part of the leader block production phase should we have leaders set the block timestamp?

Should the timestamp be associated with when the leader starts producing their block? When the leader conclusively knows what the block parent is? When the leader actually constructs the footer itself?

I'm aware that it's impossible to enforce the particular point in time, as you pointed out re. the 800 ms piece. This being said, it's worth having a sane default.

"We assume that a correct leader inserts its correct local time..."

Maybe this is not clear enough? When you insert the time value in the footer you insert your local time at that point. This way we have the least (additional) skew. Note that we will still have a systematic lag since the slice then has be encoded, sent through Rotor/Turbine, and decoded on the receiver side. We might consider to take that systematic lag into account.

rogerANZA · 2025-09-22T10:36:29Z

Could you articulate how these clock conditions interact with fast leader handover / a changing parent + the fact that your approach still works under these conditions?

This should be mentioned in the fast leader handover SIMD in my opinion. (1) If a parent is changed and we consider everything before the parent change irrelevant, then we must include the clock again. (2) If the metadata of the block before the parent change is still relevant even after the parent change, then we don't include it again. In any case, we must have exactly 1 valid clock entry per block. But mentioning in the fast handover SIMD is more natural because it's the same exception for all the cases.

Separately - with this SIMD, it may be worth including a "parent block (slot, hash)" field in the block footer in addition to the existing clock. This would be the "final parent" of the block, taking fast leader handover into account.

We could decide to have the clock always in the last slice (if we have other information that always goes in the last slice). If this is the way to go, then we can describe it here as well.

Thanks for these good questions.

ksn6 · 2025-09-22T23:33:45Z

(1) If a parent is changed and we consider everything before the parent change irrelevant, then we must include the clock again

What do you mean by "we must include the clock again?"

The block footer, which is where the clock value will reside, will appear exactly once per block, after the final entry batch (see SIMD 0307).

Could we mention SIMD 0307 + specify that it's specifically the block_producer_time_nanos: u64 field in the footer that will store the clock you're referring to in this SIMD?

We could decide to have the clock always in the last slice (if we have other information that always goes in the last slice). If this is the way to go, then we can describe it here as well.

The clock won't always be in the last slice / FEC Set, but rather in the last block marker, which will eventually span multiple FEC Sets.

This is why I'm suggesting that we place parent block information into the footer as part of this SIMD - if (1) the clock and (2) the final parent are both included within the same shred, we can run the clock check in this SIMD exactly once, directly in shred ingest, prior to replay.

rogerANZA · 2025-09-23T07:30:05Z

This is why I'm suggesting that we place parent block information into the footer as part of this SIMD

Sure, footer is okay. In fact, line 40 already says footer.

ksn6 · 2025-09-23T22:51:04Z

Sure, footer is okay. In fact, line 40 already says footer.

I think there's a bit of confusion here. Yes, we're in agreement that the clock should go into the footer; e.g., SIMD 0307 specifies this. This isn't what I'm referring to, though.

To clarify - at the moment, the only fields included in the block footer are:

user_agent: Vec<u8> denoting the validator user agent (e.g., Agave / Firedancer / etc., with mods)
block_producer_time_nanos: u64: the clock field that this SIMD aims to impose constraints on

I'm saying that, in this SIMD, we should consider proposing a new third field to the block footer of type (Slot, Hash) which denotes the (parent, block ID) of the block associated with this block footer. This third field will allow us to impose the clock check on shred ingest.

rogerANZA · 2025-09-24T12:33:41Z

SIMD 0307 specifies this. [...] I'm saying that, in this SIMD, we should consider proposing a new third field to the block footer of type (Slot, Hash) which denotes the (parent, block ID) of the block associated with this block footer. This third field will allow us to impose the clock check on shred ingest.

I don't disagree. But this should rather be in SIMD 307.

ksn6 · 2025-09-24T23:44:24Z

FYI - after a few conversations, looks like we'll be punting on placing a third field in the block footer denoting the parent. We plan on accomplishing this via other means (can elaborate if there's interest) in later work.

rogerANZA · 2025-09-25T08:35:20Z

I'm saying that, in this SIMD, we should consider proposing a new third field to the block footer of type (Slot, Hash) which denotes the (parent, block ID) of the block associated with this block footer. This third field will allow us to impose the clock check on shred ingest.

Okay, now I understand your argument.

Why only include the parent and the slot of the parent, and not also the actual time of that parent slot? Then our check is even easier because all the data is already there...

Would you agree that this is a slippery slope? Including the same information a second time is problematic in my opinion. Now we additionally would have to check that the second inclusion of the information is equal to the first inclusion of the information. What if not? Then the block is still skipped?

proposals/0363-simple-alpenglow-clock.md

rogerANZA · 2025-09-25T08:59:33Z

To include the parent (again) is reasonable for repair. But this is orthogonal to this clock SIMD, isn't it?

OliverNChalk · 2025-10-11T15:11:52Z

proposals/0363-simple-alpenglow-clock.md

+## Security Considerations
+


Assuming my understanding is correct that:

If no blocks are produced then the next leader can set the time to be the same as last (no change) or up to 2 times time since last block;

OR; If no slots are produced (no skips) that the block will simply just halt with no chance for the next leader to place the correct time.

Then I think the following risk scenario is worth considering:

Money market (MM), let say marginfi, uses pyth pull oracles where a user is able to update the oracle and then use the MM.

Chain halts for 1 hour, now the clock is 1 hour behind, imagine the chain halt coincided with market volatility.

Attacker uses 1 hour old pyth payload to open insolvent positions on MM.

This is an example where the clock is not off by a few seconds but can be off by hours. In this scenario most current MMs would be vulnerable as they use the following code:

let price = price_feed_account .get_price_no_older_than_with_custom_verification_level( clock, max_age, feed_id, MIN_PYTH_PUSH_VERIFICATION_LEVEL, ) .map_err(|e| { debug!("Pyth push oracle error: {:?}", e); let error: MarginfiError = e.into(); error })?;

Which in turn relies on the following check:

check!( price .publish_time .saturating_add(maximum_age.try_into().unwrap()) >= clock.unix_timestamp, GetPriceError::PriceTooOld );

This check will be completely broken until the clock catches up, allowing stale prices to be pushed.

Caveat

The same issue exists in the current clock which I believe will have the price for block N use the votes from N-1 which of course will be pre halt and thus stale. That said, it will correct within a few slots as opposed to this clock which will have a much longer vulnerability window.

Assuming my understanding is correct that:

If no blocks are produced then the next leader can set the time to be the same as last (no change) or up to 2 times time since last block;

Basically, yes. However, the new value has to be strictly higher (just by 1 tick though).

... Chain halts for 1 hour ...

Wait-what? So you're saying that the whole chain is down for a full hour? None of expected 9,000 blocks in that hour appended? I would say that in this case we have much bigger problems, don't we?

What we could do of course is to narrow the time the leader can choose in such a case, in the most extreme case even narrow it down to basically 1 hour +/- 400 ms. This is an oversimplification because if the chain was really down for 1 hour, our emergency protocol would kick in, and slots would eventually get exponentially longer)

But, essentially, we could change the formula, and give the leader a narrower window of choice if the chain was down for a very long time.

Would that make it better?

I think the core protocol is sound and desirable in 99.99% of operation. In the 0.01% where we have a chain halt it would be ideal if:

The protocol restarted with the correct time

OR; There was a way for onchain applications to detect that a halt/emergency mode had occurred

The former of the two is desirable as it protects applications that rely on semi accurate time without those applications needing to change their logic.

So you're saying that the whole chain is down for a full hour? None of expected 9,000 blocks in that hour appended? I would say that in this case we have much bigger problems, don't we?

It's entirely feasible that the Solana blockchain halts during a liquidation cascade that promptly reverts. Now if money markets come back online after the recovery they should have minimal bad debt. However, if the clock lags then these money markets will observe all of those prices as the clock catches up. In this case an attacker will be able to open positions at the maximally depressed prices resulting in much more bad debt and potentially blowing up all on chain lending protocols in the worst case.

The reason chain halts are bad are:

Continuous services stop (CLOBs, payments). not scary

Protocol assumptions breakdown (liquidations, oracle update timeliness). very scary

Unfortunately the only way to address 2.liquidations is to have 100% uptime. However, we can address oracle update timeliness by providing an accurate post restart clock.

But, essentially, we could change the formula, and give the leader a narrower window of choice if the chain was down for a very long time.

Would that make it better?

This sounds perfect, how easy is this to achieve?

This sounds perfect, how easy is this to achieve?

Not difficult. The question is how exactly we would be doing it.

The simplest way is narrowing the window (as suggested in the previous answer).

There might be more elaborate ways (also easy to implement):

For instance, after a long outage without any blocks, have the next 3 or 5 (or your favorite odd number) leaders propose a time and we take the median as a new starting time. But there are downsides to this solution, in particular we would have 3 (or 5) blocks without a good time.

RE solution 1: For a 1 hour outage, you say that choosing a new time in [epsilon,2 hours] is too much freedom. What makes most sense? [1 hour - delta, 1 hour + delta] for what delta? What's the maximum delta that still makes sense? (A larger delta is better because it increases the chances that the actual time is in the interval.)

Hmm, I guess the question might boil down to, is it worse to:

Take a dependency on NTP (but only during a restart/major slot skip).

OR; Accept it's possible for the clocking to be 30m+ out of sync while its catching up.

If we take a dependency on NTP conditional on some critical failure having occurred prior, what do you see as the increased risk to the protocol? I would assume we now have a risk that a % of our validators have poisoned NTP and thus we cannot restart the protocol until the NTP issues are manually corrected/overridden?

A dependency on NTP is a big NO from my side. It directly affects consensus, and I cannot prove correctness of consensus anymore. And the 30 minute skew you only get after a 30 minute outage (which we will never have, fingers crossed).

BUT: It's good that you raised the question, and that we are aware of it. I will (eventually) think more seriously about it. At this point, I would suggest that we eventually have a special program which can be used to forward the blockchain time. Anybody can suggest to vote to jump to a future time, and then all the stakers can suggest a time, and then we take the median of that. If enough stake participates (send their time vote to the program in the allowed time frame), we jump there. I don't think we need to immediately implement this, but it's good to have the thought ready if we need it.

Okay, that sounds fair. Only thing to flag is that it would be ideal if the clock time was fixed before any defi transactions were processed. For instance if it took 32 slots (random number i chose) to fix the clock then ideally for the first 32 slots no user txs are processed to prevent the looting of vulnerable defi protocols.

Though agreed, ideally we don't have a 30 minute outage ever again... or at least it doesn't coincide with large market volatility

We could try to expose this information to the protocols and have the vulnerable protocols do this logic of dismissing all transactions for some time. But I would definitely not want to halt all user transactions just because some protocols have a bug.

If there's a way to expose that the clock is/may be out of sync to the application layer then I agree that would solve this. My fear was that doing so would be roughly equivalent to reaching consensus on the clock time itself but if that's not the case then perfect.

ksn6 · 2025-10-29T21:33:39Z

@rogerANZA here are implementation details we'll want to include in the SIMD:

Implementation

In proposing a block, a leader will include a special marker called a "block footer," which stores a UNIX timestamp (in nanoseconds). As of the writing of this SIMD, the block_producer_time_nanos field of BlockFooterV1 stores the clock:

/// Version 1 block footer containing production metadata.
///
/// The user agent bytes are capped at 255 bytes during serialization to prevent
/// unbounded growth while maintaining reasonable metadata storage.
///
/// # Serialization Format
/// ```text
/// ┌─────────────────────────────────────────┐
/// │ Producer Time Nanos          (8 bytes)  │
/// ├─────────────────────────────────────────┤
/// │ User Agent Length            (1 byte)   │
/// ├─────────────────────────────────────────┤
/// │ User Agent Bytes          (0-255 bytes) │
/// └─────────────────────────────────────────┘
/// ```
#[derive(Clone, PartialEq, Eq, Debug)]
pub struct BlockFooterV1 {
    pub block_producer_time_nanos: u64,
    pub block_user_agent: Vec<u8>,
}

Upon receiving a block footer from a leader, non-leader validators will:

Locally update the clock sysvar associated with the bank
During replay, validate the bounds of this clock sysvar with respect to the parent bank's clock sysvar and proceed as outlined in "Detailed Design"

Separately, could you add my name to the authors list please (ksn6)? Now that we have the ability to disseminate block components within the Alpenglow prototype, I'll be working on the above items to implement the clock checks.

ksn6 · 2025-10-29T21:37:12Z

We could try to expose this information to the protocols and have the vulnerable protocols do this logic of dismissing all transactions for some time. But I would definitely not want to halt all user transactions just because some protocols have a bug.

If there's a way to expose that the clock is/may be out of sync to the application layer then I agree that would solve this. My fear was that doing so would be roughly equivalent to reaching consensus on the clock time itself but if that's not the case then perfect.

@OliverNChalk @qkniep - any further updates on this?

In the interest of getting this SIMD across the finish line, I'd strongly prefer coming up with a solution to this in a follow-up SIMD, while keeping this SIMD about the design for when the chain is online.

OliverNChalk · 2025-10-30T14:54:48Z

Works for me - happy for @qkniep to propose exposing additional information to DeFi apps in a subsequent proposal

rogerANZA added 18 commits July 25, 2025 10:33

Create XXXX-alpenglow.md

bbe0284

Update XXXX-alpenglow.md

f58da77

Update XXXX-alpenglow.md

7b6bc63

Giving the SIMD a real number

97b9654

Update 0160-static-instruction-limit.md

b7df980

Update 0326-alpenglow.md

c8c57d7

Update 0326-alpenglow.md

a1e8cc6

Update 0326-alpenglow.md

5017730

Update 0326-alpenglow.md

a5e531c

Update 0326-alpenglow.md

54548c8

Update 0326-alpenglow.md

65670d1

Update 0326-alpenglow.md

c95e6f5

Merge branch 'main' into main

6ae9d89

Update 0326-alpenglow.md

f711075

Update 0326-alpenglow.md

76404ad

removed this paragraph (see discussion with ashwin)

Create XXXX-simple-alpenglow-clock.md

eeabf19

Merge branch 'main' of https://github.com/rogerANZA/solana-improvemen…

483e04a

…t-documents

formatting

8b261da

rogerANZA changed the title ~~Simple Alpenglow Clock~~ SIMD-0363: Simple Alpenglow Clock Sep 19, 2025

github-actions bot mentioned this pull request Sep 22, 2025

Upstream Updates - Mon Sep 22 00:17:06 UTC 2025 smartcontractkit/chainlink-solana#1345

Open

ksn6 reviewed Sep 22, 2025

View reviewed changes

rogerANZA commented Sep 25, 2025

View reviewed changes

proposals/0363-simple-alpenglow-clock.md Show resolved Hide resolved

OliverNChalk reviewed Oct 11, 2025

View reviewed changes

ksn6 mentioned this pull request Oct 28, 2025

feat: introducing BlockComponent anza-xyz/agave#8719

Open

ksn6 mentioned this pull request Oct 30, 2025

feat: parse BlockComponents during replay anza-xyz/alpenglow#575

Open

SIMD-0363: Simple Alpenglow Clock #363

Are you sure you want to change the base?

SIMD-0363: Simple Alpenglow Clock #363

Uh oh!

Conversation

rogerANZA commented Sep 19, 2025

Uh oh!

ksn6 commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ksn6 Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ksn6 Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rogerANZA commented Sep 22, 2025

Uh oh!

ksn6 commented Sep 22, 2025

Uh oh!

rogerANZA commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ksn6 commented Sep 23, 2025

Uh oh!

rogerANZA commented Sep 24, 2025

Uh oh!

ksn6 commented Sep 24, 2025

Uh oh!

rogerANZA commented Sep 25, 2025

Uh oh!

Uh oh!

rogerANZA commented Sep 25, 2025

Uh oh!

Choose a reason for hiding this comment

Caveat

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ksn6 commented Oct 29, 2025

Uh oh!

ksn6 commented Oct 29, 2025

Uh oh!

OliverNChalk commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

ksn6 commented Sep 22, 2025 •

edited

Loading

ksn6 Sep 22, 2025 •

edited

Loading

ksn6 Sep 24, 2025 •

edited

Loading

rogerANZA commented Sep 23, 2025 •

edited

Loading