Conversation
de6bfc5 to
90e63a1
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| invalidation_cagg_add_entries(int32 ht_id, Datum start, Datum end) | ||
| { | ||
| ContinuousAggInfo info = ts_continuous_agg_get_all_caggs_info(ht_id); | ||
| ListCell *lc; | ||
| foreach (lc, info.mat_hypertable_ids) | ||
| { | ||
| int32 mat_ht_id = lfirst_int(lc); | ||
| invalidation_cagg_add_entry(mat_ht_id, start, end); | ||
| } | ||
| } |
There was a problem hiding this comment.
@svenklemm will we change this semantic in this PR? Today all the invalidation logs are per-hypertable and then the refresh procedure move it to the cagg invalidation log. According to the design document we aggreed it will not change now since we should benchmark first if it will not hurt performance for cases where a user have 1 hypertable and N caggs. /cc @gayyappan
There was a problem hiding this comment.
Yes, @fabriziomello and I discussed the potential problem with this.
With the current invalidation approach, we have 2x writes.
Directlyw riting to the mat invalidation log tables will result N times write amplification (N =#of caggs defined on the same hypertable) and slow down the regular DML path.
There was a problem hiding this comment.
Yes, @fabriziomello and I discussed the potential problem with this.
With the current invalidation approach, we have 2x writes.
Directlyw riting to the mat invalidation log tables will result N times write amplification (N =#of caggs defined on the same hypertable) and slow down the regular DML path.
Just to clarify this change will slow down DML changes only when it produces invalidation logs, so it can potentially affect backfills.
@gayyappan didn't got your point about 2x writes. Currently we write invalidations based on chunks affected in the hypertable invalidation log. With this implementation this number of rows will be inserted N times depending of the number of associated CAGGs.
There's a tradeoff moving to this implementation and the price is potentially slow down backfill operations. Is this a big problem? I really don't know and prefer to have some numbers before decide to move to this direction.
There was a problem hiding this comment.
Yes, @fabriziomello and I discussed the potential problem with this.
With the current invalidation approach, we have 2x writes.
Directlyw riting to the mat invalidation log tables will result N times write amplification (N =#of caggs defined on the same hypertable) and slow down the regular DML path.Just to clarify this change will slow down DML changes only when it produces invalidation logs, so it can potentially affect backfills.
@gayyappan didn't got your point about 2x writes. Currently we write invalidations based on chunks affected in the hypertable invalidation log. With this implementation this number of rows will be inserted N times depending of the number of associated CAGGs.
There's a tradeoff moving to this implementation and the price is potentially slow down backfill operations. Is this a big problem? I really don't know and prefer to have some numbers before decide to move to this direction.
Sorry, my phrasing was not precise. The write amplification is 1 extra write per transaction behind the invalidation threshold (i.e. this is not isolated to backfills. e.g. 5 minute caggs and hypertable with weekly chunks). With this change, the write amplification is N times.
Uh oh!
There was an error while loading. Please reload this page.