v1.0.0-rc.1
Pre-release
Pre-release
What's Changed
- chore(deps): bump golang.org/x/sync from 0.15.0 to 0.16.0 by @dependabot[bot] in #1160
- feat: Introduce pluggable queue framework by @LukeAVanDrie in #1138
- removed USE_STREAMING env var from conformance + tests by @nirrozenbaum in #1157
- Conformance: Fixes the EPP ConfigMap Namespace by @danehans in #1166
- feat: Introduce pluggable intra-flow dispatch policy framework by @LukeAVanDrie in #1139
- Add support for plugin configuration in the InferencePool helm chart by @ahg-g in #1168
- feat(epp): use kebab-cased flags for epp by @Xunzhuo in #1177
- chore: remove duplicated import for code polish by @Xunzhuo in #1179
- Add documentation for the new Configuration via text feature by @shmuelk in #1110
- fix: set epp image tag when releasing by @Xunzhuo in #1182
- feat: Introduce pluggable inter-flow dispatch policy framework by @LukeAVanDrie in #1167
- Update istio release by @LiorLieberman in #1186
- test: kubectl-validate manifests in presubmit by @chewong in #1083
- Delete the unnecessary Marshal of processRequestBody by @whzghb in #1127
- feat(flowcontrol): Introduce ManagedQueue and Service Contracts by @LukeAVanDrie in #1174
- (feat) initial types and interfaces for pluggable data layer by @elevran in #1154
- Fix a regression in prefix plugin which can cause data race by @liu-cong in #1188
- feat: generate crd with version annotation. by @zetxqx in #1134
- chore: update vllm deployment tag to latest by @Xunzhuo in #1184
- moved build details to version package by @nirrozenbaum in #1185
- Add an "Implementing a Compatible Data Plane" section to the implementers guide by @AndresGuedez in #1143
- feat(flowcontrol): Implement registry shard by @LukeAVanDrie in #1187
- feat(flowcontrol): refine types and consolidate docs by @LukeAVanDrie in #1191
- docs: update to use kebab-cased flags changed at #1177 by @nekomeowww in #1193
- added graceful shutdown when scheduler config is not initialized by @nirrozenbaum in #1198
- feat: move x-k8s to apix and add v1 InferencePool to api/v1 by @capri-xiyue in #1116
- feat: Change epp and conformance to use v1 type InferencePool by @capri-xiyue in #1118
- chore(deps): bump the kubernetes group with 6 updates by @dependabot[bot] in #1200
- Enhanced InferencePool Chart Configurability by @vMaroon in #1211
- refactor(flowcontrol): Enable behavioral mocking by @LukeAVanDrie in #1202
- random endpoint pick on tie break in max score picker by @nirrozenbaum in #1205
- removed cmd/registry file by @nirrozenbaum in #1206
- Support scraping metrics from target running with TLS by @pierDipi in #1190
- gke-gateway v0.5.0 conformance test report 9/9 by @zetxqx in #1005
- added join slack badge to readme by @nirrozenbaum in #1218
- chore: 🔨 Use the v0.3.0 llm-d-inference-sim image tag. by @yafengio in #1140
- style: ✨ optimize import order and more readable. by @yafengio in #1220
- Remove TODO stubs from website by @sats-23 in #1221
- docs: update whole repo to v1 inferencepool by @capri-xiyue in #1213
- release issue template: updated the tag command to include the -s for signing the tag by @nirrozenbaum in #1196
- fix try it out section in quickstart by @nirrozenbaum in #1197
- Do not log potentially sensitive data below DEBUG log level by @pierDipi in #1192
- Update index.md with gateway-inference-extension slack by @LiorLieberman in #1225
- Add fallback logic to support multiple endpoints by @rlakhtakia in #1122
- chore: 🔨 add fmt-imports tool for import order. by @yafengio in #1228
- fix: missing permission to list inference.networking.k8s.io/v1/inferencepool by @nekomeowww in #1230
- fix: Make test iter deterministic to fix flake by @LukeAVanDrie in #1231
- feat(flowcontrol): Implement ShardProcessor engine by @LukeAVanDrie in #1203
- Add a set of configuration defaults by @shmuelk in #1223
- Proposing the successor to the InferenceModel API by @kfswain in #1199
- cleanup of unused fields and functions by @nirrozenbaum in #1233
- chore: update CRD BundleVersion to main-dev by @zetxqx in #1216
- Change String() to accept a value reciever. by @elevran in #1239
- renamed kvcache-scorer to kvcache-utilization-scorer by @nirrozenbaum in #1238
- Add unit tests by @elevran in #1195
- test-report: istio 1.28-alpha v0.4.0 & v0.5.0 report 9/9 by @aslakknutsen in #1102
- added scheduler config logging on bootstrap by @nirrozenbaum in #1247
- fix: updated to v1 inferencepool in manifests by @capri-xiyue in #1248
- chore(deps): bump github.com/onsi/gomega from 1.37.0 to 1.38.0 by @dependabot[bot] in #1253
- chore(deps): bump sigs.k8s.io/yaml from 1.5.0 to 1.6.0 by @dependabot[bot] in #1251
- chore(deps): bump google.golang.org/grpc from 1.73.0 to 1.74.2 by @dependabot[bot] in #1252
- Update the Endpoint Picker Protocol with a new metadata field that communicates status associated with picked endpoints by @AndresGuedez in #1226
- chore(deps): bump sigs.k8s.io/controller-tools from 0.17.3 to 0.18.0 by @dependabot[bot] in #1254
- Update golangci lint to v2.x by @elevran in #1256
- Add nightly benchmarking documentation by @kaushikmitr in #1234
- normalize score to make sure it is always in the range of [0,1] by @nirrozenbaum in #1236
- updated metrics and logging for plugins by @nirrozenbaum in #1235
- fix(flowcontrol): Prevent panic on nil item during shard shutdown by @LukeAVanDrie in #1257
- chore(deps): bump github.com/elastic/crd-ref-docs from 0.1.0 to 0.2.0 by @dependabot[bot] in #1250
- cleanup of config from scheduling package by @nirrozenbaum in #1263
- Add support for multi platform image by @adarshagrawal38 in #1010
- epp: add more error integration test cases by @zhengkezhou1 in #1074
- fix: split EPP RBAC into cluster and namespaced scoped permission by @chewong in #1071
- Renaming InferenceModel to InferenceObjectives by @kfswain in #1255
- cleanup: refactor PodList calls to prepare for making pod metrics staleness configurable by @nayihz in #1046
- refactor(conformance): restructure tests and resources by @zetxqx in #1232
- Refactor(conformance): merge similar helper functions and make the condition check on inferencePool stricter by @zetxqx in #1261
- Update lora affinity to be a scorer. by @rlakhtakia in #1121
- fix image not build issue by @zetxqx in #1286
- Fixes for
make fmt-imports
by @elevran in #1287 - Docs: fixed InferenceObjective in docs by @capri-xiyue in #1284
- adding fairness-id header to be used in flow control by @kfswain in #1282
- update release template to include patch release by @nirrozenbaum in #1270
- Switch to the new default scheduler plugins in integration test by @liu-cong in #1291
- Revert "fix image not build issue" by @danehans in #1295
- Rename lora affinity plugin to lora-affinity-scorer to be consistent with others by @liu-cong in #1297
- revert #1010 to resume new main EPP image build by @zetxqx in #1300
- chore(deps): bump github.com/prometheus/client_golang from 1.22.0 to 1.23.0 by @dependabot[bot] in #1303
- fix(conformance): Reduce flakiness by using service selector modification in
EppUnAvailableFailOpen
by @zetxqx in #1265 - Promote plugin v2 config to be the default by @liu-cong in #1290
- feat: changed to support both v1 and v1a2 ip in EPP by @capri-xiyue in #1277
- Deprecate legacy filters by @liu-cong in #1305
- Updating tabs to spaces by @kfswain in #1311
- Pluggable metrics collection by @elevran in #1237
- Removing concurrency issues from Random Picker by @kfswain in #1314
- Docs: Bumps kgateway Version in Quickstart by @danehans in #1318
- Conformance: Adds Report for kgateway by @danehans in #1317
- Refactor the configuration defaults handling code by @shmuelk in #1294
- Select InferenceObjective by header by @kfswain in #1307
- refactor: 👷 clean unused randomGenerator parameter. by @yafengio in #1322
- Docs: Updates kgateway in Implementations Guide by @danehans in #1325
- remove protocol specifics from cmd-line flags by @nirrozenbaum in #1296
- Adds envoy-ai-gateway conformance report by @Xunzhuo in #1320
- Filter inference objectives based on inference pool group by @nicolexin in #1306
- Correcting title name by @kfswain in #1334
- added plugin state that can be used to share data between different extension point of a plugin by @nirrozenbaum in #1299
- Updating model name rewrite to be done by header key by @kfswain in #1331
- docs: Enable GIE in Istio installation command by @zhengkezhou1 in #1345
- InferenceObjective: Updates
PoolRef
Group Version by @danehans in #1346 - Modifying Criticality; from string, to int by @kfswain in #1348
- explicit return from test after nil check by @elevran in #1350
- fix: change the inferenceobjective to use v1 inferencepool by @capri-xiyue in #1338
- [bug] Fix datalayer Collector test flake by @elevran in #1342
- cleanup: fix typo and delete useless parameter by @nayihz in #1310
- refactor(flowcontrol): Adopt Composite FlowKey as Primary Identifier by @LukeAVanDrie in #1340
- Conformance: Adds InferenceObjective Request Header by @danehans in #1353
- test: enhance plugin state test coverage and readability by @yankay in #1349
- refactor(conformance): Relocate constants to minimize package dependencies by @zetxqx in #1355
- [BBR] perf: optimize model name extraction with selective JSON unmarshaling using struct tags by @pierDipi in #1359
- chore(deps): bump google.golang.org/protobuf from 1.36.6 to 1.36.7 by @dependabot[bot] in #1358
- Rename criticality to priority by @ahg-g in #1363
- Affiliate ready state with leader election with a flag ha-enable-leader-election by @yangligt2 in #1337
- cleanup: simplify endpointpickerconfig by @capri-xiyue in #1324
- Apply shedding upon saturation for priority below 0 by @ahg-g in #1361
- Add agentgateway as implementation by @howardjohn in #1321
- Upgrade the inferencePool selector to a struct from a map. by @zetxqx in #1330
- fix: make extensionRef to be optional by @capri-xiyue in #1365
- fix: make v1a2 remove inline and change conversion logic to match the convention by @capri-xiyue in #1368
- Pluggable data layer: transition
backend/metrics
to use type aliases fromdatalayer
package by @elevran in #1351 - docs: match Inference Extension CRDs by @zhengkezhou1 in #1377
- refactor: prevent double logging of NamespacedName across reconcilers by @chewong in #1379
- Enable kubeapilinter for GIE APIs by @rikatz in #1366
- feat: TargetPortNumber int32 to become TargetPorts []Port by @capri-xiyue in #1354
- Update doc on sglang models support. by @ReneeZhuGG in #1369
- feat: added shortname as alias by @capri-xiyue in #1375
- Tooling: Adds PR Template by @danehans in #1385
- docs: add prometheus + grafana deployment guide by @EyalPazz in #1019
- docs: how to debug integration tests by @zhengkezhou1 in #1067
- doc: update the release-quickstart.sh to include the image tag for lora-syncer by @Ruoyu-y in #1080
- Fixes Quickstart Script by @danehans in #1388
- gitignore macOS generated files. by @bexxmodd in #1378
- feat: added env var for pool group by @capri-xiyue in #1328
- updated env var example in helm chart by @nirrozenbaum in #1390
- remove InferenceModel section from EPP protocol by @nirrozenbaum in #1389
- fix: make port number become pointer by @capri-xiyue in #1400
- Fix: Handle empty string healthcheck as readiness healthcheck by @yangligt2 in #1402
- fix: updated listtype for targetports by @capri-xiyue in #1401
- chore(deps): bump the kubernetes group with 6 updates by @dependabot[bot] in #1403
- chore(deps): bump github.com/onsi/ginkgo/v2 from 2.23.4 to 2.24.0 by @dependabot[bot] in #1404
- depreacte post cycle from scheduling framework by @nirrozenbaum in #1392
- Conformance: Updates InferencePoolInvalidEPPService Test Manifest by @danehans in #1417
- Conformance: Fixes Secondary InferencePool Selector by @danehans in #1419
- Fix Makefile syntax and version consistency by @ErikJiang in #1413
- Enable pluggable datalayer as experimental feature by @elevran in #1391
- fix(cleanup): change the naming to be endpointpickerref by @capri-xiyue in #1420
- fix api-ref-docs make target and generate result by @kfswain in #1422
- trace logging for scores per pod by @nirrozenbaum in #1395
- remove env vars for cmd-lind args by @nirrozenbaum in #1397
- Ensure EPP flags are configurable via Helm chart by @rahulgurnani in #1302
- feat(flowcontrol): Implement the FlowRegistry by @LukeAVanDrie in #1319
- fix(cleanup): change spec doc by @capri-xiyue in #1434
- Cleanup helm flags which have default values from values yaml by @rahulgurnani in #1429
- Fix epp startup error due to missing plugin config file flag by @liu-cong in #1439
- Update inference gateway public docs to use helm charts by @rahulgurnani in #1370
- Fix typo. by @zetxqx in #1437
- Final bits of v1.0 API cleanup by @robscott in #1441
- remove wg serving leads from owners file by @nirrozenbaum in #1445
- fix: first hash of prefix cache with same model name by @livelxw in #1341
- cleanup: final clean up of pointer by @capri-xiyue in #1444
- adding mutex and contention profile gathering by @kfswain in #1448
- remove empty condition list when doing v1 and v1alpha2 conversion. by @zetxqx in #1447
- Updates InferencePool API Conversion Code by @danehans in #1451
New Contributors
- @whzghb made their first contribution in #1127
- @AndresGuedez made their first contribution in #1143
- @nekomeowww made their first contribution in #1193
- @vMaroon made their first contribution in #1211
- @pierDipi made their first contribution in #1190
- @sats-23 made their first contribution in #1221
- @yangligt2 made their first contribution in #1337
- @rikatz made their first contribution in #1366
- @ReneeZhuGG made their first contribution in #1369
- @Ruoyu-y made their first contribution in #1080
- @bexxmodd made their first contribution in #1378
- @ErikJiang made their first contribution in #1413
- @rahulgurnani made their first contribution in #1302
- @livelxw made their first contribution in #1341
Full Changelog: v0.5.1...v1.0.0-rc.1