-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Search before reporting
- I searched in the issues and found nothing similar.
Motivation
The Apache Pulsar KinesisSink utilises the KinesisProducer from the KPL. The KPL provides a comprehensive set of configuration parameters through its KinesisProducerConfiguration class.
In the current KinesisSink implementation, most of these parameters such as collectionMaxCount
, collectionMaxSize
, connectTimeout
, maxConnections
, and minConnections
—are not configurable. This restricts users' ability to optimise the sink's performance, or meet network and operational requirements potentially affecting throughput, latency, and resource utilisation.
Solution
We propose the following improvements to its KinesisSinkConfig.
-
Expose Key KPL Parameters Directly: Incorporate a select group of critical KPL configuration parameters (e.g.,
collectionMaxCount
,collectionMaxSize
,maxConnections
,minConnections
) directly into the KinesisSink configuration. This allows users to easily adjust these essential settings without additional complexity. -
Introduce an Optional Hashmap for Additional Configuration: Add an optional parameter,
extraKinesisProducerConfiguration
, which accepts a hashmap of key-value pairs corresponding to any other KPL configuration parameters not directly exposed. This ensures users retain full control over the KinesisProducer's behavior without overloading the primary sink configuration interface.
Alternatives
No response
Anything else?
No response
Are you willing to submit a PR?
- I'm willing to submit a PR!