diff --git a/docs/index.asciidoc b/docs/index.asciidoc index 9d6d5bc..063ef74 100644 --- a/docs/index.asciidoc +++ b/docs/index.asciidoc @@ -20,9 +20,12 @@ include::{include_path}/plugin_header.asciidoc[] ==== Description -This output will send events to a Redis queue using RPUSH. -The RPUSH command is supported in Redis v0.0.7+. Using -PUBLISH to a channel requires at least v1.3.8+. +This output will send events to a Redis queue using RPUSH/ZADD/PUBLISH. + +The RPUSH command is supported in Redis v0.0.7+. +Using ZADD is supported in Redis v1.2.0+. +Using PUBLISH to a channel requires at least v1.3.8+. + While you may be able to make these Redis versions work, the best performance and stability will be found in more recent stable versions. Versions 2.6.0+ are recommended. @@ -43,10 +46,12 @@ This plugin supports the following configuration options plus the <> |<>|No | <> |<>|No | <> |<>|No -| <> |<>, one of `["list", "channel"]`|No +| <> |<>, one of `["list", "sortedset", "channel"]`|No | <> |<>|No | <> |<>|No | <> |<>|No +| <> |<>|No +| <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No @@ -60,38 +65,38 @@ output plugins.   [id="plugins-{type}s-{plugin}-batch"] -===== `batch` +===== `batch` * Value type is <> * Default value is `false` -Set to true if you want Redis to batch up values and send 1 RPUSH command -instead of one command per value to push on the list. Note that this only -works with `data_type="list"` mode right now. +Set to true if you want Redis to batch up values and send 1 RPUSH or 1 ZADD command +instead of one command per value to push on the list or set. Note that this only +works with `data_type="list"` and `data_type="sortedset"` mode right now. -If true, we send an RPUSH every "batch_events" events or +If true, we send an RPUSH or ZADD every "batch_events" events or "batch_timeout" seconds (whichever comes first). -Only supported for `data_type` is "list". +Only supported for `data_type` "list" or "sortedset". [id="plugins-{type}s-{plugin}-batch_events"] -===== `batch_events` +===== `batch_events` * Value type is <> * Default value is `50` -If batch is set to true, the number of events we queue up for an RPUSH. +If batch is set to true, the number of events we queue up for an RPUSH or ZADD. [id="plugins-{type}s-{plugin}-batch_timeout"] -===== `batch_timeout` +===== `batch_timeout` * Value type is <> * Default value is `5` -If batch is set to true, the maximum amount of time between RPUSH commands +If batch is set to true, the maximum amount of time between RPUSH or ZADD commands when there are pending events to flush. [id="plugins-{type}s-{plugin}-congestion_interval"] -===== `congestion_interval` +===== `congestion_interval` * Value type is <> * Default value is `1` @@ -100,30 +105,32 @@ How often to check for congestion. Default is one second. Zero means to check on every event. [id="plugins-{type}s-{plugin}-congestion_threshold"] -===== `congestion_threshold` +===== `congestion_threshold` * Value type is <> * Default value is `0` -In case Redis `data_type` is `list` and has more than `@congestion_threshold` items, +In case Redis `data_type` is `list` or `sortedset` and has more than `@congestion_threshold` items, block until someone consumes them and reduces congestion, otherwise if there are no consumers Redis will run out of memory, unless it was configured with OOM protection. But even with OOM protection, a single Redis list can block all other users of Redis, until Redis CPU consumption reaches the max allowed RAM size. A default value of 0 means that this limit is disabled. -Only supported for `list` Redis `data_type`. +Only supported for `list` and `sortedset` Redis `data_type`. [id="plugins-{type}s-{plugin}-data_type"] -===== `data_type` +===== `data_type` - * Value can be any of: `list`, `channel` + * Value can be any of: `list`, `sortedset`, `channel` * There is no default value for this setting. Either list or channel. If `redis_type` is list, then we will set RPUSH to key. If `redis_type` is channel, then we will PUBLISH to `key`. +If `redis_type` is sortedset, then we will ZADD to `key` with weight set +to content of `priority_field` [id="plugins-{type}s-{plugin}-db"] -===== `db` +===== `db` * Value type is <> * Default value is `0` @@ -131,7 +138,7 @@ RPUSH to key. If `redis_type` is channel, then we will PUBLISH to `key`. The Redis database number. [id="plugins-{type}s-{plugin}-host"] -===== `host` +===== `host` * Value type is <> * Default value is `["127.0.0.1"]` @@ -148,14 +155,33 @@ For example: ["127.0.0.1:6380", "127.0.0.1"] [id="plugins-{type}s-{plugin}-key"] -===== `key` +===== `key` * Value type is <> * There is no default value for this setting. -The name of a Redis list or channel. Dynamic names are +The name of a Redis list, sortedset or channel. Dynamic names are valid here, for example `logstash-%{type}`. +[id="plugins-{type}s-{plugin}-priority_field"] +===== priority_field + + * Value type is <> + * Default value is `epoch` + +Priority field to use for data_type `sortedset`, if field doesn't exist, priority will be priority_default +The score values should be the string representation of a double precision floating point number. +inf and -inf values are valid values as well. (see https://redis.io/commands/zadd) + +[id="plugins-{type}s-{plugin}-priority_default"] +===== priority_default + + * Value type is <> + * Default value is `-1` + +Default priority for data_type `sortedset` when priority field is not found in the event +The score values should be the string representation of a double precision floating point number. +inf and -inf values are valid values as well. (see https://redis.io/commands/zadd) + + [id="plugins-{type}s-{plugin}-name"] ===== `name` (DEPRECATED) @@ -166,7 +192,7 @@ valid here, for example `logstash-%{type}`. Name is used for logging in case there are multiple instances. [id="plugins-{type}s-{plugin}-password"] -===== `password` +===== `password` * Value type is <> * There is no default value for this setting. @@ -174,7 +200,7 @@ Name is used for logging in case there are multiple instances. Password to authenticate with. There is no authentication by default. [id="plugins-{type}s-{plugin}-port"] -===== `port` +===== `port` * Value type is <> * Default value is `6379` @@ -192,7 +218,7 @@ The name of the Redis queue (we'll use RPUSH on this). Dynamic names are valid here, for example `logstash-%{type}` [id="plugins-{type}s-{plugin}-reconnect_interval"] -===== `reconnect_interval` +===== `reconnect_interval` * Value type is <> * Default value is `1` @@ -200,7 +226,7 @@ valid here, for example `logstash-%{type}` Interval for reconnecting to failed Redis connections [id="plugins-{type}s-{plugin}-shuffle_hosts"] -===== `shuffle_hosts` +===== `shuffle_hosts` * Value type is <> * Default value is `true` @@ -208,7 +234,7 @@ Interval for reconnecting to failed Redis connections Shuffle the host list during Logstash startup. [id="plugins-{type}s-{plugin}-timeout"] -===== `timeout` +===== `timeout` * Value type is <> * Default value is `5` diff --git a/lib/logstash/outputs/redis.rb b/lib/logstash/outputs/redis.rb index aa289a8..2514f6f 100644 --- a/lib/logstash/outputs/redis.rb +++ b/lib/logstash/outputs/redis.rb @@ -3,13 +3,16 @@ require "logstash/namespace" require "stud/buffer" -# This output will send events to a Redis queue using RPUSH. -# The RPUSH command is supported in Redis v0.0.7+. Using -# PUBLISH to a channel requires at least v1.3.8+. +# This output will send events to a Redis queue using RPUSH/ZADD/PUBLISH. + +# The RPUSH command is supported in Redis v0.0.7+. +# Using ZADD is supported in Redis v1.2.0+. +# Using PUBLISH to a channel requires at least v1.3.8+. + # While you may be able to make these Redis versions work, # the best performance and stability will be found in more # recent stable versions. Versions 2.6.0+ are recommended. -# + # For more information, see http://redis.io/[the Redis homepage] # class LogStash::Outputs::Redis < LogStash::Outputs::Base @@ -51,7 +54,7 @@ class LogStash::Outputs::Redis < LogStash::Outputs::Base # Password to authenticate with. There is no authentication by default. config :password, :validate => :password - # The name of the Redis queue (we'll use RPUSH on this). Dynamic names are + # The name of the Redis queue/sortedset (using RPUSH/ZADD cmds). Dynamic names are # valid here, for example `logstash-%{type}` config :queue, :validate => :string, :deprecated => true @@ -61,40 +64,50 @@ class LogStash::Outputs::Redis < LogStash::Outputs::Base # Either list or channel. If `redis_type` is list, then we will set # RPUSH to key. If `redis_type` is channel, then we will PUBLISH to `key`. - config :data_type, :validate => [ "list", "channel" ], :required => false + # If `redis_type` is sortedset, then we will ZADD to `key` with weight set + # to content of `priority_field` + config :data_type, :validate => [ "list", "channel", "sortedset" ], :required => false - # Set to true if you want Redis to batch up values and send 1 RPUSH command - # instead of one command per value to push on the list. Note that this only - # works with `data_type="list"` mode right now. + # Set to true if you want Redis to batch up values and send 1 RPUSH or 1 ZADD command + # instead of one command per value to push on the list or set. Note that this only + # works with `data_type="list"` and `data_type="sortedset"` mode right now. # - # If true, we send an RPUSH every "batch_events" events or + # If true, we send an RPUSH or ZADD every "batch_events" events or # "batch_timeout" seconds (whichever comes first). - # Only supported for `data_type` is "list". + # Only supported for `data_type` "list" or "sortedset". config :batch, :validate => :boolean, :default => false - # If batch is set to true, the number of events we queue up for an RPUSH. + # If batch is set to true, the number of events we queue up for an RPUSH or ZADD. config :batch_events, :validate => :number, :default => 50 - # If batch is set to true, the maximum amount of time between RPUSH commands + # If batch is set to true, the maximum amount of time between RPUSH or ZADD commands # when there are pending events to flush. config :batch_timeout, :validate => :number, :default => 5 # Interval for reconnecting to failed Redis connections config :reconnect_interval, :validate => :number, :default => 1 - # In case Redis `data_type` is `list` and has more than `@congestion_threshold` items, + # In case Redis `data_type` is `list` or `sortedset` and has more than `@congestion_threshold` items, # block until someone consumes them and reduces congestion, otherwise if there are # no consumers Redis will run out of memory, unless it was configured with OOM protection. # But even with OOM protection, a single Redis list can block all other users of Redis, # until Redis CPU consumption reaches the max allowed RAM size. # A default value of 0 means that this limit is disabled. - # Only supported for `list` Redis `data_type`. + # Only supported for `list` and `sortedset` Redis `data_type`. config :congestion_threshold, :validate => :number, :default => 0 # How often to check for congestion. Default is one second. # Zero means to check on every event. config :congestion_interval, :validate => :number, :default => 1 + # Priority field to use for data_type `sortedset`, if field doesn't exist, priority will be priority_default + # The score values should be the string representation of a double precision floating point number. +inf and -inf values are valid values as well. (see https://redis.io/commands/zadd) + config :priority_field, :validate => :string, :default => "epoch" + + # Default priority for data_type `sortedset` when priority field is not found in the event + # The score values should be the string representation of a double precision floating point number. +inf and -inf values are valid values as well. (see https://redis.io/commands/zadd) + config :priority_default, :validate => :number, :default => "-1" + def register require 'redis' @@ -118,7 +131,7 @@ def register if @batch - if @data_type != "list" + if @data_type != "list" and @data_type != "sortedset" raise RuntimeError.new( "batch is not supported with data_type #{@data_type}" ) @@ -176,8 +189,13 @@ def flush(events, key, close=false) # we should not block due to congestion on close # to support this Stud::Buffer#buffer_flush should pass here the :final boolean value. congestion_check(key) unless close - @redis.rpush(key, events) + if @data_type == 'sortedset' then + @redis.zadd(key, events.map{ |event| [priorize(event), event] }) + else + @redis.rpush(key, events) + end end + # called from Stud::Buffer#buffer_flush when an error occurs def on_flush_error(e) @logger.warn("Failed to send backlog of events to Redis", @@ -222,6 +240,30 @@ def connect Redis.new(params) end # def connect + private + def priorize(event) + if event.is_a?(String) then + begin + @codec.decode(event) do |event_decoded| + event = event_decoded + end + rescue => e # parse or event creation error + @logger.warn("Default priority [" << @priority_default.to_s << "] used, can't decode event [" << @event << "]") + return @priority_default.to_s + end + end + + + priority_value=event.get(@priority_field) + + if priority_value.nil? || priority_value.to_s !~ /\A[-+]?[0-9]+(\.[0-9]+)?\z/ then + @logger.debug("Default priority [" << @priority_default.to_s << "] used, field [" << @priority_field << "] doesn't exist or doesn't contain a number") + priority_value=@priority_default + end + + return priority_value.to_s + end + # A string used to identify a Redis instance in log messages def identity @name || "redis://#{@password}@#{@current_host}:#{@current_port}/#{@db} #{@data_type}:#{@key}" @@ -231,7 +273,7 @@ def send_to_redis(event, payload) # How can I do this sort of thing with codecs? key = event.sprintf(@key) - if @batch && @data_type == 'list' # Don't use batched method for pubsub. + if @batch && (@data_type == 'list' or @data_type == 'sortedset') # Don't use batched method for pubsub. # Stud::Buffer buffer_receive(payload, key) return @@ -242,6 +284,9 @@ def send_to_redis(event, payload) if @data_type == 'list' congestion_check(key) @redis.rpush(key, payload) + elsif @data_type == 'sortedset' + congestion_check(key) + @redis.zadd(key, priorize(event), payload) else @redis.publish(key, payload) end diff --git a/spec/integration/outputs/redis_spec.rb b/spec/integration/outputs/redis_spec.rb index 4d531f7..525200e 100644 --- a/spec/integration/outputs/redis_spec.rb +++ b/spec/integration/outputs/redis_spec.rb @@ -3,14 +3,15 @@ require "logstash/json" require "redis" require "flores/random" +require 'securerandom' describe LogStash::Outputs::Redis do context "integration tests", :integration => true do shared_examples_for "writing to redis list" do |extra_config| - let(:key) { 10.times.collect { rand(10).to_s }.join("") } + let(:key) { SecureRandom.hex } let(:event_count) { Flores::Random.integer(0..10000) } - let(:message) { Flores::Random.text(0..100) } + let(:message) { SecureRandom.hex } # We use hex generation to avoid escaping issues on Windows let(:default_config) { { "key" => key, @@ -72,6 +73,96 @@ end end end + + + shared_examples_for "writing to redis sortedset" do |extra_config| + let(:key) { SecureRandom.hex } + let(:event_count) { Flores::Random.integer(12..1000) } # Minimum 12 to test two digits cases + let(:default_config) { + { + "key" => key, + "data_type" => "sortedset", + "host" => "localhost", + "priority_field" => "epoch" + } + } + let(:redis_config) { + default_config.merge(extra_config || {}) + } + let(:redis_output) { described_class.new(redis_config) } + + before do + redis = Redis.new(:host => "127.0.0.1") + insist { redis.zcard(key) } == 0 + redis.close() + + redis_output.register + + event_count_1 = event_count / 2 + event_count_2 = event_count - event_count_1 + + # Add a half of events in non reverse order + event_count_1.times do |i| + event = LogStash::Event.new("message" => { "i" => i }, "epoch" => i ) + redis_output.receive(event) + end + # And add a half of events in reverse order to verify that events are sorted + event_count_2.times do |j| + i = event_count - j - 1 + event = LogStash::Event.new("message" => { "i" => i }, "epoch" => i ) + redis_output.receive(event) + end + + redis_output.close + end + + it "should successfully send all events to redis" do + redis = Redis.new(:host => "127.0.0.1") + + # The sorted set should contain the number of elements our agent pushed up. + insist { redis.zcard(key) } == event_count + + # Now check all events for order and correctness. + event_count.times do |i| + # Non reverse order + item = redis.zrange(key, i, i).first + event = LogStash::Event.new(LogStash::Json.load(item)) + insist { event.get("[message][i]") } == i + insist { event.get("[epoch]") } == i + end + end + + after "should clear the sortedset" do + redis = Redis.new(:host => "127.0.0.1") + + redis.zremrangebyrank(key, 0, -1) + # The list should now be empty + insist { redis.zcard(key) } == 0 + end + end + + + context "when batch_mode is false" do + include_examples "writing to redis sortedset" + end + + context "when batch_mode is true" do + batch_events = Flores::Random.integer(1..1000) + batch_settings = { + "batch" => true, + "batch_events" => batch_events + } + + include_examples "writing to redis sortedset", batch_settings do + + # A canary to make sure we're actually enabling batch mode + # in this shared example. + it "should have batch mode enabled" do + expect(redis_config).to include("batch") + expect(redis_config["batch"]).to be_truthy + end + end + end end end