Skip to content

Conversation

kportertx
Copy link
Contributor

No description provided.

Copy link
Contributor

@dwelch-spike dwelch-spike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, thanks for the PR. @kportertx could you re-run the failing test action? It isn't giving me the option. The TPS limit tests that failed fail intermittently but I'd like to see them pass.

EDIT: nevermind, triggered another run with some usage text changes

@dwelch-spike
Copy link
Contributor

Question about performance tuning. So if each event loop creates new transactions then --event-loops is the thread count to tune in async mode, correct? It's essentially the --threads of async mode. I wonder how we can make this as straight forward as possible to users. I'm thinking just add some text to the --threads help that says to use --event-loops when running in async mode.

@kportertx
Copy link
Contributor Author

Maybe drop the --event-loops config and use --threads to determine number of event loops? The help for --threads could explain that it is the number of event-loops when running async.

@dwelch-spike
Copy link
Contributor

Maybe drop the --event-loops config and use --threads to determine number of event loops? The help for --threads could explain that it is the number of event-loops when running async.

I like that idea, I'll run it past @hev.

@dwelch-spike
Copy link
Contributor

@kportertx I think the async throttle tests are failing because dyn_throttle_pause_for is being called by multiple threads now (worker and event loop) leading to race conditions when throttling modifies tdata.dyn_throttle. If we move to this thread model then I think we need a lock around the code in dyn_throttle_pause_for().

@dwelch-spike
Copy link
Contributor

@kportertx I think the async throttle tests are failing because dyn_throttle_pause_for is being called by multiple threads now (worker and event loop) leading to race conditions when throttling modifies tdata.dyn_throttle. If we move to this thread model then I think we need a lock around the code in dyn_throttle_pause_for().

I'll implement the lock and let you know so you can take a look

@dwelch-spike
Copy link
Contributor

@kportertx Allowing the event loop threads to create transactions causes some issues with the --throughput throttling functionality, and potentially other areas. It was assumed that the thread data struct tdata_t and dynamic throttle struct dyn_throttle_t would only ever be called modified in the same thread. Since dyn_throttle_pause_for is called by multiple threads now I think race conditions can occur.

Throttling used to take place in the 1 thread putting transactions on the event loop, now, by adding transactions and throttling in the event loop threads, throughput isn't being throttled correctly. I think one reason is that many transactions are being started by the event loop thread before it gets to a callback where it can do any throttling.

Since higher async tps is possbile by running multiple asbench processes we are treating this as low priority. I'll come back to this when I can but don't have the time to look into fixes for this change and --throughput right now.

Observations...
running with a low --async-max-commands makes throttling more accurate
low TPS limits causes timeouts

dwelch@dwelch-MBP-XQ05R aerospike-benchmark % ./target/asbench --workload RU,0.0001 --duration 15 --start-key 0 --keys 1000000000 -o I --throughput 10 -z 1 --async
hosts:                  127.0.0.1
port:                   3000
user:                   (null)
services-alternate:     false
max error rate:         0
tender interval:        1000 ms
error rate window:      1
max socket idle:        55 secs
namespace:              test
set:                    testset
start-key:              0
keys/records:           1000000000
object spec:            I4
- duration: 15
  desc: default config (specify your own with --workload-stages)
  tps: 10
  key-start: 0
  key-end: 1000000000
  pause: 0
  batch-size: 1
  async: true
  random: false
  ttl: 0
  workload: RU,0.0001%,100%,100%
  stage: 1
  object-spec: I4
  read-bins: (null)
  write-bins: (null)
threads:                1
enable compression:     false
compression ratio:      1.000000
connect timeout:        1000 ms
read socket timeout:    30000 ms
write socket timeout:   30000 ms
read total timeout:     1000 ms
write total timeout:    1000 ms
max retries:            1
sleep between retries:  0 ms
debug:                  false
latency:                false
latency histogram:      false
cumulative HDR hist:    false
shared memory:          false
send-key:               false
read replica:           sequence
read mode AP:           one
read mode SC:           session
write commit level:     all
min conns per node:       0
max conns per node:       300
conn pools per node:      1
async min conns per node: 0
async max conns per node: 300
async max commands:       50
event loops:              1
auth mode:              INTERNAL
2023-07-10 11:15:14.803 INFO Add node BB9030011AC4202 127.0.0.1:3000
2023-07-10 11:15:14.816 INFO Start 1 transaction threads
Stage 1: default config (specify your own with --workload-stages)
2023-07-10 11:15:15.816 INFO write(tps=298 (hit=298 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=298 (hit=298 miss=0) timeouts=0 errors=0)
2023-07-10 11:15:16.816 INFO write(tps=127 (hit=127 miss=0) timeouts=1 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=127 (hit=127 miss=0) timeouts=1 errors=0)
2023-07-10 11:15:17.816 INFO write(tps=97 (hit=97 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=97 (hit=97 miss=0) timeouts=0 errors=0)
2023-07-10 11:15:18.819 INFO write(tps=48 (hit=48 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=48 (hit=48 miss=0) timeouts=0 errors=0)
2023-07-10 11:15:19.836 INFO write(tps=13 (hit=13 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=13 (hit=13 miss=0) timeouts=0 errors=0)
2023-07-10 11:15:20.842 INFO write(tps=2 (hit=2 miss=0) timeouts=11 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=2 (hit=2 miss=0) timeouts=11 errors=0)
2023-07-10 11:15:21.837 INFO write(tps=0 (hit=0 miss=0) timeouts=11 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=0 (hit=0 miss=0) timeouts=11 errors=0)
2023-07-10 11:15:22.832 INFO write(tps=0 (hit=0 miss=0) timeouts=11 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=0 (hit=0 miss=0) timeouts=11 errors=0)
2023-07-10 11:15:23.832 INFO write(tps=0 (hit=0 miss=0) timeouts=10 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=0 (hit=0 miss=0) timeouts=10 errors=0)
2023-07-10 11:15:24.830 INFO write(tps=0 (hit=0 miss=0) timeouts=10 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=0 (hit=0 miss=0) timeouts=10 errors=0)
2023-07-10 11:15:25.833 INFO write(tps=0 (hit=0 miss=0) timeouts=11 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=0 (hit=0 miss=0) timeouts=11 errors=0)
2023-07-10 11:15:26.851 INFO write(tps=0 (hit=0 miss=0) timeouts=9 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=0 (hit=0 miss=0) timeouts=9 errors=0)
2023-07-10 11:15:27.846 INFO write(tps=0 (hit=0 miss=0) timeouts=10 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=0 (hit=0 miss=0) timeouts=10 errors=0)
2023-07-10 11:15:28.867 INFO write(tps=0 (hit=0 miss=0) timeouts=11 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=0 (hit=0 miss=0) timeouts=11 errors=0)
2023-07-10 11:15:29.821 INFO write(tps=0 (hit=0 miss=0) timeouts=9 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=0 (hit=0 miss=0) timeouts=9 errors=0)
dwelch@dwelch-MBP-XQ05R aerospike-benchmark % 

higher limits still get low performance (normally this command gets around 3k tps on my setup)

dwelch@dwelch-MBP-XQ05R aerospike-benchmark % ./target/asbench --workload RU,0.0001 --duration 15 --start-key 0 --keys 1000000000 -o I -z 1 --event-loops 3 --throughput 1000
hosts:                  127.0.0.1
port:                   3000
user:                   (null)
services-alternate:     false
max error rate:         0
tender interval:        1000 ms
error rate window:      1
max socket idle:        55 secs
namespace:              test
set:                    testset
start-key:              0
keys/records:           1000000000
object spec:            I4
- duration: 15
  desc: default config (specify your own with --workload-stages)
  tps: 1000
  key-start: 0
  key-end: 1000000000
  pause: 0
  batch-size: 1
  async: false
  random: false
  ttl: 0
  workload: RU,0.0001%,100%,100%
  stage: 1
  object-spec: I4
  read-bins: (null)
  write-bins: (null)
threads:                1
enable compression:     false
compression ratio:      1.000000
connect timeout:        1000 ms
read socket timeout:    30000 ms
write socket timeout:   30000 ms
read total timeout:     1000 ms
write total timeout:    1000 ms
max retries:            1
sleep between retries:  0 ms
debug:                  false
latency:                false
latency histogram:      false
cumulative HDR hist:    false
shared memory:          false
send-key:               false
read replica:           sequence
read mode AP:           one
read mode SC:           session
write commit level:     all
min conns per node:       0
max conns per node:       300
conn pools per node:      1
async min conns per node: 0
async max conns per node: 300
async max commands:       50
event loops:              3
auth mode:              INTERNAL
2023-07-10 11:20:50.497 INFO Add node BB9030011AC4202 127.0.0.1:3000
2023-07-10 11:20:50.507 INFO Start 1 transaction threads
Stage 1: default config (specify your own with --workload-stages)
2023-07-10 11:20:51.507 INFO write(tps=44 (hit=44 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=44 (hit=44 miss=0) timeouts=0 errors=0)
2023-07-10 11:20:52.507 INFO write(tps=38 (hit=38 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=38 (hit=38 miss=0) timeouts=0 errors=0)
2023-07-10 11:20:53.507 INFO write(tps=48 (hit=48 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=48 (hit=48 miss=0) timeouts=0 errors=0)
2023-07-10 11:20:54.507 INFO write(tps=45 (hit=45 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=45 (hit=45 miss=0) timeouts=0 errors=0)
2023-07-10 11:20:55.507 INFO write(tps=40 (hit=40 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=40 (hit=40 miss=0) timeouts=0 errors=0)
2023-07-10 11:20:56.507 INFO write(tps=30 (hit=30 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=30 (hit=30 miss=0) timeouts=0 errors=0)
2023-07-10 11:20:57.507 INFO write(tps=58 (hit=58 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=58 (hit=58 miss=0) timeouts=0 errors=0)
2023-07-10 11:20:58.507 INFO write(tps=31 (hit=31 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=31 (hit=31 miss=0) timeouts=0 errors=0)
2023-07-10 11:20:59.507 INFO write(tps=32 (hit=32 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=32 (hit=32 miss=0) timeouts=0 errors=0)
2023-07-10 11:21:00.507 INFO write(tps=36 (hit=36 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=36 (hit=36 miss=0) timeouts=0 errors=0)
2023-07-10 11:21:01.507 INFO write(tps=49 (hit=49 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=49 (hit=49 miss=0) timeouts=0 errors=0)
2023-07-10 11:21:02.507 INFO write(tps=57 (hit=57 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=57 (hit=57 miss=0) timeouts=0 errors=0)
2023-07-10 11:21:03.507 INFO write(tps=118 (hit=118 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=118 (hit=118 miss=0) timeouts=0 errors=0)
2023-07-10 11:21:04.507 INFO write(tps=54 (hit=54 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=54 (hit=54 miss=0) timeouts=0 errors=0)
2023-07-10 11:21:05.507 INFO write(tps=49 (hit=49 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=49 (hit=49 miss=0) timeouts=0 errors=0)
2023-07-10 11:21:05.512 INFO write(tps=202 (hit=202 miss=0) timeouts=0 errors=0) read(tps=0 (hit=0 miss=0) timeouts=0 errors=0) total(tps=202 (hit=202 miss=0) timeouts=0 errors=0)

I did try throwing a lock around the throttle changes in the test-throttle-lock branch but it didn't change the throttling or anything, that race may not be a big issue.

@kportertx
Copy link
Contributor Author

kportertx commented Jul 10, 2023

The java client's benchmark doesn't implement throttling for async so I implemented a method a while back - see git difftool dc2d2eb3b53d45112edf3b31e267ac1a3f648747\^ 02d9e06dda24c69ba4b890af29eed1115273cefb -d in the java client repo.

Basically there is a quota for how many transactions need to be completed within the current interval. Worker threads decrement the quota until it reaches 0 and then stop. A Manager thread periodically increases the quota and restarts stopped threads.

This worked well for the testing that we were doing at the time.

@dwelch-spike
Copy link
Contributor

dwelch-spike commented Jul 11, 2023

Ooops, did not mean to close

@dwelch-spike
Copy link
Contributor

The java client's benchmark doesn't implement throttling for async so I implemented a method a while back - see git difftool dc2d2eb3b53d45112edf3b31e267ac1a3f648747\^ 02d9e06dda24c69ba4b890af29eed1115273cefb -d in the java client repo.

Basically there is a quota for how many transactions need to be completed within the current interval. Worker threads decrement the quota until it reaches 0 and then stop. A Manager thread periodically increases the quota and restarts stopped threads.

This worked well for the testing that we were doing at the time.

Thanks, I will take a look

- Seems to work with `--event-loops`.

FIXME
- Multiple `--event-loops` eventually breaches the throttle.
- Noticed that it occasionally freezes at the end
  - Workload threads have all exited, could be a pre-exisiting bug.
Copy link
Contributor

@dwelch-spike dwelch-spike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Just a few minor questions.

dwelch-spike and others added 3 commits November 20, 2023 15:07
- Add parens to clarify `min_usleep` calculation.
- Remove defunct `target_per_thread` variable and loop.
Copy link
Contributor

@dwelch-spike dwelch-spike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code LGTM but we need the tests to pass. I'll take a closer look at them this week.

@kportertx
Copy link
Contributor Author

@dwelch-spike, looks like the async batch workloads are broken. Unsure when I'll be able to look more closely at this.

@dwelch-spike
Copy link
Contributor

@dwelch-spike, looks like the async batch workloads are broken. Unsure when I'll be able to look more closely at this.

@kportertx I worked on async batches somewhat recently so I'll take a look soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants