multithreading - Cassandra java driver - high latency while extracting data with multiple threads -
i can see strange behavior datastax cassandra driver (3.0). i've created new cluster, i've started set of threads using same cluster object. if keep threads 1 or 2, see avg extraction time of 5ms, if increase threads 60, extraction time increase 200ms (per single thread). strange thing that, if let 60 threads app running , start on same machine process 1 threads, extraction time single threaded app again 5ms. seems related client. i've repeated same tests many times avoid cache cold start problem. here how cluster object configured:
poolingoptions poolingoptions = new poolingoptions(); poolingoptions .setconnectionsperhost(hostdistance.local, parallelism, parallelism+20) .setconnectionsperhost(hostdistance.remote, parallelism, parallelism+20) .setmaxrequestsperconnection(hostdistance.local, 32768) .setmaxrequestsperconnection(hostdistance.remote, 2000); this.cluster = cluster.builder() .addcontactpoints(nodes) .withretrypolicy(downgradingconsistencyretrypolicy.instance) .withreconnectionpolicy(new constantreconnectionpolicy(100l)) .withloadbalancingpolicy(new tokenawarepolicy(dcawareroundrobinpolicy.builder().build())) .withcompression(compression.lz4) .withpoolingoptions(poolingoptions) .withprotocolversion(protocolversion.v4) .build();
does have experienced same problem? seems client configuration issue. maybe additional missing configuration netty?
update 1 application doing extracting chunk of data using query like:
select * table id=? , ts>=? , ts<?
so have 60 threads extracting data in parallel. id partition key. every query executed thread as:
//prepare statement preparedstatement stmt = ... prepared statment cached boundstatement bstmt = stmt.bind(...) //execute query long te1 = system.nanotime(); resultset rs = this.session.execute(bstmt); long te2 = system.nanotime(); //fetch... iterator<row> iterator = rs.iterator(); while (!rs.isexhausted() && iterator.hasnext()) { .... }
session 1 , shared cross threads. i'm measuring avg time of session.execute() method call.
thanks!
update 2 here schema definition
create table d_t ( id bigint, xid bigint, ts timestamp, avg double, ce double, cg double, p double, w double, c double, sum double, last double, max double, min double, p75 double, p90 double, p95 double, squad double, sumq double, wavg double, weight double, primary key ((id), xid, ts) ) clustering order (xid desc, ts desc) , compaction = {'class': 'sizetieredcompactionstrategy'} , gc_grace_seconds=86400 , caching = { 'keys' : 'all', 'rows_per_partition':'36000' } , min_index_interval = 2 , max_index_interval = 20;
update 3 tried
.setmaxrequestsperconnection(hostdistance.local, 1) .setmaxrequestsperconnection(hostdistance.remote, 1)
with no changes
ultimately think depend on code doing. can share example?
with regards increased latency, how measuring this? based on statement:
strange thing that, if let 60 threads app running , start on same machine process 1 threads, extraction time single threaded app again 5ms.
60 concurrent requests isn't , in general, shouldn't need thread-per-request using datastax java driver. can achieve high throughput single application thread netty event loop group driver uses of work.
the native protocol c* uses allows many requests per connection. have configured here, each connection maxed out 32768 concurrent requests. in reality, don't need touch configuration @ all, default (1000 requests per connection) sensible in practice c* not going process more native_transport_max_threads
cassandra.yaml (128 default) @ time , queue rest.
because of this, not need many connections each host. default of 1 core connection per host should more enough 60 concurrent requests. increasing number of connections per host won't , in profiling i've found diminishing returns beyond 8 connections per host high throughputs (thousands of concurrent requests) , throughput getting worse past 16 connections per host, though mileage may vary based on environment.
with said, recommend not configuring poolingoptions
beyond default, other maybe setting core , max 8 scenarios trying achieve higher throughputs (> 10k requests/sec).
Comments
Post a Comment