java - how to control the number of mappers per region server for reading a HBase table -
i have hbase table(written through apache phoenix) , needs read , write flat text file. current bottleneck have 32 salt buckets hbase(phoenix) table opens 32 mappers read. , when data grows on 100 billion becomes time consuming. can point me how control number of mappers per region server reading hbase table? have seen program explains in below url , "https://gist.github.com/bbeaudreault/9788499" not have driver program explains fully. can help?
in observation, number of regions of table = number of mappers opened framework .
so reduce number of regions in turn reduce number of mappers.
how can done :
1) pre-split hbase table while creating ex 0-9 .
2) load data in these regions generating row prefix between 0-9.*
below various ways splitting :
also, have look @ apache-hbase-region-splitting-and-merging
moreover, setting number of mappers not guarantee open many, driven input splits
you can change number of mappers using setnummaptasks
or conf.set('mapred.map.tasks','numberofmappersyouwanttoset')
(but suggestion configuration ).
about link provided you, don't know how works can check author.
Comments
Post a Comment