dictionary - Python multiprocessing pool with shared data -
i'm attempting speed multivariate fixed-point iteration algorithm using multiprocessing however, i'm running issues dealing shared data. solution vector named dictionary rather vector of numbers. each element of vector computed using different formula. @ high level, have algorithm this:
current_estimate = previous_estimate while true: state in all_states: current_estimate[state] = state.getvalue(previous_estimate) if norm(current_estimate, previous_estimate) < tolerance: break else: previous_estimate, current_estimate = current_estimate, previous_estimate
i'm trying parallelize for-loop part multiprocessing. previous_estimate
variable read-only , each process needs write 1 element of current_estimate
. current attempt @ rewriting for-loop follows:
# class , function definitions class a(object): def __init__(self,val): self.val = val # representative getvalue function def getvalue(self, est): return est[self] + self.val def worker(state, in_est, out_est): out_est[state] = state.getvalue(in_est) def worker_star(a_b_c): """ allow multiple arguments pool taken http://stackoverflow.com/a/5443941/3865495 """ return worker(*a_b_c) # initialize test environment manager = manager() estimates = manager.dict() all_states = [] in range(5): = a(i) all_states.append(a) estimates[a] = 0 pool = pool(process = 2) prev_est = estimates curr_est = estimates pool.map(worker_star, itertools.izip(all_states, itertools.repeat(prev_est), itertools.repreat(curr_est)))
the issue i'm running elements added all_states
array not same added manager.dict()
. keep getting key value
errors when trying access elements of dictionary using elements of array. , debugging, found none of elements same.
print map(id, estimates.keys()) >>> [19558864, 19558928, 19558992, 19559056, 19559120] print map(id, all_states) >>> [19416144, 19416208, 19416272, 19416336, 19416400]
this happening because objects you're putting estimates
dictproxy
aren't same objects live in regular dict. manager.dict()
call returns dictproxy
, proxying access dict
lives in separate manager process. when insert things it, they're being copied , sent remote process, means they're going have different identity.
to work around this, can define own __eq__
, __hash__
functions on a
, described in question:
class a(object): def __init__(self,val): self.val = val # representative getvalue function def getvalue(self, est): return est[self] + self.val def __hash__(self): return hash(self.__key()) def __key(self): return (self.val,) def __eq__(x, y): return x.__key() == y.__key()
this means key ups items in estimates
use value of val
attribute establish identity , equality, rather id
assigned python.
Comments
Post a Comment