python - redis.exceptions.ConnectionError after approximately one day celery running -
this full trace:
traceback (most recent call last): file "/home/server/backend/venv/lib/python3.4/site-packages/celery/app/trace.py", line 283, in trace_task uuid, retval, success, request=task_request, file "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/base.py", line 256, in store_result request=request, **kwargs) file "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/base.py", line 490, in _store_result self.set(self.get_key_for_task(task_id), self.encode(meta)) file "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 160, in set return self.ensure(self._set, (key, value), **retry_policy) file "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 149, in ensure **retry_policy file "/home/server/backend/venv/lib/python3.4/site-packages/kombu/utils/__init__.py", line 243, in retry_over_time return fun(*args, **kwargs) file "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 169, in _set pipe.execute() file "/home/server/backend/venv/lib/python3.4/site-packages/redis/client.py", line 2593, in execute return execute(conn, stack, raise_on_error) file "/home/server/backend/venv/lib/python3.4/site-packages/redis/client.py", line 2447, in _execute_transaction connection.send_packed_command(all_cmds) file "/home/server/backend/venv/lib/python3.4/site-packages/redis/connection.py", line 532, in send_packed_command self.connect() file "/home/pserver/backend/venv/lib/python3.4/site-packages/redis/connection.py", line 436, in connect raise connectionerror(self._error_message(e)) redis.exceptions.connectionerror: error 0 connecting localhost:6379. error. [2016-09-21 10:47:18,814: warning/worker-747] data collector not contactable. can because of network issue or because of data collector being restarted. in event contact cannot made after period of time please report problem new relic support further investigation. error raised connectionerror(protocolerror('connection aborted.', blockingioerror(11, 'resource temporarily unavailable')),).
i searched connectionerror there no matching problem mine.
my platform ubuntu 14.04. part of redis config. (i can share if need whole redis.conf file. way parameters closed on limits section.)
# default redis listens connections network interfaces # available on server. possible listen 1 or multiple # interfaces using "bind" configuration directive, followed 1 or # more ip addresses. # # examples: # # bind 192.168.1.100 10.0.0.1 bind 127.0.0.1 # specify path unix socket used listen # incoming connections. there no default, redis not listen # on unix socket when not specified. # # unixsocket /var/run/redis/redis.sock # unixsocketperm 755 # close connection after client idle n seconds (0 disable) timeout 0 # tcp keepalive. # # if non-zero, use so_keepalive send tcp acks clients in absence # of communication. useful 2 reasons: # # 1) detect dead peers. # 2) take connection alive point of view of network # equipment in middle. # # on linux, specified value (in seconds) period used send acks. # note close connection double of time needed. # on other kernels period depends on kernel configuration. # # reasonable value option 60 seconds. tcp-keepalive 60
this mini redis wrapper:
import redis django.conf import settings redis_pool = redis.connectionpool(host=settings.redis_host, port=settings.redis_port) def get_redis_server(): return redis.redis(connection_pool=redis_pool)
and how use it:
from redis_wrapper import get_redis_server # view , task working in different, indipendent processes def sample_view(request): rs = get_redis_server() # get-set stuff redis @shared_task def sample_celery_task(): rs = get_redis_server() # get-set stuff redis
package versions:
celery==3.1.18 django-celery==3.1.16 kombu==3.0.26 redis==2.10.3
so problem that; connection error occurs after time of starting celery workers. , after first seem of error, tasks ends error until restart of celery workers. (interestingly, celery flower fails during problematic period)
i suspect of redis connection pool usage method, or redis configuration or less network issues. ideas reason? doing wrong?
(ps: add redis-cli info results when see error today)
update:
i temporarily solved problem adding --maxtasksperchild parameter worker starter command. set 200. ofcourse not proper way solve problem, symptomatic cure. refreshes worker instance periodically (closes old process , creates new 1 when old 1 reached 200 task) , refreshes global redis pool , connections. so think should focus on global redis connection pool usage way , i'm still waiting new ideas , comments.
sorry bad english , in advance.
have enabled rdb background save method in redis ??
if check size of dump.rdb
file in /var/lib/redis
.
file grows in size , fill root
directory , redis instance cannot save file anymore.
you can stop background save process issuing
config set stop-writes-on-bgsave-error no
command on redis-cli
Comments
Post a Comment