Remove PID files


If we rely on SystemD for managing daemons, then we do not need PID files anymore.

This should be done for every daemon that is not a "forking" SystemD service. (e.i. all except asterisk, wazo-confgend and wazo-provd)

To remove PID file we also need to transfer restart logic from monit to systemd too.

And example of service file

Some questions remain:

  • Should we increase a delay between restart? (default: RestartSec=100ms)

    • Yes, if we have the use case: if service fail its restart because it wait to another service, then we should add a reasonable delay between retry)

    • Old behavior (with monit) tried to restart every 2 min

    • A nice delay would be 4-5 seconds

  • Should we decrease/change interval for StartLimitIntervalSec}}or {{StartLimitBurst}}because manual restart also count in the limit and it can be painful when we develop (e.i. use {{systemctl reset-failed to unlock it)

    • we can multiply these values by some cycle to detect loop but allow to dev without issue

      • ex: instead to have StartLimitIntervalSec=3m and StartLimitBurst=5
        we can have 3 cycles with StartLimitIntervalSec=9m and StartLimitBurst=15

To calculate StartLimitIntervalSec:

  • StartLimitBurst* (RestartSec+ <init service time>) = 5*(5s+5s) = 50s

If we multiple by 3 cycles to avoid manual restart issue, we have the following values

  • StartLimitBurst=15

  • StartLimitIntervalSec=150

  • RestartSec=5


  • services using twistd still use PID (i.e. wazo-provd and wazo-confgend)

  • services using celery still use PID (i.e. wazo-webhookd)

    • IMO (fblackburn): it should not be to the daemon to start sub process, but should be to the admin to scale processes as desired → thus PID would be removed in this scenario

  • services using pidfile for script still use PID (i.e. wazo-call-logs, xivo-stat, wazo-purge-db)

    • needed to avoid running command twice

  • services using custom pid logic still use PID (i.e. xivo-dxtora)




Sébastien Duthil
June 26, 2020, 6:28 PM

We could also configure monit to monitor systemd service status instead of PID file. I see no easy way to achieve the same behavior than monit (restart max 5 times) with systemd.

François Blackburn
June 30, 2020, 2:11 PM

I confirm that when you hit the StartLimitBurst and service went in failure, systemd will not tried to restart it after the StartLimitIntervalSec


