Linit — Lindenii init system and service manager
Status: early design RFC, no implementation yet.
"System state is a computed function of the dependency graph rather than as a
mutable imperative set of unit states."
Concepts
- Each service configuration is called a unit.
- Unit names are irrelevant to daily operation.
- Units provide one or more targets, which are used to specify
dependencies. It is illegal to have multiple units provide duplicate targets.
- Units and targets are distinct concepts!
- Runtime states
- A target's state is equivalent to the state of the unit that provides it.
- A unit (and therefore its targets) may have one of the following states:
- Disabled: Not required, so it's just off.
- Enabled: Collection of states where the graph calculates that the service
shouldn't be disabled:
- Waiting for dependencies
- Failed: The unit failed (non-zero exit, unexpected exit, etc.)
- Starting: The unit is being started and isn't ready yet.
- Active:
- Running: The unit is working properly.
- Reloading: The unit is working properly and is currently being reloaded.
- Exited: The main process exited, but it's successful because e.g. it's
oneshot.
- Running (degraded): The unit is running but one or more of its (direct
or transitive) dependencies are not working (but we do not have direct
hard dependencies that aren't working which would stop us). But we're
still considered well and running; all of our dependents will be
degraded (but not stopped).
- Stopping: The unit is being stopped.
- A target in a state other than "disabled" may have a "coincidental" flag set:
because the unit that provides this target is not disabled (i.e., because
the unit provides other targets that are needed), this target is
coincidentally provided.
- Types of units
foreground-notified: The unit is a supervised foreground process that
supports sd_notify. It is considered starting until sd_notify receives
READY=1. Other sd_notify states such as RELOADING are also supported.
foreground-supervised: The unit is a supervised foreground process. It
does not support sd_notify and is considered active while the process is
running; if a ready-wait is specified, the supervisor waits that many
seconds before the unit is considered ready (only if it doesn't exit);
if ready-grep is specified, the standard output or standard error of the
process must have a line that matches the specified regular expression
before it's considered ready.
background-pidfile: Traditional forking daemon that starts in the
foreground and forks into the background when it's ready, leaving a PID
file at a predetermined path.
background-scripted: Start/stop/reload are implemented by custom scripts.
oneshot: The unit is just a command run once. When it's running it's
considered "starting". It is considered "active but exited" when it exits
successfully, and it's considered "failed" when it exits unsuccessfully.
virtual: Exists solely for dependency management.
- Units may have start and stop timeouts. If a unit is in the
"starting"/"stopping" states for longer than the timeout specified, it is
terminated (killed if unable to terminate), and it enters the failed state.
- There are several types of dependency relationships. All dependency
specifications use target names, not unit names.
depends-on marks a hard runtime dependency. The dependent does not start
until the dependency is active. If a dependency gets out of the active
state, the dependent is stopped.
depends-ms marks a startup milestone. The dependency must start
successfully before the dependent starts. Stopping the dependency later
does not automatically stop the dependent.
waits-for wait for the dependency to either be active or to fail,
before the dependent is started.
Circular dependencies are illegal.
- There are no ad-hoc commands to start/stop individual services. At each point
in time, the system is given one target to satisfy, and it deterministically
finds the minimum set of units needed to satisfy that target. If you want
temporary things to run, then just make a new target that lists your typical
target as a dependency and add the new targets as dependencies.
- Typically you don't want to use depends-on in your main system target,
because when its dependency fails it would take down the entire system.
- Things are started as concurrently as possible (e.g., first step could launch
everything (err, only the ones calculated to be requires to satisfy the
ultimate target) that has no dependencies), constrained by the dependency
model. Set a maximum number of units starting at a time via the central
config file? Failures due to memory exhaustion during startup (should be
rare) will be handled via the ordinary restarts/retries when that gets
defined.
- Imperative restarts are supported because they don't move between
enabled/disabled.
TODOs
- How do we reload unit files? Reload an entire directory of unit files at once
yes, but details?
- Formally define what "degraded" means. Currently it just means "that'd cause
the main system-state target to be degraded too and it's surfaced to the
operator, so it's not behavior-changing. Should we make it behave
differently?
- Formally, how do we move from the current to the desired state?
- Formalize semantics of : restarts, retries, timeouts, exponential back-off, etc.
- Transient targets to perhaps add imperative control temporarily for
debugging?
background-scripted needs to be used very carefully; it's kinda necessary
for compatibility with sysvinit scripts...
- More robust crash recovery in general
- Targets may currently only have one provider unit. This makes it easy to
determine the "minimum set". But it is a bit limiting. Research better ways to
select which unit provides each target. I don't want to go with eselect-level
complexity.
I could go with an "installed set" of unit files that's a subset of all
available unit files? But that feels a bit weird.
- Test_User's comments:
- "all units that provide a target should be started (optionally in parallel)
when that target is needed; the target is considered degraded if one fails
or w/e"
- "you'd need an actual disable mechanism that's prob better than editing
dependency graphs"
- How will compatibility with existing definitions work? Perhaps it'd be useful
to have a way to automatically translate systemd/OpenRC files to here. We
also need formalized ways to support SysV/Upstart/etc init scripts.
Examples
I forgot how the services below start or present readiness... these are just
conceptual examples.
Let's say you run Maddy, an email server that provides IMAP and SMTP in one
binary. It uses sd_notify and depends on the network being online — but we
don't want to terminate it when we go offline. Note that exec-stop is typically
unnecessary as we send SIGTERM by default; we just list it here to note that
you could change it.
unit: maddy
type: foreground-notified
provides: imapd, smtpd
depends-ms: network-online
exec-start: /usr/bin/maddy run
exec-reload: /bin/kill -USR1 $MAINPID
exec-stop: /bin/kill -TERM $MAINPID
network-online is a virtual unit that is satisfied if and only if your DHCP
client, network interfaces, local resolver, etc., are ready.
unit: network-online
type: virtual
provides: network-online
depends-on: netif, dhcp, dns
Of course, these dependencies also need to be defined:
unit: dhcpcd
provides: dhcp
depends-on: netif
...
unit: netif
provides: netif
type: oneshot
exec-start: /etc/my-script-to-set-up-interfaces-with-iproute2
...
unit: unbound
provides: dns
depends-on: netif
...
Let's run an IRC daemon.
unit: inspircd
provides: ircd
depends-ms: network-online
...
We also have an IRC bot that connects to multiple networks. We want to wait for
our own network to be up before it begins, but if our own network fails then
just connect to other networks anyway.
name: irc-bot
type: foreground-notified
provides: irc-bot
depends-ms: network-online
waits-for: ircd
ready-grep: ^connected to
exec-start: /usr/local/bin/irc-bot -c /etc/irc-bot.conf
exec-reload: /bin/kill -USR1 $MAINPID
Acknowledgements
Partially inspired by Dinit.