blob: da5e6f81e4ccb00280762d91f94821b9b83e3be7 [file] [log] [blame]
systemd System and Service Manager
CHANGES WITH 249:
* When operating on disk images via the --image= switch of various
tools (such as systemd-nspawn or systemd-dissect), or when udev finds
no 'root=' parameter on the kernel command line, and multiple
suitable root or /usr/ partitions exist in the image, then a simple
comparison inspired by strverscmp() is done on the GPT partition
label, and the newest partition is picked. This permits a simple and
generic whole-file-system A/B update logic where new operating system
versions are dropped into partitions whose label is then updated with
a matching version identifier.
* systemd-sysusers now supports querying the passwords to set for the
users it creates via the "credentials" logic introduced in v247: the
passwd.hashed-password.<user> and passwd.plaintext-password.<user>
credentials are consulted for the password to use (either in UNIX
hashed form, or literally). By default these credentials are inherited
down from PID1 (which in turn imports it from a container manager if
there is one). This permits easy configuration of user passwords
during first boot. Example:
# systemd-nspawn -i foo.raw --volatile=yes --set-credential=passwd.plaintext-password.root:foo
Note that systemd-sysusers operates in purely additive mode: it
executes no operation if the declared users already exist, and hence
doesn't set any passwords as effect of the command line above if the
specified root user exists already in the image. (Note that
--volatile=yes ensures it doesn't, though.)
* systemd-firstboot now also supports querying various system
parameters via the credential subsystems. Thus, as above this may be
used to initialize important system parameters on first boot of
previously unprovisioned images (i.e. images with a mostly empty
/etc/).
* PID 1 may now show both the unit name and the unit description
strings in its status output during boot. This may be configured with
StatusUnitFormat=combined in system.conf or
systemd.status-unit-format=combined on the kernel command line.
* The systemd-machine-id-setup tool now supports a --image= switch for
provisioning a machine ID file into an OS disk image, similar to how
--root= operates on an OS file tree. This matches the existing switch
of the same name for systemd-tmpfiles, systemd-firstboot, and
systemd-sysusers tools.
* Similarly, systemd-repart gained support for the --image= switch too.
In combination with the existing --size= option, this makes the tool
particularly useful for easily growing disk images in a single
invocation, following the declarative rules included in the image
itself.
* systemd-repart's partition configuration files gained support for a
new switch MakeDirectories= which may be used to create arbitrary
directories inside file systems that are created, before registering
them in the partition table. This is useful in particular for root
partitions to create mount point directories for other partitions
included in the image. For example, a disk image that contains a
root, /home/, and /var/ partitions, may set MakeDirectories=yes to
create /home/ and /var/ as empty directories in the root file system
on its creation, so that the resulting image can be mounted
immediately, even in read-only mode.
* systemd-repart's CopyBlocks= setting gained support for the special
value "auto". If used, a suitable matching partition on the booted OS
is found as source to copy blocks from. This is useful when
implementing replicating installers, that are booted from one medium
and then stream their own root partition onto the target medium.
* systemd-repart's partition configuration files gained support for a
Flags=, a ReadOnly= and a NoAuto= setting, allowing control of these
GPT partition flags for the created partitions: this is useful for
marking newly created partitions as read-only, or as not being
subject for automatic mounting from creation on.
* The /etc/os-release file has been extended with two new (optional)
variables IMAGE_VERSION= and IMAGE_ID=, carrying identity and version
information for OS images that are updated comprehensively and
atomically as one image. Two new specifiers %M, %A now resolve to
these two fields in the various configuration options that resolve
specifiers.
* portablectl gained a new switch --extension= for enabling portable
service images with extensions that follow the extension image
concept introduced with v248, and thus allows layering multiple
images when setting up the root filesystem of the service.
* systemd-coredump will now extract ELF build-id information from
processes dumping core and include it in the coredump report.
Moreover, it will look for ELF .note.package sections with
distribution packaging meta-information about the crashing process.
This is useful to directly embed the rpm or deb (or any other)
package name and version in ELF files, making it easy to match
coredump reports with the specific package for which the software was
compiled. This is particularly useful on environments with ELF files
from multiple vendors, different distributions and versions, as is
common today in our containerized and sand-boxed world. For further
information, see:
https://systemd.io/COREDUMP_PACKAGE_METADATA
* A new udev hardware database has been added for FireWire devices
(IEEE 1394).
* The "net_id" built-in of udev has been updated with three
backwards-incompatible changes:
- PCI hotplug slot names on s390 systems are now parsed as
hexadecimal numbers. They were incorrectly parsed as decimal
previously, or ignored if the name was not a valid decimal
number.
- PCI onboard indices up to 65535 are allowed. Previously, numbers
above 16383 were rejected. This primarily impacts s390 systems,
where values up to 65535 are used.
- Invalid characters in interface names are replaced with "_".
The new version of the net naming scheme is "v249". The previous
scheme can be selected via the "net.naming-scheme=v247" kernel
command line parameter.
* sd-bus' sd_bus_is_ready() and sd_bus_is_open() calls now accept a
NULL bus object, for which they will return false. Or in other words,
an unallocated bus connection is neither ready nor open.
* The sd-device API acquired a new API function
sd_device_get_usec_initialized() that returns the monotonic time when
the udev device first appeared in the database.
* sd-device gained a new APIs sd_device_trigger_with_uuid() and
sd_device_get_trigger_uuid(). The former is similar to
sd_device_trigger() but returns a randomly generated UUID that is
associated with the synthetic uevent generated by the call. This UUID
may be read from the sd_device object a monitor eventually receives,
via the sd_device_get_trigger_uuid(). This interface requires kernel
4.13 or above to work, and allows tracking a synthetic uevent through
the entire device management stack. The "udevadm trigger --settle"
logic has been updated to make use of this concept if available to
wait precisely for the uevents it generates. "udevadm trigger" also
gained a new parameter --uuid that prints the UUID for each generated
uevent.
* sd-device also gained new APIs sd_device_new_from_ifname() and
sd_device_new_from_ifindex() for allocating an sd-device object for
the specified network interface. The former accepts an interface name
(either a primary or an alternative name), the latter an interface
index.
* The native Journal protocol has been documented. Clients may talk
this as alternative to the classic BSD syslog protocol for locally
delivering log records to the Journal. The protocol has been stable
for a long time and in fact been implemented already in a variety
of alternative client libraries. This documentation makes the support
for that official:
https://systemd.io/JOURNAL_NATIVE_PROTOCOL
* A new BPFProgram= setting has been added to service files. It may be
set to a path to a loaded kernel BPF program, i.e. a path to a bpffs
file, or a bind mount or symlink to one. This may be used to upload
and manage BPF programs externally and then hook arbitrary systemd
services into them.
* The "home.arpa" domain that has been officially declared as the
choice for domain for local home networks per RFC 8375 has been added
to the default NTA list of resolved, since DNSSEC is generally not
available on private domains.
* The CPUAffinity= setting of unit files now resolves "%" specifiers.
* A new ManageForeignRoutingPolicyRules= setting has been added to
.network files which may be used to exclude foreign-created routing
policy rules from systemd-networkd management.
* systemd-network-wait-online gained two new switches -4 and -6 that
may be used to tweak whether to wait for only IPv4 or only IPv6
connectivity.
* .network files gained a new RequiredFamilyForOnline= setting to
fine-tune whether to require an IPv4 or IPv6 address in order to
consider an interface "online".
* networkctl will now show an over-all "online" state in the per-link
information.
* In .network files a new OutgoingInterface= setting has been added to
specify the output interface in bridge FDB setups.
* In .network files the Multipath group ID may now be configured for
[NextHop] entries, via the new Group= setting.
* The DHCP server logic configured in .network files gained a new
setting RelayTarget= that turns the server into a DHCP server relay.
The RelayAgentCircuitId= and RelayAgentRemoteId= settings may be used
to further tweak the DHCP relay behaviour.
* The DHCP server logic also gained a new ServerAddress= setting in
.network files that explicitly specifies the server IP address to
use. If not specified, the address is determined automatically, as
before.
* The DHCP server logic in systemd-networkd gained support for static
DHCP leases, configurable via the [DHCPServerStaticLease]
section. This allows explicitly mapping specific MAC addresses to
fixed IP addresses and vice versa.
* The RestrictAddressFamilies= setting in service files now supports a
new special value "none". If specified sockets of all address
families will be made unavailable to services configured that way.
* systemd-fstab-generator and systemd-repart have been updated to
support booting from disks that carry only a /usr/ partition but no
root partition yet, and where systemd-repart can add it in on the
first boot. This is useful for implementing systems that ship with a
single /usr/ file system, and whose root file system shall be set up
and formatted on a LUKS-encrypted volume whose key is generated
locally (and possibly enrolled in the TPM) during the first boot.
* The [Address] section of .network files now accepts a new
RouteMetric= setting that configures the routing metric to use for
the prefix route created as effect of the address configuration.
Similarly, the [DHCPv6PrefixDelegation] and [IPv6Prefix] sections
gained matching settings for their prefix routes. (The option of the
same name in the [DHCPv6] section is moved to [IPv6AcceptRA], since
it conceptually belongs there; the old option is still understood for
compatibility.)
* The DHCPv6 IAID and DUID are now explicitly configurable in .network
files.
* A new udev property ID_NET_DHCP_BROADCAST on network interface
devices is now honoured by systemd-networkd, controlling whether to
issue DHCP offers via broadcasting. This is used to ensure that s390
layer 3 network interfaces work out-of-the-box with systemd-networkd.
* nss-myhostname and systemd-resolved will now synthesize address
records for a new special hostname "_outbound". The name will always
resolve to the local IP addresses most likely used for outbound
connections towards the default routes. On multi-homed hosts this is
useful to have a stable handle referring to "the" local IP address
that matters most, to the point where this is defined.
* The Discoverable Partition Specification has been updated with a new
GPT partition flag "grow-file-system" defined for its partition
types. Whenever partitions with this flag set are automatically
mounted (i.e. via systemd-gpt-auto-generator or the --image= switch
of systemd-nspawn or other tools; and as opposed to explicit mounting
via /etc/fstab), the file system within the partition is
automatically grown to the full size of the partition. If the file
system size already matches the partition size this flag has no
effect. Previously, this functionality has been available via the
explicit x-systemd.growfs mount option, and this new flag extends
this to automatically discovered mounts. A new GrowFileSystem=
setting has been added to systemd-repart drop-in files that allows
configuring this partition flag. This new flag defaults to on for
partitions automatically created by systemd-repart, except if they
are marked read-only. See the specification for further details:
https://systemd.io/DISCOVERABLE_PARTITIONS
* .network files gained a new setting RoutesToNTP= in the [DHCPv4]
section. If enabled (which is the default), and an NTP server address
is acquired through a DHCP lease on this interface an explicit route
to this address is created on this interface to ensure that NTP
traffic to the NTP server acquired on an interface is also routed
through that interface. The pre-existing RoutesToDNS= setting that
implements the same for DNS servers is now enabled by default.
* A pair of service settings SocketBindAllow= + SocketBindDeny= have
been added that may be used to restrict the network interfaces
sockets created by the service may be bound to. This is implemented
via BPF.
* A new ConditionFirmware= setting has been added to unit files to
conditionalize on certain firmware features. At the moment it may
check whether running on an UEFI system, a device.tree system, or if
the system is compatible with some specified device-tree feature.
* A new ConditionOSRelease= setting has been added to unit files to
check os-release(5) fields. The "=", "!=", "<", "<=", ">=", ">"
operators may be used to check if some field has some specific value
or do an alphanumerical comparison. Equality comparisons are useful
for fields like ID, but relative comparisons for fields like
VERSION_ID or IMAGE_VERSION.
* hostnamed gained a new Describe() D-Bus method that returns a JSON
serialization of the host data it exposes. This is exposed via
"hostnamectl --json=" to acquire a host identity description in JSON.
It's our intention to add a similar features to most services and
objects systemd manages, in order to simplify integration with
program code that can consume JSON.
* Similarly, networkd gained a Describe() method on its Manager and
Link bus objects. This is exposed via "networkctl --json=".
* hostnamectl's various "get-xyz"/"set-xyz" verb pairs
(e.g. "hostnamectl get-hostname", "hostnamectl "set-hostname") have
been replaced by a single "xyz" verb (e.g. "hostnamectl hostname")
that is used both to get the value (when no argument is given), and
to set the value (when an argument is specified). The old names
continue to be supported for compatibility.
* systemd-detect-virt and ConditionVirtualization= are now able to
correctly identify Amazon EC2 environments.
* The LogLevelMax= setting of unit files now applies not only to log
messages generated *by* the service, but also to log messages
generated *about* the service by PID 1. To suppress logs concerning a
specific service comprehensively, set this option to a high log
level.
* bootctl gained support for a new --make-machine-id-directory= switch
that allows precise control on whether to create the top-level
per-machine directory in the boot partition that typically contains
Type 1 boot loader entries.
* During build SBAT data to include in the systemd-boot EFI PE binaries
may be specified now.
* /etc/crypttab learnt a new option "headless". If specified any
requests to query the user interactively for passwords or PINs will
be skipped. This is useful on systems that are headless, i.e. where
an interactive user is generally not present.
* /etc/crypttab also learnt a new option "password-echo=" that allows
configuring whether the encryption password prompt shall echo the
typed password and if so, do so literally or via asterisks. (The
default is the same behaviour as before: provide echo feedback via
asterisks.)
* FIDO2 support in systemd-cryptenroll/systemd-cryptsetup and
systemd-homed has been updated to allow explicit configuration of the
"user presence" and "user verification" checks, as well as whether a
PIN is required for authentication, via the new switches
--fido2-with-user-presence=, --fido2-with-user-verification=,
--fido2-with-client-pin= to systemd-cryptenroll and homectl. Which
features are available, and may be enabled or disabled depends on the
used FIDO2 token.
* systemd-nspawn's --private-user= switch now accepts the special value
"identity" which configures a user namespacing environment with an
identity mapping of 65535 UIDs. This means the container UID 0 is
mapped to the host UID 0, and the UID 1 to host UID 1. On first look
this doesn't appear to be useful, however it does reduce the attack
surface a bit, since the resulting container will possess process
capabilities only within its namespace and not on the host.
* systemd-nspawn's --private-user-chown switch has been replaced by a
more generic --private-user-ownership= switch that accepts one of
three values: "chown" is equivalent to the old --private-user-chown,
and "off" is equivalent to the absence of the old switch. The value
"map" uses the new UID mapping mounts of Linux 5.12 to map ownership
of files and directories of the underlying image to the chosen UID
range for the container. "auto" is equivalent to "map" if UID mapping
mount are supported, otherwise it is equivalent to "chown". The short
-U switch systemd-nspawn now implies --private-user-ownership=auto
instead of the old --private-user-chown. Effectively this means: if
the backing file system supports UID mapping mounts the feature is
now used by default if -U is used. Generally, it's a good idea to use
UID mapping mounts instead of recursive chown()ing, since it allows
running containers off immutable images (since no modifications of
the images need to take place), and share images between multiple
instances. Moreover, the recursive chown()ing operation is slow and
can be avoided. Conceptually it's also a good thing if transient UID
range uses do not leak into persistent file ownership anymore. TLDR:
finally, the last major drawback of user namespacing has been
removed, and -U should always be used (unless you use btrfs, where
UID mapped mounts do not exist; or your container actually needs
privileges on the host).
* nss-systemd now synthesizes user and group shadow records in addition
to the main user and group records. Thus, hashed passwords managed by
systemd-homed are now accessible via the shadow database.
* The userdb logic (and thus nss-systemd, and so on) now read
additional user/group definitions in JSON format from the drop-in
directories /etc/userdb/, /run/userdb/, /run/host/userdb/ and
/usr/lib/userdb/. This is a simple and powerful mechanism for making
additional users available to the system, with full integration into
NSS including the shadow databases. Since the full JSON user/group
record format is supported this may also be used to define users with
resource management settings and other runtime settings that
pam_systemd and systemd-logind enforce at login.
* The userdbctl tool gained two new switches --with-dropin= and
--with-varlink= which can be used to fine-tune the sources used for
user database lookups.
* systemd-nspawn gained a new switch --bind-user= for binding a host
user account into the container. This does three things: the user's
home directory is bind mounted from the host into the container,
below the /run/userdb/home/ hierarchy. A free UID is picked in the
container, and a user namespacing UID mapping to the host user's UID
installed. And finally, a minimal JSON user and group record (along
with its hashed password) is dropped into /run/host/userdb/. These
records are picked up automatically by the userdb drop-in logic
describe above, and allow the user to login with the same password as
on the host. Effectively this means: if host and container run new
enough systemd versions making a host user available to the container
is trivially simple.
* systemd-journal-gatewayd now supports the switches --user, --system,
--merge, --file= that are equivalent to the same switches of
journalctl, and permit exposing only the specified subset of the
Journal records.
* The OnFailure= dependency between units is now augmented with a
implicit reverse dependency OnFailureOf= (this new dependency cannot
be configured directly it's only created as effect of an OnFailure=
dependency in the reverse order — it's visible in "systemctl show"
however). Similar, Slice= now has an reverse dependency SliceOf=,
that is also not configurable directly, but useful to determine all
units that are members of a slice.
* A pair of new dependency types between units PropagatesStopTo= +
StopPropagatedFrom= has been added, that allows propagation of unit
stop events between two units. It operates similar to the existing
PropagatesReloadTo= + ReloadPropagatedFrom= dependencies.
* A new dependency type OnSuccess= has been added (plus the reverse
dependency OnSuccessOf=, which cannot be configured directly, but
exists only as effect of the reverse OnSuccess=). It is similar to
OnFailure=, but triggers in the opposite case: when a service exits
cleanly. This allows "chaining up" of services where one or more
services are started once another service has successfully completed.
* A new dependency type Upholds= has been added (plus the reverse
dependency UpheldBy=, which cannot be configured directly, but exists
only as effect of Upholds=). This dependency type is a stronger form
of Wants=: if a unit has an UpHolds= dependency on some other unit
and the former is active then the latter is started whenever it is
found inactive (and no job is queued for it). This is an alternative
to Restart= inside service units, but less configurable, and the
request to uphold a unit is not encoded in the unit itself but in
another unit that intends to uphold it.
* The systemd-ask-password tool now also supports reading passwords
from the credentials subsystem, via the new --credential= switch.
* The systemd-ask-password tool learnt a new switch --emoji= which may
be used to explicit control whether the lock and key emoji (🔐) is
shown in the password prompt on suitable TTYs.
* The --echo switch of systemd-ask-password now optionally takes a
parameter that controls character echo. It may either show asterisks
(default, as before), turn echo off entirely, or echo the typed
characters literally.
* The systemd-ask-password tool also gained a new -n switch for
suppressing output of a trailing newline character when writing the
acquired password to standard output, similar to /bin/echo's -n
switch.
* New documentation has been added that describes the organization of
the systemd source code tree:
https://systemd.io/ARCHITECTURE
* Units using ConditionNeedsUpdate= will no longer be activated in
the initrd.
* It is now possible to list a template unit in the WantedBy= or
RequiredBy= settings of the [Install] section of another template
unit, which will be instantiated using the same instance name.
* A new MemoryAvailable property is available for units. If the unit,
or the slice(s) it is part of, have a memory limit set via MemoryMax=/
MemoryHigh=, MemoryAvailable will indicate how much more memory the
unit can claim before hitting the limit(s).
* systemd-coredump will now try to stay below the cgroup memory limit
placed on itself or one of the slices it runs under, if the storage
area for core files (/var/lib/systemd/coredump/) is placed on a tmpfs,
since files written on such filesystems count toward the cgroup memory
limit. If there is not enough available memory in such cases to store
the core file uncompressed, systemd-coredump will skip to compressed
storage directly (if enabled) and it will avoid analyzing the core file
to print backtrace and metadata in the journal.
* tmpfiles.d/ drop-ins gained a new '=' modifier to check if the type
of a path matches the configured expectations, and remove it if not.
* tmpfiles.d/'s 'Age' now accepts an 'age-by' argument, which allows to
specify which of the several available filesystem timestamps (access
time, birth time, change time, modification time) to look at when
deciding whether a path has aged enough to be cleaned.
* A new IPv6StableSecretAddress= setting has been added to .network
files, which takes an IPv6 address to use as secret for IPv6 address
generation.
* The [DHCPServer] logic in .network files gained support for a new
UplinkInterface= setting that permits configuration of the uplink
interface name to propagate DHCP lease information from.
* The WakeOnLan= setting in .link files now accepts a list of flags
instead of a single one, to configure multiple wake-on-LAN policies.
* User-space defined tracepoints (USDT) have been added to udev at
strategic locations. This is useful for tracing udev behaviour and
performance with bpftrace and similar tools.
* systemd-journald-upload gained a new NetworkTimeoutSec= option for
setting a network timeout time.
* If a system service is running in a new mount namespace (RootDirectory=
and friends), all file systems will be mounted with MS_NOSUID by
default, unless the system is running with SELinux enabled.
* When enumerating time zones the timedatectl tool will now consult the
'tzdata.zi' file shipped by the IANA time zone database package, in
addition to 'zone1970.tab', as before. This makes sure time zone
aliases are now correctly supported. Some distributions so far did
not install this additional file, most do however. If you
distribution does not install it yet, it might make sense to change
that.
Contributions from: Aakash Singh, adrian5, Albert Brox,
Alexander Sverdlin, Alexander Tsoy, Alexey Rubtsov, alexlzhu,
Allen Webb, Alvin Šipraga, Alyssa Ross, Anders Wenhaug,
Andrea Pappacoda, Anita Zhang, asavah, Balint Reczey, Bertrand Jacquin,
borna-blazevic, caoxia2008cxx, Carlo Teubner, Christian Göttsche,
Christian Hesse, Daniel Schaefer, Dan Streetman,
David Santamaría Rogado, David Tardon, Deepak Rawat, dgcampea,
Dimitri John Ledkov, ei-ke, Emilio Herrera, Emil Renner Berthing,
Eric Cook, Flos Lonicerae, Franck Bui, Francois Gervais,
Frantisek Sumsal, Gibeom Gwon, gitm0, Hamish Moffatt, Hans de Goede,
Harsh Barsaiyan, Henri Chain, Hristo Venev, Icenowy Zheng, Igor Zhbanov,
imayoda, Jakub Warczarek, James Buren, Jan Janssen, Jan Macku,
Jan Synacek, Jason Francis, Jayanth Ananthapadmanaban, Jeremy Szu,
Jérôme Carretero, Jesse Stricker, jiangchuangang, Joerg Behrmann,
Jóhann B. Guðmundsson, Jörg Deckert, Jörg Thalheim, Juergen Hoetzel,
Julia Kartseva, Kai-Heng Feng, Khem Raj, KoyamaSohei, laineantti,
Lennart Poettering, LetzteInstanz, Luca Adrian L, Luca Boccassi,
Lucas Magasweran, Mantas Mikulėnas, Marco Antonio Mauro, Mark Wielaard,
Masahiro Matsuya, Matt Johnston, Michael Catanzaro, Michal Koutný,
Michal Sekletár, Mike Crowe, Mike Kazantsev, Milan, milaq,
Miroslav Suchý, Morten Linderud, nerdopolis, nl6720, Noah Meyerhans,
Oleg Popov, Olle Lundberg, Ondrej Kozina, Paweł Marciniak, Perry.Yuan,
Peter Hutterer, Peter Kjellerstedt, Peter Morrow, Phaedrus Leeds,
plattrap, qhill, Raul Tambre, Roman Beranek, Roshan Shariff,
Ryan Hendrickson, Samuel BF, scootergrisen, Sebastian Blunt,
Seong-ho Cho, Sergey Bugaev, Sevan Janiyan, Sibo Dong, simmon,
Simon Watts, Srinidhi Kaushik, Štěpán Němec, Steve Bonds, Susant Sahani,
sverdlin, syyhao1994, Takashi Sakamoto, Topi Miettinen, tramsay,
Trent Piepho, Uwe Kleine-König, Viktor Mihajlovski, Vincent Dechenaux,
Vito Caputo, William A. Kennington III, Yangyang Shen, Yegor Alexeyev,
Yi Gao, Yu Watanabe, Zbigniew Jędrzejewski-Szmek, zsien, наб
— Edinburgh, 2021-07-07
CHANGES WITH 248:
* A concept of system extension images is introduced. Such images may
be used to extend the /usr/ and /opt/ directory hierarchies at
runtime with additional files (even if the file system is read-only).
When a system extension image is activated, its /usr/ and /opt/
hierarchies and os-release information are combined via overlayfs
with the file system hierarchy of the host OS.
A new systemd-sysext tool can be used to merge, unmerge, list, and
refresh system extension hierarchies. See
https://www.freedesktop.org/software/systemd/man/systemd-sysext.html.
The systemd-sysext.service automatically merges installed system
extensions during boot (before basic.target, but not in very early
boot, since various file systems have to be mounted first).
The SYSEXT_LEVEL= field in os-release(5) may be used to specify the
supported system extension level.
* A new ExtensionImages= unit setting can be used to apply the same
system extension image concept from systemd-sysext to the namespaced
file hierarchy of specific services, following the same rules and
constraints.
* Support for a new special "root=tmpfs" kernel command-line option has
been added. When specified, a tmpfs is mounted on /, and mount.usr=
should be used to point to the operating system implementation.
* A new configuration file /etc/veritytab may be used to configure
dm-verity integrity protection for block devices. Each line is in the
format "volume-name data-device hash-device roothash options",
similar to /etc/crypttab.
* A new kernel command-line option systemd.verity.root_options= may be
used to configure dm-verity behaviour for the root device.
* The key file specified in /etc/crypttab (the third field) may now
refer to an AF_UNIX/SOCK_STREAM socket in the file system. The key is
acquired by connecting to that socket and reading from it. This
allows the implementation of a service to provide key information
dynamically, at the moment when it is needed.
* When the hostname is set explicitly to "localhost", systemd-hostnamed
will respect this. Previously such a setting would be mostly silently
ignored. The goal is to honour configuration as specified by the
user.
* The fallback hostname that will be used by the system manager and
systemd-hostnamed can now be configured in two new ways: by setting
DEFAULT_HOSTNAME= in os-release(5), or by setting
$SYSTEMD_DEFAULT_HOSTNAME in the environment block. As before, it can
also be configured during compilation. The environment variable is
intended for testing and local overrides, the os-release(5) field is
intended to allow customization by different variants of a
distribution that share the same compiled packages.
* The environment block of the manager itself may be configured through
a new ManagerEnvironment= setting in system.conf or user.conf. This
complements existing ways to set the environment block (the kernel
command line for the system manager, the inherited environment and
user@.service unit file settings for the user manager).
* systemd-hostnamed now exports the default hostname and the source of
the configured hostname ("static", "transient", or "default") as
D-Bus properties.
* systemd-hostnamed now exports the "HardwareVendor" and
"HardwareModel" D-Bus properties, which are supposed to contain a
pair of cleaned up, human readable strings describing the system's
vendor and model. It's typically sourced from the firmware's DMI
tables, but may be augmented from a new hwdb database. hostnamectl
shows this in the status output.
* Support has been added to systemd-cryptsetup for extracting the
PKCS#11 token URI and encrypted key from the LUKS2 JSON embedded
metadata header. This allows the information how to open the
encrypted device to be embedded directly in the device and obviates
the need for configuration in an external file.
* systemd-cryptsetup gained support for unlocking LUKS2 volumes using
TPM2 hardware, as well as FIDO2 security tokens (in addition to the
pre-existing support for PKCS#11 security tokens).
* systemd-repart may enroll encrypted partitions using TPM2
hardware. This may be useful for example to create an encrypted /var
partition bound to the machine on first boot.
* A new systemd-cryptenroll tool has been added to enroll TPM2, FIDO2
and PKCS#11 security tokens to LUKS volumes, list and destroy
them. See:
http://0pointer.net/blog/unlocking-luks2-volumes-with-tpm2-fido2-pkcs11-security-hardware-on-systemd-248.html
It also supports enrolling "recovery keys" and regular passphrases.
* The libfido2 dependency is now based on dlopen(), so that the library
is used at runtime when installed, but is not a hard runtime
dependency.
* systemd-cryptsetup gained support for two new options in
/etc/crypttab: "no-write-workqueue" and "no-read-workqueue" which
request synchronous processing of encryption/decryption IO.
* The manager may be configured at compile time to use the fexecve()
instead of the execve() system call when spawning processes. Using
fexecve() closes a window between checking the security context of an
executable and spawning it, but unfortunately the kernel displays
stale information in the process' "comm" field, which impacts ps
output and such.
* The configuration option -Dcompat-gateway-hostname has been dropped.
"_gateway" is now the only supported name.
* The ConditionSecurity=tpm2 unit file setting may be used to check if
the system has at least one TPM2 (tpmrm class) device.
* A new ConditionCPUFeature= has been added that may be used to
conditionalize units based on CPU features. For example,
ConditionCPUFeature=rdrand will condition a unit so that it is only
run when the system CPU supports the RDRAND opcode.
* The existing ConditionControlGroupController= setting has been
extended with two new values "v1" and "v2". "v2" means that the
unified v2 cgroup hierarchy is used, and "v1" means that legacy v1
hierarchy or the hybrid hierarchy are used.
* A new PrivateIPC= setting on a unit file allows executed processes to
be moved into a private IPC namespace, with separate System V IPC
identifiers and POSIX message queues.
A new IPCNamespacePath= allows the unit to be joined to an existing
IPC namespace.
* The tables of system calls in seccomp filters are now automatically
generated from kernel lists exported on
https://fedora.juszkiewicz.com.pl/syscalls.html.
The following architectures should now have complete lists:
alpha, arc, arm64, arm, i386, ia64, m68k, mips64n32, mips64, mipso32,
powerpc, powerpc64, s390, s390x, tilegx, sparc, x86_64, x32.
* The MountAPIVFS= service file setting now additionally mounts a tmpfs
on /run/ if it is not already a mount point. A writable /run/ has
always been a requirement for a functioning system, but this was not
guaranteed when using a read-only image.
Users can always specify BindPaths= or InaccessiblePaths= as
overrides, and they will take precedence. If the host's root mount
point is used, there is no change in behaviour.
* New bind mounts and file system image mounts may be injected into the
mount namespace of a service (without restarting it). This is exposed
respectively as 'systemctl bind <unit> <path>…' and
'systemctl mount-image <unit> <image>…'.
* The StandardOutput= and StandardError= settings can now specify files
to be truncated for output (as "truncate:<path>").
* The ExecPaths= and NoExecPaths= settings may be used to specify
noexec for parts of the file system.
* sd-bus has a new function sd_bus_open_user_machine() to open a
connection to the session bus of a specific user in a local container
or on the local host. This is exposed in the existing -M switch to
systemctl and similar tools:
systemctl --user -M lennart@foobar start foo
This will connect to the user bus of a user "lennart" in container
"foobar". If no container name is specified, the specified user on
the host itself is connected to
systemctl --user -M lennart@ start quux
* sd-bus also gained a convenience function sd_bus_message_send() to
simplify invocations of sd_bus_send(), taking only a single
parameter: the message to send.
* sd-event allows rate limits to be set on event sources, for dealing
with high-priority event sources that might starve out others. See
the new man page sd_event_source_set_ratelimit(3) for details.
* systemd.link files gained a [Link] Promiscuous= switch, which allows
the device to be raised in promiscuous mode.
New [Link] TransmitQueues= and ReceiveQueues= settings allow the
number of TX and RX queues to be configured.
New [Link] TransmitQueueLength= setting allows the size of the TX
queue to be configured.
New [Link] GenericSegmentOffloadMaxBytes= and
GenericSegmentOffloadMaxSegments= allow capping the packet size and
the number of segments accepted in Generic Segment Offload.
* systemd-networkd gained support for the "B.A.T.M.A.N. advanced"
wireless routing protocol that operates on ISO/OSI Layer 2 only and
uses ethernet frames to route/bridge packets. This encompasses a new
"batadv" netdev Type=, a new [BatmanAdvanced] section with a bunch of
new settings in .netdev files, and a new BatmanAdvanced= setting in
.network files.
* systemd.network files gained a [Network] RouteTable= configuration
switch to select the routing policy table.
systemd.network files gained a [RoutingPolicyRule] Type=
configuration switch (one of "blackhole, "unreachable", "prohibit").
systemd.network files gained a [IPv6AcceptRA] RouteDenyList= and
RouteAllowList= settings to ignore/accept route advertisements from
routers matching specified prefixes. The DenyList= setting has been
renamed to PrefixDenyList= and a new PrefixAllowList= option has been
added.
systemd.network files gained a [DHCPv6] UseAddress= setting to
optionally ignore the address provided in the lease.
systemd.network files gained a [DHCPv6PrefixDelegation]
ManageTemporaryAddress= switch.
systemd.network files gained a new ActivationPolicy= setting which
allows configuring how the UP state of an interface shall be managed,
i.e. whether the interface is always upped, always downed, or may be
upped/downed by the user using "ip link set dev".
* The default for the Broadcast= setting in .network files has slightly
changed: the broadcast address will not be configured for wireguard
devices.
* systemd.netdev files gained a [VLAN] Protocol=, IngressQOSMaps=,
EgressQOSMaps=, and [MACVLAN] BroadcastMulticastQueueLength=
configuration options for VLAN packet handling.
* udev rules may now set log_level= option. This allows debug logs to
be enabled for select events, e.g. just for a specific subsystem or
even a single device.
* udev now exports the VOLUME_ID, LOGICAL_VOLUME_ID, VOLUME_SET_ID, and
DATA_PREPARED_ID properties for block devices with ISO9660 file
systems.
* udev now exports decoded DMI information about installed memory slots
as device properties under the /sys/class/dmi/id/ pseudo device.
* /dev/ is not mounted noexec anymore. This didn't provide any
significant security benefits and would conflict with the executable
mappings used with /dev/sgx device nodes. The previous behaviour can
be restored for individual services with NoExecPaths=/dev (or by allow-
listing and excluding /dev from ExecPaths=).
* Permissions for /dev/vsock are now set to 0o666, and /dev/vhost-vsock
and /dev/vhost-net are owned by the kvm group.
* The hardware database has been extended with a list of fingerprint
readers that correctly support USB auto-suspend using data from
libfprint.
* systemd-resolved can now answer DNSSEC questions through the stub
resolver interface in a way that allows local clients to do DNSSEC
validation themselves. For a question with DO+CD set, it'll proxy the
DNS query and respond with a mostly unmodified packet received from
the upstream server.
* systemd-resolved learnt a new boolean option CacheFromLocalhost= in
resolved.conf. If true the service will provide caching even for DNS
lookups made to an upstream DNS server on the 127.0.0.1/::1
addresses. By default (and when the option is false) systemd-resolved
will not cache such lookups, in order to avoid duplicate local
caching, under the assumption the local upstream server caches
anyway.
* systemd-resolved now implements RFC5001 NSID in its local DNS
stub. This may be used by local clients to determine whether they are
talking to the DNS resolver stub or a different DNS server.
* When resolving host names and other records resolvectl will now
report where the data was acquired from (i.e. the local cache, the
network, locally synthesized, …) and whether the network traffic it
effected was encrypted or not. Moreover the tool acquired a number of
new options --cache=, --synthesize=, --network=, --zone=,
--trust-anchor=, --validate= that take booleans and may be used to
tweak a lookup, i.e. whether it may be answered from cached
information, locally synthesized information, information acquired
through the network, the local mDNS/LLMNR zone, the DNSSEC trust
anchor, and whether DNSSEC validation shall be executed for the
lookup.
* systemd-nspawn gained a new --ambient-capability= setting
(AmbientCapability= in .nspawn files) to configure ambient
capabilities passed to the container payload.
* systemd-nspawn gained the ability to configure the firewall using the
nftables subsystem (in addition to the existing iptables
support). Similarly, systemd-networkd's IPMasquerade= option now
supports nftables as back-end, too. In both cases NAT on IPv6 is now
supported too, in addition to IPv4 (the iptables back-end still is
IPv4-only).
"IPMasquerade=yes", which was the same as "IPMasquerade=ipv4" before,
retains its meaning, but has been deprecated. Please switch to either
"ivp4" or "both" (if covering IPv6 is desired).
* systemd-importd will now download .verity and .roothash.p7s files
along with the machine image (as exposed via machinectl pull-raw).
* systemd-oomd now gained a new DefaultMemoryPressureDurationSec=
setting to configure the time a unit's cgroup needs to exceed memory
pressure limits before action will be taken, and a new
ManagedOOMPreference=none|avoid|omit setting to avoid killing certain
units.
systemd-oomd is now considered fully supported (the usual
backwards-compatiblity promises apply). Swap is not required for
operation, but it is still recommended.
* systemd-timesyncd gained a new ConnectionRetrySec= setting which
configures the retry delay when trying to contact servers.
* systemd-stdio-bridge gained --system/--user options to connect to the
system bus (previous default) or the user session bus.
* systemd-localed may now call locale-gen to generate missing locales
on-demand (UTF-8-only). This improves integration with Debian-based
distributions (Debian/Ubuntu/PureOS/Tanglu/...) and Arch Linux.
* systemctl --check-inhibitors=true may now be used to obey inhibitors
even when invoked non-interactively. The old --ignore-inhibitors
switch is now deprecated and replaced by --check-inhibitors=false.
* systemctl import-environment will now emit a warning when called
without any arguments (i.e. to import the full environment block of
the called program). This command will usually be invoked from a
shell, which means that it'll inherit a bunch of variables which are
specific to that shell, and usually to the TTY the shell is connected
to, and don't have any meaning in the global context of the system or
user service manager. Instead, only specific variables should be
imported into the manager environment block.
Similarly, programs which update the manager environment block by
directly calling the D-Bus API of the manager, should also push
specific variables, and not the full inherited environment.
* systemctl's status output now shows unit state with a more careful
choice of Unicode characters: units in maintenance show a "○" symbol
instead of the usual "●", failed units show "×", and services being
reloaded "↻".
* coredumpctl gained a --debugger-arguments= switch to pass arguments
to the debugger. It also gained support for showing coredump info in
a simple JSON format.
* systemctl/loginctl/machinectl's --signal= option now accept a special
value "list", which may be used to show a brief table with known
process signals and their numbers.
* networkctl now shows the link activation policy in status.
* Various tools gained --pager/--no-pager/--json= switches to
enable/disable the pager and provide JSON output.
* Various tools now accept two new values for the SYSTEMD_COLORS
environment variable: "16" and "256", to configure how many terminal
colors are used in output.
* less 568 or newer is now required for the auto-paging logic of the
various tools. Hyperlink ANSI sequences in terminal output are now
used even if a pager is used, and older versions of less are not able
to display these sequences correctly. SYSTEMD_URLIFY=0 may be used to
disable this output again.
* Builds with support for separate / and /usr/ hierarchies ("split-usr"
builds, non-merged-usr builds) are now officially deprecated. A
warning is emitted during build. Support is slated to be removed in
about a year (when the Debian Bookworm release development starts).
* Systems with the legacy cgroup v1 hierarchy are now marked as
"tainted", to make it clearer that using the legacy hierarchy is not
recommended.
* systemd-localed will now refuse to configure a keymap which is not
installed in the file system. This is intended as a bug fix, but
could break cases where systemd-localed was used to configure the
keymap in advanced of it being installed. It is necessary to install
the keymap file first.
* The main git development branch has been renamed to 'main'.
* mmcblk[0-9]boot[0-9] devices will no longer be probed automatically
for partitions, as in the vast majority of cases they contain none
and are used internally by the bootloader (eg: uboot).
* systemd will now set the $SYSTEMD_EXEC_PID environment variable for
spawned processes to the PID of the process itself. This may be used
by programs for detecting whether they were forked off by the service
manager itself or are a process forked off further down the tree.
* The sd-device API gained four new calls: sd_device_get_action() to
determine the uevent add/remove/change/… action the device object has
been seen for, sd_device_get_seqno() to determine the uevent sequence
number, sd_device_new_from_stat_rdev() to allocate a new sd_device
object from stat(2) data of a device node, and sd_device_trigger() to
write to the 'uevent' attribute of a device.
* For most tools the --no-legend= switch has been replaced by
--legend=no and --legend=yes, to force whether tables are shown with
headers/legends.
* Units acquired a new property "Markers" that takes a list of zero,
one or two of the following strings: "needs-reload" and
"needs-restart". These markers may be set via "systemctl
set-property". Once a marker is set, "systemctl reload-or-restart
--marked" may be invoked to execute the operation the units are
marked for. This is useful for package managers that want to mark
units for restart/reload while updating, but effect the actual
operations at a later step at once.
* The sd_bus_message_read_strv() API call of sd-bus may now also be
used to parse arrays of D-Bus signatures and D-Bus paths, in addition
to regular strings.
* bootctl will now report whether the UEFI firmware used a TPM2 device
and measured the boot process into it.
* systemd-tmpfiles learnt support for a new environment variable
$SYSTEMD_TMPFILES_FORCE_SUBVOL which takes a boolean value. If true
the v/q/Q lines in tmpfiles.d/ snippets will create btrfs subvolumes
even if the root fs of the system is not itself a btrfs volume.
* systemd-detect-virt/ConditionVirtualization= will now explicitly
detect Docker/Podman environments where possible. Moreover, they
should be able to generically detect any container manager as long as
it assigns the container a cgroup.
* portablectl gained a new "reattach" verb for detaching/reattaching a
portable service image, useful for updating images on-the-fly.
* Intel SGX enclave device nodes (which expose a security feature of
newer Intel CPUs) will now be owned by a new system group "sgx".
Contributions from: Adam Nielsen, Adrian Vovk, AJ Jordan, Alan Perry,
Alastair Pharo, Alexander Batischev, Ali Abdallah, Andrew Balmos,
Anita Zhang, Annika Wickert, Ansgar Burchardt, Antonio Terceiro,
Antonius Frie, Ardy, Arian van Putten, Ariel Fermani, Arnaud T,
A S Alam, Bastien Nocera, Benjamin Berg, Benjamin Robin, Björn Daase,
caoxia, Carlo Wood, Charles Lee, ChopperRob, chri2, Christian Ehrhardt,
Christian Hesse, Christopher Obbard, clayton craft, corvusnix, cprn,
Daan De Meyer, Daniele Medri, Daniel Rusek, Dan Sanders, Dan Streetman,
Darren Ng, David Edmundson, David Tardon, Deepak Rawat, Devon Pringle,
Dmitry Borodaenko, dropsignal, Einsler Lee, Endre Szabo,
Evgeny Vereshchagin, Fabian Affolter, Fangrui Song, Felipe Borges,
feliperodriguesfr, Felix Stupp, Florian Hülsmann, Florian Klink,
Florian Westphal, Franck Bui, Frantisek Sumsal, Gablegritule,
Gaël PORTAY, Gaurav, Giedrius Statkevičius, Greg Depoire-Ferrer,
Gustavo Costa, Hans de Goede, Hela Basa, heretoenhance, hide,
Iago López Galeiras, igo95862, Ilya Dmitrichenko, Jameer Pathan,
Jan Tojnar, Jiehong, Jinyuan Si, Joerg Behrmann, John Slade,
Jonathan G. Underwood, Jonathan McDowell, Josh Triplett, Joshua Watt,
Julia Cartwright, Julien Humbert, Kairui Song, Karel Zak,
Kevin Backhouse, Kevin P. Fleming, Khem Raj, Konomi, krissgjeng,
l4gfcm, Lajos Veres, Lennart Poettering, Lincoln Ramsay, Luca Boccassi,
Luca BRUNO, Lucas Werkmeister, Luka Kudra, Luna Jernberg,
Marc-André Lureau, Martin Wilck, Matthias Klumpp, Matt Turner,
Michael Gisbers, Michael Marley, Michael Trapp, Michal Fabik,
Michał Kopeć, Michal Koutný, Michal Sekletár, Michele Guerini Rocco,
Mike Gilbert, milovlad, moson-mo, Nick, nihilix-melix, Oğuz Ersen,
Ondrej Mosnacek, pali, Pavel Hrdina, Pavel Sapezhko, Perry Yuan,
Peter Hutterer, Pierre Dubouilh, Piotr Drąg, Pjotr Vertaalt,
Richard Laager, RussianNeuroMancer, Sam Lunt, Sebastiaan van Stijn,
Sergey Bugaev, shenyangyang4, simmon, Simonas Kazlauskas,
Slimane Selyan Amiri, Stefan Agner, Steve Ramage, Susant Sahani,
Sven Mueller, Tad Fisher, Takashi Iwai, Thomas Haller, Tom Shield,
Topi Miettinen, Torsten Hilbrich, tpgxyz, Tyler Hicks, ulf-f,
Ulrich Ölmann, Vincent Pelletier, Vinnie Magro, Vito Caputo, Vlad,
walbit-de, Whired Planck, wouter bolsterlee, Xℹ Ruoyao, Yangyang Shen,
Yuri Chornoivan, Yu Watanabe, Zach Smith, Zbigniew Jędrzejewski-Szmek,
Zmicer Turok, Дамјан Георгиевски
— Berlin, 2021-03-30
CHANGES WITH 247:
* KERNEL API INCOMPATIBILITY: Linux 4.14 introduced two new uevents
"bind" and "unbind" to the Linux device model. When this kernel
change was made, systemd-udevd was only minimally updated to handle
and propagate these new event types. The introduction of these new
uevents (which are typically generated for USB devices and devices
needing a firmware upload before being functional) resulted in a
number of issues which we so far didn't address. We hoped the kernel
maintainers would themselves address these issues in some form, but
that did not happen. To handle them properly, many (if not most) udev
rules files shipped in various packages need updating, and so do many
programs that monitor or enumerate devices with libudev or sd-device,
or otherwise process uevents. Please note that this incompatibility
is not fault of systemd or udev, but caused by an incompatible kernel
change that happened back in Linux 4.14, but is becoming more and
more visible as the new uevents are generated by more kernel drivers.
To minimize issues resulting from this kernel change (but not avoid
them entirely) starting with systemd-udevd 247 the udev "tags"
concept (which is a concept for marking and filtering devices during
enumeration and monitoring) has been reworked: udev tags are now
"sticky", meaning that once a tag is assigned to a device it will not
be removed from the device again until the device itself is removed
(i.e. unplugged). This makes sure that any application monitoring
devices that match a specific tag is guaranteed to both see uevents
where the device starts being relevant, and those where it stops
being relevant (the latter now regularly happening due to the new
"unbind" uevent type). The udev tags concept is hence now a concept
tied to a *device* instead of a device *event* — unlike for example
udev properties whose lifecycle (as before) is generally tied to a
device event, meaning that the previously determined properties are
forgotten whenever a new uevent is processed.
With the newly redefined udev tags concept, sometimes it's necessary
to determine which tags are the ones applied by the most recent
uevent/database update, in order to discern them from those
originating from earlier uevents/database updates of the same
device. To accommodate for this a new automatic property CURRENT_TAGS
has been added that works similar to the existing TAGS property but
only lists tags set by the most recent uevent/database
update. Similarly, the libudev/sd-device API has been updated with
new functions to enumerate these 'current' tags, in addition to the
existing APIs that now enumerate the 'sticky' ones.
To properly handle "bind"/"unbind" on Linux 4.14 and newer it is
essential that all udev rules files and applications are updated to
handle the new events. Specifically:
• All rule files that currently use a header guard similar to
ACTION!="add|change",GOTO="xyz_end" should be updated to use
ACTION=="remove",GOTO="xyz_end" instead, so that the
properties/tags they add are also applied whenever "bind" (or
"unbind") is seen. (This is most important for all physical device
types — those for which "bind" and "unbind" are currently
generated, for all other device types this change is still
recommended but not as important — but certainly prepares for
future kernel uevent type additions).
• Similarly, all code monitoring devices that contains an 'if' branch
discerning the "add" + "change" uevent actions from all other
uevents actions (i.e. considering devices only relevant after "add"
or "change", and irrelevant on all other events) should be reworked
to instead negatively check for "remove" only (i.e. considering
devices relevant after all event types, except for "remove", which
invalidates the device). Note that this also means that devices
should be considered relevant on "unbind", even though conceptually
this — in some form — invalidates the device. Since the precise
effect of "unbind" is not generically defined, devices should be
considered relevant even after "unbind", however I/O errors
accessing the device should then be handled gracefully.
• Any code that uses device tags for deciding whether a device is
relevant or not most likely needs to be updated to use the new
udev_device_has_current_tag() API (or sd_device_has_current_tag()
in case sd-device is used), to check whether the tag is set at the
moment an uevent is seen (as opposed to the existing
udev_device_has_tag() API which checks if the tag ever existed on
the device, following the API concept redefinition explained
above).
We are very sorry for this breakage and the requirement to update
packages using these interfaces. We'd again like to underline that
this is not caused by systemd/udev changes, but result of a kernel
behaviour change.
* UPCOMING INCOMPATIBILITY: So far most downstream distribution
packages have not retriggered devices once the udev package (or any
auxiliary package installing additional udev rules) is updated. We
intend to work with major distributions to change this, so that
"udevadm trigger -a change" is issued on such upgrades, ensuring that
the updated ruleset is applied to the devices already discovered, so
that (asynchronously) after the upgrade completed the udev database
is consistent with the updated rule set. This means udev rules must
be ready to be retriggered with a "change" action any time, and
result in correct and complete udev database entries. While the
majority of udev rule files known to us currently get this right,
some don't. Specifically, there are udev rules files included in
various packages that only set udev properties on the "add" action,
but do not handle the "change" action. If a device matching those
rules is retriggered with the "change" action (as is intended here)
it would suddenly lose the relevant properties. This always has been
problematic, but as soon as all udev devices are triggered on relevant
package upgrades this will become particularly so. It is strongly
recommended to fix offending rules so that they can handle a "change"
action at any time, and acquire all necessary udev properties even
then. Or in other words: the header guard mentioned above
(ACTION=="remove",GOTO="xyz_end") is the correct approach to handle
this, as it makes sure rules are rerun on "change" correctly, and
accumulate the correct and complete set of udev properties. udev rule
definitions that cannot handle "change" events being triggered at
arbitrary times should be considered buggy.
* The MountAPIVFS= service file setting now defaults to on if
RootImage= and RootDirectory= are used, which means that with those
two settings /proc/, /sys/ and /dev/ are automatically properly set
up for services. Previous behaviour may be restored by explicitly
setting MountAPIVFS=off.
* Since PAM 1.2.0 (2015) configuration snippets may be placed in
/usr/lib/pam.d/ in addition to /etc/pam.d/. If a file exists in the
latter it takes precedence over the former, similar to how most of
systemd's own configuration is handled. Given that PAM stack
definitions are primarily put together by OS vendors/distributions
(though possibly overridden by users), this systemd release moves its
own PAM stack configuration for the "systemd-user" PAM service (i.e.
for the PAM session invoked by the per-user user@.service instance)
from /etc/pam.d/ to /usr/lib/pam.d/. We recommend moving all
packages' vendor versions of their PAM stack definitions from
/etc/pam.d/ to /usr/lib/pam.d/, but if such OS-wide migration is not
desired the location to which systemd installs its PAM stack
configuration may be changed via the -Dpamconfdir Meson option.
* The runtime dependencies on libqrencode, libpcre2, libidn/libidn2,
libpwquality and libcryptsetup have been changed to be based on
dlopen(): instead of regular dynamic library dependencies declared in
the binary ELF headers, these libraries are now loaded on demand
only, if they are available. If the libraries cannot be found the
relevant operations will fail gracefully, or a suitable fallback
logic is chosen. This is supposed to be useful for general purpose
distributions, as it allows minimizing the list of dependencies the
systemd packages pull in, permitting building of more minimal OS
images, while still making use of these "weak" dependencies should
they be installed. Since many package managers automatically
synthesize package dependencies from ELF shared library dependencies,
some additional manual packaging work has to be done now to replace
those (slightly downgraded from "required" to "recommended" or
whatever is conceptually suitable for the package manager). Note that
this change does not alter build-time behaviour: as before the
build-time dependencies have to be installed during build, even if
they now are optional during runtime.
* sd-event.h gained a new call sd_event_add_time_relative() for
installing timers relative to the current time. This is mostly a
convenience wrapper around the pre-existing sd_event_add_time() call
which installs absolute timers.
* sd-event event sources may now be placed in a new "exit-on-failure"
mode, which may be controlled via the new
sd_event_source_get_exit_on_failure() and
sd_event_source_set_exit_on_failure() functions. If enabled, any
failure returned by the event source handler functions will result in
exiting the event loop (unlike the default behaviour of just
disabling the event source but continuing with the event loop). This
feature is useful to set for all event sources that define "primary"
program behaviour (where failure should be fatal) in contrast to
"auxiliary" behaviour (where failure should remain local).
* Most event source types sd-event supports now accept a NULL handler
function, in which case the event loop is exited once the event
source is to be dispatched, using the userdata pointer — converted to
a signed integer — as exit code of the event loop. Previously this
was supported for IO and signal event sources already. Exit event
sources still do not support this (simply because it makes little
sense there, as the event loop is already exiting when they are
dispatched).
* A new per-unit setting RootImageOptions= has been added which allows
tweaking the mount options for any file system mounted as effect of
the RootImage= setting.
* Another new per-unit setting MountImages= has been added, that allows
mounting additional disk images into the file system tree accessible
to the service.
* Timer units gained a new FixedRandomDelay= boolean setting. If
enabled, the random delay configured with RandomizedDelaySec= is
selected in a way that is stable on a given system (though still
different for different units).
* Socket units gained a new setting Timestamping= that takes "us", "ns"
or "off". This controls the SO_TIMESTAMP/SO_TIMESTAMPNS socket
options.
* systemd-repart now generates JSON output when requested with the new
--json= switch.
* systemd-machined's OpenMachineShell() bus call will now pass
additional policy metadata data fields to the PolicyKit
authentication request.
* systemd-tmpfiles gained a new -E switch, which is equivalent to
--exclude-prefix=/dev --exclude-prefix=/proc --exclude=/run
--exclude=/sys. It's particularly useful in combination with --root=,
when operating on OS trees that do not have any of these four runtime
directories mounted, as this means no files below these subtrees are
created or modified, since those mount points should probably remain
empty.
* systemd-tmpfiles gained a new --image= switch which is like --root=,
but takes a disk image instead of a directory as argument. The
specified disk image is mounted inside a temporary mount namespace
and the tmpfiles.d/ drop-ins stored in the image are executed and
applied to the image. systemd-sysusers similarly gained a new
--image= switch, that allows the sysusers.d/ drop-ins stored in the
image to be applied onto the image.
* Similarly, the journalctl command also gained an --image= switch,
which is a quick one-step solution to look at the log data included
in OS disk images.
* journalctl's --output=cat option (which outputs the log content
without any metadata, just the pure text messages) will now make use
of terminal colors when run on a suitable terminal, similarly to the
other output modes.
* JSON group records now support a "description" string that may be
used to add a human-readable textual description to such groups. This
is supposed to match the user's GECOS field which traditionally
didn't have a counterpart for group records.
* The "systemd-dissect" tool that may be used to inspect OS disk images
and that was previously installed to /usr/lib/systemd/ has now been
moved to /usr/bin/, reflecting its updated status of an officially
supported tool with a stable interface. It gained support for a new
--mkdir switch which when combined with --mount has the effect of
creating the directory to mount the image to if it is missing
first. It also gained two new commands --copy-from and --copy-to for
copying files and directories in and out of an OS image without the
need to manually mount it. It also acquired support for a new option
--json= to generate JSON output when inspecting an OS image.
* The cgroup2 file system is now mounted with the
"memory_recursiveprot" mount option, supported since kernel 5.7. This
means that the MemoryLow= and MemoryMin= unit file settings now apply
recursively to whole subtrees.
* systemd-homed now defaults to using the btrfs file system — if
available — when creating home directories in LUKS volumes. This may
be changed with the DefaultFileSystemType= setting in homed.conf.
It's now the default file system in various major distributions and
has the major benefit for homed that it can be grown and shrunk while
mounted, unlike the other contenders ext4 and xfs, which can both be
grown online, but not shrunk (in fact xfs is the technically most
limited option here, as it cannot be shrunk at all).
* JSON user records managed by systemd-homed gained support for
"recovery keys". These are basically secondary passphrases that can
unlock user accounts/home directories. They are computer-generated
rather than user-chosen, and typically have greater entropy.
homectl's --recovery-key= option may be used to add a recovery key to
a user account. The generated recovery key is displayed as a QR code,
so that it can be scanned to be kept in a safe place. This feature is
particularly useful in combination with systemd-homed's support for
FIDO2 or PKCS#11 authentication, as a secure fallback in case the
security tokens are lost. Recovery keys may be entered wherever the
system asks for a password.
* systemd-homed now maintains a "dirty" flag for each LUKS encrypted
home directory which indicates that a home directory has not been
deactivated cleanly when offline. This flag is useful to identify
home directories for which the offline discard logic did not run when
offlining, and where it would be a good idea to log in again to catch
up.
* systemctl gained a new parameter --timestamp= which may be used to
change the style in which timestamps are output, i.e. whether to show
them in local timezone or UTC, or whether to show µs granularity.
* Alibaba's "pouch" container manager is now detected by
systemd-detect-virt, ConditionVirtualization= and similar
constructs. Similar, they now also recognize IBM PowerVM machine
virtualization.
* systemd-nspawn has been reworked to use the /run/host/incoming/ as
place to use for propagating external mounts into the
container. Similarly /run/host/notify is now used as the socket path
for container payloads to communicate with the container manager
using sd_notify(). The container manager now uses the
/run/host/inaccessible/ directory to place "inaccessible" file nodes
of all relevant types which may be used by the container payload as
bind mount source to over-mount inodes to make them inaccessible.
/run/host/container-manager will now be initialized with the same
string as the $container environment variable passed to the
container's PID 1. /run/host/container-uuid will be initialized with
the same string as $container_uuid. This means the /run/host/
hierarchy is now the primary way to make host resources available to
the container. The Container Interface documents these new files and
directories:
https://systemd.io/CONTAINER_INTERFACE
* Support for the "ConditionNull=" unit file condition has been
deprecated and undocumented for 6 years. systemd started to warn
about its use 1.5 years ago. It has now been removed entirely.
* sd-bus.h gained a new API call sd_bus_error_has_names(), which takes
a sd_bus_error struct and a list of error names, and checks if the
error matches one of these names. It's a convenience wrapper that is
useful in cases where multiple errors shall be handled the same way.
* A new system call filter list "@known" has been added, that contains
all system calls known at the time systemd was built.
* Behaviour of system call filter allow lists has changed slightly:
system calls that are contained in @known will result in EPERM by
default, while those not contained in it result in ENOSYS. This
should improve compatibility because known system calls will thus be
communicated as prohibited, while unknown (and thus newer ones) will
be communicated as not implemented, which hopefully has the greatest
chance of triggering the right fallback code paths in client
applications.
* "systemd-analyze syscall-filter" will now show two separate sections
at the bottom of the output: system calls known during systemd build
time but not included in any of the filter groups shown above, and
system calls defined on the local kernel but known during systemd
build time.
* If the $SYSTEMD_LOG_SECCOMP=1 environment variable is set for
systemd-nspawn all system call filter violations will be logged by
the kernel (audit). This is useful for tracking down system calls
invoked by container payloads that are prohibited by the container's
system call filter policy.
* If the $SYSTEMD_SECCOMP=0 environment variable is set for
systemd-nspawn (and other programs that use seccomp) all seccomp
filtering is turned off.
* Two new unit file settings ProtectProc= and ProcSubset= have been
added that expose the hidepid= and subset= mount options of procfs.
All processes of the unit will only see processes in /proc that are
are owned by the unit's user. This is an important new sandboxing
option that is recommended to be set on all system services. All
long-running system services that are included in systemd itself set
this option now. This option is only supported on kernel 5.8 and
above, since the hidepid= option supported on older kernels was not a
per-mount option but actually applied to the whole PID namespace.
* Socket units gained a new boolean setting FlushPending=. If enabled
all pending socket data/connections are flushed whenever the socket
unit enters the "listening" state, i.e. after the associated service
exited.
* The unit file setting NUMAMask= gained a new "all" value: when used,
all existing NUMA nodes are added to the NUMA mask.
* A new "credentials" logic has been added to system services. This is
a simple mechanism to pass privileged data to services in a safe and
secure way. It's supposed to be used to pass per-service secret data
such as passwords or cryptographic keys but also associated less
private information such as user names, certificates, and similar to
system services. Each credential is identified by a short user-chosen
name and may contain arbitrary binary data. Two new unit file
settings have been added: SetCredential= and LoadCredential=. The
former allows setting a credential to a literal string, the latter
sets a credential to the contents of a file (or data read from a
user-chosen AF_UNIX stream socket). Credentials are passed to the
service via a special credentials directory, one file for each
credential. The path to the credentials directory is passed in a new
$CREDENTIALS_DIRECTORY environment variable. Since the credentials
are passed in the file system they may be easily referenced in
ExecStart= command lines too, thus no explicit support for the
credentials logic in daemons is required (though ideally daemons
would look for the bits they need in $CREDENTIALS_DIRECTORY
themselves automatically, if set). The $CREDENTIALS_DIRECTORY is
backed by unswappable memory if privileges allow it, immutable if
privileges allow it, is accessible only to the service's UID, and is
automatically destroyed when the service stops.
* systemd-nspawn supports the same credentials logic. It can both
consume credentials passed to it via the aforementioned
$CREDENTIALS_DIRECTORY protocol as well as pass these credentials on
to its payload. The service manager/PID 1 has been updated to match
this: it can also accept credentials from the container manager that
invokes it (in fact: any process that invokes it), and passes them on
to its services. Thus, credentials can be propagated recursively down
the tree: from a system's service manager to a systemd-nspawn
service, to the service manager that runs as container payload and to
the service it runs below. Credentials may also be added on the
systemd-nspawn command line, using new --set-credential= and
--load-credential= command line switches that match the
aforementioned service settings.
* systemd-repart gained new settings Format=, Encrypt=, CopyFiles= in
the partition drop-ins which may be used to format/LUKS
encrypt/populate any created partitions. The partitions are
encrypted/formatted/populated before they are registered in the
partition table, so that they appear atomically: either the
partitions do not exist yet or they exist fully encrypted, formatted,
and populated — there is no time window where they are
"half-initialized". Thus the system is robust to abrupt shutdown: if
the tool is terminated half-way during its operations on next boot it
will start from the beginning.
* systemd-repart's --size= operation gained a new "auto" value. If
specified, and operating on a loopback file it is automatically sized
to the minimal size the size constraints permit. This is useful to
use "systemd-repart" as an image builder for minimally sized images.
* systemd-resolved now gained a third IPC interface for requesting name
resolution: besides D-Bus and local DNS to 127.0.0.53 a Varlink
interface is now supported. The nss-resolve NSS module has been
modified to use this new interface instead of D-Bus. Using Varlink
has a major benefit over D-Bus: it works without a broker service,
and thus already during earliest boot, before the dbus daemon has
been started. This means name resolution via systemd-resolved now
works at the same time systemd-networkd operates: from earliest boot
on, including in the initrd.
* systemd-resolved gained support for a new DNSStubListenerExtra=
configuration file setting which may be used to specify additional IP
addresses the built-in DNS stub shall listen on, in addition to the
main one on 127.0.0.53:53.
* Name lookups issued via systemd-resolved's D-Bus and Varlink
interfaces (and thus also via glibc NSS if nss-resolve is used) will
now honour a trailing dot in the hostname: if specified the search
path logic is turned off. Thus "resolvectl query foo." is now
equivalent to "resolvectl query --search=off foo.".
* systemd-resolved gained a new D-Bus property "ResolvConfMode" that
exposes how /etc/resolv.conf is currently managed: by resolved (and
in which mode if so) or another subsystem. "resolvctl" will display
this property in its status output.
* The resolv.conf snippets systemd-resolved provides will now set "."
as the search domain if no other search domain is known. This turns
off the derivation of an implicit search domain by nss-dns for the
hostname, when the hostname is set to an FQDN. This change is done to
make nss-dns using resolv.conf provided by systemd-resolved behave
more similarly to nss-resolve.
* systemd-tmpfiles' file "aging" logic (i.e. the automatic clean-up of
/tmp/ and /var/tmp/ based on file timestamps) now looks at the
"birth" time (btime) of a file in addition to the atime, mtime, and
ctime.
* systemd-analyze gained a new verb "capability" that lists all known
capabilities by the systemd build and by the kernel.
* If a file /usr/lib/clock-epoch exists, PID 1 will read its mtime and
advance the system clock to it at boot if it is noticed to be before
that time. Previously, PID 1 would only advance the time to an epoch
time that is set during build-time. With this new file OS builders
can change this epoch timestamp on individual OS images without
having to rebuild systemd.
* systemd-logind will now listen to the KEY_RESTART key from the Linux
input layer and reboot the system if it is pressed, similarly to how
it already handles KEY_POWER, KEY_SUSPEND or KEY_SLEEP. KEY_RESTART
was originally defined in the Multimedia context (to restart playback
of a song or film), but is now primarily used in various embedded
devices for "Reboot" buttons. Accordingly, systemd-logind will now
honour it as such. This may configured in more detail via the new
HandleRebootKey= and RebootKeyIgnoreInhibited=.
* systemd-nspawn/systemd-machined will now reconstruct hardlinks when
copying OS trees, for example in "systemd-nspawn --ephemeral",
"systemd-nspawn --template=", "machinectl clone" and similar. This is
useful when operating with OSTree images, which use hardlinks heavily
throughout, and where such copies previously resulting in "exploding"
hardlinks.
* systemd-nspawn's --console= setting gained support for a new
"autopipe" value, which is identical to "interactive" when invoked on
a TTY, and "pipe" otherwise.
* systemd-networkd's .network files gained support for explicitly
configuring the multicast membership entries of bridge devices in the
[BridgeMDB] section. It also gained support for the PIE queuing
discipline in the [FlowQueuePIE] sections.
* systemd-networkd's .netdev files may now be used to create "BareUDP"
tunnels, configured in the new [BareUDP] setting.
* systemd-networkd's Gateway= setting in .network files now accepts the
special values "_dhcp4" and "_ipv6ra" to configure additional,
locally defined, explicit routes to the gateway acquired via DHCP or
IPv6 Router Advertisements. The old setting "_dhcp" is deprecated,
but still accepted for backwards compatibility.
* systemd-networkd's [IPv6PrefixDelegation] section and
IPv6PrefixDelegation= options have been renamed as [IPv6SendRA] and
IPv6SendRA= (the old names are still accepted for backwards
compatibility).
* systemd-networkd's .network files gained the DHCPv6PrefixDelegation=
boolean setting in [Network] section. If enabled, the delegated prefix
gained by another link will be configured, and an address within the
prefix will be assigned.
* systemd-networkd's .network files gained the Announce= boolean setting
in [DHCPv6PrefixDelegation] section. When enabled, the delegated
prefix will be announced through IPv6 router advertisement (IPv6 RA).
The setting is enabled by default.
* VXLAN tunnels may now be marked as independent of any underlying
network interface via the new Independent= boolean setting.
* systemctl gained support for two new verbs: "service-log-level" and
"service-log-target" may be used on services that implement the
generic org.freedesktop.LogControl1 D-Bus interface to dynamically
adjust the log level and target. All of systemd's long-running
services support this now, but ideally all system services would
implement this interface to make the system more uniformly
debuggable.
* The SystemCallErrorNumber= unit file setting now accepts the new
"kill" and "log" actions, in addition to arbitrary error number
specifications as before. If "kill" the processes are killed on the
event, if "log" the offending system call is audit logged.
* A new SystemCallLog= unit file setting has been added that accepts a
list of system calls that shall be logged about (audit).
* The OS image dissection logic (as used by RootImage= in unit files or
systemd-nspawn's --image= switch) has gained support for identifying
and mounting explicit /usr/ partitions, which are now defined in the
discoverable partition specification. This should be useful for
environments where the root file system is
generated/formatted/populated dynamically on first boot and combined
with an immutable /usr/ tree that is supplied by the vendor.
* In the final phase of shutdown, within the systemd-shutdown binary
we'll now try to detach MD devices (i.e software RAID) in addition to
loopback block devices and DM devices as before. This is supposed to
be a safety net only, in order to increase robustness if things go
wrong. Storage subsystems are expected to properly detach their
storage volumes during regular shutdown already (or in case of
storage backing the root file system: in the initrd hook we return to
later).
* If the SYSTEMD_LOG_TID environment variable is set all systemd tools
will now log the thread ID in their log output. This is useful when
working with heavily threaded programs.
* If the SYSTEMD_RDRAND environment variable is set to "0", systemd will
not use the RDRAND CPU instruction. This is useful in environments
such as replay debuggers where non-deterministic behaviour is not
desirable.
* The autopaging logic in systemd's various tools (such as systemctl)
has been updated to turn on "secure" mode in "less"
(i.e. $LESSECURE=1) if execution in a "sudo" environment is
detected. This disables invoking external programs from the pager,
via the pipe logic. This behaviour may be overridden via the new
$SYSTEMD_PAGERSECURE environment variable.
* Units which have resource limits (.service, .mount, .swap, .slice,
.socket, and .slice) gained new configuration settings
ManagedOOMSwap=, ManagedOOMMemoryPressure=, and
ManagedOOMMemoryPressureLimitPercent= that specify resource pressure
limits and optional action taken by systemd-oomd.
* A new service systemd-oomd has been added. It monitors resource
contention for selected parts of the unit hierarchy using the PSI
information reported by the kernel, and kills processes when memory
or swap pressure is above configured limits. This service is only
enabled by default in developer mode (see below) and should be
considered a preview in this release. Behaviour details and option
names are subject to change without the usual backwards-compatibility
promises.
* A new helper oomctl has been added to introspect systemd-oomd state.
It is only enabled by default in developer mode and should be
considered a preview without the usual backwards-compatibility
promises.
* New meson option -Dcompat-mutable-uid-boundaries= has been added. If
enabled, systemd reads the system UID boundaries from /etc/login.defs
at runtime, instead of using the built-in values selected during
build. This is an option to improve compatibility for upgrades from
old systems. It's strongly recommended not to make use of this
functionality on new systems (or even enable it during build), as it
makes something runtime-configurable that is mostly an implementation
detail of the OS, and permits avoidable differences in deployments
that create all kinds of problems in the long run.
* New meson option '-Dmode=developer|release' has been added. When
'developer', additional checks and features are enabled that are
relevant during upstream development, e.g. verification that
semi-automatically-generated documentation has been properly updated
following API changes. Those checks are considered hints for
developers and are not actionable in downstream builds. In addition,
extra features that are not ready for general consumption may be
enabled in developer mode. It is thus recommended to set
'-Dmode=release' in end-user and distro builds.
* systemd-cryptsetup gained support for processing detached LUKS
headers specified on the kernel command line via the header=
parameter of the luks.options= kernel command line option. The same
device/path syntax as for key files is supported for header files
like this.
* The "net_id" built-in of udev has been updated to ignore ACPI _SUN
slot index data for devices that are connected through a PCI bridge
where the _SUN index is associated with the bridge instead of the
network device itself. Previously this would create ambiguous device
naming if multiple network interfaces were connected to the same PCI
bridge. Since this is a naming scheme incompatibility on systems that
possess hardware like this it has been introduced as new naming
scheme "v247". The previous scheme can be selected via the
"net.naming-scheme=v245" kernel command line parameter.
* ConditionFirstBoot= semantics have been modified to be safe towards
abnormal system power-off during first boot. Specifically, the
"systemd-machine-id-commit.service" service now acts as boot
milestone indicating when the first boot process is sufficiently
complete in order to not consider the next following boot also a
first boot. If the system is reset before this unit is reached the
first time, the next boot will still be considered a first boot; once
it has been reached, no further boots will be considered a first
boot. The "first-boot-complete.target" unit now acts as official hook
point to order against this. If a service shall be run on every boot
until the first boot fully succeeds it may thus be ordered before
this target unit (and pull it in) and carry ConditionFirstBoot=
appropriately.
* bootctl's set-default and set-oneshot commands now accept the three
special strings "@default", "@oneshot", "@current" in place of a boot
entry id. These strings are resolved to the current default and
oneshot boot loader entry, as well as the currently booted one. Thus
a command "bootctl set-default @current" may be used to make the
currently boot menu item the new default for all subsequent boots.
* "systemctl edit" has been updated to show the original effective unit
contents in commented form in the text editor.
* Units in user mode are now segregated into three new slices:
session.slice (units that form the core of graphical session),
app.slice ("normal" user applications), and background.slice
(low-priority tasks). Unless otherwise configured, user units are
placed in app.slice. The plan is to add resource limits and
protections for the different slices in the future.
* New GPT partition types for RISCV32/64 for the root and /usr
partitions, and their associated Verity partitions have been defined,
and are now understood by systemd-gpt-auto-generator, and the OS
image dissection logic.
Contributions from: Adolfo Jayme Barrientos, afg, Alec Moskvin, Alyssa
Ross, Amitanand Chikorde, Andrew Hangsleben, Anita Zhang, Ansgar
Burchardt, Arian van Putten, Aurelien Jarno, Axel Rasmussen, bauen1,
Beniamino Galvani, Benjamin Berg, Bjørn Mork, brainrom, Chandradeep
Dey, Charles Lee, Chris Down, Christian Göttsche, Christof Efkemann,
Christoph Ruegge, Clemens Gruber, Daan De Meyer, Daniele Medri, Daniel
Mack, Daniel Rusek, Dan Streetman, David Tardon, Dimitri John Ledkov,
Dmitry Borodaenko, Elias Probst, Elisei Roca, ErrantSpore, Etienne
Doms, Fabrice Fontaine, fangxiuning, Felix Riemann, Florian Klink,
Franck Bui, Frantisek Sumsal, fwSmit, George Rawlinson, germanztz,
Gibeom Gwon, Glen Whitney, Gogo Gogsi, Göran Uddeborg, Grant Mathews,
Hans de Goede, Hans Ulrich Niedermann, Haochen Tong, Harald Seiler,
huangyong, Hubert Kario, igo95862, Ikey Doherty, Insun Pyo, Jan Chren,
Jan Schlüter, Jérémy Nouhaud, Jian-Hong Pan, Joerg Behrmann, Jonathan
Lebon, Jörg Thalheim, Josh Brobst, Juergen Hoetzel, Julien Humbert,
Kai-Chuan Hsieh, Kairui Song, Kamil Dudka, Kir Kolyshkin, Kristijan
Gjoshev, Kyle Huey, Kyle Russell, Lee Whalen, Lennart Poettering,
lichangze, Luca Boccassi, Lucas Werkmeister, Luca Weiss, Marc
Kleine-Budde, Marco Wang, Martin Wilck, Marti Raudsepp, masmullin2000,
Máté Pozsgay, Matt Fenwick, Michael Biebl, Michael Scherer, Michal
Koutný, Michal Sekletár, Michal Suchanek, Mikael Szreder, Milo
Casagrande, mirabilos, Mitsuha_QuQ, mog422, Muhammet Kara, Nazar
Vinnichuk, Nicholas Narsing, Nicolas Fella, Njibhu, nl6720, Oğuz Ersen,
Olivier Le Moal, Ondrej Kozina, onlybugreports, Pass Automated Testing
Suite, Pat Coulthard, Pavel Sapezhko, Pedro Ruiz, perry_yuan, Peter
Hutterer, Phaedrus Leeds, PhoenixDiscord, Piotr Drąg, Plan C,
Purushottam choudhary, Rasmus Villemoes, Renaud Métrich, Robert Marko,
Roman Beranek, Ronan Pigott, Roy Chen (陳彥廷), RussianNeuroMancer,
Samanta Navarro, Samuel BF, scootergrisen, Sorin Ionescu, Steve Dodd,
Susant Sahani, Timo Rothenpieler, Tobias Hunger, Tobias Kaufmann, Topi
Miettinen, vanou, Vito Caputo, Weblate, Wen Yang, Whired Planck,
williamvds, Yu, Li-Yu, Yuri Chornoivan, Yu Watanabe, Zbigniew
Jędrzejewski-Szmek, Zmicer Turok, Дамјан Георгиевски
– Warsaw, 2020-11-26
CHANGES WITH 246:
* The service manager gained basic support for cgroup v2 freezer. Units
can now be suspended or resumed either using new systemctl verbs,
freeze and thaw respectively, or via D-Bus.
* PID 1 may now automatically load pre-compiled AppArmor policies from
/etc/apparmor/earlypolicy during early boot.
* The CPUAffinity= setting in service unit files now supports a new
special value "numa" that causes the CPU affinity masked to be set
based on the NUMA mask.
* systemd will now log about all left-over processes remaining in a
unit when the unit is stopped. It will now warn about services using
KillMode=none, as this is generally an unsafe thing to make use of.
* Two new unit file settings
ConditionPathIsEncrypted=/AssertPathIsEncrypted= have been
added. They may be used to check whether a specific file system path
resides on a block device that is encrypted on the block level
(i.e. using dm-crypt/LUKS).
* Another pair of new settings ConditionEnvironment=/AssertEnvironment=
has been added that may be used for simple environment checks. This
is particularly useful when passing in environment variables from a
container manager (or from PAM in case of the systemd --user
instance).
* .service unit files now accept a new setting CoredumpFilter= which
allows configuration of the memory sections coredumps of the
service's processes shall include.
* .mount units gained a new ReadWriteOnly= boolean option. If set
it will not be attempted to mount a file system read-only if mounting
in read-write mode doesn't succeed. An option x-systemd.rw-only is
available in /etc/fstab to control the same.
* .socket units gained a new boolean setting PassPacketInfo=. If
enabled, the kernel will attach additional per-packet metadata to all
packets read from the socket, as an ancillary message. This controls
the IP_PKTINFO, IPV6_RECVPKTINFO, NETLINK_PKTINFO socket options,
depending on socket type.
* .service units gained a new setting RootHash= which may be used to
specify the root hash for verity enabled disk images which are
specified in RootImage=. RootVerity= may be used to specify a path to
the Verity data matching a RootImage= file system. (The latter is
only useful for images that do not contain the Verity data embedded
into the same image that carries a GPT partition table following the
Discoverable Partition Specification). Similarly, systemd-nspawn
gained a new switch --verity-data= that takes a path to a file with
the verity data of the disk image supplied in --image=, if the image
doesn't contain the verity data itself.
* .service units gained a new setting RootHashSignature= which takes
either a base64 encoded PKCS#7 signature of the root hash specified
with RootHash=, or a path to a file to read the signature from. This
allows validation of the root hash against public keys available in
the kernel keyring, and is only supported on recent kernels
(>= 5.4)/libcryptsetup (>= 2.30). A similar switch has been added to
systemd-nspawn and systemd-dissect (--root-hash-sig=). Support for
this mechanism has also been added to systemd-veritysetup.
* .service unit files gained two new options
TimeoutStartFailureMode=/TimeoutStopFailureMode= that may be used to
tune behaviour if a start or stop timeout is hit, i.e. whether to
terminate the service with SIGTERM, SIGABRT or SIGKILL.
* Most options in systemd that accept hexadecimal values prefixed with
0x in additional to the usual decimal notation now also support octal
notation when the 0o prefix is used and binary notation if the 0b
prefix is used.
* Various command line parameters and configuration file settings that
configure key or certificate files now optionally take paths to
AF_UNIX sockets in the file system. If configured that way a stream
connection is made to the socket and the required data read from
it. This is a simple and natural extension to the existing regular
file logic, and permits other software to provide keys or
certificates via simple IPC services, for example when unencrypted
storage on disk is not desired. Specifically, systemd-networkd's
Wireguard and MACSEC key file settings as well as
systemd-journal-gatewayd's and systemd-journal-remote's PEM
key/certificate parameters support this now.
* Unit files, tmpfiles.d/ snippets, sysusers.d/ snippets and other
configuration files that support specifier expansion learnt six new
specifiers: %a resolves to the current architecture, %o/%w/%B/%W
resolve to the various ID fields from /etc/os-release, %l resolves to
the "short" hostname of the system, i.e. the hostname configured in
the kernel truncated at the first dot.
* Support for the .include syntax in unit files has been removed. The
concept has been obsolete for 6 years and we started warning about
its pending removal 2 years ago (also see NEWS file below). It's
finally gone now.
* StandardError= and StandardOutput= in unit files no longer support
the "syslog" and "syslog-console" switches. They were long removed
from the documentation, but will now result in warnings when used,
and be converted to "journal" and "journal+console" automatically.
* If the service setting User= is set to the "nobody" user, a warning
message is now written to the logs (but the value is nonetheless
accepted). Setting User=nobody is unsafe, since the primary purpose
of the "nobody" user is to own all files whose owner cannot be mapped
locally. It's in particular used by the NFS subsystem and in user
namespacing. By running a service under this user's UID it might get
read and even write access to all these otherwise unmappable files,
which is quite likely a major security problem.
* tmpfs mounts automatically created by systemd (/tmp, /run, /dev/shm,
and others) now have a size and inode limits applied (50% of RAM for
/tmp and /dev/shm, 10% of RAM for other mounts, etc.). Please note
that the implicit kernel default is 50% too, so there is no change
in the size limit for /tmp and /dev/shm.
* nss-mymachines lost support for resolution of users and groups, and
now only does resolution of hostnames. This functionality is now
provided by nss-systemd. Thus, the 'mymachines' entry should be
removed from the 'passwd:' and 'group:' lines in /etc/nsswitch.conf
(and 'systemd' added if it is not already there).
* A new kernel command line option systemd.hostname= has been added
that allows controlling the hostname that is initialized early during
boot.
* A kernel command line option "udev.blockdev_read_only" has been
added. If specified all hardware block devices that show up are
immediately marked as read-only by udev. This option is useful for
making sure that a specific boot under no circumstances modifies data
on disk. Use "blockdev --setrw" to undo the effect of this, per
device.
* A new boolean kernel command line option systemd.swap= has been
added, which may be used to turn off automatic activation of swap
devices listed in /etc/fstab.
* New kernel command line options systemd.condition-needs-update= and
systemd.condition-first-boot= have been added, which override the
result of the ConditionNeedsUpdate= and ConditionFirstBoot=
conditions.
* A new kernel command line option systemd.clock-usec= has been added
that allows setting the system clock to the specified time in µs
since Jan 1st, 1970 early during boot. This is in particular useful
in order to make test cases more reliable.
* The fs.suid_dumpable sysctl is set to 2 / "suidsafe". This allows
systemd-coredump to save core files for suid processes. When saving
the core file, systemd-coredump will use the effective uid and gid of
the process that faulted.
* The /sys/module/kernel/parameters/crash_kexec_post_notifiers file is
now automatically set to "Y" at boot, in order to enable pstore
generation for collection with systemd-pstore.
* We provide a set of udev rules to enable auto-suspend on PCI and USB
devices that were tested to correctly support it. Previously, this
was distributed as a set of udev rules, but has now been replaced by
by a set of hwdb entries (and a much shorter udev rule to take action
if the device modalias matches one of the new hwdb entries).
As before, entries are periodically imported from the database
maintained by the ChromiumOS project. If you have a device that
supports auto-suspend correctly and where it should be enabled by
default, please submit a patch that adds it to the database (see
/usr/lib/udev/hwdb.d/60-autosuspend.hwdb).
* systemd-udevd gained the new configuration option timeout_signal= as well
as a corresponding kernel command line option udev.timeout_signal=.
The option can be used to configure the UNIX signal that the main
daemon sends to the worker processes on timeout. Setting the signal
to SIGABRT is useful for debugging.
* .link files managed by systemd-udevd gained options RxFlowControl=,
TxFlowControl=, AutoNegotiationFlowControl= in the [Link] section, in
order to configure various flow control parameters. They also gained
RxMiniBufferSize= and RxJumboBufferSize= in order to configure jumbo
frame ring buffer sizes.
* networkd.conf gained a new boolean setting ManageForeignRoutes=. If
enabled systemd-networkd manages all routes configured by other tools.
* .network files managed by systemd-networkd gained a new section
[SR-IOV], in order to configure SR-IOV capable network devices.
* systemd-networkd's [IPv6Prefix] section in .network files gained a
new boolean setting Assign=. If enabled an address from the prefix is
automatically assigned to the interface.
* systemd-networkd gained a new section [DHCPv6PrefixDelegation] which
controls delegated prefixes assigned by DHCPv6 client. The section
has three settings: SubnetID=, Assign=, and Token=. The setting
SubnetID= allows explicit configuration of the preferred subnet that
systemd-networkd's Prefix Delegation logic assigns to interfaces. If
Assign= is enabled (which is the default) an address from any acquired
delegated prefix is automatically chosen and assigned to the
interface. The setting Token= specifies an optional address generation
mode for Assign=.
* systemd-networkd's [Network] section gained a new setting
IPv4AcceptLocal=. If enabled the interface accepts packets with local
source addresses.
* systemd-networkd gained support for configuring the HTB queuing
discipline in the [HierarchyTokenBucket] and
[HierarchyTokenBucketClass] sections. Similar the "pfifo" qdisc may
be configured in the [PFIFO] section, "GRED" in
[GenericRandomEarlyDetection], "SFB" in [StochasticFairBlue], "cake"
in [CAKE], "PIE" in [PIE], "DRR" in [DeficitRoundRobinScheduler] and
[DeficitRoundRobinSchedulerClass], "BFIFO" in [BFIFO],
"PFIFOHeadDrop" in [PFIFOHeadDrop], "PFIFOFast" in [PFIFOFast], "HHF"
in [HeavyHitterFilter], "ETS" in [EnhancedTransmissionSelection] and
"QFQ" in [QuickFairQueueing] and [QuickFairQueueingClass].
* systemd-networkd gained support for a new Termination= setting in the
[CAN] section for configuring the termination resistor. It also
gained a new ListenOnly= setting for controlling whether to only
listen on CAN interfaces, without interfering with traffic otherwise
(which is useful for debugging/monitoring CAN network
traffic). DataBitRate=, DataSamplePoint=, FDMode=, FDNonISO= have
been added to configure various CAN-FD aspects.
* systemd-networkd's [DHCPv6] section gained a new option WithoutRA=.
When enabled, DHCPv6 will be attempted right-away without requiring an
Router Advertisement packet suggesting it first (i.e. without the 'M'
or 'O' flags set). The [IPv6AcceptRA] section gained a boolean option
DHCPv6Client= that may be used to turn off the DHCPv6 client even if
the RA packets suggest it.
* systemd-networkd's [DHCPv4] section gained a new setting UseGateway=
which may be used to turn off use of the gateway information provided
by the DHCP lease. A new FallbackLeaseLifetimeSec= setting may be
used to configure how to process leases that lack a lifetime option.
* systemd-networkd's [DHCPv4] and [DHCPServer] sections gained a new
setting SendVendorOption= allowing configuration of additional vendor
options to send in the DHCP requests/responses. The [DHCPv6] section
gained a new SendOption= setting for sending arbitrary DHCP
options. RequestOptions= has been added to request arbitrary options
from the server. UserClass= has been added to set the DHCP user class
field.
* systemd-networkd's [DHCPServer] section gained a new set of options
EmitPOP3=/POP3=, EmitSMTP=/SMTP=, EmitLPR=/LPR= for including server
information about these three protocols in the DHCP lease. It also
gained support for including "MUD" URLs ("Manufacturer Usage
Description"). Support for "MUD" URLs was also added to the LLDP
stack, configurable in the [LLDP] section in .network files.
* The Mode= settings in [MACVLAN] and [MACVTAP] now support 'source'
mode. Also, the sections now support a new setting SourceMACAddress=.
* systemd-networkd's .netdev files now support a new setting
VLANProtocol= in the [Bridge] section that allows configuration of
the VLAN protocol to use.
* systemd-networkd supports a new Group= setting in the [Link] section
of the .network files, to control the link group.
* systemd-networkd's [Network] section gained a new
IPv6LinkLocalAddressGenerationMode= setting, which specifies how IPv6
link local address is generated.
* A new default .network file is now shipped that matches TUN/TAP
devices that begin with "vt-" in their name. Such interfaces will
have IP routing onto the host links set up automatically. This is
supposed to be used by VM managers to trivially acquire a network
interface which is fully set up for host communication, simply by
carefully picking an interface name to use.
* systemd-networkd's [DHCPv6] section gained a new setting RouteMetric=
which sets the route priority for routes specified by the DHCP server.
* systemd-networkd's [DHCPv6] section gained a new setting VendorClass=
which configures the vendor class information sent to DHCP server.
* The BlackList= settings in .network files' [DHCPv4] and
[IPv6AcceptRA] sections have been renamed DenyList=. The old names
are still understood to provide compatibility.
* networkctl gained the new "forcerenew" command for forcing all DHCP
server clients to renew their lease. The interface "status" output
will now show numerous additional fields of information about an
interface. There are new "up" and "down" commands to bring specific
interfaces up or down.
* systemd-resolved's DNS= configuration option now optionally accepts a
port number (after ":") and a host name (after "#"). When the host
name is specified, the DNS-over-TLS certificate is validated to match
the specified hostname. Additionally, in case of IPv6 addresses, an
interface may be specified (after "%").
* systemd-resolved may be configured to forward single-label DNS names.
This is not standard-conformant, but may make sense in setups where
public DNS servers are not used.
* systemd-resolved's DNS-over-TLS support gained SNI validation.
* systemd-nspawn's --resolv-conf= switch gained a number of new
supported values. Specifically, options starting with "replace-" are
like those prefixed "copy-" but replace any existing resolv.conf
file. And options ending in "-uplink" and "-stub" can now be used to
propagate other flavours of resolv.conf into the container (as
defined by systemd-resolved).
* The various programs included in systemd can now optionally output
their log messages on stderr prefixed with a timestamp, controlled by
the $SYSTEMD_LOG_TIME environment variable.
* systemctl gained a new "-P" switch that is a shortcut for "--value
--property=…".
* "systemctl list-units" and "systemctl list-machines" no longer hide
their first output column with --no-legend. To hide the first column,
use --plain.
* "systemctl reboot" takes the option "--reboot-argument=".
The optional positional argument to "systemctl reboot" is now
being deprecated in favor of this option.
* systemd-run gained a new switch --slice-inherit. If specified the
unit it generates is placed in the same slice as the systemd-run
process itself.
* systemd-journald gained support for zstd compression of large fields
in journal files. The hash tables in journal files have been hardened
against hash collisions. This is an incompatible change and means
that journal files created with new systemd versions are not readable
with old versions. If the $SYSTEMD_JOURNAL_KEYED_HASH boolean
environment variable for systemd-journald.service is set to 0 this
new hardening functionality may be turned off, so that generated
journal files remain compatible with older journalctl
implementations.
* journalctl will now include a clickable link in the default output for
each log message for which an URL with further documentation is
known. This is only supported on terminal emulators that support
clickable hyperlinks, and is turned off if a pager is used (since
"less" still doesn't support hyperlinks,
unfortunately). Documentation URLs may be included in log messages
either by including a DOCUMENTATION= journal field in it, or by
associating a journal message catalog entry with the log message's
MESSAGE_ID, which then carries a "Documentation:" tag.
* journald.conf gained a new boolean setting Audit= that may be used to
control whether systemd-journald will enable audit during
initialization.
* when systemd-journald's log stream is broken up into multiple lines
because the PID of the sender changed this is indicated in the
generated log records via the _LINE_BREAK=pid-change field.
* journalctl's "-o cat" output mode will now show one or more journal
fields specified with --output-fields= instead of unconditionally
MESSAGE=. This is useful to retrieve a very specific set of fields
without any decoration.
* The sd-journal.h API gained two new functions:
sd_journal_enumerate_available_unique() and
sd_journal_enumerate_available_data() that operate like their
counterparts that lack the _available_ in the name, but skip items
that cannot be read and processed by the local implementation
(i.e. are compressed in an unsupported format or such),
* coredumpctl gained a new --file= switch, matching the same one in
journalctl: a specific journal file may be specified to read the
coredump data from.
* coredumps collected by systemd-coredump may now be compressed using
the zstd algorithm.
* systemd-binfmt gained a new switch --unregister for unregistering all
registered entries at once. This is now invoked automatically at
shutdown, so that binary formats registered with the "F" flag will
not block clean file system unmounting.
* systemd-notify's --pid= switch gained new values: "parent", "self",
"auto" for controlling which PID to send to the service manager: the
systemd-notify process' PID, or the one of the process invoking it.
* systemd-logind's Session bus object learnt a new method call
SetType() for temporarily updating the session type of an already
allocated session. This is useful for upgrading tty sessions to
graphical ones once a compositor is invoked.
* systemd-socket-proxy gained a new switch --exit-idle-time= for
configuring an exit-on-idle time.
* systemd-repart's --empty= setting gained a new value "create". If
specified a new empty regular disk image file is created under the
specified name. Its size may be specified with the new --size=
option. The latter is also supported without the "create" mode, in
order to grow existing disk image files to the specified size. These
two new options are useful when creating or manipulating disk images
instead of operating on actual block devices.
* systemd-repart drop-ins now support a new UUID= setting to control
the UUID to assign to a newly created partition.
* systemd-repart's SizeMin= per-partition parameter now defaults to 10M
instead of 0.
* systemd-repart's Label= setting now support the usual, simple
specifier expansion.
* systemd-homed's LUKS backend gained the ability to discard empty file
system blocks automatically when the user logs out. This is enabled
by default to ensure that home directories take minimal space when
logged out but get full size guarantees when logged in. This may be
controlled with the new --luks-offline-discard= switch to homectl.
* If systemd-homed detects that /home/ is encrypted as a whole it will
now default to the directory or subvolume backends instead of the
LUKS backend, in order to avoid double encryption. The default
storage and file system may now be configured explicitly, too, via
the new /etc/systemd/homed.conf configuration file.
* systemd-homed now supports unlocking home directories with FIDO2
security tokens that support the 'hmac-secret' extension, in addition
to the existing support for PKCS#11 security token unlocking
support. Note that many recent hardware security tokens support both
interfaces. The FIDO2 support is accessible via homectl's
--fido2-device= option.
* homectl's --pkcs11-uri= setting now accepts two special parameters:
if "auto" is specified and only one suitable PKCS#11 security token
is plugged in, its URL is automatically determined and enrolled for
unlocking the home directory. If "list" is specified a brief table of
suitable PKCS#11 security tokens is shown. Similar, the new
--fido2-device= option also supports these two special values, for
automatically selecting and listing suitable FIDO2 devices.
* The /etc/crypttab tmp option now optionally takes an argument
selecting the file system to use. Moreover, the default is now
changed from ext2 to ext4.
* There's a new /etc/crypttab option "keyfile-erase". If specified the
key file listed in the same line is removed after use, regardless if
volume activation was successful or not. This is useful if the key
file is only acquired transiently at runtime and shall be erased
before the system continues to boot.
* There's also a new /etc/crypttab option "try-empty-password". If
specified, before asking the user for a password it is attempted to
unlock the volume with an empty password. This is useful for
installing encrypted images whose password shall be set on first boot
instead of at installation time.
* systemd-cryptsetup will now attempt to load the keys to unlock
volumes with automatically from files in
/etc/cryptsetup-keys.d/<volume>.key and
/run/cryptsetup-keys.d/<volume>.key, if any of these files exist.
* systemd-cryptsetup may now activate Microsoft BitLocker volumes via
/etc/crypttab, during boot.
* logind.conf gained a new RuntimeDirectoryInodesMax= setting to
control the inode limit for the per-user $XDG_RUNTIME_DIR tmpfs
instance.
* A new generator systemd-xdg-autostart-generator has been added. It
generates systemd unit files from XDG autostart .desktop files, and
may be used to let the systemd user instance manage services that are
started automatically as part of the desktop session.
* "bootctl" gained a new verb "reboot-to-firmware" that may be used
to query and change the firmware's 'reboot into firmware' setup flag.
* systemd-firstboot gained a new switch --kernel-command-line= that may
be used to initialize the /etc/kernel/cmdline file of the image. It
also gained a new switch --root-password-hashed= which is like
--root-password= but accepts a pre-hashed UNIX password as
argument. The new option --delete-root-password may be used to unset
any password for the root user (dangerous!). The --root-shell= switch
may be used to control the shell to use for the root account. A new
--force option may be used to override any already set settings with
the parameters specified on the command line (by default, the tool
will not override what has already been set before, i.e. is purely
incremental).
* systemd-firstboot gained support for a new --image= switch, which is
similar to --root= but accepts the path to a disk image file, on
which it then operates.
* A new sd-path.h API has been added to libsystemd. It provides a
simple API for retrieving various search paths and primary
directories for various resources.
* A new call sd_notify_barrier() has been added to the sd-daemon.h
API. The call will block until all previously sent sd_notify()
messages have been processed by the service manager. This is useful
to remove races caused by a process already having disappeared at the
time a notification message is processed by the service manager,
making correct attribution impossible. The systemd-notify tool will
now make use of this call implicitly, but this can be turned off again
via the new --no-block switch.
* When sending a file descriptor (fd) to the service manager to keep
track of, using the sd_notify() mechanism, a new parameter FDPOLL=0
may be specified. If passed the service manager will refrain from
poll()ing on the file descriptor. Traditionally (and when the
parameter is not specified), the service manager will poll it for
POLLHUP or POLLERR events, and immediately close the fds in that
case.
* The service manager (PID1) gained a new D-Bus method call
SetShowStatus() which may be used to control whether it shall show
boot-time status output on the console. This method has a similar
effect to sending SIGRTMIN+20/SIGRTMIN+21 to PID 1.
* The sd-bus API gained a number of convenience functions that take
va_list arguments rather than "...". For example, there's now
sd_bus_call_methodv() to match sd_bus_call_method(). Those calls make
it easier to build wrappers that accept variadic arguments and want
to pass a ready va_list structure to sd-bus.
* sd-bus vtable entries can have a new SD_BUS_VTABLE_ABSOLUTE_OFFSET
flag which alters how the userdata pointer to pass to the callbacks
is determined. When the flag is set, the offset field is converted
as-is into a pointer, without adding it to the object pointer the
vtable is associated with.
* sd-bus now exposes four new functions:
sd_bus_interface_name_is_valid() + sd_bus_service_name_is_valid() +
sd_bus_member_name_is_valid() + sd_bus_object_path_is_valid() will
validate strings to check if they qualify as various D-Bus concepts.
* The sd-bus API gained the SD_BUS_METHOD_WITH_ARGS(),
SD_BUS_METHOD_WITH_ARGS_OFFSET() and SD_BUS_SIGNAL_WITH_ARGS() macros
that simplify adding argument names to D-Bus methods and signals.
* The man pages for the sd-bus and sd-hwdb APIs have been completed.
* Various D-Bus APIs of systemd daemons now have man pages that
document the methods, signals and properties.
* The expectations on user/group name syntax are now documented in
detail; documentation on how classic home directories may be
converted into home directories managed by homed has been added;
documentation regarding integration of homed/userdb functionality in
desktops has been added:
https://systemd.io/USER_NAMES
https://systemd.io/CONVERTING_TO_HOMED
https://systemd.io/USERDB_AND_DESKTOPS
* Documentation for the on-disk Journal file format has been updated
and has now moved to:
https://systemd.io/JOURNAL_FILE_FORMAT
* The interface for containers (https://systemd.io/CONTAINER_INTERFACE)
has been extended by a set of environment variables that expose
select fields from the host's os-release file to the container
payload. Similarly, host's os-release files can be mounted into the
container underneath /run/host. Together, those mechanisms provide a
standardized way to expose information about the host to the
container payload. Both interfaces are implemented in systemd-nspawn.
* All D-Bus services shipped in systemd now implement the generic
LogControl1 D-Bus API which allows clients to change log level +
target of the service during runtime.
* Only relevant for developers: the mkosi.default symlink has been
dropped from version control. Please create a symlink to one of the
distribution-specific defaults in .mkosi/ based on your preference.
Contributions from: 24bisquitz, Adam Nielsen, Alan Perry, Alexander
Malafeev, Amitanand.Chikorde, Alin Popa, Alvin Šipraga, Amos Bird,
Andreas Rammhold, AndreRH, Andrew Doran, Anita Zhang, Ankit Jain,
antznin, Arnaud Ferraris, Arthur Moraes do Lago, Arusekk, Balaji
Punnuru, Balint Reczey, Bastien Nocera, bemarek, Benjamin Berg,
Benjamin Dahlhoff, Benjamin Robin, Chris Down, Chris Kerr, Christian
Göttsche, Christian Hesse, Christian Oder, Ciprian Hacman, Clinton Roy,
codicodi, Corey Hinshaw, Daan De Meyer, Dana Olson, Dan Callaghan,
Daniel Fullmer, Daniel Rusek, Dan Streetman, Dave Reisner, David
Edmundson, David Wood, Denis Pronin, Diego Escalante Urrelo, Dimitri
John Ledkov, dolphrundgren, duguxy, Einsler Lee, Elisei Roca, Emmanuel
Garette, Eric Anderson, Eric DeVolder, Evgeny Vereshchagin,
ExtinctFire, fangxiuning, Ferran Pallarès Roca, Filipe Brandenburger,
Filippo Falezza, Finn, Florian Klink, Florian Mayer, Franck Bui,
Frantisek Sumsal, gaurav, Georg Müller, Gergely Polonkai, Giedrius
Statkevičius, Gigadoc2, gogogogi, Gaurav Singh, gzjsgdsb, Hans de
Goede, Haochen Tong, ianhi, ignapk, Jakov Smolic, James T. Lee, Jan
Janssen, Jan Klötzke, Jan Palus, Jay Burger, Jeremy Cline, Jérémy
Rosen, Jian-Hong Pan, Jiri Slaby, Joel Shapiro, Joerg Behrmann, Jörg
Thalheim, Jouke Witteveen, Kai-Heng Feng, Kenny Levinsen, Kevin
Kuehler, Kumar Kartikeya Dwivedi, layderv, laydervus, Lénaïc Huard,
Lennart Poettering, Lidong Zhong, Luca Boccassi, Luca BRUNO, Lucas
Werkmeister, Lukas Klingsbo, Lukáš Nykrýn, Łukasz Stelmach, Maciej
S. Szmigiero, MadMcCrow, Marc-André Lureau, Marcel Holtmann, Marc
Kleine-Budde, Martin Hundebøll, Matthew Leeds, Matt Ranostay, Maxim
Fomin, MaxVerevkin, Michael Biebl, Michael Chapman, Michael Gubbels,
Michael Marley, Michał Bartoszkiewicz, Michal Koutný, Michal Sekletár,
Mike Gilbert, Mike Kazantsev, Mikhail Novosyolov, ml, Motiejus Jakštys,
nabijaczleweli, nerdopolis, Niccolò Maggioni, Niklas Hambüchen, Norbert
Lange, Paul Cercueil, pelzvieh, Peter Hutterer, Piero La Terza, Pieter
Lexis, Piotr Drąg, Rafael Fontenelle, Richard Petri, Ronan Pigott, Ross
Lagerwall, Rubens Figueiredo, satmandu, Sean-StarLabs, Sebastian
Jennen, sterlinghughes, Surhud More, Susant Sahani, szb512, Thomas
Haller, Tobias Hunger, Tom, Tomáš Pospíšek, Tomer Shechner, Tom Hughes,
Topi Miettinen, Tudor Roman, Uwe Kleine-König, Valery0xff, Vito Caputo,
Vladimir Panteleev, Vladyslav Tronko, Wen Yang, Yegor Vialov, Yigal
Korman, Yi Gao, YmrDtnJu, Yuri Chornoivan, Yu Watanabe, Zbigniew
Jędrzejewski-Szmek, Zhu Li, Дамјан Георгиевски, наб
– Warsaw, 2020-07-30
CHANGES WITH 245:
* A new tool "systemd-repart" has been added, that operates as an
idempotent declarative repartitioner for GPT partition tables.
Specifically, a set of partitions that must or may exist can be
configured via drop-in files, and during every boot the partition
table on disk is compared with these files, creating missing
partitions or growing existing ones based on configurable relative
and absolute size constraints. The tool is strictly incremental,
i.e. does not delete, shrink or move partitions, but only adds and
grows them. The primary use-case is OS images that ship in minimized
form, that on first boot are grown to the size of the underlying
block device or augmented with additional partitions. For example,
the root partition could be extended to cover the whole disk, or a
swap or /home partitions could be added on first boot. It can also be
used for systems that use an A/B update scheme but ship images with
just the A partition, with B added on first boot. The tool is
primarily intended to be run in the initrd, shortly before
transitioning into the host OS, but can also be run after the
transition took place. It automatically discovers the disk backing
the root file system, and should hence not require any additional
configuration besides the partition definition drop-ins. If no
configuration drop-ins are present, no action is taken.
* A new component "userdb" has been added, along with a small daemon
"systemd-userdbd.service" and a client tool "userdbctl". The framework
allows defining rich user and group records in a JSON format,
extending on the classic "struct passwd" and "struct group"
structures. Various components in systemd have been updated to
process records in this format, including systemd-logind and
pam-systemd. The user records are intended to be extensible, and
allow setting various resource management, security and runtime
parameters that shall be applied to processes and sessions of the
user as they log in. This facility is intended to allow associating
such metadata directly with user/group records so that they can be
produced, extended and consumed in unified form. We hope that
eventually frameworks such as sssd will generate records this way, so
that for the first time resource management and various other
per-user settings can be configured in LDAP directories and then
provided to systemd (specifically to systemd-logind and pam-system)
to apply on login. For further details see:
https://systemd.io/USER_RECORD
https://systemd.io/GROUP_RECORD
https://systemd.io/USER_GROUP_API
* A small new service systemd-homed.service has been added, that may be
used to securely manage home directories with built-in encryption.
The complete user record data is unified with the home directory,
thus making home directories naturally migratable. Its primary
back-end is based on LUKS volumes, but fscrypt, plain directories,
and other storage schemes are also supported. This solves a couple of
problems we saw with traditional ways to manage home directories, in
particular when it comes to encryption. For further discussion of
this, see the video of Lennart's talk at AllSystemsGo! 2019:
https://media.ccc.de/v/ASG2019-164-reinventing-home-directories
For further details about the format and expectations on home
directories this new daemon makes, see:
https://systemd.io/HOME_DIRECTORY
* systemd-journald is now multi-instantiable. In addition to the main
instance systemd-journald.service there's now a template unit
systemd-journald@.service, with each instance defining a new named
log 'namespace' (whose name is specified via the instance part of the
unit name). A new unit file setting LogNamespace= has been added,
taking such a namespace name, that assigns services to the specified
log namespaces. As each log namespace is serviced by its own
independent journal daemon, this functionality may be used to improve
performance and increase isolation of applications, at the price of
losing global message ordering. Each instance of journald has a
separate set of configuration files, with possibly different disk
usage limitations and other settings.
journalctl now takes a new option --namespace= to show logs from a
specific log namespace. The sd-journal.h API gained
sd_journal_open_namespace() for opening the log stream of a specific
log namespace. systemd-journald also gained the ability to exit on
idle, which is useful in the context of log namespaces, as this means
log daemons for log namespaces can be activated automatically on
demand and will stop automatically when no longer used, minimizing
resource usage.
* When systemd-tmpfiles copies a file tree using the 'C' line type it
will now label every copied file according to the SELinux database.
* When systemd/PID 1 detects it is used in the initrd it will now boot
into initrd.target rather than default.target by default. This should
make it simpler to build initrds with systemd as for many cases the
only difference between a host OS image and an initrd image now is
the presence of the /etc/initrd-release file.
* A new kernel command line option systemd.cpu_affinity= is now
understood. It's equivalent to the CPUAffinity= option in
/etc/systemd/system.conf and allows setting the CPU mask for PID 1
itself and the default for all other processes.
* When systemd/PID 1 is reloaded (with systemctl daemon-reload or
equivalent), the SELinux database is now reloaded, ensuring that
sockets and other file system objects are generated taking the new
database into account.
* systemd/PID 1 accepts a new "systemd.show-status=error" setting, and
"quiet" has been changed to imply that instead of
"systemd.show-status=auto". In this mode, only messages about errors
and significant delays in boot are shown on the console.
* The sd-event.h API gained native support for the new Linux "pidfd"
concept. This permits watching processes using file descriptors
instead of PID numbers, which fixes a number of races and makes
process supervision more robust and efficient. All of systemd's
components will now use pidfds if the kernel supports it for process
watching, with the exception of PID 1 itself, unfortunately. We hope
to move PID 1 to exclusively using pidfds too eventually, but this
requires some more kernel work first. (Background: PID 1 watches
processes using waitid() with the P_ALL flag, and that does not play
together nicely with pidfds yet.)
* Closely related to this, the sd-event.h API gained two new calls
sd_event_source_send_child_signal() (for sending a signal to a
watched process) and sd_event_source_get_child_process_own() (for
marking a process so that it is killed automatically whenever the
event source watching it is freed).
* systemd-networkd gained support for configuring Token Bucket Filter
(TBF) parameters in its qdisc configuration support. Similarly,
support for Stochastic Fairness Queuing (SFQ), Controlled-Delay
Active Queue Management (CoDel), and Fair Queue (FQ) has been added.
* systemd-networkd gained support for Intermediate Functional Block
(IFB) network devices.
* systemd-networkd gained support for configuring multi-path IP routes,
using the new MultiPathRoute= setting in the [Route] section.
* systemd-networkd's DHCPv4 client has been updated to support a new
SendDecline= option. If enabled, duplicate address detection is done
after a DHCP offer is received from the server. If a conflict is
detected, the address is declined. The DHCPv4 client also gained
support for a new RouteMTUBytes= setting that allows to configure the
MTU size to be used for routes generated from DHCPv4 leases.
* The PrefixRoute= setting in systemd-networkd's [Address] section of
.network files has been deprecated, and replaced by AddPrefixRoute=,
with its sense inverted.
* The Gateway= setting of [Route] sections of .network files gained
support for a special new value "_dhcp". If set, the configured
static route uses the gateway host configured via DHCP.
* New User= and SuppressPrefixLength= settings have been implemented
for the [RoutingPolicyRule] section of .network files to configure
source routing based on UID ranges and prefix length, respectively.
* The Type= match property of .link files has been generalized to
always match the device type shown by 'networkctl status', even for
devices where udev does not set DEVTYPE=. This allows e.g. Type=ether
to be used.
* sd-bus gained a new API call sd_bus_message_sensitive() that marks a
D-Bus message object as "sensitive". Those objects are erased from
memory when they are freed. This concept is intended to be used for
messages that contain security sensitive data. A new flag
SD_BUS_VTABLE_SENSITIVE has been introduced as well to mark methods
in sd-bus vtables, causing any incoming and outgoing messages of
those methods to be implicitly marked as "sensitive".
* sd-bus gained a new API call sd_bus_message_dump() for dumping the
contents of a message (or parts thereof) to standard output for
debugging purposes.
* systemd-sysusers gained support for creating users with the primary
group named differently than the user.
* systemd-growfs (i.e. the x-systemd.growfs mount option in /etc/fstab)
gained support for growing XFS partitions. Previously it supported
only ext4 and btrfs partitions.
* The support for /etc/crypttab gained a new x-initrd.attach option. If
set, the specified encrypted volume is unlocked already in the
initrd. This concept corresponds to the x-initrd.mount option in
/etc/fstab.
* systemd-cryptsetup gained native support for unlocking encrypted
volumes utilizing PKCS#11 smartcards, i.e. for example to bind
encryption of volumes to YubiKeys. This is exposed in the new
pkcs11-uri= option in /etc/crypttab.
* The /etc/fstab support in systemd now supports two new mount options
x-systemd.{required,wanted}-by=, for explicitly configuring the units
that the specified mount shall be pulled in by, in place of
the usual local-fs.target/remote-fs.target.
* The https://systemd.io/ web site has been relaunched, directly
populated with most of the documentation included in the systemd
repository. systemd also acquired a new logo, thanks to Tobias
Bernard.
* systemd-udevd gained support for managing "alternative" network
interface names, as supported by new Linux kernels. For the first
time this permits assigning multiple (and longer!) names to a network
interface. systemd-udevd will now by default assign the names
generated via all supported naming schemes to each interface. This
may be further tweaked with .link files and the AlternativeName= and
AlternativeNamesPolicy= settings. Other components of systemd have
been updated to support the new alternative names wherever
appropriate. For example, systemd-nspawn will now generate
alternative interface names for the host-facing side of container
veth links based on the full container name without truncation.
* systemd-nspawn interface naming logic has been updated in another way
too: if the main interface name (i.e. as opposed to new-style
"alternative" names) based on the container name is truncated, a
simple hashing scheme is used to give different interface names to
multiple containers whose names all begin with the same prefix. Since
this changes the primary interface names pointing to containers if
truncation happens, the old scheme may still be requested by
selecting an older naming scheme, via the net.naming-scheme= kernel
command line option.
* PrivateUsers= in service files now works in services run by the
systemd --user per-user instance of the service manager.
* A new per-service sandboxing option ProtectClock= has been added that
locks down write access to the system clock. It takes away device
node access to /dev/rtc as well as the system calls that set the
system clock and the CAP_SYS_TIME and CAP_WAKE_ALARM capabilities.
Note that this option does not affect access to auxiliary services
that allow changing the clock, for example access to
systemd-timedated.
* The systemd-id128 tool gained a new "show" verb for listing or
resolving a number of well-known UUIDs/128bit IDs, currently mostly
GPT partition table types.
* The Discoverable Partitions Specification has been updated to support
/var and /var/tmp partition discovery. Support for this has been
added to systemd-gpt-auto-generator. For details see:
https://systemd.io/DISCOVERABLE_PARTITIONS
* "systemctl list-unit-files" has been updated to show a new column
with the suggested enablement state based on the vendor preset files
for the respective units.
* "systemctl" gained a new option "--with-dependencies". If specified
commands such as "systemctl status" or "systemctl cat" will now show
all specified units along with all units they depend on.
* networkctl gained support for showing per-interface logs in its
"status" output.
* systemd-networkd-wait-online gained support for specifying the maximum
operational state to wait for, and to wait for interfaces to
disappear.
* The [Match] section of .link and .network files now supports a new
option PermanentMACAddress= which may be used to check against the
permanent MAC address of a network device even if a randomized MAC
address is used.
* The [TrafficControlQueueingDiscipline] section in .network files has
been renamed to [NetworkEmulator] with the "NetworkEmulator" prefix
dropped from the individual setting names.
* Any .link and .network files that have an empty [Match] section (this
also includes empty and commented-out files) will now be
rejected. systemd-udev and systemd-networkd started warning about
such files in version 243.
* systemd-logind will now validate access to the operation of changing
the virtual terminal via a polkit action. By default, only users
with at least one session on a local VT are granted permission.
* When systemd sets up PAM sessions that invoked service processes
shall run in, the pam_setcred() API is now invoked, thus permitting
PAM modules to set additional credentials for the processes.
* portablectl attach/detach verbs now accept --now and --enable options
to combine attachment with enablement and invocation, or detachment
with stopping and disablement.
* UPGRADE ISSUE: a bug where some jobs were trimmed as redundant was
fixed, which in turn exposed bugs in unit configuration of services
which have Type=oneshot and should only run once, but do not have
RemainAfterExit=yes set. Without RemainAfterExit=yes, a one-shot
service may be started again after exiting successfully, for example
as a dependency in another transaction. Affected services included
some internal systemd services (most notably
systemd-vconsole-setup.service, which was updated to have
RemainAfterExit=yes), and plymouth-start.service. Please ensure that
plymouth has been suitably updated or patched before upgrading to
this systemd release. See
https://bugzilla.redhat.com/show_bug.cgi?id=1807771 for some
additional discussion.
Contributions from: AJ Bagwell, Alin Popa, Andreas Rammhold, Anita
Zhang, Ansgar Burchardt, Antonio Russo, Arian van Putten, Ashley Davis,
Balint Reczey, Bart Willems, Bastien Nocera, Benjamin Dahlhoff, Charles
(Chas) Williams, cheese1, Chris Down, Chris Murphy, Christian Ehrhardt,
Christian Göttsche, cvoinf, Daan De Meyer, Daniele Medri, Daniel Rusek,
Daniel Shahaf, Dann Frazier, Dan Streetman, Dariusz Gadomski, David
Michael, Dimitri John Ledkov, Emmanuel Bourg, Evgeny Vereshchagin,
ezst036, Felipe Sateler, Filipe Brandenburger, Florian Klink, Franck
Bui, Fran Dieguez, Frantisek Sumsal, Greg "GothAck" Miell, Guilhem
Lettron, Guillaume Douézan-Grard, Hans de Goede, HATAYAMA Daisuke, Iain
Lane, James Buren, Jan Alexander Steffens (heftig), Jérémy Rosen, Jin
Park, Jun'ichi Nomura, Kai Krakow, Kevin Kuehler, Kevin P. Fleming,
Lennart Poettering, Leonid Bloch, Leonid Evdokimov, lothrond, Luca
Boccassi, Lukas K, Lynn Kirby, Mario Limonciello, Mark Deneen, Matthew
Leeds, Michael Biebl, Michal Koutný, Michal Sekletár, Mike Auty, Mike
Gilbert, mtron, nabijaczleweli, Naïm Favier, Nate Jones, Norbert Lange,
Oliver Giles, Paul Davey, Paul Menzel, Peter Hutterer, Piotr Drąg, Rafa
Couto, Raphael, rhn, Robert Scheck, Rocka, Romain Naour, Ryan Attard,
Sascha Dewald, Shengjing Zhu, Slava Kardakov, Spencer Michaels, Sylvain
Plantefeve, Stanislav Angelovič, Susant Sahani, Thomas Haller, Thomas
Schmitt, Timo Schlüßler, Timo Wilken, Tobias Bernard, Tobias Klauser,
Tobias Stoeckmann, Topi Miettinen, tsia, WataruMatsuoka, Wieland
Hoffmann, Wilhelm Schuster, Will Fleming, xduugu, Yong Cong Sin, Yuri
Chornoivan, Yu Watanabe, Zach Smith, Zbigniew Jędrzejewski-Szmek, Zeyu
DONG
– Warsaw, 2020-03-06
CHANGES WITH 244:
* Support for the cpuset cgroups v2 controller has been added.
Processes may be restricted to specific CPUs using the new
AllowedCPUs= setting, and to specific memory NUMA nodes using the new
AllowedMemoryNodes= setting.
* The signal used in restart jobs (as opposed to e.g. stop jobs) may
now be configured using a new RestartKillSignal= setting. This
allows units which signals to request termination to implement
different behaviour when stopping in preparation for a restart.
* "systemctl clean" may now be used also for socket, mount, and swap
units.
* systemd will also read configuration options from the EFI variable
SystemdOptions. This may be used to configure systemd behaviour when
modifying the kernel command line is inconvenient, but configuration
on disk is read too late, for example for the options related to
cgroup hierarchy setup. 'bootctl systemd-efi-options' may be used to
set the EFI variable.
* systemd will now disable printk ratelimits in early boot. This should
allow us to capture more logs from the early boot phase where normal
storage is not available and the kernel ring buffer is used for
logging. Configuration on the kernel command line has higher priority
and overrides the systemd setting.
systemd programs which log to /dev/kmsg directly use internal
ratelimits to prevent runaway logging. (Normally this is only used
during early boot, so in practice this change has very little
effect.)
* Unit files now support top level dropin directories of the form
<unit_type>.d/ (e.g. service.d/) that may be used to add configuration
that affects all corresponding unit files.
* systemctl gained support for 'stop --job-mode=triggering' which will
stop the specified unit and any units which could trigger it.
* Unit status display now includes units triggering and triggered by
the unit being shown.
* The RuntimeMaxSec= setting is now supported by scopes, not just
.service units. This is particularly useful for PAM sessions which
create a scope unit for the user login. systemd.runtime_max_sec=
setting may used with the pam_systemd module to limit the duration
of the PAM session, for example for time-limited logins.
* A new @pkey system call group is now defined to make it easier to
allow-list memory protection syscalls for containers and services
which need to use them.
* systemd-udevd: removed the 30s timeout for killing stale workers on
exit. systemd-udevd now waits for workers to finish. The hard-coded
exit timeout of 30s was too short for some large installations, where
driver initialization could be prematurely interrupted during initrd
processing if the root file system had been mounted and init was
preparing to switch root. If udevd is run without systemd and workers
are hanging while udevd receives an exit signal, udevd will now exit
when udev.event_timeout is reached for the last hanging worker. With
systemd, the exit timeout can additionally be configured using
TimeoutStopSec= in systemd-udevd.service.
* udev now provides a program (fido_id) that identifies FIDO CTAP1
("U2F")/CTAP2 security tokens based on the usage declared in their
report and descriptor and outputs suitable environment variables.
This replaces the externally maintained allow lists of all known
security tokens that were used previously.
* Automatically generated autosuspend udev rules for allow-listed
devices have been imported from the Chromium OS project. This should
improve power saving with many more devices.
* udev gained a new "CONST{key}=value" setting that allows matching
against system-wide constants without forking a helper binary.
Currently "arch" and "virt" keys are supported.
* udev now opens CDROMs in non-exclusive mode when querying their
capabilities. This should fix issues where other programs trying to
use the CDROM cannot gain access to it, but carries a risk of
interfering with programs writing to the disk, if they did not open
the device in exclusive mode as they should.
* systemd-networkd does not create a default route for IPv4 link local
addressing anymore. The creation of the route was unexpected and was
breaking routing in various cases, but people who rely on it being
created implicitly will need to adjust. Such a route may be requested
with DefaultRouteOnDevice=yes.
Similarly, systemd-networkd will not assign a link-local IPv6 address
when IPv6 link-local routing is not enabled.
* Receive and transmit buffers may now be configured on links with
the new RxBufferSize= and TxBufferSize= settings.
* systemd-networkd may now advertise additional IPv6 routes. A new
[IPv6RoutePrefix] section with Route= and LifetimeSec= options is
now supported.
* systemd-networkd may now configure "next hop" routes using the
[NextHop] section and Gateway= and Id= settings.
* systemd-networkd will now retain DHCP config on restarts by default
(but this may be overridden using the KeepConfiguration= setting).
The default for SendRelease= has been changed to true.
* The DHCPv4 client now uses the OPTION_INFORMATION_REFRESH_TIME option
received from the server.
The client will use the received SIP server list if UseSIP=yes is
set.
The client may be configured to request specific options from the
server using a new RequestOptions= setting.
The client may be configured to send arbitrary options to the server
using a new SendOption= setting.
A new IPServiceType= setting has been added to configure the "IP
service type" value used by the client.
* The DHCPv6 client learnt a new PrefixDelegationHint= option to
request prefix hints in the DHCPv6 solicitation.
* The DHCPv4 server may be configured to send arbitrary options using
a new SendOption= setting.
* The DHCPv4 server may now be configured to emit SIP server list using
the new EmitSIP= and SIP= settings.
* systemd-networkd and networkctl may now renew DHCP leases on demand.
networkctl has a new 'networkctl renew' verb.
* systemd-networkd may now reconfigure links on demand. networkctl
gained two new verbs: "reload" will reload the configuration, and
"reconfigure DEVICE…" will reconfigure one or more devices.
* .network files may now match on SSID and BSSID of a wireless network,
i.e. the access point name and hardware address using the new SSID=
and BSSID= options. networkctl will display the current SSID and
BSSID for wireless links.
.network files may also match on the wireless network type using the
new WLANInterfaceType= option.
* systemd-networkd now includes default configuration that enables
link-local addressing when connected to an ad-hoc wireless network.
* systemd-networkd may configure the Traffic Control queueing
disciplines in the kernel using the new
[TrafficControlQueueingDiscipline] section and Parent=,
NetworkEmulatorDelaySec=, NetworkEmulatorDelayJitterSec=,
NetworkEmulatorPacketLimit=, NetworkEmulatorLossRate=,
NetworkEmulatorDuplicateRate= settings.
* systemd-tmpfiles gained a new w+ setting to append to files.
* systemd-analyze dump will now report when the memory configuration in
the kernel does not match what systemd has configured (usually,
because some external program has modified the kernel configuration
on its own).
* systemd-analyze gained a new --base-time= switch instructs the
'calendar' verb to resolve times relative to that timestamp instead
of the present time.
* journalctl --update-catalog now produces deterministic output (making
reproducible image builds easier).
* A new devicetree-overlay setting is now documented in the Boot Loader
Specification.
* The default value of the WatchdogSec= setting used in systemd
services (the ones bundled with the project itself) may be set at
configuration time using the -Dservice-watchdog= setting. If set to
empty, the watchdogs will be disabled.
* systemd-resolved validates IP addresses in certificates now when GnuTLS
is being used.
* libcryptsetup >= 2.0.1 is now required.
* A configuration option -Duser-path= may be used to override the $PATH
used by the user service manager. The default is again to use the same
path as the system manager.
* The systemd-id128 tool gained a new switch "-u" (or "--uuid") for
outputting the 128bit IDs in UUID format (i.e. in the "canonical
representation").
* Service units gained a new sandboxing option ProtectKernelLogs= which
makes sure the program cannot get direct access to the kernel log
buffer anymore, i.e. the syslog() system call (not to be confused
with the API of the same name in libc, which is not affected), the
/proc/kmsg and /dev/kmsg nodes and the CAP_SYSLOG capability are made
inaccessible to the service. It's recommended to enable this setting
for all services that should not be able to read from or write to the
kernel log buffer, which are probably almost all.
Contributions from: Aaron Plattner, Alcaro, Anita Zhang, Balint Reczey,
Bastien Nocera, Baybal Ni, Benjamin Bouvier, Benjamin Gilbert, Carlo
Teubner, cbzxt, Chen Qi, Chris Down, Christian Rebischke, Claudio
Zumbo, ClydeByrdIII, crashfistfight, Cyprien Laplace, Daniel Edgecumbe,
Daniel Gorbea, Daniel Rusek, Daniel Stuart, Dan Streetman, David
Pedersen, David Tardon, Dimitri John Ledkov, Dominique Martinet, Donald
A. Cupp Jr, Evgeny Vereshchagin, Fabian Henneke, Filipe Brandenburger,
Franck Bui, Frantisek Sumsal, Georg Müller, Hans de Goede, Haochen
Tong, HATAYAMA Daisuke, Iwan Timmer, Jan Janssen, Jan Kundrát, Jan
Synacek, Jan Tojnar, Jay Strict, Jérémy Rosen, Jóhann B. Guðmundsson,
Jonas Jelten, Jonas Thelemann, Justin Trudell, J. Xing, Kai-Heng Feng,
Kenneth D'souza, Kevin Becker, Kevin Kuehler, Lennart Poettering,
Léonard Gérard, Lorenz Bauer, Luca Boccassi, Maciej Stanczew, Mario
Limonciello, Marko Myllynen, Mark Stosberg, Martin Wilck, matthiasroos,
Michael Biebl, Michael Olbrich, Michael Tretter, Michal Sekletar,
Michal Sekletár, Michal Suchanek, Mike Gilbert, Mike Kazantsev, Nicolas
Douma, nikolas, Norbert Lange, pan93412, Pascal de Bruijn, Paul Menzel,
Pavel Hrdina, Peter Wu, Philip Withnall, Piotr Drąg, Rafael Fontenelle,
Renaud Métrich, Riccardo Schirone, RoadrunnerWMC, Ronan Pigott, Ryan
Attard, Sebastian Wick, Serge, Siddharth Chandrasekara, Steve Ramage,
Steve Traylen, Susant Sahani, Thibault Nélis, Tim Teichmann, Tom
Fitzhenry, Tommy J, Torsten Hilbrich, Vito Caputo, ypf791, Yu Watanabe,
Zach Smith, Zbigniew Jędrzejewski-Szmek
– Warsaw, 2019-11-29
CHANGES WITH 243:
* This release enables unprivileged programs (i.e. requiring neither
setuid nor file capabilities) to send ICMP Echo (i.e. ping) requests
by turning on the "net.ipv4.ping_group_range" sysctl of the Linux
kernel for the whole UNIX group range, i.e. all processes. This
change should be reasonably safe, as the kernel support for it was
specifically implemented to allow safe access to ICMP Echo for
processes lacking any privileges. If this is not desirable, it can be
disabled again by setting the parameter to "1 0".
* Previously, filters defined with SystemCallFilter= would have the
effect that any calling of an offending system call would terminate
the calling thread. This behaviour never made much sense, since
killing individual threads of unsuspecting processes is likely to
create more problems than it solves. With this release the default
action changed from killing the thread to killing the whole
process. For this to work correctly both a kernel version (>= 4.14)
and a libseccomp version (>= 2.4.0) supporting this new seccomp
action is required. If an older kernel or libseccomp is used the old
behaviour continues to be used. This change does not affect any
services that have no system call filters defined, or that use
SystemCallErrorNumber= (and thus see EPERM or another error instead
of being killed when calling an offending system call). Note that
systemd documentation always claimed that the whole process is
killed. With this change behaviour is thus adjusted to match the
documentation.
* On 64 bit systems, the "kernel.pid_max" sysctl is now bumped to
4194304 by default, i.e. the full 22bit range the kernel allows, up
from the old 16bit range. This should improve security and
robustness, as PID collisions are made less likely (though certainly
still possible). There are rumours this might create compatibility
problems, though at this moment no practical ones are known to
us. Downstream distributions are hence advised to undo this change in
their builds if they are concerned about maximum compatibility, but
for everybody else we recommend leaving the value bumped. Besides
improving security and robustness this should also simplify things as
the maximum number of allowed concurrent tasks was previously bounded
by both "kernel.pid_max" and "kernel.threads-max" and now effectively
only a single knob is left ("kernel.threads-max"). There have been
concerns that usability is affected by this change because larger PID
numbers are harder to type, but we believe the change from 5 digits
to 7 digits doesn't hamper usability.
* MemoryLow= and MemoryMin= gained hierarchy-aware counterparts,
DefaultMemoryLow= and DefaultMemoryMin=, which can be used to
hierarchically set default memory protection values for a particular
subtree of the unit hierarchy.
* Memory protection directives can now take a value of zero, allowing
explicit opting out of a default value propagated by an ancestor.
* systemd now defaults to the "unified" cgroup hierarchy setup during
build-time, i.e. -Ddefault-hierarchy=unified is now the build-time
default. Previously, -Ddefault-hierarchy=hybrid was the default. This
change reflects the fact that cgroupsv2 support has matured
substantially in both systemd and in the kernel, and is clearly the
way forward. Downstream production distributions might want to
continue to use -Ddefault-hierarchy=hybrid (or even =legacy) for
their builds as unfortunately the popular container managers have not
caught up with the kernel API changes.
* Man pages are not built by default anymore (html pages were already
disabled by default), to make development builds quicker. When
building systemd for a full installation with documentation, meson
should be called with -Dman=true and/or -Dhtml=true as appropriate.
The default was changed based on the assumption that quick one-off or
repeated development builds are much more common than full optimized
builds for installation, and people need to pass various other
options to when doing "proper" builds anyway, so the gain from making
development builds quicker is bigger than the one time disruption for
packagers.
Two scripts are created in the *build* directory to generate and
preview man and html pages on demand, e.g.:
build/man/man systemctl
build/man/html systemd.index
* libidn2 is used by default if both libidn2 and libidn are installed.
Please use -Dlibidn=true if libidn is preferred.
* The D-Bus "wire format" of the CPUAffinity= attribute is changed on
big-endian machines. Before, bytes were written and read in native
machine order as exposed by the native libc __cpu_mask interface.
Now, little-endian order is always used (CPUs 0–7 are described by
bits 0–7 in byte 0, CPUs 8–15 are described by byte 1, and so on).
This change fixes D-Bus calls that cross endianness boundary.
The presentation format used for CPUAffinity= by "systemctl show" and
"systemd-analyze dump" is changed to present CPU indices instead of
the raw __cpu_mask bitmask. For example, CPUAffinity=0-1 would be
shown as CPUAffinity=03000000000000000000000000000… (on
little-endian) or CPUAffinity=00000000000000300000000000000… (on
64-bit big-endian), and is now shown as CPUAffinity=0-1, matching the
input format. The maximum integer that will be printed in the new
format is 8191 (four digits), while the old format always used a very
long number (with the length varying by architecture), so they can be
unambiguously distinguished.
* /usr/sbin/halt.local is no longer supported. Implementation in
distributions was inconsistent and it seems this functionality was
very rarely used.
To replace this functionality, users should:
- either define a new unit and make it a dependency of final.target
(systemctl add-wants final.target my-halt-local.service)
- or move the shutdown script to /usr/lib/systemd/system-shutdown/
and ensure that it accepts "halt", "poweroff", "reboot", and
"kexec" as an argument, see the description in systemd-shutdown(8).
* When a [Match] section in .link or .network file is empty (contains
no match patterns), a warning will be emitted. Please add any "match
all" pattern instead, e.g. OriginalName=* or Name=* in case all
interfaces should really be matched.
* A new setting NUMAPolicy= may be used to set process memory
allocation policy. This setting can be specified in
/etc/systemd/system.conf and hence will set the default policy for
PID1. The default policy can be overridden on a per-service
basis. The related setting NUMAMask= is used to specify NUMA node
mask that should be associated with the selected policy.
* PID 1 will now listen to Out-Of-Memory (OOM) events the kernel
generates when processes it manages are reaching their memory limits,
and will place their units in a special state, and optionally kill or
stop the whole unit.
* The service manager will now expose bus properties for the IO
resources used by units. This information is also shown in "systemctl
status" now (for services that have IOAccounting=yes set). Moreover,
the IO accounting data is included in the resource log message
generated whenever a unit stops.
* Units may now configure an explicit timeout to wait for when killed
with SIGABRT, for example when a service watchdog is hit. Previously,
the regular TimeoutStopSec= timeout was applied in this case too —
now a separate timeout may be set using TimeoutAbortSec=.
* Services may now send a special WATCHDOG=trigger message with
sd_notify() to trigger an immediate "watchdog missed" event, and thus
trigger service termination. This is useful both for testing watchdog
handling, but also for defining error paths in services, that shall
be handled the same way as watchdog events.
* There are two new per-unit settings IPIngressFilterPath= and
IPEgressFilterPath= which allow configuration of a BPF program
(usually by specifying a path to a program uploaded to /sys/fs/bpf/)
to apply to the IP packet ingress/egress path of all processes of a
unit. This is useful to allow running systemd services with BPF
programs set up externally.
* systemctl gained a new "clean" verb for removing the state, cache,
runtime or logs directories of a service while it is terminated. The
new verb may also be used to remove the state maintained on disk for
timer units that have Persistent= configured.
* During the last phase of shutdown systemd will now automatically
increase the log level configured in the "kernel.printk" sysctl so
that any relevant loggable events happening during late shutdown are
made visible. Previously, loggable events happening so late during
shutdown were generally lost if the "kernel.printk" sysctl was set to
high thresholds, as regular logging daemons are terminated at that
time and thus nothing is written to disk.
* If processes terminated during the last phase of shutdown do not exit
quickly systemd will now show their names after a short time, to make
debugging easier. After a longer timeout they are forcibly killed,
as before.
* journalctl (and the other tools that display logs) will now highlight
warnings in yellow (previously, both LOG_NOTICE and LOG_WARNING where
shown in bright bold, now only LOG_NOTICE is). Moreover, audit logs
are now shown in blue color, to separate them visually from regular
logs. References to configuration files are now turned into clickable
links on terminals that support that.
* systemd-journald will now stop logging to /var/log/journal during
shutdown when /var/ is on a separate mount, so that it can be
unmounted safely during shutdown.
* systemd-resolved gained support for a new 'strict' DNS-over-TLS mode.
* systemd-resolved "Cache=" configuration option in resolved.conf has
been extended to also accept the 'no-negative' value. Previously,
only a boolean option was allowed (yes/no), having yes as the
default. If this option is set to 'no-negative', negative answers are
not cached while the old cache heuristics are used positive answers.
The default remains unchanged.
* The predictable naming scheme for network devices now supports
generating predictable names for "netdevsim" devices.
Moreover, the "en" prefix was dropped from the ID_NET_NAME_ONBOARD
udev property.
Those two changes form a new net.naming-policy-scheme= entry.
Distributions which want to preserve naming stability may want to set
the -Ddefault-net-naming-scheme= configuration option.
* systemd-networkd now supports MACsec, nlmon, IPVTAP and Xfrm
interfaces natively.
* systemd-networkd's bridge FDB support now allows configuration of a
destination address for each entry (Destination=), as well as the
VXLAN VNI (VNI=), as well as an option to declare what an entry is
associated with (AssociatedWith=).
* systemd-networkd's DHCPv4 support now understands a new MaxAttempts=
option for configuring the maximum number of DHCP lease requests. It
also learnt a new BlackList= option for deny-listing DHCP servers (a
similar setting has also been added to the IPv6 RA client), as well
as a SendRelease= option for configuring whether to send a DHCP
RELEASE message when terminating.
* systemd-networkd's DHCPv4 and DHCPv6 stacks can now be configured
separately in the [DHCPv4] and [DHCPv6] sections.
* systemd-networkd's DHCP support will now optionally create an
implicit host route to the DNS server specified in the DHCP lease, in
addition to the routes listed explicitly in the lease. This should
ensure that in multi-homed systems DNS traffic leaves the systems on
the interface that acquired the DNS server information even if other
routes such as default routes exist. This behaviour may be turned on
with the new RoutesToDNS= option.
* systemd-networkd's VXLAN support gained a new option
GenericProtocolExtension= for enabling VXLAN Generic Protocol
Extension support, as well as IPDoNotFragment= for setting the IP
"Don't fragment" bit on outgoing packets. A similar option has been
added to the GENEVE support.
* In systemd-networkd's [Route] section you may now configure
FastOpenNoCookie= for configuring per-route TCP fast-open support, as
well as TTLPropagate= for configuring Label Switched Path (LSP) TTL
propagation. The Type= setting now supports local, broadcast,
anycast, multicast, any, xresolve routes, too.
* systemd-networkd's [Network] section learnt a new option
DefaultRouteOnDevice= for automatically configuring a default route
onto the network device.
* systemd-networkd's bridging support gained two new options ProxyARP=
and ProxyARPWifi= for configuring proxy ARP behaviour as well as
MulticastRouter= for configuring multicast routing behaviour. A new
option MulticastIGMPVersion= may be used to change bridge's multicast
Internet Group Management Protocol (IGMP) version.
* systemd-networkd's FooOverUDP support gained the ability to configure
local and peer IP addresses via Local= and Peer=. A new option
PeerPort= may be used to configure the peer's IP port.
* systemd-networkd's TUN support gained a new setting VnetHeader= for
tweaking Generic Segment Offload support.
* The address family for policy rules may be specified using the new
Family= option in the [RoutingPolicyRule] section.
* networkctl gained a new "delete" command for removing virtual network
devices, as well as a new "--stats" switch for showing device
statistics.
* networkd.conf gained a new setting SpeedMeter= and
SpeedMeterIntervalSec=, to measure bitrate of network interfaces. The
measured speed may be shown by 'networkctl status'.
* "networkctl status" now displays MTU and queue lengths, and more
detailed information about VXLAN and bridge devices.
* systemd-networkd's .network and .link files gained a new Property=
setting in the [Match] section, to match against devices with
specific udev properties.
* systemd-networkd's tunnel support gained a new option
AssignToLoopback= for selecting whether to use the loopback device
"lo" as underlying device.
* systemd-networkd's MACAddress= setting in the [Neighbor] section has
been renamed to LinkLayerAddress=, and it now allows configuration of
IP addresses, too.
* systemd-networkd's handling of the kernel's disable_ipv6 sysctl is
simplified: systemd-networkd will disable the sysctl (enable IPv6) if
IPv6 configuration (static or DHCPv6) was found for a given
interface. It will not touch the sysctl otherwise.
* The order of entries is $PATH used by the user manager instance was
changed to put bin/ entries before the corresponding sbin/ entries.
It is recommended to not rely on this order, and only ever have one
binary with a given name in the system paths under /usr.
* A new tool systemd-network-generator has been added that may generate
.network, .netdev and .link files from IP configuration specified on
the kernel command line in the format used by Dracut.
* The CriticalConnection= setting in .network files is now deprecated,
and replaced by a new KeepConfiguration= setting which allows more
detailed configuration of the IP configuration to keep in place.
* systemd-analyze gained a few new verbs:
- "systemd-analyze timestamp" parses and converts timestamps. This is
similar to the existing "systemd-analyze calendar" command which
does the same for recurring calendar events.
- "systemd-analyze timespan" parses and converts timespans (i.e.
durations as opposed to points in time).
- "systemd-analyze condition" will parse and test ConditionXYZ=
expressions.
- "systemd-analyze exit-status" will parse and convert exit status
codes to their names and back.
- "systemd-analyze unit-files" will print a list of all unit
file paths and unit aliases.
* SuccessExitStatus=, RestartPreventExitStatus=, and
RestartForceExitStatus= now accept exit status names (e.g. "DATAERR"
is equivalent to "65"). Those exit status name mappings may be
displayed with the systemd-analyze exit-status verb describe above.
* systemd-logind now exposes a per-session SetBrightness() bus call,
which may be used to securely change the brightness of a kernel
brightness device, if it belongs to the session's seat. By using this
call unprivileged clients can make changes to "backlight" and "leds"
devices securely with strict requirements on session membership.
Desktop environments may use this to generically make brightness
changes to such devices without shipping private SUID binaries or
udev rules for that purpose.
* "udevadm info" gained a --wait-for-initialization switch to wait for
a device to be initialized.
* systemd-hibernate-resume-generator will now look for resumeflags= on
the kernel command line, which is similar to rootflags= and may be
used to configure device timeout for the hibernation device.
* sd-event learnt a new API call sd_event_source_disable_unref() for
disabling and unref'ing an event source in a single function. A
related call sd_event_source_disable_unrefp() has been added for use
with gcc's cleanup extension.
* The sd-id128.h public API gained a new definition
SD_ID128_UUID_FORMAT_STR for formatting a 128bit ID in UUID format
with printf().
* "busctl introspect" gained a new switch --xml-interface for dumping
XML introspection data unmodified.
* PID 1 may now show the unit name instead of the unit description
string in its status output during boot. This may be configured in
the StatusUnitFormat= setting in /etc/systemd/system.conf or the
kernel command line option systemd.status_unit_format=.
* PID 1 now understands a new option KExecWatchdogSec= in
/etc/systemd/system.conf to set a watchdog timeout for kexec reboots.
Previously watchdog functionality was only available for regular
reboots. The new setting defaults to off, because we don't know in
the general case if the watchdog will be reset after kexec (some
drivers do reset it, but not all), and the new userspace might not be
configured to handle the watchdog.
Moreover, the old ShutdownWatchdogSec= setting has been renamed to
RebootWatchdogSec= to more clearly communicate what it is about. The
old name is still accepted for compatibility.
* The systemd.debug_shell kernel command line option now optionally
takes a tty name to spawn the debug shell on, which allows a
different tty to be selected than the built-in default.
* Service units gained a new ExecCondition= setting which will run
before ExecStartPre= and either continue execution of the unit (for
clean exit codes), stop execution without marking the unit failed
(for exit codes 1 through 254), or stop execution and fail the unit
(for exit code 255 or abnormal termination).
* A new service systemd-pstore.service has been added that pulls data
from /sys/fs/pstore/ and saves it to /var/lib/pstore for later
review.
* timedatectl gained new verbs for configuring per-interface NTP
service configuration for systemd-timesyncd.
* "localectl list-locales" won't list non-UTF-8 locales anymore. It's
2019. (You can set non-UTF-8 locales though, if you know their name.)
* If variable assignments in sysctl.d/ files are prefixed with "-" any
failures to apply them are now ignored.
* systemd-random-seed.service now optionally credits entropy when
applying the seed to the system. Set $SYSTEMD_RANDOM_SEED_CREDIT to
true for the service to enable this behaviour, but please consult the
documentation first, since this comes with a couple of caveats.
* systemd-random-seed.service is now a synchronization point for full
initialization of the kernel's entropy pool. Services that require
/dev/urandom to be correctly initialized should be ordered after this
service.
* The systemd-boot boot loader has been updated to optionally maintain
a random seed file in the EFI System Partition (ESP). During the boot
phase, this random seed is read and updated with a new seed
cryptographically derived from it. Another derived seed is passed to
the OS. The latter seed is then credited to the kernel's entropy pool
very early during userspace initialization (from PID 1). This allows
systems to boot up with a fully initialized kernel entropy pool from
earliest boot on, and thus entirely removes all entropy pool
initialization delays from systems using systemd-boot. Special care
is taken to ensure different seeds are derived on system images
replicated to multiple systems. "bootctl status" will show whether
a seed was received from the boot loader.
* bootctl gained two new verbs:
- "bootctl random-seed" will generate the file in ESP and an EFI
variable to allow a random seed to be passed to the OS as described
above.
- "bootctl is-installed" checks whether systemd-boot is currently
installed.
* bootctl will warn if it detects that boot entries are misconfigured
(for example if the kernel image was removed without purging the
bootloader entry).
* A new document has been added describing systemd's use and support
for the kernel's entropy pool subsystem:
https://systemd.io/RANDOM_SEEDS
* When the system is hibernated the swap device to write the
hibernation image to is now automatically picked from all available
swap devices, preferring the swap device with the highest configured
priority over all others, and picking the device with the most free
space if there are multiple devices with the highest priority.
* /etc/crypttab support has learnt a new keyfile-timeout= per-device
option that permits selecting the timeout how long to wait for a
device with an encryption key before asking for the password.
* IOWeight= has learnt to properly set the IO weight when using the
BFQ scheduler officially found in kernels 5.0+.
* A new mailing list has been created for reporting of security issues:
systemd-security@redhat.com. For mode details, see
https://systemd.io/CONTRIBUTING#security-vulnerability-reports.
Contributions from: Aaron Barany, Adrian Bunk, Alan Jenkins, Albrecht
Lohofener, Andrej Valek, Anita Zhang, Arian van Putten, Balint Reczey,
Bastien Nocera, Ben Boeckel, Benjamin Robin, camoz, Chen Qi, Chris
Chiu, Chris Down, Christian Göttsche, Christian Kellner, Clinton Roy,
Connor Reeder, Daniel Black, Daniel Lublin, Daniele Medri, Dan
Streetman, Dave Reisner, Dave Ross, David Art, David Tardon, Debarshi
Ray, Dimitri John Ledkov, Dominick Grift, Donald Buczek, Douglas
Christman, Eric DeVolder, EtherGraf, Evgeny Vereshchagin, Feldwor,
Felix Riemann, Florian Dollinger, Francesco Pennica, Franck Bui,
Frantisek Sumsal, Franz Pletz, frederik, Hans de Goede, Iago López
Galeiras, Insun Pyo, Ivan Shapovalov, Iwan Timmer, Jack, Jakob
Unterwurzacher, Jan Chren, Jan Klötzke, Jan Losinski, Jan Pokorný, Jan
Synacek, Jan-Michael Brummer, Jeka Pats, Jeremy Soller, Jérémy Rosen,
Jiri Pirko, Joe Lin, Joerg Behrmann, Joe Richey, Jóhann B. Guðmundsson,
Johannes Christ, Johannes Schmitz, Jonathan Rouleau, Jorge Niedbalski,
Jörg Thalheim, Kai Krakow, Kai Lüke, Karel Zak, Kashyap Chamarthy,
Krayushkin Konstantin, Lennart Poettering, Lubomir Rintel, Luca
Boccassi, Luís Ferreira, Marc-André Lureau, Markus Felten, Martin Pitt,
Matthew Leeds, Mattias Jernberg, Michael Biebl, Michael Olbrich,
Michael Prokop, Michael Stapelberg, Michael Zhivich, Michal Koutný,
Michal Sekletar, Mike Gilbert, Milan Broz, Miroslav Lichvar, mpe85,
Mr-Foo, Network Silence, Oliver Harley, pan93412, Paul Menzel, pEJipE,
Peter A. Bigot, Philip Withnall, Piotr Drąg, Rafael Fontenelle, Robert
Scheck, Roberto Santalla, Ronan Pigott, root, RussianNeuroMancer,
Sebastian Jennen, shinygold, Shreyas Behera, Simon Schricker, Susant
Sahani, Thadeu Lima de Souza Cascardo, Theo Ouzhinski, Thiebaud
Weksteen, Thomas Haller, Thomas Weißschuh, Tomas Mraz, Tommi Rantala,
Topi Miettinen, VD-Lycos, ven, Vladimir Yerilov, Wieland Hoffmann,
William A. Kennington III, William Wold, Xi Ruoyao, Yuri Chornoivan,
Yu Watanabe, Zach Smith, Zbigniew Jędrzejewski-Szmek, Zhang Xianwei
– Camerino, 2019-09-03
CHANGES WITH 242:
* In .link files, MACAddressPolicy=persistent (the default) is changed
to cover more devices. For devices like bridges, tun, tap, bond, and
similar interfaces that do not have other identifying information,
the interface name is used as the basis for persistent seed for MAC
and IPv4LL addresses. The way that devices that were handled
previously is not changed, and this change is about covering more
devices then previously by the "persistent" policy.
MACAddressPolicy=random may be used to force randomized MACs and
IPv4LL addresses for a device if desired.
Hint: the log output from udev (at debug level) was enhanced to
clarify what policy is followed and which attributes are used.
`SYSTEMD_LOG_LEVEL=debug udevadm test-builtin net_setup_link /sys/class/net/<name>`
may be used to view this.
Hint: if a bridge interface is created without any slaves, and gains
a slave later, then now the bridge does not inherit slave's MAC.
To inherit slave's MAC, for example, create the following file:
```
# /etc/systemd/network/98-bridge-inherit-mac.link
[Match]
Type=bridge
[Link]
MACAddressPolicy=none
```
* The .device units generated by systemd-fstab-generator and other
generators do not automatically pull in the corresponding .mount unit
as a Wants= dependency. This means that simply plugging in the device
will not cause the mount unit to be started automatically. But please
note that the mount unit may be started for other reasons, in
particular if it is part of local-fs.target, and any unit which
(transitively) depends on local-fs.target is started.
* networkctl list/status/lldp now accept globbing wildcards for network
interface names to match against all existing interfaces.
* The $PIDFILE environment variable is set to point the absolute path
configured with PIDFile= for processes of that service.
* The fallback DNS server list was augmented with Cloudflare public DNS
servers. Use `-Ddns-servers=` to set a different fallback.
* A new special target usb-gadget.target will be started automatically
when a USB Device Controller is detected (which means that the system
is a USB peripheral).
* A new unit setting CPUQuotaPeriodSec= assigns the time period
relatively to which the CPU time quota specified by CPUQuota= is
measured.
* A new unit setting ProtectHostname= may be used to prevent services
from modifying hostname information (even if they otherwise would
have privileges to do so).
* A new unit setting NetworkNamespacePath= may be used to specify a
namespace for service or socket units through a path referring to a
Linux network namespace pseudo-file.
* The PrivateNetwork= setting and JoinsNamespaceOf= dependencies now
have an effect on .socket units: when used the listening socket is
created within the configured network namespace instead of the host
namespace.
* ExecStart= command lines in unit files may now be prefixed with ':'
in which case environment variable substitution is
disabled. (Supported for the other ExecXYZ= settings, too.)
* .timer units gained two new boolean settings OnClockChange= and
OnTimezoneChange= which may be used to also trigger a unit when the
system clock is changed or the local timezone is
modified. systemd-run has been updated to make these options easily
accessible from the command line for transient timers.
* Two new conditions for units have been added: ConditionMemory= may be
used to conditionalize a unit based on installed system
RAM. ConditionCPUs= may be used to conditionalize a unit based on
installed CPU cores.
* The @default system call filter group understood by SystemCallFilter=
has been updated to include the new rseq() system call introduced in
kernel 4.15.
* A new time-set.target has been added that indicates that the system
time has been set from a local source (possibly imprecise). The
existing time-sync.target is stronger and indicates that the time has
been synchronized with a precise external source. Services where
approximate time is sufficient should use the new target.
* "systemctl start" (and related commands) learnt a new
--show-transaction option. If specified brief information about all
jobs queued because of the requested operation is shown.
* systemd-networkd recognizes a new operation state 'enslaved', used
(instead of 'degraded' or 'carrier') for interfaces which form a
bridge, bond, or similar, and an new 'degraded-carrier' operational
state used for the bond or bridge master interface when one of the
enslaved devices is not operational.
* .network files learnt the new IgnoreCarrierLoss= option for leaving
networks configured even if the carrier is lost.
* The RequiredForOnline= setting in .network files may now specify a
minimum operational state required for the interface to be considered
"online" by systemd-networkd-wait-online. Related to this
systemd-networkd-wait-online gained a new option --operational-state=
to configure the same, and its --interface= option was updated to
optionally also take an operational state specific for an interface.
* systemd-networkd-wait-online gained a new setting --any for waiting
for only one of the requested interfaces instead of all of them.
* systemd-networkd now implements L2TP tunnels.
* Two new .network settings UseAutonomousPrefix= and UseOnLinkPrefix=
may be used to cause autonomous and onlink prefixes received in IPv6
Router Advertisements to be ignored.
* New MulticastFlood=, NeighborSuppression=, and Learning= .network
file settings may be used to tweak bridge behaviour.
* The new TripleSampling= option in .network files may be used to
configure CAN triple sampling.
* A new .netdev settings PrivateKeyFile= and PresharedKeyFile= may be
used to point to private or preshared key for a WireGuard interface.
* /etc/crypttab now supports the same-cpu-crypt and
submit-from-crypt-cpus options to tweak encryption work scheduling
details.
* systemd-tmpfiles will now take a BSD file lock before operating on a
contents of directory. This may be used to temporarily exclude
directories from aging by taking the same lock (useful for example
when extracting a tarball into /tmp or /var/tmp as a privileged user,
which might create files with really old timestamps, which
nevertheless should not be deleted). For further details, see:
https://systemd.io/TEMPORARY_DIRECTORIES
* systemd-tmpfiles' h line type gained support for the
FS_PROJINHERIT_FL ('P') file attribute (introduced in kernel 4.5),
controlling project quota inheritance.
* sd-boot and bootctl now implement support for an Extended Boot Loader
(XBOOTLDR) partition, that is intended to be mounted to /boot, in
addition to the ESP partition mounted to /efi or /boot/efi.
Configuration file fragments, kernels, initrds and other EFI images
to boot will be loaded from both the ESP and XBOOTLDR partitions.
The XBOOTLDR partition was previously described by the Boot Loader
Specification, but implementation was missing in sd-boot. Support for
this concept allows using the sd-boot boot loader in more
conservative scenarios where the boot loader itself is placed in the
ESP but the kernels to boot (and their metadata) in a separate
partition.
* A system may now be booted with systemd.volatile=overlay on the
kernel command line, which causes the root file system to be set up
an overlayfs mount combining the root-only root directory with a
writable tmpfs. In this setup, the underlying root device is not
modified, and any changes are lost at reboot.
* Similar, systemd-nspawn can now boot containers with a volatile
overlayfs root with the new --volatile=overlay switch.
* systemd-nspawn can now consume OCI runtime bundles using a new
--oci-bundle= option. This implementation is fully usable, with most
features in the specification implemented, but since this a lot of
new code and functionality, this feature should most likely not
be used in production yet.
* systemd-nspawn now supports various options described by the OCI
runtime specification on the command-line and in .nspawn files:
--inaccessible=/Inaccessible= may be used to mask parts of the file
system tree, --console=/--pipe may be used to configure how standard
input, output, and error are set up.
* busctl learned the `emit` verb to generate D-Bus signals.
* systemd-analyze cat-config may be used to gather and display
configuration spread over multiple files, for example system and user
presets, tmpfiles.d, sysusers.d, udev rules, etc.
* systemd-analyze calendar now takes an optional new parameter
--iterations= which may be used to show a maximum number of iterations
the specified expression will elapse next.
* The sd-bus C API gained support for naming method parameters in the
introspection data.
* systemd-logind gained D-Bus APIs to specify the "reboot parameter"
the reboot() system call expects.
* journalctl learnt a new --cursor-file= option that points to a file
from which a cursor should be loaded in the beginning and to which
the updated cursor should be stored at the end.
* ACRN hypervisor and Windows Subsystem for Linux (WSL) are now
detected by systemd-detect-virt (and may also be used in
ConditionVirtualization=).
* The behaviour of systemd-logind may now be modified with environment
variables $SYSTEMD_REBOOT_TO_FIRMWARE_SETUP,
$SYSTEMD_REBOOT_TO_BOOT_LOADER_MENU, and
$SYSTEMD_REBOOT_TO_BOOT_LOADER_ENTRY. They cause logind to either
skip the relevant operation completely (when set to false), or to
create a flag file in /run/systemd (when set to true), instead of
actually commencing the real operation when requested. The presence
of /run/systemd/reboot-to-firmware-setup,
/run/systemd/reboot-to-boot-loader-menu, and
/run/systemd/reboot-to-boot-loader-entry, may be used by alternative
boot loader implementations to replace some steps logind performs
during reboot with their own operations.
* systemctl can be used to request a reboot into the boot loader menu
or a specific boot loader entry with the new --boot-load-menu= and
--boot-loader-entry= options to a reboot command. (This requires a
boot loader that supports this, for example sd-boot.)
* kernel-install will no longer unconditionally create the output
directory (e.g. /efi/<machine-id>/<kernel-version>) for boot loader
snippets, but will do only if the machine-specific parent directory
(i.e. /efi/<machine-id>/) already exists. bootctl has been modified
to create this parent directory during sd-boot installation.
This makes it easier to use kernel-install with plugins which support
a different layout of the bootloader partitions (for example grub2).
* During package installation (with `ninja install`), we would create
symlinks for getty@tty1.service, systemd-networkd.service,
systemd-networkd.socket, systemd-resolved.service,
remote-cryptsetup.target, remote-fs.target,
systemd-networkd-wait-online.service, and systemd-timesyncd.service
in /etc, as if `systemctl enable` was called for those units, to make
the system usable immediately after installation. Now this is not
done anymore, and instead calling `systemctl preset-all` is
recommended after the first installation of systemd.
* A new boolean sandboxing option RestrictSUIDSGID= has been added that
is built on seccomp. When turned on creation of SUID/SGID files is
prohibited.
* The NoNewPrivileges= and the new RestrictSUIDSGID= options are now
implied if DynamicUser= is turned on for a service. This hardens
these services, so that they neither can benefit from nor create
SUID/SGID executables. This is a minor compatibility breakage, given
that when DynamicUser= was first introduced SUID/SGID behaviour was
unaffected. However, the security benefit of these two options is
substantial, and the setting is still relatively new, hence we opted
to make it mandatory for services with dynamic users.
Contributions from: Adam Jackson, Alexander Tsoy, Andrey Yashkin,
Andrzej Pietrasiewicz, Anita Zhang, Balint Reczey, Beniamino Galvani,
Ben Iofel, Benjamin Berg, Benjamin Dahlhoff, Chris, Chris Morin,
Christopher Wong, Claudius Ellsel, Clemens Gruber, dana, Daniel Black,
Davide Cavalca, David Michael, David Rheinsberg, emersion, Evgeny
Vereshchagin, Filipe Brandenburger, Franck Bui, Frantisek Sumsal,
Giacinto Cifelli, Hans de Goede, Hugo Kindel, Ignat Korchagin, Insun
Pyo, Jan Engelhardt, Jonas Dorel, Jonathan Lebon, Jonathon Kowalski,
Jörg Sommer, Jörg Thalheim, Jussi Pakkanen, Kai-Heng Feng, Lennart
Poettering, Lubomir Rintel, Luís Ferreira, Martin Pitt, Matthias
Klumpp, Michael Biebl, Michael Niewöhner, Michael Olbrich, Michal
Sekletar, Mike Lothian, Paul Menzel, Piotr Drąg, Riccardo Schirone,
Robin Elvedi, Roman Kulikov, Ronald Tschalär, Ross Burton, Ryan
Gonzalez, Sebastian Krzyszkowiak, Stephane Chazelas, StKob, Susant
Sahani, Sylvain Plantefève, Szabolcs Fruhwald, Taro Yamada, Theo
Ouzhinski, Thomas Haller, Tobias Jungel, Tom Yan, Tony Asleson, Topi
Miettinen, unixsysadmin, Van Laser, Vesa Jääskeläinen, Yu, Li-Yu,
Yu Watanabe, Zbigniew Jędrzejewski-Szmek
— Warsaw, 2019-04-11
CHANGES WITH 241:
* The default locale can now be configured at compile time. Otherwise,
a suitable default will be selected automatically (one of C.UTF-8,
en_US.UTF-8, and C).
* The version string shown by systemd and other tools now includes the
git commit hash when built from git. An override may be specified
during compilation, which is intended to be used by distributions to
include the package release information.
* systemd-cat can now filter standard input and standard error streams
for different syslog priorities using the new --stderr-priority=
option.
* systemd-journald and systemd-journal-remote reject entries which
contain too many fields (CVE-2018-16865) and set limits on the
process' command line length (CVE-2018-16864).
* $DBUS_SESSION_BUS_ADDRESS environment variable is set by pam_systemd
again.
* A new network device NamePolicy "keep" is implemented for link files,
and used by default in 99-default.link (the fallback configuration
provided by systemd). With this policy, if the network device name
was already set by userspace, the device will not be renamed again.
This matches the naming scheme that was implemented before
systemd-240. If naming-scheme < 240 is specified, the "keep" policy
is also enabled by default, even if not specified. Effectively, this
means that if naming-scheme >= 240 is specified, network devices will
be renamed according to the configuration, even if they have been
renamed already, if "keep" is not specified as the naming policy in
the .link file. The 99-default.link file provided by systemd includes
"keep" for backwards compatibility, but it is recommended for user
installed .link files to *not* include it.
The "kernel" policy, which keeps kernel names declared to be
"persistent", now works again as documented.
* kernel-install script now optionally takes the paths to one or more
initrd files, and passes them to all plugins.
* The mincore() system call has been dropped from the @system-service
system call filter group, as it is pretty exotic and may potentially
used for side-channel attacks.
* -fPIE is dropped from compiler and linker options. Please specify
-Db_pie=true option to meson to build position-independent
executables. Note that the meson option is supported since meson-0.49.
* The fs.protected_regular and fs.protected_fifos sysctls, which were
added in Linux 4.19 to make some data spoofing attacks harder, are
now enabled by default. While this will hopefully improve the
security of most installations, it is technically a backwards
incompatible change; to disable these sysctls again, place the
following lines in /etc/sysctl.d/60-protected.conf or a similar file:
fs.protected_regular = 0
fs.protected_fifos = 0
Note that the similar hardlink and symlink protection has been
enabled since v199, and may be disabled likewise.
* The files read from the EnvironmentFile= setting in unit files now
parse backslashes inside quotes literally, matching the behaviour of
POSIX shells.
* udevadm trigger, udevadm control, udevadm settle and udevadm monitor
now automatically become NOPs when run in a chroot() environment.
* The tmpfiles.d/ "C" line type will now copy directory trees not only
when the destination is so far missing, but also if it already exists
as a directory and is empty. This is useful to cater for systems
where directory trees are put together from multiple separate mount
points but otherwise empty.
* A new function sd_bus_close_unref() (and the associated
sd_bus_close_unrefp()) has been added to libsystemd, that combines
sd_bus_close() and sd_bus_unref() in one.
* udevadm control learnt a new option for --ping for testing whether a
systemd-udevd instance is running and reacting.
* udevadm trigger learnt a new option for --wait-daemon for waiting
systemd-udevd daemon to be initialized.
Contributions from: Aaron Plattner, Alberts Muktupāvels, Alex Mayer,
Ayman Bagabas, Beniamino Galvani, Burt P, Chris Down, Chris Lamb, Chris
Morin, Christian Hesse, Claudius Ellsel, dana, Daniel Axtens, Daniele
Medri, Dave Reisner, David Santamaría Rogado, Diego Canuhe, Dimitri
John Ledkov, Evgeny Vereshchagin, Fabrice Fontaine, Filipe
Brandenburger, Franck Bui, Frantisek Sumsal, govwin, Hans de Goede,
James Hilliard, Jan Engelhardt, Jani Uusitalo, Jan Janssen, Jan
Synacek, Jonathan McDowell, Jonathan Roemer, Jonathon Kowalski, Joost
Heitbrink, Jörg Thalheim, Lance, Lennart Poettering, Louis Taylor,
Lucas Werkmeister, Mantas Mikulėnas, Marc-Antoine Perennou,
marvelousblack, Michael Biebl, Michael Sloan, Michal Sekletar, Mike
Auty, Mike Gilbert, Mikhail Kasimov, Neil Brown, Niklas Hambüchen,
Patrick Williams, Paul Seyfert, Peter Hutterer, Philip Withnall, Roger
James, Ronnie P. Thomas, Ryan Gonzalez, Sam Morris, Stephan Edel,
Stephan Gerhold, Susant Sahani, Taro Yamada, Thomas Haller, Topi
Miettinen, YiFei Zhu, YmrDtnJu, YunQiang Su, Yu Watanabe, Zbigniew
Jędrzejewski-Szmek, zsergeant77, Дамјан Георгиевски
— Berlin, 2019-02-14
CHANGES WITH 240:
* NoNewPrivileges=yes has been set for all long-running services
implemented by systemd. Previously, this was problematic due to
SELinux (as this would also prohibit the transition from PID1's label
to the service's label). This restriction has since been lifted, but
an SELinux policy update is required.
(See e.g. https://github.com/fedora-selinux/selinux-policy/pull/234.)
* DynamicUser=yes is dropped from systemd-networkd.service,
systemd-resolved.service and systemd-timesyncd.service, which was
enabled in v239 for systemd-networkd.service and systemd-resolved.service,
and since v236 for systemd-timesyncd.service. The users and groups
systemd-network, systemd-resolve and systemd-timesync are created
by systemd-sysusers again. Distributors or system administrators
may need to create these users and groups if they not exist (or need
to re-enable DynamicUser= for those units) while upgrading systemd.
Also, the clock file for systemd-timesyncd may need to move from
/var/lib/private/systemd/timesync/clock to /var/lib/systemd/timesync/clock.
* When unit files are loaded from disk, previously systemd would
sometimes (depending on the unit loading order) load units from the
target path of symlinks in .wants/ or .requires/ directories of other
units. This meant that unit could be loaded from different paths
depending on whether the unit was requested explicitly or as a
dependency of another unit, not honouring the priority of directories
in search path. It also meant that it was possible to successfully
load and start units which are not found in the unit search path, as
long as they were requested as a dependency and linked to from
.wants/ or .requires/. The target paths of those symlinks are not
used for loading units anymore and the unit file must be found in
the search path.
* A new service type has been added: Type=exec. It's very similar to
Type=simple but ensures the service manager will wait for both fork()
and execve() of the main service binary to complete before proceeding
with follow-up units. This is primarily useful so that the manager
propagates any errors in the preparation phase of service execution
back to the job that requested the unit to be started. For example,
consider a service that has ExecStart= set to a file system binary
that doesn't exist. With Type=simple starting the unit would be
considered instantly successful, as only fork() has to complete
successfully and the manager does not wait for execve(), and hence
its failure is seen "too late". With the new Type=exec service type
starting the unit will fail, as the manager will wait for the
execve() and notice its failure, which is then propagated back to the
start job.
NOTE: with the next release 241 of systemd we intend to change the
systemd-run tool to default to Type=exec for transient services
started by it. This should be mostly safe, but in specific corner
cases might result in problems, as the systemd-run tool will then
block on NSS calls (such as user name look-ups due to User=) done
between the fork() and execve(), which under specific circumstances
might cause problems. It is recommended to specify "-p Type=simple"
explicitly in the few cases where this applies. For regular,
non-transient services (i.e. those defined with unit files on disk)
we will continue to default to Type=simple.
* The Linux kernel's current default RLIMIT_NOFILE resource limit for
userspace processes is set to 1024 (soft) and 4096
(hard). Previously, systemd passed this on unmodified to all
processes it forked off. With this systemd release the hard limit
systemd passes on is increased to 512K, overriding the kernel's
defaults and substantially increasing the number of simultaneous file
descriptors unprivileged userspace processes can allocate. Note that
the soft limit remains at 1024 for compatibility reasons: the
traditional UNIX select() call cannot deal with file descriptors >=
1024 and increasing the soft limit globally might thus result in
programs unexpectedly allocating a high file descriptor and thus
failing abnormally when attempting to use it with select() (of
course, programs shouldn't use select() anymore, and prefer
poll()/epoll, but the call unfortunately remains undeservedly popular
at this time). This change reflects the fact that file descriptor
handling in the Linux kernel has been optimized in more recent
kernels and allocating large numbers of them should be much cheaper
both in memory and in performance than it used to be. Programs that
want to take benefit of the increased limit have to "opt-in" into
high file descriptors explicitly by raising their soft limit. Of
course, when they do that they must acknowledge that they cannot use
select() anymore (and neither can any shared library they use — or
any shared library used by any shared library they use and so on).
Which default hard limit is most appropriate is of course hard to
decide. However, given reports that ~300K file descriptors are used
in real-life applications we believe 512K is sufficiently high as new
default for now. Note that there are also reports that using very
high hard limits (e.g. 1G) is problematic: some software allocates
large arrays with one element for each potential file descriptor
(Java, …) — a high hard limit thus triggers excessively large memory
allocations in these applications. Hopefully, the new default of 512K
is a good middle ground: higher than what real-life applications
currently need, and low enough for avoid triggering excessively large
allocations in problematic software. (And yes, somebody should fix
Java.)
* The fs.nr_open and fs.file-max sysctls are now automatically bumped
to the highest possible values, as separate accounting of file
descriptors is no longer necessary, as memcg tracks them correctly as
part of the memory accounting anyway. Thus, from the four limits on
file descriptors currently enforced (fs.file-max, fs.nr_open,
RLIMIT_NOFILE hard, RLIMIT_NOFILE soft) we turn off the first two,
and keep only the latter two. A set of build-time options
(-Dbump-proc-sys-fs-file-max=false and -Dbump-proc-sys-fs-nr-open=false)
has been added to revert this change in behaviour, which might be
an option for systems that turn off memcg in the kernel.
* When no /etc/locale.conf file exists (and hence no locale settings
are in place), systemd will now use the "C.UTF-8" locale by default,
and set LANG= to it. This locale is supported by various
distributions including Fedora, with clear indications that upstream
glibc is going to make it available too. This locale enables UTF-8
mode by default, which appears appropriate for 2018.
* The "net.ipv4.conf.all.rp_filter" sysctl will now be set to 2 by
default. This effectively switches the RFC3704 Reverse Path filtering
from Strict mode to Loose mode. This is more appropriate for hosts
that have multiple links with routes to the same networks (e.g.
a client with a Wi-Fi and Ethernet both connected to the internet).
Consult the kernel documentation for details on this sysctl:
https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
* The v239 change to turn on "net.ipv4.tcp_ecn" by default has been
reverted.
* CPUAccounting=yes no longer enables the CPU controller when using
kernel 4.15+ and the unified cgroup hierarchy, as required accounting
statistics are now provided independently from the CPU controller.
* Support for disabling a particular cgroup controller within a sub-tree
has been added through the DisableControllers= directive.
* cgroup_no_v1=all on the kernel command line now also implies
using the unified cgroup hierarchy, unless one explicitly passes
systemd.unified_cgroup_hierarchy=0 on the kernel command line.
* The new "MemoryMin=" unit file property may now be used to set the
memory usage protection limit of processes invoked by the unit. This
controls the cgroup v2 memory.min attribute. Similarly, the new
"IODeviceLatencyTargetSec=" property has been added, wrapping the new
cgroup v2 io.latency cgroup property for configuring per-service I/O
latency.
* systemd now supports the cgroup v2 devices BPF logic, as counterpart
to the cgroup v1 "devices" cgroup controller.
* systemd-escape now is able to combine --unescape with --template. It
also learnt a new option --instance for extracting and unescaping the
instance part of a unit name.
* sd-bus now provides the sd_bus_message_readv() which is similar to
sd_bus_message_read() but takes a va_list object. The pair
sd_bus_set_method_call_timeout() and sd_bus_get_method_call_timeout()
has been added for configuring the default method call timeout to
use. sd_bus_error_move() may be used to efficiently move the contents
from one sd_bus_error structure to another, invalidating the
source. sd_bus_set_close_on_exit() and sd_bus_get_close_on_exit() may
be used to control whether a bus connection object is automatically
flushed when an sd-event loop is exited.
* When processing classic BSD syslog log messages, journald will now
save the original time-stamp string supplied in the new
SYSLOG_TIMESTAMP= journal field. This permits consumers to
reconstruct the original BSD syslog message more correctly.
* StandardOutput=/StandardError= in service files gained support for
new "append:…" parameters, for connecting STDOUT/STDERR of a service
to a file, and appending to it.
* The signal to use as last step of killing of unit processes is now