systemd/backport-service-add-ability-to-pin-fd-store.patch

477 lines
28 KiB
Diff

From b9c1883a9cd9b5126fe648f3e198143dc19a222d Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Wed, 29 Mar 2023 22:07:22 +0200
Subject: [PATCH] service: add ability to pin fd store
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Oftentimes it is useful to allow the per-service fd store to survive
longer than for a restart. This is useful in various scenarios:
1. An fd to some security relevant object needs to be stashed somewhere,
that should not be cleaned automatically, because the security
enforcement would be dropped then.
2. A user namespace fd should be allocated on first invocation and be
kept around until the user logs out (i.e. systemd --user ends), á la
#16328 (This does not implement what #16318 asks for, but should
solve the use-case discussed there.)
3. There's interest in allow a concept of "userspace reboots" where the
kernel stays running, and userspace is swapped out (i.e. all services
exit, and the rootfs transitioned into a new version of it) while
keeping some select resources pinned, very similar to how we
implement a switch root. Thus it is useful to allow services to exit,
while leaving their fds around till the very end.
This is exposed through a new FileDescriptorStorePreserve= setting that
is closely modelled after RuntimeDirectoryPreserve= (in fact it reused
the same internal type), since we want similar behaviour in the end, and
quite often they probably want to be used together.
Conflict:Adaptation Context.The FileDescriptorStorePreserve= field is added to the directives.service file to prevent test case failures.
Reference:https://github.com/systemd/systemd/commit/b9c1883a9cd9b5126fe648f3e198143dc19a222d
---
man/org.freedesktop.systemd1.xml | 6 +++
man/systemd.service.xml | 21 +++++++++-
src/basic/unit-def.c | 1 +
src/basic/unit-def.h | 1 +
src/core/dbus-execute.c | 8 ++--
src/core/dbus-execute.h | 2 +
src/core/dbus-service.c | 4 ++
src/core/load-fragment-gperf.gperf.in | 3 +-
src/core/load-fragment.c | 2 +-
src/core/load-fragment.h | 2 +-
src/core/service.c | 43 ++++++++++++++++-----
src/core/service.h | 1 +
src/shared/bus-unit-util.c | 3 +-
test/fuzz/fuzz-unit-file/directives.service | 1 +
14 files changed, 78 insertions(+), 20 deletions(-)
diff --git a/man/org.freedesktop.systemd1.xml b/man/org.freedesktop.systemd1.xml
index 74a9202..ec2148d 100644
--- a/man/org.freedesktop.systemd1.xml
+++ b/man/org.freedesktop.systemd1.xml
@@ -2343,6 +2343,8 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2eservice {
readonly u FileDescriptorStoreMax = ...;
@org.freedesktop.DBus.Property.EmitsChangedSignal("false")
readonly u NFileDescriptorStore = ...;
+ @org.freedesktop.DBus.Property.EmitsChangedSignal("false")
+ readonly s FileDescriptorStorePreserve = '...';
readonly s StatusText = '...';
readonly i StatusErrno = ...;
readonly s Result = '...';
@@ -2898,6 +2900,8 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2eservice {
<!--property NFileDescriptorStore is not documented!-->
+ <!--property FileDescriptorStorePreserve is not documented!-->
+
<!--property StatusErrno is not documented!-->
<!--property ReloadResult is not documented!-->
@@ -3430,6 +3434,8 @@ node /org/freedesktop/systemd1/unit/avahi_2ddaemon_2eservice {
<variablelist class="dbus-property" generated="True" extra-ref="NFileDescriptorStore"/>
+ <variablelist class="dbus-property" generated="True" extra-ref="FileDescriptorStorePreserve"/>
+
<variablelist class="dbus-property" generated="True" extra-ref="StatusText"/>
<variablelist class="dbus-property" generated="True" extra-ref="StatusErrno"/>
diff --git a/man/systemd.service.xml b/man/systemd.service.xml
index 350bc5f..ed9b410 100644
--- a/man/systemd.service.xml
+++ b/man/systemd.service.xml
@@ -1050,7 +1050,7 @@
<literal>FDSTORE=1</literal> messages. This is useful for implementing services that can restart
after an explicit request or a crash without losing state. Any open sockets and other file
descriptors which should not be closed during the restart may be stored this way. Application state
- can either be serialized to a file in <filename>/run/</filename>, or better, stored in a
+ can either be serialized to a file in <varname>RuntimeDirectory=</varname>, or stored in a
<citerefentry><refentrytitle>memfd_create</refentrytitle><manvolnum>2</manvolnum></citerefentry>
memory file descriptor. Defaults to 0, i.e. no file descriptors may be stored in the service
manager. All file descriptors passed to the service manager from a specific service are passed back
@@ -1059,12 +1059,29 @@
details about the precise protocol used and the order in which the file descriptors are passed). Any
file descriptors passed to the service manager are automatically closed when
<constant>POLLHUP</constant> or <constant>POLLERR</constant> is seen on them, or when the service is
- fully stopped and no job is queued or being executed for it. If this option is used,
+ fully stopped and no job is queued or being executed for it (the latter can be tweaked with
+ <varname>FileDescriptorStorePreserve=</varname>, see below). If this option is used,
<varname>NotifyAccess=</varname> (see above) should be set to open access to the notification socket
provided by systemd. If <varname>NotifyAccess=</varname> is not set, it will be implicitly set to
<option>main</option>.</para></listitem>
</varlistentry>
+ <varlistentry>
+ <term><varname>FileDescriptorStorePreserve=</varname></term>
+ <listitem><para>Takes one of <constant>no</constant>, <constant>yes</constant>,
+ <constant>restart</constant> and controls when to release the service's file descriptor store
+ (i.e. when to close the contained file descriptors, if any). If set to <constant>no</constant> the
+ file descriptor store is automatically released when the service is stopped; if
+ <constant>restart</constant> (the default) it is kept around as long as the unit is neither inactive
+ nor failed, or a job is queued for the service, or the service is expected to be restarted. If
+ <constant>yes</constant> the file descriptor store is kept around until the unit is removed from
+ memory (i.e. is not referenced anymore and inactive). The latter is useful to keep entries in the
+ file descriptor store pinned until the service manage exits.</para>
+
+ <para>Use <command>systemctl clean --what=fdstore …</command> to release the file descriptor store
+ explicitly.</para></listitem>
+ </varlistentry>
+
<varlistentry>
<term><varname>USBFunctionDescriptors=</varname></term>
<listitem><para>Configure the location of a file containing
diff --git a/src/basic/unit-def.c b/src/basic/unit-def.c
index 2667e61..09b2747 100644
--- a/src/basic/unit-def.c
+++ b/src/basic/unit-def.c
@@ -196,6 +196,7 @@ static const char* const service_state_table[_SERVICE_STATE_MAX] = {
[SERVICE_FINAL_SIGTERM] = "final-sigterm",
[SERVICE_FINAL_SIGKILL] = "final-sigkill",
[SERVICE_FAILED] = "failed",
+ [SERVICE_DEAD_RESOURCES_PINNED] = "dead-resources-pinned",
[SERVICE_AUTO_RESTART] = "auto-restart",
[SERVICE_CLEANING] = "cleaning",
};
diff --git a/src/basic/unit-def.h b/src/basic/unit-def.h
index 08651ef..c9e41bd 100644
--- a/src/basic/unit-def.h
+++ b/src/basic/unit-def.h
@@ -141,6 +141,7 @@ typedef enum ServiceState {
SERVICE_FINAL_SIGTERM, /* In case the STOP_POST executable hangs, we shoot that down, too */
SERVICE_FINAL_SIGKILL,
SERVICE_FAILED,
+ SERVICE_DEAD_RESOURCES_PINNED, /* Like SERVICE_DEAD, but with pinned resources */
SERVICE_AUTO_RESTART,
SERVICE_CLEANING,
_SERVICE_STATE_MAX,
diff --git a/src/core/dbus-execute.c b/src/core/dbus-execute.c
index d8931c1..97277ca 100644
--- a/src/core/dbus-execute.c
+++ b/src/core/dbus-execute.c
@@ -46,7 +46,7 @@
BUS_DEFINE_PROPERTY_GET_ENUM(bus_property_get_exec_output, exec_output, ExecOutput);
static BUS_DEFINE_PROPERTY_GET_ENUM(property_get_exec_input, exec_input, ExecInput);
static BUS_DEFINE_PROPERTY_GET_ENUM(property_get_exec_utmp_mode, exec_utmp_mode, ExecUtmpMode);
-static BUS_DEFINE_PROPERTY_GET_ENUM(property_get_exec_preserve_mode, exec_preserve_mode, ExecPreserveMode);
+BUS_DEFINE_PROPERTY_GET_ENUM(bus_property_get_exec_preserve_mode, exec_preserve_mode, ExecPreserveMode);
static BUS_DEFINE_PROPERTY_GET_ENUM(property_get_exec_keyring_mode, exec_keyring_mode, ExecKeyringMode);
static BUS_DEFINE_PROPERTY_GET_ENUM(property_get_protect_proc, protect_proc, ProtectProc);
static BUS_DEFINE_PROPERTY_GET_ENUM(property_get_proc_subset, proc_subset, ProcSubset);
@@ -1181,7 +1181,7 @@ const sd_bus_vtable bus_exec_vtable[] = {
SD_BUS_PROPERTY("Personality", "s", property_get_personality, offsetof(ExecContext, personality), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("LockPersonality", "b", bus_property_get_bool, offsetof(ExecContext, lock_personality), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("RestrictAddressFamilies", "(bas)", property_get_address_families, 0, SD_BUS_VTABLE_PROPERTY_CONST),
- SD_BUS_PROPERTY("RuntimeDirectoryPreserve", "s", property_get_exec_preserve_mode, offsetof(ExecContext, runtime_directory_preserve_mode), SD_BUS_VTABLE_PROPERTY_CONST),
+ SD_BUS_PROPERTY("RuntimeDirectoryPreserve", "s", bus_property_get_exec_preserve_mode, offsetof(ExecContext, runtime_directory_preserve_mode), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("RuntimeDirectoryMode", "u", bus_property_get_mode, offsetof(ExecContext, directories[EXEC_DIRECTORY_RUNTIME].mode), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("RuntimeDirectory", "as", NULL, offsetof(ExecContext, directories[EXEC_DIRECTORY_RUNTIME].paths), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("StateDirectoryMode", "u", bus_property_get_mode, offsetof(ExecContext, directories[EXEC_DIRECTORY_STATE].mode), SD_BUS_VTABLE_PROPERTY_CONST),
@@ -1551,7 +1551,7 @@ static BUS_DEFINE_SET_TRANSIENT_PARSE(protect_home, ProtectHome, protect_home_fr
static BUS_DEFINE_SET_TRANSIENT_PARSE(keyring_mode, ExecKeyringMode, exec_keyring_mode_from_string);
static BUS_DEFINE_SET_TRANSIENT_PARSE(protect_proc, ProtectProc, protect_proc_from_string);
static BUS_DEFINE_SET_TRANSIENT_PARSE(proc_subset, ProcSubset, proc_subset_from_string);
-static BUS_DEFINE_SET_TRANSIENT_PARSE(preserve_mode, ExecPreserveMode, exec_preserve_mode_from_string);
+BUS_DEFINE_SET_TRANSIENT_PARSE(exec_preserve_mode, ExecPreserveMode, exec_preserve_mode_from_string);
static BUS_DEFINE_SET_TRANSIENT_PARSE_PTR(personality, unsigned long, parse_personality);
static BUS_DEFINE_SET_TRANSIENT_TO_STRING_ALLOC(secure_bits, "i", int32_t, int, "%" PRIi32, secure_bits_to_string_alloc_with_check);
static BUS_DEFINE_SET_TRANSIENT_TO_STRING_ALLOC(capability, "t", uint64_t, uint64_t, "%" PRIu64, capability_set_to_string_alloc);
@@ -1842,7 +1842,7 @@ int bus_exec_context_set_transient_property(
return bus_set_transient_proc_subset(u, name, &c->proc_subset, message, flags, error);
if (streq(name, "RuntimeDirectoryPreserve"))
- return bus_set_transient_preserve_mode(u, name, &c->runtime_directory_preserve_mode, message, flags, error);
+ return bus_set_transient_exec_preserve_mode(u, name, &c->runtime_directory_preserve_mode, message, flags, error);
if (streq(name, "UMask"))
return bus_set_transient_mode_t(u, name, &c->umask, message, flags, error);
diff --git a/src/core/dbus-execute.h b/src/core/dbus-execute.h
index c538341..5926bdb 100644
--- a/src/core/dbus-execute.h
+++ b/src/core/dbus-execute.h
@@ -28,6 +28,8 @@ int bus_property_get_exec_output(sd_bus *bus, const char *path, const char *inte
int bus_property_get_exec_command(sd_bus *bus, const char *path, const char *interface, const char *property, sd_bus_message *reply, void *userdata, sd_bus_error *ret_error);
int bus_property_get_exec_command_list(sd_bus *bus, const char *path, const char *interface, const char *property, sd_bus_message *reply, void *userdata, sd_bus_error *ret_error);
int bus_property_get_exec_ex_command_list(sd_bus *bus, const char *path, const char *interface, const char *property, sd_bus_message *reply, void *userdata, sd_bus_error *ret_error);
+int bus_property_get_exec_preserve_mode(sd_bus *bus, const char *path, const char *interface, const char *property, sd_bus_message *reply, void *userdata, sd_bus_error *ret_error);
int bus_exec_context_set_transient_property(Unit *u, ExecContext *c, const char *name, sd_bus_message *message, UnitWriteFlags flags, sd_bus_error *error);
int bus_set_transient_exec_command(Unit *u, const char *name, ExecCommand **exec_command, sd_bus_message *message, UnitWriteFlags flags, sd_bus_error *error);
+int bus_set_transient_exec_preserve_mode(Unit *u, const char *name, ExecPreserveMode *p, sd_bus_message *message, UnitWriteFlags flags, sd_bus_error *error);
diff --git a/src/core/dbus-service.c b/src/core/dbus-service.c
index c0ec277..4b05968 100644
--- a/src/core/dbus-service.c
+++ b/src/core/dbus-service.c
@@ -240,6 +240,7 @@ const sd_bus_vtable bus_service_vtable[] = {
SD_BUS_PROPERTY("BusName", "s", NULL, offsetof(Service, bus_name), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("FileDescriptorStoreMax", "u", bus_property_get_unsigned, offsetof(Service, n_fd_store_max), SD_BUS_VTABLE_PROPERTY_CONST),
SD_BUS_PROPERTY("NFileDescriptorStore", "u", property_get_size_as_uint32, offsetof(Service, n_fd_store), 0),
+ SD_BUS_PROPERTY("FileDescriptorStorePreserve", "s", bus_property_get_exec_preserve_mode, offsetof(Service, fd_store_preserve_mode), 0),
SD_BUS_PROPERTY("StatusText", "s", NULL, offsetof(Service, status_text), SD_BUS_VTABLE_PROPERTY_EMITS_CHANGE),
SD_BUS_PROPERTY("StatusErrno", "i", bus_property_get_int, offsetof(Service, status_errno), SD_BUS_VTABLE_PROPERTY_EMITS_CHANGE),
SD_BUS_PROPERTY("Result", "s", property_get_result, offsetof(Service, result), SD_BUS_VTABLE_PROPERTY_EMITS_CHANGE),
@@ -477,6 +478,9 @@ static int bus_service_set_transient_property(
if (streq(name, "FileDescriptorStoreMax"))
return bus_set_transient_unsigned(u, name, &s->n_fd_store_max, message, flags, error);
+ if (streq(name, "FileDescriptorStorePreserve"))
+ return bus_set_transient_exec_preserve_mode(u, name, &s->fd_store_preserve_mode, message, flags, error);
+
if (streq(name, "NotifyAccess"))
return bus_set_transient_notify_access(u, name, &s->notify_access, message, flags, error);
diff --git a/src/core/load-fragment-gperf.gperf.in b/src/core/load-fragment-gperf.gperf.in
index 42441ea..49e036f 100644
--- a/src/core/load-fragment-gperf.gperf.in
+++ b/src/core/load-fragment-gperf.gperf.in
@@ -127,7 +127,7 @@
{{type}}.MountFlags, config_parse_exec_mount_flags, 0, offsetof({{type}}, exec_context.mount_flags)
{{type}}.MountAPIVFS, config_parse_exec_mount_apivfs, 0, offsetof({{type}}, exec_context)
{{type}}.Personality, config_parse_personality, 0, offsetof({{type}}, exec_context.personality)
-{{type}}.RuntimeDirectoryPreserve, config_parse_runtime_preserve_mode, 0, offsetof({{type}}, exec_context.runtime_directory_preserve_mode)
+{{type}}.RuntimeDirectoryPreserve, config_parse_exec_preserve_mode, 0, offsetof({{type}}, exec_context.runtime_directory_preserve_mode)
{{type}}.RuntimeDirectoryMode, config_parse_mode, 0, offsetof({{type}}, exec_context.directories[EXEC_DIRECTORY_RUNTIME].mode)
{{type}}.RuntimeDirectory, config_parse_exec_directories, 0, offsetof({{type}}, exec_context.directories[EXEC_DIRECTORY_RUNTIME].paths)
{{type}}.StateDirectoryMode, config_parse_mode, 0, offsetof({{type}}, exec_context.directories[EXEC_DIRECTORY_STATE].mode)
@@ -395,6 +395,7 @@ Service.SysVStartPriority, config_parse_warn_compat,
Service.NonBlocking, config_parse_bool, 0, offsetof(Service, exec_context.non_blocking)
Service.BusName, config_parse_bus_name, 0, offsetof(Service, bus_name)
Service.FileDescriptorStoreMax, config_parse_unsigned, 0, offsetof(Service, n_fd_store_max)
+Service.FileDescriptorStorePreserve, config_parse_exec_preserve_mode, 0, offsetof(Service, fd_store_preserve_mode)
Service.NotifyAccess, config_parse_notify_access, 0, offsetof(Service, notify_access)
Service.Sockets, config_parse_service_sockets, 0, 0
Service.BusPolicy, config_parse_warn_compat, DISABLED_LEGACY, 0
diff --git a/src/core/load-fragment.c b/src/core/load-fragment.c
index f357408..7e5f919 100644
--- a/src/core/load-fragment.c
+++ b/src/core/load-fragment.c
@@ -133,7 +133,7 @@ DEFINE_CONFIG_PARSE_ENUM(config_parse_job_mode, job_mode, JobMode, "Failed to pa
DEFINE_CONFIG_PARSE_ENUM(config_parse_notify_access, notify_access, NotifyAccess, "Failed to parse notify access specifier");
DEFINE_CONFIG_PARSE_ENUM(config_parse_protect_home, protect_home, ProtectHome, "Failed to parse protect home value");
DEFINE_CONFIG_PARSE_ENUM(config_parse_protect_system, protect_system, ProtectSystem, "Failed to parse protect system value");
-DEFINE_CONFIG_PARSE_ENUM(config_parse_runtime_preserve_mode, exec_preserve_mode, ExecPreserveMode, "Failed to parse runtime directory preserve mode");
+DEFINE_CONFIG_PARSE_ENUM(config_parse_exec_preserve_mode, exec_preserve_mode, ExecPreserveMode, "Failed to parse resource preserve mode");
DEFINE_CONFIG_PARSE_ENUM(config_parse_service_type, service_type, ServiceType, "Failed to parse service type");
DEFINE_CONFIG_PARSE_ENUM(config_parse_service_restart, service_restart, ServiceRestart, "Failed to parse service restart specifier");
DEFINE_CONFIG_PARSE_ENUM(config_parse_service_timeout_failure_mode, service_timeout_failure_mode, ServiceTimeoutFailureMode, "Failed to parse timeout failure mode");
diff --git a/src/core/load-fragment.h b/src/core/load-fragment.h
index 45e9c39..974bb42 100644
--- a/src/core/load-fragment.h
+++ b/src/core/load-fragment.h
@@ -93,7 +93,7 @@ CONFIG_PARSER_PROTOTYPE(config_parse_exec_selinux_context);
CONFIG_PARSER_PROTOTYPE(config_parse_exec_apparmor_profile);
CONFIG_PARSER_PROTOTYPE(config_parse_exec_smack_process_label);
CONFIG_PARSER_PROTOTYPE(config_parse_address_families);
-CONFIG_PARSER_PROTOTYPE(config_parse_runtime_preserve_mode);
+CONFIG_PARSER_PROTOTYPE(config_parse_exec_preserve_mode);
CONFIG_PARSER_PROTOTYPE(config_parse_exec_directories);
CONFIG_PARSER_PROTOTYPE(config_parse_set_credential);
CONFIG_PARSER_PROTOTYPE(config_parse_load_credential);
diff --git a/src/core/service.c b/src/core/service.c
index af6360e..9f16790 100644
--- a/src/core/service.c
+++ b/src/core/service.c
@@ -60,6 +60,7 @@ static const UnitActiveState state_translation_table[_SERVICE_STATE_MAX] = {
[SERVICE_FINAL_SIGTERM] = UNIT_DEACTIVATING,
[SERVICE_FINAL_SIGKILL] = UNIT_DEACTIVATING,
[SERVICE_FAILED] = UNIT_FAILED,
+ [SERVICE_DEAD_RESOURCES_PINNED] = UNIT_INACTIVE,
[SERVICE_AUTO_RESTART] = UNIT_ACTIVATING,
[SERVICE_CLEANING] = UNIT_MAINTENANCE,
};
@@ -84,6 +85,7 @@ static const UnitActiveState state_translation_table_idle[_SERVICE_STATE_MAX] =
[SERVICE_FINAL_SIGTERM] = UNIT_DEACTIVATING,
[SERVICE_FINAL_SIGKILL] = UNIT_DEACTIVATING,
[SERVICE_FAILED] = UNIT_FAILED,
+ [SERVICE_DEAD_RESOURCES_PINNED] = UNIT_INACTIVE,
[SERVICE_AUTO_RESTART] = UNIT_ACTIVATING,
[SERVICE_CLEANING] = UNIT_MAINTENANCE,
};
@@ -121,6 +123,8 @@ static void service_init(Unit *u) {
s->watchdog_original_usec = USEC_INFINITY;
s->oom_policy = _OOM_POLICY_INVALID;
+
+ s->fd_store_preserve_mode = EXEC_PRESERVE_RESTART;
}
static void service_unwatch_control_pid(Service *s) {
@@ -906,8 +910,10 @@ static void service_dump(Unit *u, FILE *f, const char *prefix) {
if (s->n_fd_store_max > 0)
fprintf(f,
"%sFile Descriptor Store Max: %u\n"
+ "%sFile Descriptor Store Pin: %s\n"
"%sFile Descriptor Store Current: %zu\n",
prefix, s->n_fd_store_max,
+ prefix, exec_preserve_mode_to_string(s->fd_store_preserve_mode),
prefix, s->n_fd_store);
cgroup_context_dump(UNIT(s), f, prefix);
@@ -1101,7 +1107,7 @@ static void service_set_state(Service *s, ServiceState state) {
s->control_command_id = _SERVICE_EXEC_COMMAND_INVALID;
}
- if (IN_SET(state, SERVICE_DEAD, SERVICE_FAILED, SERVICE_AUTO_RESTART)) {
+ if (IN_SET(state, SERVICE_DEAD, SERVICE_FAILED, SERVICE_AUTO_RESTART, SERVICE_DEAD_RESOURCES_PINNED)) {
unit_unwatch_all_pids(UNIT(s));
unit_dequeue_rewatch_pids(UNIT(s));
}
@@ -1202,7 +1208,8 @@ static int service_coldplug(Unit *u) {
return r;
}
- if (!IN_SET(s->deserialized_state, SERVICE_DEAD, SERVICE_FAILED, SERVICE_AUTO_RESTART, SERVICE_CLEANING)) {
+ if (!IN_SET(s->deserialized_state, SERVICE_DEAD, SERVICE_FAILED, SERVICE_AUTO_RESTART, SERVICE_CLEANING,
+ SERVICE_DEAD_RESOURCES_PINNED)) {
(void) unit_enqueue_rewatch_pids(u);
(void) unit_setup_dynamic_creds(u);
(void) unit_setup_exec_runtime(u);
@@ -1718,6 +1725,12 @@ static bool service_will_restart(Unit *u) {
return unit_will_restart_default(u);
}
+static ServiceState service_determine_dead_state(Service *s) {
+ assert(s);
+
+ return s->fd_store && s->fd_store_preserve_mode == EXEC_PRESERVE_YES ? SERVICE_DEAD_RESOURCES_PINNED : SERVICE_DEAD;
+}
+
static void service_enter_dead(Service *s, ServiceResult f, bool allow_restart) {
ServiceState end_state;
int r;
@@ -1734,10 +1747,10 @@ static void service_enter_dead(Service *s, ServiceResult f, bool allow_restart)
if (s->result == SERVICE_SUCCESS) {
unit_log_success(UNIT(s));
- end_state = SERVICE_DEAD;
+ end_state = service_determine_dead_state(s);
} else if (s->result == SERVICE_SKIP_CONDITION) {
unit_log_skip(UNIT(s), service_result_to_string(s->result));
- end_state = SERVICE_DEAD;
+ end_state = service_determine_dead_state(s);
} else {
unit_log_failure(UNIT(s), service_result_to_string(s->result));
end_state = SERVICE_FAILED;
@@ -1793,6 +1806,10 @@ static void service_enter_dead(Service *s, ServiceResult f, bool allow_restart)
/* Also, remove the runtime directory */
unit_destroy_runtime_data(UNIT(s), &s->exec_context);
+ /* Also get rid of the fd store, if that's configured. */
+ if (s->fd_store_preserve_mode == EXEC_PRESERVE_NO)
+ service_release_fd_store(s);
+
/* Get rid of the IPC bits of the user */
unit_unref_uid_gid(UNIT(s), true);
@@ -2449,7 +2466,7 @@ static int service_start(Unit *u) {
if (s->state == SERVICE_AUTO_RESTART)
return -EAGAIN;
- assert(IN_SET(s->state, SERVICE_DEAD, SERVICE_FAILED));
+ assert(IN_SET(s->state, SERVICE_DEAD, SERVICE_FAILED, SERVICE_DEAD_RESOURCES_PINNED));
r = unit_acquire_invocation_id(u);
if (r < 0)
@@ -2501,7 +2518,7 @@ static int service_stop(Unit *u) {
/* A restart will be scheduled or is in progress. */
if (s->state == SERVICE_AUTO_RESTART) {
- service_set_state(s, SERVICE_DEAD);
+ service_set_state(s, service_determine_dead_state(s));
return 0;
}
@@ -3312,6 +3332,7 @@ static void service_notify_cgroup_empty_event(Unit *u) {
* up the cgroup earlier and should do it now. */
case SERVICE_DEAD:
case SERVICE_FAILED:
+ case SERVICE_DEAD_RESOURCES_PINNED:
unit_prune_cgroup(u);
break;
@@ -4332,7 +4353,7 @@ int service_set_socket_fd(
assert(!s->socket_peer);
- if (s->state != SERVICE_DEAD)
+ if (!IN_SET(s->state, SERVICE_DEAD, SERVICE_DEAD_RESOURCES_PINNED))
return -EAGAIN;
if (getpeername_pretty(fd, true, &peer_text) >= 0) {
@@ -4369,7 +4390,7 @@ static void service_reset_failed(Unit *u) {
assert(s);
if (s->state == SERVICE_FAILED)
- service_set_state(s, SERVICE_DEAD);
+ service_set_state(s, service_determine_dead_state(s));
s->result = SERVICE_SUCCESS;
s->reload_result = SERVICE_SUCCESS;
@@ -4531,14 +4552,19 @@ static void service_release_resources(Unit *u) {
/* Don't release resources if this is a transitionary failed/dead state
* (i.e. SERVICE_DEAD_BEFORE_AUTO_RESTART/SERVICE_FAILED_BEFORE_AUTO_RESTART), insist on a permanent
* failure state. */
- if (!IN_SET(s->state, SERVICE_DEAD, SERVICE_FAILED))
+ if (!IN_SET(s->state, SERVICE_DEAD, SERVICE_FAILED, SERVICE_DEAD_RESOURCES_PINNED))
return;
log_unit_debug(u, "Releasing resources...");
service_close_socket_fd(s);
service_release_stdio_fd(s);
- service_release_fd_store(s);
+
+ if (s->fd_store_preserve_mode != EXEC_PRESERVE_YES)
+ service_release_fd_store(s);
+
+ if (s->state == SERVICE_DEAD_RESOURCES_PINNED && !s->fd_store)
+ service_set_state(s, SERVICE_DEAD);
}
static const char* const service_restart_table[_SERVICE_RESTART_MAX] = {
diff --git a/src/core/service.h b/src/core/service.h
index 2e803a3..04bdcb2 100644
--- a/src/core/service.h
+++ b/src/core/service.h
@@ -194,6 +194,7 @@ struct Service {
size_t n_fd_store;
unsigned n_fd_store_max;
unsigned n_keep_fd_store;
+ ExecPreserveMode fd_store_preserve_mode;
char *usb_function_descriptors;
char *usb_function_strings;
diff --git a/src/shared/bus-unit-util.c b/src/shared/bus-unit-util.c
index d3a5b25..b944cd0 100644
--- a/src/shared/bus-unit-util.c
+++ b/src/shared/bus-unit-util.c
@@ -1999,7 +1999,8 @@ static int bus_append_service_property(sd_bus_message *m, const char *field, con
"USBFunctionStrings",
"OOMPolicy",
"TimeoutStartFailureMode",
- "TimeoutStopFailureMode"))
+ "TimeoutStopFailureMode",
+ "FileDescriptorStorePreserve"))
return bus_append_string(m, field, eq);
if (STR_IN_SET(field, "PermissionsStartOnly",
diff --git a/test/fuzz/fuzz-unit-file/directives.service b/test/fuzz/fuzz-unit-file/directives.service
index de7d2c7..0509c36 100644
--- a/test/fuzz/fuzz-unit-file/directives.service
+++ b/test/fuzz/fuzz-unit-file/directives.service
@@ -163,6 +163,7 @@ ExecStopPost=
ExtensionImages=
FailureAction=
FileDescriptorStoreMax=
+FileDescriptorStorePreserve=
FinalKillSignal=
Group=
GuessMainPID=
--
2.33.0