HN2new | past | comments | ask | show | jobs | submitlogin

> it's literally a new random problem every other boot because of this non-deterministic startup, which was never a problem with traditional init or /etc/rc

This gave me a good chuckle. Systemd literally was created to solve the awful race conditions and non-determinism in other init systems. And it has done a tremendous job at it. Hence the litany of options to ensure correct order and execution: https://www.freedesktop.org/software/systemd/man/latest/syst...

And outside of esoteric setups I haven't ever encountered the problems you mentioned with service files.





systemd was created to solve the problems of a directory full of shell scripts. A single shell script has completely different problems. And traditional init uses inittab, which is not /etc/init.d, and works more like runit.

runit's approach is to just keep trying to start the shell script every 2 seconds until it works. One of those worse–is–better ideas, it's really dumb, and effective. You can check for arbitrary conditions and error–exit, and it will keep trying. If you need the time synced you can just make your script fail if the time is not synced.

traditional inittab is older than that and there's not any reason to use it when you could be using runit, really.


yeah, many options that are complicated beyond the understanding of the distro maintainers, and yet still don't allow expression of common semantics required to support network services reliably

like "at least one real IP address is available" or "time has been synced"

and it's not esoteric, even ListenAddress with sshd doesn't even work reliably

the ONLY piece of systemd I've not had problems with is systemd-boot, and then it turned out they didn't write that


> like "at least one real IP address is available" or "time has been synced"

"network-online.target is a target that actively waits until the network is “up”, where the definition of “up” is defined by the network management software. Usually it indicates a configured, routable IP address of some kind. Its primary purpose is to actively delay activation of services until the network has been set up."

For time sync checks, I assume one of the targets available will effectively mean a time sync has happened. Or you can do something with ExecStartPre. You could run a shell command that checks for the most recent time sync or forces one.


it's the "usually" that's the problem

this service (untouched by me) had:

After=local-fs.target network-online.target remote-fs.target time-sync.target

but it was still started without an IP address, and then failed to bind

just like this sort of problem: https://github.com/systemd/systemd/issues/4880#issuecomment-...

the entire thing is unreliable and doesn't act like you'd expect

> Or you can do something with ExecStartPre. You could run a shell command that checks for the most recent time sync or forces one.

at that point I might as well go back to init=/etc/rc


Are you running this particular unit file as a user unit or a system unit? Some targets like network-online.target don't work from user unit files.

You could also try targeting NetworkManager or networkd's "wait-online" services. Or if that doesn't work, something is telling systemd that you have an IP when you don't. NetworkManager has "ipv4.may-fail" and "ipv6.may-fail" that might be errenously true.

> at that point I might as well go back to init=/etc/rc

The difference is that systemd is much better at ensuring correctness. If you write the invoked shell command properly, it'll communicate failure or success correct and systemd will then communicate that state to the unit. It's still a lot more robust than before.


it's a system service file

the problem is systemd

> The difference is that systemd is much better at ensuring correctness.

yeah, whatever mate


Seems like you have an axe to grind with systemd because it replaced your familiar (but extremely cruddy) init system and now you refuse to debug the problem because you prefer being able to blame systemd.

There is so much granularity and flexibility in what you can do it seems rather unlikely you cannot make it happen correctly. And if it is truly a bug... open an issue? They're rather responsive to it. And it isn't like the legacy init systems were bug free from inception (well, lord knows they were still chock full of bugs even when they were replaced).

Edit: sitting here with a grin .. HN downvoting the advice of checking logs, debugging and opening an issue. I wish the companies y'all work at good luck.. they'll need it.


> Seems like you have an axe to grind with systemd because it replaced your familiar (but extremely cruddy) init system and now you refuse to debug the problem because you prefer being able to blame systemd.

I'm a pragmatist: I just want it to work

my solution to MULTIPLE different services failing to IP bind is to turn on the non-local ip binding sysctl, bypassing systemd's brokenness entirely

> There is so much granularity and flexibility in what you can do it seems rather unlikely you cannot make it happen correctly.

I've written an init before (in C), I know how the netlink interface to set an IP address and add routing table entries works

I understand the difference between monotonic and wall clocks

I understand the difference between Wants and Require

I know what's going on at every, single, level

and I can't stand how unreliable systemd makes nearly every single one of my, bluntly, completely vanilla systems

> And if it is truly a bug... open an issue?

did you read the link I pasted earlier?

I'm not wasting my time with that level of idiocy (from LP himself)


> Some targets like network-online.target don't work from user unit files.

So basically it just doesn't work sometimes for no particular reason.

> The difference is that systemd is much better at ensuring correctness

Uh, well, you just said that it isn't, because some targets like network-online.target don't work from user unit files.


> https://github.com/systemd/systemd/issues/4880

I'm not a systemd hater or anything, but I continue to read stuff from Poettering which to me is deeply disturbing given the programs he works on.

Saying it's not a bug that service is launched despite a stated required prerequisite dependency failed... WTF?

Sure, I agree with him that most computers should probably boot despite NTP being unable to sync. But proposing that the solution to that is breaking Requires is just wild to me.


I'm not sure I understand why you think the solution proposed there is so bad.

The question in that issue is around the semantics of time-sync.target. Targets are synchronization points for the system and don't (afaik) generally make promises about the units that are ordered before them (in this case chrony-wait.service.

Does that answer your specific objection of "proposing that the solution to that is breaking Requires is just wild to me"? Basically, what is proposed in that issue is not breaking Requires=. The proposition is that the user add their own, specific Requires= as a drop-in configuration since that's not a generally-applicable default.


No, that does not make sense, because it goes against the systemd documentation.

Targets[1]: Target units do not offer any additional functionality on top of the generic functionality provided by units. They merely group units, allowing a single target name to be used in Wants= and Requires= settings to establish a dependency on a set of units defined by the target, and in Before= and After= settings to establish ordering.

boot-complete.target[2]: Order units that shall only run when the boot process is considered successful after the target unit and pull in the target from it, also with Requires=.

Note use of "only run" with a reference to Requires=.

time-sync.target[3]: This target provides stricter clock accuracy guarantees than time-set.target (see above), but likely requires network communication and thus introduces unpredictable delays. Services that require clock accuracy and where network communication delays are acceptable should use this target.

Especially note the last sentence there.

The documentation clearly indicates that targets can fail, and that services that needs the target to be successful, should use Requires= to specify that.

If the above is not true, the Requires= and Targets documentation should be rewritten to explicitly say that targets might fulfill Requires= regardless of state. Also, the documentation for time-sync.target should explicitly remove the last sentence and instead state there is no functional difference between Requires=time-sync.target and Wants=time-sync.target, it is best-effort only.

[1]: https://www.freedesktop.org/software/systemd/man/latest/syst...

[2]: https://www.freedesktop.org/software/systemd/man/latest/syst...

[3]: https://www.freedesktop.org/software/systemd/man/latest/syst...


That seems like a fair point about the documentation! As far as I can see, you're right.

So that's why I find his statements disturbing.

If he really don't want targets to deliver failed/success guarantees, then they've massively miscommunicated in their documentation. That in my book is a huge deal.

In either case the issue should in no circumstance be casually dismissed as not-a-bug without further action.


I don't personally find it as disturbing as you do, I think. Which isn't to say that I don't think it should be fixed, etc. etc.

I'm sure the project would accept a documentation patch to amend this discrepancy. At the end of the day (despite what some people on the internet might like to allege), systemd is a free software project that, despite having (more or less) a BIFL, is ultimately a relatively bazaar-like project.

Though since these targets and unit properties are very core to systemd-the-service manager, I do think that this is a bigger documentation oversight than most.


The disturbing part isn't the bug in time-sync.target or documentation, the disturbing part is how casually he brushes the issue away.

To me this is a huge red flag for a senior contributer to a core systems component, signalling some fundamental lack of understanding or imagination.

I very much disagree with not fixing time-sync.target, but if he had instead written a well-reasoned explanation for why time-sync.target should not propagate failed states and flagging it as a documentation bug, then that's something I'd respect and would be fine with. Or, even better IMHO, he'd fix time-sync.target and state that users who wants to boot regardless should use Wants instead.


Is it possible for network-online to mean that, or does network-on actually mean that?

It is possible for a specification to be so abstract that it's useless.


That's entirely defined by whatever units order themselves before network-online.target (normally a network management daemon like NetworkManager or systemd-networkd). systemd itself doesn't define the details; that's left up to how that distro and sysadmin have configured the network manager/system.

Sysadmins really hate the word "usually", and that is at the root of just about every systemd headache I've had

Same. I run a server with a ton of services running on it which all have what I think are pretty complex dependency chains. And I also have used Linux with systemd on my laptop. Systemd has never, once, caused me issues.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: