Ha, systemd is basically my git. I'm always surprised by how many developers don't grok git despite understanding significantly more complex things. I keep thinking if they would just take a day or two, they would get it and then wouldn't complain about having to copy paste commands and "rm -rf && git clone" to fix their workspaces.
And yet I do the _exact_ same thing with systemd. I don't get it, I don't like it, and every time I need to do _anything_, I'm googling, copy pasting commands into my terminal until something vaguely correct happens. The CLI options are indecipherable to me, like reading hieroglyphics.
Deep down I understand that it solves some pretty complex things when it comes to system initialization, but something about the UX of using journalctl and systemctl is so bad (for me) that it creates a mental block that I cannot overcome.
Luckily, nothing I do in my day job requires messing with systemd.
It's pretty well documented, especially for Linux. Follow the links from that page at least one level deep.
Part of the complexity is that it's tightly integrated into the same Linux concepts that enable containers. In fact, systemd-nspawn can manage containers on your behalf.
But start with the concepts and unit files, then read the man pages for systemctl and journalctl. That'll get you most of the knowledge you need to interact with it on a day-to-day basis.
> I'm always surprised by how many developers don't grok <things> despite understanding significantly more complex things.
You used git as a (counter?) example, but for me systemd has this issue because it's because I touch it so infrequently that whatever I learned about it has aged out of mental cache.
I have the same issue with `jq`, GNU `parallel`, and `ffmpeg` (which I'll grant even if I used it daily I probably wouldn't be able to keep up.
> UX of using journalctl and systemctl is so bad (for me)
Not JUST you. I've heard all the arguments 100x for "binary logs good/bad", but I feel this was a step down for some puritanical idea that doesn't translate for many to practicality.
how on earth is remembering the location of some log file, buried in /var/log somewhere, hopefully, or maybe next to the application in /etc/ or /opt/, easier than remembering journalctl -u unitfilename ? or journalctl -u unitfilename -f to follow the logs -- the very same flag tail uses?
after a decade of systemd usage I've never once had to actually think about the format the logs are stored in, or where they're stored; journalctl is actually easier than using tail and cat
> how on earth is remembering the location of some log file, buried in /var/log somewhere, hopefully, or maybe next to the application in /etc/ or /opt/, easier than remembering journalctl -u unitfilename ?
This may seem obvious to you, but I for one find filenames much easier to remember. Of course the logs are inside /var/log (and not anywhere else).
Moreover, with filenames, you can check your logs with a program that is independent of the program that produces it. Why implement the -f option on other programs when it is already implemented in tail?
That's the thing though, you have to remember filenames. I don't have to remember anything, just journalctl, because it can be run without any arguments and get logs for -everything that's running- including units I totally forgot about!
More than once I've forgotten the name of some unit I wrote -- the same as I easily forget filenames for random services I've installed, and where I might have installed them. I can discover the service name by calling systemctl status, or I can just call journalctl with no arguments and pass directly to grep to find what I need, in whatever unit it's in.
Try that when you have log files scattered all around your filesystem.
I concede it's as easy if you're just reading one log from one machine, but what if you're harvesting and/or merging multiple logs from multiple machines? It's much easier to do with simple unix tools and plaintext logs.
> after a decade of systemd usage I've never once had to actually think about the format the logs are stored in, or where they're stored; journalctl is actually easier than using tail and cat
It's hard to explain the problem with journalctl... it just always does something completely different from what I expect it to do.
The option to fix it is always there, individualized by some rule that is completely different from what a command line interface tends to use, so it's mostly unrecognizable.
It wouldn't really be that bad, except for the fact that it's very hard to be confident it's doing the job you wanted. So it's showing no errors with this service... does that mean the service has no errors?
Sounds like a you problem. You have identified a hole in your knowledge, you should address it. Systemd is not rocket science, and I would say it's more intuitive that git is anyway.
It’s not rocket science, but it is baroque. TFA ends with 30 lines of config to make it boot and do nothing, including some to disable a bunch of spooky action at a distance invisible default dependencies, which still ends up with a system with a bunch of other units that aren’t explained.
I was finally impressed with systemd when I recently had to port our single-process multi threaded TCP/UDP server (listening on a heap of ports) at work to a multiple binary/process, single threaded asynchronous IO system.
PartOf= let us still have one top level “lithium” service to start and stop which starts and stops all of the other services, while being able to use journalctl to view them as one log file via a wildcard!
Systemd is a pretty neat little thing. It's what keeps me running Ubuntu and Debian servers instead of switching to Alpine, at least in those situations where I can't containerize what I want to do.
The thing that got me into it was systemd timers. I really like being able to split up what should run from when should it run. `cron` works best if you desire simplicity above all else, and Slack used `cron` for a very long time before moving to something more custom. But I ultimately don't feel like the learning curve for systemd timers is that bad, and the benefits it can bring (like writing all your echo statements to journalctl instead of some random hard coded logfile in a script) feel very worth it to me.
>`cron` works best if you desire simplicity above all else
It may be simple, but it's config file is hard to comprehend compared to a systemd timer. Being able to just use "Friday 17:00" or "hourly" is so much more readable.
Debugging and tracability is just nuts too. With a timer and a run-once service it's easy to see what happens, happened, and will happen:
systemctl status foo.timer tells you when it has last triggered and when it will triggered next, if it is still enabled.
systemctl status foo.service tells you if it has been triggered from the timer and when, or ran manually with start
I get the conceptual utter simplicity of cron in place of timer + a script in place of service but in practice systemd is much simpler and more consistent to manage.
Exactly, and there are still alot of old school "cron" users who don't touch the better "anacron" processes. I hate managing that stuff.
Systemd > Anacron > Cron
Even worse, when I was finding "crons" run from Java apps with cron4j. I hated on SystemD a lot in the early days, and there are still many legit criticisms that don't have good answers, but I have come to appreciate it's power and usefulness very much. Writing your own units seems so much cleaner than the old way, and I say that as a everyday bash script writer/user. (example, systemd-nspawn for containerization)
PS: mcron is what I have been tinkering with and think will probably replace my systemd timers for most purposes.
I started writing a "oh, I never found it that difficult" comment. Then I thought to test my own belief and tried to type out a cron schedule for "run this every hour", and... Well...
As well as the already-mentioned `n * * * *` solution, there's also dropping a symlink in /etc/cron.hourly to the program to run. Or a 2-line shell script if you need to add command-line params. (The other line is the shebang.)
The cron.hourly directory is a non-standard Linux concept provided by run-parts. The FreeBSD equivalent is periodic, but that's really for system tasks and doesn't have anything more granular than daily.
> There's a single implementation as far as I know
That's exactly what I mean.
Systemd itself, including timers, is deeply coupled with the Linux API, and cannot be (easily) ported on any other system. Cron, on the other hand, exists on almost every UNIX-like system.
systemd is is many things, but obvious and accessible is not one of them. It has a lot of power and flexibility, but the trade off is a lot of complexity. Just look at the unit configuration file:
I have no idea from that documentation how I'd run something every hour. I guess I have to create a new unit file, and use the OnCalendar stanza? But what do I set it to? I'm directed to this page:
I started my career as a Unix SA over 25 years ago and have worked with a lot of different Unix-flavored operating systems: SunOS/Solaris, HP/UX, FreeBSD, Linux (Slackware, RedHat, Debian), macOS/OS X, AIX. I'm familiar with all their different variations of init.
All of them are esoteric in one way or another. Some of them have their behavior right out in the open where it's easy to see (inittab and rc scripts). Others hide away massive complexity (launchd and systemd) and require extensive documentation to understand.
I appreciate all the power that systemd provides. It gets a lot of things right. But in terms of complexity, it's almost an operating system unto itself.
I'm not quite sure why you felt pressed to write this long-winded response to something I didn't write. I never said that systemd is easy. I said that if someone is running *BSD, today, in 2024, they'll cope with cron's syntax.
I'll still reply to one part:
> Oh, I see I can use "hourly" but it turns out that's syntactic sugar for:
> *-*-* *:00:00
>Is that really any easier than the cron equivalent?
> 0 * * * *
...Yes? Ask someone who's not familiar either with cron or systemd to guess what each one mean, with the context that it's supposed to be something related to dates and times. If you're vaguely familiar with how dates are usually written down, and that a star is a wildcard, you can immediately guess that the first one means "any year, any month, any day, any hour, at the first minute, first second". The cron one? It's anyone's guess.
Hah, that's fair. I inferred from your comment the corollary that systemd is not arcane, and thought "oh, it's just arcane differently," hence my long-winded reply.
You can argue that no other system matters (which I strongly disagree with for reasons I don't want to get into right now), but that doesn't make systemd timers any more portable.
I’d argue that the degree of portability implied by the term is very much context dependent. “Portable between Linux systems” is a perfectly valid use of the term, imo.
Do you take portable to mean a program that can run on every processor arch and OS available?
> Do you take portable to mean a program that can run on every processor arch and OS available?
The definition of the word "portable" is not the point of this comment thread (although if you want to hear my definition, see the reply to a sibling comment).
Let's take the context into account:
> hashworks:
While I support systemd timers over cron, AFAIK cron has stuff like @hourly.
> caiusdurling:
Some cron implementations do, it's not portable.
> bheadmaster:
Neither are systemd timers. I don't think there's a single system out there implementing the systemd timer interface.
User caiusdurling said @hourly is not portable between cron implementations as a way to discredit hashworks' argument about cron having @hourly.
However, that's a disingenious argument, because systemd timers aren't any more portable than cron implementations that use @hourly - in fact, there isn't a single system out there implementing the systemd timer interface except systemd.
Portable means easy to make it work while changing the underlying system or its implementation details - where the underlying system refers to architecture, OS, API or any other dependencies needed to run the program.
In context of systemd, systemd isn't portable because it highly depends on Linux API and thus cannot be ported on any other UNIX-like (or unlike) OS.
In context of systemd timers, systemd timers aren't portable because there isn't as single alternative implementation that can interpret systemd timer unit files.
I wish I could make friends with systemd like y'all. I spent at least
a week wrestling unsuccessfully with systemd (admittedly starting from
complete ignorance) to make a service that
synchronizes a local directory on a laptop with a remote directory on
a server after a docker container is stopped but before the
filesystems and network stack are torn down during poweroff. It was
easy enough to accomplish by manually invoking a script prior to
powering off but no amount of cajoling, poring over documentation, or
reading and adapting examples from forum posts on my part would get
systemd to do it automatically without conflicts, race
conditions, or timeouts. Is this not exactly the kind of problem systemd
dependences are intended to solve?
From the personal experience, systemd works quite well when it works, but sometimes I ran into problems with it that were very hard for me to solve. I admit I never became a systemd expert, but I think I had a decent working knowledge of it.
The blog post series author then did an online playground [1] that lets you create systemd examples and experiment with systemd in the browser. It's really interesting.
Not a lot of people, a very loud minority. They can go use Void Linux or other niche distros which replace it with the monstrosity that were SysV init scripts. It's the type of people that never had to write one of those scripts by hand, and maybe support two or three different distros.
The UX of systemd tools (systemctl, journalctl) is a bit crap, to be honest, but systemd does what it needs to do well and it is an established component of a modern Linux system.
I'm really tired of people pretending that systemd is the only modern init system around. If you're going to compare it to the status quo, at least don't strawman it. If SysV init scripts are your comparison, it just sounds like you're regurgitating outdated talking points.
Sorry but this argument really falls flat. All of the other init systems' service files are reminiscent of SysV say OpenRC or runit.
Moreover systemd achieves what it does because it provides a unified way of doing system administration. It is not only a service manager. It is the much needed dynamic system layer for Linux.
A modern Linux engineer or a sysadmin, no longer needs to make 5 to 10 different services work together by duct taping everything or writing abstraction layers per integration. Any system that has the correct level of abstraction will be at least as complex as systemd.
I never said it was. I said it's the only relevant one. Upstart was half-decent, for example, but no one uses it any more. Yes, there are other modern init systems, none that are so brilliantly designed and perfect that are worth this kind of resistance. They're all crappy in their own way. So what if systemd has won.
I do not get why people want to die on this inconsequential hill.
> regurgitating outdated talking points
No, I'm not being paid by Poettering. I've been running Linux systems and software for 25 years, and I have formed my opinion on the matter, thank you very much.
That's serveral conjectures on your part.
Give me metrics about the minority and then back up the fact that most of those people never wrote a sysinit or rc script by hand.
Like it or not Systemd breaks the Unix way and for a certain amount of people (I am not pretending to have a measuring stick) the dogma and philosophy are much more important than perceived ease of use or features.
All the major distros, especially server distros, run systemd. That is proof enough that NOT running systemd is a niche and unsupported configuration in any half-serious setup.
A conjecture is not automatically wrong. The point of a conjecture is to challenge someone else to prove it wrong. You'd have a hard time showing proof that systemd is niche.
And honestly, it does not break UNIX, let's be serious now. A haphazard bunch of shell scripts is not much better than a monolithic binary, and UNIX does not give a damn either way.
I do not care about dogma and philosophy, I am an engineer with software to run and deadlines to meet.
> I am an engineer with software to run and deadlines to meet.
Sure, but UNIX, with its minimalist philosophy, was - luckily for us! - not created under that slogan. With respect to quality, this slogan has always been the one driving its decrease that we see all around us today.
UNIX was created by engineers to run software, not as a philosophical device.
Sorry to say, yours is a terrible excuse. I respect that you might not like systemd, no software is perfect and systemd has many flaws, but the "think of the UNIX philosophy" is righteous nonsense that I cannot take seriously.
The real world is not black-and-white, with or against us. One of the most beloved UNIX tools, Emacs, is as far removed from the minimalist philosophy that some have decided to be UNIX's crowning achievement. It is not. Linux and UNIX have not won because we're all into minimalism, or dogmatic zealotry.
I don't care about a 50 year old technology philosophy. Times have changed.
Cloud computing and the prevalence of highly distributed systems have completely changed how software is deployed, scaled, and managed. These environments often rely on complex orchestration which can stretch the boundaries of the UNIX philosophy towards systems that are more about interactions between distributed components than about simple, single-purpose tools.
I don't see how it's relevant. An orchestration tool can also follow the Unix way by not implementing too much functionality and relying on the small tools for subtasks.
And do you know what systemd is? A project that produces multiple orthogonal executables that share a syntax paradigm and communicate over standardized IPC mechanisms.
I think the only valid-ish argument these Unix philosophy discussions boil down to "why didn't they use purely text based interfaces". Well because they are inefficent and hard to debug and reason with. Making things text only doesn't make them more understandable but they make sure the programs waste energy and development time in meaningless parsing code that is full of security issues.
You have a point concerning the text files. (Although I would be really interested to see an example where efficiency of text files is insufficient on modern hardware.)
However "multiple orthogonal multiple orthogonal executables" is just a strawman: their interfaces are unstable and non-straightforward, aand communicate over standardized IPC mechanismsnd one can't reliably replace one of them with an alternative. It's technically true but in practice not. This effectively makes it a huge, inflexible, not-easily-verifiable blob with wide permissions and, potentially, many bugs.
> and communicate over standardized IPC mechanisms
I've never seen that anyone could replace one systemd component with an alternative implementation. Did you?
You’re commenting on a post by someone that managed to encapsulate a fully functional instance of an init manager, ripped out all of its default configuration, and replaced them with dummy files - just to find out the whole thing works exactly as documented.
And now you seriously argue this is brittle, non-straightforward, inflexible, and hard to verify? For real?
If I ever have seen a bloody well designed software system, it is this!
I really dislike that systemctl subcommands do not respect the accepted standard of `command subcommand -h`. For example, try `git branch -h` or `systemctl start -h`. The latter won't give any useful information.
The man page does not help either. `man git branch` opens the manual of the `git-branch` binary, so only the relevant options are shown. `man systemctl start` will show the man of the entire systemctl command, which is a bit unwieldy.
`journalctl` is the same: it both does paging through the logs, but also manages rotating or vacuuming the logs. Why not have two commands?
>The UX of systemd tools (systemctl, journalctl) is a bit crap....
And this is easily 80% of why I don't like it. I _know_ it's better* but my use-cases have never clashed with the limitations of init.d.
*better: I've recently had an issue that I was trying to troubleshoot while using `journalctl -fu service-name` and it just wasn't giving me enough information/the same content that /var/log/messages was giving me on my other OS (FreeBSD). Perhaps there was a flag I was missing (increase verbosity??) but the UX did get in the way.
Side note: I do chuckle every time I end up using the -fu flag. =)
I think it depends on what you get used to first. I only started using Linux back in 2016-17 (I'm 25), and for me systemd is just so insanely good compared to anything else I've tried. Maybe it's just because I never got used to what came before it or to current alternatives.
I also don't care that much about the "Unix philosophy" or whatever, so there's that.
Yep it's magical to have a working runfile or config on demand. Actually a 10x boost in productivity in not having to spelunk through docs to find the obscure option
I haven’t been in the weeds with systemd for a while, but this post was very helpful in understanding how systemd actually works, which didn’t match my (or, I would think, nearly anyone’s) intuition as a user of it. I don’t know if there’s any other full technical explanation of how units, jobs, transactions, etc. are actually handled.
>systemd’s developers and almost all public documentation revolve around a “unit-centric” definition of systemd (except when discussing bugs), but my position is that for systemd-the-service-manager the most important construct is not the unit but the job, and hence one ought to understand systemd in a “job-centric” way.
Took me a while to read through so I could come back and comment.
I've got to say, this is probably the best single explainer of the basics of systemd I've read.
I've only recently started fiddling with systemd extensively since I had my Linux-coming-of-age prior to its widespread adoption. I've got to be honest. I've found the experience very confusing and disorientating. The man pages feel really terse, even by manpage standards, and all the information is heavily split up so it's quite difficult to build a mental model of the full system IMO.
Somethint like this would have been magic before, but it was still really helpful to me now with the way it explained things, I feel like it took me most of the way from cargo culting, into more understanding.
funny, to me systemd == no docker, no containers, just a VM.
it's my goto way to keep my programming running and have it be restarted if the vm reboots. I use VMs like "pods". I deploy code directly to the VM and run it there along with other programs. I scale up and scale down with: https://www.pulumi.com/
If I correctly understand your point, then it's worth noting the article is using a container merely to showcase the different parts of systemd in a sterile, reproducible environment. They're not running containers as systemd services, even though that's possible with podman
oh I guess I mean "most kids today" know nothing of running code outside docker. And don't ever need systemd. The "orchestration" of their pods is done by k8s. But I just don't get why. The extra complexity for what? Just make each ec2 instance VM a pod.
Back in the day when LinuxAcademy was still cool, I asked for a systemd course... and they did it. Can't find the link right now, but that was what made systemd really click for me.
Weird that still there is no apress/o'reilly/packt/manning introductory book about systemd.
A casualty of acquisitions, I think. Linux Academy got bought by A Cloud Guru, who then got bought by Vista Equity along with Pluralsight. I think the material lives there now [0].
Packt does have "Linux Service Management Made Easy with systemd" as of Feb 2022, but I've not yet read it.
Personally I find systemd man pages pretty great, they comprehensively explain pretty much every thing that you need to know. Sure, they are more of a reference than a tutorial and there is lot of material to read there, but still I'd recommend reading the relevant sections whenever you are working with systemd
I am not having a good time with systemd on my embedded device.
Somewhat complicated scenario, I need to change the mode of a wifi driver (which I do via modprobe), use iw to create a second interface for that wifi device, and finally use networkmanager to create a wifi hotspot based on the host name of the device. We set the hostname based on the mac address of wlan0.
There are a lot of places this can go wrong, and you need a lot of deep knowledge of systemd to get it right.
I start with three services, one to set the hostname, one to create the wlan0_ap device, and one to start up the wifi hotspot using nmcli.
Alright, simple enough, this is the kind of thing systemd is good at, right?
Well no, (and pardon me if my memory isn't perfect on this, it's tricky to debug weird race conditions) sometimes setting the hostname fails. Why? Well sometimes wlan0 doesn't exist, there's a udev rule that renames wlan0 during start up. Apparently I can add the systemd tag to that udev rules, and it will create a `.device` dependency I can Require. Alright, use both require and after keywords just to be sure. Good enough.
Huh, onehsot services are listed as stopped even though they've run. Do some research, find out I need the `RemainAfterExit` keyword yet. Sure, dependencies are working.
Now wlan0_ap device creation sometimes fails. Why? No really, why, this is hard to debug. Well looking around at dmesg it looks like the problem is that networkmanager is trying to do something with the device at the same time I am. Looks like it's enumerating device capabilities at the same time I'm trying to create a new device? Oh well, I'll just add the retry keyword to my service.
Oh, wait, I can't add the retry keyword to one-shot services? Really? Internet tells me to change the service type to simple. Sure, we'll try that. Now that wlan0_ap device exists, let's start our access point.
Also it's still sometimes failing on the hostname step still? The set-hostname command is asynchronous I think? I'll add a line to the start_access_point script that waits until the hostname isn't the default.
Everything is looking good. Oh wait, it's sometimes failing intermittently. What the hell, why doesn't wlan0_ap exist? Service type `simple` considers a service to be started when the command first launches, not when it exit successfully. I've introduced another race condition.
How can I make a service that works like a oneshot, but that you can retry? What, write a shell script and use systemd-notify to say it's complete?
Long story short I ended up just writing a traditional shell-script based init script. Maybe I'm thinking about systemd wrong or something.
> I need to change the mode of a wifi driver (which I do via modprobe)
Is that something that could be added to modprobe.conf to have it work right without any modprobe calls?
> Huh, onehsot services are listed as stopped even though they've run. Do some research, find out I need the `RemainAfterExit` keyword yet. Sure, dependencies are working.
Yeah I've also been burned not having RemainAfterExit=y in a oneshot unit. What's worse, such an unit can get started multiple times.
> Well looking around at dmesg it looks like the problem is that networkmanager is trying to do something with the device at the same time I am. Looks like it's enumerating device capabilities at the same time I'm trying to create a new device?
To be fair such race condition could have happened in any init system.
> Oh well, I'll just add the retry keyword to my service.
Perhaps having a dependency to start your service before network-manager would have solved this.
> Long story short I ended up just writing a traditional shell-script based init script. Maybe I'm thinking about systemd wrong or something.
It's perfectly fine IMO to have small shell script services for such stuff.
Probably some of the problems are due to network-manager being mostly designed/used for desktop use cases. It also likes to take complete control of all the interfaces so calling iw/ifconfig behind it's back will cause tears (as you found out).
I do agree that race conditions caused by parallel service startup really suck in embedded devices. Systemd really could use a "please be as deterministic as possible" mode for embedded (if it already doesn't have).
>Is that something that could be added to modprobe.conf to have it work right without any modprobe calls?
Yes, that's how I do it. But that just switches the mode to one that supports virtual interfaces, I still need to add the virtual interface, and I can't do that in a modprobe conf.
>To be fair such race condition could have happened in any init system.
I feel like it's really easy to make that happen under systemd if you're not completely on the ball, where as it's more difficult in things like openrc.
>Yeah I've also been burned not having RemainAfterExit=y in a oneshot unit. What's worse, such an unit can get started multiple times.
You live and you learn, on its own something like that isn't a huge problem.
>Perhaps having a dependency to start your service before network-manager would have solved this.
It was one in a long-chain of similar issues, but yes. There are a lot of ways I probably could have made this work, but at the end of the day I'm still writing shell scripts and not using very many of systemd's "helpful" features.
I think you're trying to force the init and service manager systemd side to do network things. Well those are more complex than the purely service manager side to handle and that's ultimately why you struggle.
All the extra complexity is the reason that systemd ships with its own network daemon systemd-networkd to handle that complexity separately. It sounds like you need a .netdev file to create your virtual interface. Also .device files are also handled by networkd. I use them on our embedded devices pretty successfully. Take a look (especially wlan type): https://www.freedesktop.org/software/systemd/man/latest/syst...
And yet I do the _exact_ same thing with systemd. I don't get it, I don't like it, and every time I need to do _anything_, I'm googling, copy pasting commands into my terminal until something vaguely correct happens. The CLI options are indecipherable to me, like reading hieroglyphics.
Deep down I understand that it solves some pretty complex things when it comes to system initialization, but something about the UX of using journalctl and systemctl is so bad (for me) that it creates a mental block that I cannot overcome.
Luckily, nothing I do in my day job requires messing with systemd.