Ask HN: How to get into OS/systems programming in 2023?
I graduated from college about decade+ ago, took few OS & systems programming courses, but didn't find them interesting back then. However, now, more than a decade later, I'm finding myself bored at my cookie cutter C#/Asp.net/SQL(+ cloud) job. What would be your suggestions getting into OS/System programming in 2023 ? For starters, I'm thinking about learning assembly, which I never learned.
* Start with C programming language. K&R book is still the best, even though outdated.
* Take a course like Nand2Tetris (freely available online)
* Read a good OS book, e.g. The Design of the UNIX Operating System by Maurice Bach or any Andrew Tanenbaum book on the subject.
* Search for "unix systems programming" or "linux systems programming" and you will find some fantastic free books online.
* Read the free Linux Device Driver book. It is outdated but will still provide enough information to help you self navigate Linux driver code.
* Visit https://wiki.osdev.org/Expanded_Main_Page
* And as they say, "Read the source, Luke"; read the Linux source code. First you will have to identify which subsystem you are interested in, and then start browsing that.
If it's outdated is it still the best? Did you mean: even though it's old ?
“The C Programming Language, Second Edition” is an excellent book for learning the basics of ANSI C. However, there have been a few revisions of the C standard since then. While none of the changes are extreme (it’s not like learning Java 1.2 or Python 2 in 2023, and there are many ANSI C code bases still being maintained today), it’s still good to know of the newer features of the language. Additionally, the book does not cover certain pitfalls that affect the security and reliability of programs written in C, such as security problems caused by faulty uses of gets() and strcpy().
It’s unfortunate Dennis Ritchie is no longer with us; it would be nice to have a third edition of K&R that covers modern C standards and has information about best practices for avoiding situations that can lead to insecure code.
I would highly recommend C Programming: A Modern Approach by K.N. King as an alternative and replacement for K&R.
http://knking.com/books/c2/index.html
There is one free book for modern C https://gustedt.gitlabpages.inria.fr/modern-c
Did not check whether it covers security issues.
In general, following a combined approach of OpenBSD and MISRA is generally a good way to code. Code defensively, check all external untrusted input (Windows (NT-kernel) has checked and unchecked syscalls).
The "Downloads" link is dead but the other 2 work.
Outdated because the versions have changed a lot since the last publication of the book.
It is still good because it gives information about general things which are applicable across any version.
BTW,just checked the meaning of outdated on Cambridge dictionary portal; outdated means "no longer useful or modern" When I used the word outdated I meant "no longer modern" :)
I'm following the OSTEP website and the book:
https://pages.cs.wisc.edu/~remzi/OSTEP/
In the first step they ask you to build a simple shell with redirection and parallel execution which seems to be accessible to me (I don't have CS background).
MIT also has their own OS courses exposed to audiences:
https://pdos.csail.mit.edu/6.S081/2020/
I would learn C and x86/arm assembly first enough to bootstrap.
It's usually good to learn how to write ld linker scripts and how to make something bootable from grub.
Then, I would look Rust because it's easier to create correct no_std kernels.
EDIT: Link spew turned into a gist https://gist.github.com/5a1b94e8fde45e37c55c5a13d97c9b3f
As a hobby or as a career switch?
I've been programming embedded systems for the majority of my career and if you were to tell me that you just learned assembly, I'd wonder why you were wasting time like that. The last time I did more than glance at assembly language would have been around 1999.
If it's for a hobby, grab an arduino clone board and go for it! I'd suggest something ESP32-based though.
My experience has been the complete opposite. When I was in Uni I was required to take an assembly language course and was told it was foundational but I'd never use it in industry.
Straight out of Uni in 2001, I became an operating systems programmer bringing up a new platform and pretty much immediately started debugging C++ code almost exclusively in an asm debugger (it's what we had for an OS level debugger). I also started implementing capabilities provided by a hypervisor directly available via asm instruction because there wasn't a C construct for doing so).
After that I became a toolchain programmer (working mostly on new system libraries) and would write in C and Assembly language and debug mostly in Assembly language because, when you're working on system libraries and the toolchain for a new platform, you can't always trust the compiler, and the compiler can't always generate optimal code (or any code) based on the new instruction set. Oh, and you can't always trust the debugger either if it doesn't understand the instruction set.
While there I would work on 9 new platform bringups and would often need to stub in assembly language routines for capabilities that were in the architecture but not yet available in the toolchain. Other times, new system call routines, locking intrinsics, unwinders, exception handlers, and some optimized routines would require implementation in asm as well.
If you're only working on the non-platform part of an operating system, you might not need to learn assembly language, but if you touch platform code you will likely need to know it.
Sounds like fun experience to me. I guess grass is greener everywhere else but what are places that hire for a large number of such positions (and don't require much industry experience)? But I'm in Canada so that could be an issue...
I was recruited directly out of a feeder school (University) for this company after focusing my undergrad course selection on operating system topics.
Compared to twenty years ago, there are significantly fewer operating systems which means fewer companies have a need for this type of work. If you're looking to get into this work I'd suggest looking at Apple, Samsung, Nvidia, Qualcomm, Google, Microsoft, Arm, and any Arm architecture licensee. It seems that the people with the most luck these days end up tinkering with Linux at university, do a Google Summer of Code project and score an internship with one of the above.
Thanks, looks like way off my projectile. But at least I can keep this as a hobby~
What if for a career switch? I work as a data eng but have no idea how to make the switch (looks like a huge leap indeed). I did start working on some personal projects but I don't think they are enough to prove. I actually don't know how many projects to prove.
I want to say apply to any job that seems interesting since some of the best programmers I've worked with on embedded systems came from desktop & web backgrounds because they had a good grasp of modern development techniques as opposed to "hack & fix."
But those were systems where we abstracted away as much of the hardware as we could so everyone could focus on "business logic." The hardware interface is the part that's hardest to debug, so we try to put as little as possible into those layers.
Unfortunately, it sounds like that's exactly where you want to be working :-(
Thanks. I understand it's kinda niche field nowadays. At least I can still pursue it as a hobby.
I haven't been an active BSD operator for several years (the old hardware died), but the lack of corporate influence in the DragonFly/Free/Net/OpenBSD might be a space to check out. They've got non-trivial standards for what gets excepted into their source trees, but a significant number of contributors to those projects are professional or (at the very least) lifelong systems coders. OpenBSD would be the best choice if you're into what's current in the hardened systems arena. NetBSD has historically been the OS that runs on nearly every general computing platform that can still boot (I thought I had read something about Net sunsetting support for the vax architecture, but I just checked--still listed on their tier 2 list)
OP here. A while ago, I saw a book and set of videos (if my memory serves me right) going over the design and source code for one of the BSD flavors. I was thinking about finding that book.
Not sure about videos, but there's a series of books on FreeBSD called The Design and Implementation of the FreeBSD Operating System; the 1996 edition with a mildy different name is freely available [1], but I recommend picking up a newer version if it resonates with you; the 2014 version covers up to FreeBSD 11, the 2004 covers up to FreeBSD 5; but there were a lot of important changes in that ten year period. It will be a little dated, but the broad strokes are still the same.
OSTEP, linked elsewhere in the thread is good, too.
[1] https://docs.freebsd.org/en/books/design-44bsd/
For videos, it might be this: https://www.mckusick.com/courses/advdescrip.html
Yep this is what I saw years ago. Thank you for the link. Damn, those video are pricey and they are for outdated code as well.
> They've got non-trivial standards for what gets excepted into their source trees
The Wireguard drama in FreeBSD debunks this myth.
Eh. Okay. Like I said, I've been out of direct contact with the *BSD world for a while, so take it less as mythologizing as not being up-to-date :)
Nobody (except of hardware guys and compiler writers) needs to learn Asm, but definitely you are got to get a skill of reading it. There are a lot of asms (one per every computing architerture plus several possible syntaxes). Of course you need to be an expert in C, in my understanding this is way more important than Asm. One of the easiest way of starting system programming is to build some electronic toy based on microcontroller and write some code for it to do something useful. Bonus points will be earned if you will manage to pack some complicated behaviour so heavily that your controller will be having zero free memory.
Depending what you mean by "hardware guys" this isn't completely accurate. Though, you may not need to "learn" it in the sense that you do other programming languages, or like you previously did.
It's conceivable that they might have to write ASM at some point. However it will probably be very minimal and geared to a specific low level (hardware) goal.
There's still some value to "learning" ASM, or at least having worked with it if you want to write low level code, though it might be minimal. I wouldn't understand C and low level software without having worked with ASM at some point, even though I don't use it.
I probably wouldn't recommend it a good use of time.
It is good to know how control flow works under the hood.
Like C loops and ifs are made of jumps and compares in assembly.
If you do it correctly it can help demystify function call/return and pointers... although, as I understand it pointers are NOT just addresses in memory.
I think there is also the type of the data it points to, or maybe at least the length of the data it points to? I was wondering how come OS understands that say it only needs to grab 4 bytes for an int32_t, but 8 for an int64_t, I guess such information is included.
Generally, the CPU has instructions for loading memory into registers and storing registers into memory and they specify how much memory to read/write.
eg:
ARM (32) has LDR (32-bit), LDRH (16-bit), and LDRB (8-bit)
68K has MOVE.L (32-bit), MOVE.W (16-bit), MOVE.B (8-bit)
There are also some architectures (early Alpha for example) where memory can only be read in aligned 32-bit chunks or aligned 64-bit chunks. If you want to read 8 bits... well the compiler will generate a 32-bit read and then shift/mask the data to get the 8-bits that you care about.
Oh you are right, I absolutely forgot the translation part I read about tons of time.
> jumps
<irony>goto is considered harmful</irony>
I will go against the trend and disagree. IMHO anyone writing software in C or C++, ASM is an extremely valuable skill. Its really hard to debug code if you don't understand what the compiler is outputting.
If you ever do software vulnerability research, or reverse engineering it is also very important.
I know its a spicy take, and probably due to my reversing background, but I almost look at C as more of a macro language for ASM. In fact when I write in C, often times I think about what the compiler will produce. It also makes sure that when I write things, I am not doing it in an architecture specific way.
One thing I found pretty useful is to port very early Linux command line programs to modern platforms.
I just "ported" (nerfed the number of switches) the 1991 version of `ls.c` to modern Ubuntu and learned a lot of about `stat()`, `lstat()`, the `struct state`, file types in Linux and some programming tricks. It only has 1.2K LoC so it's easy to sort out the logic and write your own from scratch if needed.
I'd dig into genode[1], which is a capability based operating system. You'll likely see an upsurge in interest in capability based systems in the next decade.
[1] https://genode.org/
You can try working on the Linux kernel, perhaps with the kernel janitors project (https://code.google.com/archive/p/kernel-janitors/ ). I recommend hanging out on the mailing list a little while to get some idea of what to do, as well as reading the wikis on that page.
Also worth mentioning SerenityOS https://github.com/SerenityOS/serenity. It’s a very friendly environment and easy to get involved.
Assembly isn't very fun, but maybe reading an old book on assembly for an old 8- or 16-bit cpu will satisfy your curiosity.
[dead]