Fully Dockerized Linux kernel debugging environment

github.com

190 points by 0x41534446 3 years ago

suprjami 3 years ago

Interesting idea, my work is all kernel so running containers has traditionally not been interesting to me.

Raw gdb against vmlinux is really doing it hard though. How about crash with pykdump?

https://github.com/crash-utility/crash

https://pykdump.readthedocs.io/

0xricksanchez 3 years ago

Hi author of like-dbg here :)! The attached debugger is not just raw GDB but is using https://hugsy.github.io/gef/ to make debugging less of a pain. It's still not perfect but helps plenty already. I was not aware of the crash utility you linked, I'll definitely take a closer look :)!
ungamed 3 years ago

I dont think I can do it without crash, just too many things can go wrong.
acoard 3 years ago

What’s your local dev environment like? VMs?
- suprjami 3 years ago
  
  Yes, all KVM on Linux.
  For released code, work has an internal instance of the brew build system that the Fedora Project uses.

nyanpasu64 3 years ago

Can you run the kernel on real hardware and debug from another PC (automated, either container or not)? Most of the kernel nightmares I've seen and wanted to debug involve drivers (amdgpu sleep-wake GPU hangs, ALSA USB audio setup, touchpad resolution, display backlight PWM frequency), and containers are insulated from physical hardware by design.

mikepurvis 3 years ago

You probably want kgdboe for this kind of thing? http://sysprogs.com/VisualKernel/kgdboe/
On a embedded targets you'd have a JTAG port for full out of band debugging, but that's obviously not an option for your typical desktop scenario.
- nyanpasu64 3 years ago
  
  Do I build the kernel on the debugging or victim machine, and do I need to use the same distro on both? Does my kernel configuration have to match the victim machine's distro's default settings? How would I get kernel symbols for binary kernels, or transfer symbols from the victim to debugger machine?
- phendrenad2 3 years ago
  
  I never understood why there wasn't a cheap raspberry pi like x86 board with JTAG...
  
  cpgxiii 3 years ago
  
  There's an interface for debugging many (not all) Intel machines via USB that's been around for years. Needs to be enabled in firmware first, given the obvious security implications.
  
  bitwize 3 years ago
  
  Intel CPUs are higher-than-root exploitable remotely, by design. Closing the USB debugging root hole seems penny-wise and pound-foolish.
0xricksanchez 3 years ago

In a future (very very very future) PR I'd be interesting exploring this idea of building and pushing stuff to a dev-board and debugging it remotely. However, right now, I don't have any specific use case for this scenario

0xjmp 3 years ago

I have no idea what I'm doing so it's a night-and-day kernel programming experience to be able to commit container state; conduct experiments; reset back to a clean state; in one command.

Another key element that makes this so useful to me is that I work on a macbook (gasp), and hacky stuff I have to do in that context is the first source of blame when e.g. a header gives weird type issues.

Thank you so much for putting this together.

0xricksanchez 3 years ago

You're welcome! Note that this is still very much early alpha stage at best.
PS: If you're using a case-insensitive APFS, Linux kernel compilation will fail. I'm already aware of that

sneak 3 years ago

These are privileged containers that can mount things and change kernel settings and do real virtualization.

What is the point, then, of doing all this in docker containers? If you're virtualizing with qemu anyway then why containerize?

I was expecting something like software (non-kvm) emulation in a userspace process in an unprivileged container, or perhaps UML, real brain-in-a-jar stuff.

tecleandor 3 years ago

I haven't read much, but I think the idea is doing all the building process in a container so you don't fill your system with headers, libraries and dependencies.
I don't know how well this deals with caches and so, though.
- ta988 3 years ago
  
  Yes it is just used as an easier to setup manage and dispose of chroot. Which I believe is also a perfect use of docker. It is not all about isolation, sometimes you just want to be able to experiment and go back easily. Sure there are filesystems that allow you to take snapshots but not everybody use these for various reasons.
0xricksanchez 3 years ago

I wanted containers for the following things:
- Minimal host system requirements as kernel tool chains can be very much a pain to set up correctly. If I want to compile different kernel versions for different architectures this seemed liked a good trade-off. - I wanted to encapsulate every major logic component into a separate instance so working on the individual components does not require building one insanely large image.
Also note this is very much an early prototype and me just experimenting. Nothing is set in stone yet. Things are likely to change along the way.
galangalalgol 3 years ago

If it is using quemu, could you do this to debug an arm image from x86_64
- 0xricksanchez 3 years ago
  
  Exactly the use-case I had in mind with this as well :).
raverbashing 3 years ago

> If you're virtualizing with qemu anyway then why containerize?
You're exactly right. Docker does pretty much almost nothing here
For real, dockerizing stuff has become a meme lately
What it (actually) does it simplifying dependencies. But you know, most of the time if you can run docker you can install stuff locally
Docker has a significant performance penalty unless you're running Linux already
- sph 3 years ago
  
  A meme? You've said it simplifies dependencies and then go on saying you can install stuff locally, forgetting how much of a nightmare it is to maintain OS-level dependencies, especially when an app depends on a specific version and another app depends on another version.
  Containers solve real problems and are here to stay. Feel free not to use them, but saying they're a meme is just trying to be edgy.
  Keep on containerising applications.
  Sincerely, a sysadmin that has been fighting conflicting dependency requirements, deprecated packages and major distro upgrades for more than a decade.
  
  raverbashing 3 years ago
  
  Containers are not a meme
  What's a meme is creating a container for any minor thing like npm + some js project. Or to run one Go binary.
  > a sysadmin
  Sure, for the deployment of stuff containers are very good. But you know people were using stuff like chroots before that right?
  I don't think I'm deploying this debugging system in a webserver.
  
  piaste 3 years ago
  
  > What's a meme is creating a container for any minor thing like npm + some js project. Or to run one Go binary.
  You couldn't have picked two more different examples.
  Go apps are generally pretty clean statically-linked, native binaries and will be happy to just run from their directory without touching anything else. Containerization is usually unneeded, though if you already have a container ecosystem then they will fit in with no fuss, since the container images are basically glorified tarballs (COPY app/ . RUN foo).
  Node.js apps on the other side combine an interpreted language, a messy package management ecosystem, and a fragile and persnickety toolchain that will happily leave dirty stuff in your filesystem and/or your PATH, in order to maximize the risk of dependency-related screwups. Containerization is an absolute godsend for them.
  
  raverbashing 3 years ago
  
  > Go apps are generally pretty clean statically-linked,
  I know, that's why I put it as an example (of something that doesn't need docker)
  > and a fragile and persnickety toolchain that will happily leave dirty stuff in your filesystem and/or your PATH
  Docker existing shouldn't be an excuse for npm behaving like this ;)
  But I agree, for deployment it is good. But you can still run npm projects without it
  
  yjftsjthsd-h 3 years ago
  
  > Docker existing shouldn't be an excuse for npm behaving like this ;)
  I don't care about the direction of causality, I just want the stupid thing to run on my computer and not pollute my home directory,and docker solves that and solves it well.
  Edit: Although actually about the only time I'm touching node is to ship it in a container image for deployment on a cluster somewhere, which is even more compelling.
  
  sedachv 3 years ago
  
  There are much better tools for doing that, such as Guix profiles and nix-shell, which also happen to be better tools for making container images. Linux container images are a distribution mechanism that does not do anything to address package and dependency management other than shifting the problem somewhere else.
  
  yjftsjthsd-h 3 years ago
  
  nix/guix are also good solutions, but they have a much higher learning curve than a Dockerfile; I would even be willing to suggest that docker is not the true best solution to any problem it solves, but in my experience it is the easiest solution to most of them
  
  calineczka 3 years ago
  
  > creating a container for any minor thing like npm + some js project
  I am backend engineer with a non-JS background and to me that's a a very useful thing. I really don't want to install specific node version, yarn version, etc to run some node/JS application. If I can just docker-compose up that's much easier to me and I know it does not affect anything globally. So removing it after playing around is easy and non-risky.
  
  rixthefox 3 years ago
  
  I think the point he was making was too many people and projects are turning towards containers as a sort of "end all" where we're abandoning local system dependencies by shipping applications with their own bundled versions of libraries.
  My biggest problem with this shift is I'm seeing is people are running their distribution updates and containers are still shipping with vulnerable versions of libraries. So a server admin updates his servers thinking he's all good, No, No he's not because $DEV hasn't updated their container dependencies and now that server gets pwned because here's yet another layer of dependencies I have to deal with on top of the ones I'm maintaining with the OS.
  So no, please don't make containers on everything just because you can, because more often than not the container you never intended to be used in production WILL be used in production because someone was lazy and didn't want to deal with the dependencies.
  Additionally, to your point, specific node versions? specific yarn versions? Is it really that unreasonable to demand libraries be backwards compatible with warnings on software saying a method is being depreciated? Like I can understand for major revision changes but seriously. I should be able to run the latest version of node and that code still work with warnings telling me that the code I've written will eventually stop working in later versions due to changes in the underlying setup.
  My point being, don't weigh on containers as a crutch for badly coded software. Operating systems have moved towards rolling releases, it's about time libraries and programming languages followed suit.
  
  raverbashing 3 years ago
  
  Thanks, you got my point!
  And might I add, some people are shipping containers with full unoptimized Ubuntu (or other large distros) for running projects that are kilobytes in size
  
  dotancohen 3 years ago
  
  In what scenario does compromising a container pwn the server?
  If the server is running just for that container's application, then the server admin knows it is there and needs updating.
  
  rixthefox 3 years ago
  
  Container escapes are a very real security threat that need to be taken seriously. Just like you wouldn't let all your users have a remote shell on your machines even if they are locked down. Just because they can't get anywhere doesn't mean they can't find ways out of that. If you're on the machine and can run whatever code you want, it's not a matter of if, it's when.
  The admin may know what applications they are running in the container but I can bet you they don't know every library that container is shipping and I hold very little faith that these admins are going to pin a laundry list of every single container they run along with all the different versions of libraries each container is bringing and constantly checking that list for CVEs. This problem increases for every single container you bring onto that server.
  Edit:
  I love containers, don't get me wrong. I've seen first-hand how incredible it is to be able to set up an application inside a container and then remove that application and have no residual packages containing the libraries and dependencies that program needed. I get it. I just don't like how dependent we've become on them.
  Now I'm a network engineer first, and server admin second. I don't want to spend a majority of my time pinging container maintainers to update their dependencies when an updated version of a library comes out. I expect to be able to update my local copy of that library and get on with my day and not have to worry about when this library is going to get patched in each of those containers.
  
  dotancohen 3 years ago
  
  I agree with every word, well stated.
  
  __MatrixMan__ 3 years ago
  
  Sure they are. Or rather, their pattern of use is. Wearing your seatbelt is also a meme. Just because something is a meme doesn't mean it's not useful.
- cmm 3 years ago
  
  dockerfiles (or even better but less popular, {default,shell,flake}.nix) are code that can be tracked in a repository and reasoned about, "stuff installed locally" isn't.
  
  encryptluks2 3 years ago
  
  I don't see how Nix is better than Docker as they serve different purposes. While I agree using containers helps aid in automating build environments, you can also checkin shell scripts and Ansible roles into repos as well. I understand the criticism and hopefully people aren't going to be so thin-skinned to take offense, but at the same time if it works then it works.
  
  Strum355 3 years ago
  
  Nix and Docker share the same purpose in this context: declaring the dependencies needed for this environment. This can include shell scripts, system-level dependencies etc. If anything I would say Nix would be better here (logical middleground in filesystem separation between having access only to whats in the container vs having to juggle system-level dependency versions for the entire system), while also having better version pinning.
  
  pxc 3 years ago
  
  > I don't see how Nix is better than Docker as they serve different purposes
  Nix can build many things, among them whole operating systems on the metal, virtual machine images, and container images (including Docker images).
  When you use Nix to generate Docker images and deploy them in production using Docker, Nix is not functioning as an alternative to Docker, but your *.nix file is functioning as an alternative to your Dockerfile.
  Defining container images in terms of Nixpkgs rather than via a Dockerfile, and assembling it via Nix rathet than `docker build` does have some advantages.
  1. It moves you from mere repeatability in the direction of true reproducibility— it makes your outcomes more predictable and reliable
  2. Having the benefit of knowledge of the actual dependencies on the system down to the package (or subpackage) level, Nix knows how to generate minimal images when Docker can't.
  Plus there's the prospect of reuse, I guess; an environment you've defined in Nix for building Docker containers is easy to install and debug in other environments, without the extra complexity of Docker and filesystems exports/imports, port forwarding, etc. That can be nice, to be able to easily separate out what you're troubleshooting or learning as you're creating or modifying the image's environment.
  But yeah that doesn't mean that pairing Dockerfiles with imperative or convergent configutation management or provisioning tools can't also sometimes work well enough in a given situation.
- BossingAround 3 years ago
  
  > Docker has a significant performance penalty unless you're running Linux already
  Ah yes, the famed Windows/macOS kernel developer... :)
  (Not that it's not possible of course, I just found it chuckle-worthy within the context of this article)
- EddySchauHai 3 years ago
  
  I like simplicity and for that reason the first thing I do on any project larger than a bash script is setup a container for it. It helps isolate my OS from the project, allows me to easily share it via dockerhub so others can run it, lets me have a sharable dev environment so people can just open it in vscode and edit the project, etc... I've yet to regret using a container and it takes barely any more time than setting it up once for the OS anyway.

bitwize 3 years ago

Author's username is a highly plausible story beat. "Wubba-lubba-dub-dub! I created a Linux kernel debugging environment and got my name in the hexad*burp*ecimal numbering system, Morty. Now when all those less-intelligent-than me nerds are debugging their kernels, they'll see me in their memory dumps. I'm a part of math, Morty! I'm gonna live forever!"

witnesser 3 years ago

I have a long time question. All interruptions in docker container are supposed to be soft interruption, that means different than a network io events backed up by physical hardware. Correct me if wrong. I remember a professor explaining soft/hard interrupt, if you don't handle my events, I wont' turn off that red LED. Given so. Do we observe exceptional amount of signal missings in a container? If true, how they are handled?

My understanding, interruptions is in essence signal processing. hard means hardware chip guartantee the reliablity. soft means signal are just memory address that can be easy wiped out.

geofft 3 years ago

Docker containers don't have their own kernels. They share the kernel with the host operating system, and the host kernel handles interrupts. The Docker container is just a group of processes started in a particular way so they have a different view of the filesystem, the network, and so forth. Like any other userspace process, they get UNIX signals, but they don't get hardware interrupts, which are handled by the kernel.
For this project, in order to have a kernel to debug, they run a separate kernel inside the qemu virtual machine emulator. qemu is a normal userspace process, and therefore that process can run inside a Docker container. So the kernel being debugged has interrupts, but the "hardware" generating those interrupts (and all the other "hardware," including the CPU itself) is all just software emulating a PC.