QEMU: user-to-root privesc inside VM via bad translation caching

tyingq 9 years ago

"However, while real X86 processors have a maximum instruction length of 15 bytes, QEMU's instruction decoder for X86 does not place any limit on the instruction and length or the number of instruction prefixes."

Interesting, and not your usual type of exploit. Guessing this isn't one that will have the Rust crowd doling out "told ya so" :). Logic error only. No buffer overflow, not much strong types do for you, etc.

quotemstr 9 years ago

Well, look on the bright side: once we eliminate the boring old memory safety bugs, and the XSS, and the SQL injection, the exploits that remain will at least be interesting.
- orblivion 9 years ago
  
  And they'll have time to deal with them.
pjmlp 9 years ago

If we remove memory corruption errors, that it already one less class of errors to worry about.
As for the <whatever type safe systems programming language> crowd, these complaints have been done in the past by fairly unknown people like C. A. R. Hoare, Niklaus Wirth, James G. Mitchell, Alan Kay, Luca Cardelli,.... so what do they know about computers.
hueving 9 years ago

If you really wanted to fish for a "told ya so", someone could just point out that by eliminating all of those other classes of bugs, developers could have spent more time looking for logic errors.
- nullc 9 years ago
  
  I got that while trolling Rust developers a bit on this point, and had to concede-- it's a fair point.
  I do think that rust doesn't (yet) have enough affordances for formal verification of algorithmic correctness-- but if you're not chasing memory safety you have more time to deal with other issues.
- empath75 9 years ago
  
  Will you have time for that after fighting the borrow checker?
- simion314 9 years ago
  
  If a developer finishes his task he will be given a new task, I don't think you will get a task "go and review our code again for weird logic errors" you will be sent to add a new feature or fix a bug, so you will work on that and create a few new bugs.
thesz 9 years ago

Range checking is done deal with sufficiently strong types (read - dependent types). It was done in the Epigram LONG time ago, ten years ago if not more.
For example, when you fetching bytes for decode you can type-check that you are in the quota and act accordingly.
- tyingq 9 years ago
  
  Still a developer choice as to what the quota is though, right? That's the logic error here, quota set too high.
  
  thesz 9 years ago
  
  No.
  Working with two page quota (as QEMU dev intended) at type level won't allow you to pass through three page (as QEMU code allows and what is exploited).

omribahumi 9 years ago

> To be clear: As far as I know, this bug only affects the TCG mode (without hardware acceleration), not KVM VMs or so.

I wonder what's the reach of that bug.

Confiks 9 years ago

As the bug seems to rely on a maximum instruction length that is present in hardware x86 but not in QEMU's x86, the reach of this particular bug seems to be just the software emulated mode.
- omribahumi 9 years ago
  
  Yes, I get that. I was wondering how widely used is qemu in x86 software mode
  
  Confiks 9 years ago
  
  It might be used for special applications, but not for your typical server that is connected to the internet, simply because it's horribly slow compared to the virtualization support (VT-x / AMD-V) most modern CPUs offer since at least 2010.
  
  omribahumi 9 years ago
  
  Right, if you're running on x86 hardware. The use case is probably non-x86 hardware running x86 VMs.
  For example, I remember using qemu to emulate RPi with ARM software emulation on my x86 machine.
bonzini 9 years ago

Zero. TCG is not considered secure/trusted by any means by the QEMU team, unlike KVM or Xen. It has never received a serious security audit.
- omribahumi 9 years ago
  
  That doesn't mean people don't use it.
  Is what you're saying here documented anywhere?
  
  pm215 9 years ago
  
  It is documented here, but you're right that ideally we could mention it somewhere more prominent.
  http://wiki.qemu-project.org/SecurityProcess#How_impact_and_...
  
  robryk 9 years ago
  
  Is this the correct link? I can find nothing about TCG nor about "tiny code generator" there.
  It would be nice to warn about lack of security properties of TCG in some of these places: http://git.qemu-project.org/?p=qemu.git;a=blob_plain;f=tcg/R... http://wiki.qemu-project.org/Documentation/TCG
  
  pm215 9 years ago
  
  It's the bit where it says 'is it used in conjunction with a hypervisor?'. That's how we define the use cases that count as defendable against malicious guests.
  This covers more than just the TCG cpu emulation because it also means that any device model that can only be used with an emulated CPU is also out of scope for CVEs and hasn't been audited to confirm it has no VM-escape bugs. So the internal documentation of TCG itself isn't really the right place to document this I think.
  
  i336_ 9 years ago
  
  Hi, wanted to quickly ask a slightly offtopic question about TCG that I've wondered about for a few years. I was never sure who to ask.
  Rob Landley made some small noises about the possibility of leveraging TCG as a successor to tcc (I read about the tinycc/tcc debacle (http://www.landley.net/code/tinycc/) - really sad). I was just curious if such an idea - turning QEMU's code generator into a standalone compiler - is technically feasible in terms of architectural sanity and practical maintainability.
  
  pm215 9 years ago
  
  Well, I guess technically you could do it -- the codegen started out as code from tcc, after all. However we've made enough changes over the years that right now it has some specializations to QEMU's needs. Also questions like "is this optimization pass worthwhile" have definitely different answers for QEMU's JIT purposes compared to an actual compiler -- we care a lot more about codegen speed so there are some optimizations that we prototyped but abandoned because they didn't improve the runtime performance enough to make up for codegen getting slower.
  
  bonzini 9 years ago
  
  TCG used to have no optimizations at all, so the current ones shouldn't be worse than what tcc used to have... That said I have never looked at tcc's code generator.
  
  i336_ 9 years ago
  
  I didn't actually know that TCG was built from tcc, interesting.
  Leveraging TCG in its current JIT-optimized state could actually be very compelling though: the only "C interpreters" I'm aware of are PicoC and Ch (and pedantically, tcc -run), all very different codebases. None offer a JIT-optimizing C interpreter though.
  I take it that all that would be necessary would be ripping out the disassembler frontend and wiring in something like http://www.quut.com/c/ANSI-C-grammar-y.html + http://www.quut.com/c/ANSI-C-grammar-l-2011.html (or an equivalent), or maybe even a (heavily) forward-ported copy of tcc's C parser?

gbrown_ 9 years ago

Not sure why this wasn't duped to https://news.ycombinator.com/item?id=13921305

tomhoward 9 years ago

Dang has addressed this matter several times in the past [1]. The dupe detector is 'deliberately porous' so good stories have multiple chances to get exposure.
[1] https://hn.algolia.com/?query=dang%20porous&sort=byPopularit...
- gbrown_ 9 years ago
  
  Ah interesting didn't know this, thanks.