I started my career out in compiler test at Microsoft, I actually hadn't taken any compiler classes in college, so I figured being unfamiliar with the field that compilers would be a fun place to start off.
Microsoft had these amazing enterprise support contracts (maybe they still do, not sure!) where if you reported a bug it would keep on being escalated up until the dev team of the product was contacted to help you out.
One of the Japanese car manufacturers reached out, through a translator, with some C code that was crashing at runtime. Eventually it worked its way on up to us.
So, sidenote, my internship prior to this was at Boeing on the 787 program, and Boeing had training on cross cultural communication that I signed up for.
So anyway, the first few levels of support aren't able to help, code gets to us, and we have a corporate provided translator assigned to help us work with the customer team in Japan. (Microsoft's enterprise support is really good!)
I am trying to remember code from 15 or 16 years ago, but IIRC the code was a bit convoluted but the issue amounted to accessing an array out of bounds.
So if this had been an American engineer who contacted me, no problem, "Hi, through these series of 4 steps you are accessing an array out of bounds, thanks for contacting us, byes!"
Except, cultural awareness courses. There were a lot of people on the thread, and I knew it wouldn't be nice to reply back next day with a highlight saying "You made an obvious mistake, please stop accessing arrays out of bounds".
So over the next three days, working through a translator, I made communicated that maybe, just maybe, there could be complicated-and-easily-not-seen issue that could be revealed if these particular lines were looked over.
Customer left happy, I got a congrats pat on the back.
Unfortunately compiler authors can be remarkably unconcerned about codegen bugs. I'm still waiting for a longjmp bug in clang from over a decade ago to be fixed:https://github.com/llvm/llvm-project/issues/21557
This is from 2010, so there is some improvement. Namely his team has developed a lot of compiler fuzzing tools and gotten bugs fixed in mainline compilers, but for UB issues in your own code, `-fsanitize=undefined` is a better workaround than `-O0`.
My favorite compiler bug story:
I started my career out in compiler test at Microsoft, I actually hadn't taken any compiler classes in college, so I figured being unfamiliar with the field that compilers would be a fun place to start off.
Microsoft had these amazing enterprise support contracts (maybe they still do, not sure!) where if you reported a bug it would keep on being escalated up until the dev team of the product was contacted to help you out.
One of the Japanese car manufacturers reached out, through a translator, with some C code that was crashing at runtime. Eventually it worked its way on up to us.
So, sidenote, my internship prior to this was at Boeing on the 787 program, and Boeing had training on cross cultural communication that I signed up for.
So anyway, the first few levels of support aren't able to help, code gets to us, and we have a corporate provided translator assigned to help us work with the customer team in Japan. (Microsoft's enterprise support is really good!)
I am trying to remember code from 15 or 16 years ago, but IIRC the code was a bit convoluted but the issue amounted to accessing an array out of bounds.
So if this had been an American engineer who contacted me, no problem, "Hi, through these series of 4 steps you are accessing an array out of bounds, thanks for contacting us, byes!"
Except, cultural awareness courses. There were a lot of people on the thread, and I knew it wouldn't be nice to reply back next day with a highlight saying "You made an obvious mistake, please stop accessing arrays out of bounds".
So over the next three days, working through a translator, I made communicated that maybe, just maybe, there could be complicated-and-easily-not-seen issue that could be revealed if these particular lines were looked over.
Customer left happy, I got a congrats pat on the back.
Unfortunately compiler authors can be remarkably unconcerned about codegen bugs. I'm still waiting for a longjmp bug in clang from over a decade ago to be fixed:https://github.com/llvm/llvm-project/issues/21557
At least they didn't turn it into a language feature
https://stackoverflow.com/questions/18808226/why-is-typeof-n...
https://softwareengineering.stackexchange.com/a/336857
I was going to complain about a GCC codegen bug open since 2007 but it looks like it finally got fixed in 2020!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33799
Clang still has the same bug though, filed in 2012.
https://bugs.llvm.org/show_bug.cgi?id=12286
Oh. My. Goodness.
That is sickening. I use Clang. I use longjmp().
I'm glad I'm going to rewrite all of my stuff into my own language with a custom compiler.
There are lots of compiler bugs. The further you stray from the common use cases the more likely you are to hit them.
There are also an appreciable number of hardware bugs. Some of the complexity in compilers is working around those so that applications don't need to.
Your own compiler, assuming it emits machine code and avoids documented and undocumented hardware bugs, can be error free.
Brilliant :)
I also agree with JonChesterfield on this one: even home-brew compilers might have bugs! That's why I always write in HEX machine code!
This is from 2010, so there is some improvement. Namely his team has developed a lot of compiler fuzzing tools and gotten bugs fixed in mainline compilers, but for UB issues in your own code, `-fsanitize=undefined` is a better workaround than `-O0`.
(2010)
> The base version of gcc was fine, but the Ubuntu people applied about 5 MB of patches
Yikes. I'm guessing there are other surprises to be found there.
Maybe they learned from RedHat (gcc 2.96)