Speeding up C++ build times

165 points by skilled a year ago

kjksf a year ago

I wrote about how I keep build times sane in SumatraPDF at https://blog.kowalczyk.info/article/96a4706ec8e44bc4b0bafda2...

The idea is the same: reduce the duplicate parsing of .h files.

I don't use any tools, just a hard-core discipline of only #include'ing .h in .cpp files.

The problem is that if you start #include'ing .h in .h, you quickly start introducing duplication that is intractable, for a human, to avoid.

On another note: C++ compiler should by default keep statistics about the chain of #include's / parsing during compilation and dump it to a file at the end and also summarize how badly you're re-parsing the same .h files during build.

That info would help people remove redundant #include's.

But of course even if they do have such options, you have to turn on some flags and they'll spam your build output instead of writing to a file.

Maxatar a year ago

>On another note: C++ compiler should by default keep statistics about the chain of #include's / parsing during compilation and dump it to a file at the end and also summarize how badly you're re-parsing the same .h files during build.
Clang does offer something very close to this, and if you use it you'll find parsing the same/duplicate header files contributes on the order of microseconds to the overall compile time. Your article is from 1989, which is likely before compilers implemented duplicate header file elimination [1], but nowadays all C++ compilers optimize the common #ifndef/#define header guard as well as #pragma once so that they entirely ignore parsing the same file over and over again.
[1] https://gcc.gnu.org/onlinedocs/cpp/Once-Only-Headers.html
- WalterBright a year ago
  
  True, but when you're compiling a.c and then b.c, the .h files get reparsed all over again.
  
  highfrequency a year ago
  
  Correct me if I'm wrong, but I believe the parent comment's strategy of only #including header files within .c files only helps reduce duplicate header file parsing within each compilation unit. So it wouldn't do anything to improve the case you mention (duplicate header compilation across compilation units) anyway, while adding much additional overhead in manually tracking header file dependencies.
  Also, given your experience in compilers - keen to see if you agree that because modern compilers optimize away re-scanning of the same header file within a compilation unit anyway (in the presence of include guards), the strategy of only #including header files within .c files is close to useless.
  
  WalterBright a year ago
  
  The not rescanning if the #include guards are there goes back to the mid 1980s. It's not a modern feature :-)
  > the strategy of only #including header files within .c files is close to useless
  It probably is. It also means the user of the .h file has to manage the .h file's dependencies, which is not the best practice. .h files should be self-contained.
  
  kazinator a year ago
  
  This is not only not true, but not possible. Many kind sof defintiions in .h files may not be repeated without error, like:
  struct foo { int bar; };
  or
  typedef int xyzzy_t;
  This is why you have include guards:
  #ifndef FOO_H_3DF0_755A #define FOO_H_3DF0_755A struct foo { int bar; } #endif
  GCC optimized the handling of headers with include guards 30 years ago already.
  If you think most of your compile time is spent in preprocessing, benchmark a clean build with the optimization set to -O0 versus your full optimization like -O2 or whatever you are using.
  Both builds perform preprocessing; thus the preprocessing time is bounded by the total time spent in the -O0 build: if the actual semantic analysis of tokens and code generation at -O0 were to take next to no time at all, then we would have to attribute the time to tokenizing and preprocessing. But even then, all the additional time observed under -O2 is not tokenizing and preprocessing.
  
  WalterBright a year ago
  
  Um, I think there is a misunderstanding.
  gcc -c a.c gcc -c b.c
  requires reparsing of the .h files used by both.
WalterBright a year ago

Having written, benchmarked, and maintained C and C++ compilers for decades, I know why the compiles are slow:
1. phases of translation
2. constant rescanning and reparsing of .h files
3. cannot parse without doing semantic analysis
4. the preprocessor has its own tokens - so you gotta tokenize the .h file, do the preprocessing, convert it back to text, then tokenize the text again with the C/C++ compiler. This is madness. (Although with the C compiler I did manage to merge the preprocessor lexer with the compiler lexer, this made it the speed champ.)
This experience fed into D which:
1. uses modules instead of .h files. No matter how many times a module is imported, it is lexed/parsed/semanticed exactly once.
2. module semantics are independent of who/what imports them
3. no phases of translation
4. lexing and parsing is independent of semantic analysis
- kazinator a year ago
  
  Having done some casual benchmarking recently, I found that GCC is about 15 times slower when optimizing than when not. In both situations, the compiler is scanning the same header files, so that activity is bound up within the 1/15th of the optimized compilation time.
  It used to be a common wisdom that the character-level processing of code took the most time. Just like the old floating-point is slow; always use integer when possible.
  Also note that the ccache tool greatly speeds up C and C++ builds. Yet, the input to ccache is the preprocessed translation unit! When you're using ccache, none of the preprocessing is skipped. ccache hashes the preprocesed translation unit (plus compiler command line options and the path of the compiler executable and such) in an intelligent way and the checks its cache. If there is a hit, it pulls the .o out of its cache, otherwise it invokes the compiler on the preprocessed translation unit.
  If most of the time were spent in preprocessing, a much more modest speedup would be observed with ccache.
  
  WalterBright a year ago
  
  Generally when benchmarking compile speeds, the unoptimized build is used, as that is the edit-compile-debug loop. It's always been true that a good optimizer will dominate the build times.
  Back in the Bronze Age (1990s) I endeavored to speed up compilation in a manner that you describe ccache as doing. After the .h files were taken care of, the compiler would roll out to disk the state of the compiler. (It could also do this with individual .h files.) Then, instead of doing all the .h files again, it would just memory map in the precompiled .h file.
  And yes, it resulted in a dramatic improvement in compile times, as you describe.
  The downside was one had to be extremely careful about compiling the .h files the same way each time. One difference could affect the path through the .h files, and invalidate the precompiled version.
  It was quite a lot of careful work to make that work, and I expect ccache is also a complex piece of work.
  What I learned from that is it's easier to just fix the language so none of that is necessary. C/C++ can be so fixed, the proof is ImportC, a C compiler that can use imports instead of .h files, and can compile multiple .c files in one invocation and merge them into a single .o file.
  
  SleepyMyroslav a year ago
  
  There is a reason of why
  > unoptimized build is used, as that is the edit-compile-debug loop
  is no longer true.
  Modern C++ has a lot of metaprogramming abstractions in it and they are no cost only in optimized builds.
  In my years of gamedev work I have not met a sizeable project that was working in unoptimized builds even for debug purposes. Unoptimized only worked in unit tests or small tools.
  
  59nadir a year ago
  
  I think at that point the real solution is to seriously consider all of the language constructs you use and their compile times as well. It's not a given that using more of C++ is always better and real, sustainable change in compile times can be had by moving more and more towards C in many ways but keeping some of the safety C++ provides.
  (I'm sure you've been there, though; gamedev is one of the areas I would expect people to be more sensible about their C++ feature usage in.)
  
  SleepyMyroslav a year ago
  
  If you are implying that we can go back to force inlining everything and only using small wrappers around memcpy then I will have to say that that ship has sailed years ago. I do not know anyone who wants to go back for more than brief moments while changed header causes cascade of rebuilds.
  Now the elephant in the room of build times that no one wants to talk about is 'the optimized' build with PGO+LTO. I think none of the projects I worked that got used to it ever did a local pipeline to do it xD. But if you ask people if they want to ship a build without it the answer is clear 'no'.
  I will totally understand if authors of the linked article also do not like to talk about it. What I am trying to do here is to clear confusion about importance of it. Pretending that IWYU is more than polishing of last 5% of build times helps almost no one. YMMV ofc.
  
  59nadir a year ago
  
  There are plenty of constructs in C++ that are safer than C and still don't impact compile times that much and some that, while safer and better in some regards, are murder for compile times. I'm saying there is a tradeoff to be made and faster iteration speeds are oftentimes more valuable for end result quality than (often perceived) safety.
  
  taylorius a year ago
  
  "Just like the old floating-point is slow; always use integer when possible."
  I know this was only an aside - but it took me the longest time to properly internalize that floats were fast these days. I'm still getting used to the idea that double precision isn't a preposterous extravagance. :D
  
  Arech a year ago
  
  It depends on what you're doing. Doubles are still slower than floats due to twice bigger requirements imposed on memory performance & cache size. So if you doing a "calculator" style of work, there's not much difference, but if you're processing large arrays of data, it's still something you should think of.
  
  taylorius a year ago
  
  Yeah, exactly that. I was getting artifacts processing long vectors, and it was 32 bit float precision that was the culprit.
  
  spacechild1 a year ago
  
  On certain platforms this rule still holds. Recently I have been working with the ESP32. Although some versions have an FPU, floating point math is still slower than integer math. Also, the FPU can only process 32-bit floats, 64-bit doubles are emulated in software and terribly slow.
- pajko a year ago
  
  Precompiled headers eliminate some of these issues, if the usage is right. At one of my former companies any available build optimization stuff had been added to the code base over the years by the DevOps guys without any thought about how the thing would work together, and it just got slower and slower. At the end, the initial 5-minute complete build time increased to about 35 minutes, of which about 10 minutes could be attributed to the various refactors and extremely high amount of templating, but couldn't find a cause for the extra 20-minute increase. It was about 15 minutes to test just a single change.
- uecker a year ago
  
  This is the same for C and C++ and C compile times are dramatically shorter than C++.
  Also doing semantic analysis during parsing should save time compared to having additional tree walking later (and does in my experience).
- zerr a year ago
  
  Isn't `#pragma once` helpful for avoiding reparsing headers?
  
  knome a year ago
  
  `#pragma once` is to prevent a header from being reparsed repeatedly for the same translation unit when ten different headers all include a common one transitively.
  it replaces the prior pattern of including code predicated on a unique definition defined only within that same block to avoid double parsing.
  /* if it isn't unique, you're going to have a bad time */ #ifndef SOME_HOPEFULLY_UNIQUE_DEFINITION #define SOME_HOPEFULLY_UNIQUE_DEFINITION ...code... #endif /* SOME_HOPEFULLY_UNIQUE_DEFINITION */
  This, however, doesn't stop those same headers from needing to be reread and reparsed and reread and reparsed for every single cpp file in the project.
kazinator a year ago
In modern compilers, the scanning of tokens is a tiny fraction of the compile time.
If the header files are properly protected with inclusion guards, then at worst the contents are tokenized by the preprocessor, and not seen by anything else. But for decades now, compilers have been smart enough to recognize include guards and not actually open the file. So that is to say, when the compiler has seen a file of this form once:
```
  #ifndef FOO
  #define FOO
  ...
  #endif
```
and is asked to #include that same file again, it knows that this file is guarded by FOO. If FOO exists, it won't even open that file.
There are reasons to avoid including .h files in .h files, but you're not going to get compile time gains out of it, other than perhaps through secondary effects.
By secondary effects I mean this: when in a project you forbid .h files including other .h files, it hurts to add new dependencies into header files. When you change a header such that it needs some definition in another, you have to go into every .cpp file and add the #include. So, you find ways to reduce the dependencies to avoid doing that.
When .h files are allowed to include others, the dependencies grow rampant. You want to compile a little banana, but it depends on the gorilla, which needs the whole jungle, just to be defined. Juggling the jungle definition takes a bit of time. C++ definitions are complicated, requiring lots of processing to develop. Not as much as optimizing the banana that is the actual subject being compiled to code, but not as little as skipping header files.
jay-barronville a year ago

As much as I hate long compilation times, I also value code discoverability, readability, and consistency. Your (i.e., Rob Pike’s) strategy seems like a nightmare to me. I love Rob, but I can’t follow this rule.
highfrequency a year ago

> We try to mitigate it with #ifdef guards, #pragma once etc. but in my experience those band-aids don’t solve the problem.
Why don't include guards solve the problem of duplicate parsing of .h files within one compilation unit? I believe modern compilers can entirely optimize away even the opening and scanning of the file. And even without that, modern NVMe disks are so fast I would imagine the file opens would be negligible.
Curious to hear if anyone has data on whether duplicate header file parsing is still an actual performance issue even with modern compilers, modern SSDs, and #pragma once.
johannes1234321 a year ago

> On another note: C++ compiler should by default keep statistics about the chain of #include's / parsing during compilation and dump it to a file at the end and also summarize how badly you're re-parsing the same .h files during build.
Not exactly that, but do you know clang's -ftime-trace and tools like https://github.com/aras-p/ClangBuildAnalyzer which help analyzing where time is actually spent? (In small repeated headers I don't see much of a problem, but they of course may contain not so small things ...)
- touisteur a year ago
  
  Any idea whether gcc is roadmapping including this?
  
  gpderetta a year ago
  
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92396
  In the meantime there is -ftime-report
greenavocado a year ago

Only a handful of commenters have mentioned "#pragma once", which is alarmingly few people considering how many C++ practitioners are out there. "#pragma once" is the obvious way to deal with this problem.
- flohofwoe a year ago
  
  That's a somewhat unrelated problem, it just protects from parsing the same header multiple times. The rule that headers should not include other header is more of an organizational thing.
  You can immediately see the include complexity of each source file by looking at the top of the file, all required includes are there as a flat list. Otherwise a single #include may hide dozens, hundreds or even thousands of other includes hidden in a deep dependency tree.
  It also nudges the programmer away from "every class in a tiny header/source pair" nonsense.
- Arech a year ago
  
  `#pragma once` isn't standardized. Even though all major modern compilers do support it (hopefully, but not guaranteed in the same way), if you care about portability to some not even very esoteric platform, it most likely uses ancient compilers (or based on ancient version of a major compiler) that might not support it. But besides that I too don't see a reason why not use a better `#pragma once` instead if include guards...
  
  maccard a year ago
  
  Pragma once is a defacto standard and is supported by every compiler and platform you’re ever going to work with. If you need to support the targets it doesn’t work with, then you’ll get a loud compile error.
  The old include guards are error prone, requiring a unique key per header. Getting that wrong can cause anything from a compile error, an obtuse linker error, or a runtime bug depending on what’s in the header.
  The chances of someone copy pasting a header and breaking the include guard are significantly higher than the chances of someone deciding that we need to support an esoteric embedded target overnight.
  
  rcxdude a year ago
  
  This. #pragma once is more consistently supported than many parts of the standard. Just use it. (And I've worked with some of those esoteric embedded targets. Haven't run into a compiler that didn't support it yet)
  
  flohofwoe a year ago
  
  FWIW I've never seen a compiler that doesn't support '#pragma once' (at least since the late 90s).
  I did come across some esoteric path-related problems with pragma once though which wouldn't occur with include guards. IIRC it was related to filesystem links.
  
  jstimpfle a year ago
  
  I've come across obscure bugs with header guards in open source software that went unnoticed for years -- essentially, there were competing definitions of a global config struct in the code base. Some parts of the code would access Version A, while others would access the same data but thinking it was version B.
  Most .c files had ended up including both versions, but the headers had the same guards so only the first included definition would be in effect.
  The project maintainer had been fixing obscure memory bugs for years until someone noticed the real issue.
  Wouldn't have happened with #pragma once I suppose. With pragma once you don't have to painfully choose an identifier for the header guard, and you can't get that wrong.
  
  uecker a year ago
  
  Pragma once itself gets it wrong, if there are multiple paths to the same file which apparently happens quite a lot.
  
  jstimpfle a year ago
  
  Never seen it happen. Seems like myth, who uses hardlinks/symlinks in source projects? And is pragma once not implemented based on inpdes?
  Anyway false negative identity tests is not a problem like false positive identity tests are.
  
  greenavocado a year ago
  
  Microsoft Visual C++ (MSVC): Supported since Visual C++ 6.0 (1998).
  GNU Compiler Collection (GCC): Supported since GCC 3.4 (2004).
  Clang (based on LLVM): Supported since Clang 2.9 (2009).
  Intel C++ Compiler: Supported since version 8.0 (2003).
  
  pjmlp a year ago
  
  And then there are all of these, and the list isn't even exaustive, https://en.cppreference.com/w/cpp/compiler_support
electroly a year ago

If you use a "unity" build (all .cpp files concatenated into one compilation unit) with the normal '#pragma once' or guard in the headers, then you can be certain that every header file is parsed only once, with no discipline needed. CMake can do this for you with an option. You lose the ability to do incremental builds of individual changed files, but it may be fast enough that you don't care.
For my own personal projects, I just use ccache and precompiled headers. It's good enough for me. I don't want to have to apply "hard-core discipline" to my projects.
eru a year ago

Oh, the joys of using a language that thinks (automated) copy-and-paste coding is the same as a module system.
timvdalen a year ago

Thanks for building SumatraPDF! I just switched away from Windows, but I've been a happy user for a very long time, and it's hard to get that mix of user experience just right. Really appreciate your work.
intelVISA a year ago

The entire article's premise is misguided as slow C++ build times are usually a result of inexperienced SWEs unfamiliar with the language's "fun" build rules.
- maccard a year ago
  
  This is the equivalent of saying C is safe to use, as long as you use it correctly, or your iPhone is fine you’re just holding it wrong.
  If build times are a wide spread problem with a language, that’s a language problem not a user problem.
  
  exe34 a year ago
  
  Knives are really difficult to use, everybody keeps holding the nice shiny metal part, but somehow you're meant to just know to hold the wooden part.
- Arech a year ago
  
  In enterprise you are almost always under a pressure to deliver something asap, so some bad practices are just a natural consequence of that. They could have liked to do it differently, but then they would only be half way to delivering what they have already shipped. There are always tradeoffs.
nextaccountic a year ago

Shouldn't precompiled headers mitigate that?
sfpotter a year ago

How much of a speedup did you get switching over to Rob Pike style includes?
- cpeterso a year ago
  
  I had to look this up:
  Rob Pike’s rule is that header files should not include other header files; they should only document which other header files they depend on so programmers can directly include those other header files in their .c files.
  https://bytes.com/topic/c/answers/217074-rob-pikes-simple-in...
  
  fooker a year ago
  
  This seems like the kind of thing that a compiler should be good at instead of making the user jump through hoops.
  
  techbrovanguard a year ago
  
  I'm glad Rob Pike didn't contribute to Go, that'd have been catastrophic to my productivity! Imagine having tools do work instead of users, it'd be like a world without lawyers. Anyhow, I've got to refactor my WIP code for the 10th time today since unused variables are compiler errors. Seeya!
  
  iainmerrick a year ago
  
  Wow, I hadn't heard of that one. It sounds very unpleasant.
  I have the exact opposite rule -- it should always be OK to include a header file on its own. As a corollary, headers must include or forward-declare everything they need.
  
  ErikCorry a year ago
  
  If the .h file is documenting which other header files it needs, what keeps that documentation up to date. This sounds like a very painful way to work, where .h files fail to compile (with very strange error messages because of the way C++ works) and then you have to hunt down the missing dependency.
  
  flohofwoe a year ago
  
  In my header libraries I have checks like this which produce a meaningful error message:
  #if !defined(SOKOL_GFX_INCLUDED) #error "Please include sokol_gfx.h before sokol_shape.h" #endif

snypehype46 a year ago

Coincidentally in the project I'm currently working I managed to reduce our compile times significantly (~35% faster) using ClangBuildAnalyzer [1]. The main two things that helped were precompiled headers and explicit template instantiations.

Unfortunately, the project still remains heavy to compile because of our use of Eigen throughout the entire codebase. The analysis with Clang's "-ftime-trace" show that 75-80% of the compilation time is spent in the optimisation stage, but not really sure what to do about that.

[1] https://github.com/aras-p/ClangBuildAnalyzer

celrod a year ago

What's the trick with explicit template instantiations? Including them in the precompiled header?
- logicchains a year ago
  
  You leave them uninstantiated in the header file and explicitly instantiate them in the .cc file (only works if there's a small number of possible instantiations).

petermcneeley a year ago

The video game industry uses bulk builds (master files) which groups all .cc s into very large single .cc files. The speedups here are like 5-10x at least. These bulk files are sent to other developers machines with possible caching. The result is 12 min builds instead of 6 hours.

shanemhansen a year ago

Chrome supported this for a long time and it really helped small developers outside of Google be able to build chrome without specialist machines.
But that feature was pulled by the chrome team with the stated justification being that since C++ guaranteed different things (iirc around variable scopes outside of a namespace) for one file vs multiple files, supporting the jumbo build option meant writing some language that was "not C++".
Unfortunate.
- Shorel a year ago
  
  When "best" is the enemy of "good enough"
anybodyz a year ago

Many of them use this tool: https://www.fastbuild.org/docs/home.html
It's was built and is regularly maintained by a Riot Games principal C++ architect, and automatically compiles files in large "unity" chunks, distributes builds across all machines in an organization, and creates convenient Visual Studio .sln files, and XCode projects. It's also all command line driven and open source.
This is industrial strength C++ builds for very large rapidly changing code bases. It works.
celrod a year ago

If using cmake, you can try a unity build. https://cmake.org/cmake/help/latest/prop_tgt/UNITY_BUILD.htm...
You can also specify a `-DUNITY_BUILD_BATCH_SIZE` to control how many get grouped, so you can still get some parallelism. However, I think it'd be more natural to be able to specify number of batches (e.g. `nproc`) than their size.
Code bases may need some updating to work.
WalterBright a year ago
How it works with D is you can do separate compilation:
```
    dmd -c a.d
    dmd -c b.d
    dmd a.o b.o
```
or do it all in one go:
```
    dmd a.d b.d
```
Over time, the latter became the preferred method. With it, the compiler generates one large .o file for a.d and b.d, more or less creating a "pre-linked" object file. This also means lots of inlining opportunities are present without needing linker support for intermodule inlining.

Twirrim a year ago

sqlite uses something like this approach too, and there are additional optimisation advantages from keeping everything in a single file:

https://sqlite.org/amalgamation.html

    Over 100 separate source files are concatenated into a single large file of C-code named "sqlite3.c" and referred to as "the amalgamation". The amalgamation contains everything an application needs to embed SQLite.
    
    Combining all the code for SQLite into one big file makes SQLite easier to deploy — there is just one file to keep track of. And because all code is in a single translation unit, compilers can do better inter-procedure and inlining optimization resulting in machine code that is between 5% and 10% faster.

josephg a year ago

It makes sense. As projects grow, the average header file is included O(n) times from O(n) different .cc files - leading to O(n^2) parsed header files during compilation. And thus, O(n^2) work for the compiler.
Merging everything into one big .cc file reduces the compilation job back to an O(n) task, since each header only needs to be parsed once.
Its stupid that any of this is necessary, but I suppose its easier to hack around the problem than fix the problem in the language.
- WalterBright a year ago
  
  Those problems are fixable with C/C++, but nobody seems to want to do it. They are fixed with dlang's ImportC. You can do things like:
  dmd a.c b.c
  and it will compile and link the C files together. ImportC also supports modules (to solve the .h problems).
  It's all quite doable.
  
  Gibbon1 a year ago
  
  I wonder if you could create a pragma in C to do almost the same.
  I don't have a good name for it but it would force the compiler to ignore previous definitions. With an 'undo' pragma as well.
- gpderetta a year ago
  
  Which compiler parses the same header file multiple times in the same translation unit? Compilers have been optimizing around pragma once and header guards for multiple decades.
  edit: ok, you meant that each header is included once in each translation unit.
  
  josephg a year ago
  
  Yep. Worst case, every header is included in every translation unit. Assuming you have a similar proportion of code in your headers and source files, compilation time will land somewhere between O(n) and O(n^2) where n = the number of files. IME in large projects its usually closer to n^2 than n.
  (Technically big-O notation specifically refers to worst case performance - but thats not how most people use the notation.)
forrestthewoods a year ago

I'm starting to believe that one static/shared library should be produced by compiling exactly one cpp file. Go ahead and logically break your code into as many cpp files as you want. But there should then be a single cpp file that includes all other cpp files.
The whole C++ build model is terrible and broken. Everyone knows n^2 algorithms are bad and yet here we are.
- sfpotter a year ago
  
  Everyone: "O(n^2) algorithms are bad."
  Also everyone: "Just do the stupidest thing in the shortest amount of time possible. We'll fix it later."
Shorel a year ago

Nice, that's like doing the link phase before the compilation.

andersa a year ago

It's frustrating to see the C++ committee spend year after year on pointless new over engineered libraries instead of finally fixing the compile times. On a high level view, with only one change to the language, we could entirely eliminate this problem!

Consider the following theoretically simple change:

A definition in a file may not affect headers included after it. If you want global configuration, define them at the project level, or in a header included by all files that need it.

i.e. we need to break this construct:

    #define MY_CONFIG=1
    #include "header_using_MY_CONFIG.h"

Thats really all we need to do to completely eliminate the nonsense that is constant re-parsing of headers and turn the build process into a task graph where each file is processed exactly one time, and each template is instantiated exactly one time, the intermediate outputs of which can be fully cached.

Most real-world large projects already practice IWYU meaning they are already fully compatible with this.

There are some videos by Jonathan Blow on how this is exactly how the Jai compiler is so fast. Why must we still suffer with these outdated design decisions from 50 years ago in C++? Why can't the tech evolve?

/end rant

James_K a year ago

I used to think the same thing myself, but now I'm not sure that would solve much. The problem you describe is really just a matter of caching. I believe you should be able to process the tokens of a header to determine all the identifiers in it, at which point you can model the header compilation as a memoised function from the definition of those identifiers to the compiled file. So for your example, every include of the header where MY_CONFIG=1 could re-use the same results.
The real issue is just that C++ compilers are horrendously slow. They have been designed with the intention of producing fast executables rather than compiling quickly. Think of Rust, which has a high degree of structure in its compilation process and so a high degree of opimisability, yet it still suffers slow compilation due to its usage of a C++ compiler backend.
I think this really because C++ builds are fundamentally unstructured. Rather than invoking a compiler on the entire build directory and letting it handle each file, it is invoked once for every file in a way that might be non-trivial. Improving the build process almost always comes at the
Beyond that, C++ developers simply do not care about slow compilations times. If they did, they wouldn't be using C++. It's my personal theory that C++ as a language has effectively self-selected a user-base that is immensely tolerant of this kind of thing by driving off anyone who isn't.
- andersa a year ago
  
  > I think this really because C++ builds are fundamentally unstructured. Rather than invoking a compiler on the entire build directory and letting it handle each file, it is invoked once for every file in a way that might be non-trivial.
  True, this would also need to be fixed. Compilation would need to become a single process that can effectively manage concurrency and share data.
affgrff2 a year ago

Aren't there modules since c++20 which solve this problem?
- andoma a year ago
  
  Yes, but we are not exactly there yet: https://arewemodulesyet.org/
  Edit: Saw after posting this was already posted in a top level comment.
- andersa a year ago
  
  Modules are a massively over engineered "solution" to the problem that require significant refactoring to actually make use of them. Have you tried to properly use modules (i.e. create ones in your software, not just import std)? It's super clunky and still hardly usable.
  I doubt we'll see Unreal Engine get any benefit from that in a long time for example. It could be so much better, working fully automatically with almost all existing code so long as you use IWYU, which is already standard for large projects where this is needed the most.

diath a year ago

Ever since I tried -ftime-trace in Clang to improve build times in a project a while ago, I've been very conscious about using forward declarations wherever possible, however I wish we had proper module support that actually worked well, having to keep this in mind whenever writing new code just so your project doesn't take forever to compile sucks, this shouldn't even be something we have to keep in mind in 2024.

Kelteseth a year ago

I recently created https://arewemodulesyet.org/ to track module adoption in the c++ ecosystem. I used how often a vcpkg manifest file changed to get a rough estimate how popular a library is.

lastgeniusua a year ago

Thank you very much for this resource! Reading through the article and the discussion here I was really surprised why nobody discussed using actually existing modules, this clarifies how far that is from a solution.
You should really post this as a separate Show HN story!
"Estimated finish by the year 5134" made me chuckle
- Kelteseth a year ago
  
  And we just improved the waiting time by about 1000 years, by adding flux to the list of supported projects. But sarcasm aside, yes this is a bit of criticism of the c++ ecosystem that moves a bit too slow. Many libraries pride themselves with only using c++11/14.

nrclark a year ago

Back when I was doing C++ more often, I got a big speedup on my builds (and also smaller binaries) by using C-style forward declarations for everything that I possibly could, including class methods. Method definitions / template instantiations would be placed into a corresponding .cpp file.

I know that it's unpopular in a world of header-only libraries, but it really does make a dramatic difference to build-time. Especially if your code is heavy on templates.

LeSaucy a year ago

I use c++ on the daily and find that ccache and an m1/2/3 cpu go a very long way to reducing build times.

ErikCorry a year ago

Ccache is great but then change one central .h file and you are back to square one.
- gpderetta a year ago
  
  Ccache + distcc (or similar caching/distributed build solutions) work very well.
MathMonkeyMan a year ago

Not sure why this was downvoted. It's true that ccache and build parallelization (e.g. icecream) can grease the wheels enough that builds are no longer a dev cycle bottleneck.
What the article is about, though, is changing the source code so that it is intrinsically faster to compile. At some point you say "this program isn't complicated, why does it take so long to compile?" Then you start looking at unnecessary includes, transitive includes, forward declarations, excessive inlining, etc.
- rjkaplan a year ago
  
  I'm guessing the comment was downvoted because the suggestions are mentioned in the first paragraph of the article...
  > After trying a few stopgap solutions—like purchasing M1 Maxs for our team—build times gradually reverted to their original pace; Ccache and remote caching weren’t enough either.
  
  gpderetta a year ago
  
  Parallelization is not a stop gap solution. Is the only scalable one as C++ projects tend to grow to multimilion lines of code easily. And with distcc (or similar) you do not need to buy your developers beefy workstations (although you should!).
scheme271 a year ago

Yeah, I got 15 minute build times down to under 30s using ccache. Doesn't help with cold rebuilds but once you have a cache, it really does help things significantly.

pietroppeter a year ago

> It is written purely in Python and does not use Clang, which makes it fast to run—usually in just a couple of seconds.

Oh, the irony

intelVISA a year ago

Indeed, reading this article is supposed to promote Figma's engineering team for 'solving' a solved problem: it does quite the opposite.

chipdart a year ago

This blog entry is highly disappointing. The Sigma blog post reads as if they reinvented the wheel with basic information that is not only widely known and understood but it also featured in books published decades ago.

The blog post authors would do well if they got up to speed on the basics of working with C++ projects. Books such as "Large scale C++ vol1" by John Lakos already cover this and much more.

rileymat2 a year ago

It could be my bias but it seems a lot of inexperienced developers no longer read comprehensive books on topics but survive on Google, stack overflow, some documentation with examples/simple tutorials, blog posts and now Gpts.
All are useful tools but they are very poor in eliminating unknown unknowns like a book would.
- bdowling a year ago
  
  You think only inexperienced developers have stopped reading books?
  
  rileymat2 a year ago
  
  That's fair, I bet it is more widespread, but people starting in the last 15-20 years did not even have the initial introduction. I read a lot, it surprised me to find out that it was atypical in an industry that is supposed to be somewhat about leveraging brainpower. May as well stand on the shoulders of giants.
SuperV1234 a year ago

Shameless plug from my talk: https://youtube.com/watch?v=PfHD3BsVsAM

jupp0r a year ago

Bazel + remote workers yields a great user experience with small infrastructure footprint per developer, but requires quite a bit of work to initially set up. You get reproducible builds, caching of test results and blazingly fast CI as a side effect.

jack_pp a year ago

when you use tensorflow from python you're basically using c++
bburnett44 a year ago

lol I initially read this as remote people workers and was super confused

tinganho a year ago

One thing that crossed my mind when I was looking at the TypeScript compiler. Was that it was parsing header "d.ts" files even though it wasn't used in source. Although, it had some references of a type in a small main function.

Iirc, I think this was how most compilers did. The downside is that transitive deps can easily explode. Thus, compiling a super small main function can takes seconds.

I did suggest a solution to just lazily parse/check symbols if they are encountered in source. Instead of when including a type, you have to parse all the transitive header files of the file that defines the type.

wifijammer a year ago

I really hate the repetition from separating out header and definition files so I've been writing my whole codebase headers only.

I feel like this kills my compile time but I'm not sure how to fix it. Precompiled headers?

flohofwoe a year ago

If you have all your application code in headers you'll get the fastest build times (for full rebuilds at least) by including all headers into a single main.cpp file and build just that, since that way there is no redundant code for the compiler to build at all.
Of course the downside is that every tiny code change triggers a full rebuild then, but it's quite likely that the most time is spent in the linker anyway, so maybe worth a try.
- runevault a year ago
  
  I think I've heard this called a Unity build where there's a precompile step that just dumps everything into a single file and compiles that so it doesn't have to re-include everything at different compilation units (when I first heard the term I got confused because it was in a game dev context but had nothing to do with the Unity engine lol).
  
  flohofwoe a year ago
  
  Yes, it's typically called unity or jumbo build, and can also be done with regular source files.
  Cmake has a feature to do that automatically during the build: https://cmake.org/cmake/help/latest/prop_tgt/UNITY_BUILD.htm...
  ...haven't tinkered with it yet though.
  
  runevault a year ago
  
  Huh did not know cmake had it built in. Thanks for the information!
- asvitkine a year ago
  
  Except then there's no parallelism since you're only building one file. Ideally you'd split it into N files to take advantage of multiple cores, but then you have to decide how to split it...
  
  flohofwoe a year ago
  
  Right. My rule of thumb would be "one implementation file per system", or what would be called a "module" in other languages. So that a moderately complex code base ends up with a about a couple dozen source files to build.
  And each system should only have a single 'public interface header' to keep the number of cross-system include dependencies low.
pjmlp a year ago

C++20 modules.
Currently mostly usable on VC++ vlatest, and clang 17, with clang 18 bringing in support for c++23 import std (VC++ already does it).
Sadly GCC is still far behind, not to mention all the other ones still catching up with C++17.
ranger_danger a year ago

ccache is one solution, or a script/IDE plugin that will create both the header/definition from a signature you provide?

stockhorn a year ago

I've also tried to optimize c++ compile times on large projects a few times. I never got IWYU working properly and I always hated the fact that I still have to care about header files at all. Then I switched to doing rust full time, which made all the fiddling with header files obsolete. This felt amazing. But now I'm facing the same problem, slow compile times :). Only this time I have to rely on the compiler team to make the improvements and I cant do much on my side AFAIK.

Nereuxofficial a year ago

Well that's not quite true. You can do a few things: 1. Reduce dependencies and features of dependencies 2. Use a faster linker like mold 3. Use a faster compiler backend like cranelift(if possible) 4. Use the parallel compiler frontend(again if using nightly is possible) 5. Use sccache to cache dependencies But i do get what you mean. Especially in CI the build times are often long
- fanf2 a year ago
  
  Split up crates so your compilation units are smaller.

Scubabear68 a year ago

I find it hard to believe that this post indicates that C++ build times are proportional to included bytes, period.

I haven’t used C++ in quite a while, but aren’t templates a big part of this issue?

flohofwoe a year ago

It becomes believable when you consider that your own code is just a very tiny appendix dangling off the end of a massive chunk of included data. For instance just including <vector> results in a 24kloc compilation unit of gnarly template code in Clang with C++23:
https://www.godbolt.org/z/G18WGdET5
...add <string> and <algorithm> and you're at 45kloc:
https://www.godbolt.org/z/Whv73YPYh
...and those numbers have been growing steadily by a couple thousand lines in each new C++ version.
Multiply this with a few thousand source files (not atypical with the old 'clean code' rule to prefer small source files, e.g. one file per class), and that's already dozens to hundreds of million lines of code the compiler needs to process on a full rebuild, all spent on compiling <vector> over and over again.
TL;DR: the most effective way to improve build times in C++ is to split your project into few big source files instead of many small files (either manually, e.g. one big source file per 'system', or let the build system take care of it via 'unity' or 'jumbo' builds).
- gpderetta a year ago
  
  From a quick test, including every single header in the standard C and C++ library is 180k loc in an otherwise empty .cc file and compiling it with g++13 in C++23 mode takes 1.6s on my machine. Not amazing, but not terrible either.
  I also don't believe in tiny instantiation units, but when compiling real code, parsing the headers themselves is not necessarily the bottleneck.
- mgaunard a year ago
  
  Except that doesn't improve the time of iterative builds, which are the only ones that really matter to software development.
  
  flohofwoe a year ago
  
  It does though for header changes, which then may trigger fewer source file compilations. IME in incremental builds the most time is spent in the linker anyway.
  > which are the only ones that really matter to software development.
  Debatable in this age of cloud CI builds ;)
  
  josephg a year ago
  
  > IME in incremental builds the most time is spent in the linker anyway.
  A lot of time is spent in the linker because the linker needs to parse and deduplicate any monomorphized C++ classes (like vector). This takes time proportional to the number of compiled copies of the class / function that are kicking around.
  So I'd expect linking times to also decrease if you're compiling fewer, larger source files.
  
  mgaunard a year ago
  
  Splitting everything in small files makes it so that very little needs to recompile when you change something.
  Linking is pretty much just I/O-bound unless you're using LTO. This is assuming you're using a modern linker like mold.
  
  flohofwoe a year ago
  
  You also need to put into consideration though there are no "small" source files anymore as soon as you include anything from the C++ stdlib. Each tiny source file ends up anywhere between 20 and 100kloc after includes.
  Also, many C++ projects I've seen indirectly include almost anything into anything under the hood, so a header change on one end of the project may trigger a rebuild of seemingly unrelated source files.
  
  mgaunard a year ago
  
  They're small in the sense that their dependency on the rest of your codebase is small.
  
  SleepyMyroslav a year ago
  
  Typical solution is to automatically exclude modified files from unity build. So when you edit only 1 cpp you build 2 first time and only 1 afterwards if you need to iterate over it.
  Such solution introduces more funny failure modes into build though. People get quite irritated when no change edit in a single file breaks build :)

gosub100 a year ago

I just checked and distcc [1] is still a thing. It's been around a while now, I remember messing with it in about 2010. It allows you to parallelize your builds across multiple machines. Not as relevant now with todays multicore CPUs, but if you or your employer can't afford to buy you a work station, distcc might be your answer.

[1] https://www.distcc.org/

hi-v-rocknroll a year ago

Use icecream or sccache. sccache optionally supports distributed builds.
https://github.com/mozilla/sccache/blob/main/docs/Distribute...
- saidinesh5 a year ago
  
  Back in college days, I had to use icecream to compile some of KDE codebase. Our laptops were quickly thermal throttling and simply couldn't handle so much code. Add to that crazy summers here, icecream was a godsend.
gpderetta a year ago

Even if you have a beefy workstation, if your projects has thousands of translation units (i.e. any moderately large c++program), distcc (and ccache) still help significantly.
Unfortunately linking becomes the bottleneck.
curiousgal a year ago

We use Incredibuild at work.

jsbus a year ago

I was thinking about applying at Figma a couple days ago. Seeing the engineering culture portrayed in this post, I do not any more.

bandika a year ago

A bit of a shameless plug, I've created this Rust tool to visualise header dependencies (major contributor to build time) of C++ projects. https://gitlab.com/Akeras/header-dependency-visualiser

ErikCorry a year ago

My tip: Increase the size of .cc files. Since each .cc file is including tens of thousands of .h lines, you should not allow developers to check in < 100 line .cc files.

The OP was seeing build times increase faster than loc. Probably someone on the team likes small .cc files.

choppaface a year ago

But translation units / small .cc files can build in parallel and cached, and with multi-core thus it's desirable to have many small translation units. Except of course when there's eventually one large translation unit that needs everything and then link time dominates ...
The article emphasizes a common issue about headers.. X-Code and Visual Studio work around this to some extent with pre-compiled headers, something that can be really hard to set up in ccache. If Figma's whole team is using macs (they mention getting everybody macbooks?) then I wonder if they could just switch to X-Code and use built-in pch support. While that introduces a dependency on X-Code :( maybe their whole C++ stack will get effectively re-written in the next couple of years anyways?
- ErikCorry a year ago
  
  OK, when you only have as many .cc files as you have CPU cores, you should stop making them bigger. This is not an issue for big projects.
  The caching of compilation fails if you touch a central .h file. When you work on projects like that you start dreading a change to the .h files because development slows to a crawl as the compile times explode.
  I worked on V8 and over time the .cc files got smaller and the build times got much worse. Some people felt this was neater, but if you are not in an office with 20 beefy workstations using distcc the effect is brutal.

threesmegiste a year ago

Whenever i tend to learn c++ this build compile issues prevent me. İ really dont understand how such a important language could still have tooling problems. Especially for beginners. I dont wanna learn a language in browsers, and dont wanna download visual studio. Is there a way or a light a complete solution or app to resolve this issue for beginners. İ really wanna learn this language to understand computer. Dont love rust or carbon or go or zig? any language called system language. I wanna learn c++. İ know basic, pascal yes you guessed right i am old

taylodl a year ago

Back when I was doing a lot of C++ development, pre-compiled headers had become the solution to this problem. If memory serves me correctly, I think it was Symantec who first implemented this feature. Dramatically sped-up compilation teams on processors that were only running at 33 MHz - or slower!

Though I was an early adopter of STL at the time, it still hadn't enjoyed widespread use yet. Are templates now the problem with pre-compiled headers? If so, then that should be the problem we tackle.

demilich a year ago

Use C++20 modules, take a look at this project: https://github.com/infiniflow/infinity

delta_p_delta_x a year ago

A big improvement in compile times with modern C++ can be had with C++20 standard modules.

This requires extremely bleeding-edge toolchains, though: VS 2022 17.10, Clang 18, and GCC 14.

sfpotter a year ago

I'm under the impression "big" here might mean a 10-20% improvement if you're lucky. Good to back this claim up with hard data.
celrod a year ago

I'm still waiting for clangd support, e.g. [0] before trying modules. But maybe I should just try it, as at least one person reports that it already works [1].
[0] https://github.com/clangd/clangd/issues/1293 [1] https://github.com/snu-sf-class/swpp202401/issues/21
jupp0r a year ago

You know you are talking about C++ when "bleeding edge" is compilers from 3 years ago which implement the version before the current standard from 4 years ago....
- IAmLiterallyAB a year ago
  
  Clang 18 came out in March and GCC 14 isn't even out. So those are bleeding edge by just about any metric
- delta_p_delta_x a year ago
  
  Clang 18 was released this March, barely two months ago[1]; VS 2022 17.10 is still in preview[2], and GCC 14 isn't even out yet. This can't be nearer the bleeding edge.
  [1]: https://github.com/llvm/llvm-project/releases/tag/llvmorg-18...
  [2]: https://learn.microsoft.com/en-us/visualstudio/releases/2022...
- Longhanks a year ago
  
  Well, ISO C++ is a standard, and the standard being finished does not mean the implementation being finished.
  Also, modules is probably one of the biggest changes in terms of work required in build systems, compilers and tooling since at least C++11.
bdd8f1df777b a year ago

I'm still waiting for CMake support to mature.

DylanSp a year ago

I'm curious how bad the build times were; minutes, tens of minutes, hours? I didn't see any absolute times in the post, just percentages.

saidinesh5 a year ago

Funnily enough, this topic was all they asked in one of the job interviews I had when I was younger.. they shared a bunch of sources files and asked me where/what I could do to improve the compilation speeds.

I didn't realize how big the impact of these little changes were (only include what you use, forward declare as much as you can, PIMPL etc..) until I worked on a large codebase in that company.

mgaunard a year ago

DIWYDU sound like a better tool than IWYU.

feverzsj a year ago

On the contrary, I prefer putting everything in header file, and put related headers in same source file just for compile. It's basically unit build but without its drawbacks.

Gupie a year ago

Link times are the main pain for our edit, build, debug cycle.

Liquid_Fire a year ago

Have you tried mold [1]? In my experience, it makes a dramatic difference to link times.
[1] https://github.com/rui314/mold/

setheron a year ago

Isnt this what header guards are for or the #pragma include

rurban a year ago

The best thing would be to convert to C and use tcc. Other options would be dlang or go.

Cfront with tcc would sound nice enough to try

ramon156 a year ago

The longest acronym I know is WYSIWYG (What You See Is What You Get)

8372049 a year ago

TIMTOWTDI (There Is More Than One Way To Do It)

chrisjj a year ago

> the perennial issue of slow build times

No, that's long build times.

anthk a year ago

Use ccache and stop reinventing the wheel.

ramon156 a year ago

The longest acronym I know is WYSIWYG (What You See Is What You Get)

teunispeters a year ago

(silly answer) use Visual C++. I've been coding across MacOS, Linux, and Windows 11, with roughly comparable machines in a lot of ways (all Intel). Visual C++ while awkward for compatibility - is very very fast for builds, and not bad for IDE interface for rapid fixes.

More serious - I moved to CMake presets and with that came a lot of cache optimization - including parallel builds. MacOS is now almost as fast as Windows for build, and Linux/gcc not far behind. Windows C++ seems to have the lowest modern feature compatibility, followed by MacOS/Clang, with Linux/recent GCC being the most complex. A lot of the newer features seem to add a lot to the build time..

... mind I've been working with C++ only for the last few months, and C for many years before, so consider it a beginner post in a lot of ways. Still, it was interesting to explore, and I'll be continuing to explore - I haven't yet enabled ccache for instance which I suspect will improve a lot.

jupp0r a year ago

My experience is the exact opposite. Moving a multimillion line C++ code base from msbuild to CMake/ninja on Windows cut the build time in half.
Chrome got even better speedups I believe by building with clang/ninja on Windows.
Bazel is where the real benefits lie by reusing other people's (or CI machine's) partial build artifacts via a centralized cache and by avoiding to run tests that are not affected by code changes.
- spacemanspiff01 a year ago
  
  How does bazel work with cmake builds?
  
  tambre a year ago
  
  Seems it has the necessary integration points to run CMake builds as an external command. The same way you could build Make, Autotools, Meson or Bazel projects from CMake with the necessary external command plumbings.
  Obviously both fille the same purpose of being a build system, though Bazel is also a build executor not just a generator. Integration would mean either adding BUILD language support to CMake or vice-versa, but you wouldn't get the particular benefits of either this way.
gpderetta a year ago

Haven't used MSVC in a long while, but at last place where we were doing multi compiler builds, MSVC was always slower by far.

juunpp a year ago

Just another self-promotion blog post with near zero information density. "Behold, the vastness of our vanity, regurgitating old news like we just discovered something new." Do these posts really help with hiring?

CLION also highlights unused includes, nothing new here. Use a good IDE. A networked ccache also does wonders if your org allows it.

Slow build otherwise stem from a combination of: a) lack of proper modules in C++ (until recently) and b) unidiomatic or just terrible code bases. To help with the latter, hide physical implementations (PIMPL for class state, forward declarations for imports), avoid OOP-style C++ above all, minimize use of templates, design sound and minimal modules. No rocket science.

MathMonkeyMan a year ago

Three times I've joined a team that has a substantial C++ codebase, and three times I've been tempted to use libclang based tooling to automate changes, or at least to identify patterns that could be changed.
This article, while not the nerdy deep dive I'd like, does touch on what happens when you try to do that. You realize that the C++ standard library is really complicated, that your existing code is really fucked up, and that libclang is too limited a tool. You end up writing a XSLT engine in hacked up python, but by a different name.
[LibTooling][1] is probably The Right Thing ("in C++", as the article says), but I never spent the time to get it working.
Somebody write a DSL for C++ inspection and transformations that uses LibTooling as a backend. I bet there are many, but none close at hand.
edit: [this][2] is close...
[1]: https://clang.llvm.org/docs/LibTooling.html
[2]: https://clang.llvm.org/docs/LibASTMatchersTutorial.html#inte...
stefan_ a year ago

Is there anyone else that gets unreasonably angry at stuff like PIMPL? It is truly the most braindead, bereft of sense activity in this world. There was a comment in one of the many Rust threads that called C++ a respectable language now but then things like PIMPL snap you right back into the wasteland it is.
- wakawaka28 a year ago
  
  PIMPL is an elegant solution to multiple problems. Idk what you could possibly have against it besides the extra work involved. I don't think any language has solved the fundamental problem of hiding details better than PIMPL does.
  
  bananaboy a year ago
  
  I really like C#'s `partial` keyword as a solution to the problem of hiding implementation details. It lets you declare a class over several files, so you can have one file which is only the public interface, and another which has private implementation.
  
  wakawaka28 a year ago
  
  That is essentially the same idea as PIMPL. You put the private parts of the class (e.g., the data layout) in some file that is held privately. I guess you could argue that there is extra syntax involved with PIMPL because C++ is more low-level than C#, but it's not so bad. The actual implementation of a class can be spread over as many files as you want in C++.
  
  bananaboy a year ago
  
  Yes but pimpl is really just a hack and workaround for the fact that you can't separate the public interface from the implementation details in C++ due to needing to put everything in the class definition. imho `partial` is superior, as you don't need an additional allocation and indirection.
  
  nrclark a year ago
  
  You actually don't need to do that at all. It's common style in C++, but the language does not require it.
  With the right techniques, you can absolutely forward-declare basically all of a class's functionality. Then you can put it into its own translation unit.
  Members and function signatures have to be declared in the header, but details about member values/initialization and function implementations can absolutely be placed in a single translation unit.
  
  wakawaka28 a year ago
  
  The data layout of a class must be part of its definition. So either you expose the layout (data members), add a layer of indirection with PIMPL, or resort to ugly hacks to otherwise hide the data layout such as having buffers as members. Another possibility is to not use exposed classes for member functions. Then you can just pass pointers around only and never use C++ features. Out of all of these, PIMPL solves the problem the best.
  
  nrclark a year ago
  
  Yes, that's true. But if the concern is build-times, exposing the data layout is harmless. We don't necessarily need full PIMPL just to get improved build times. By keeping the data layout in the .hpp, you can guarantee that your class can still stack-allocate.
  
  iainmerrick a year ago
  
  if the concern is build-times, exposing the data layout is harmless
  Not at all! If your private member variables use any interesting types (and why shouldn't they?) you need to include all the headers for those types too. That's exactly why you get an explosion of header inclusion.
  
  wakawaka28 a year ago
  
  If you change the data layout of your exposed class, you must recompile anything that uses it. That increases build times and also breaks ABI. And as the other guy commented, the data itself has types that also need definitions and file inclusions. Without PIMPL, your data layout can change without touching your header, due to changes in a related header (even a 3rd party header).
  
  Maxatar a year ago
  
  You don't need any indirection with partial because in C# all classes already go through a layer of indirection.
  However, consider that in C# if you add a field to a struct, which is a value type and hence no indirection, then you do need to recompile all assemblies that make use of that struct. It's no different than C++ in this regard.
  
  comex a year ago
  
  And yet C, an even lower-level language, achieves the same effect without the duplication of PIMPL. You just forward-declare a struct, and declare functions that accept pointers to it: the header doesn't need to contain the struct fields, and you don't need to define any wrapper functions. Technically you can do the same in C++. But in C++ to make an idiomatic API you need methods instead of free functions, and you can't declare methods on forward-declared classes. Why not? Well, I can imagine some reasons… but they have more to do with C++'s idiosyncrasies than any fundamental limitation of a low-level language.
  The C++ committee could address this, but instead they seem to want to pretend separate compilation doesn't exist. (Why are there no official headers to forward-declare STL types, except for whatever happens to be in <iosfwd>?) Then they complain about how annoying it is to preserve ABI stability for the standard library, blaming the very concept of a stable ABI [1] [2], all while there are simple language tweaks that could make it infinitely more tractable! But now I'm ranting.
  [1] https://cor3ntin.github.io/posts/abi/
  [2] https://thephd.dev/binary-banshees-digital-demons-abi-c-c++-...
  
  uecker a year ago
  
  It is indeed one of the most grotesque language design errors C++ did early on.
  
  wakawaka28 a year ago
  
  First of all, comparing C to C++ in this way is silly, because C++ is a very different language. But there are some similarities.
  > You just forward-declare a struct, and declare functions that accept pointers to it: the header doesn't need to contain the struct fields, and you don't need to define any wrapper functions.
  Those functions would be more verbose because they must contain an explicit `this` equivalent pointer. This would have to be repeated at every single call site. So it's not really helping.
  You don't need wrapper functions for PIMPL. You can have them if you think it's worthwhile, of course.
  >Technically you can do the same in C++. But in C++ to make an idiomatic API you need methods instead of free functions, and you can't declare methods on forward-declared classes. Why not?
  There are good technical reasons why you can't tack member functions into the interface of a forward-declared function. There would be nowhere for that information to go, if nothing else. I think I heard a talk about adding new metaprogramming features to C++ that might address this in like C++26, but anyway it's not a significant problem to simply avoid the problem
  I think you can probably make some template-based thing that would automate implementing the wrappers for you. But it would be a convoluted solution to what I consider a non-problem.
  >The C++ committee could address this, but instead they seem to want to pretend separate compilation doesn't exist. (Why are there no official headers to forward-declare STL types, except for whatever happens to be in <iosfwd>?)
  Most of the STL types that people need are based on templates. It does not make sense to forward-declare those. I just don't see a use case for forward-declaring much besides io stuff and maybe strings.
  >Then they complain about how annoying it is to preserve ABI stability for the standard library, blaming the very concept of a stable ABI [1] [2], all while there are simple language tweaks that could make it infinitely more tractable! But now I'm ranting.
  There seems to be a faction of the C++ committee that does not share the traditional commitment to backward compatibility. They have gone so far as to lobby for a rolling release language, which is guaranteed to be a disaster if implemented. I think wanting to break ABI might be a sign of that. Let's hope they use good judgement and not turn the language into an ever-shifting code rot generator.
  Keep in mind, there may be ABI breakage coming from your library provider anyway, on top of what the committee wants. So it's not necessarily such a cataclysmic surprise as something you're supposed to plan around anyway. ABI stability between language standards is mostly a concern for people who link code built with different C++ standards (probably, a lot of code). It wouldn't be the end of the world if you had to recompile old code to a newer standard, but it might generate significant work.
  
  gpderetta a year ago
  
  > There are good technical reasons why you can't tack member functions into the interface of a forward-declared function.
  Are there? You could have a class decorator to mark a definition as incomplete and only allow member functions, types and constant definitions:
  // in Foo.h incomplete struct Foo { public: Foo(); Foo(const& Foo); void frobnicate(); }; // in foo.cc struct Foo { // redefinition of incomplete structs is allowed public: Foo(){...} Foo(const& Foo) {...} void frobnicate(){...} private: void bar() {...} int baz; string bax; };
  edit: there are also very good reasons to fwd declare templates. You might want to add support in your interface for an std template without imposing it to all users of your header. In most companies I have worked, we had technically illegal fwd headers for standard templates.
  
  wakawaka28 a year ago
  
  >Are there? You could have a class decorator to mark a definition as incomplete and only allow member functions, types and constant definitions:
  C++ already has a way to do this via inheritance, even multiple inheritance. I suppose some of the same machinery used for inheritance could be repurposed for partial class definitions but it is unnecessary.
  Edit: I think I overlooked something here at first glance. Yes it might be nice to have a public partial definition of a class and a private full definition. But the technical reason you can't have this is that using a class in C++ requires knowing its memory characteristics. If that information does not come from the code, then it must come from somewhere else like a binary. Maybe the partial definition could be shorthand for "use PIMPL" but I haven't thought through all the ways it could go wrong, such as with inheritance.
  >edit: there are also very good reasons to fwd declare templates. You might want to add support in your interface for an std template without imposing it to all users of your header. In most companies I have worked, we had technically illegal fwd headers for standard templates.
  I have never seen illegal forward declaration headers. Not at any company I've worked at, nor in any open-source project. I don't think there is a reasonable value proposition to doing that. What kind of speedup are you expecting from that?
  >You might want to add support in your interface for an std template without imposing it to all users of your header.
  This sounds good in theory but in practice, most interfaces I've seen use the same handful of types or std headers, so it can't be avoided and furthermore you'd be forcing everyone to bring their own std headers every time (and probably forget why they ever included them in their own code, in the first place). You'd be talking about a lot of trouble to maybe save one simple include, and introducing a lot of potential for unused and noisy includes elsewhere.
  
  gpderetta a year ago
  
  Yes, that's why I used the 'incomplete' keyword. Of course you can only pass around pointers and references to incomplete classes (although there might be ways around that).
  Base classes almost work, but you either need all your functions to be virtual or you need to play nasty casts in your member functions.
  re std fwds, typically the forwarding is needed when specializing traits while metaprogramming.
  
  kevin_thibedeau a year ago
  
  Ada had all of C++'s problems figured out in 1983. PIMPL as a means of boosting compiler performance is fundamentally braindead. We shouldn't be bending over backwards with broken tools to make them sort of work.
  
  wakawaka28 a year ago
  
  PIMPL doesn't only boost compiler performance. It provides code-hiding and ABI stability for everyone using it effectively. It's like killing 3 birds with one stone. PIMPL for sure isn't gonna be the thing to convince me that C++ is broken.
  Ada has piqued my curiosity before but I think if it was as good as you make it sound, it might have at least 1% market share after 40 years. It doesn't. I can't justify the time investment to learn it unless I get a job that demands it.
  
  iainmerrick a year ago
  
  It's not PIMPL per se that's the problem, it's that C++ needs it but makes it very awkward to write. It feels like the language is fighting against you rather than setting you up for success. At least that's been my angle in this discussion.
  
  pjmlp a year ago
  
  Unfortunely "bending over backwards with broken tools" is exactly to what made C++ a success in first place, an idea also adopted by Objective-C and TypeScript.
  Trying to make the best out of an improved langauge, while not touching the broken tools of the existing kingdom they were trying to build upon.
  Naturally such decision goes both ways, helps gain adoption, and becomes a huge weight to carry around when backwards compatibility matters for staying relevant.
- Maxatar a year ago
  
  No, maybe the acronym is ugly, sounds like pimple? But other than that the use of the technique is invaluable to writing stable ABIs which in turn makes distribution of C++ libraries a lot easier.
- iainmerrick a year ago
  
  Yes! Why should I need to do extra work, messing up runtime performance by adding a pointer indirection, just to improve compilation time a bit?
  In C, idiomatic C, you can forward-declare a struct and the functions that operate on it, and you don't need any indirection at runtime. C++ has plenty of nice features, and in general I'd reach for it rather than C, but for some reason it can't do that!
  
  Maxatar a year ago
  
  This doesn't make much sense.
  Sure you can forward declare a struct and functions that operate on it, but you can't call the function or instantiate the struct without the definition. That's no different than in C++.
  The purpose of PIMPL is that you can actually call the function with a complete instantiation of the struct in such a way that changes to the struct do not require a rebuild of anything that uses the struct.
  It's not about just declaring things, it's about actually being able to use them.
  
  iainmerrick a year ago
  
  I'm thinking of this kind of declaration in C:
  typedef struct foo Foo; extern Foo* foo_create(); extern const char* foo_getStringOrSomething(Foo* foo);
  That's a fully opaque type, and it's reasonably efficient. The one thing you can't do store a Foo on the stack, because you don't know its internal size and layout. So it's always a heap pointer, but there's only one level of indirection.
  In idiomatic C++ I think you'd have something like:
  class Foo { struct impl; std::unique_ptr<impl> _impl; std::string getStringOrSomething(); }
  If I have a pointer to a Foo, that's two pointer indirections to get to the opaque _impl. So, okay, I can store my Foo on the stack and then I'm back to one pointer. But if I want shared access from multiple places, I use shared_ptr<Foo>, and then I'm back to two indirections again.
  The idiomatic C++ way to avoid those indirections is to declare the implementation inline, but it make it private so it's still opaque. But then you get the exploding compile times that people are complaining about on this thread.
  The C approach is a nice balance between convenience, compilation speed and runtime performance. But you can't do this in idiomatic C++! It's an OO approach, but C++'s classes and templates don't help you implement it. C++ gives you RAII, which is very nice and a big advantage over C, but in other respects it just gets in the way.
  Edit to add: now I look at this, Foo::getStringOrSomething() will always be called with a pointer (Foo&) so it will always need a double-dereference to access the impl. Unless, again, you inline the definition so the compiler has enough information to flatten that indirection.
  I don't see how that pImpl approach can ever be as performant as the basic C approach. Am I missing something?
  
  Maxatar a year ago
  
  In C++ if you want an opaque type similar to your C example, then you give your class a static constructor.
  class Foo { public: static std::unique_ptr<Foo> create(); std::string getStringOrSomething(); private: struct FooImp; Foo() = delete; Foo(const Foo&) = delete; auto& self() { return static_cast<FooImp&>(*this); } };
  And then you use inheritance to hide your implementation in a single translation unit/ie. a .cpp file source file.
  struct Foo::FooImp : Foo { std::string something; }; std::string Foo::getStringOrSomething() { return self().something; }
  With this, there is a single indirection to access the object just as in the C example.
  But PIMPL is used when you want to preserve value semantics, things like a copy constructor, move semantics, assignment operations, RAII, etc...
  
  iainmerrick a year ago
  
  That’s a nice trick, I don’t recall seeing that one before!
  It still seems like an awful lot of boilerplate just to reproduce the C approach, albeit with the addition of method call syntax and scoped destructors.
  I feel like there must be an easier way to do it. Hmm, maybe I’m at risk of becoming a Go fan...!
  
  Maxatar a year ago
  
  The boilerplate gives you additional type safety compared to C.
  If you want the same type safety as C, which is basically none, then you can write it as:
  // foo.hpp struct Foo { static Foo* make(); void method(); }; // foo.cpp struct FooImp : Foo { std::string something; }; Foo* Foo::make() { return new FooImp(); } void Foo::something() { ... }
  And yes it's rare because nowadays most C++ developers stick as much to value semantics as possible rather than reference semantics, but this approach was very common in the early 2000s, especially when writing Windows COM components.
  Nowadays if you want ABI stability, you'd use PIMPL. Qt is probably the biggest library that uses this approach to preserve ABI.
  
  iainmerrick a year ago
  
  I don't think it's fair to say that C has no type safety. To recap, I had:
  typedef struct foo Foo; extern Foo* foo_create(); extern const char* foo_getStringOrSomething(Foo* foo);
  The only valid way to get a Foo (or a Foo ptr) is by calling foo_create(). Inside foo_getStringOrSomething(), the pointer is definitely the correct type unless the caller has done something naughty.
  Of course there are a few caveats. First, the Foo could have been deleted, so you have a use-after-free. That's a biggie for sure! Likewise the caller could pass NULL, but that's easily checked at runtime. Those are part of the nature of C, but they're not "no type safety".
  You can also cast an arbitrary pointer to Foo*, but that's equally possibly in C++.
  
  Maxatar a year ago
  
  A comment like "basically none" should not be taken literally. It is intended to indicate that the difference between the C++ approach and the C approach is that the C++ approach gives you a great deal of type safety to the point that the C approach looks downright error prone.
  The C++ approach of sticking to value semantics doesn't involve any of the issues you get working with pointers, like lifetime issues, null pointer checks, invalid casts, forgetting to properly deallocate it, for example you have a foo_create but you didn't provide the corresponding foo_delete. Do I delete it using free, which could potentially lead to memory leaks? The type system gives me no indication of how I am supposed to properly deallocate your foo.
  You don't like boilerplate, fair enough it's annoying to write, but is boilerplate in the implementation worse than having to burden every user of your class by prefixing every single function name with foo_?
  The C++ approach allows you to treat the class like any other built in type, so you can give it a copy and assignment operator, or give it move semantics.
  So no it's not literally true that C has absolutely zero type safety. It is true that compared to the C++ approach it is incredibly error prone.
  While older C++ code is rampant with pointers, references, and runtime polymorphism, best practices when writing modern C++ code is to stick to value types, abstain from exposing pointers in your APIs, prefer type-checked parametric polymorphism over object oriented programming.
  If anything, the worst parts of C++, including your point about being able to perform an invalid cast, is inherited from C. C++ casts, for example, do not allow arbitrary pointers to be cast to each other.
- bun_terminator a year ago
  
  I have never used or seen pimpl in my life, and c++ is all I do
  
  MathMonkeyMan a year ago
  
  Does somebody using my "class Foo" really need to know everything about its "std::unordered_map<std::string, std::unique_ptr<NetBeanFactory<RageMixin, DefaultBowelEvictionPolicy>>>" data member?
  No, they need know only the size and alignment of "class Foo". Unfortunately, in C++, either a client has to see every type definition recursively down to the primitives, or you give them an opaque pointer and hide all of the internals in the implementation (all they know about "sizeof(Foo)" is "it contains a pointer to something on the heap, probably").
  edit: Ok, there's also the copy constructors and other possibly auto-generated value semantic operations, but pretend you've defined those explicitly, too.
  
  bun_terminator a year ago
  
  I know what pimpl is - I relearn it before every job change for interviews. But I've never seen it in use, and don't see a use for it. Compile times are rarely an issue in my experience. At least not big enough to warrant something that extreme.
  I like to see things simple: I need something, so I include something. Compile times are really orthogonal to that, and mostly a job for compiler devs, hardware people, modules or whatever. Changing my code because of compile times seems pretty harsh
  
  Liquid_Fire a year ago
  
  pimpl is not for improving compile times (although it can help with that). It's for maintaining ABI compatibility by keeping your implementation details out of public headers.
  
  gpderetta a year ago
  
  You can also provide pointers to base classes, but, unless you do some heroics, you are forced to use virtual functions.

allpaca a year ago

C++ is evolving so much, however I don't understand this thing: why people continue to develop AI projects with Python? I'd choose C++ instead...

bpicolo a year ago

Because they're python libraries that just wrap C and C++. All the performance upside with better ergonomics
- oivey a year ago
  
  Putting this another way, people use Python because it makes it way easier to compose together the underlying C++ code. Composition and polymorphism in C++’s static type system is rather weak.
  Of course there is also the relative succintness of Python and other advantages, too.
VHRanger a year ago

Because you need a dynamic language to do rapid iteration.
You don't want to recompile, re-parse a 12GB parquet file, etc. everytime you try a new parameter in a model.