kjksf 11 days ago

I wrote about how I keep build times sane in SumatraPDF at https://blog.kowalczyk.info/article/96a4706ec8e44bc4b0bafda2...

The idea is the same: reduce the duplicate parsing of .h files.

I don't use any tools, just a hard-core discipline of only #include'ing .h in .cpp files.

The problem is that if you start #include'ing .h in .h, you quickly start introducing duplication that is intractable, for a human, to avoid.

On another note: C++ compiler should by default keep statistics about the chain of #include's / parsing during compilation and dump it to a file at the end and also summarize how badly you're re-parsing the same .h files during build.

That info would help people remove redundant #include's.

But of course even if they do have such options, you have to turn on some flags and they'll spam your build output instead of writing to a file.

  • Maxatar 11 days ago

    >On another note: C++ compiler should by default keep statistics about the chain of #include's / parsing during compilation and dump it to a file at the end and also summarize how badly you're re-parsing the same .h files during build.

    Clang does offer something very close to this, and if you use it you'll find parsing the same/duplicate header files contributes on the order of microseconds to the overall compile time. Your article is from 1989, which is likely before compilers implemented duplicate header file elimination [1], but nowadays all C++ compilers optimize the common #ifndef/#define header guard as well as #pragma once so that they entirely ignore parsing the same file over and over again.

    [1] https://gcc.gnu.org/onlinedocs/cpp/Once-Only-Headers.html

    • WalterBright 11 days ago

      True, but when you're compiling a.c and then b.c, the .h files get reparsed all over again.

      • highfrequency 11 days ago

        Correct me if I'm wrong, but I believe the parent comment's strategy of only #including header files within .c files only helps reduce duplicate header file parsing within each compilation unit. So it wouldn't do anything to improve the case you mention (duplicate header compilation across compilation units) anyway, while adding much additional overhead in manually tracking header file dependencies.

        Also, given your experience in compilers - keen to see if you agree that because modern compilers optimize away re-scanning of the same header file within a compilation unit anyway (in the presence of include guards), the strategy of only #including header files within .c files is close to useless.

        • WalterBright 11 days ago

          The not rescanning if the #include guards are there goes back to the mid 1980s. It's not a modern feature :-)

          > the strategy of only #including header files within .c files is close to useless

          It probably is. It also means the user of the .h file has to manage the .h file's dependencies, which is not the best practice. .h files should be self-contained.

      • kazinator 11 days ago

        This is not only not true, but not possible. Many kind sof defintiions in .h files may not be repeated without error, like:

          struct foo { int bar; };
        
        or

          typedef int xyzzy_t;
        
        
        This is why you have include guards:

          #ifndef FOO_H_3DF0_755A
          #define FOO_H_3DF0_755A
        
          struct foo { int bar; }
        
          #endif
        
        GCC optimized the handling of headers with include guards 30 years ago already.

        If you think most of your compile time is spent in preprocessing, benchmark a clean build with the optimization set to -O0 versus your full optimization like -O2 or whatever you are using.

        Both builds perform preprocessing; thus the preprocessing time is bounded by the total time spent in the -O0 build: if the actual semantic analysis of tokens and code generation at -O0 were to take next to no time at all, then we would have to attribute the time to tokenizing and preprocessing. But even then, all the additional time observed under -O2 is not tokenizing and preprocessing.

        • WalterBright 11 days ago

          Um, I think there is a misunderstanding.

              gcc -c a.c
              gcc -c b.c
          
          requires reparsing of the .h files used by both.
  • WalterBright 11 days ago

    Having written, benchmarked, and maintained C and C++ compilers for decades, I know why the compiles are slow:

    1. phases of translation

    2. constant rescanning and reparsing of .h files

    3. cannot parse without doing semantic analysis

    4. the preprocessor has its own tokens - so you gotta tokenize the .h file, do the preprocessing, convert it back to text, then tokenize the text again with the C/C++ compiler. This is madness. (Although with the C compiler I did manage to merge the preprocessor lexer with the compiler lexer, this made it the speed champ.)

    This experience fed into D which:

    1. uses modules instead of .h files. No matter how many times a module is imported, it is lexed/parsed/semanticed exactly once.

    2. module semantics are independent of who/what imports them

    3. no phases of translation

    4. lexing and parsing is independent of semantic analysis

    • kazinator 11 days ago

      Having done some casual benchmarking recently, I found that GCC is about 15 times slower when optimizing than when not. In both situations, the compiler is scanning the same header files, so that activity is bound up within the 1/15th of the optimized compilation time.

      It used to be a common wisdom that the character-level processing of code took the most time. Just like the old floating-point is slow; always use integer when possible.

      Also note that the ccache tool greatly speeds up C and C++ builds. Yet, the input to ccache is the preprocessed translation unit! When you're using ccache, none of the preprocessing is skipped. ccache hashes the preprocesed translation unit (plus compiler command line options and the path of the compiler executable and such) in an intelligent way and the checks its cache. If there is a hit, it pulls the .o out of its cache, otherwise it invokes the compiler on the preprocessed translation unit.

      If most of the time were spent in preprocessing, a much more modest speedup would be observed with ccache.

      • WalterBright 11 days ago

        Generally when benchmarking compile speeds, the unoptimized build is used, as that is the edit-compile-debug loop. It's always been true that a good optimizer will dominate the build times.

        Back in the Bronze Age (1990s) I endeavored to speed up compilation in a manner that you describe ccache as doing. After the .h files were taken care of, the compiler would roll out to disk the state of the compiler. (It could also do this with individual .h files.) Then, instead of doing all the .h files again, it would just memory map in the precompiled .h file.

        And yes, it resulted in a dramatic improvement in compile times, as you describe.

        The downside was one had to be extremely careful about compiling the .h files the same way each time. One difference could affect the path through the .h files, and invalidate the precompiled version.

        It was quite a lot of careful work to make that work, and I expect ccache is also a complex piece of work.

        What I learned from that is it's easier to just fix the language so none of that is necessary. C/C++ can be so fixed, the proof is ImportC, a C compiler that can use imports instead of .h files, and can compile multiple .c files in one invocation and merge them into a single .o file.

        • SleepyMyroslav 10 days ago

          There is a reason of why

          > unoptimized build is used, as that is the edit-compile-debug loop

          is no longer true.

          Modern C++ has a lot of metaprogramming abstractions in it and they are no cost only in optimized builds.

          In my years of gamedev work I have not met a sizeable project that was working in unoptimized builds even for debug purposes. Unoptimized only worked in unit tests or small tools.

          • 59nadir 10 days ago

            I think at that point the real solution is to seriously consider all of the language constructs you use and their compile times as well. It's not a given that using more of C++ is always better and real, sustainable change in compile times can be had by moving more and more towards C in many ways but keeping some of the safety C++ provides.

            (I'm sure you've been there, though; gamedev is one of the areas I would expect people to be more sensible about their C++ feature usage in.)

            • SleepyMyroslav 10 days ago

              If you are implying that we can go back to force inlining everything and only using small wrappers around memcpy then I will have to say that that ship has sailed years ago. I do not know anyone who wants to go back for more than brief moments while changed header causes cascade of rebuilds.

              Now the elephant in the room of build times that no one wants to talk about is 'the optimized' build with PGO+LTO. I think none of the projects I worked that got used to it ever did a local pipeline to do it xD. But if you ask people if they want to ship a build without it the answer is clear 'no'.

              I will totally understand if authors of the linked article also do not like to talk about it. What I am trying to do here is to clear confusion about importance of it. Pretending that IWYU is more than polishing of last 5% of build times helps almost no one. YMMV ofc.

              • 59nadir 10 days ago

                There are plenty of constructs in C++ that are safer than C and still don't impact compile times that much and some that, while safer and better in some regards, are murder for compile times. I'm saying there is a tradeoff to be made and faster iteration speeds are oftentimes more valuable for end result quality than (often perceived) safety.

      • taylorius 10 days ago

        "Just like the old floating-point is slow; always use integer when possible."

        I know this was only an aside - but it took me the longest time to properly internalize that floats were fast these days. I'm still getting used to the idea that double precision isn't a preposterous extravagance. :D

        • Arech 10 days ago

          It depends on what you're doing. Doubles are still slower than floats due to twice bigger requirements imposed on memory performance & cache size. So if you doing a "calculator" style of work, there's not much difference, but if you're processing large arrays of data, it's still something you should think of.

          • taylorius 10 days ago

            Yeah, exactly that. I was getting artifacts processing long vectors, and it was 32 bit float precision that was the culprit.

        • spacechild1 10 days ago

          On certain platforms this rule still holds. Recently I have been working with the ESP32. Although some versions have an FPU, floating point math is still slower than integer math. Also, the FPU can only process 32-bit floats, 64-bit doubles are emulated in software and terribly slow.

    • pajko 10 days ago

      Precompiled headers eliminate some of these issues, if the usage is right. At one of my former companies any available build optimization stuff had been added to the code base over the years by the DevOps guys without any thought about how the thing would work together, and it just got slower and slower. At the end, the initial 5-minute complete build time increased to about 35 minutes, of which about 10 minutes could be attributed to the various refactors and extremely high amount of templating, but couldn't find a cause for the extra 20-minute increase. It was about 15 minutes to test just a single change.

    • uecker 10 days ago

      This is the same for C and C++ and C compile times are dramatically shorter than C++.

      Also doing semantic analysis during parsing should save time compared to having additional tree walking later (and does in my experience).

    • zerr 10 days ago

      Isn't `#pragma once` helpful for avoiding reparsing headers?

      • knome 10 days ago

        `#pragma once` is to prevent a header from being reparsed repeatedly for the same translation unit when ten different headers all include a common one transitively.

        it replaces the prior pattern of including code predicated on a unique definition defined only within that same block to avoid double parsing.

            /* if it isn't unique, you're going to have a bad time */
            #ifndef SOME_HOPEFULLY_UNIQUE_DEFINITION
            #define SOME_HOPEFULLY_UNIQUE_DEFINITION
            
            ...code...
            
            #endif /* SOME_HOPEFULLY_UNIQUE_DEFINITION */
        
        This, however, doesn't stop those same headers from needing to be reread and reparsed and reread and reparsed for every single cpp file in the project.
  • kazinator 11 days ago

    In modern compilers, the scanning of tokens is a tiny fraction of the compile time.

    If the header files are properly protected with inclusion guards, then at worst the contents are tokenized by the preprocessor, and not seen by anything else. But for decades now, compilers have been smart enough to recognize include guards and not actually open the file. So that is to say, when the compiler has seen a file of this form once:

      #ifndef FOO
      #define FOO
      ...
      #endif
    
    and is asked to #include that same file again, it knows that this file is guarded by FOO. If FOO exists, it won't even open that file.

    There are reasons to avoid including .h files in .h files, but you're not going to get compile time gains out of it, other than perhaps through secondary effects.

    By secondary effects I mean this: when in a project you forbid .h files including other .h files, it hurts to add new dependencies into header files. When you change a header such that it needs some definition in another, you have to go into every .cpp file and add the #include. So, you find ways to reduce the dependencies to avoid doing that.

    When .h files are allowed to include others, the dependencies grow rampant. You want to compile a little banana, but it depends on the gorilla, which needs the whole jungle, just to be defined. Juggling the jungle definition takes a bit of time. C++ definitions are complicated, requiring lots of processing to develop. Not as much as optimizing the banana that is the actual subject being compiled to code, but not as little as skipping header files.

  • jay-barronville 11 days ago

    As much as I hate long compilation times, I also value code discoverability, readability, and consistency. Your (i.e., Rob Pike’s) strategy seems like a nightmare to me. I love Rob, but I can’t follow this rule.

  • highfrequency 11 days ago

    > We try to mitigate it with #ifdef guards, #pragma once etc. but in my experience those band-aids don’t solve the problem.

    Why don't include guards solve the problem of duplicate parsing of .h files within one compilation unit? I believe modern compilers can entirely optimize away even the opening and scanning of the file. And even without that, modern NVMe disks are so fast I would imagine the file opens would be negligible.

    Curious to hear if anyone has data on whether duplicate header file parsing is still an actual performance issue even with modern compilers, modern SSDs, and #pragma once.

  • greenavocado 11 days ago

    Only a handful of commenters have mentioned "#pragma once", which is alarmingly few people considering how many C++ practitioners are out there. "#pragma once" is the obvious way to deal with this problem.

    • flohofwoe 10 days ago

      That's a somewhat unrelated problem, it just protects from parsing the same header multiple times. The rule that headers should not include other header is more of an organizational thing.

      You can immediately see the include complexity of each source file by looking at the top of the file, all required includes are there as a flat list. Otherwise a single #include may hide dozens, hundreds or even thousands of other includes hidden in a deep dependency tree.

      It also nudges the programmer away from "every class in a tiny header/source pair" nonsense.

    • Arech 10 days ago

      `#pragma once` isn't standardized. Even though all major modern compilers do support it (hopefully, but not guaranteed in the same way), if you care about portability to some not even very esoteric platform, it most likely uses ancient compilers (or based on ancient version of a major compiler) that might not support it. But besides that I too don't see a reason why not use a better `#pragma once` instead if include guards...

      • maccard 10 days ago

        Pragma once is a defacto standard and is supported by every compiler and platform you’re ever going to work with. If you need to support the targets it doesn’t work with, then you’ll get a loud compile error.

        The old include guards are error prone, requiring a unique key per header. Getting that wrong can cause anything from a compile error, an obtuse linker error, or a runtime bug depending on what’s in the header.

        The chances of someone copy pasting a header and breaking the include guard are significantly higher than the chances of someone deciding that we need to support an esoteric embedded target overnight.

        • rcxdude 10 days ago

          This. #pragma once is more consistently supported than many parts of the standard. Just use it. (And I've worked with some of those esoteric embedded targets. Haven't run into a compiler that didn't support it yet)

      • flohofwoe 10 days ago

        FWIW I've never seen a compiler that doesn't support '#pragma once' (at least since the late 90s).

        I did come across some esoteric path-related problems with pragma once though which wouldn't occur with include guards. IIRC it was related to filesystem links.

        • jstimpfle 9 days ago

          I've come across obscure bugs with header guards in open source software that went unnoticed for years -- essentially, there were competing definitions of a global config struct in the code base. Some parts of the code would access Version A, while others would access the same data but thinking it was version B.

          Most .c files had ended up including both versions, but the headers had the same guards so only the first included definition would be in effect.

          The project maintainer had been fixing obscure memory bugs for years until someone noticed the real issue.

          Wouldn't have happened with #pragma once I suppose. With pragma once you don't have to painfully choose an identifier for the header guard, and you can't get that wrong.

          • uecker 9 days ago

            Pragma once itself gets it wrong, if there are multiple paths to the same file which apparently happens quite a lot.

            • jstimpfle 8 days ago

              Never seen it happen. Seems like myth, who uses hardlinks/symlinks in source projects? And is pragma once not implemented based on inpdes?

              Anyway false negative identity tests is not a problem like false positive identity tests are.

      • greenavocado 10 days ago

        Microsoft Visual C++ (MSVC): Supported since Visual C++ 6.0 (1998).

        GNU Compiler Collection (GCC): Supported since GCC 3.4 (2004).

        Clang (based on LLVM): Supported since Clang 2.9 (2009).

        Intel C++ Compiler: Supported since version 8.0 (2003).

  • johannes1234321 11 days ago

    > On another note: C++ compiler should by default keep statistics about the chain of #include's / parsing during compilation and dump it to a file at the end and also summarize how badly you're re-parsing the same .h files during build.

    Not exactly that, but do you know clang's -ftime-trace and tools like https://github.com/aras-p/ClangBuildAnalyzer which help analyzing where time is actually spent? (In small repeated headers I don't see much of a problem, but they of course may contain not so small things ...)

  • electroly 11 days ago

    If you use a "unity" build (all .cpp files concatenated into one compilation unit) with the normal '#pragma once' or guard in the headers, then you can be certain that every header file is parsed only once, with no discipline needed. CMake can do this for you with an option. You lose the ability to do incremental builds of individual changed files, but it may be fast enough that you don't care.

    For my own personal projects, I just use ccache and precompiled headers. It's good enough for me. I don't want to have to apply "hard-core discipline" to my projects.

  • eru 11 days ago

    Oh, the joys of using a language that thinks (automated) copy-and-paste coding is the same as a module system.

  • timvdalen 10 days ago

    Thanks for building SumatraPDF! I just switched away from Windows, but I've been a happy user for a very long time, and it's hard to get that mix of user experience just right. Really appreciate your work.

  • intelVISA 10 days ago

    The entire article's premise is misguided as slow C++ build times are usually a result of inexperienced SWEs unfamiliar with the language's "fun" build rules.

    • maccard 10 days ago

      This is the equivalent of saying C is safe to use, as long as you use it correctly, or your iPhone is fine you’re just holding it wrong.

      If build times are a wide spread problem with a language, that’s a language problem not a user problem.

      • exe34 10 days ago

        Knives are really difficult to use, everybody keeps holding the nice shiny metal part, but somehow you're meant to just know to hold the wooden part.

    • Arech 10 days ago

      In enterprise you are almost always under a pressure to deliver something asap, so some bad practices are just a natural consequence of that. They could have liked to do it differently, but then they would only be half way to delivering what they have already shipped. There are always tradeoffs.

  • nextaccountic 11 days ago

    Shouldn't precompiled headers mitigate that?

  • sfpotter 11 days ago

    How much of a speedup did you get switching over to Rob Pike style includes?

    • cpeterso 11 days ago

      I had to look this up:

      Rob Pike’s rule is that header files should not include other header files; they should only document which other header files they depend on so programmers can directly include those other header files in their .c files.

      https://bytes.com/topic/c/answers/217074-rob-pikes-simple-in...

      • fooker 10 days ago

        This seems like the kind of thing that a compiler should be good at instead of making the user jump through hoops.

        • techbrovanguard 10 days ago

          I'm glad Rob Pike didn't contribute to Go, that'd have been catastrophic to my productivity! Imagine having tools do work instead of users, it'd be like a world without lawyers. Anyhow, I've got to refactor my WIP code for the 10th time today since unused variables are compiler errors. Seeya!

      • iainmerrick 10 days ago

        Wow, I hadn't heard of that one. It sounds very unpleasant.

        I have the exact opposite rule -- it should always be OK to include a header file on its own. As a corollary, headers must include or forward-declare everything they need.

      • ErikCorry 10 days ago

        If the .h file is documenting which other header files it needs, what keeps that documentation up to date. This sounds like a very painful way to work, where .h files fail to compile (with very strange error messages because of the way C++ works) and then you have to hunt down the missing dependency.

        • flohofwoe 10 days ago

          In my header libraries I have checks like this which produce a meaningful error message:

              #if !defined(SOKOL_GFX_INCLUDED)
              #error "Please include sokol_gfx.h before sokol_shape.h"
              #endif
snypehype46 11 days ago

Coincidentally in the project I'm currently working I managed to reduce our compile times significantly (~35% faster) using ClangBuildAnalyzer [1]. The main two things that helped were precompiled headers and explicit template instantiations.

Unfortunately, the project still remains heavy to compile because of our use of Eigen throughout the entire codebase. The analysis with Clang's "-ftime-trace" show that 75-80% of the compilation time is spent in the optimisation stage, but not really sure what to do about that.

[1] https://github.com/aras-p/ClangBuildAnalyzer

  • celrod 11 days ago

    What's the trick with explicit template instantiations? Including them in the precompiled header?

    • logicchains 11 days ago

      You leave them uninstantiated in the header file and explicitly instantiate them in the .cc file (only works if there's a small number of possible instantiations).

petermcneeley 11 days ago

The video game industry uses bulk builds (master files) which groups all .cc s into very large single .cc files. The speedups here are like 5-10x at least. These bulk files are sent to other developers machines with possible caching. The result is 12 min builds instead of 6 hours.

  • shanemhansen 11 days ago

    Chrome supported this for a long time and it really helped small developers outside of Google be able to build chrome without specialist machines.

    But that feature was pulled by the chrome team with the stated justification being that since C++ guaranteed different things (iirc around variable scopes outside of a namespace) for one file vs multiple files, supporting the jumbo build option meant writing some language that was "not C++".

    Unfortunate.

    • Shorel 7 days ago

      When "best" is the enemy of "good enough"

  • anybodyz 10 days ago

    Many of them use this tool: https://www.fastbuild.org/docs/home.html

    It's was built and is regularly maintained by a Riot Games principal C++ architect, and automatically compiles files in large "unity" chunks, distributes builds across all machines in an organization, and creates convenient Visual Studio .sln files, and XCode projects. It's also all command line driven and open source.

    This is industrial strength C++ builds for very large rapidly changing code bases. It works.

  • celrod 11 days ago

    If using cmake, you can try a unity build. https://cmake.org/cmake/help/latest/prop_tgt/UNITY_BUILD.htm...

    You can also specify a `-DUNITY_BUILD_BATCH_SIZE` to control how many get grouped, so you can still get some parallelism. However, I think it'd be more natural to be able to specify number of batches (e.g. `nproc`) than their size.

    Code bases may need some updating to work.

  • WalterBright 11 days ago

    How it works with D is you can do separate compilation:

        dmd -c a.d
        dmd -c b.d
        dmd a.o b.o
    
    or do it all in one go:

        dmd a.d b.d
    
    Over time, the latter became the preferred method. With it, the compiler generates one large .o file for a.d and b.d, more or less creating a "pre-linked" object file. This also means lots of inlining opportunities are present without needing linker support for intermodule inlining.
  • Twirrim 11 days ago

    sqlite uses something like this approach too, and there are additional optimisation advantages from keeping everything in a single file:

    https://sqlite.org/amalgamation.html

        Over 100 separate source files are concatenated into a single large file of C-code named "sqlite3.c" and referred to as "the amalgamation". The amalgamation contains everything an application needs to embed SQLite.
        
        Combining all the code for SQLite into one big file makes SQLite easier to deploy — there is just one file to keep track of. And because all code is in a single translation unit, compilers can do better inter-procedure and inlining optimization resulting in machine code that is between 5% and 10% faster.
  • josephg 11 days ago

    It makes sense. As projects grow, the average header file is included O(n) times from O(n) different .cc files - leading to O(n^2) parsed header files during compilation. And thus, O(n^2) work for the compiler.

    Merging everything into one big .cc file reduces the compilation job back to an O(n) task, since each header only needs to be parsed once.

    Its stupid that any of this is necessary, but I suppose its easier to hack around the problem than fix the problem in the language.

    • WalterBright 11 days ago

      Those problems are fixable with C/C++, but nobody seems to want to do it. They are fixed with dlang's ImportC. You can do things like:

          dmd a.c b.c
      
      and it will compile and link the C files together. ImportC also supports modules (to solve the .h problems).

      It's all quite doable.

      • Gibbon1 10 days ago

        I wonder if you could create a pragma in C to do almost the same.

        I don't have a good name for it but it would force the compiler to ignore previous definitions. With an 'undo' pragma as well.

    • gpderetta 10 days ago

      Which compiler parses the same header file multiple times in the same translation unit? Compilers have been optimizing around pragma once and header guards for multiple decades.

      edit: ok, you meant that each header is included once in each translation unit.

      • josephg 10 days ago

        Yep. Worst case, every header is included in every translation unit. Assuming you have a similar proportion of code in your headers and source files, compilation time will land somewhere between O(n) and O(n^2) where n = the number of files. IME in large projects its usually closer to n^2 than n.

        (Technically big-O notation specifically refers to worst case performance - but thats not how most people use the notation.)

  • forrestthewoods 11 days ago

    I'm starting to believe that one static/shared library should be produced by compiling exactly one cpp file. Go ahead and logically break your code into as many cpp files as you want. But there should then be a single cpp file that includes all other cpp files.

    The whole C++ build model is terrible and broken. Everyone knows n^2 algorithms are bad and yet here we are.

    • sfpotter 11 days ago

      Everyone: "O(n^2) algorithms are bad."

      Also everyone: "Just do the stupidest thing in the shortest amount of time possible. We'll fix it later."

  • Shorel 7 days ago

    Nice, that's like doing the link phase before the compilation.

andersa 10 days ago

It's frustrating to see the C++ committee spend year after year on pointless new over engineered libraries instead of finally fixing the compile times. On a high level view, with only one change to the language, we could entirely eliminate this problem!

Consider the following theoretically simple change:

A definition in a file may not affect headers included after it. If you want global configuration, define them at the project level, or in a header included by all files that need it.

i.e. we need to break this construct:

    #define MY_CONFIG=1
    #include "header_using_MY_CONFIG.h"
Thats really all we need to do to completely eliminate the nonsense that is constant re-parsing of headers and turn the build process into a task graph where each file is processed exactly one time, and each template is instantiated exactly one time, the intermediate outputs of which can be fully cached.

Most real-world large projects already practice IWYU meaning they are already fully compatible with this.

There are some videos by Jonathan Blow on how this is exactly how the Jai compiler is so fast. Why must we still suffer with these outdated design decisions from 50 years ago in C++? Why can't the tech evolve?

/end rant

  • James_K 10 days ago

    I used to think the same thing myself, but now I'm not sure that would solve much. The problem you describe is really just a matter of caching. I believe you should be able to process the tokens of a header to determine all the identifiers in it, at which point you can model the header compilation as a memoised function from the definition of those identifiers to the compiled file. So for your example, every include of the header where MY_CONFIG=1 could re-use the same results.

    The real issue is just that C++ compilers are horrendously slow. They have been designed with the intention of producing fast executables rather than compiling quickly. Think of Rust, which has a high degree of structure in its compilation process and so a high degree of opimisability, yet it still suffers slow compilation due to its usage of a C++ compiler backend.

    I think this really because C++ builds are fundamentally unstructured. Rather than invoking a compiler on the entire build directory and letting it handle each file, it is invoked once for every file in a way that might be non-trivial. Improving the build process almost always comes at the

    Beyond that, C++ developers simply do not care about slow compilations times. If they did, they wouldn't be using C++. It's my personal theory that C++ as a language has effectively self-selected a user-base that is immensely tolerant of this kind of thing by driving off anyone who isn't.

    • andersa 10 days ago

      > I think this really because C++ builds are fundamentally unstructured. Rather than invoking a compiler on the entire build directory and letting it handle each file, it is invoked once for every file in a way that might be non-trivial.

      True, this would also need to be fixed. Compilation would need to become a single process that can effectively manage concurrency and share data.

  • affgrff2 10 days ago

    Aren't there modules since c++20 which solve this problem?

    • andersa 10 days ago

      Modules are a massively over engineered "solution" to the problem that require significant refactoring to actually make use of them. Have you tried to properly use modules (i.e. create ones in your software, not just import std)? It's super clunky and still hardly usable.

      I doubt we'll see Unreal Engine get any benefit from that in a long time for example. It could be so much better, working fully automatically with almost all existing code so long as you use IWYU, which is already standard for large projects where this is needed the most.

diath 11 days ago

Ever since I tried -ftime-trace in Clang to improve build times in a project a while ago, I've been very conscious about using forward declarations wherever possible, however I wish we had proper module support that actually worked well, having to keep this in mind whenever writing new code just so your project doesn't take forever to compile sucks, this shouldn't even be something we have to keep in mind in 2024.

Kelteseth 11 days ago

I recently created https://arewemodulesyet.org/ to track module adoption in the c++ ecosystem. I used how often a vcpkg manifest file changed to get a rough estimate how popular a library is.

  • lastgeniusua 10 days ago

    Thank you very much for this resource! Reading through the article and the discussion here I was really surprised why nobody discussed using actually existing modules, this clarifies how far that is from a solution.

    You should really post this as a separate Show HN story!

    "Estimated finish by the year 5134" made me chuckle

    • Kelteseth 10 days ago

      And we just improved the waiting time by about 1000 years, by adding flux to the list of supported projects. But sarcasm aside, yes this is a bit of criticism of the c++ ecosystem that moves a bit too slow. Many libraries pride themselves with only using c++11/14.

nrclark 10 days ago

Back when I was doing C++ more often, I got a big speedup on my builds (and also smaller binaries) by using C-style forward declarations for everything that I possibly could, including class methods. Method definitions / template instantiations would be placed into a corresponding .cpp file.

I know that it's unpopular in a world of header-only libraries, but it really does make a dramatic difference to build-time. Especially if your code is heavy on templates.

LeSaucy 11 days ago

I use c++ on the daily and find that ccache and an m1/2/3 cpu go a very long way to reducing build times.

  • ErikCorry 10 days ago

    Ccache is great but then change one central .h file and you are back to square one.

    • gpderetta 10 days ago

      Ccache + distcc (or similar caching/distributed build solutions) work very well.

  • MathMonkeyMan 11 days ago

    Not sure why this was downvoted. It's true that ccache and build parallelization (e.g. icecream) can grease the wheels enough that builds are no longer a dev cycle bottleneck.

    What the article is about, though, is changing the source code so that it is intrinsically faster to compile. At some point you say "this program isn't complicated, why does it take so long to compile?" Then you start looking at unnecessary includes, transitive includes, forward declarations, excessive inlining, etc.

    • rjkaplan 11 days ago

      I'm guessing the comment was downvoted because the suggestions are mentioned in the first paragraph of the article...

      > After trying a few stopgap solutions—like purchasing M1 Maxs for our team—build times gradually reverted to their original pace; Ccache and remote caching weren’t enough either.

      • gpderetta 10 days ago

        Parallelization is not a stop gap solution. Is the only scalable one as C++ projects tend to grow to multimilion lines of code easily. And with distcc (or similar) you do not need to buy your developers beefy workstations (although you should!).

  • scheme271 11 days ago

    Yeah, I got 15 minute build times down to under 30s using ccache. Doesn't help with cold rebuilds but once you have a cache, it really does help things significantly.

pietroppeter 10 days ago

> It is written purely in Python and does not use Clang, which makes it fast to run—usually in just a couple of seconds.

Oh, the irony

  • intelVISA 10 days ago

    Indeed, reading this article is supposed to promote Figma's engineering team for 'solving' a solved problem: it does quite the opposite.

chipdart 12 days ago

This blog entry is highly disappointing. The Sigma blog post reads as if they reinvented the wheel with basic information that is not only widely known and understood but it also featured in books published decades ago.

The blog post authors would do well if they got up to speed on the basics of working with C++ projects. Books such as "Large scale C++ vol1" by John Lakos already cover this and much more.

  • rileymat2 11 days ago

    It could be my bias but it seems a lot of inexperienced developers no longer read comprehensive books on topics but survive on Google, stack overflow, some documentation with examples/simple tutorials, blog posts and now Gpts.

    All are useful tools but they are very poor in eliminating unknown unknowns like a book would.

    • bdowling 11 days ago

      You think only inexperienced developers have stopped reading books?

      • rileymat2 11 days ago

        That's fair, I bet it is more widespread, but people starting in the last 15-20 years did not even have the initial introduction. I read a lot, it surprised me to find out that it was atypical in an industry that is supposed to be somewhat about leveraging brainpower. May as well stand on the shoulders of giants.

jupp0r 11 days ago

Bazel + remote workers yields a great user experience with small infrastructure footprint per developer, but requires quite a bit of work to initially set up. You get reproducible builds, caching of test results and blazingly fast CI as a side effect.

  • jack_pp 11 days ago

    when you use tensorflow from python you're basically using c++

  • bburnett44 11 days ago

    lol I initially read this as remote people workers and was super confused

tinganho 10 days ago

One thing that crossed my mind when I was looking at the TypeScript compiler. Was that it was parsing header "d.ts" files even though it wasn't used in source. Although, it had some references of a type in a small main function.

Iirc, I think this was how most compilers did. The downside is that transitive deps can easily explode. Thus, compiling a super small main function can takes seconds.

I did suggest a solution to just lazily parse/check symbols if they are encountered in source. Instead of when including a type, you have to parse all the transitive header files of the file that defines the type.

wifijammer 11 days ago

I really hate the repetition from separating out header and definition files so I've been writing my whole codebase headers only.

I feel like this kills my compile time but I'm not sure how to fix it. Precompiled headers?

  • flohofwoe 11 days ago

    If you have all your application code in headers you'll get the fastest build times (for full rebuilds at least) by including all headers into a single main.cpp file and build just that, since that way there is no redundant code for the compiler to build at all.

    Of course the downside is that every tiny code change triggers a full rebuild then, but it's quite likely that the most time is spent in the linker anyway, so maybe worth a try.

    • runevault 11 days ago

      I think I've heard this called a Unity build where there's a precompile step that just dumps everything into a single file and compiles that so it doesn't have to re-include everything at different compilation units (when I first heard the term I got confused because it was in a game dev context but had nothing to do with the Unity engine lol).

    • asvitkine 11 days ago

      Except then there's no parallelism since you're only building one file. Ideally you'd split it into N files to take advantage of multiple cores, but then you have to decide how to split it...

      • flohofwoe 10 days ago

        Right. My rule of thumb would be "one implementation file per system", or what would be called a "module" in other languages. So that a moderately complex code base ends up with a about a couple dozen source files to build.

        And each system should only have a single 'public interface header' to keep the number of cross-system include dependencies low.

  • pjmlp 10 days ago

    C++20 modules.

    Currently mostly usable on VC++ vlatest, and clang 17, with clang 18 bringing in support for c++23 import std (VC++ already does it).

    Sadly GCC is still far behind, not to mention all the other ones still catching up with C++17.

  • ranger_danger 11 days ago

    ccache is one solution, or a script/IDE plugin that will create both the header/definition from a signature you provide?

stockhorn 10 days ago

I've also tried to optimize c++ compile times on large projects a few times. I never got IWYU working properly and I always hated the fact that I still have to care about header files at all. Then I switched to doing rust full time, which made all the fiddling with header files obsolete. This felt amazing. But now I'm facing the same problem, slow compile times :). Only this time I have to rely on the compiler team to make the improvements and I cant do much on my side AFAIK.

  • Nereuxofficial 10 days ago

    Well that's not quite true. You can do a few things: 1. Reduce dependencies and features of dependencies 2. Use a faster linker like mold 3. Use a faster compiler backend like cranelift(if possible) 4. Use the parallel compiler frontend(again if using nightly is possible) 5. Use sccache to cache dependencies But i do get what you mean. Especially in CI the build times are often long

    • fanf2 10 days ago

      Split up crates so your compilation units are smaller.

Scubabear68 11 days ago

I find it hard to believe that this post indicates that C++ build times are proportional to included bytes, period.

I haven’t used C++ in quite a while, but aren’t templates a big part of this issue?

  • flohofwoe 11 days ago

    It becomes believable when you consider that your own code is just a very tiny appendix dangling off the end of a massive chunk of included data. For instance just including <vector> results in a 24kloc compilation unit of gnarly template code in Clang with C++23:

    https://www.godbolt.org/z/G18WGdET5

    ...add <string> and <algorithm> and you're at 45kloc:

    https://www.godbolt.org/z/Whv73YPYh

    ...and those numbers have been growing steadily by a couple thousand lines in each new C++ version.

    Multiply this with a few thousand source files (not atypical with the old 'clean code' rule to prefer small source files, e.g. one file per class), and that's already dozens to hundreds of million lines of code the compiler needs to process on a full rebuild, all spent on compiling <vector> over and over again.

    TL;DR: the most effective way to improve build times in C++ is to split your project into few big source files instead of many small files (either manually, e.g. one big source file per 'system', or let the build system take care of it via 'unity' or 'jumbo' builds).

    • gpderetta 10 days ago

      From a quick test, including every single header in the standard C and C++ library is 180k loc in an otherwise empty .cc file and compiling it with g++13 in C++23 mode takes 1.6s on my machine. Not amazing, but not terrible either.

      I also don't believe in tiny instantiation units, but when compiling real code, parsing the headers themselves is not necessarily the bottleneck.

    • mgaunard 11 days ago

      Except that doesn't improve the time of iterative builds, which are the only ones that really matter to software development.

      • flohofwoe 11 days ago

        It does though for header changes, which then may trigger fewer source file compilations. IME in incremental builds the most time is spent in the linker anyway.

        > which are the only ones that really matter to software development.

        Debatable in this age of cloud CI builds ;)

        • josephg 11 days ago

          > IME in incremental builds the most time is spent in the linker anyway.

          A lot of time is spent in the linker because the linker needs to parse and deduplicate any monomorphized C++ classes (like vector). This takes time proportional to the number of compiled copies of the class / function that are kicking around.

          So I'd expect linking times to also decrease if you're compiling fewer, larger source files.

        • mgaunard 10 days ago

          Splitting everything in small files makes it so that very little needs to recompile when you change something.

          Linking is pretty much just I/O-bound unless you're using LTO. This is assuming you're using a modern linker like mold.

          • flohofwoe 10 days ago

            You also need to put into consideration though there are no "small" source files anymore as soon as you include anything from the C++ stdlib. Each tiny source file ends up anywhere between 20 and 100kloc after includes.

            Also, many C++ projects I've seen indirectly include almost anything into anything under the hood, so a header change on one end of the project may trigger a rebuild of seemingly unrelated source files.

            • mgaunard 10 days ago

              They're small in the sense that their dependency on the rest of your codebase is small.

      • SleepyMyroslav 10 days ago

        Typical solution is to automatically exclude modified files from unity build. So when you edit only 1 cpp you build 2 first time and only 1 afterwards if you need to iterate over it.

        Such solution introduces more funny failure modes into build though. People get quite irritated when no change edit in a single file breaks build :)

gosub100 11 days ago

I just checked and distcc [1] is still a thing. It's been around a while now, I remember messing with it in about 2010. It allows you to parallelize your builds across multiple machines. Not as relevant now with todays multicore CPUs, but if you or your employer can't afford to buy you a work station, distcc might be your answer.

[1] https://www.distcc.org/

  • gpderetta 10 days ago

    Even if you have a beefy workstation, if your projects has thousands of translation units (i.e. any moderately large c++program), distcc (and ccache) still help significantly.

    Unfortunately linking becomes the bottleneck.

  • curiousgal 10 days ago

    We use Incredibuild at work.

jsbus 10 days ago

I was thinking about applying at Figma a couple days ago. Seeing the engineering culture portrayed in this post, I do not any more.

ErikCorry 10 days ago

My tip: Increase the size of .cc files. Since each .cc file is including tens of thousands of .h lines, you should not allow developers to check in < 100 line .cc files.

The OP was seeing build times increase faster than loc. Probably someone on the team likes small .cc files.

  • choppaface 10 days ago

    But translation units / small .cc files can build in parallel and cached, and with multi-core thus it's desirable to have many small translation units. Except of course when there's eventually one large translation unit that needs everything and then link time dominates ...

    The article emphasizes a common issue about headers.. X-Code and Visual Studio work around this to some extent with pre-compiled headers, something that can be really hard to set up in ccache. If Figma's whole team is using macs (they mention getting everybody macbooks?) then I wonder if they could just switch to X-Code and use built-in pch support. While that introduces a dependency on X-Code :( maybe their whole C++ stack will get effectively re-written in the next couple of years anyways?

    • ErikCorry 10 days ago

      OK, when you only have as many .cc files as you have CPU cores, you should stop making them bigger. This is not an issue for big projects.

      The caching of compilation fails if you touch a central .h file. When you work on projects like that you start dreading a change to the .h files because development slows to a crawl as the compile times explode.

      I worked on V8 and over time the .cc files got smaller and the build times got much worse. Some people felt this was neater, but if you are not in an office with 20 beefy workstations using distcc the effect is brutal.

threesmegiste 10 days ago

Whenever i tend to learn c++ this build compile issues prevent me. İ really dont understand how such a important language could still have tooling problems. Especially for beginners. I dont wanna learn a language in browsers, and dont wanna download visual studio. Is there a way or a light a complete solution or app to resolve this issue for beginners. İ really wanna learn this language to understand computer. Dont love rust or carbon or go or zig? any language called system language. I wanna learn c++. İ know basic, pascal yes you guessed right i am old

taylodl 10 days ago

Back when I was doing a lot of C++ development, pre-compiled headers had become the solution to this problem. If memory serves me correctly, I think it was Symantec who first implemented this feature. Dramatically sped-up compilation teams on processors that were only running at 33 MHz - or slower!

Though I was an early adopter of STL at the time, it still hadn't enjoyed widespread use yet. Are templates now the problem with pre-compiled headers? If so, then that should be the problem we tackle.

delta_p_delta_x 11 days ago

A big improvement in compile times with modern C++ can be had with C++20 standard modules.

This requires extremely bleeding-edge toolchains, though: VS 2022 17.10, Clang 18, and GCC 14.

DylanSp 10 days ago

I'm curious how bad the build times were; minutes, tens of minutes, hours? I didn't see any absolute times in the post, just percentages.

saidinesh5 10 days ago

Funnily enough, this topic was all they asked in one of the job interviews I had when I was younger.. they shared a bunch of sources files and asked me where/what I could do to improve the compilation speeds.

I didn't realize how big the impact of these little changes were (only include what you use, forward declare as much as you can, PIMPL etc..) until I worked on a large codebase in that company.

mgaunard 11 days ago

DIWYDU sound like a better tool than IWYU.

feverzsj 11 days ago

On the contrary, I prefer putting everything in header file, and put related headers in same source file just for compile. It's basically unit build but without its drawbacks.

setheron 11 days ago

Isnt this what header guards are for or the #pragma include

rurban 9 days ago

The best thing would be to convert to C and use tcc. Other options would be dlang or go.

Cfront with tcc would sound nice enough to try

ramon156 10 days ago

The longest acronym I know is WYSIWYG (What You See Is What You Get)

  • 8372049 10 days ago

    TIMTOWTDI (There Is More Than One Way To Do It)

chrisjj 10 days ago

> the perennial issue of slow build times

No, that's long build times.

anthk 10 days ago

Use ccache and stop reinventing the wheel.

ramon156 10 days ago

The longest acronym I know is WYSIWYG (What You See Is What You Get)

teunispeters 11 days ago

(silly answer) use Visual C++. I've been coding across MacOS, Linux, and Windows 11, with roughly comparable machines in a lot of ways (all Intel). Visual C++ while awkward for compatibility - is very very fast for builds, and not bad for IDE interface for rapid fixes.

More serious - I moved to CMake presets and with that came a lot of cache optimization - including parallel builds. MacOS is now almost as fast as Windows for build, and Linux/gcc not far behind. Windows C++ seems to have the lowest modern feature compatibility, followed by MacOS/Clang, with Linux/recent GCC being the most complex. A lot of the newer features seem to add a lot to the build time..

... mind I've been working with C++ only for the last few months, and C for many years before, so consider it a beginner post in a lot of ways. Still, it was interesting to explore, and I'll be continuing to explore - I haven't yet enabled ccache for instance which I suspect will improve a lot.

  • jupp0r 11 days ago

    My experience is the exact opposite. Moving a multimillion line C++ code base from msbuild to CMake/ninja on Windows cut the build time in half.

    Chrome got even better speedups I believe by building with clang/ninja on Windows.

    Bazel is where the real benefits lie by reusing other people's (or CI machine's) partial build artifacts via a centralized cache and by avoiding to run tests that are not affected by code changes.

    • spacemanspiff01 11 days ago

      How does bazel work with cmake builds?

      • tambre 10 days ago

        Seems it has the necessary integration points to run CMake builds as an external command. The same way you could build Make, Autotools, Meson or Bazel projects from CMake with the necessary external command plumbings.

        Obviously both fille the same purpose of being a build system, though Bazel is also a build executor not just a generator. Integration would mean either adding BUILD language support to CMake or vice-versa, but you wouldn't get the particular benefits of either this way.

  • gpderetta 10 days ago

    Haven't used MSVC in a long while, but at last place where we were doing multi compiler builds, MSVC was always slower by far.

juunpp 11 days ago

Just another self-promotion blog post with near zero information density. "Behold, the vastness of our vanity, regurgitating old news like we just discovered something new." Do these posts really help with hiring?

CLION also highlights unused includes, nothing new here. Use a good IDE. A networked ccache also does wonders if your org allows it.

Slow build otherwise stem from a combination of: a) lack of proper modules in C++ (until recently) and b) unidiomatic or just terrible code bases. To help with the latter, hide physical implementations (PIMPL for class state, forward declarations for imports), avoid OOP-style C++ above all, minimize use of templates, design sound and minimal modules. No rocket science.

  • MathMonkeyMan 11 days ago

    Three times I've joined a team that has a substantial C++ codebase, and three times I've been tempted to use libclang based tooling to automate changes, or at least to identify patterns that could be changed.

    This article, while not the nerdy deep dive I'd like, does touch on what happens when you try to do that. You realize that the C++ standard library is really complicated, that your existing code is really fucked up, and that libclang is too limited a tool. You end up writing a XSLT engine in hacked up python, but by a different name.

    [LibTooling][1] is probably The Right Thing ("in C++", as the article says), but I never spent the time to get it working.

    Somebody write a DSL for C++ inspection and transformations that uses LibTooling as a backend. I bet there are many, but none close at hand.

    edit: [this][2] is close...

    [1]: https://clang.llvm.org/docs/LibTooling.html

    [2]: https://clang.llvm.org/docs/LibASTMatchersTutorial.html#inte...

  • stefan_ 11 days ago

    Is there anyone else that gets unreasonably angry at stuff like PIMPL? It is truly the most braindead, bereft of sense activity in this world. There was a comment in one of the many Rust threads that called C++ a respectable language now but then things like PIMPL snap you right back into the wasteland it is.

    • wakawaka28 11 days ago

      PIMPL is an elegant solution to multiple problems. Idk what you could possibly have against it besides the extra work involved. I don't think any language has solved the fundamental problem of hiding details better than PIMPL does.

      • bananaboy 11 days ago

        I really like C#'s `partial` keyword as a solution to the problem of hiding implementation details. It lets you declare a class over several files, so you can have one file which is only the public interface, and another which has private implementation.

        • wakawaka28 11 days ago

          That is essentially the same idea as PIMPL. You put the private parts of the class (e.g., the data layout) in some file that is held privately. I guess you could argue that there is extra syntax involved with PIMPL because C++ is more low-level than C#, but it's not so bad. The actual implementation of a class can be spread over as many files as you want in C++.

          • bananaboy 10 days ago

            Yes but pimpl is really just a hack and workaround for the fact that you can't separate the public interface from the implementation details in C++ due to needing to put everything in the class definition. imho `partial` is superior, as you don't need an additional allocation and indirection.

            • nrclark 10 days ago

              You actually don't need to do that at all. It's common style in C++, but the language does not require it.

              With the right techniques, you can absolutely forward-declare basically all of a class's functionality. Then you can put it into its own translation unit.

              Members and function signatures have to be declared in the header, but details about member values/initialization and function implementations can absolutely be placed in a single translation unit.

              • wakawaka28 10 days ago

                The data layout of a class must be part of its definition. So either you expose the layout (data members), add a layer of indirection with PIMPL, or resort to ugly hacks to otherwise hide the data layout such as having buffers as members. Another possibility is to not use exposed classes for member functions. Then you can just pass pointers around only and never use C++ features. Out of all of these, PIMPL solves the problem the best.

                • nrclark 10 days ago

                  Yes, that's true. But if the concern is build-times, exposing the data layout is harmless. We don't necessarily need full PIMPL just to get improved build times. By keeping the data layout in the .hpp, you can guarantee that your class can still stack-allocate.

                  • iainmerrick 9 days ago

                    if the concern is build-times, exposing the data layout is harmless

                    Not at all! If your private member variables use any interesting types (and why shouldn't they?) you need to include all the headers for those types too. That's exactly why you get an explosion of header inclusion.

                  • wakawaka28 9 days ago

                    If you change the data layout of your exposed class, you must recompile anything that uses it. That increases build times and also breaks ABI. And as the other guy commented, the data itself has types that also need definitions and file inclusions. Without PIMPL, your data layout can change without touching your header, due to changes in a related header (even a 3rd party header).

            • Maxatar 10 days ago

              You don't need any indirection with partial because in C# all classes already go through a layer of indirection.

              However, consider that in C# if you add a field to a struct, which is a value type and hence no indirection, then you do need to recompile all assemblies that make use of that struct. It's no different than C++ in this regard.

          • comex 11 days ago

            And yet C, an even lower-level language, achieves the same effect without the duplication of PIMPL. You just forward-declare a struct, and declare functions that accept pointers to it: the header doesn't need to contain the struct fields, and you don't need to define any wrapper functions. Technically you can do the same in C++. But in C++ to make an idiomatic API you need methods instead of free functions, and you can't declare methods on forward-declared classes. Why not? Well, I can imagine some reasons… but they have more to do with C++'s idiosyncrasies than any fundamental limitation of a low-level language.

            The C++ committee could address this, but instead they seem to want to pretend separate compilation doesn't exist. (Why are there no official headers to forward-declare STL types, except for whatever happens to be in <iosfwd>?) Then they complain about how annoying it is to preserve ABI stability for the standard library, blaming the very concept of a stable ABI [1] [2], all while there are simple language tweaks that could make it infinitely more tractable! But now I'm ranting.

            [1] https://cor3ntin.github.io/posts/abi/

            [2] https://thephd.dev/binary-banshees-digital-demons-abi-c-c++-...

            • uecker 10 days ago

              It is indeed one of the most grotesque language design errors C++ did early on.

            • wakawaka28 10 days ago

              First of all, comparing C to C++ in this way is silly, because C++ is a very different language. But there are some similarities.

              > You just forward-declare a struct, and declare functions that accept pointers to it: the header doesn't need to contain the struct fields, and you don't need to define any wrapper functions.

              Those functions would be more verbose because they must contain an explicit `this` equivalent pointer. This would have to be repeated at every single call site. So it's not really helping.

              You don't need wrapper functions for PIMPL. You can have them if you think it's worthwhile, of course.

              >Technically you can do the same in C++. But in C++ to make an idiomatic API you need methods instead of free functions, and you can't declare methods on forward-declared classes. Why not?

              There are good technical reasons why you can't tack member functions into the interface of a forward-declared function. There would be nowhere for that information to go, if nothing else. I think I heard a talk about adding new metaprogramming features to C++ that might address this in like C++26, but anyway it's not a significant problem to simply avoid the problem

              I think you can probably make some template-based thing that would automate implementing the wrappers for you. But it would be a convoluted solution to what I consider a non-problem.

              >The C++ committee could address this, but instead they seem to want to pretend separate compilation doesn't exist. (Why are there no official headers to forward-declare STL types, except for whatever happens to be in <iosfwd>?)

              Most of the STL types that people need are based on templates. It does not make sense to forward-declare those. I just don't see a use case for forward-declaring much besides io stuff and maybe strings.

              >Then they complain about how annoying it is to preserve ABI stability for the standard library, blaming the very concept of a stable ABI [1] [2], all while there are simple language tweaks that could make it infinitely more tractable! But now I'm ranting.

              There seems to be a faction of the C++ committee that does not share the traditional commitment to backward compatibility. They have gone so far as to lobby for a rolling release language, which is guaranteed to be a disaster if implemented. I think wanting to break ABI might be a sign of that. Let's hope they use good judgement and not turn the language into an ever-shifting code rot generator.

              Keep in mind, there may be ABI breakage coming from your library provider anyway, on top of what the committee wants. So it's not necessarily such a cataclysmic surprise as something you're supposed to plan around anyway. ABI stability between language standards is mostly a concern for people who link code built with different C++ standards (probably, a lot of code). It wouldn't be the end of the world if you had to recompile old code to a newer standard, but it might generate significant work.

              • gpderetta 10 days ago

                > There are good technical reasons why you can't tack member functions into the interface of a forward-declared function.

                Are there? You could have a class decorator to mark a definition as incomplete and only allow member functions, types and constant definitions:

                  // in Foo.h
                  incomplete struct Foo {
                  public:
                     Foo();
                     Foo(const& Foo);
                     void frobnicate();
                  };
                
                  // in foo.cc
                
                  struct Foo { // redefinition of incomplete structs is allowed
                  public:
                     Foo(){...}
                     Foo(const& Foo) {...}
                     void frobnicate(){...}
                  private:
                     void bar() {...}
                     int baz;
                     string bax;
                  };
                
                edit: there are also very good reasons to fwd declare templates. You might want to add support in your interface for an std template without imposing it to all users of your header. In most companies I have worked, we had technically illegal fwd headers for standard templates.
                • wakawaka28 10 days ago

                  >Are there? You could have a class decorator to mark a definition as incomplete and only allow member functions, types and constant definitions:

                  C++ already has a way to do this via inheritance, even multiple inheritance. I suppose some of the same machinery used for inheritance could be repurposed for partial class definitions but it is unnecessary.

                  Edit: I think I overlooked something here at first glance. Yes it might be nice to have a public partial definition of a class and a private full definition. But the technical reason you can't have this is that using a class in C++ requires knowing its memory characteristics. If that information does not come from the code, then it must come from somewhere else like a binary. Maybe the partial definition could be shorthand for "use PIMPL" but I haven't thought through all the ways it could go wrong, such as with inheritance.

                  >edit: there are also very good reasons to fwd declare templates. You might want to add support in your interface for an std template without imposing it to all users of your header. In most companies I have worked, we had technically illegal fwd headers for standard templates.

                  I have never seen illegal forward declaration headers. Not at any company I've worked at, nor in any open-source project. I don't think there is a reasonable value proposition to doing that. What kind of speedup are you expecting from that?

                  >You might want to add support in your interface for an std template without imposing it to all users of your header.

                  This sounds good in theory but in practice, most interfaces I've seen use the same handful of types or std headers, so it can't be avoided and furthermore you'd be forcing everyone to bring their own std headers every time (and probably forget why they ever included them in their own code, in the first place). You'd be talking about a lot of trouble to maybe save one simple include, and introducing a lot of potential for unused and noisy includes elsewhere.

                  • gpderetta 10 days ago

                    Yes, that's why I used the 'incomplete' keyword. Of course you can only pass around pointers and references to incomplete classes (although there might be ways around that).

                    Base classes almost work, but you either need all your functions to be virtual or you need to play nasty casts in your member functions.

                    re std fwds, typically the forwarding is needed when specializing traits while metaprogramming.

      • kevin_thibedeau 11 days ago

        Ada had all of C++'s problems figured out in 1983. PIMPL as a means of boosting compiler performance is fundamentally braindead. We shouldn't be bending over backwards with broken tools to make them sort of work.

        • wakawaka28 11 days ago

          PIMPL doesn't only boost compiler performance. It provides code-hiding and ABI stability for everyone using it effectively. It's like killing 3 birds with one stone. PIMPL for sure isn't gonna be the thing to convince me that C++ is broken.

          Ada has piqued my curiosity before but I think if it was as good as you make it sound, it might have at least 1% market share after 40 years. It doesn't. I can't justify the time investment to learn it unless I get a job that demands it.

          • iainmerrick 9 days ago

            It's not PIMPL per se that's the problem, it's that C++ needs it but makes it very awkward to write. It feels like the language is fighting against you rather than setting you up for success. At least that's been my angle in this discussion.

        • pjmlp 10 days ago

          Unfortunely "bending over backwards with broken tools" is exactly to what made C++ a success in first place, an idea also adopted by Objective-C and TypeScript.

          Trying to make the best out of an improved langauge, while not touching the broken tools of the existing kingdom they were trying to build upon.

          Naturally such decision goes both ways, helps gain adoption, and becomes a huge weight to carry around when backwards compatibility matters for staying relevant.

    • Maxatar 11 days ago

      No, maybe the acronym is ugly, sounds like pimple? But other than that the use of the technique is invaluable to writing stable ABIs which in turn makes distribution of C++ libraries a lot easier.

    • iainmerrick 10 days ago

      Yes! Why should I need to do extra work, messing up runtime performance by adding a pointer indirection, just to improve compilation time a bit?

      In C, idiomatic C, you can forward-declare a struct and the functions that operate on it, and you don't need any indirection at runtime. C++ has plenty of nice features, and in general I'd reach for it rather than C, but for some reason it can't do that!

      • Maxatar 10 days ago

        This doesn't make much sense.

        Sure you can forward declare a struct and functions that operate on it, but you can't call the function or instantiate the struct without the definition. That's no different than in C++.

        The purpose of PIMPL is that you can actually call the function with a complete instantiation of the struct in such a way that changes to the struct do not require a rebuild of anything that uses the struct.

        It's not about just declaring things, it's about actually being able to use them.

        • iainmerrick 10 days ago

          I'm thinking of this kind of declaration in C:

            typedef struct foo Foo;
          
            extern Foo* foo_create();
          
            extern const char* foo_getStringOrSomething(Foo* foo);
          
          That's a fully opaque type, and it's reasonably efficient. The one thing you can't do store a Foo on the stack, because you don't know its internal size and layout. So it's always a heap pointer, but there's only one level of indirection.

          In idiomatic C++ I think you'd have something like:

            class Foo {
          
              struct impl;
          
              std::unique_ptr<impl> _impl;
          
              std::string getStringOrSomething();
          
            }
          
          If I have a pointer to a Foo, that's two pointer indirections to get to the opaque _impl. So, okay, I can store my Foo on the stack and then I'm back to one pointer. But if I want shared access from multiple places, I use shared_ptr<Foo>, and then I'm back to two indirections again.

          The idiomatic C++ way to avoid those indirections is to declare the implementation inline, but it make it private so it's still opaque. But then you get the exploding compile times that people are complaining about on this thread.

          The C approach is a nice balance between convenience, compilation speed and runtime performance. But you can't do this in idiomatic C++! It's an OO approach, but C++'s classes and templates don't help you implement it. C++ gives you RAII, which is very nice and a big advantage over C, but in other respects it just gets in the way.

          Edit to add: now I look at this, Foo::getStringOrSomething() will always be called with a pointer (Foo&) so it will always need a double-dereference to access the impl. Unless, again, you inline the definition so the compiler has enough information to flatten that indirection.

          I don't see how that pImpl approach can ever be as performant as the basic C approach. Am I missing something?

          • Maxatar 10 days ago

            In C++ if you want an opaque type similar to your C example, then you give your class a static constructor.

                class Foo {
                  public:
                    static std::unique_ptr<Foo> create();
            
                    std::string getStringOrSomething();
            
                  private:
                    struct FooImp;
            
                    Foo() = delete;
            
                    Foo(const Foo&) = delete;
            
                    auto& self() {
                      return static_cast<FooImp&>(*this);
                    }
                };
            
            And then you use inheritance to hide your implementation in a single translation unit/ie. a .cpp file source file.

                struct Foo::FooImp : Foo {
                  std::string something;
                };
            
                std::string Foo::getStringOrSomething() {
                  return self().something;
                }
            
            With this, there is a single indirection to access the object just as in the C example.

            But PIMPL is used when you want to preserve value semantics, things like a copy constructor, move semantics, assignment operations, RAII, etc...

            • iainmerrick 9 days ago

              That’s a nice trick, I don’t recall seeing that one before!

              It still seems like an awful lot of boilerplate just to reproduce the C approach, albeit with the addition of method call syntax and scoped destructors.

              I feel like there must be an easier way to do it. Hmm, maybe I’m at risk of becoming a Go fan...!

              • Maxatar 9 days ago

                The boilerplate gives you additional type safety compared to C.

                If you want the same type safety as C, which is basically none, then you can write it as:

                    // foo.hpp
                    struct Foo {
                      static Foo* make();
                
                      void method();
                    };
                
                    // foo.cpp
                    struct FooImp : Foo {
                      std::string something;
                    };
                
                    Foo* Foo::make() {
                      return new FooImp();
                    }
                
                    void Foo::something() {
                      ...
                    }
                
                And yes it's rare because nowadays most C++ developers stick as much to value semantics as possible rather than reference semantics, but this approach was very common in the early 2000s, especially when writing Windows COM components.

                Nowadays if you want ABI stability, you'd use PIMPL. Qt is probably the biggest library that uses this approach to preserve ABI.

                • iainmerrick 9 days ago

                  I don't think it's fair to say that C has no type safety. To recap, I had:

                    typedef struct foo Foo;
                  
                    extern Foo* foo_create();
                  
                    extern const char* foo_getStringOrSomething(Foo* foo);
                  
                  The only valid way to get a Foo (or a Foo ptr) is by calling foo_create(). Inside foo_getStringOrSomething(), the pointer is definitely the correct type unless the caller has done something naughty.

                  Of course there are a few caveats. First, the Foo could have been deleted, so you have a use-after-free. That's a biggie for sure! Likewise the caller could pass NULL, but that's easily checked at runtime. Those are part of the nature of C, but they're not "no type safety".

                  You can also cast an arbitrary pointer to Foo*, but that's equally possibly in C++.

                  • Maxatar 9 days ago

                    A comment like "basically none" should not be taken literally. It is intended to indicate that the difference between the C++ approach and the C approach is that the C++ approach gives you a great deal of type safety to the point that the C approach looks downright error prone.

                    The C++ approach of sticking to value semantics doesn't involve any of the issues you get working with pointers, like lifetime issues, null pointer checks, invalid casts, forgetting to properly deallocate it, for example you have a foo_create but you didn't provide the corresponding foo_delete. Do I delete it using free, which could potentially lead to memory leaks? The type system gives me no indication of how I am supposed to properly deallocate your foo.

                    You don't like boilerplate, fair enough it's annoying to write, but is boilerplate in the implementation worse than having to burden every user of your class by prefixing every single function name with foo_?

                    The C++ approach allows you to treat the class like any other built in type, so you can give it a copy and assignment operator, or give it move semantics.

                    So no it's not literally true that C has absolutely zero type safety. It is true that compared to the C++ approach it is incredibly error prone.

                    While older C++ code is rampant with pointers, references, and runtime polymorphism, best practices when writing modern C++ code is to stick to value types, abstain from exposing pointers in your APIs, prefer type-checked parametric polymorphism over object oriented programming.

                    If anything, the worst parts of C++, including your point about being able to perform an invalid cast, is inherited from C. C++ casts, for example, do not allow arbitrary pointers to be cast to each other.

    • bun_terminator 11 days ago

      I have never used or seen pimpl in my life, and c++ is all I do

      • MathMonkeyMan 10 days ago

        Does somebody using my "class Foo" really need to know everything about its "std::unordered_map<std::string, std::unique_ptr<NetBeanFactory<RageMixin, DefaultBowelEvictionPolicy>>>" data member?

        No, they need know only the size and alignment of "class Foo". Unfortunately, in C++, either a client has to see every type definition recursively down to the primitives, or you give them an opaque pointer and hide all of the internals in the implementation (all they know about "sizeof(Foo)" is "it contains a pointer to something on the heap, probably").

        edit: Ok, there's also the copy constructors and other possibly auto-generated value semantic operations, but pretend you've defined those explicitly, too.

        • bun_terminator 10 days ago

          I know what pimpl is - I relearn it before every job change for interviews. But I've never seen it in use, and don't see a use for it. Compile times are rarely an issue in my experience. At least not big enough to warrant something that extreme.

          I like to see things simple: I need something, so I include something. Compile times are really orthogonal to that, and mostly a job for compiler devs, hardware people, modules or whatever. Changing my code because of compile times seems pretty harsh

          • Liquid_Fire 9 days ago

            pimpl is not for improving compile times (although it can help with that). It's for maintaining ABI compatibility by keeping your implementation details out of public headers.

        • gpderetta 10 days ago

          You can also provide pointers to base classes, but, unless you do some heroics, you are forced to use virtual functions.

allpaca 11 days ago

C++ is evolving so much, however I don't understand this thing: why people continue to develop AI projects with Python? I'd choose C++ instead...

  • bpicolo 11 days ago

    Because they're python libraries that just wrap C and C++. All the performance upside with better ergonomics

    • oivey 11 days ago

      Putting this another way, people use Python because it makes it way easier to compose together the underlying C++ code. Composition and polymorphism in C++’s static type system is rather weak.

      Of course there is also the relative succintness of Python and other advantages, too.

  • VHRanger 11 days ago

    Because you need a dynamic language to do rapid iteration.

    You don't want to recompile, re-parse a 12GB parquet file, etc. everytime you try a new parameter in a model.