The Success and Failure of Ninja

209 points by dochtman 5 years ago

drothlis 5 years ago

The OP is 100% right that ninja's biggest architectural insight is the "assembler" metaphor. I came to this same conclusion in https://lwn.net/Articles/706404/

Sad to hear the overall experience has been so negative for him. Ninja has certainly had a large impact! I can relate to the open-source burnout, even though I've experienced only a tiny tiny fraction of it compared to the success of Ninja.

Re. speed, I wonder how much of Ninja's speed over make is thanks to Ninja's "deps log" binary format. In my (fabricated) benchmarks, Make spent 98% of its time processing the compiler-generated ".d" files that are used to track dependencies on header files: https://david.rothlis.net/ninja-benchmark/

evmar 5 years ago

Ninja originally used `.d` files as well (it was built while incrementally replacing a make-based system so it did many things like make). My recollection is that it was already faster than make without the deps log, but it is almost certainly project specific, and perhaps due to the ways we were abusing make (search for "clever hacks" on http://neugierig.org/software/chromium/notes/2011/02/ninja.h... for some discussion of this.
With that said, the deps log processing of .d files was a huge win for us. One thing to observe is that it's the same code and amount of work either way, but if you load them at startup they are processed during the critical path, while the deps log parsing happens during the build so their cost is amortized.

modeless 5 years ago

I actually submitted a pull request to Ninja adding a daemon mode, knowing that not being a daemon was an intentional choice in the original design. The optimal design can change as a project matures. I think it makes sense to pay the complexity cost for the startup time optimization now that ninja is so widely used.

I'm still using the crash-only principle as much as possible; the daemon simply exits if build files or arguments change, rather than attempting to update its state. Although cross-platform IPC is complex, once that's abstracted away the changes to ninja itself are small and non-invasive. I think it's less complex than other proposals to speed up ninja startup, like a new binary input format.

Anyway, if anyone is working on a very large project and wishes ninja started faster, try this PR: https://github.com/ninja-build/ninja/pull/1438

The_rationalist 5 years ago

Interesting, did you take inspiration from gradle daemon?

greggman3 5 years ago

the whole post is interesting but if you're short on time jump to the end "Open Source"

Then try to have a better attitude when leaving issues on github or whatever. Be thankful. Take responsibility if you want a change to make the change yourself. Offer it as a PR but don't expect it to be pulled. Others can grab the PR if they want it even if the maintainer doesn't feel it's a fit. If you have a suggestion offer it as a question like (would it be faster if this did X instead of Y? or would it be useful if feature A existed?)

I'm not good at this myself but I'm slowly getting better over the years.

Also also, I wish there was a better way to express thanks. In the article it was mentioned that thank yous are rare where as angry demands are common. But, there is no easy way to express thanks. Like opening an issue and saying "Thank you" is frowned on. Even leaving a comment under a tech blog post that just says "thanks" is considered by some to be unwanted noise. I don't have a solution here. Maybe you have some suggestions?

Also, if you feel it's appropriate, sponsor your favorite projects. I've wondered if there is a good way to promote sponsorship. I don't want to brag that I'm donating each month to various projects but I do want more people to consider sponsoring. Especially user oriented projects like say Blender, software that 100k to 1000k people are using which would normally probably cost $1500k-$3000k a year, maybe a coffee a month would be worth if you're using it often? Same for things like Kodi? Or VLC? both of which I've got 1000s of hours of usage out of.

StavrosK 5 years ago

I've registered the domain www.osscoc.com and I keep planning to write something but never get the inspiration. I want it to be a reminder that people can easily link to, e.g. when someone is being belligerent, the CoC can remind them that everyone is on the same side, and lay out some scenarios like "my PR didn't get approved" and what to remember.
Also, the best thanks are PRs, I think.
aplaice 5 years ago

> Maybe you have some suggestions?
I'm probably not the intended "you", but one idea would be a "testimonials"/thanks section in GitHub/GitLab/Gitea, in addition to the standard Issues and Pull/Merge requests.
It'd allow users of the tool/library to express gratitude, give the developer feedback about their creation and possibly even provide inspiration to other users. People could set their own notification preferences, and the UI could make it clear to those thanking that they shouldn't necessarily expect a reply.
The "testimonials/thanks template" could include a suggestion to donate if you've made a commercial product using the software or if you're just a happy end-user.
GitHub is trying out "discussions", but that's far less targeted.
From a personal point of view, I just try to express thanks whenever I otherwise have a substantive point to make (bug report, comment on an issue, but as you point out that doesn't allow for situations where you don't have anything else to say.

jcelerier 5 years ago

Ninja is great ! In particular the author's point about iteration time rings really true, if things don't happen in a span pf 2 seconds attention span is lost.

I've been developping a small cmake frontend which uses it by default as well as generally optimizes for build speed - https://github.com/jcelerier/cninja

ncmncm 5 years ago

This article is almost a course in how to make good utility software of any kind.

Being short and not very specific, it probably takes close reading to get much useful from it, but not a word is wasted, and nothing in it will lead one astray.

albertzeyer 5 years ago

> I strongly believe that iteration time has a huge impact on programmer satisfaction, and Ninja is used exactly in the edit-compile loop where the difference of 1 second and 4 seconds is critical. I think I am personally more latency-sensitive than the average programmer, but I also believe that programmers feel latency and it affects their mood even if they don't notice it. (Google has recently done some research in this area that kinda confirmed my belief, here's hoping they'll publish it publicly!)

This is something to emphasize.

I often feel that my productivity is strongly linked to the latency of actions I do. I like to play around a lot if the latency is low. Once it becomes higher, I generally don't do that, and in principle I often additionally think about how to work around stuff, maybe write another complexity layer on top, maybe even do less testing, etc (many of these things just unconsciously). So it does not only adds the latency time, but really makes the whole development cycle slower, and less optimal. Also I tend to get distracted more easily. When I know that the next action (compiling, startup time, whatever) takes 30-60 seconds, I will definitely not just wait, but do something in the meanwhile. But also you cannot really do something productive in that time. So I end up checking mails or HN or so.

Also, since I started with my PhD, doing research on machine learning, I discovered a whole new scale of this problem, which I had not experienced before. Now the latency time has usually increased from a couple of seconds to a day, or a week. This drastically changes of how you work. To stay productive, you have to parallelize your work. When you are trying out things, you will try out many variations in parallel. Also you will look at the intermediate results, and maybe already draw conclusions from that.

If there is a chance to refactor your current pipeline to reduce that iteration loop time, or the wait-time (which might be the startup time of some tool, or whatever), I think it's very worth to do it.

rjsw 5 years ago

We have come full circle back to the days of batch processing and punched cards.
- mcny 5 years ago
  
  Reminds me of this comment:
  https://news.ycombinator.com/item?id=18442941
  https://archive.fo/FEtN0
  Oracle Database 12.2.
  It is close to 25 million lines of C code.
  What an unimaginable horror! You can't change a single line of code in the product without breaking 1000s of existing tests. Generations of programmers have worked on that code under difficult deadlines and filled the code with all kinds of crap.
  Very complex pieces of logic, memory management, context switching, etc. are all held together with thousands of flags. The whole code is ridden with mysterious macros that one cannot decipher without picking a notebook and expanding relevant pats of the macros by hand. It can take a day to two days to really understand what a macro does.
  Sometimes one needs to understand the values and the effects of 20 different flag to predict how the code would behave in different situations. Sometimes 100s too! I am not exaggerating.
  The only reason why this product is still surviving and still works is due to literally millions of tests!
  Here is how the life of an Oracle Database developer is:
  - Start working on a new bug.
  - Spend two weeks trying to understand the 20 different flags that interact in mysterious ways to cause this bag.
  - Add one more flag to handle the new special scenario. Add a few more lines of code that checks this flag and works around the problematic situation and avoids the bug.
  - Submit the changes to a test farm consisting of about 100 to 200 servers that would compile the code, build a new Oracle DB, and run the millions of tests in a distributed fashion.
  - Go home. Come the next day and work on something else. The tests can take 20 hours to 30 hours to complete.
  - Go home. Come the next day and check your farm test results. On a good day, there would be about 100 failing tests. On a bad day, there would be about 1000 failing tests. Pick some of these tests randomly and try to understand what went wrong with your assumptions. Maybe there are some 10 more flags to consider to truly understand the nature of the bug.
  - Add a few more flags in an attempt to fix the issue. Submit the changes again for testing. Wait another 20 to 30 hours.
  - Rinse and repeat for another two weeks until you get the mysterious incantation of the combination of flags right.
  - Finally one fine day you would succeed with 0 tests failing.
  - Add a hundred more tests for your new change to ensure that the next developer who has the misfortune of touching this new piece of code never ends up breaking your fix.
  - Submit the work for one final round of testing. Then submit it for review. The review itself may take another 2 weeks to 2 months. So now move on to the next bug to work on.
  - After 2 weeks to 2 months, when everything is complete, the code would be finally merged into the main branch.
  The above is a non-exaggerated description of the life of a programmer in Oracle fixing a bug. Now imagine what horror it is going to be to develop a new feature. It takes 6 months to a year (sometimes two years!) to develop a single small feature (say something like adding a new mode of authentication like support for AD authentication).
  The fact that this product even works is nothing short of a miracle!
  I don't work for Oracle anymore. Will never work for Oracle again!

biggestdecision 5 years ago

Funny to hear the author of Ninja say that he's never used CMake...

And they shouldn't have been surprised at the number of Ninja users that were on Windows, Ninja is so much faster than the alternatives if you are using CMake on windows.

gjvc 5 years ago

I have found the combination of CMake, Ninja, and ccache to be excellent.

RobotCaleb 5 years ago

I use cmake and ninja with ice cream to distribute across the various machines around my office. I've been quite happy with it.
ncmncm 5 years ago

Ninja and ccache actually made SCons almost tolerable.

carapace 5 years ago

> Related work

> I mentioned that I stumbled through Ninja's design. I regret not spending more time researching before building, ... Since then I have come to appreciate how important it is to actually understand the design space when building a thing. I now find myself noticing how rare it is for programmers to discuss related work and it now drives me mad.

FWIW, I think this is a key take-away.

rwmj 5 years ago

OT but I recently had a rethink about make, and wrote an experimental replacement and did a talk about it: https://rwmj.wordpress.com/2020/01/14/goals-an-experimental-...

drothlis 5 years ago
Thanks for the link -- it was really interesting.
For other HN readers: The parent works at RedHat (on distro-wide build/packaging? He talks about building Fedora packages) and there are some useful ideas in there.
My own notes & thoughts having just watched the video:
## "Tactic" allows a target that aren't files ##
- For example a target #url("http://xyz/abc"): to "build" it, scp the dependency to the server.
[Edit: I had to replace * with # in "* url" above because of HN formatting.]
How does it check if the file at the url is out of date? Would be terribly slow if it had to fetch it each time.
- For example #koji-built("package-name"): used to rebuild fedora packages in the right dependency order, if needed, using a remote build service.
I suppose you can implement this in make/ninja by creating dummy marker files like ".url.http_xyz_abc.built" as your target, but it's a hassle.
## Multiple parameters ##
- make rules support a single parameter, i.e. the "%" in "%.o: %.c"
- his tool supports arbitrary parameters:
```
    goal compile (name, debug) =
    "%name.%debug.o" : "%name.c" {
        if [ % debug = d ]; then f=-DDEBUG=1; fi
        ...
    }
```
I've been using a python script to generate ninja files, so I achieve the same by using normal Python functions with arguments; plus ninja's built-in support for rebuilding the target if the recipe has changed.
## Shell problems in make ##
1. filenames with spaces - his tool supports it by forcing filenames to be quoted. "some_name" is syntactic sugar for #file("some_name") (see "tactics" above).
2. make (by default) uses a separate shell per recipe line.
3. dollar interpreted by make so have to use $$var - his tool solves this by using % as the escape character.
I think ninja solves the first 2 automatically. I solve the 3rd one by doing the escaping ($ -> $$) in the Python script that generates my ninja files.
## Named goals ##
If you have a "goal" (rule) called "compile" then you can use the goal invocation as a dependency. For example these 2 are equivalent:
```
    goal link = "prog": "foo.o"
```
and
```
    goal link = "prog": compile("foo")
```
THIS IS A BIG DEAL. It means you don't have to come up with artificial target names (filenames) for intermediate build steps. In my own Python+ninja builds, each Python function returns the target, which you can use as input to other functions, and this has turned out to be super useful. For example (Python code):
```
    def build_package(name, build_root_yaml):
        build_root = build_container(name, build_root_yaml)
        src = tarball_to_ostree(git_archive(repo, sha, prefix="src/"))
        return run_command("make -C /src install DESTDIR=/dest",
                           ostree_combine([build_root, src]),
                           chroot=True,
                           out_subdir="/dest")
```
This snippet creates a filesystem image (build_root), creates a source tarball using "git archive", unpacks that tarball into the build root, then and chroots into the build root to run a build command, capturing the tree created in "/dest" as a new build artifact. The return value is target representing this build artifact -- you can pass it into other functions. It's still a filename, but it can be generated automatically, for example as a hash of the inputs.
Note that running this python code doesn't actually do any of this building; it just writes out a ninja file that will do it. The return value (from the python function) is the target name. Each of "build_container", "tarball_to_ostree", "git_archive", "run_command", and "ostree_combine" are Python functions that behave in a similar way (writing out ninja rules and returning the target).
(I need to do a proper write-up of how we use Python+ninja+ostree in our build system. I hope the above made some sense at least.)
## Computer sciency things ##
He talks about how his named goals are like functions, that can be called by name or by pattern matching. I need to think about it a bit more to really understand the implications.
---
Thanks again for the talk. I hope this doesn't come across as "but you can already do this in ninja!". I just find it interesting to have new ways & words to think about various build system features/capatibilies, and to compare how different build tools tackle it. (In this vein I recommend the "build systems a la carte" talk/paper mentioned in the OP.)

memco 5 years ago

I've never worked with it myself, but Textmate (https://github.com/textmate/textmate) has used ninja for years. It was the only project I knew that used it, but was always interested in that decision.

malkia 5 years ago

"stat 30k files in 10s of milliseconds" - I'm not sure how is this possible, as GetFileAttributesEx() is roughly 65k/s - e.g. 64-65k ops (my computer, my drive, etc.) - all cached. It's possible that Ninja maybe using something else (directory lookup?)

JNRowe 5 years ago

The paragraph mentions that it is talking about Linux. Further down in the article it digs in to the speed problems on Windows, and links to a quick run down of the speed comparisons¹.
1. https://github.com/ninja-build/ninja/wiki/Timing-Numbers
- malkia 5 years ago
  
  Just saw it, if I'm interpreting the numbers there - it's even worse (as expected on Windows/NTFS), even from cache. Still though I love ninja, and just recently used it for generating .proto files to 4 different languages.
  Also love bazel (after using 2-3 years blaze), but for the use case above it would've been a real "killer" :)
acqq 5 years ago

Have you measured the speed of FindFirstFile, if the goal is only to have many file times, maybe that could be faster for a hot cache, compared to GetFileAttributesEx? (For cold cache, everything is slow anyway.)
In specific case of ninja build tool, it can be that they simply don't achieve the same speed on Linux and Windows, as JNRowe writes. It is known that NTFS and Windows are traditionally both visibly slower with many operations on many files, compared to Linux.
- malkia 5 years ago
  
  You can FindFirstFile (not that, but directly reading the directory as a file) really fast (haven't measured that recently), but there is latency
  
  acqq 5 years ago
  
  > FindFirstFile (not that, but directly reading the directory as a file)
  I always believed FindFirstFile actually does read the directory as a file once (or in some convenient chunks), keeps the handle to that and returns an element of that each time FindNextFile is called. Moreover, NTFS has some stat-like file information in the directory file, exactly to be able to return it this way. And as always, then the problem with having the cached information is, what if it is stale. But sometimes, for some purposes, it can be good enough.
- evmar 5 years ago
  
  The first attempt at Windows tried to do some memory-resident file watching thing that I eventually rejected as too complex. We circled a few more times on different approaches and the final Windows approach is pretty complex.
  Look at the WIN32-related ifdefs in https://github.com/ninja-build/ninja/blob/master/src/disk_in... (I would link to something specifically but it's not that long of a file.)
  
  malkia 5 years ago
  
  I wonder how that works, and haven't it run into problems? (E.g. relying on FindFirstFile for info). Maybe it's all right when comes to compiling... just this https://devblogs.microsoft.com/oldnewthing/20111226-00/?p=88...
  
  acqq 5 years ago
  
  > Maybe it's all right when comes to compiling
  I would not expect problems, once there are no bugs in the code. As far as I understand, a build tool doesn't need file sizes, only the timestamps.
  The code appears to do sensible things, e.g. it notes "FindExInfoBasic is 30% faster than FindExInfoStandard" and uses the former when possible.
  Looking "from 20000 ft", the only to me unexpected thing I see is handling ".." there, somehow I expected that code which uses that wouldn't even need the timestamp of "..".
- pianoben 5 years ago
  
  As it turns out, this is more or less what ninja does:
  https://github.com/ninja-build/ninja/blob/master/src/disk_in...

highlysyntropic 5 years ago

particularly responding to the authors story about all the disappointment that open source involves.

open source maintainers this is a public service announcement. You're killing yourselves by being too nice. Don't be afraid to embrace your inner despot because really that's what you are when you are maintainer over a project. you need to be sort of a rougher, less experienced than, less polished version of Linus's or Guido's benevolent dictator for life. if you don't like something just shut it down if you don't like an issue just close it if you don't like a PR just close it if you don't like a request just deny it.

if this is starting to rub off on you in a good way and you're liking what I'm saying well then just double down and do it. embrace your inner despot. it's okay. if you're still sitting on the fence let me bring you to the good side with a little bit of advice, if you need to have it sold you like this. That's fine. it's not about you. keeping you in the best mental health possible, taking care of you is really about taking care of the project and all the people involved in it. you need to come first so that the project survives and can thrive. so uncomfortable as it may be to embrace your inner despot that's the best way to secure your own mental health.

clear boundaries. unhesitating expression of what you want to do. no apologies.

they want to fork it? not your problem. stick a fork in it.

you don't like the way someone's behaving? block them.

you can choose to make this a great experience for you I believe. you just need the discipline to stick to that path. at every branch in the road choose this: what you want to do and only what you want to do. no compromises. That's the discipline you need to practice. if a project that used to be fun and give you a whole set of positive emotions, opens up, that should really only increase the positivity for you once you share it with others. I believe if you picked the right path by embracing your inner despot you can do that. you end up with a lot of people expressing their negative emotions, but you know what? who cares. their feelings are not your responsibility. it's not your problem.

be a despot. you'll like it. and it might be the only chance you have to do that. and I might be the way to save your mental health and your project.

and maybe just maybe if you be a despot the community will change. people who support you will gather around you and protect you once they see that strength which is inspiring. this drink that you stick to your principles and your vision. and maybe just maybe if you're lucky there's no guarantee but maybe the community will start protecting you. and if they do well just don't get soft. keep being a despot because that's what you need to do that's your job. you created this thing it's your responsibility to keep it on the best possible that involves keeping yourself on the best path possible. you already know how to do that you just have to choose that at every little decision you have. if you're not ready to make a decision right now just aren't. delay it until you feel ready to do it.

good luck. end of message.

TheDong 5 years ago

If I were to be a despot, I wouldn't feel better, I would feel worse.
To quote the article, "I wanted to repay their effort with a thorough explanation about why I was turning them down, and doing that was itself exhausting" -- the author feels as if they need to do these things, and at the same time doing so is draining. Simply replying with "no, I'm not taking this" and closing it very well might make the author feel even worse because it's not like that contributor meant anything ill, it's not like they don't deserve an explanation... it's simply an imbalance that there's an order of magnitude more contributors than maintainers.
If the problems were clearly abusive people that could be blocked, the solution would be easy. Unfortunately, the feeling of "people care about this project, and I owe them respect, even if it's draining on me" isn't something nearly so easily solved.
If you want to program in the style you speak of, you wouldn't even publish the code in the first place. If the point of the code is as a hedonistic selfish art-form, it never needs to leave your computer. You can be a tyrant over your own code. However, if you decide you wish others to see your code, use it, or learn from it... well, you clearly have a desire that others will use it, and now you're stuck between feeling as if you need to do X in pursuit of that, and burning out.
Frankly, I'd rather have shared my code, and learnt the hell that results, than to never have shared my code at all.
- highlysyntropic 5 years ago
  
  Disagree. This is too compromising. You'll suffer.
  Doesn't make sense. You say you'd feel worse being a despot, but say you'd have "learnt the hell" that results by compromising. Worse than hell?
  Boundaries are not just for people you want block. It's for everybody. It's not about them. It's about what you want. It's what's good for you.
  You choose your own experience pal, but the way I see it is, if you're feeling these things the author describes...that's not good for you.
  I'm not trying to convince you. You've got to choose your own way. To me it's a clear choice. What's good for you needs to trump what's good for others, otherwise you'll hurt yourself. So, boundaries.
  I understand if it's the first time you've encountered the concept. Or seen it so boldly applied. But...I think it's warranted. The magnitude of suffering of these OSS guys is staggering. And to my view, they do it to themselves by not saying no. That's all.
  I am grateful you help me explore it more...but I found that everything you raised I'd already thought of, or was already covered by the approach I propose to it. Good chance to reinforce the idea tho.
  
  austinjp 5 years ago
  
  Horses for courses. Some people will respond better in one environment, others won't. There's no One True Way.
  
  highlysyntropic 5 years ago
  
  Not really tho. In this case, there really is a One True Way, here! If you're doing more than you want to, and what you don't want to, for other people, you're not gonna enjoy it. Going beyond your boundaries, and violating your own comfort and principles, it's not gonna be good for you. Should be obvious.
  If people are asking you to do things, and you don't want to, just do what you want. I know it's hard to resist sometimes. But that's sort of why the discipline is essential.
  Of course, there's also..."push out of your comfort zone" and "stretch to your limit" and "live at your edge"... but that's a whole different thing. Those things should be taken in small doses, for you for excitement and enjoyment, like extreme training or skydiving or whatever, at totally consensual choice by yourself. But that's not what we're talking about.
  Sure some people choose self-punishment. But that doesn't mean it's good for them. People choose a lot of stuff, for long times, that aren't necessarily good for them. They complain about it. When they could have just said no. Humans.
  
  austinjp 5 years ago
  
  The behaviour described is entirely selfish. There's plenty of evidence that the opposite, selfless behaviour, is highly beneficial and can actually be enjoyable.
  If you never compromise for the sake of others you'll find yourself lonely and disliked.
  There's no correct or incorrect here, no black and white. It's a big spectrum of grey. There is a balance, and it's hard to find.
finnthehuman 5 years ago

I can't agree more; say "no" early and often.
But there are cultural forces, even moreso in software development workplace culture, that make saying "no" hard/uncomfortable, and receiving a "no" feel like a bigger deal than it is.
It leads to the passive aggressiveness we're all familiar with, and that leads to a further slide, adding even more ambiguity to a rejection on the scale between "eh, maybe not right now" and "fuck off and die in a fire."
- highlysyntropic 5 years ago
  
  When it could have just been a simple no.
  Then again, workplace is so different to OSS. It's different power dynamic. You're not a despot in the workplace if you're taking orders from above. In a real sense, when you work for someone, saying "no" to work tasks can mean getting fired.
  But in another sense, you need to be able to talk to people in your workplace as peers (again different to the OSS situation we are talking about here), using NVC and what not, to make sure your needs are getting met, you feel heard, and you're healthy. A "no" as in "I'm not unwilling to do the work here" but "this doesn't work for me" and talking is necessary. A workplace where you can't say no does not sound very good.
  Well, receiving a no that's a different story, and people need to get that it's their responsibility to get used to that, and it's okay to get a no.
  Giving a no, that discipline is essential