zeroxfe 19 days ago

Okay, so when I worked at Sony about 25 years ago, I got assigned this project to fix our order management system, which was extremely slow, and kept crashing.

I jumped in and started digging around, and to my horror, the OMS was a giant set of shell scripts running on an AIX server, which evolved over a decade and was abandoned. It was over 50,000 lines of code! It was horrendous and shit kept timing out everywhere -- orders, payments, and other information were moved from server to server over FTP, parsed with complicated sed/awk, and inventory was tracked in text files (also FTPd around.)

At the time, perl seemed like the most practical way for me to migrate the mess -- I rewrote all of the shell piece by piece, starting with the simplest peices and replaced them with small perl modules as part of a larger perl application, refactoring along the way. It took me 3 months and I moved the whole thing to about 5000 lines of perl, and it ran 10-100x faster with almost none of the failures in the original system.

As terrible as it was, it's one of the most satisfying things I've ever done. :-)

  • martin-t 19 days ago

    Just 3 months?

    That's deleting 800 lines a day, each day. Did you need to read through the original code, get a deep understanding and match its behavior exactly or did you throw away huge chunks and write new code as you thought it should behave?

    Was there a lot of boilerplate that could be replaced quickly?

    • zeroxfe 17 days ago

      There was tons of duplicate code, unnecessary code, dead code, etc. There was also a lot of code for which CPAN modules could entirely replace. (Also, I'm a workaholic who obsesses about a problem until it's fully solved.)

    • almostgotcaught 18 days ago

      > That's deleting 800 lines a day, each day

      50,000/90 = 555 ???

      • ciupicri 18 days ago

        Not all days are working days.

  • shawn_w 19 days ago

    perl is still the most practical way to mitigate shell script abominations like that. Though tcl's a good option too.

    • chubot 19 days ago

      Oils aims to be the absolute best way to migrate shell scripts! (I created the project, and the wiki page being discussed)

      https://www.oilshell.org/

      OSH is the most bash-compatible shell in the world, and YSH is a new language

          ls | sort | uniq | wc -l   # this is both OSH and YSH
      
          var mydict = {foo: 42, bar: ['a', 'b']}   # this is new YSH stuff you can start using
          json write (mydict)
      
      
      The difference between OSH and YSH is exactly a set of "shopt" options [1], although YSH feels like a brand new language too! There is a smooth blend.

      I think it's worth it for 2 things alone

      - YSH checks all errors - you never lose an exit code

      - YSH has real arrays and doesn't mangle your variables with word splitting

      There's a lot more: modules with namespaces (use mymodule.ysh), buffered I/O that's not slow, etc.

      Gradually upgrading -https://github.com/oils-for-unix/oils/wiki/Gradually-Upgradi... (people are writing new YSH, but not many people have gradually upgraded, so I'd definitely appreciate feedback from people with a big "shell script problem")

      ---

      There is a FAQ here about Perl:

      Are you reinventing Perl? - https://www.oilshell.org/blog/2021/01/why-a-new-shell.html#a...

      Not to say that migrating to Perl is worse in any way, i.e. if you already know Perl or your team knows it.

      But objectively YSH is also a shell, so I think more of the code carries over, and there is a more direct upgrade path.

      ---

      [1] Unix Shell Should Evolve like Perl 5 - https://www.oilshell.org/blog/2020/07/blog-roadmap.html#the-... - i.e. with compatible upgrade options

      • RestartKernel 18 days ago

        That looks great! I moved from Fish and NuShell to Zsh because of its reasonable compatibility with bash, so OSH seems right up my alley.

    • nextos 19 days ago

      Or Ruby, which is essentially Smalltalk for Unix, plus lots of Perl-isms.

      Haskell (e.g. shh) and Clojure (Babashka) are also a nice for this usecase, but more niche options.

PeterWhittaker 20 days ago

Oh, no, now I have to go dig out some of mine....

The first really big one I wrote was the ~7000 line installer for the Enrust CA and directory, which ran on, well, all Unixes at that time. It didn't initially, of course, but it grew with customer demand.

The installation itself wasn't especially complicated, but upgrades were, a little, and this was back when every utility on every Unix had slight variations.

Much of the script was figuring out and managing those differences, much was error detection and recovery and rollback, some was a very primitive form of package and dependency management....

DEC's Unix (the other one, not Ultrix) was the most baffling. It took me days to realize that all command line utilities truncated their output at column width. Every single one. Over 30 years later and that one still stands out.

Every release of HP-UX had breaking changes, and we covered 6.5 to 11, IIRC. I barely remember Ultrix or the Novell one or Next, or Sequent. I do remember AIX as being weird but I don't remember why. And of course even Sun's three/four OS's had their differences (SunOS pre 4.1.3; 4.1.3; Solaris pre 2; and 2+) but they had great FMs. The best.

  • emmelaich 20 days ago

    That column truncation sounds bizarre. Are you sure the terminal didn't have some sort of sideways scroll available?

    • dspillett 20 days ago

      I think he was meaning that they truncated the lines even when called from a script, with their output going somewhere other than a terminal, not just when run interactively

      • emmelaich 19 days ago

        Yep, but I'm curious enough to quiz it.

        Weirdly, today I ran wish in MacOS Sequoia (15.1.x) and had the (exception) output truncated at terminal width!

        • p_l 19 days ago

          Because macOS closest relative isn't Free/NetBSD, but OSF/1 which, under few different names, was sold by Digital as Unix for Alpha (there were few rare builds for MIPS too).

          • skissane 19 days ago

            > Because macOS closest relative isn't Free/NetBSD, but OSF/1

            What you say here contains some truth (certainly with respect to the kernel), but I doubt that has anything to do with the behaviour the grandparent is reporting–the wish command in Tcl/Tk truncating output at terminal width. That behaviour would be determined by the Tcl/Tk code, nothing inherently to do with the underlying OS.

            > but OSF/1 which, under few different names, was sold by Digital as Unix for Alpha (there were few rare builds for MIPS too).

            IBM also briefly sold a port of OSF/1 to IBM mainframes, AIX/ESA: it was available to customers in June 1992, and withdrawn from marketing in June 1993–it was discontinued so quickly due to lack of customer interest, and also because IBM was about to release (in 1994) a UNIX compatibility subsystem for MVS (OpenEdition), which was a more attractive UNIX option for many of their mainframe customers.

            I believe IBM's abortive Workplace OS – shipped in beta form only as OS/2 PowerPC Edition – was also partially derived from the OSF/1 code base.

            • p_l 19 days ago

              I believe the Workplace OS connection was related to the use of Mach microkernel. I knew IBM was looking into at least one use of OSF/1 as shipped Unix, but wasn't sure which one (and AIX by middle 1990s was weird enough to confuse most people...)

          • emmelaich 19 days ago

            Interesting, I thought MacOS was basically a FreeBSD variant.

            But I just tried it again on a resized terminal window and I couldn't reproduce it!

            • p_l 19 days ago

              OSX took in a bit of FreeBSD and NetBSD to modernise sons areas, but the main thing for 10.0 release involved pulling in the latest OSFMK.

              Also, several APIs introduced after BSD 4.4 are very visibly missing, pointing to how little was taken from Free/NetBSD

        • PeterWhittaker 19 days ago

          dspillett was exactly right: ps, e.g., truncated its output at $COLUMNS and there was no horizontal scroll.

          As suggested above, it did this even when called from a script.

          The fix was easy, set COLUMNS ridiculously large if DEC Unix, but it took days of WTF apparent UB before I realized how simple was what was happening. It just seemed haphazard: I'd reposition and resize a window so I could run the script in one while manually running the commands in another, get inconsistent results, rinse, repeat...

          ...and eventually realize the common element in each test was me, and the variations I was introducing were window size.

          I cursed their engineers for trying to be "helpful" and keep things "pretty".

      • nikau 19 days ago

        If he is talking about osf1/tru64 that's the first one heard of it

  • throw16180339 20 days ago

    > DEC's Unix (the other one, not Ultrix) was the most baffling. It took me days to realize that all command line utilities truncated their output at column width. Every single one. Over 30 years later and that one still stands out.

    Do you mean OSF1/, Digital Unix, or Tru64 Unix?

    • PeterWhittaker 19 days ago

      Oh, yes, I think it was Digital Unix. IIRC, we toyed with OSF/1, but there wasn't much call for it.

      • p_l 19 days ago

        OSF/1, Digital Unix, and Tru64 are the same OS at different points in time.

        Technically OSF/1 was supposed to be the commercial BSD answer to System V, in practice only several niche vendors used it, plus Digital and NeXT (and through NeXT, Apple which continues the line to this day)

        • PeterWhittaker 19 days ago

          Thanks for the clarification. It was so long ago, it’s all a bit hazy.

          Other than the COLUMNS thing. That is burnt into my memory forever.

  • raffraffraff 20 days ago

    I made it to thousands but more like 2000. At least I only had to support Redhat and Ubuntu (modern ones, at that)

  • banku_brougham 20 days ago

    Thank you for your service, Im so glad you could share. Id be interested to read more.

  • PeterWhittaker 19 days ago

    OK, so JOOC I ran wc against the main binary and supporting libraries for a project I did last year: It's a script to manage linear assured pipelines implemented as a series of containers (an input protocol adapter, one or more filters, an output protocol adapter). The whole thing was intended to be useful for people who aren't necessarily experts in either containers or protocols, but who have an idea of how they want to filter/transform files as they transit the pipeline.

    It's 6224 lines, so far.

    There is a top-level binary with sub-functions, sort of like how

       git [ git options ] < git action> [action options]
    
    or

      systemctl [etc.[
    
    work.

    There is a sub command to add a new sub command, which creates the necessary libraries and pre-populates function definitions from a template; the template includes short and long usage functions, so that

      cbap -h
    
    or

      cbap pipeline -h
    
    give useful and reasonable advice.

    There are subcommands for manipulating base images, components (which are images with specific properties for use as containers in the pipelines), and pipelines themselves. A LOT of code is for testing, to make sure that the component and pipeline definitions are correctly formatted. (Pipelines are specified in something-almost-TOML, so there is code to parse toml, convert sections to arrays, etc., while components are specified as simple key=value files, so there is code to parse those, extract LHS and RHS, perform schema validation, etc.).

    Since pipeline components can share properties, there is code to find common properties in var and etc files, specify component properties, etc.

    There are a lot user and group and directory and FIFO manipulation functions tailored to the security requirements: When a pipeline is setup, users and groups and SEL types and MCS categories are generated and applied, then mapped into to the service files that start the components (so there is a lot of systemd manipulation as well).

    Probably the single biggest set of calls are the functions that get/set component properties (which are really container properties) and allow us to use data-driven container definitions, with each property having a get function, a validation function, and an inline (in a pipeline) version, for maximum flexibility.

    Finally, there is code that uses a lot of bash references to set variables either from files, the environment, or the command line, so that we can test rapidly.

    It also support four levels of user, from maintainer (people who work on the code itself), developer (people who develop component definitions), integrators (people who build pipelines from components), and operators (people who install pipelines), with the ability to copy and package itself for export to users at any of those levels (there is a lot of data-driven, limited recursive stuff happening therein).

    Since target systems can be any Linux, it uses makeself to package and extract itself.

    For example, an integrator can create a pipeline definition, which will produce a makeself file that, when run on the target system, will create all users, groups, directories, FIFOs (the inter-component IPC), apply DAC and MAC, create systemd files, copy images to each user, and launch the pipeline - with a delete option to undo all of that.

    There is some seccomp in there as well, but we've paused that as we need to find the right balance between allow- and deny- listing.

    (Yes, I use shellcheck. Religiously. :->)

    • ndsipa_pomu 19 days ago

      That sounds both great and horrifying

RodgerTheGreat 20 days ago

At one point I considered writing an interpreter for my scripting language Lil in bash to maximize portability, but quickly realized that floating-point arithmetic would be extremely painful (can't even necessarily depend on bc/dc being available in every environment) and some of the machines in my arsenal have older versions of bash with very limited support for associative arrays. My compromise was to instead target AWK, which is a much more pleasant general-purpose language than most shells, and available in any POSIX environment: https://beyondloom.com/blog/lila.html

  • seiferteric 20 days ago

    > can't even necessarily depend on bc/dc being available in every environment

    Just discovered this myself, also trying to make a language target shell. Was really surprised bc/dc was not present I think in Ubuntu install in WSL2. Also using awk for floating point math, but just shelling out to it.

    • RodgerTheGreat 20 days ago

      Yep! I considered shelling out to AWK for the same reason, as a bc/dc alternative, but rapidly found that nearly everything else bash could do was easier and less error-prone (and workable on much older systems) if I moved the whole script into pure AWK.

kamaal 20 days ago

As someone who has written and maintained large Perl programs at various points in my career. There is a reason why people do this- Java and Python like languages work fine when interfaces and formats are defined, and you often have 0 OS interaction. That is, you use JSON/XML/YAML or interact with a database or other programs via http(s). This creates an ideal situation where these languages can shine.

When people do large quantity text and OS interaction work, languages like Java and Python are a giant pain. And you will begin to notice how Shell/Perl become a breeze to do this kind of work.

This means nearly every automation task, chaotic non-standard interfaces, working with text/log files, or other data formats that are not structured(or at least well enough). Add to this Perl's commitment towards backwards compatibility, a large install base and performance. You have 0 alternatives apart from Perl if you are working to these kind of tasks.

I have long believed that a big reason for so much manual drudgery these days, with large companies hiring thousands of people to do trivially easy to automate tasks is because Perl usage dropped. People attempt to use Python or Java to do some big automation tasks and quit soon enough when they are faced with the magnitude of verbosity and overall size of code they have to churn and maintain to get it done.

  • stackskipton 19 days ago

    Strong disagree that it's because "Omg, no more Perl" but just complexity cranked up and that Perl person stitching scripts together became their full job and obviously Perl only got you so far. So now you have additional FTE who is probably expensive.

    Also, if end user is on Windows, there is already Perl like option on their desktop, it's called Powershell and will perform similar to Perl.

  • GoblinSlayer 19 days ago

    I did a big automation task in native code, because efficiency is desirable in such cases, while bash+grep favor running a new process for every text line. In order to be efficient, you need to minimize work, and thus batch and deduplicate it, which means you need to handle data in a stateful manner while tracking deduplication context, which is easier in a proper programming language, while bash+grep favor stateless text processing and thus result in much work duplication. Another strategy for minimization of work is accurate filtering, which is easier to express imperatively with nice formatting in a proper programming language, grep and regex are completely unsuitable for this. Then if you use line separated format, git awards you with escaping to accommodate for whatever, which is inconsistently supported and can be disabled by asking null terminated string format with -z option, I don't think bash has any way to handle it, while in a sufficiently low level language it's natural, and it also allows for incremental streaming so you don't have to start a new process for every text line.

    As a bonus you can use single code base for everything no matter if there's http or something else in the line.

  • chubot 19 days ago

    Yes I agree - my favorite language is Python, but it can be annoying/inefficient for certain low-level OS things. This is why I created https://www.oilshell.org (and the linked wiki page)

    A few links for context:

    Are you reinventing Perl?

    https://www.oilshell.org/blog/2021/01/why-a-new-shell.html#a...

    The Unix Shell Should Evolve Like Perl 5 (with compatible upgrade options, rather than a big bang like Perl 6/Raku)

    https://www.oilshell.org/blog/2020/07/blog-roadmap.html#the-...

    A Tour of YSH - https://www.oilshell.org/release/latest/doc/ysh-tour.html

  • hiAndrewQuinn 19 days ago

    I've been seriously considering learning some Perl 5-fu ever since I realized it's installed by default on so many Linux and BSD systems. I think even OpenBSD comes with perl installed.

    That may not seem like a big advantage until you're working in an environment where you don't actually have the advantage of just installing things from the open Internet (or reaching the Internet at all).

ulrischa 20 days ago

I think the main problem with writing large programs as bash scripts is that shell scripting languages were never really designed for complexity. They excel at orchestrating small commands and gluing together existing tools in a quick, exploratory way. But when you start pushing beyond a few hundred lines of Bash, you run into a series of limitations that make long-term maintenance and scalability a headache.

First, there’s the issue of readability. Bash's syntax can become downright cryptic as it grows. Variable scoping rules are subtle, error handling is primitive, and string handling quickly becomes messy. These factors translate into code that’s harder to maintain and reason about. As a result, future maintainers are likely to waste time deciphering what’s going on, and they’ll also have a harder time confidently making changes.

Next, there’s the lack of robust tooling. With more mature languages, you get static analysis tools, linters, and debuggers that help you spot common mistakes early on. For bash, most of these are either missing or extremely limited. Without these guardrails, large bash programs are more prone to silent errors, regressions, and subtle bugs.

Then there’s testing. While you can test bash scripts, the process is often more cumbersome. Complex logic or data structures make it even trickier. Plus, handling edge cases—like whitespace in filenames or unexpected environment conditions—means you end up writing a ton of defensive code that’s painful to verify thoroughly.

Finally, the ecosystem just isn’t built for large-scale Bash development. You lose out on modularity, package management, standardized dependency handling, and all the other modern development patterns that languages like Python or Go provide. Over time, these deficits accumulate and slow you down.

I think using Bash for one-off tasks or simple automation is fine — it's what it’s good at. But when you start thinking of building something substantial, you’re usually better off reaching for a language designed for building and maintaining complex applications. It saves time in the long run, even if the initial learning curve or setup might be slightly higher.

  • ndsipa_pomu 20 days ago

    Using ShellCheck as a linter can catch a lot of the common footguns and there are a LOT of footguns and/or unexpected behaviour that can catch out even experienced Bash writers. However, Bash/shell occupies a unique place in the hierarchy of languages in that it's available almost everywhere and will still be around in 30 years. If you want a program that will run almost everywhere and still run in 30 years time, then shell/Bash is a good choice.

    • norir 19 days ago

      I'd almost always prefer c99 to shell for anything more than 100 lines of code or so. There is even a project I saw here recently that can bootstrap tcc in pure shell (which can then be used to bootstrap gcc). I'm somewhat skeptical that bash will still be used for anything but legacy scripts in 30 years, despite it's impressive longevity to this point, but I could sadly be proven wrong.

      • ndsipa_pomu 19 days ago

        So, if you wanted to write something that you would be pretty sure could easily run on machines in 30 years time, what would you use?

        I don't think c99 would be a good choice as processors will likely be different in 30 years time. If you had your program on e.g. a usb stick and you manage to load it onto a machine, it'd only be able to run if you had the same architecture. Even nowadays, you'd run into difficulties with arm and x86 differences.

        Some kind of bytecode language might seem better (e.g. java), but I have my doubts about backwards compatibility. I wonder if Java code from 20 years ago would just run happily on a new Java version. However, there's also the issue of Java not being installed everywhere.

        • wiseowise 19 days ago

          > I wonder if Java code from 20 years ago would just run happily on a new Java version.

          Absolutely.

          • ndsipa_pomu 19 days ago

            That's good to know. I haven't touched Java myself in years, but at work I hear of developers complaining that our code runs on Java 11 and they haven't been given the time to move it to a more recent version.

            Personally, I've encountered great difficulties with some old SAN software that required a Java 6 web plugin that I couldn't get running on anything other than Internet Explorer - I kept an XP VM with the correct version just for that. I suspect a large part of the problem was the software incorrectly attempts to check that the version is at least 6, but fails when the version is newer (they obviously didn't test it when later versions got released).

  • JoyfulTurkey 19 days ago

    Dealing with this at work right now. Digging through thousands of lines of Bash. This script wasn’t written a long time ago, so no clue why they went with Bash.

    The script works but it always feels like something is going to break if I look at the code the wrong way.

    • chubot 19 days ago

      If you have thousands of lines of bash, don't like maintaining it, but don't necessarily want to rewrite the whole thing at once, that's what https://www.oilshell.org/ is for!

      See my comment here, with some details: https://news.ycombinator.com/item?id=42354095

      (I created the project and the wiki page. Right now the best bet is to join https://oilshell.zulipchat.com/ if it interests you. People who want to test it out should be comfortable with compiling source tarballs, which is generally trivial because shells have almost no dependencies.)

      The first step is:

          shopt --set strict:all  # at the top of the file
      
      Or to run under bash

          shopt -s strict:all 2>/dev/null || true
      
      And then run with "osh myscript.bash"

      OSH should run your script exactly the same as bash, but with better error messages, and precise source locations.

      And you will get some strictness errors, which can help catch coding bugs. It's a little like ShellCheck, except it can detect things at runtime, whereas ShellCheck can't.

  • anthk 20 days ago

    Bash/ksh have -x as a debug/tracing argument.

voxadam 20 days ago

I'm pretty sure the largest handwritten shell program I used back in the day on a regular basis was abcde (A Better CD Encoder)[1] which clocks in at ~5500 LOC.[2]

[1] https://abcde.einval.com

[2] https://git.einval.com/cgi-bin/gitweb.cgi?p=abcde.git;a=blob...

  • lelandfe 20 days ago

    Not that I'd know anything about it, but this was one of the tools recommended on What.CD back in the day. Along with Max (my friends tell me) https://github.com/sbooth/Max

    • voxadam 20 days ago

      Probably every rip I posted to What.CD and OiNK before it was created using abcde.

      Allegedly.

      • lelandfe 19 days ago

        The greatest loss was truly not even What.CD the incredible tracker but the forums. I've never again found a more concentrated group of people with taste.

      • throwup238 20 days ago

        You gotta use the SWIM acronym, for the ultimate callback to the aughts.

        • voxadam 20 days ago

          Honestly, I came so close, so damn close. :)

  • dlcarrier 20 days ago

    I've used that before. It works really well and was pretty easy to use. I had no idea the whole thing is just a giant shell script.

ykonstant 20 days ago

Many of these programs are true gems; the rkhunter script, for instance is both nice code (can be improved) and a treasure trove of information*.

Note that much of the code size of these scripts is dedicated to ensuring that the right utilities exist across the various platforms and perform as expected with their various command line options. This is the worst pain point of any serious shell script author, even worse than signals and subprocesses (unless one enjoys the pain).

*Information that, I would argue, would be less transparent if rkhunter had been written in a "proper" programming language. It might be shoved off in some records in data structures to be retrieved; actions might be complex combinations of various functions---or, woe, methods and classes---on nested data structures; logging could be JSON-Bourned into pieces and compressed in some database to be accessed via other methods and so on.

Shell scripts, precisely due to the lack of such complex tools, tend to "spill the beans" on what is happening. This makes rkhunter, for instance, a decent documentation of various exploits and rootkits without having to dig into file upon file, structure upon structure, DB upon DB.

cperciva 20 days ago

The FreeBSD Update client is about 3600 lines of sh code. Not huge compared to some of the other programs mentioned here, but I'm inclined to say that "tool for updating an entire operating system" is a pretty hefty amount of functionality.

The code which builds the updates probably adds up to more lines, but that's split across many files.

xyst 20 days ago

It’s “only” 7.1K LoC, but my favorite is the “acme.sh” script which is used to issue and renew certs from Lets Encrypt.

https://github.com/acmesh-official/acme.sh/blob/master/acme....

  • Brian_K_White 20 days ago

    already in the list

    • dizhn 20 days ago

      Parent might have meant that they like it. I was going to say the same thing. That one and distrobox are quite impressive in how well they work.

      • Macha 19 days ago

        Personally I abandoned acme.sh for lego because it didn't work well. For example, they lost track of the environment variables they were using for the server in their acme dns plugin across versions, thereby breaking what's supposed to be a fire and forget process.

        That and the CA that was exploiting shell injection in acme.sh convinced me it was time to move on

        • dizhn 18 days ago

          I have also moved everything over to working with Caddy. It's so convenient that for one domain I even set up a little job to copy over the web certificate from it to be used for my smtp/imap.

eschneider 19 days ago

Sometimes shell is the only thing you can guarantee is available and life is such you have to have portability, but in general, if you've got an enormous shell app, you might want to rethink your life choices. :/

  • GuB-42 19 days ago

    The problem is: which shell? Bash is far from being universal. Also, there is usually not much you can do with a shell without commands (find, grep, sed, cat, head, tail, cut...), and commands have their own portability issues.

    Targeting busybox may be your best bet, but once you are leaving your typical Linux system, writing portable (Bourne) shell scripts becomes hard to impossible.

  • norir 19 days ago

    I hear this fairly often and I'm genuinely curious how often you have shell but _not_ a c compiler or the ability to install a c compiler via the shell. Once you have a c compiler, you can break out of shell and either write c programs that the shell script composes or install a better scripting language like lua. At this point in time, it feels quite niche to me that one would _need_ to exclusively use shell.

    • bhawks 19 days ago

      There are plenty of contexts where you won't have a compiler today - embedded (optimize for space) and very security hardened deployments (minimize attack surface).

      Historically people used to sell compilers - so minimizing installation to dev machines probably was a savings (and in those times space was at a premium everywhere).

      That said - I am with you, give me any other programming language besides shell!

    • chubot 19 days ago

      That's what I thought -- I thought that OS X was the main Unix where it is "annoying" to get a C compiler (huge XCode thing IIRC), and it isn't used for servers much.

      But people have told me stories about working for the government

      (my background was more "big tech", and video games, which are both extremely different)

      Some government/defense systems are extremely locked down, and they don't have C compilers

      So people make do with crazy shell script hacks. This is obviously suboptimal, but it is not that surprising in retrospect!

      • mdaniel 19 days ago

        > it is "annoying" to get a C compiler (huge XCode thing IIRC)

        FWIW, there is an apple.com .dmg "command line tools" that is much smaller than Xcode formal, but I actually came here to say that to the very best of my knowledge a $(ruby -e $(curl ...)) to brew install will then subsequently download pre-built binaries from GitHub's docker registry

sigoden 20 days ago

If you're looking for a tool to simplify the building of big shell programs, I highly recommend using argc (https://github.com/sigoden/argc). It's a powerful Bash CLI framework that significantly simplifies the process of developing feature-rich command-line interfaces.

jefftk 19 days ago

Back when I worked on mod_pagespeed we wrote shell scripts for our end-to-end tests. This was expedient when getting started, but then we just kept using it long past when we should have switched away. At one point I got buy-in for switching to python, but (inexperience) I thought the right way to do it was to build up a parallel set of tests in python and then switch over once everything had been ported. This, of course, didn't work out. If I were doing this now I'd do it incrementally, since there's no reason you can't have a mix of shell and python during the transition.

I count 10k lines of hand-written bash in the system tests:

    $ git clone git@github.com:apache/incubator-pagespeed-mod.git
    $ git clone git@github.com:apache/incubator-pagespeed-ngx.git
    $ find incubator-pagespeed-* | \
         grep sh$ | \
         grep system_test | \
         xargs cat | \
         wc -l
    10579
  • gjvc 19 days ago

    lines ending in | do not require \

    • michaelcampbell 19 days ago

      I (also?) never new this; thanks.

      My old finger memory will still probably put them in, alas.

    • greazy 19 days ago

      In my experience they do. Could it be related to strict bash mode?

      • gjvc 19 days ago

        dunno. give some evidence.

zabzonk 20 days ago

Don't know about the biggest, although it was quite big, , but the best shell program I ever wrote was in ReXX for a couple of IBM 4381s running VM/CMS which did distributed printing across a number of physical sites. It saved us a ton of money as it only needed a cheap serial terminal and printer and saved us so much money when IBM was wanting to charge us an ungodly amount for their own printers and associated comms. One of pieces of software I'm most proud of (written in the mid 1980s), to this day.

  • banku_brougham 20 days ago

    Well, you gotta post this somewhere so we can see

    • zabzonk 20 days ago

      Like much of what i wrote before the days of distributed version control, this is now lost in the mists of time. And the code wouldn't belong to me anyway.

  • walterbell 19 days ago

    Thanks for the reminder that Rexx was open-sourced!

    https://rexxinfo.org

    • kristopolous 19 days ago

      I remember an article I read, probably around 1997 about cgi languages. It considered I believe, Rexx, TCL, Perl and Python. I bet it's at archive.org somewhere

      hah, found it. Byte, 1998: https://archive.org/details/199806_byte_magazine_vol_23_06_w...

      I tried metacard after reading that. It ran on linux: http://www.sai.msu.su/sal/F/5/metacard.gif ... I think I might have written some things with it. Cool, good luck on me trying to find 26 year old software.

      You can totally still run this if you want btw - just download some old linux ISOs from archive.org and install it in a VM ... hope they survive all their lawsuits; we're so lucky to have them around.

michaelcampbell 19 days ago

Probably my largest one that was an order of magnitude smaller than these for the most part, but it checked that my VPN was up (or not) and started it if not. (And restarted various media based docker containers.)

If it was up, it would do a speedcheck and record that for the IP the VPN was using, then check to see how that speed was compared to the average, with a standard deviation and z-score. It would then calculate how long it should wait before it recycled the VPN client. Slow VPN endpoints would cycle quicker, faster ones would wait longer to cycle. Speeds outsize a standard deviation or so would check quicker than the last delta, within 1 Z would expand the delta before it checked again.

Another one about that size would, based on current time, scrape the local weather and sunup/sundown times for my lat/long, and determine how long to wait before turning on an outdoor hose, and for how long to run it via X10 with a switch on the laptop that was using a serial port to hook into the X10 devices. The hose was attached to a sprinkler on my roof which would spray down the roof to cool it off. Hotter (and sunnier) weather would run longer and wait shorter, and vice versa. I live in the US South where shedding those BTUs via evaporation did make a difference in my air conditioning power use.

  • Y_Y 19 days ago

    For those of you not familiar with "British Thermal Units", they're about 7e-14 firkin square furlongs per square fortnight.

  • eszed 19 days ago

    These are my two favorite on the page, and seem somehow emblematic of the "hacker spirit".

tpoacher 19 days ago

I'm writing a ticketing manager for the terminal entirely in bash. Reasonably non-trivial project, and it's been pretty enjoyable working "exclusively" with bash. ("exclusively" here used in quotes, because the whole point of a shell scripting language is to act as a glue between smaller programs or core utilities in the first place, which obviously may well have been written in other languages. but you get the point).

Having said that, if I were to start experimenting with an altogether different shell, I would be very tempted to try jshell!

Incidentally, I hate when projects say stuff like "Oils is our upgrade path from bash to a better language and runtime". Whether a change of this kind is an "upgrade" is completely subjective, and the wording is unnecessarily haughty / dismissive. And very often you realise that projects who say that kind of thing are basically just using the underlying tech wrongly, and trying to reinvent the wheel.

Honestly, I've almost developed a knee reflex to seeing the words "upgrade" and "better" in this kind of context by now. Oils may be a cool project but that description is not making me want to find out more about it.

  • bbkane 19 days ago

    Man if you think Bash doesn't need an upgrade, more power to you, but every time I use it for anything slightly complicated I regret it, so I'm firmly in the "dear Lord, let's find a smooth upgrade from Bash" camp and I'm excited about Oils

    • tpoacher 16 days ago

      Well I didn't necessarily mean that Bash is perfect and could never be improved; or that Oils may not have good ideas that are an 'improvement' over some things in bash.

      I meant that, if I come up with a project called "Spills: an improved Oils without all the awful warts", this says nothing good or meaningful about my project itself, and all it really does is make a casual implication that Oils is crap. So such a sentence would (personally) put me off Spills, rather than get me excited about it. Especially if I was already happy with Oils and in fact found it enjoyable, and certainly not having "awful warts".

      Yes you can go 'delve' into the project to figure out if and why Spills is actually better than Oils (and if and why Oils is 'crap' according to your tagline), but a tagline is chosen for a reason. It's a single sentence you use to describe and sell your project with. If your best tagline is "that other product is crap" then I'm not that interested to do any 'delving' in the first place. That was my point.

      Also, I think partly the reason bash gets a bad rep is because it's a scripting language, and the canonical one for that matter. A lot of the time people get frustrated with bash, I find it's because they're trying to use it as a 'system' language and get everything done via bash. But you're not supposed to. It's a scripting language, it's intended as 'glue'. Where system languages rely on external libraries, bash relies on external programs. You want to validate your inputs? Use a validator program. You want to operate on specific types? Use a type-specific program. If you need type-specific validation and you're doing it in bash, then of course you're going to get frustrated; it probably can be done, and possibly even well, but the language was just never designed for that kind of thing and you'd have to get knee-deep into arcane hackery rather than have lovely clean maintainable code.

      • bbkane 13 days ago

        I respect that, and if I was happier with Bash I would probably agree with you :)

oneeyedpigeon 19 days ago

On a macOS machine, this:

  $ file /usr/bin/* | grep "shell script" | cut -f1 -d':' | xargs wc -l | sort -n
gives me:

  6431 /usr/bin/tkcon
but that's another Tk script disguised as a shell script; the next is:

  1030 /usr/bin/dtruss
which is a shell script wrapper around dtrace.
gosub100 19 days ago

Since the topic is shell, can I shamelessly ask a question?

I'm an SRE for a service everyone has heard of. I have inadvertently pasted into my terminal prompt multiple times now, which has attempted to run each line as a command. I see there is a way to disable this at the shell for each client, but what about at the server level? This way I could enforce it as a policy, and not have to protect every single user (including myself) individually. Said differently, I want to keep everyone who ssh into a prod machine from being able to paste and execute multiple lines. But not forbid paste entirely.

The only thing I could think of would be to recompile bash and detect if the input was from a tty. If so, require at least 200ms between commands, and error out if the threshold exceeded. This would still allow the first pasted command to run, however.

  • porridgeraisin 19 days ago

    Bracketed paste might help you. It's an option for readline so it goes in ~/.inputrc. There's a way to set these options in bashrc as well which I don't remember.

    It inserts a control sequence before and after the pasted contents that makes bash not execute all the lines. Instead it will keep them in the command line, post which you can choose to execute all of them in one go with enter or cancel with ctrl C.

    • teo_zero 19 days ago

      Everything you can do in inputrc can be done in bashrc if you prepend "bind". In this case:

        bind 'set enable-bracketed-paste on'
  • TacticalCoder 19 days ago

    Not an answer to your question but here's a "fun" thing I used to do... If you want to run a program from the CLI, which blocks you terminal (say an xterm), you can use that terminal as a temporary paste buffer. But with a trick.

    Imagine you want to run, say, Firefox like that (say because you'd like to see stdin/stderr output of what's going on without having to find to which log file it's outputting stuff: it's really just a silly example):

        xterm>  firefox
        <-- the xterm is now "stuck here" (until Firefox exists)
    
    if you now write, into that "blocked" xterm, the things you write shall execute when you exit/kill Firefox:

        Hello, world!
    
    But one thing I used to do all the time and still occasionally do, first do this:

        xterm> firefox
        <-- the xterm is now "stuck" here (until Firefox exits)
        cat > /dev/null
    
    You can now use that xterm as a temp paste buffer.

    So, yup, a good old cat > /dev/null works wonder.

  • fargle 19 days ago

    everybody works differently. what seems like a sensible guardrail for you would be extremely annoying for others.

    so whatever you do, it should be a feature, even defaulted on. but never a policy that you enforce to "everyone who ssh into a prod machine"

    if you find something that works well for you, add it as a suggestion to your developer docs.

    • gosub100 19 days ago

      all good points, and it's not a great way to "make friends and influence people" by screwing with their workflow. After making this mistake at least twice myself (mainly due to fumbling with MacOS mouse/keyboard differences on my machine), I just wanted to prevent a disaster in the future from me or anyone else. But alas, I just need to be more careful and encourage others to learn from my mistakes :)

alsetmusic 20 days ago

I love exploring things like this. The demo for ble.sh interactive text editor made me chuckle with delight.

mulle_nat 20 days ago

I think for sports, I could wrap all the various mulle-sde and mulle-bashfunction files back into one and make it > 100K lines. It wouldn't even be cheating, because it naturally fractalized into multiple sub-projects with sub-components from a monolithic script over time.

anothername12 20 days ago

I would add Bash Forth to that. String-threaded concatenative programming!

transcriptase 20 days ago

Sometimes I do things I know are cursed for the sheer entertainment of being able to say it worked. E.g. my one absurdly complex R script that would write ungodly long bash scripts based on the output of various domain specific packages.

It began:

# Yeah yeah I know

sn9 20 days ago

I think around a decade ago, I tried installing a copy of Mathematica and the installer from Wolfram was a bash program that was over a GB in size.

I tried opening it up just to look at it and most text editors just absolutely choked on it. I can't remember, but it was either Vim xor Emacs that could finally handle opening it.

  • zertrin 20 days ago

    Most likely it embedded a (g)zip inside the shell script? I've seen this frequently.

  • szszrk 20 days ago

    Some installers include binaries inside their shell scripts. So the script extracts data from itself. Not great for transparency, but works and is single file.

    • anthk 20 days ago

      shar, shell archives.

      • szszrk 19 days ago

        A bit of a pain in the ass in some corporate environments, where binaries are scanned before use by DLP software ;/

fny 19 days ago

I feel like this merits having a Computer Benchmarks Game for different shells.

svilen_dobrev 19 days ago

around ~2000, my build/install script had to simulate some kind of OO-like inheritance.. And there was python but noone understood it (and even less had it installed), so: bash - aliases had priority over funcs which had priority to whatever executables found in PATH.. so here you go - whole 3 levels of it, with lowest/PATH being changeable..

khushy 19 days ago

Most shell script installers are works of art

pwdisswordfishz 19 days ago

Surprised not to see Arch Linux’s makepkg on the list, btw.

  • NekkoDroid 17 days ago

    Fun fact: makepkg isn't that big. IIRC it's <5k SLOC (I guess a bit bigger than some of the smaller ones on this list though).

rajamaka 20 days ago

Would love to see the same for batch on Windows

  • denistaran 19 days ago

    If you’re scripting on Windows, it’s better to use PowerShell instead of batch. Compared to Bash, PowerShell is also better suited for large scripts because it works with objects rather than plain text. This makes handling structured data like JSON, XML, or command outputs much easier, avoiding the need for error-prone text parsing.

    • shawn_w 19 days ago

      PowerShell is definitely better for new projects, but there's lots of legacy batch files out there.