kragen 2 months ago

I tried this a few years ago; http://canonical.org/~kragen/sw/dev3/colors.html has them as foreground colors and http://canonical.org/~kragen/sw/dev3/colors.2.html has them as background colors. I tested 3-letter words as well as 6-letter words, and used 1 as "l" as well as "I", but I didn't try aghasemi's very productive suggestion of using 5 as S. I don't remember if it it didn't occur to me or if I tried it and didn't like the results.

Some of them are pretty #bad (#011 doesn't really look much like "oil") and some, though they read quite well, correspond to awful colors; you might even say, #faeca1 colors. Still, I've made my #bed, #0dd as it may be; now I must #11e in it. I think I've #fed you enough #babb1e for today.

js2 2 months ago

The gist is rather a pipeline of Unix commands with no bash necessarily involved. Here it is in shellcheck-compliant 100% bash:

    #!/usr/bin/env bash
    shopt -s nocasematch
    while read -r word; do
        if [[ $word =~ ^[abcdefoi]{6,6}$ ]]; then
            word=${word//o/0}
            word=${word//i/1}
            word=${word^^}
            printf '#%s\n' "$word"
        fi
    done < /usr/share/dict/words

This could be collapsed to one line with semicolons. On the macOS 12.6 dictionary I get 59 words.

Edit: and in sed which someone just asked me for elsewhere:

    sed -n -e '
    /^[abcdefoi]\{6,6\}$/I {
    s/o/0/g;
    s/i/1/g;
    s/^/#/;
    y/abcdef/ABCDEF/;
    p;}' < /usr/share/dict/words
  • kps 2 months ago

         sed -n -e 'y/abcdefOoIi/ABCDEF0011/' -e 's/^[A-F01]\{6\}$/#&/p' /usr/share/dict/words
    • js2 2 months ago

      I’d golf with you but I think you got a hole in one there. I didn’t spend any time thinking about how to make the sed more compact. I sorta just translated what I’d already written in bash.

  • version_five 2 months ago

    Thanks for this. I'd probably call the original the GNU coreutils version. The linked github also has a sed-only version in the comments. It's instructive to see the different versions.

    • kps 2 months ago

      > I'd probably call the original the GNU coreutils version.

      Why? The only GNUish bit is the grep -P option, which is unnecessary (-E will do as well).

      • version_five 2 months ago

        I would have considered tr to be part of gnu coreutils, awk, not necessarily but the default on a mac is gawk I believe

        • kragen 2 months ago

          tr predates GNU by about a decade.

    • js2 2 months ago

      I just added a sed version as well. I'll have to click through and see how closely it resembles what's in the gist.

      bash is actually pretty powerful if you don't mind its baroque syntax. Writing it in POSIX would be a bit more challenging. You could use a case statement for the pattern matching, but I'm not sure about the substitution.

nine_k 2 months ago

Never mind the colors.

This snippet demonstrates how a number of small tools, each doing its narrow job, strung together via the most trivial interface, produces a non-trivial result.

This composability is still unreachable to the vast majority of GUI tools.

  • vesinisa 2 months ago

    The non-trivial part here is actually the source data (the dict file.) It is also its pitfall - after adding 5 for S you should see a lithany of plurals. Most dict files (for English anyway) however seem to omit plural nouns. I guess the logic is that in English most plurals are regular, and the naive algorithm for deriving them from the singular forms (correctly most of the time) is quite trivial.

  • throwing_away 2 months ago

    SaaS companies hate this one weird trick!

  • miohtama 2 months ago

    While it is a neat trick as one liner, I would recommend against doing anything like this in any software that requires maintenance. The code is hard, or impossible to follow, no comments. Brittle and only few people can understand what it really does. Better option would be 10 lines of Python or JavaScript with some comments.

    • kragen 2 months ago

      I thought it was trivial to understand, though the comment above it helps a lot, and it's maybe an unfair advantage that I'd done the same thing in pretty much the same way four years ago. It probably depends on your background; I wouldn't write it that way for people who didn't know shell, just like I wouldn't write this comment in English for people who speak only Spanish.

      I'm not convinced that it's easier to understand in Python (even though I simplified it a bit, in part because one piece of the Python 3 braindamage was moving string.maketrans to bytes):

          import re
      
      
          def main(words):
              for word in words:
                  word = word.strip().upper()
                  if re.compile(r'[A-FOI]{6}$').match(word):
                      print('#' + word.replace('I', '1').replace('O', '0'))
      
          if __name__ == '__main__':
              main(open('/usr/share/dict/words'))
      
      I think the shell version is clearly better for interactive improvisation, though.
      • js2 2 months ago

        I prefer search with an explicit '^' in the pattern to using match. For a throw-away script I'd probably do this:

            import re
            is_hex_like = re.compile(r"^[a-foi]{6}$", re.I).search
            for word in filter(is_hex_like, open("/usr/share/dict/words")):
                hexword = word.upper().replace("O", "0").replace("I", "1").rstrip()
                print(f"#{hexword}")
        • Too 2 months ago

          findall and multiline mode makes it even easier, at the cost of loading whole file into memory though, for that reason your alternaive is probably better

              import re
              wordlist = open("/usr/share/dict/words").read()
              for word in re.findall(r"^[a-foi]{6}$", wordlist, re.IGNORECASE | re.MULTILINE):
                  hexword = word.upper().replace("O", "0").replace("I", "1")
                  print(f"#{hexword}")
        • kragen 2 months ago

          That's nicer than my version! I'm curious why you prefer search(), though.

          • js2 2 months ago

            1. I don't have to remember which implicitly anchors to the start of the string and which doesn't. 2. I prefer the explicitness of '^' (maybe that's just another way of stating (1). 3. I can use re.M to modify '^' to match at the start of each line on multiline strings, whereas match will still keep searching from the front. 4. The asymmetry of anchoring the front but not the end is weird. Python now has fullmatch, but ugh, just use the pattern for that if you need it. 5. Off the top of my head, I can't think of another language that has a regex function that implicitly anchors the front.

            • kragen 2 months ago

              Hmm, I see. Interesting! I think of regexps as state machines, so I think of the implicit loop to find a starting position as extra complexity, which can give rise to for example performance problems, though it's true that in many languages you can't avoid it.

    • rascul 2 months ago

      Comments can be added. Understanding it requires learning the tools. Just like understanding python or javascript requires learning python or javascript. It's not impossible to follow.

    • lrvick 2 months ago

      I understood it instantly on first read. Probably depends on how much shell you write.

pwpwp 2 months ago

It's missing #DADB0D

  • kragen 2 months ago

    I look forward to your improved version that tests against the Cartesian product of /usr/dict/words with itself plus the empty string and maybe some slang words like "bod". I suggest you limit to shortish words before the Cartesian product rather than after.

    • gabrielsroka 2 months ago

      I installed the American English large dictionary on Ubuntu. It has `bod`.

      • kragen 2 months ago

        Nice! I'm just using the 102'401-entry version.

  • kgwxd 2 months ago

    Wish I could say the same.

b800h 2 months ago

Is HEX another of these words which gets erroneously capitalised, like SCRUM or GAP analysis?

  • markrages 2 months ago

    I've noticed that for years in embedded (where we use "Intel HEX" formatted files) but I ascribed it to a field full of eccentric loners doing idiosyncratic things, or some kind of DOS 8.3 brain damage.

Waterluvian 2 months ago

Does anyone have a link to a guide on how to write Python or node or rust programs that behave well with bash? Ie. Streaming inputs and outputs and other things I probably don’t know about?

  • KMnO4 2 months ago

    It’s pretty easy. You have three basic streams:

    1. Stdin - just iterate through sys.stdin

    2. Stdout - regular printing will go there

    3. Stderr - print errors here eg with print(…, file=sys.stderr)

    And then beyond that as long as your script gets invoked by the interpreter (Ie #!/usr/bin/env python) everything will “just work”.

    • IgorPartola 2 months ago

      Don’t you also have to keep in mind how often you flush outputs/how you buffer? Encoding? Handle EOF correctly?

      Not saying it’s hard but also it’s not 100% covered by what you said.

      • markrages 2 months ago

        Those are advanced topics and you can look them up if you need them.

        Generally, Python does the right thing by default for scripting use: line buffered, system encoding, EOF handled naturally by the iterator protocol.

    • gnubison 2 months ago

      And preferably use fileinput for the stdin so that you can name files on the command line as well

    • Calzifer 2 months ago

      And avoid seek. Pipes are not random access. I once tried to use a python library to convert a file from stdin but it failed on a f.seek(0) the library added 'just in case' in the beginning.

  • jeroenjanssens 2 months ago

    My book Data Science at the Command Line has a chapter about this that scratches the surface and lists some resources in case you want to dive deeper [1]. I can also recommend checking out packages such as Rich [2] and Click [3], if only to get an idea of the possibilities when it comes to creating command-line tools with Python.

    [1] https://datascienceatthecommandline.com/2e/chapter-4-creatin...

    [2] https://github.com/Textualize/rich

    [3] https://click.palletsprojects.com/en/8.1.x/

  • eyelidlessness 2 months ago

    This is oddly something that some of the earliest Node interfaces do quite well. (I say “oddly” because Node was mostly promoted early on for network/server use cases.) It’s generally not idiomatic in these days of async/await and Web Streams, but streaming IO was a core async primitive from very early on. 0.1.90 for child processes, unspecified for the main process object so possibly from the first release. Granted the interfaces really show their age in terms of incidental complexity, they’re far from being as simple as their shell equivalents. But as far as behaving well, streaming is solid and there’s a wealth of compatibility affordances depending on how portable your script needs to be.

netule 2 months ago

Reminds me of debugging pointer values in C with 0xDEADBEEF.

dwheeler 2 months ago

I appreciate the presence of #C0FFEE.

Can't do computing without that!! :-)

  • layer8 2 months ago

    That color doesn’t look healthy though. ;)

silisili 2 months ago

Fun idea. Perhaps could stretch a little like we did in calculators and add 5 for S, or even 7 for T, but that would likely be a bit less readable.

  • ghasemi 2 months ago

    I added a comment for 5 vs S. 7/T looks like it's a bit too much :D

  • bawolff 2 months ago

    You could just do full 1337 speek.

    • genewitch 2 months ago

      pager code, probably better. "143" = I love you; but 177427*711773 = what time. I don't miss those days. I never had a pager, and i managed to convince all my friends that they shouldn't, either, by pager bombing them. Pagers are still in use, and they're plaintext over the air so if you live near a place that uses pagers (hospitals still use them, for instance), you can get all the messages in real time. It's the frequency. It's in VHF (iirc) so it goes places microwaves cannot; it's also low bandwidth, so the small spectrum carved out for it is usually enough for hundreds of pagers in the area.

      And since there's no real place to mention this elsewhere, there's a HTML color bot on fediverse (botsin.space) that periodically posts two colors, that work as compliments as foreground and background, and vice versa. I haven't seen it in a while, but our little instance has gotten popular so the feed rate is up near a few hundred posts an hour to sift through.

    • mod 2 months ago

      Little town I frequently drive through has a population of 1337.

      I always have a little giggle.

      • hoyd 2 months ago

        what town and country?

        • mod 2 months ago

          I like my pseudo-anonymity here.

          It's in the US. Here's the census data to discover many occurrences of "1337"

          https://www.census.gov/data/tables/time-series/demo/popest/2...

          FWIW the town I'm talking about has a different population listed there, a little bit short. The road sign still says 1337, though, as of Thursday.

    • silisili 2 months ago

      come to think of it, doing a separate list of toLower l -> 1 isn't a bad idea either...

Yenrabbit 2 months ago

It makes me happy that #ACAC1A is about the right colour for the flowers of the sweet acacia tree (a pale yellow).

dspillett 2 months ago

I know this is only looking at single words, so would miss this, but I always like to work ABAD1DEA into PoC work.

  • eyelidlessness 2 months ago

    I like this! I usually try to pick a word/set of words that relates to the subject matter I’m testing, or something off the top of my head when that fails. But ABAD1DEA is a great default for exploratory work.

    This is also an 8 character string, which I had wrongly inferred from usage in existing code to be restricted to certain APIs, but I looked it up and it’s evidently part of CSS Color Module Level 4 and has wide browser support. The one-liner could trivially be expanded to support 8-character codes. Not sure how trivial multiple words would be, my gut says “reasonably so but won’t feel quite so reasonable on one line”. Alas I’m on mobile so I’m not gonna try it right now.

    • dspillett 2 months ago

      Just as RRGGBB has a three colour shorthand, you can use for characters too: RGBA as a shorthand for RRGGBBAA.

1vuio0pswjnm7 2 months ago

Not sure why this is being called "Bash" one-liner. It will work with many shells. It will run noticeably faster in Dash, for example. Test it yourself. Linux chooses Dash for non-interactive use, like this one-line script, because it is faster than Bash.

  • 1vuio0pswjnm7 2 months ago

    Some examples of where one finds Dash (NetBSD-derived Almquist shell, or "ash") in Linux

       The git.kernel.org repository
       Slackware
       Debian 
       Unbuntu
       Gentoo
       Arch initramfs
       Alpine 
       Tiny Core 
       OpenWRT
       Any other distrib that uses Busybox
       Android
    
    What the OP fails to mention is that this shell one-liner (cf. "Bash one-liner"), as written, requires GNU grep, thanks to "-P".

    BusyBox grep does not have a "-P" option.

    In the case of Android, Google uses NetBSD userland programs, e.g., grep, which also does not include PCRE, i.e., "-P".

    https://coral.googlesource.com/android-core/+/3458bb6ce1d3e7...

    https://git.kernel.org/pub/scm/utils/dash/dash.git/

       curl -O https://mirror.rackspace.com/archlinux/iso/2022.10.01/arch/boot/x86_64/initramfs-linux.img
       xz -dc < initramfs-linux.img|cpio -t|grep -m1 usr/bin/ash
    • kps 2 months ago

      It's written with `-P` but doesn't actually need it. Standard `-E` works just fine instead.

      • 1vuio0pswjnm7 2 months ago

        How many "professional" programmers even know the difference between BRE, ERE and PCRE.

        Perhaps this is why use of regex is so controversial amongst a majority of "professional" programmers. They are trying to use PCRE for every pattern matching task, i.e, even ones where it is not necessary, whether it is within their programing language or with command-line utilities. This "Bash one-liner" is a simple example.

        I have reviewed a number of books written about regular expressions and for the most part^1 they focus only on regex as implemented in popular programming languages. That almost invariably is PCRE or some form of PCRE-like pattern matching. There is little distinction, let alone acknowledgment, between PCRE/PCRE-like patterns and anything simpler.

        Not being a "professional" programmer, I use regex everyday but I never (intentionally) use PCRE.^2 Too complicated for my tastes, not to mention slow if using backtracking.

        1. I recall one older book that did include an incomplete table attempting to show which type of regex was used by various UNIX utilities in addition to what regex was used by popular programming languages of the day.

        2. For programs that optionally link to a PCRE library, I re-compile without them without it.

  • LambdaComplex 2 months ago

    > Linux chooses Dash for non-interactive use

    That entirely depends on the Linux distro.

ratsmack 2 months ago

I don't like using multiple commands.

    mawk 'BEGIN{b = "[abcdefois]"; l = "[a-z]"; W = "^" b l l l l l "$"}; $0 ~ W {print "#" toupper($0);}' /usr/share/dict/words
  • kbr2000 2 months ago

    I came up with:

      gawk 'BEGIN {IGNORECASE=1} ((length($1) == 6) && /^[a-fois]+$/) {gsub(/o/,0);gsub(/i/,1);gsub(/s/,5); print toupper("#"$1)}' /usr/share/dict/words
    
    (caveat: it does not filter out duplicates)
  • adrianmonk 2 months ago

    You can also do it entirely in sed:

        sed -E -e '/^[a-fio]{6}$/!d; y/abcdefioIO/ABCDEF1010/; s/^/#/' /usr/share/dict/words
    • xertopertha 2 months ago

      This produces 35 items. The grep version gives 93

      • adrianmonk 2 months ago

        Yeah, I failed to make the pattern case insensitive.

        Here's a fixed version that also handles S/5:

            sed -E -e '/^[A-FIOSa-fios]{6}$/!d; y/abcdefiosIOS/ABCDEF105105/; s/^/#/' /usr/share/dict/words
  • Keyframe 2 months ago

    you also aren't going to get valid color codes

kgwxd 2 months ago

I wanted a t-shirt that is the color #FAB; and says #FAB; on it, thought it'd be a fun one for digital artists, then I found out how hard it would be to get t-shirt that matches it just right.

teaearlgraycold 2 months ago

Fun fact: Every Java .class file starts with the magic bytes C0FEBABE

  • belter 2 months ago

    CAFEBABE

    "...We used to go to lunch at a place called St Michael’s Alley. According to local legend, in the deep dark past, the Grateful Dead used to perform there before they made it big. It was a pretty funky place that was definitely a Grateful Dead Kinda Place. When Jerry died, they even put up a little Buddhist-esque shrine. When we used to go there, we referred to the place as Cafe Dead. Somewhere along the line, it was noticed that this was a HEX number. I was re-vamping some file format code and needed a couple of magic numbers: one for the persistent object file, and one for classes. I used CAFEDEAD for the object file format, and in grepping for 4 character hex words that fit after “CAFE” (it seemed to be a good theme) I hit on BABE and decided to use it. At that time, it didn’t seem terribly important or destined to go anywhere but the trash can of history. So CAFEBABE became the class file format, and CAFEDEAD was the persistent object format. But the persistent object facility went away, and along with it went the use of CAFEDEAD – it was eventually replaced by RMI...."

    - James Gosling

    • jrumbut 2 months ago

      I had the distinct pleasure of discovering CAFEBABE myself, in high school (not sure what direction this is dating myself in but I'll risk it), when I went on a tear of opening odd things in a hex editor.

      Now I will never be able to see without thinking of this story: https://aphyr.com/posts/341-hexing-the-technical-interview

    • TillE 2 months ago

      I've been using that as my own alternative to DEADBEEF for years, I had no idea it was part of the official Java spec. Maybe it got lodged in my brain subconsciously at some point.

nick0garvey 2 months ago

Interesting one liner but would like to see the colors it generates

pushedx 2 months ago

What about 7 for T and also 3 for E?

  • jaclaz 2 months ago

    E is a legit hex character:

    0123456789ABCDEF

    isn't it?

    The 3 for E in 1337 speak was on numerical calculators that didn't display letters.

    • pushedx 2 months ago

      Using 3 you can get more colors with human readable names, and maybe pick the canonical color for any given word based on some criteria of interestingness.