> Some routers and firewall silently kill idle TCP connections without telling application. Some code (like HTTP client libraries, database clients) keep a pool of TCP connections for reuse, which can be silently invalidated. To solve it you can configure system TCP keepalive. For HTTP you can use Connection: keep-alive Keep-Alive: timeout=30, max=1000 header.
Once a TCP connection has been established there is no state on routers in between the 2 ends of the connection. The issue here is firewalls / NAT entries timing out. And indeed, no RSTs are sent.
We had the issue in K8s with the conntrack module set too low.
Note that TCP Keep-Alive might not play well with mobile devices.
Mobile operating systems can use application-level keep-alive packets, because those can be easily attributed to individual applications: an applications receives a TCP/UDP packet during low-power CPU sleep mode, asks system to wake up (by e.g. taking a wake-lock), and the system takes note who caused the wake-up. TCP Keep-Alive happens below application level, so it may be disabled, even when application can still be reached.
> A method that returns Optional<T> may return null.
projects that do this drive me bananas
If I had the emotional energy, I'd open a JEP for a new @java.lang.NonNullReference and any type annotated with it would be a compiler error to assign null to it
public interface Alpha {}
@java.lang.NonNullReference
public interface Beta {}
Alpha a = null; // ok
Beta b = null; // compiler error
javac will tolerate this
Beta b;
if (Random.randBoolean()) {
b = getBeta();
} else {
b = newBeta();
}
but I would need to squint at the language specification to see if dead code elimination is a nicety or a formality
Beta b;
if (true) {
b = getBeta();
} else {
b = null; // I believe this will be elided and thus technically legal
}
I question the wisdom of even having Optional<T> in a language with nulls. It would raise some eyebrows if a function in Python returned an Optional type object rather than T | None. You have to do a check either way unless you're doing some cute monad-y stuff.
It works quite well in Scala, which still tolerates nulls due to being in the JVM and having Java interop. Realistically nothing in the language is going to return null, so the only time you might have to care is when you call Java classes, and all of the Java standard library comes scalaified into having no nulls. And yes, there are enough monadic behavior in the standard library to make Option and Either quite useful, instead of just sum types.
Java really suffers with optional because the language has such love for backwards compatibility that it's extremely unlikely that nulls would even be removed from the standard library in the first place. The fact that the ecosystem relies on ugly auto wiring hacks instead of mandating explicit constructors doesn't help either.
> because the language has such love for backwards compatibility
I still remember when Java 9 introduced modules. And I’m currently pulling my hair because Java 21 renamed all javax.* into jakarta.* because Javax was a trademark of Oracle, and all libs now require a “-jakartax” version for JDK 21.
But somehow I still have to deal with nulls everywhere and erased-at-runtime generics because Java loves backwards compatibility so much. The simple fact all libs released a “-jakartax” proves the entire ecosystem is fully maintained (plus CVEs means unmaintained libs aren’t allowed in production), so they could very well release a -jdk25 version with non-null types.
Maybe this is cute monady stuff, but there isn't an equivalent to Optional<Optional<T>> with only null/None. You usually don't directly write that, but you might incidentally instantiate that type when composing generic code, or a container/function won't allow nulls.
In any context where you're combining two things that have different kinds of absence. E.g. if you have a cache around an expensive API call, you want to be able to cache null results from that API call, so you need a distinction between "not in the cache" and "cache entry where the API returned null".
Java for example has Map.computeIfAbsent. This function returns a nullable value and does not need 2 layers of if absence to work. I would personally argue that exposing the 2 levels of absence is exposing an implementation details, but I'll accept this as a valid usage of double optionals.
computeIfAbsent doesn't need a second kind of absence because it can never return absence. That's fine if that API is a good fit for your use case, but it isn't always (there's a reason Map.get exists as well as Map.computeIfAbsent).
> I would personally argue that exposing the 2 levels of absence is exposing an implementation details
Probably you wouldn't want to return Option<Option<X>> (or Nullable) to outside callers - maybe you want to convert None to one kind of domain-meaningful error and Some(None) to a different kind of domain-meaningful error, maybe you want to take some different codepaths to respond to "recover" from the different kinds of absence.
But it's extremely valuable to be able to compose together existing libraries that might use absence to mean something and have them just do the right thing rather than always having to worry about the edge cases where one has a kind of absence that's subtly different from the other's kind of absence. I mean fundamentally you can't ever assume that a random third-party function in Java is safe to call with null, because many of them aren't. But you also can't ever assume that a random third-party function won't return null, because some of them do. So even to just compose two functions you've got to check their docs and think about the behaviour of this special value, and it's just all so avoidable.
In JSON/REST API bindings, where a deserializer maps JSON to language-native object/struct type, I'll often need to know the difference between:
{}
and
{ "foo": null }
and
{ "foo": 42 }
So I'll represent that (in e.g. Rust) as:
struct Whatever {
foo: Option<Option<u32>>,
}
None means not present, Some(None) means present but null, and Some(Some(42)) means present with a value.
I'll often use this in PATCH endpoints, where not-present means to leave the current value alone, null means to unset it, and a value means to set to that value.
How about a situation where the inner Optional<T> is acquired from another system or database, and the outer Optional<Optional<T>> is a local cache of the value. If the outer Optional is null, then you need to query the other system. If the outer Optional is filled and the inner Optional is null, then you know that the other system explicitly has no value for the data item, and can skip the query. Seems like using nested optionals would be natural here, although of course alternative representations are possible.
There's a lot of quality-of-life stuff enabled by it in Java, since the base language's equivalents to Optional.empty(), Optional.ofNullable(...).orElse(...), etc are painfully verbose by comparison.
You're far from alone, it does make it a tiny bit easier to see which functions are expected to return null, but that's about it and messing around with it always feels like wasted effort.
Yeah, sure, and thankfully everyone has already switched all their teams to superior languages
Anyway, I believe what you're referring to is the "?" syntax that annotates types in Kotlin but doesn't help the resulting bytecode, which means that every single library ever would need to convert to kotlin to benefit
fun doit() : java.io.InputStream? { return null }
kotlinc test.kt
javap -c test.class
public final java.io.InputStream doit();
Code:
0: aconst_null
1: areturn
So even they didn't have the courtesy of marking the result of a known Optional result as Optional<java.io.InputStream> when interfacing with the existing Java ecosystem
> Java, C# and JS use UTF-16-like encoding for in-memory string
That’s incorrect for Java, possibly also for C# and JS.
In any language where strings are opaque enough types [1], the in-memory representation is an implementation detail. Java has been such a language since release 9 (https://openjdk.org/jeps/254)
[1] The ‘enough’ is because some languages have fully opaque types, but specify efficiency of some operations and through it, effectively proscribe implementation details. Having a foreign function interface also often means implementation details cannot be changed because doing that would break backwards compatibility.
> JS use floating point for all numbers. The max accurate integer is 2⁵³−1
That is incorrect. Much larger integers can be represented exactly, for example 2¹⁰⁰.
What is true is that 2⁵³−1 is the largest integer n such that n-1, n, and n+1 can be represented exactly in an IEEE double. That, in turn, means n == n-1 and n == n+1 both will evaluate to false, as expected in ‘normal’ arithmetic.
The representation for C# is very much fixed, as it allows, and very commonly uses, direct access into the string buffer as a ReadOnlySpan<char> or a raw char pointer, where char is the type of UTF-16 codepoints.
I started to say something about C# strings and then I remembered the clusterfuck when it came to Windows development and strings and depending on which API you call, a string is represented by one of a dozen different ways.
> > Java, C# and JS use UTF-16-like encoding for in-memory string
>
> That’s incorrect for Java,
Maybe so, technically, but if you Base64 encode a string in a language that uses UTF-8 (or another UTF-16 with another endian) and decode it in Java, Java's UTF-16 representation will be the problem you will be dealing with.
That's a nice compendium of tips and useful information.
I wonder if anyone can learn from this. I feel like I only understood what I already knew, or at least was very close to knowing. That's the same thing that happens with teaching manuals about any topic: they're organized in a way that makes sense and it's easy for people who already know the topics, but often very bad at teaching the same topics to an audience that doesn't know anything.
> with teaching manuals about any topic: they're organized in a way that makes sense and it's easy for people who already know the topic
I think that the reason for a manual existence. To have a written record so we don't have to trust our memory. This is what most unix manuals are. You already know what the software can do, you just need to remember the specificity on how to get something done.
> often very bad at teaching the same topics to an audience that doesn't know anything.
What you need then is a tutorial (beginner seeking to learn) or a guide (beginner/intermediate seeking to do). Manuals in this case only serve to have better questions (Now you know what you don't know).
For what it's worth, I really enjoyed "Traceroute Isn't Real." I have noticed for quite a while that the data from it is at best patchy, often apparently meaningless. So it's helpful to see the explanation of why that is expected behavior:
I'm not regularly a python developer and I just spent a ton of time earlier this week tripped up by the default argument being a stored value. I was using to make an empty set if no set was passed in... but the set was not always empty because it was being reused. Took me forever to figure out what was going on.
I guess the first trap should really be: "You cannot read any CSS property in isolation, as just like what the name implies, defaults and what values end up doing cascades through all the rules your document ends up using"
CSS cascade for text properties more or less makes sense.
I have been unable to comprehend CSS layout from any perspective: page designer, implementer, user, anything. It must have someone in mind but I have no idea who I that is.
https://every-layout.dev has by far the best explanations and coherent usage of CSS I've encountered since I started doing webdev for a living in 1998.
> Unicode unification. Different characters in different language use the same code point. Different languages' font variants render the same code point differently. 語
This isn't a trap. The given example character means the same thing in Chinese and Japanese, and the Japanese version was imported from China. People from both languages recognize both font variants as the same conceptual character.
The author is making it sound like the letter 'A' in English should have a different code point than an 'A' in French. Or that a lowercase 'a' with the top tail should be a different character than a lowercase 'a' without the top tail.
> There is a negative zero -0.0 which is different to normal zero. The negative zero equals zero when using floating point comparision. Normal zero is treated as "positive zero".
And there are two ways to distinguish negative zero from normal zero: By their integer bit patterns, or by the fact that 1.0/-0.0 == -Inf vs. 1.0/0.0 == +Inf.
> It's recommended to configure the server's time zone as UTC.
Big yes. I use UTC for servers, logs, photos, and anything that is worth archiving and timestamping properly. Local time is only for colloquial use.
> For integer (low + high) / 2 may overflow. A safer way is low + (high - low) / 2
Yes, but if low and high could be negative numbers, then you've just shifted the overflow to a different range. This matters for general binary search over an integer range, as opposed to unsigned binary search over an array.
> C/C++
I'm going to throw in one of my lists of pitfalls - just using integer types and arithmetic correctly in C/C++ is a massive developer trap. That's like the most basic thing in programming. https://www.nayuki.io/page/summary-of-c-cpp-integer-rules
> Rebase can rewrite history
"Can" is a weasel word; rebase does nothing but rewrite history.
> People from both languages recognize both font variants as the same conceptual character.
A character that carries the same concept yes. A mere "font variant" no. It's absolutely a trap to think that you can safely replace one character with another just because they have the same unicode codepoint; Japanese people will avoid your product if you do this.
> The author is making it sound like the letter 'A' in English should have a different code point than an 'A' in French. Or that a lowercase 'a' with the top tail should be a different character than a lowercase 'a' without the top tail.
But we do have А and A. Even though they look the same. And unified Han characters are often quite distinct, it tripped me up as a learner of Chinese more than once. For example, a very common character '喝' (drink) looks quite a bit different: https://en.wiktionary.org/wiki/%E5%96%9D - they have a different number of strokes even. And I can't even copy-paste it here to demonstrate, because it changes form once I copy it from the Wikipedia article.
CSS and C++ both have the “pick a subset and enforce that, or suffer” nature. On my to-do list: make a github action that requires manual override to merge any pull request with a css attribute not already present
I am unsure how this is supposed to work for CSS. To my knowledge, most CSS properties cannot be substituted for each other. If the subset to be enforced is "CSS properties already present", what is a developer supposed to do if their CSS property is not already present? Change the design?
Well, (like C++) new css attributes are constantly added. This means you constantly have to choose between the old way or the new way: either is fine, but “pick old or new at random on a per pull request basis” isn’t.
You seem to assume that old CSS properties can be substituted for new ones. But as I said, to my knowledge this isn’t possible in most cases. Can you give an example of two CSS properties where 'either is fine, but only one should be used'?
Or do you mean something else altogether by 'CSS attributes'?
Regex semantics is subtly different across languages. E.g. a{,3} matches between 0 and 3 "a" characters in Python. In JavaScript it matches the literal string "a{,3}".
Regex is more a technique than an actual specification. It would be best to find the time to go and read an introductory book about Theory of Computation where they explain the underlying mechanism.
It's half a chapter in most books I know. Or a subset of this 1h MIT videos [0], but the instructor also explains Finite Automata which is the basic mechanism that does all the stuff.
I'll assume sarcasm (from your comment history) but for people actually believing this first degree: good luck debugging an incorrect regex if you haven't practiced regexes. Especially if it was generated by an llm.
In some common implementations if $LANG is set to certain values, it will fail to match some ASCII letters. This is because not all latin character using languages put Z last in the alphabet.
Try this (you probably need to enable and generate the locale first)
echo y | LANG=lv_LV.UTF-8 grep '[a-z]'
Locales in general should be considered a "trap", just look at Windows CSV separator handling, etc.
It depends on its use, ultimately, but if your goal is to find a string of letters (a common use IMO), you'll want to use something like \p{L} to ensure you don't miss non-ASCII characters.
eta: fixed regex, I had typed \L, shared from my faulty memory.
>A volatile write operation prevents earlier memory operations on the thread from being reordered to occur after the volatile write. A volatile read operation prevents later memory operations on the thread from being reordered to occur before the volatile read
Looks like release/acquire to me? A total ordering would be sequential consistency.
"In C#, using the volatile modifier on a field guarantees that every access to that field is a volatile memory operation"
This makes it sound like you are right and the volatile keyword has the same behaviour as the Volatile class which explicitly says it has acquire-release ordering.
But that seems to contradict "The volatile keyword doesn't provide atomicity for operations other than assignment, doesn't prevent race conditions, and doesn't provide ordering guarantees for other memory operations." from the volatile keyword documentation?
I too interpretat those docs as contradictory, and I wonder if, like how Java 5 strengthened volatile semantics, this happened at some point in C# too and the docs weren't updated? Either way the specification, which the docs say is definitive, says it's acquire/release.
"When a field_declaration includes a volatile modifier, the fields introduced by that declaration are volatile fields. [...] For volatile fields, such reordering optimizations are restricted:
A read of a volatile field is called a volatile read. A volatile read has “acquire semantics”; that is, it is guaranteed to occur prior to any references to memory that occur after it in the instruction sequence.
A write of a volatile field is called a volatile write. A volatile write has “release semantics”; that is, it is guaranteed to happen after any memory references prior to the write instruction in the instruction sequence."
Acquire-release ordering provides ordering guarantees for all memory operations. If an acquire observes a releases, the thread is also guaranteed to see all the previous writes done by the other thread - regardless of the atomicity of those writes. (There still can't be any other data races though.)
This volatile keyword appears to only consider that specific memory location whereas the Volatile class seem to implement acquire-release.
Somewhat off topic, but what is a realistic example of where you need atomics with sequential consistency? Like, what useful data structure or pattern requires it? I feel like I've seen every other ordering except that one (and consume) in real world code.
A mutex would be the most trivial example. I don't believe that is possible to implement, in the general case, with only acquire-release.
Sequential consistency mostly become relevant when you have more than two threads interacting with both reads and writes. However, if you only have single-consumer (i.e. only one thread reading) or single-producer (i.e. only one thread writing) then the acquire-release semantics ends up becoming sequential since the single-consumer/producer implicitly enforces a sequential ordering. I can potentially see some multi-producer multi-consumer queues lock-free queues needing sequential atomics.
I think it's rare to see atomics with sequential consistency in practice since you typically either choose (1) a mutex to simplify the code at the expense of locking or (2) acquire-release (or weaker) to minimize the synchronization.
No, sorry. I was just remembering where I've typically seen sequential consistency being used. For instance, Peterson's algorithm was what I had in mind. Spinlock is indeed a good example (although a terrible algorithm which I hope you haven't seen used in practice) of a mutex algorithm which only requires acquire-release.
Better to make the requirement explicit, instead of relying on the argument-parsing details of rm or some other command:
# Default message
$ rm -rf "${DIR:?}"
bash: DIR: parameter null or not set
# Custom message
$ rm -rf "${DIR:?It is not set OMG}"
bash: DIR: It is not set OMG
Does anyone truly understand all the little edge cases with CSS?
I've write tons and tons of CSS, have done for a decade. I don't sit and think about the exact interactions, I just know a couple things that might work if I'm getting something unexpected.
I don't really see it possible to commit that to memory, unless I literally start working on an interpreter myself.
I think there can be a different way to think about CSS that can help with that feeling of never understanding it all. Recently I’ve heard people influential in the CSS world describe it as a “suggestion” to the browser. The browser has its own styles, the user might have some custom stylesheet on top of the browser’s version, extensions, etc etc and at some point CSS is really more a long list of “suggestions” about how the site should look.
If you embrace that idea to the fullest, you can create some interesting designs/patterns that can be more resilient. The “downside” is that this way of writing css will likely made the pixel perfect head of the marketing department hate you unless they also write code.
I think it’s also okay to say that some ways of writing css just aren’t relevant anymore. A good parallel in mind is building construction and general carpentry. These days, a quick 2x4 stud wall or insulated concrete forms is fast, cheap, and standardized around the world. However, many craftspeople still exist that will create beautiful joinery for what is ultimately a simple thing, but we can appreciate that art standalone. With CSS, I don’t suspect we will ever need to go back to floats or crazy background images or whatever but it’s nice that those tools are still there for not only the sake of back compat, but also as a way to tinker and “craft” something bespoke for a special project or just because you like it. Education will eventually catch up and grid and flexbox will keep gaining popularity until we decide that it’s too complicated and come up with some new algorithm. That can all be true though and you can bring value as a developer without knowing every single aspect to the public API.
But you need to, you know, actually float something in a text. I think to do it with flexbox/grid you need JS that calculates heights and than manually splits the text into boxes with heights, so essentially you are doing rendering.
Also is there another way to position boxes side-by-side in an inline context without float?
"Associativity law and distribution law doesn't strictly hold because of inaccuracy." should be due to precision loss not inaccuracy (they are different).
I would agree on the self part. But otherwise this point of view is distinctly in contrast with the recently republished "work on things that don't scale" article from pg.
Also as a corollary, I was thinking about games from the 90s I spent hundreds of hours playing back then earlier today. Bolo and Escape Velocity in particular come to mind. They were "simple" games with immense depth. But after some fruitless searching, all I find is scattered questions and comments over the last few years looking for modern equivalents are a handful of recommendations for games that are no longer developed or are defunct.
There's clear prior evidence of both success and lack of modern supply. Want to have a minimally minor successful game with established nostalgic audience? Make a new version of EV that is that game at its core. Don't be fancy, just do the thing Ambrosia did. Expand from there.
The simple model is look forward, and also look back. The first itch is small. For yourself. Find some others with a similar itch.
The hard part is pushing your idea into something more people want. That's where pg's 2013 article comes in to play. That's the hard part.
But that leaves a huge space between 0 and 10 where someone can find a successful niche.
Hell I've thought about trying to make a modern EV. It's tantalizing. But I've never made a game in my life. Ok I've rewritten Game of Life a lot. Good concept to try new ideas with.
But the amount of work for a solo dev trying to recapture that original magic with no background in game dev is daunting.
Which is incredibly painful on Windows systems doing a git clone of shell scripts, since core.autocrlf is often helpful, but not for shell scripts, since it causes the weirdest looking error messages:
> Division is much slower than multiplication (unless using approximation). Dividing many numbers with one number can be optimized by firstly computing reciprocal then multiply by reciprocal.
It's hardware-level - division requires more CPU cycles than multiplication on most processor architectures, making this optimization pattern relevant across virtually all programming languages.
Adjacent to the bit about 100vh, I’d add an item about the much older fundamental stupidity of viewport units, which Firefox tried to fix but no one else got on board and so they eventually removed their patch and went back to being as broken as everyone else. That is: they ignore scrollbars. Use a platform with 17px-wide scrollbars, and in a document with vertical scrolling, 100vw is now 100% of the viewport width… plus 17px. People get this wrong all the time, and it’s only increased as more platforms have switched to overlay scrollbars by default.
> Blur does not consider ambient things.
Need to make that backdrop blur.
> Whitespace collapse. HTML Whitespace is Broken
I always hated that article title. It’s not broken, it’s just different from what you occasionally expected, mostly for very good reasons that let it do what you expected most of the time. I also disagreed with some significant other parts of it: https://news.ycombinator.com/item?id=42971415.
> Two concepts: code point (rune), grapheme cluster:
I have problems with this lot.
Firstly: add code units. When dealing with UTF-8 and you can access individual bytes, they’re UTF-8 code units. When dealing with UTF-16 and you can access 16-bit values, including surrogates, they’re UTF-16 code units.
Secondly: don’t talk about code points in general, talk about scalar values, which are what code points would have been were it not for that accursèd abomination UTF-16. Specifically, scalar values excludes the surrogate code points.
Sensible things work with scalar values. Python is the only thing I can think of that gets this wrong: its strings are sequences of code points, so you can represent lone surrogates (representable in neither UTF-8 nor UTF-16). This is so stupid.
Thirdly: ditch the alias “rune”, Go invented it and it never caught on. I don’t know of anything else that uses the term (though maybe a few things do), and the term rune gets used for other unrelated things (e.g. in Svelte 5).
> Some text files have byte order mark (BOM) at the beginning. For example, EF BB BF is a BOM that denotes the file is in UTF-8 encoding. It's mainly used in Windows. Some non-Windows software does not handle BOM.
BOM was about endianness, and U+FEFF. Link to https://en.wikipedia.org/wiki/Byte_order_mark. It made sense in UTF-16 and UTF-32: for UTF-16, if the file starts with FE FF it’s UTF-16BE, if FF FE it’s UTF-16LE. UTF-8 application of it was always a mess, never really caught on, and I think can fairly be considered completely obsolete now: the typical person or software will never encounter it. Very little software will actually handle it any more. Fortunately, it doesn’t tend to matter much: U+FEFF is the BOM for a reason, it’s ZERO WIDTH NO-BREAK SPACE.
> YAML:
I’d add “disallows Tab indentation”. It’s the only thing I can immediately think of that’s like that. (Make is roughly the opposite, recipe lines having to start with a single Tab. Python doesn’t let you mix spaces and tabs. These are all I can immediately think of.)
> Some routers and firewall silently kill idle TCP connections without telling application. Some code (like HTTP client libraries, database clients) keep a pool of TCP connections for reuse, which can be silently invalidated. To solve it you can configure system TCP keepalive. For HTTP you can use Connection: keep-alive Keep-Alive: timeout=30, max=1000 header.
Once a TCP connection has been established there is no state on routers in between the 2 ends of the connection. The issue here is firewalls / NAT entries timing out. And indeed, no RSTs are sent.
We had the issue in K8s with the conntrack module set too low.
Now, you can try to put in an HTTP Keep-Alive, but that will not help you. The HTTP Keep-Alive is merely for connection re-use at the HTTP level, i.e. it doesn't close the connection: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/...
An HTTP Keep-Alive does not generate any packages, it merely postpones the close.
A TCP Keep-Alive generates packages which resets the timers.
Note that TCP Keep-Alive might not play well with mobile devices.
Mobile operating systems can use application-level keep-alive packets, because those can be easily attributed to individual applications: an applications receives a TCP/UDP packet during low-power CPU sleep mode, asks system to wake up (by e.g. taking a wake-lock), and the system takes note who caused the wake-up. TCP Keep-Alive happens below application level, so it may be disabled, even when application can still be reached.
s/packages/packets/
Thanks I will correct that
> A method that returns Optional<T> may return null.
projects that do this drive me bananas
If I had the emotional energy, I'd open a JEP for a new @java.lang.NonNullReference and any type annotated with it would be a compiler error to assign null to it
javac will tolerate this but I would need to squint at the language specification to see if dead code elimination is a nicety or a formalityI question the wisdom of even having Optional<T> in a language with nulls. It would raise some eyebrows if a function in Python returned an Optional type object rather than T | None. You have to do a check either way unless you're doing some cute monad-y stuff.
Python has Optional[T] defined as T | None in the stdlib https://docs.python.org/3/library/typing.html#typing.Optiona...
It works quite well in Scala, which still tolerates nulls due to being in the JVM and having Java interop. Realistically nothing in the language is going to return null, so the only time you might have to care is when you call Java classes, and all of the Java standard library comes scalaified into having no nulls. And yes, there are enough monadic behavior in the standard library to make Option and Either quite useful, instead of just sum types.
Java really suffers with optional because the language has such love for backwards compatibility that it's extremely unlikely that nulls would even be removed from the standard library in the first place. The fact that the ecosystem relies on ugly auto wiring hacks instead of mandating explicit constructors doesn't help either.
> because the language has such love for backwards compatibility
I still remember when Java 9 introduced modules. And I’m currently pulling my hair because Java 21 renamed all javax.* into jakarta.* because Javax was a trademark of Oracle, and all libs now require a “-jakartax” version for JDK 21.
But somehow I still have to deal with nulls everywhere and erased-at-runtime generics because Java loves backwards compatibility so much. The simple fact all libs released a “-jakartax” proves the entire ecosystem is fully maintained (plus CVEs means unmaintained libs aren’t allowed in production), so they could very well release a -jdk25 version with non-null types.
Maybe this is cute monady stuff, but there isn't an equivalent to Optional<Optional<T>> with only null/None. You usually don't directly write that, but you might incidentally instantiate that type when composing generic code, or a container/function won't allow nulls.
In what context would you not want to treat Optional.of(null) and null as the same? It shouldn't be a big deal.
In any context where you're combining two things that have different kinds of absence. E.g. if you have a cache around an expensive API call, you want to be able to cache null results from that API call, so you need a distinction between "not in the cache" and "cache entry where the API returned null".
Java for example has Map.computeIfAbsent. This function returns a nullable value and does not need 2 layers of if absence to work. I would personally argue that exposing the 2 levels of absence is exposing an implementation details, but I'll accept this as a valid usage of double optionals.
computeIfAbsent doesn't need a second kind of absence because it can never return absence. That's fine if that API is a good fit for your use case, but it isn't always (there's a reason Map.get exists as well as Map.computeIfAbsent).
> I would personally argue that exposing the 2 levels of absence is exposing an implementation details
Probably you wouldn't want to return Option<Option<X>> (or Nullable) to outside callers - maybe you want to convert None to one kind of domain-meaningful error and Some(None) to a different kind of domain-meaningful error, maybe you want to take some different codepaths to respond to "recover" from the different kinds of absence.
But it's extremely valuable to be able to compose together existing libraries that might use absence to mean something and have them just do the right thing rather than always having to worry about the edge cases where one has a kind of absence that's subtly different from the other's kind of absence. I mean fundamentally you can't ever assume that a random third-party function in Java is safe to call with null, because many of them aren't. But you also can't ever assume that a random third-party function won't return null, because some of them do. So even to just compose two functions you've got to check their docs and think about the behaviour of this special value, and it's just all so avoidable.
The None branch of each level of a nested Optional has a different meaning.
But typically it boils down to either you have the data or you don't. It's a subtle difference which I argue you can live without.
Often people use optional or nullable types as a convenient approximation to an Either type.
I still don't see why it would be a problem merging then down even when used like an either. If there is no value then there is no value.
In JSON/REST API bindings, where a deserializer maps JSON to language-native object/struct type, I'll often need to know the difference between:
and and So I'll represent that (in e.g. Rust) as: None means not present, Some(None) means present but null, and Some(Some(42)) means present with a value.I'll often use this in PATCH endpoints, where not-present means to leave the current value alone, null means to unset it, and a value means to set to that value.
How about a situation where the inner Optional<T> is acquired from another system or database, and the outer Optional<Optional<T>> is a local cache of the value. If the outer Optional is null, then you need to query the other system. If the outer Optional is filled and the inner Optional is null, then you know that the other system explicitly has no value for the data item, and can skip the query. Seems like using nested optionals would be natural here, although of course alternative representations are possible.
what is the advantage of using that over 2 variables? the cost is extra mental load, what does it buy?
There's a lot of quality-of-life stuff enabled by it in Java, since the base language's equivalents to Optional.empty(), Optional.ofNullable(...).orElse(...), etc are painfully verbose by comparison.
You're far from alone, it does make it a tiny bit easier to see which functions are expected to return null, but that's about it and messing around with it always feels like wasted effort.
[dead]
In Kotlin this would already be a compile error, no need for another annotation.
Yeah, sure, and thankfully everyone has already switched all their teams to superior languages
Anyway, I believe what you're referring to is the "?" syntax that annotates types in Kotlin but doesn't help the resulting bytecode, which means that every single library ever would need to convert to kotlin to benefit
So even they didn't have the courtesy of marking the result of a known Optional result as Optional<java.io.InputStream> when interfacing with the existing Java ecosystem> Java, C# and JS use UTF-16-like encoding for in-memory string
That’s incorrect for Java, possibly also for C# and JS.
In any language where strings are opaque enough types [1], the in-memory representation is an implementation detail. Java has been such a language since release 9 (https://openjdk.org/jeps/254)
[1] The ‘enough’ is because some languages have fully opaque types, but specify efficiency of some operations and through it, effectively proscribe implementation details. Having a foreign function interface also often means implementation details cannot be changed because doing that would break backwards compatibility.
> JS use floating point for all numbers. The max accurate integer is 2⁵³−1
That is incorrect. Much larger integers can be represented exactly, for example 2¹⁰⁰.
What is true is that 2⁵³−1 is the largest integer n such that n-1, n, and n+1 can be represented exactly in an IEEE double. That, in turn, means n == n-1 and n == n+1 both will evaluate to false, as expected in ‘normal’ arithmetic.
> possibly also for C# and JS
The representation for C# is very much fixed, as it allows, and very commonly uses, direct access into the string buffer as a ReadOnlySpan<char> or a raw char pointer, where char is the type of UTF-16 codepoints.
JS could maybe get away with it.
When you have code that works a lot with strings the cost overhead of building an app on iso-latin-1 but encoding as utf-16 can be substantial.
I think Java moved away from this back around 8, or possibly 9.
I started to say something about C# strings and then I remembered the clusterfuck when it came to Windows development and strings and depending on which API you call, a string is represented by one of a dozen different ways.
https://stackoverflow.com/questions/689211/interop-sending-s...
Yeah, I think they didn't mean max "accurate" integer and rather meant max "safe" integer.
Thanks I will correct that
> > Java, C# and JS use UTF-16-like encoding for in-memory string
>
> That’s incorrect for Java,
Maybe so, technically, but if you Base64 encode a string in a language that uses UTF-8 (or another UTF-16 with another endian) and decode it in Java, Java's UTF-16 representation will be the problem you will be dealing with.
That's why when you are constructing a String with a byte array, you always, always, always use the constructor that also takes a character set.
That's a nice compendium of tips and useful information.
I wonder if anyone can learn from this. I feel like I only understood what I already knew, or at least was very close to knowing. That's the same thing that happens with teaching manuals about any topic: they're organized in a way that makes sense and it's easy for people who already know the topics, but often very bad at teaching the same topics to an audience that doesn't know anything.
> with teaching manuals about any topic: they're organized in a way that makes sense and it's easy for people who already know the topic
I think that the reason for a manual existence. To have a written record so we don't have to trust our memory. This is what most unix manuals are. You already know what the software can do, you just need to remember the specificity on how to get something done.
> often very bad at teaching the same topics to an audience that doesn't know anything.
What you need then is a tutorial (beginner seeking to learn) or a guide (beginner/intermediate seeking to do). Manuals in this case only serve to have better questions (Now you know what you don't know).
Kind of what I noticed for myself.
When I was a kid I was trying to learn Linux and commands and it was disappointing.
Over the years of using it I don’t need to learn it but I do need to look stuff up.
This looks like not so much traps, but a list of things the author has learned.
Much of it would only apply in certain relatively narrow contexts, but the contexts aren't necessarily mentioned.
Some of it appears to be just wrong.
I guess I'm saying: I would not take this literally, but as something almost like a stream-of-consciousness.
For what it's worth, I really enjoyed "Traceroute Isn't Real." I have noticed for quite a while that the data from it is at best patchy, often apparently meaningless. So it's helpful to see the explanation of why that is expected behavior:
https://gekk.info/articles/traceroute.htm
(If it's outdated I'm curious if anyone knows relevant updates?)
> Python: - Default argument is a stored value that will not be re-created on every call.
PSA for anyone working with datetime variables!
I'm not regularly a python developer and I just spent a ton of time earlier this week tripped up by the default argument being a stored value. I was using to make an empty set if no set was passed in... but the set was not always empty because it was being reused. Took me forever to figure out what was going on.
The first "trap" on the page says "min-width: auto makes min width determined by content", but this is false outside of flex/grid.
From MDN: "For block boxes, inline boxes, inline blocks, and all table layout boxes auto resolves to 0."
https://developer.mozilla.org/en-US/docs/Web/CSS/min-width
I guess the first trap should really be: "You cannot read any CSS property in isolation, as just like what the name implies, defaults and what values end up doing cascades through all the rules your document ends up using"
CSS cascade for text properties more or less makes sense.
I have been unable to comprehend CSS layout from any perspective: page designer, implementer, user, anything. It must have someone in mind but I have no idea who I that is.
https://every-layout.dev has by far the best explanations and coherent usage of CSS I've encountered since I started doing webdev for a living in 1998.
Every Layout changed how I look at and do CSS. Great resource with a good philosophy behind it: CubeCSS. It really made CSS fun for me again.
Layout is more bazaar than cathedral. It has had many ideas mixed in by different contributors over decades.
Thanks I will correct that
Largely a good listicle. Some feedback:
> Unicode unification. Different characters in different language use the same code point. Different languages' font variants render the same code point differently. 語
This isn't a trap. The given example character means the same thing in Chinese and Japanese, and the Japanese version was imported from China. People from both languages recognize both font variants as the same conceptual character.
The author is making it sound like the letter 'A' in English should have a different code point than an 'A' in French. Or that a lowercase 'a' with the top tail should be a different character than a lowercase 'a' without the top tail.
Anyway, this is discussed at length in https://en.wikipedia.org/wiki/Han_unification
> There is a negative zero -0.0 which is different to normal zero. The negative zero equals zero when using floating point comparision. Normal zero is treated as "positive zero".
And there are two ways to distinguish negative zero from normal zero: By their integer bit patterns, or by the fact that 1.0/-0.0 == -Inf vs. 1.0/0.0 == +Inf.
> It's recommended to configure the server's time zone as UTC.
Big yes. I use UTC for servers, logs, photos, and anything that is worth archiving and timestamping properly. Local time is only for colloquial use.
> For integer (low + high) / 2 may overflow. A safer way is low + (high - low) / 2
Yes, but if low and high could be negative numbers, then you've just shifted the overflow to a different range. This matters for general binary search over an integer range, as opposed to unsigned binary search over an array.
> C/C++
I'm going to throw in one of my lists of pitfalls - just using integer types and arithmetic correctly in C/C++ is a massive developer trap. That's like the most basic thing in programming. https://www.nayuki.io/page/summary-of-c-cpp-integer-rules
> Rebase can rewrite history
"Can" is a weasel word; rebase does nothing but rewrite history.
> People from both languages recognize both font variants as the same conceptual character.
A character that carries the same concept yes. A mere "font variant" no. It's absolutely a trap to think that you can safely replace one character with another just because they have the same unicode codepoint; Japanese people will avoid your product if you do this.
> The author is making it sound like the letter 'A' in English should have a different code point than an 'A' in French. Or that a lowercase 'a' with the top tail should be a different character than a lowercase 'a' without the top tail.
But we do have А and A. Even though they look the same. And unified Han characters are often quite distinct, it tripped me up as a learner of Chinese more than once. For example, a very common character '喝' (drink) looks quite a bit different: https://en.wiktionary.org/wiki/%E5%96%9D - they have a different number of strokes even. And I can't even copy-paste it here to demonstrate, because it changes form once I copy it from the Wikipedia article.
Han unification is a mess.
> If you already use locking, no volatile needed.
Kinda misleading. volatile is for memory mapped I/O and such. volatile means the memory access really happens
I changed the wording of it.
> There are subtle differences between numpy and pytorch.
This isn't really a trap, and it doesn't help anyone; it looks like "I got burned but I don't want to share the specifics".
CSS and C++ both have the “pick a subset and enforce that, or suffer” nature. On my to-do list: make a github action that requires manual override to merge any pull request with a css attribute not already present
I am unsure how this is supposed to work for CSS. To my knowledge, most CSS properties cannot be substituted for each other. If the subset to be enforced is "CSS properties already present", what is a developer supposed to do if their CSS property is not already present? Change the design?
Well, (like C++) new css attributes are constantly added. This means you constantly have to choose between the old way or the new way: either is fine, but “pick old or new at random on a per pull request basis” isn’t.
You seem to assume that old CSS properties can be substituted for new ones. But as I said, to my knowledge this isn’t possible in most cases. Can you give an example of two CSS properties where 'either is fine, but only one should be used'?
Or do you mean something else altogether by 'CSS attributes'?
The specific case that inspired this comment was a random mix of margin and gap
A recent trap for me:
Regex semantics is subtly different across languages. E.g. a{,3} matches between 0 and 3 "a" characters in Python. In JavaScript it matches the literal string "a{,3}".
Regex is more a technique than an actual specification. It would be best to find the time to go and read an introductory book about Theory of Computation where they explain the underlying mechanism.
[flagged]
It's half a chapter in most books I know. Or a subset of this 1h MIT videos [0], but the instructor also explains Finite Automata which is the basic mechanism that does all the stuff.
[0]: https://www.youtube.com/watch?v=9syvZr-9xwk
I'll assume sarcasm (from your comment history) but for people actually believing this first degree: good luck debugging an incorrect regex if you haven't practiced regexes. Especially if it was generated by an llm.
I always use regex101 to develop my regexes. It allows you to switch between different engines.
Honorable mention to [a-z], gotta be my favorite trap
What's the trap for this one? I can't think of any engine that parses this to mean anything other than the letters a through z.
In some common implementations if $LANG is set to certain values, it will fail to match some ASCII letters. This is because not all latin character using languages put Z last in the alphabet.
Try this (you probably need to enable and generate the locale first)
Locales in general should be considered a "trap", just look at Windows CSV separator handling, etc.That's wild. Thanks for explaining. I had no idea this depends on the locale. Looks like I have about a million scripts to fix...
Not in general, but using locales for something different than affecting presentation.
It depends on its use, ultimately, but if your goal is to find a string of letters (a common use IMO), you'll want to use something like \p{L} to ensure you don't miss non-ASCII characters.
eta: fixed regex, I had typed \L, shared from my faulty memory.
[A-z] though is a fun one though as it includes a few extra symbols between upper and lowercase.
Does it? I thought Regex are defined on character classes not on numeric ASCII values. What would a Regex do on a different encoding then?
The part about C# volatile accesses using release-acquire ordering seems to be wrong if I read the C# docs correctly.
"There is no guarantee of a single total ordering of volatile writes as seen from all threads of execution"
https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...
>A volatile write operation prevents earlier memory operations on the thread from being reordered to occur after the volatile write. A volatile read operation prevents later memory operations on the thread from being reordered to occur before the volatile read
Looks like release/acquire to me? A total ordering would be sequential consistency.
I think you are quoting from https://learn.microsoft.com/en-us/dotnet/api/system.threadin...
"In C#, using the volatile modifier on a field guarantees that every access to that field is a volatile memory operation"
This makes it sound like you are right and the volatile keyword has the same behaviour as the Volatile class which explicitly says it has acquire-release ordering.
But that seems to contradict "The volatile keyword doesn't provide atomicity for operations other than assignment, doesn't prevent race conditions, and doesn't provide ordering guarantees for other memory operations." from the volatile keyword documentation?
I too interpretat those docs as contradictory, and I wonder if, like how Java 5 strengthened volatile semantics, this happened at some point in C# too and the docs weren't updated? Either way the specification, which the docs say is definitive, says it's acquire/release.
https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...
"When a field_declaration includes a volatile modifier, the fields introduced by that declaration are volatile fields. [...] For volatile fields, such reordering optimizations are restricted:
Acquire-release ordering provides ordering guarantees for all memory operations. If an acquire observes a releases, the thread is also guaranteed to see all the previous writes done by the other thread - regardless of the atomicity of those writes. (There still can't be any other data races though.)
This volatile keyword appears to only consider that specific memory location whereas the Volatile class seem to implement acquire-release.
Somewhat off topic, but what is a realistic example of where you need atomics with sequential consistency? Like, what useful data structure or pattern requires it? I feel like I've seen every other ordering except that one (and consume) in real world code.
A mutex would be the most trivial example. I don't believe that is possible to implement, in the general case, with only acquire-release.
Sequential consistency mostly become relevant when you have more than two threads interacting with both reads and writes. However, if you only have single-consumer (i.e. only one thread reading) or single-producer (i.e. only one thread writing) then the acquire-release semantics ends up becoming sequential since the single-consumer/producer implicitly enforces a sequential ordering. I can potentially see some multi-producer multi-consumer queues lock-free queues needing sequential atomics.
I think it's rare to see atomics with sequential consistency in practice since you typically either choose (1) a mutex to simplify the code at the expense of locking or (2) acquire-release (or weaker) to minimize the synchronization.
> A mutex would be the most trivial example. I don't believe that is possible to implement, in the general case, with only acquire-release.
Wait, what? So you're saying this spinlock is buggy? What's the bug?
https://en.cppreference.com/w/cpp/atomic/atomic_flag.html
No, sorry. I was just remembering where I've typically seen sequential consistency being used. For instance, Peterson's algorithm was what I had in mind. Spinlock is indeed a good example (although a terrible algorithm which I hope you haven't seen used in practice) of a mutex algorithm which only requires acquire-release.
> Unset variables. If DIR is unset, rm -rf $DIR/ becomes rm -rf /. Using set -u can make bash error when encountering unset variable.
sweet mercy :O
Someone call the Inquisition
Instead, say
That is, skip the trailing slash. Then if $DIR is not set, it becomes an invalid command, because no file names were supplied.Better to make the requirement explicit, instead of relying on the argument-parsing details of rm or some other command:
This was a very famous Steam bug
Does anyone truly understand all the little edge cases with CSS?
I've write tons and tons of CSS, have done for a decade. I don't sit and think about the exact interactions, I just know a couple things that might work if I'm getting something unexpected.
I don't really see it possible to commit that to memory, unless I literally start working on an interpreter myself.
I think there can be a different way to think about CSS that can help with that feeling of never understanding it all. Recently I’ve heard people influential in the CSS world describe it as a “suggestion” to the browser. The browser has its own styles, the user might have some custom stylesheet on top of the browser’s version, extensions, etc etc and at some point CSS is really more a long list of “suggestions” about how the site should look.
If you embrace that idea to the fullest, you can create some interesting designs/patterns that can be more resilient. The “downside” is that this way of writing css will likely made the pixel perfect head of the marketing department hate you unless they also write code.
I think it’s also okay to say that some ways of writing css just aren’t relevant anymore. A good parallel in mind is building construction and general carpentry. These days, a quick 2x4 stud wall or insulated concrete forms is fast, cheap, and standardized around the world. However, many craftspeople still exist that will create beautiful joinery for what is ultimately a simple thing, but we can appreciate that art standalone. With CSS, I don’t suspect we will ever need to go back to floats or crazy background images or whatever but it’s nice that those tools are still there for not only the sake of back compat, but also as a way to tinker and “craft” something bespoke for a special project or just because you like it. Education will eventually catch up and grid and flexbox will keep gaining popularity until we decide that it’s too complicated and come up with some new algorithm. That can all be true though and you can bring value as a developer without knowing every single aspect to the public API.
But you need to, you know, actually float something in a text. I think to do it with flexbox/grid you need JS that calculates heights and than manually splits the text into boxes with heights, so essentially you are doing rendering.
Also is there another way to position boxes side-by-side in an inline context without float?
"Associativity law and distribution law doesn't strictly hold because of inaccuracy." should be due to precision loss not inaccuracy (they are different).
updated
The biggest trap of all: building things that no one, including yourself, wants
I would agree on the self part. But otherwise this point of view is distinctly in contrast with the recently republished "work on things that don't scale" article from pg.
Also as a corollary, I was thinking about games from the 90s I spent hundreds of hours playing back then earlier today. Bolo and Escape Velocity in particular come to mind. They were "simple" games with immense depth. But after some fruitless searching, all I find is scattered questions and comments over the last few years looking for modern equivalents are a handful of recommendations for games that are no longer developed or are defunct.
There's clear prior evidence of both success and lack of modern supply. Want to have a minimally minor successful game with established nostalgic audience? Make a new version of EV that is that game at its core. Don't be fancy, just do the thing Ambrosia did. Expand from there.
The simple model is look forward, and also look back. The first itch is small. For yourself. Find some others with a similar itch.
The hard part is pushing your idea into something more people want. That's where pg's 2013 article comes in to play. That's the hard part.
But that leaves a huge space between 0 and 10 where someone can find a successful niche.
Hell I've thought about trying to make a modern EV. It's tantalizing. But I've never made a game in my life. Ok I've rewritten Game of Life a lot. Good concept to try new ideas with. But the amount of work for a solo dev trying to recapture that original magic with no background in game dev is daunting.
well if you at least build something you want, you are doing something that doesn't scale no?
> Golang use UTF-8 for in-memory string.
Nope. It’s just bytes with no encoding.
https://go.dev/blog/strings
Corrected.
There is no such thing as "just bytes" when it comes to Unicode. UTF-8 is a way to represent Unicode codepoints in binary.
But I agree that author's statement is wrong. Go stings are equivalent to byte slices.
Go strings are just bytes. There is no Unicode or encodings.
yaml: https://www.bram.us/2022/01/11/yaml-the-norway-problem/
bash: errexit depends on caller's context, will utterly fail you one day: https://lists.gnu.org/archive/html/bug-bash/2012-12/msg00093...
Added
LF vs CRLF
Which is incredibly painful on Windows systems doing a git clone of shell scripts, since core.autocrlf is often helpful, but not for shell scripts, since it causes the weirdest looking error messages:
e.g. https://askubuntu.com/questions/370124/not-found-error-when-...Polite projects will have .gitattributes specifying that .sh or .bash or bin/* or whatever are to always checkout with eol=lf <https://git-scm.com/docs/gitattributes#_eol>
As best I can tell, .ps1 does tolerate Unix lf but I'd bet good money that .bat and .cmd definitely do not
Thanks for reminding. Added.
> Division is much slower than multiplication (unless using approximation). Dividing many numbers with one number can be optimized by firstly computing reciprocal then multiply by reciprocal.
Is this a general fact or specific to a language?
It's generally true even for CPU instructions: https://electronics.stackexchange.com/questions/280673/why-d...
It's hardware-level - division requires more CPU cycles than multiplication on most processor architectures, making this optimization pattern relevant across virtually all programming languages.
Adjacent to the bit about 100vh, I’d add an item about the much older fundamental stupidity of viewport units, which Firefox tried to fix but no one else got on board and so they eventually removed their patch and went back to being as broken as everyone else. That is: they ignore scrollbars. Use a platform with 17px-wide scrollbars, and in a document with vertical scrolling, 100vw is now 100% of the viewport width… plus 17px. People get this wrong all the time, and it’s only increased as more platforms have switched to overlay scrollbars by default.
> Blur does not consider ambient things.
Need to make that backdrop blur.
> Whitespace collapse. HTML Whitespace is Broken
I always hated that article title. It’s not broken, it’s just different from what you occasionally expected, mostly for very good reasons that let it do what you expected most of the time. I also disagreed with some significant other parts of it: https://news.ycombinator.com/item?id=42971415.
> Two concepts: code point (rune), grapheme cluster:
I have problems with this lot.
Firstly: add code units. When dealing with UTF-8 and you can access individual bytes, they’re UTF-8 code units. When dealing with UTF-16 and you can access 16-bit values, including surrogates, they’re UTF-16 code units.
Secondly: don’t talk about code points in general, talk about scalar values, which are what code points would have been were it not for that accursèd abomination UTF-16. Specifically, scalar values excludes the surrogate code points.
Sensible things work with scalar values. Python is the only thing I can think of that gets this wrong: its strings are sequences of code points, so you can represent lone surrogates (representable in neither UTF-8 nor UTF-16). This is so stupid.
Thirdly: ditch the alias “rune”, Go invented it and it never caught on. I don’t know of anything else that uses the term (though maybe a few things do), and the term rune gets used for other unrelated things (e.g. in Svelte 5).
> Some text files have byte order mark (BOM) at the beginning. For example, EF BB BF is a BOM that denotes the file is in UTF-8 encoding. It's mainly used in Windows. Some non-Windows software does not handle BOM.
BOM was about endianness, and U+FEFF. Link to https://en.wikipedia.org/wiki/Byte_order_mark. It made sense in UTF-16 and UTF-32: for UTF-16, if the file starts with FE FF it’s UTF-16BE, if FF FE it’s UTF-16LE. UTF-8 application of it was always a mess, never really caught on, and I think can fairly be considered completely obsolete now: the typical person or software will never encounter it. Very little software will actually handle it any more. Fortunately, it doesn’t tend to matter much: U+FEFF is the BOM for a reason, it’s ZERO WIDTH NO-BREAK SPACE.
> YAML:
I’d add “disallows Tab indentation”. It’s the only thing I can immediately think of that’s like that. (Make is roughly the opposite, recipe lines having to start with a single Tab. Python doesn’t let you mix spaces and tabs. These are all I can immediately think of.)
Thanks for your suggestions.
[dead]