When First Principles Thinking Fails (2020)

105 points by nickwritesit 2 years ago

rocqua 2 years ago

The image where adding an axiom / base fact causes a previous conclusion to be invalidated shows a key fact that goes unmentioned in the article. The inference steps the author uses are, mathematically speaking, unsound (or the new axiom causes self-contradiction).

This is not to say the inference steps were mistakes, I believe the author when he claims they were backed up, reasonable, etc. Instead, my point is that 'real world reasoning' is very often unsound by necessity of needing to make a decision. This is a very important fact to acknowledge when doing real world reasoning, confusing this form of reasoning from mathematically sound reasoning is quite seductive, but will often lead to reality punching you in the face.

Based on this, I do object to the author's use of the word 'axiom' it suggests a much more rigorous form of argument than is actually being discussed.

Joker_vD 2 years ago

To quote from Carroll's "Symbolic Logic":

    He will reply "Why, it's perfectly correct, of course! And if your precious Logic-book tells you it *isn't*, don't believe it! You don't mean to tell me those tourists *need* to run? If I were one of them, and knew the Premisses to be true, I should be *quite* clear that I *needn't* run — and I *should walk*!"

    And you will reply "But suppose there was a mad bull behind you?"

    And then your innocent friend will say "Hum! Ha! I must think that over a bit!"

    You may then explain to him, as a convenient *test* of the soundness of a Syllogism, that, if circumstances can be invented which, without interfering with the truth of the Premisses, would make the Conclusion false, the Syllogism *must* be unsound.

danaris 2 years ago

That sounds to me like it may also describe a situation where there are unstated premises—basically, unexamined assumptions, like "continuing to walk will remain safe".

atoav 2 years ago

First principles thinking combined with patteen matching is really good when you are trying to find solutions to problems with limited and known domain boundaries: e.g. debugging software, fixing an electronic circuit, etc.
Many problems don't have a fixed or even a known domain boundary. If you e.g. apply to an art school as a painter, most of the stuff that can make or break your application is outside of the bounds of what you can know and/or influence. E.g. who else is applying, who will review your application and for how long, what experiences they had with similar students etc. That means you can do everything right and still fail. Same is true for a lot of things.
tunesmith 2 years ago

Yeah, I agree. Theoretically you can make those syllogisms sound if you also explicitly list the assumptions you are assuming to be true. Then when a scenario like his describe scenario happens, you can point to which of those assumptions turned out to be false.
- rocqua 2 years ago
  
  Yeah, and that shows why this exhaustive list of assumptions is rarely made. It is useful after you realized the argument was wrong, but that isn't the critical point in time. The critical point is when evaluating the argument before you act on it. Listing 1000 assumptions that are 99% confident doesn't help evaluate whether the argument is true. Focusing on the uncertain assumptions is much more valuable.
codeflo 2 years ago

I'll prefix my point by saying I'm not one of those people who think all reasoning should (or can) be Bayesian, but here, I think it applies at least as an analogy. In what you call "real world reasoning", you never have the amount of logical rigor that you get when going from mathematical axiom to theorem. Instead, there are always varying degrees of knowledge, certainty and understanding. In such a setting, it's very easy to imagine a new, non-contradicatory fact pushing a conclusion from 80% certainty down to 40% certainty.
YeGoblynQueenne 2 years ago

If I understand the diagrams in the article correctly (I haven't had a careful read) the correct terminology is that reasoning with new observations can be non-monotonic, not that it is unsound.
Non-monotonic reasoning means that the set of true consequences of a theory can both increase and decrease in cardinality, as new observations are made. A "theory" here is a set of axioms (which are assumed to be true, without further proof), and the theorems that can be derived from the axioms, under some set of rules of inference. If the rules of inference themselves are sound, then any decrease or increase of the set of true consequences of the theory is also sound.
Non-monotonic reasoning is claimed to be a better match for the way humans reason under uncertainty in the real world, than classical logic (which is monotonic) because it means a reasoning agent can "change its mind" when new information comes in, and so the agent does not have to be stuck with faulty initial assumptions.
This new information is in the form of "observations", which the article labels together with axioms. Technically speaking, observations are logical atoms, i.e. "facts", while axioms of a theory can be both facts and "rules" (where rules are more complex formulae composed of atoms). We normally don't directly observe new rules of the world but are left having to infer them using induction (which is not sound). New facts about the world can also be inferred: by abduction, like the thing that Sherlock Holmes does (and which is also unsound, and maps to probabilistic inference). In any case observations are not axioms, because they tend to come from a different source than axioms, which are basically made up by humans. e.g. "two parallel lines never meet" is an axiom while "things fall towards the ground" is an observation. On the other hand, it makes sense to lump observations together with axioms because they both make up the basis from which conclusions are drawn. So the author of the article is fudging it a bit but is not entirely off.
tl;dr, under a non-monotonic inference framework the results of inference can change when new observations are made, and that creates no contradiction with the fact that inference steps are not taken in error.
But note that all this is basically logic talk and has nothing to do with business. I was drawn to the article because of its title, but I find it has nothing to do with logical reasoning (which interests me) but is instead about thinking about business (which doesn't). I'm not sure if the framework of soundness (and completeness) and monotonicity can be directly applied to a business context.

apienx 2 years ago

First Principles Thinking, by definition, only assumes the most well-stablished axioms. If your assumptions get broken by a paradigm shift, then you simply failed at proper First Principles Thinking IMHO.

The issue with First Principles Thinking is the required resources (cognitive, temporal, financial, etc.).

P.S. The abstraction-level example the author gives in the blog post look like a case of incomplete information.

zootboy 2 years ago

I know this is a totally off-topic complaint, but why did the author choose purple as the CSS link color? Perhaps this is just a sign that I'm old, but I genuinely was confused for a moment, wondering how I had possibly visited all the links in a blog I don't remember ever seeing before.

PakG1 2 years ago

This is where first principles thinking would be useful. Har har. But as an older guy, I have the same question. :)

tbrownaw 2 years ago

Formal yes / no logic with an assumption of fully explicit axioms isn't a great model for things like this.

You never know everything you need to know, so you need a big pile of default assumptions (a model of the world). Hopefully as you go along you can figure out which assumptions you're using and make them explicit.

You needs to work with probabilities (is it going to rain tomorrow) and also validity domains (if I go to a different city I need to get a new weather forecast).

Bayesian training does a decent job of formalizing some of this.

ajuc 2 years ago

Traditional programming vs machine learning shows the limitations of first principles thinking.

You can write huge programs with byzantine nested chains of logic. But you can't write an OCR with a bunch of ifs.

WJW 2 years ago

When you drill down, OCRs are indeed just a (large) bunch of ifs. All programs are ultimately composed of only calculations and conditional JMP statements.
- bruce343434 2 years ago
  
  Aren't most machine learned models matrices on which multiplication is performed?
  
  Bootvis 2 years ago
  
  Which are implemented using simpler calculations and jumps.
  
  HPsquared 2 years ago
  
  And, most importantly, data is used in the calculation. Data which has statistical properties.
worrycue 2 years ago

> But you can't write an OCR with a bunch of ifs.
Aren’t all programs include AIs just a bunch of “if”s at the machine instruction level?
- HPsquared 2 years ago
  
  The program yes, but the model is based on data which is fuzzy and uncertain.
- jmopp 2 years ago
  
  Technically, yes. I believe there is a theorem which states that any neural network can be transformed into a decision tree. Good luck, though, trying to write an OCR algorithm with a bunch of ifs.
- ajuc 2 years ago
  
  They are. Try to write one :)

smitty1e 2 years ago

Patterns and first principles seem like tactics and strategy. Both needed.

But then he makes the point that hidden information causes bewilderment. Decisions are closer to poker than chess.

Keep it humble and attentive.

elisharobinson 2 years ago

when reasoning from first principles there is a "log" which you can look back and reflect on which helps to reevaluate either the axioms or inference steps. this is more time consuming.

reasoning by pattern matching is quicker , when you fail you search for new patterns this is useful when the outcome is of low impact.

both are useful tools from my experience.

est 2 years ago

this article talks about "pattern matching", I am wondering is it really the contrast of "first principle thinking" ? Are these two comparable at all?

wenc 2 years ago

First principle's thinking's counterpart is analogical thinking -- which is pattern matching.
HPsquared 2 years ago

Empiricism vs rationalism
worrycue 2 years ago

“Pattern matching” feels cargo cult science …