snidane 2 months ago

The syntax for joins is unnecessarily complex. The difference in behaviour is determined by the operator used for comparison between the two tables. If there existed modifiers for the operators to be sensitive to nulls, no need for special syntax would be needed.

If we define /<op> to make the operator include nulls on the left side and <op>\ to include nulls on the right side, we can do joins of all kinds without the JOIN ON syntax.


inner join

select * from a,b where

left join

select * from a,b where\

right join

select * from a,b where

outer join

select * from a,b where\

  • Svip 2 months ago

    That's the old join syntax before ANSI SQL was designed. Except your operators look like "* =" and "= *" (without whitespace of course, blame HN's markdown parser). Sybase supported this syntax well into its version 15.x (maybe still in version 16?). And Sybase did not (does not?) support full outer joins.

    It may seem simple on the surface, but once you get into complex queries, it's a horrible syntax, and it's easy to forget the relations between the tables, particularly with several tables (4+) involved.

    As someone who has converted an old and large SQL code base from non-ANSI JOINs to ANSI JOINs, I am happy to see this style of syntax abandoned.

    • noisy_boy 2 months ago

      IIRC, Oracle also supports this "* =" and "= *" syntax. I started with that and then learned ANSI JOINs; was difficult initially because I was used to the former syntax but I would agree that ANSI JOINs are more clearer/cleaner. That and the "with" clause makes queries so much more readable.

    • arafa 2 months ago

      Can't agree more, having been in a very similar position. From what I understand (based on the book SQL for Smarties), it's also more efficient for the query planner in some cases (especially when filter/non-join predicates are added to the JOIN).

  • 3pt14159 2 months ago

    Eh—I would have designed SQL differently if I were in charge (especially for casts, function definitions, and string munging) but leaving that aside, I disagree that the JOIN ON syntax is unnecessarily complex.

    We do not write programming languages solely for computers. We write them for humans too. Humans make mistakes and come at codebases at all manners of experience, both in breadth and depth.

    For example, it may literally be someone's first time looking at SQL when they're confronted with the following:

        SELECT *
        FROM classrooms, enrolments
    It is inherently less clear than the following:

        SELECT *
        FROM classrooms
        LEFT JOIN enrolments
        ON = classroom_id;
    It's easier to figure out bugs too, because it leaves more room to understand what the person intended to do rather than what they actually did. For example, say the classrooms table had, due to legacy issues, a nullable id field, but we didn't want to include those classrooms. Someone may have inadvertently typed:

        SELECT *
        FROM classrooms
        LEFT JOIN enrolments
        ON = classroom_id
        WHERE classroom_id IS NOT NULL;
        //    ^ should have been
    Because the NOT NULL criteria is in the where clause, it's clear it's not supposed to impact the join. It's supposed to be part of the filter. I know it's the same thing practically, but they're conceptually different.

    Further, as you get into more complex queries (unions, recursive, cross joins, aggregations) stuffing everything into one hairy WHERE statement gets impossible to understand because it does everything all at once.

  • codeulike 2 months ago

    But sometimes you want to outer join on the ID matching and some other condition, and thats where the JOIN syntax gets more useful, because the 'outer' then obviously applies to a whole series of logic statements that are and/or'd together. If you write reports its not long before you start hitting really obscure joins and conditions and subqueries and so on.

  • jontro 2 months ago

    How would you solve a left join like this with that kind of syntax?

    LEFT JOIN ( inner join table2 ON table2.ID = table1.table2id and table2.othertype = 1234 ) on table1.outerid = outertable.Id

    • Svip 2 months ago

      The subquery would be in the WHERE clause, and it would be called for each row, thus making it N times slower than the JOIN syntax. There is a reason why this style has been abandoned.

      • kijin 2 months ago

        Modern RDBMS's are smart enough to figure out when the result of a subquery doesn't depend on rows outside of it. Then the subquery is run only once, and its result is treated as a temporary table to be JOINed.

        • Svip 2 months ago

          For most simple queries, that is correct. But if the query gets complicated, and/or the tables involved have several indexes, where several might be useful, then it might make the wrong decision. I've written plenty of complicated queries, where I have to help the query optimiser along.

          Although, the most modern I've had this experience with is SQL Server 2014, but that definitely still needed guiding a few times.

    • snidane 2 months ago

      How about

      select * from outertable, (select *, outerid from table1,table2 where table1.table2id = table2.ID and table2.othertype = 1234) t12 where outertable.Id =\ t12.outerid

sheeshkebab 2 months ago

Contrary to articles conclusion, there are databases (i.e. Oracle) where it doesn’t matter how you write joins, but oracle also has support for left/right/full outer joins in the where clause

Left join ... is the same as leftcol=rightcol(+) (oracle has special syntax for this with that + sign).

In summary, depending on database, it may not matter how you write these joins, but it’s best to stick to SQL standard using JOIN if you’d like cross database support.

  • irrational 2 months ago

    I have noticed that many authors of SQL articles don't seem to be aware of Oracles quirks. For example, recently I read an article about NULLs in SQL. The author didn't specify any particular database, but wrote as if what he was saying applied to all relational databases. But Oracle treats empty strings and NULLs as the same thing, which I don't think the article's author was aware of.

    • RHSeeger 2 months ago

      > treats empty strings and NULLs as the same thing

      Wow, that's ridiculous. They are very clearly two different things. For example "this has no value" and "we don't know the value of this".

      • paulddraper 2 months ago

        More like "this value is empty text" and "this has no value/is unknown".

        I agree though. Makes as much sense as treating 0 and NULL as the same.

        • kijin 2 months ago

          What does ANSI SQL say about the matter? If Oracle's behavior is non-standard, it makes sense for the aforementioned author to ignore it.

          • irrational 2 months ago

            My understanding is that Oracle's behavior existed prior to the standard. That is, Oracle came first, then the standard was written years later. Oracle hasn't changed to match the standard because of issues with backwards compliance.

    • lukaseder 2 months ago

      > I have noticed that many authors of SQL articles don't seem to be aware of Oracles quirks

      I've spent much more time with Oracle than with other RDBMS. So, what made you think so?

      > The author didn't specify any particular database, but wrote as if what he was saying applied to all relational databases

      Yes, in this article's case, the article and conclusion applies to all relational databases.

      > But Oracle treats empty strings and NULLs as the same thing, which I don't think the article's author was aware of.

      I am aware of this, but how is it related to the article?

    • Svip 2 months ago

      Indeed, Sybase for instance treats a single space (e.g. ' ') the same as an empty string (e.g. ''). For the purpose of concatenating in Sybase, NULL is '' (and thus ' '). So `SELECT a = "a" + NULL` would yield a = "a". But so would `SELECT a = "a" + " "`.

      MSSQL allows this behaviour, since it's based on Sybase, but you have to explicitly turn it on.

    • panarky 2 months ago

      "this may not continue to be true in future releases, and Oracle recommends that you do not treat empty strings the same as nulls"

      • mcguire 2 months ago

        How are we supposed to treat them differently? Oracle's the one putting the damn nulls in the db.

        (Recently tripped over this one.)

      • Foobar8568 2 months ago

        They finally did a 180 on that bullshit after years trying to convinced everyone that empty string = null. (marketing, blogs, oracle ace etc.)

  • lukaseder 2 months ago

    Oracle's (+) syntax does not allow for full outer joins.

    The conclusion would be the same with Oracle's syntax. It's just more obscure, as it would turn into a "fa.film_id < 10" vs "fa.film_id(+) < 10" discussion.

jimmytucson 2 months ago

As soon as you filter on the table you’re joining with in the where clause it becomes an inner join.

    FROM foo
    LEFT JOIN bar ON ...
    /* Makes it an inner join: */
    WHERE bar.baz ...
  • astine 2 months ago

    Only if you're resting for a non null value. If you test for a null value, you'll be looking for all values in one table that don't have a match in the other table.

    • Svip 2 months ago

      Indeed, a faster way to check if a table does _not_ have a comparable row is a LEFT JOIN with a WHERE NULL clause:

      SELECT t1.* FROM t1 LEFT JOIN t2 ON = t1.foreign_id WHERE IS NULL

      Is a lot faster than the naïve alternative:

      SELECT t1.* FROM t1 WHERE NOT EXISTS (SELECT * FROM t2 WHERE id = t1.foreign_id)

      • brokensegue 2 months ago

        Unless the query is optimized?

        • Svip 2 months ago

          For a simple query like this, then yes, it will be optimised. But once your queries get more complicated, the optimiser might not be able to make that deduction on its own.