pedrovhb a year ago

The author conflates generators with generator expressions[1]. He's correct in that generator expressions lazily evaluate and lists don't, but actual generators are a much more powerful construct - they're functions which use the `yield` keyword. They're most frequently used as iterators, but they can also communicate with the outside state via receiving values (`x = yield y`), being explicitly closed with `.close()`, or having an exception raised at the yield point with `.throw()`.

They're so powerful in fact, that asyncio is implemented under the hood with a lot of generator cleverness. You rarely see it, but the `__await__` dunder method which powers `await` is just a synchronous function which returns a generator.

Generator expressions are useful for when there's memory constraints that preclude using a list comprehension, but list comprehensions are actually faster because of cPython's internal optimizations.

[0] https://peps.python.org/pep-0255/ [1] https://peps.python.org/pep-0289/

  • henrydark a year ago

    Generators and generator expressions are useful every time your api involves an unbounded sequence.

    I once wrote a library of signal processing components based on the premise that the raw input signal is such a sequence, and its concrete type was Iterable[np.ndarray] (iterable of chunks of audio signal). Since the signal is unbounded - we just continue sampling from, say, a microphone forever - the types can't involve finite sequences like lists. A general iterable is works here, and then components that transform the sequence are themselves implemented as generators, or if they are very simple, as generator expressions.

    Consider the following function for amplifying a signal:

      def amplify(signal: Iterable[np.ndarray], a: float):
        return (a*chunk for chunk in signal)
    
    You can flip the whole infinite streams interface and talk about "pushing" chunks around, and it's an esthetic choice.
  • enragedcacti a year ago

    > but they can also communicate with the outside state via receiving values (`x = yield y`)

    I'm dumbfounded that I have written python for as long as I have without ever encountering this, TIL!

    https://peps.python.org/pep-0342/

    • photochemsyn a year ago

      Just found an interesting example explaining how that works, I wasn't aware of it either (this site also has nice 'yield from' explainer):

      https://lerner.co.il/2020/05/08/making-sense-of-generators-c...

      > "The above code looks a bit weird, in that “yield’ is on the right side of an assignment statement. This means that “yield” must be providing a value to the generator. Where is it getting that value from?

      Answer: From the “send” method, which can be invoked in place of the “next” function. The “send” method works just like “next”, except that you can pass any Python data structure you want into the generator. And whatever you send to the generator is then assigned to “x”."

      • memco a year ago

        Knew about it, but didn't quite understand how to use send until I read through the link you posted. Thanks for that!

        Regarding that article specifically I think it could've done better service to 'yield from' had it mentioned recursive generators. One example would be flattening nested lists: have a flatten function that checks if its argument is iterable: if not just yield the thing else yield from flatten for every item in the iterable.

    • tomn a year ago

      Back in the days before async/await, twisted (an asynchronous I/O framework) used these to emulate async/await -- you could yield a differed value (like a promise/future), and it would pass it back when the value was available, saving some effort making your state explicit and writing many callbacks:

      https://twisted.org/documents/21.2.0/api/twisted.internet.de...

    • knlb2022 a year ago

      I used this in AoC for the first time a couple of years ago, as a consequence of diving deeply into asyncio. https://explog.in/static/aoc2019/AoC23.html

      • memco a year ago

        I'm glad to see someone with the same idea was able to take it to completion. Life got in the way for me and I never got to implement that part. If I ever get back around to it I will consult your solution if I get stuck.

  • nijave a year ago

    Besides memory, they're also great for an expensive operation. You can do pagination with network calls, disk I/O, etc

  • charlieflowers a year ago

    To be fair, I don’t see a conflation here. He chose not to elaborate on generator functions, but that appears to be due to a defensible conscious choice about article scope.

    Imo it’s a well written, technically accurate article.

Waterluvian a year ago

I have a rule for comprehensions that I try not to break: they should only have two phrases in them. Whether that’s two “for x in y” or one “for” and one “if”. Anything more becomes confusing and illegible.

Complicated comprehensions are not clever or impressive, they’re annoying.

  • code_biologist a year ago

    You can abuse it, but the lineage of comprehensions (Haskell's list monad, SQL) make it clear that longer examples are certainly useful.

    Haskell list comprehensions limited to two clauses, SQL queries limited to a single FROM and a WHERE, or a FROM and a JOIN sound pretty limited. I use Python comprehensions to do those same kinds of data querying without leaving the Python context, so seems weird to limit myself for the same reason.

    Using imperative constructs certainly isn't wrong, but they tend to crawl off the right hand side of the page.

    • maxbond a year ago

      Yeah, I personally find a triply-nested comprehension to be more readable than a triply-nested loop. Once you get comfortable with the syntax, it's nice to be able to fit it compactly on your screen, so that your eyes can roll all over it without having to scroll. I agree with the general principle that terseness can be a detriment to readability, but sometimes it's nice for everything to be in one place.

      I suppose this is only true in combination with verbose function names that capture the logic of the operation you're performing, with the comprehension expressing how this logic is composed. In my mind, comprehensions are sugar over map(), filter(), etc., and before I was using comprehensions I had a mess of nested calls to these functions with lots of lambdas. Comprehensions are a big step up from that as far as readability goes.

    • im3w1l a year ago

      Tbh I often find I would like to use variables in sql. Something like

        americans := SELECT username FROM users WHERE country = "us";
        SELECT url, visits FROM posts, americans ON posts.author = americans.username ORDER BY visits DESC LIMIT 10;
      • hackandthink a year ago

        Common Table Extensions are often good enough:

        With americans as (select ...) select from ..., americans ...

        https://www.draxlr.com/blogs/common-table-expressions-and-it...

        • code_biologist a year ago

          The lack of variables is a major pain point for many devs when learning SQL. CTEs address one aspect. It's maybe not 1:1 or ideal, but lateral joins are a great way to get in-query variables and are extremely powerful for controlling cardinality!

          https://sqlfordevs.com/for-each-loop-lateral-join

        • im3w1l a year ago

          Oh huh, I didn't know that was a thing. Exactly what I wanted.

      • bit_for_a_byte a year ago

        It wouldn’t work as a variable, you’d get race conditions unless you explicitly wrapped both query statements in a transaction.

        It could work as an expression which gets injected into the 2nd query. So the second query would effectively have a nested SELECT statement as defined by the americans expression.

        But this has the downside that if you reuse americans you have to know that it will be recomputed at each query. So unless you know the semantics of these variable-expression things over time you get race conditions. The solution again would be to wrap everything in a transaction, but at that point you just have to constantly think about transactions, which you don’t if your query is a one-liner.

        Also good luck writing a query planner that has to efficiently take into account variables.

      • hyencomper a year ago

        For cases like these, I use sql views - CREATE VIEW view as SELECT * .. Then DROP VIEW view.

  • ddejohn a year ago

    My rule has always been that if it doesn't fit on one line it gets re-written, or if practical, I will factor out what I can into helpers. As soon as a comprehension spills over into a new line, the readability tanks for me.

    • 83457 a year ago

      With proper indention, a comp with long variable names or 2 part if is very readable for me when if is on it's own line. Put from on it's own line and it is like a sql statement.

  • TheAdamist a year ago

    Yeah i did a multi level nested one once including else clauses, and it didnt actually assign to anything, only side effects. Felt clever at the time.

    But it was completely incomprehensible and unmaintainable. So got replaced by me later.

    • Waterluvian a year ago

      omg. A comprehension for side effects only. That’s so horrible I wish I could see it. :)

      • leetrout a year ago

        I worked with a guy that did this.

        Instead of writing for loops he would write comprehensions everywhere because "list comprehensions are pythonic".

        Think like:

          [send_welcome_email(u) for u in users]
        
        
        Just hanging out in the middle of the module or function.
        • 83457 a year ago

          I prefer

          sent = [(u, send_welcome_email(u)) for u in users]

          • Ultimatt a year ago

            I prefer

            not_sent = users - {u for u in users if send_welcome_email(u)}

          • Waterluvian a year ago

            Just make the function return the argument passed into it. ;)

      • BeetleB a year ago

        I do it all the time :-)

        Just make sure it isn't consuming a lot of RAM and it's totally OK.

  • cuteboy19 a year ago

    Yes, the equivalent itertools expression is much more easily readable

    • code_biologist a year ago

      What do you mean by the equivalent itertools expression? It's not clear to me how an expression like this is made more readable by itertools. I'm not sure how itertools is even relevant.

          sum(
              a + b
              for a in range(10)
              if a % 2 == 0
              for b in range(10)
          )
      • cuteboy19 a year ago

        I don't even understand what this code does but I was referring to itertools.chain for the nested comprehension in the post

    • Waterluvian a year ago

      Yes!! itertools. Always be aware of itertools.

RobertoG a year ago

Well, I learn a few details with the article, so, thanks.

On the other hand, and maybe is me being too literal, I think it gives the wrong impression about what a generator is. It has nothing to do with one liners using brackets. My generators use 'yield' and look like a 'for' and they are still generators.

gompertz a year ago

For those interested... Generators first appeared in Icon programming language in 1977. https://en.m.wikipedia.org/wiki/Icon_%28programming_language... (ctrl+f Generators and Goal Directed evaluation)

Incredible it has taken 40+ years for some languages to just catch up to the concept!

  • antod a year ago

    40yrs? Python has had generators for approx 20yrs already. Unless you're talking about a different language.

antirez a year ago

I always thought list comprehension shows the problem with Python: too many ad hoc ideas. It's a language that looks like a lot more like a sum of many tools than a few well chosen orthogonal ideas. Ruby is a much better language with a much worse implementation, community, set of libraries.

js2 a year ago

I find chain much easier to read than nested comprehensions:

    from itertools import chain
    list(chain(*trees))
    list(chain.from_iterable(trees))
    list(chain(*dog_breeds.values()))
    list(chain.from_iterable(dog_breeds.values()))
  • Jensson a year ago

    Nested comprehensions are a more powerful construct, you can do more with it than chain or the itertools module. Chain is cleaner for this case since it is weaker, so it is clearer what it does, but since list comprehension is more powerful you ought to learn it first and chain is just a nicety.

    • js2 a year ago

      I’ve been using Python as my primary language since before it had list comprehensions and I still find nested comprehensions to be hard to comprehend, hard to compose, and hard to refactor.

      I usually resort to building up generator expressions, using itertools, or writing traditional loops.

      $0.02.

aunty_helen a year ago

Nice article. I learnt some stuff.

WRT to length of a statement, look at how many operations are being done. If it’s doing something like filtering that should be 2-3 ops on the same line. Perfect application of a comprehension with an if statement.

If it’s 5+ different things jammed into a comprehension, you’re failing PR. Don’t think it needs to hit 2 LOC before refactoring.

Build a list, take a property, call a function with that value, iterate, if something, else something, all on one line?? Aren’t you clever. Now, let’s write professional code.

xwowsersx a year ago

   In a REPL session, "_" means "the previous output"

Wow TIL. After all these years. I would always just arrow up and go to the beginning of the line to bind to a variable if I missed it the first time. Nice to learn something new, however small!
  • JonathanMerklin a year ago

    I just checked and this is true for the Node.js REPL (Tried in v12, v14, v16, and v18) as well, which I'm also just learning for the first time (and I had the same workflow as you before this). Neat!

    • xwowsersx a year ago

      Ha! Funny that you'd even think to check. In Scala and Python (I'm guessing other langs as well), the underscore can be used for ignored variables, among other things. Never thought it had another usage in the REPL. Actually really useful!

      • JonathanMerklin a year ago

        Yeah, I've used it in Haskell for ignored variables, holes, type wildcards, etc.

        Another interesting little rabbit hole that this little conversation led me to: In the Chrome (and Firefox) developer tools, the variable for the previous output is "$_", which I imagine that is the case because of how common it was to assign the main export of the Underscore.js library to "_" (and in the days before a lot of websites would mostly e.g. have their site's code in a webpack-induced closure, they would e.g. grab underscore (or lodash, in those times?) from a CDN and pollute the global scope).

        Since _ is a valid identifier, it also turns out that in Node.js's REPL, it warns you when you clobber _, but (weirdly) not if the clobbering is with a block-scoped declaration.

          $ node
          Welcome to Node.js v18.12.1.
            Type ".help" for more information.
          > "asdf"
          'asdf'
          > _
          'asdf'
          > var _ = 1
          Expression assignment to _ now disabled.
          undefined
          > "asdf"
          'asdf'
          > _
          1
        
          $ node
          Welcome to Node.js v18.12.1.
          Type ".help" for more information.
          > const _  = 1
          undefined
          > _
          1
          > "asdf"
          'asdf'
          > _
          1
        
        And obviously, there's no warning for assigning to it outside of the REPL (`node -e "var _ = 2;"`), which makes obvious sense to me.
buttocks a year ago

I love explainers like these because my Python code looks like C without the curly braces and semicolons.

pizza a year ago

Neat trick with generators: refactoring out related looping structures across multiple different functions, for greater reusability

henron a year ago

I really prefer working with functional transformations over containers (including iterators).

  • saeranv a year ago

    What's a functional transformation?

    • ghostwriter a year ago

      fmap, fold, traverse, join

      • saeranv a year ago

        I guess I'm lacking some background to understand this.

        So, I recognize fold, from the above operations ('reduce' in python), what makes it a "functional transformation"?

        I'm googling function/functional transformations, and the results refer to translating, flipping or scaling a function, like transforming f(x) to f(x + 2) or f(x) + 2. It doesn't seem like it has anything to do with what you guys are referring to, but correct me if I'm wrong.

ggm a year ago

A parenthetical comment: I've always liked the imagery a "crash" course gives me:

Either its the course of treatment you get after a crash

Or it's the course of actions which is going to lead to a crash.

  • eyelidlessness a year ago

    I’ve always taken it to mean it’s fast and unthorough, which I suppose aligns with your latter case but I think it’s a specialized usage still.

    • ggm a year ago

      It's usually the least worst choice for somebody thrown into unexpected situations. Deliberately walking to a crash course on anything to avoid the long learning path is like autodidactism: it works, kinda, and kinda not.

      I should add I'm a serial offender.