Python and manually creating the requirements.txt vs. pip freeze

20 points by robinweiss 2 years ago

Hey everyone!

I just had an argument with a colleague that I feel like I had a million times before: Should the requirements.txt be manually created with only first-order dependencies or automatically created through `pip freeze` and hence contain all the transitive dependencies?

I am very strongly in favor of the `pip freeze` approach since it guarantees reproducible virtual environments (at least in theory). My colleague, who is, unfortunately, developing on Windows tells me that this regularly breaks on his machine and insists on the manual files with only the first-level dependencies.

Are there known cross-platform issues with `pip freeze` approach? I have searched the internet high and wide and I couldn't find a sound argument for either being right or wrong on this topic.

I would appreciate any input on this :)

roberrrrrrt 2 years ago

I put direct dependencies in requirements.txt. Then I run `pip freeze > requirements.lock` within a docker container of the team's target deployment environment. I commit both files, and the deployed application is configured to use the requirements.lock.

frou_dh 2 years ago

Look at what the well-regarded pip-tools (https://github.com/jazzband/pip-tools) does, for a disciplined approach to these requirements files. You don't necessarily need to adopt that tool, but can maybe take inspiration from it.

  • robinweiss 2 years ago

    Thanks! I personally use those tools, but a lot of times it's hard to convince people to switch from their hand-written requirements.txt. I was looking for some bullet-proof arguments to end that discussion once and for all :D

    • mlegendre 2 years ago

      "Reproducible environments" sounds like a strong argument to me. If you don't have a reproducible environment, you have no guarantee that your code will work at any point in time. The more complex the projet and its dependencies, the more likely breakage will happen with a new version of a dependency.

      Maybe this is something that is best learnt the hard way? Something you understand once you reflect back on how many hours you have lost fighting with dependencies, instead of doing what you actually wanted to do with the code (be it running it, developping, debugging, bisecting...).

      Freezing dependencies comes at a cost though, especially since dependency management in python is a PITA. So this is a trade-off. For simpler, projects or scripts, I personally don't bother with freezing dependencies, and I handle issues when they happen. (And they do happen.)

    • mattbillenstein 2 years ago

      I use pip-compile in my flows - it's the best of both worlds, you get your declared direct dependencies via requirements.in, and the full blown locked dependencies then in requirements.txt. It's my preferred way to manage this now.

      • frou_dh 2 years ago

        And a key thing is that both files can be committed to source control, not just one or the other.

        • mattbillenstein 2 years ago

          Yeah, and the other bits about pip-compile -P to upgrade a single package and whatnot are very handy.

n8henrie 2 years ago

Depending on the project I've often done both, with something like `pip list --not-required | awk something-to-strip-versions > requirements.txt; pip freeze > requirements.lock`, I generally keep both in VCS so I can easily install all up to date versions for dev, once it works fine and tests pass overwrite the "lockfile" (problems with transitive dependencies acknowledges, probably not the right word), end users can fall back to just installing from the lockfile if not doing dev work.

EDIT: re: cross-platform concerns, I'll also note that since moving to an M1 Mac I've never been so annoyed with the Python packaging situation, and even my own libraries are routinely broken without obvious paths forward. Learning docker as a result, and looking to rust for new projects.

samwillis 2 years ago

Somewhat convoluted but I have started experimenting with treating my venv as immutable. So I do this:

- Manually created requirements.txt with top level production requirements

- Manually created requirements-dev.txt with top level dev requirements.

- pip freeze > requirements-lock.txt (with no dev dependencies)

When installing a new requirement:

1. Delete old venv and create new empty one.

2. pip install -r requirements.txt

3. pip freeze > requirements-lock.txt

4. pip install -r requirements-dev.txt

This ensures your requirements-freeze.txt doesn’t include dev dependencies. With most packages that have binaries now releasing wheels the install isn’t too slow.

Technically you could just save a copy of the venv after step 2 to speed up the process of installing new requirements.

bestcoder69 2 years ago

Tangentially related, I came up with something atrocious last night.

If you install pex with pipx (or put pex in $PATH another way), you can declare your python script dependencies in the shebang line, like:

  #!/usr/bin/env pex requests==2.28.1 other_dep=1.2.3 --exe

  import requests

  def main():
    ...
Then, you can `chmod +x` your script and call it with "--" between the script name and your arguments, if any:

  ./main.py -- arg1 arg2
Requests will be installed the first time you run it, then pex should use its cache for subsequent invocations.

I wonder if anyone's done this in earnest?

  • robinweiss 2 years ago

    This ... I don't know what to say :D I guess I have to thank you, this helps me see the bigger picture and not be too hung up on the manual requirements.txt files of my colleagues. Things truly could be worse.

nicolaslem 2 years ago

My simple low-tech approach to the problem:

requirements-base.txt:

    flask
    celery
    package-with-broken-release==2.4.3
requirements.txt generated via pip freeze:

    amqp==5.1.1
    billiard==3.6.4.0
    celery==5.2.7
    click==8.1.3
    click-didyoumean==0.3.0
    click-plugins==1.1.1
    click-repl==0.2.0
    Flask==2.2.2
    package-with-broken-release==2.4.3
    ...
  • robinweiss 2 years ago

    The only problem I see here is that these files could diverge and lead to all kinds of annoying problems down the line. Would be great if they could be automatically linked somehow.

tracnar 2 years ago

pip freeze might not work with platform-specific dependencies, you will only get the platform dependencies from who ran "pip freeze". So you might end up with "linux-only==1.2.3" in your requirements.txt which won't install on windows (or missing windows-specific deps).

poetry should handle this, and you also get hashes for free!

sls1 2 years ago

I'd strongly advice to use pip freeze as you might end up breaking your requirements.txt through typos / syntax issues or wrong versions eventually. It enforces standardized listing of the dependencies as well and makes it easier.

However, I'm using poetry since a while as it makes it easier for me to just focus on development and don't hassle with virtualenvs, broken APIs (pip search) and manually keeping files in sync.

jmconfuzeus 2 years ago

Never use pip freeze.

Instead, install pip-tools[0] then use the pip-compile command.

Why?

pip freeze will also pin dependencies of your dependencies, which makes your requirements.txt hard to read and extend.

Never manually create requirements.txt either because a programmer's job is to automate boring tasks like dependency pinning.

[0] https://github.com/jazzband/pip-tools

  • robinweiss 2 years ago

    What's bad about pinning the transitive dependencies? I feel otherwise this could lead to really hard to debug errors down the road if second-order dependency versions diverge between developers.

    And I fully agree on the tools. I personally would always use something like pip-tools or poetry, but it's not always an option to use them, unfortunately.

deeteecee 2 years ago

im not an expert but i think you can just do the first-order dependencies on requirements.txt and then do a pip freeze with constraints.txt and do `pip install -r requirements.txt -c constraints.txt`. that'll make you both happy.

Before constraints.txt came into play, I would've done what you said with pip freeze. With only first-order dependencies, that sounds more likely to break at some point due to the other potential version changes. I think your colleague needs to explain why things are breaking in his case.

mkranjec 2 years ago

Is using Poetry or pdm, even pip tools linked above not a viable solution?

  • robinweiss 2 years ago

    Absolutely they are. But I work a lot with teams that are on the fringes of tech and business like Data Science & BI. I am trying to find the right balance between doing things cleanly but not overburdening non-engineers with too many tools and abstractions.

neeh0 2 years ago

Any reason not to use `poetry` these days?