Show HN: Panoptisch – A recursive dependency scanner for Python projects

First of all: I'm glad that more people are trying to tackle this problem!

That being said, I'm not sure if I would encourage this approach: this conflates modules (a property of the Python language) with dependencies (a thing that maps roughly to packages/distributions, which are a property of Python packaging). The two actually aren't that connected: there's no guaranteed 1-1 (or even 1-N) mapping between a dependency's package name and its importable modules, meaning that knowledge of a malicious package doesn't imply that you can derive how that package's module(s) get imported at runtime.

More perniciously: module names aren't static. It's pretty easy to construct a dynamic module object, or to rename (or alias) an existing module object to avoid this kind of detection.

Finally: walking a project's import tree isn't safe in the general case! Lots of packages have side effects when imported, and malicious dependencies definitely take advantage of that ability. Running this tool might find a malicious import by virtue of actually running malicious code, which isn't ideal.

If your goal is to detect malicious API patterns at runtime (which is effectively what you're doing when you walk the package import tree), I think runtime audit hooks[1] are probably a better fit. Those also aren't foolproof either, but they'll probably be more reliable (and don't require as much context awareness to determine maliciousness).

[1]: https://peps.python.org/pep-0578/

maweki a year ago

> It's pretty easy [...] to avoid this kind of detection.
And by rice's theorem, it is generally undecidable whether there are hidden modules loaded.
Static (or in this case maybe not so static) analysis of arbitrary code will never lead to 100% safety. You'll always need some static restrictions on what code you're even allowed to write.
- janalsncm a year ago
  
  Coming at this from a naive perspective, do we care if a module is provably, definitely imported? I think for supply chain attacks it should be sufficient to say this module might be imported and deserves attention. I assume Rice’s theorem would also say that even though malicious code is imported it might not run the malicious bits today or on this machine [1]. But that doesn’t mean I’m ok with having it there.
  [1] https://en.m.wikipedia.org/wiki/Stuxnet
  
  maweki a year ago
  
  It's undecidable whether there even is an import of stuff. It's every non-trivial property that's undecidable.
  So the question whether a certain line is reached is as undecidable as the question what values can arguments have at that line.
  So it's undecidable whether a dynamic import statement is reached and what its import argument will be. And even if the value is static, then it's undecidable whether the content of the imported module has been just changed.

ashishbijlani a year ago

Good to see this project here! Have you added support for permissions already? Would love to integrate this in https://github.com/ossillate-inc/packj

r9295 a year ago

Working on it!

gegtik a year ago

I've been using pip-compile from https://github.com/jazzband/pip-tools for this use case; a standard project Makefile defines "make update" which pip-compiles the current requirements, and "make install" installs the frozen requirements list.

This way I can install the same bill of materials every time

r9295 a year ago

I think we have different motivations. pip-compile can only fetch and install dependencies which have been declared.
For example, let's say I have a malicious yaml parser package. It should not need requests as a dependency. The odds are that a project may have requests already installed as a sub-dependency of another dependency. I can then just try and import requests in a try catch block and if available, and fetch malicious artefacts, for example. Panoptisch would report this.
Also, the usage of operating system or builtin modules such as socket, sys or importlib is not something which is analyzed by pip-compile.