Show HN: Tach – Visualize and untangle your Python codebase
github.comHey everyone! We're Evan and Caelean, the authors of Tach.
Tach gives you the tools to build a modular monolith in Python: visibility into the state of your dependencies, as well as the tools to fix them.
Since our last Show HN (https://news.ycombinator.com/item?id=41359181) we've shipped support for layers, third party dependencies, visualizations, and more.
Tach is:
* Open source (MIT) and completely free * Fast (written in Rust ) * In use by teams at NVIDIA, PostHog, and more
As your team and codebase grows, code gets tangled up. This hurts developer velocity, and increases cognitive load for engineers.
One way Tach differs from existing systems that handle this problem (build systems, import linters, etc) is in it's ability to be incrementally adopted and its runtime speed.
If you struggle with dependencies, onboarding new engineers, or a massive codebase, Tach is for you! We built it with developers in mind - with clean integrations into Git, CI/CD, and IDEs, and the performance for it to be effective in any form factor.
When I tried it, it seemed like you really need to list all modules in `tach.toml`.
What I wanted was to work at a coarser package level. For example if you have the modules `foo.a`, `foo.b`, `bar.a`, and `bar.b`, I'd like a rule that `bar` can import from `foo` but not vice versa, without having to list or care about the submodules.
Is that something you'd want to support?
Really excited to see this project gain traction.
> Note that this graph is generated remotely with the contents of your `tach.toml`
Isn't shipping off parts of your codebase to a 3rd party without warning in the CLI a security risk? Or in regulatory environments you get audited that your code was only stored on properly vetted services which is why some sales cycles for AI coding assistant tools are so long. It would be kind of frustrating to have something like that happen and get set back on licensing, etc.
Just from the video it doesn't seem like any sort of warning that you are shipping config files to your servers and the URL that you produced doesn't seem to have any authentication.
Maybe i am misunderstanding that functionality, but it gives me pause to use it.
Co-author here, fair question!
In short, we want to make the visualization UX as smooth as possible, and this is best done with a web app. The URLs use UUIDs, and the contents being sent don't include literal source code, only module names and Tach configuration. We will also delete graphs by UUID on request, and have done so in the past.
That said, we do try to be up-front about this, which is why that disclaimer exists, and when running this command on the CLI, you must supply an explicit `--web` argument to `tach show`. Otherwise, the default behavior is to generate a GraphViz DOT file locally.
Why not just let users run the web app locally? There's no reason it needs to be remote.
Also, the mere fact that it sends any data, no matter what you say it contains is a non-starter at many places. And even module names can contain proprietary data.
I can understand the frustration, but I think there are legitimate reasons to run this remotely.
Tach is an installable Python package, shipping a full web app would have to come in a separate form factor and has significant maintenance implications. Given we are explicit about the remote app before anything is sent, require explicit opt-in, and we provide usable alternatives locally, we prioritize shipping a useful graph experience that is immediately usable.
If you are at an enterprise that cannot tolerate this, then you can use a local viewer with either GraphViz DOT format or Mermaid which is generated by using `tach show` or `tach show --mermaid` respectively.
Tools like this rub me the wrong way.
We have well established conventions like prefixing private modules and symbols with an underscore, or declaring your public interfaces in the __init__.py file, but the Python developer decries it as "busywork", "weird" and "hard to read", so we instead use tools like this.
We can manage dependencies with protocols, a type checker and generally following SOLID principles, but the Python developer decries it as "too indirect and convoluted", so we instead use tools like this.
This is more commentary on the Python developer than this tool. Tach looks great.
Co-author here, I can understand where you're coming from!
Part of the philosophy here is that the tools and techniques you're describing can (and should) be used diligently to solve this problem, and Tach is often a complement to this approach.
The benefit of centralizing the concern into a single tool, and often a single config file, is that teams get better documentation, earlier feedback (in-editor vs. code review), and more visibility when planning new development. Teams also get to choose _how_ they would like to satisfy Tach's config, and other teams can still rely on the same guarantees due to Tach's static checks.
> We have well established conventions like prefixing private modules and symbols with an underscore, or declaring your public interfaces in the __init__.py file,
The language doesn't enforce them, so they may as well not exist. See: python dependency management.
> This is more commentary on the Python developer than this tool.
100%. Python has become an unstructured Wild West, perhaps even worse than modern JavaScript. The "Zen of Python" is a bold faced lie.
Python has incredible use cases. It blends together different disciplines effectively. But perhaps we should ask ourselves whether or not it's a language suitable for writing large monoliths in.
> The language doesn't enforce them, so they may as well not exist. See: python dependency management.
I see your point. You can enforce them with mypy by declaring your exports in your __init__.py file, using the `as` aliasing method or using `__all__`: https://mypy.readthedocs.io/en/stable/command_line.html#cmdo....
The conventions are widely used and Python is used successfully in numerous “large monoliths”. Saying that the conventions may as well not exist if they’re not enforced is demonstrably nonsense.
This looks nice! I vaguely know Grimp as a similar tool, any idea how they differ/compare?
Thanks! You can think of Grimp as a lower-level tool for interacting with the import graph in Python, while Tach is a high-level tool responsible for 'modularity' as a whole (e.g. modules, interfaces, layers, deprecations etc.)
Tach is also more opinionated - so it doesn't require you to write any custom code, and uses declarative config to enforce your desired architecture.
Having the example be a video that changes was confusing at first, and if you are going to show me something that is changing, I would like to be able to rewind to the beginning. But really I just think it's a bad idea to show something like that without making it obvious what it is.
Thanks for this feedback! Here's a more structured walkthrough: https://www.loom.com/share/7c03d72ea2b54212a3509d4333f61b99?...
We'll add this to the README.
Cool! Do you have plans to launch a paid offering? The website makes me think it's a company but I didn't see any pricing / sales details.
Co-author here - We do provide a web platform (https://www.gauge.sh/platform) which we have been developing with design partners. The fundamental difference between using Tach alone vs. the platform is that the platform provides incremental enforcement at the pull request level.
We're always happy to chat about adding more design partners! email: founders@gauge.sh
This is pretty cool, thanks for sharing