Show HN: 3LC – Illuminate the ML Black Box

pypi.org

8 points by Pendresen 11 days ago

Hello HackerNews,

I am Paul, and I would like to get some feedback on the tool we are releasing as beta today.

3LC is an ML tool that gives detailed insights, real-time data-centric iterative workflows for training/finetuning, and data quality improvements for your Machine Learning datasets and models. 3LC serves as a visualizer, editor, and debugger, focusing on how models learn from the training data.

Key Features of 3LC:

  • Detailed Data Analysis: 3LC enables users to dive into model performance beyond typical labeling errors. It offers the capability to analyze intricate false positives, track embedding-space dynamics, and perform interactive metrics analysis on a per-sample, per-epoch basis.

  • Real-Time Visualization: Users can visualize every data point immediately after training in our Dashboard; create interactive 2D/3D plots, real-time filter outliers, and find correlations in the metrics recorded. All results are visualized and linked to the underlying original data. 

  • Interactive Data Editing: At its core, 3LC allows for on-the-fly modifications of training data. Users can, for example, adjust bounding boxes in image datasets or change sample weights based on their trajectories in embedding space, directly influencing subsequent training rounds.

  • Seamless Integration: 3LC can integrate into existing PyTorch training scripts without significantly changing your established workflow. It operates within any system setup, be it locally on your laptop, on-prem HPC, or at your favorite cloud provider.

  • Non-Intrusive Data Revisions: 3LC's data modifications are sparse and without duplicating or relocating data. No upload of data to a SaaS solution!
We have tried to make 3LC as minimally intrusive as possible – enabling full data-centric workflows wherever you run your training or finetuning.

Why This Matters:

We have designed 3LC to provide insights into the Machine Learning workflows not often visible in traditional setups. The aim is, of course, to help produce more accurate ML models, but we have also seen users able to reduce both their model and training dataset sizes while improving their accuracy.

This first release is aimed at the Computer Vision domain. However, we are hard at work on UI improvements and integrations to support LLM finetuning.

To integrate 3LC into your projects, you can start with a simple installation:

    pip install 3lc
or visit

    https://pypi.org/project/3lc
For further details and documentation, visit https://docs.3lc.ai

We welcome feedback from the HackerNews community to help us improve and develop 3LC further.

Thank you for your time!

luke-stanley 10 days ago

It says "Sign up for a free account" on the PyPI page. That's not the usual way developers go about using a Python module. You might think you have a really good reason that warrants this but I think this is going to add a lot of friction - it's a big turn off. It conflicts with the "local data" selling point.

  • Pendresen 7 days ago

    luke-stanley We've listened to your feedback, and have removed the signup process entirely. Thank you!

    pip install 3lc

    3lc service

    We host a set of demo projects, so you don't need to integrate with your training scripts to test it out :-)

    • luke-stanley 6 days ago

      Thats great. I suspect the developer experience will be much better, I hope it goes well! The pressures to strive for commercial success and meet the needs of developers, and open source are tricky but 3LC looks interesting, I think running local servers to use powerful and easy to develop browser based UIs is a great pattern. Good luck with this!

  • Pendresen 10 days ago

    Account management in 3LC is SaaS-based, while the data you operate on is always local to you. The Beta is free, and 3LC will remain free for non-commercial use after the beta period. The sign-up process is minimally invasive, only requiring an email.

JustFinishedBSG 10 days ago

Please don't call it an "ML tool" if it only works with Deep Learning.

Nothing wrong with that, but it's annoying to read intros/docs only to reach the part where it becomes clear it's an NN specific tool.

  • Pendresen 10 days ago

    Even though we focused on deep learning as examples and integrations initially (and where it is most used), you can register any data and metrics through the Python API and use the Dashboard. You don't have to run a NN training. We have users that use it without Pytorch or a NN, and just use it to browse data and do plots of statistics/metrics(and data edits). I agree that this should be much clearer and thanks for the feedback!