It doesn't look like a typical round for raising capital for investments. Instead:
1. Liquidity: Early investors could sell to late-stage investors, since they are not IPO. Their previous round looked like that.
2. Markup: The previous investors can increase their valuation by doing a round again. It also provides a paper valuation for acquiring new companies. That combined with preferred stock (always get 1x back) might be appealing and make some investors more generous on valuation.
So if I understand well, investors are not really investing for the company results, but more on the hope that people will continue to invest in the company?
A Ponzi scheme is extreme, where the underlying asset is worthless.
Databricks is a fast-growing company with ~$4B in annualised revenue and huge potential.
Many rounds got some portion of the round for liquidity. Similarly, markup strategies are common and valid. For existing investors, it works because they have already done research on the company and believe in it, so they put their money where their mouth is. For the company, it may speed up their fundraising process.
> So if I understand well, investors are not really investing for the company results
I don't know where you got that idea. Investors are putting their money into this company because they like the results and believe it's a better investment that their alternatives.
Any time you sell shares you generate some signal about what a company is worth. You can claim the company is worth a $100B all day long, but until you can sell a significant number of fractional shares of the company at that valuation it's just talk.
> In a kind of a ... ponzi pyramid?
A ponzi scheme or pyramid scheme implies that the company is lying about their results and books. Classic ponzi schemes might not have any real assets at all. The operators lie about the company and rely on incoming cash from new investors to pay out claims from past investors.
There's no ponzi here unless you believe Databricks is completely falsifying their operations and results. If any of those investors took their shares to the secondary market there would be plenty of other investors interested in buying them because they represent shares in the real company.
Not a Ponzi, but definitely some markup and valuation engineering.
Let’s say that Databricks has 100B valuation (just for the sake of simplicity).
They do this round, and due to this markup they can do acquisitions via stock option exchange. For instance, let’s say that you’re Neon, and you as a founder wants some sort of exit.
It’s preferable to get acquired for 1B with let’s say, 100Mi in cash and 900Mi in Databricks paper valuation shares; than to wait for a long process for an IPO.
If the mothership company (Databricks) goes public, you have liquidation and a good payday, or in meanwhile you can sell secondaries at a discount.
Man I am not kidding but atleast this company has some returns but yes to me also its definitely risky, but it still has some decent intrinsic value as compared to companies whose sole objective is to trade within (MNC's should be illegal)
Also, maybe I just want to talk about it, but whenever I hear about ponzi pyramid, I think about cryptocoins like bitcoin and then remember about the people paying 2$ to buy 1$ worth of btc in american institutional markets.
My rant about crypto is unwarranted but I want to still share it. Stablecoins are really really cool but any native coins/tokens are literally ponzi pyramids / scams.
I had no idea how preferred shares actually worked, so I went down a rabbit hole looking it up. That "always get 1x back" thing you mentioned is called a liquidation preference, which means preferred shareholders get their money back first before anyone else sees a dime.
Turns out there are different flavors too. "Non-participating" means preferred gets their original investment back, then common stock splits whatever's left. "Participating" means preferred gets their money back AND also gets to participate in splitting the leftovers with common shareholders. No wonder investors are willing to pay up for these late-stage rounds when they've got that safety net.
Definitely seem like bad investments from my perspective on databricks.
Databricks is great at offering a "distributed spark/kubernetes in a box" platform. But its AI integration is one of the least helpful I've experienced. It's very interuptive to a workflow, and very rarely offers genuinely useful help. Most users I've seen turn it off, something databricks must be aware of because they require admins permission for users to opt out of AI.
I don't mean to rant, there's lots that is useful in databricks, but it doesn't seem like this funding round is targeting any of that.
i don't think that it is possible to raise a 100 billion without name dropping ai in every sentence in every meeting you have with a potential investor....
what is the investor thesis for coming in with such a multiple?
You know they will have to find a a greater fool, possibly the public to buy at an even higher ratio to break any profit on that....
This isn't really venture investing at this point. The valuation risk calculation is very different for preferred shares than common stock, and with a healthy ARR they have very little risk (maybe not much profit, but it's not that different than a bond on some level...).
My company is heavily invested in Databricks and let me tell you it sucks. 5 min to spin up a job that needs to run for 10 seconds is a terrible way to spend ones time and money.
use serverless then. its literally the simplest solution. what sort of poor decisions your team is making? it needs to run 10 seconds but still spinning up a cluster?
Unfortunately what I see is companies, especially smaller companies who originally got into Databricks because they hired people with Databricks/Spark experience, are trying to get away from the platform because it is too expensive -- and with that kind of money it is just easier to use Snowflake.
Yeah but looks like it is more "managed" and analysts especially prefer writing SQL over Python.
Honestly, as a Data engineer on the DWH side, I figured that my career is going to come to an end in a few years. AI + Cloud managed DWH are going to make all technical issues trivial, and I'm not someone who is interested in business context. Not sure where to move though.
I'm really surprised to hear this. If anything, I'd expect that every company transitioning from
> "we want to store/retrieve thin event logs and clickstreams"
to
> "we need to store/retrieve/join thick prose from customer interactions/reviews at every layer of the stack to give our LLMs the right context"
would create a significant need for data engineering for bespoke/enterprise/retail-monster use cases. (And data analysis too, until LLMs get better at tabular data.)
Are you seeing that this transformation need is actually being sufficiently covered by cloud providers, on the ground?
Or that people aren't seeing the problem this way, and are just doing prompt engineering with minimal RAG on static help-center datasets? That seems suboptimal, to say the least.
That's what I think too, but meh I'm not super interested in that I guess, as it definitely rings the death bell. But I agree that's definitely the future. And people are going to be trained to accommodate AI instead of the other way around -- it's much easier!
What’s the obvious rationale for going through the whole alphabet of funding rounds, instead of going public / IPO after «the usual» number of raising money.
Wouldn’t the current strategy result in some serious stock dilution for the early investors?
Investors put 10 billion in in a previous round; that's a lot. Somehow, more is needed now. 100M is just 1% of that. So it's not going to massively move the needle. But it does raise the question where all that cash is going.
My guess is that they might be about to embark on a shopping spree and acquire some more VC backed companies. They've actually bought quite a few companies already in the past few years. And they would need cash to buy more. The company itself seems healthy and generating revenue. So, it shouldn't strictly need a lot of extra capital. Acquisitions would be the exception. You can either do that via share swaps or cash. And of course cash would mostly go to the VCs backing the acquired companies. Which is an interesting way to liquidate investments. I would not be surprised to learn that there's a large overlap with the groups of VCs of those companies and those backing databricks. 100M$ on top of 10B sounds like somebody wants in on that action.
As a financial construction it's a bit shady of course. VCs are using money from big institutional investors to artificially inflate one of their companies so that it can create exits for some of their other investments via acquisitions financed with more investment. It creates a steady stream of "successes". But it sounds a bit like a pyramid game. At some point the big company will have to deliver some value. I assume the hope is some gigantic IPO here to offload the whole construction to the stock market.
At least in some sort, this new venture market dynamics in those private markets is looking more similar with the Art market. I remember that J used to follow several private auctions where most of the auctioneers had some sort of ring where in time to time someone needed some liquidity.
Even in some situations where some artworks could have way less value at the public auction houses (Christie’s, Phillips, Sotheby's) their preference was to market between in this circuit of private auctions.
Right I'm curious how long many of these deca/centa unicorn startups can make payroll & pay their cloud bills if all of this AI FOMO unlimited exit liquidity VC investment takes even a pause.
Stock dilution doesn't work like that. If a seed investor invests for 5% at a $10mil valuation, and the company goes 10x (ie. a valuation of $100mil), if the company now raises a $100mil Series K, that means the Series K investor owns 50% of the company, and the seed investor got diluted down to 2.5%. However, the new valuation of the company is now $200mil with the cash that the new investor brought in, effectively making the seed investor's investment worth the same.
It's a smaller piece of a bigger pie.
To answer your question, the right question to ask is why go public when you can remain private? Public means more paperwork, more legalese, more scrutiny, and less control for the founder, and all of that only to get a bit more liquidity for your stock. If you can remain private, there really isn't much of a reason to not do that.
An IPO means selling a whole bunch of people, whereas fundraising rounds pre-IPO mean courting a small number of large investors. I think it's partly a sign of the times that there's enough concentrated capital that you can get enough money from private hands to not need to go the IPO route yet.
The private market is getting out of hand, then. I think it makes sense for private companies beyond a certain size to have the same reporting requirements that listed ones do. At these valuations the private market for startups is becoming systemically important.
To some degree, they do -- under SEC rules (Exchange Act §12(g)), private companies with >$10M in assets and 2,000+ shareholders (or 500+ non-accredited investors) have to start public-style reporting.
I assume there's some clever accounting to ensure they're not at the 2,000 shareholder cap (perhaps double-trigger RSUs don't count as being a shareholder yet?)
This heavily depends on share classes and preferences. Surely the new investor wants better terms. The issue isn't so much dilution as a preference but added risk of never even getting a payout at all.
Both have benefits. Staying private means a lot less distractions, less investor scrutiny (good and bad), and the general ability to do whatever you want (good and bad).
It's a lot easier to stay long-term focused without investors breathing down your neck. As a private company you're not dealing with shortsellers, retail memers, institutional capital that wants good earnings now, etc..
Of course, the bad side is that if the company gets mismanaged, there's far less accountability and thus it could continue until it's too late. In the public markets it's far easier to oust the C-suite if things go south.
It's a shame that the trend of staying private longer means retail gets shut out from companies like this.
An order of magnitude less scrutiny, but also an order of magnitude in size of investor base. The private markets trade at Palantir levels so why go public. Also the private markets are now routinely doing secondary transactions so even less reason to go public.
it's funny how we're letting private companies get away with made up numbers. Rather than making IPOs easier, owning a private company above a certain valuation should come with at least an obligation for GAAP accounting, indepndent audits etc.
This is really for the greater good - so what is we see 2-5 years of a beautiful AI bubble if it's going to come crashing down again. It's lawmakers and regulators role to smooth out and dampen the natural tendency for the markets to go bubbly.
In India, Zomato[0] (now listed) and Swiggy[1] both had a Series K. SpaceX has only gotten to a Series J, but they've done some secondary sales since. Apparently, Palantir[2] has had a Series K as well, back in 2015.
If you Google "Series K investment" basically all the hits are about this. Same applies for J and I - you have to get back to H before you start seeing anyone else.
Why Databricks would do this (rather than IPO) is obvious. When you can raise privately, it’s way easier than IPO. The real question to me is why the investors (new and previous) are going along with it?
You'd think previous investors would want some actual liquidity though at some point. The early investors have had plenty of chances now but surely not everyone's been able to cash out. But hey, they have lots of funny money now I guess?
Looks like someone is thinking “hey let’s wave our hands in the air and talk about AI and someone will write us a cheque!” as a way to kick the can down the road that this far into it they’re still not selling a product that’s making money. Looks a bit desperate TBH.
lot of usage has moved to snowflake. I know snowflake cannot do everything spark does but a huge number of workloads on spark can be moved to snowflake ( which has superior ux)
Depends on how you define cheaper - you could set up Apache Iceberg, Spark, MLFlow, AirFlow, JupyterLab, etc and create an abomination that sort of looks like Databricks if you squint, but then you have to deal with set up, maintenance, support, etc.
Computationally speaking - again depends on what your company does - Collect a lot of data? You need a lot of storage.
Train ML Models, you will need GPUs - and you need to think about how to utilise those GPUs.
Or...you could pay databricks, log in and start working.
I worked at a company who tried to roll their own, and they wasted about a year to do it, and it was flaky as hell and fell apart. Self hosting makes sense if you have the people to manage it, but the vast majority of medium sized companies will have engineers who think they can manage this, try it, fail and move on to another company.
Don't worry, most places go straight with databricks and get a flaky as hell system that falls apart anyway, but then they can blame databricks instead of their own incompetence.
I'm surprised at how often this is reality.
Bureaucrat at the top of the decision tree smiles smugly while describing how easy they're accomplishing <goal> with <system>. I've been that bureaucrat too many times.
yeah where IT blocks half of the config, and you disable half of the features that could make it great, just to make sure they definitely don't give control to..GASP... A DATA ENGINEER
I don't think there is anything out there that really bundles everything exactly like databricks does.
There are better storage solutions, better compute and better AI/ML platforms, but once you start with databricks, you dig yourself a hole because the replacing it is hard because it has such a specific subset of features across multiple domains.
In our multinational environment, we have a few companies that are on different tech stacks (result of M&A). I can say Snowflake can do a lot of the things Databricks does, but not everything. Teradata is also great and somehow not gaining a lot of traction. But they are near impossible to get into as a startup, which does not attract new talent to give it a go.
On the ML side, Dataiku and Datarobot are great.
Tools like Talend, snaplogic, fivetran are also really good at replacing parts of databricks.
So you see, there are better alternatives for sure, cheaper at the same time too, but there is no drop-in replacement I can think of
Exactly this. But you don't really want to bundle straight away -- think about the exact problem you have and then solve exactly that problem. After you've sorted a few problems like this think if a bundled platform is useful.
Maybe I wasn't super clear. Wasn't looking for a 1:1 replacement.
Trying to understand what other options are out there for small teams / projects that don't need all those enterprise features that Databricks offers (governance etc).
It's been mentioned but I want to add that the original idea of the post (mid size VPS hosting apache spark) might be missing that spark is ideal for distributed and resilient work (if a node fails the framework is able to avoid losing that work).
If you don't need this features, specially the distributed one, going tall (single instance with high capacity, replicate when necessary) or going simpler (multiple servers but without spark coordinating the work) could be good options depending on your/the team's knowledge
Exasol costs us a fraction of what we used to pay for Databricks, and that is even with us serving far more users than we used to do (from a data size perspective we are not at the petabytes scale yet, but getting there).
My little Databricks story: we setup hosted model inference for an in-house model. Worked great for several months!
But then they did maintenance and broke the entire feature. Reconfiguring everything from scratch didn't work. A key part where a Docker image is selected was replaced with a hard-coded value including a long system path (and employee name -- verified via LinkedIn).
Because of constant turn-over in account reps we couldn't get any help there. General support was of no use. We finally got acknowledgement of the issue when we got yet another new account rep, but all they did was push us towards paid support.
We exhaustively investigated the issue and it was clearly the case that nothing could be done on our end to resolve it. The entire underlying compute layer was busted.
Eventually they released a newer version of the feature which did work again, but at this point it has become impossible to justify the cost of the platform and we're 100% off.
Good luck to them, but from my experience the business fundamentals are misaligned and it's not a company I hope to ever work with again.
Just finished ripping out Databricks at one of my clients, and have several more queued up. Folks can't wait to get as far away as they can, and as fast as they can from any of their offerings. Poor performance, bad product, bad UX: hard to get even decent logs out of the damn thing, and it's incredibly overpriced.
They told a good story and had a good sales team, but the writing is on the wall for them.
Databricks on azure is huge. I've heard that in some Azure regions, over 70% of the compute usage is just Databricks. So there is definitely an incentive for MS to acquire them.
Except that Microsoft looks better if you have the illusion of choice. Azure Databricks or Azure Databricks but you have to build it yourself out of janky azure services
I always struggled to understand how do you make a company adopt a platform like databricks to « manage data » isnt managing data a minefield with plenty of open source pieces of software that serve different purposes ? who is the typical databricks customer?
I think that's the main offering of databricks- you get a "data platforn in a box" and navigating the forest of piecemeal solutions is replaced with telling your data science and analytics teams to "use databricks".
It's easy to look on knowing lots about data tools and say "this could be better done with open source tools for a fraction of the cost", but if you're not a big tech company, hiring a team to manage your data platform for 5 analysts is probably a lot more expensive than just buying databricks.
We have a large postgres server running on a dedicated server that handles millions of users, billions of record updates and inserts per day, and when I want to run an analysis I just open up psql. I wrote some dashboards and alerting in python that took a few hours to spin up. If we ever ran into load issues, we'd just set up some basic replication. It's all very simple and can easily scale further.
Sounds like you have the benefit of a nicely designed server and good practices. A lot of companies aren't the same.
Imagine you're a big company with loads of teams/departments multiple different types of SQL servers for data reporting, plus some parquet datalakes, and hey, just for fun why not a bunch of csvs.
Getting data from all these locations becomes a full time job, so at some point someone wants some tool/ui that lets data analysts log into a single thing, and get the experience that you currently have with one postgres server.
I think it's not a problem of scale in the CS sense, more the business sense where big organisations become complex and disorganised and need abstractions on top to make them workable.
we have databricks at my company 50m ARR, 150 employee thats still growing at 15% YoY. With 0 full time Data Engineer (1 data scientist + 1 db admin both co-manage everything on there as part-time jobs. They have their full-time role). We are able to have data from like 100 transactional database tables, Zendesk, all our logs of every API call, every single event from every user in our mobile and web applications, banking data, calendar data, goole play store data, apple store data, all in 1 place. We are a 2-sided marketplace, we can easily get 360 degree view of our B2B customers, B2C customers, measure employee productivity across all departments. It's that deep data understanding of our customers that powers our growth
My team of 3 data scientists are able to support a culture of experimentation, data-informed decision making accross the entire org.
And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me.
I really struggle to imagine being able to that any cheaper. How else we can engineer a data hub for all of our data and manage appropriate access & permissions, run complex calculations in seconds (yes we have replaced overnight complex calculation done by engineering teams), join data from so many disparate sources, at a total cost (tool + labor) <80k/yr. I double dare you to suggest or find me a cheaper option for our use case.
you kill off all open source pieces, in turn compliance is happy, and a CTO is happy because he has a maintenance contract and can blame other people if stuff goes wrong.
It's a way to get those pesky Python people to shut up
Oh, and a CTO is always valued more if he manages a 5 million Databricks budget, where he can prove his worth by showing a 5% discount het negotiated very well, than a 1 million whatever-else budget that would be best in class. Everybody wins.
My company is doing the dbx thing, and the best I can tell my manager is that I'm neutral on it.
My working theory is that the UI, a low-grade web-based SQL editor and catalog browser, is more integrated that the hodgepodge of tools that we were using before, and people may gain something from that. I've seen similar with in-house tools that collect ad-hoc/reporting/ETL into one app, and one should never underestimate the value that people give to the UI.
But we give up price-performance; the only way it can work is if we shrink the workload. So it's a cleanup of stale pipelines combined with a migration. Chaos in other words.
we have databricks at my company 50m ARR, 150 employee thats still growing at 15% YoY.. With 0 full time Data Engineer (1 data scientist + 1 db admin manages everything on there as part time jobs). We are able to have data from like 100 transactional database tables, Zendesk, all our logs of every API call, every single event from every user in our mobile and web applications, banking data, calendar data, goole play store data, apple store data, all in 1 place. We are a 2-sided marketplace, we can easily get 360 degree data on our B2B customers, B2C customers, measure employee productivity across all departments.
My team of 3 data scientists are able to support a culture of experimentation, data-informed decision making accross the entire org.
And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me.
I really struggle to imagine being able to that any cheaper. How else we can engineer a hub for all of our data and manage appropriate access, run complex calculations in seconds, join data from so many disparate sources, at a total cost (tool + labor) <80k/yr. I double dare you to suggest or find me a cheaper option for our use case.
I think the governance stuff might push it over the top for a lot of organisations; it's pretty well integrated with IAM providers not only for structured/modelled data but also workspaces for the data sciencey stuff. Pretty much everything has permissions associated with it. When you have a big data engineering/science push off the back of the AI hype I think it appeals to the cheque writers to have something centralised and controlled.
Aside from that I do get the feeling that most small and medium sized companies have been oversold on it - they don't really have enough data to leverage a lot of the features and they don't really have the skill a lot of the time to avoid shooting themselves in the foot. It's possible for a reporting analyst upskilling to learn the programming skill to not create a tangled web of christmas lights but not probable in most situations. There seems to be a whole cottage industry of consultancies now that purport to get you up and running with limited actual success.
At least it's an incentive for companies to get their data in order and standardise on one place and a set of processes.
In terms of actual development the notebook IDE feels like big old turd to use tho and it feels slow in general if you're at all used to local dev. People do kinda like these web based tools tho. Can't trust people all the time! There's VS code and PyCharm extensions but my team work mainly with notebooks at the moment for good or ill and the experience there is absolute flaky dogshit.
I think it's possible to make some good stuff with it and it's paying my bills at the moment, but I think a lot of the adoption may be doomed to failure lol
Since this year the employees are vesting RSUs (not options, and also no expiry date) quorterly now, they sell a portion of them (automatically) and pay taxes to the government at each vesting event, as the expiry date no longer exists. For liquidity there are tenders where employees sell their stock privately, so the employees no longer need IPO to cash out.
Just to clarify - for many years employees were getting the RSUs not options, just with the expiratation date attached - which is gone since this year.
Options can a significant portion of sign on bonus but they typically vest over several years so I guess they are hoping for an IPO eventually. IMHO Databricks will be overtaken by "events" including AI disillusionment, broader open source tools and broader education across the workforce. So the eventual IPO will not happen.
Depends. Some options only vest in the case of an "exit event", i.e. an acquisition or an IPO. At this point I would assume such options are borderline worthless.
Yeah I think this is how it usually works, and yeah at $100bn valuation they are now 100% worthless, because investors get paid first, and there's no way they'll get sold or IPO for more than $100bn.
> Yeah I think this is how it usually works, and yeah at $100bn valuation they are now 100% worthless, because investors get paid first, and there's no way they'll get sold or IPO for more than $100bn.
Not quite right? Because the raise-implied valuation doesnt account for preferences. The IPO could be for 50bn and the latest investors could do well given the preference stack of first money outs in later rounds.
This curvature of spacetime is caused by the mass of the AI bubble.
While many comments were focused on the "K" letter, I wanted to remind us all that OpenAI stretched their Series E from Jan 23, 2023 to Nov 22, 2024
-- 23 months, squeezing in 6 rounds
Their product looks like basic wrappers for managing postgres instances and dashboards. Why would anyone with even minimal technical expertise pay for a generic service like that?
A lot of people purchasing their products have a vague understanding of the problem they're trying to solve and an even worse grasp of how dbx solves it for them. I'm living this first hand.
we have databricks at my company 50m ARR, 150 employee. With 0 full time Data Engineer (1 data scientist + 1 db admin manages everything on there as part time jobs). We are able to have data from like 100 transactional database tables, Zendesk, all our logs, every single event from eveery user in our mobile application, banking data all in 1 place. We are a 2-sided marketplace, we can easily get 360 degree data on our B2B customers, B2C customers, measure employee producting.
My team of 3 data scientists are able to support a a culture of experimentation, data-informed decision making and support the entire org, and we are still growing 15% YoY.
And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me.
i struggle to imagine how else we can engineer a hub for all of our data and manage permissions appropriately at less tooling and engineering cost
no we don't. Our plan includes some support, but we honestly haven't needed it.
We are also aggressive about sizing compute resources to the task, and foregoing some of the more costly "easier serverless options" that databricks provides. Their serverless SQL though is excellent value for money.
Regardless of the product and idea they had, a company that is 15 years old and raised 10+ billion dollars still needing to raise money after all this time is ridiculous.
Not being sustainable after all this time and billions of dollars is a sign company is just burning money, and a lot of it. wework vibes.
They were expecting to be cash flow positive in Jan 2025, according to [0]. That said, it is hard to tell if they actually became cash flow positive since with them still being a private company, they aren't required to release that information.
Whenever companies release glowing fluff PR about their amazing financials they key word in there is “non-GAAP.”
i.e. when we exclude a bunch of pesky costs and other expenses that are the reason we’re not doing so well, we’re actually doing really well!
Non-GAAP has its place, but if used to say the company is doing well (vs like actual accounting) that’s usually not a good sign. Real healthy companies don’t need to hide behind non-GAAP.
Yes but free cash flow is free cash flow, and that's what matters for survival (i.e. run-rate). So long as fcf is positive, you'll never go bankrupt.
Really what they don't tell you is how much SBC they have. That's what crushes public tech stocks so much. They'll have nice fcf, but when you look under the hood you realize they're diluting you by 5% every year. Take a look at MongoDB (picked one randomly). It went public in 2016 with 48.9m shares outstanding. Today, it has 81.7m shares outstanding. 67% dilution in 9 years.
This. To me if you are still unprofitable after 15 years you are not really a business.
However genuinely curious about the thesis applied by the VC’s/Funds that invest in such a late stage round? Is it simply they are taking a chance that they won’t be the last person holding the potato? Like they will get out in series L or M rounds or the company may IPO by then. Either ways they will make a small return? Or is the calculus diff?
The last person in usually gets the best deal, in that they can get preference and push everyone else (previous investors, founders, and employees) down. If things goes south, they get their money out before anyone else.
Isn’t everyone “the last” at the moment they are taking participation in the round? If someone thinks they’re gonna get preferential treatment in Series C or D, and then comes someone in E with preferential treatement, then
Why don't early investors put clauses in their investment to protect themselves against being screwed over by later investors? It seems like an obvious thing to ask for if you're giving someone a lot of money, so I'm assuming there must be a very good reason it's not done.
Early investors (the main ones at least) usually get pro-rate rights - which means you can invest in later rounds to maintain your ownership percentage (i.e a later round dilutes your ownership, so you invest a bit until the ownership stays the same).
But the pref stack always favors later investors, partly because that's just the way it's always been, and if you try to change that now no one will take your money, and later investors will not want to invest in a company unless they get the senior liquidity pref.
> However genuinely curious about the thesis applied by the VC’s/Funds that invest in such a late stage round
1) It's evaluated as any other deal. If you model out a good return quantitatively/qualitatively, then you do the deal. Doesn't really matter how far along it is.
2) Large private funds have far fewer opportunities to deploy because of the scale. If you have a $10B fund, you'd need to fund 2,000 seed companies (at a generous $5m on $25m cap). Obviously that's not scalable and too diversified. With this Databricks round, you can invest a few billion in one go, which solves both problems.
This! We did some simple testing on their platform to integrate it into our product for a customer. In a few days of light work rang up a huge bill. Many multiples of what we spend on OpenAI, which gets heavy use.
That may be, but our use of DB was 1/1000 of what we do in a month with OpenAI and the bill we racked up was $3,000 in 1 day. We talked with them and because we freaked out and deleted the widget (whatever the connectors are called) they didn't have logs for what we did, so they couldn't refund anything (they were willing). The fact that they couldn't find anything because we deleted whatever it was, that was weird, because they could certainly bill us. We're never using them again.
I worked at a place once where the CEO basically said that it's a lot easier to raise money when you don't need it than to raise it when you do. The US economy is looking pretty weird with a bunch of conflicting predictors. Maybe they're buffering for a recession.
Its always true. Whether you are a start up or an individual. People throw money at you when you least need it. But when you do need it, they give all types of hassle
depends on who is making a decision and how exactly is the funding round structured - for some investors, diluting other shareholders is actually a good thing. For existing employees, if they get an option to partially cash out now is probably better than waiting indefinitely for an IPO etc
At that time the Palantit valuation was considered 'hefty / overpriced' at $9B.
Current stock price valuation, post IPO is a completely detached from fundamentals ~$378B
if you were to apply the same ratio to Databricks it would have to trade at 42 000 000 000 000 000 USD - enough to buy the entire US sovereign debt, the moon, all earth's minerals with plenty to spare. A completely rational market if you ask me.
it really is the most expensive I've ever came across. It would be a flatout no-go if it weren't for Microsoft pushing everyone onto this platform, supported by their network of really absolutely neutral Gartner friends and Deloittes/KPMG/Accenture/TCS "experts" to recommend what lines their pockets.
Same story was with Spunk. Yet it was acquired by Cisco for $28 billion.
Valuation and ability to burn cash for 10+ years of the Silicon Valley companies never cease to amaze me.
Just curious, wouldn't it get harder for companies like Databricks or Clickhouse to compete against AWS in the long run? They have better products, for sure. Yet over time, the product gap between what they offer and what AWS offers will narrow, and as a result the cost will be what matters most to the customers. And how can they compete on cost given that they run on AWS?
AWS is always going to be a lego bricks - choose your own adventure or assemble your own stuff option for people who want to struggle with IAM policies. plus AWS VPs compete against each other, so there is no concept of the best opinionated way of doing things. just a bunch of random options with tradeoffs. might attract some nerds. nobody else
Theoretically, yes, if AWS were really focused on they could probably deliver something like databricks; all the components are off the shelf, and a significant number of databricks clusters are on aws anyway. The question though is why; they’re already driving a lot of traffic to AWS and managing all the end customer stuff. The benefit of killing databricks is less than letting it live and grow and buy more from AWS.
Good point. I had the assumption that AWS EMR and Redshift had incentives to compete with Databricks. Another assumption was that someone in the AWS will eventually be ambitious enough to add offerings similar to Databrick's, like how AWS added MKS and OpenSearch. Both assumptions can well be wrong, though.
who the fuck wants to do sarbanes oxley. sox killed IPOs. the private market is quite liquid. why attract activists and losers with an agenda to your company
I'm as skeptical as anyone, but have you ever heard of companies like Oracle, which got rich off a database or Snowflake (current market cap 65B)? Companies pay oodles of money for that capabilities.
because it's recommended by nearly all consultants and Microsoft.
Simple as that, it's consulting Heaven. Much like SAS and SAP. Everybody happy.
Now to be far to databricks, if used properly and ignore the cost, it does actually function pretty well. Compared to Synapse, PowerBI Tabular, Fabric, Azure ML, ... that's already a big big big step forward.
To be honest, I completely lost the sense of scale with money in general. It all feels like Zimbabwe dollars to me. The news talking billions and trillions. Meanwhile friends who used to be well-off (in US/UK/EU) struggle with mortgage payments and/or bills. And the ones not laid off are expected to grind 10hs per day to keep their jobs.
all good if you dont have any data on databricks lakehouse. if its already on the lakehouse, its one more click to reverse-etl all that to postgres lakebase and get sub millisecond query response
It doesn't look like a typical round for raising capital for investments. Instead:
1. Liquidity: Early investors could sell to late-stage investors, since they are not IPO. Their previous round looked like that.
2. Markup: The previous investors can increase their valuation by doing a round again. It also provides a paper valuation for acquiring new companies. That combined with preferred stock (always get 1x back) might be appealing and make some investors more generous on valuation.
So if I understand well, investors are not really investing for the company results, but more on the hope that people will continue to invest in the company?
In a kind of a ... ponzi pyramid?
A Ponzi scheme is extreme, where the underlying asset is worthless.
Databricks is a fast-growing company with ~$4B in annualised revenue and huge potential.
Many rounds got some portion of the round for liquidity. Similarly, markup strategies are common and valid. For existing investors, it works because they have already done research on the company and believe in it, so they put their money where their mouth is. For the company, it may speed up their fundraising process.
Though those strategies carry some risks.
potato, herbalife.
so it is a ponzi schem. we are just discussing the size.
> So if I understand well, investors are not really investing for the company results
I don't know where you got that idea. Investors are putting their money into this company because they like the results and believe it's a better investment that their alternatives.
Any time you sell shares you generate some signal about what a company is worth. You can claim the company is worth a $100B all day long, but until you can sell a significant number of fractional shares of the company at that valuation it's just talk.
> In a kind of a ... ponzi pyramid?
A ponzi scheme or pyramid scheme implies that the company is lying about their results and books. Classic ponzi schemes might not have any real assets at all. The operators lie about the company and rely on incoming cash from new investors to pay out claims from past investors.
There's no ponzi here unless you believe Databricks is completely falsifying their operations and results. If any of those investors took their shares to the secondary market there would be plenty of other investors interested in buying them because they represent shares in the real company.
Not a Ponzi, but definitely some markup and valuation engineering.
Let’s say that Databricks has 100B valuation (just for the sake of simplicity).
They do this round, and due to this markup they can do acquisitions via stock option exchange. For instance, let’s say that you’re Neon, and you as a founder wants some sort of exit.
It’s preferable to get acquired for 1B with let’s say, 100Mi in cash and 900Mi in Databricks paper valuation shares; than to wait for a long process for an IPO.
If the mothership company (Databricks) goes public, you have liquidation and a good payday, or in meanwhile you can sell secondaries at a discount.
Man I am not kidding but atleast this company has some returns but yes to me also its definitely risky, but it still has some decent intrinsic value as compared to companies whose sole objective is to trade within (MNC's should be illegal)
Also, maybe I just want to talk about it, but whenever I hear about ponzi pyramid, I think about cryptocoins like bitcoin and then remember about the people paying 2$ to buy 1$ worth of btc in american institutional markets.
My rant about crypto is unwarranted but I want to still share it. Stablecoins are really really cool but any native coins/tokens are literally ponzi pyramids / scams.
Welcome to Silicon Valley in the 21st century
yes.
I had no idea how preferred shares actually worked, so I went down a rabbit hole looking it up. That "always get 1x back" thing you mentioned is called a liquidation preference, which means preferred shareholders get their money back first before anyone else sees a dime.
Turns out there are different flavors too. "Non-participating" means preferred gets their original investment back, then common stock splits whatever's left. "Participating" means preferred gets their money back AND also gets to participate in splitting the leftovers with common shareholders. No wonder investors are willing to pay up for these late-stage rounds when they've got that safety net.
You should read Venture Deals! Great book by Brad Feld and Jason Mendelson that's really comprehensive in this.
Also a good book for those interested in VC is The Venture Mindset by Ilya Strebulaev.
I can’t help feeling it is the first major misstep from databricks , they are raising the money for their hosted Postgres and ai platform.
Ai is not far away from dropping to the “trough of disillusionment” and I can’t see why databricks even needs Postgres.
Hopefully I’m wrong as I’m a big fan of databricks.
Definitely seem like bad investments from my perspective on databricks.
Databricks is great at offering a "distributed spark/kubernetes in a box" platform. But its AI integration is one of the least helpful I've experienced. It's very interuptive to a workflow, and very rarely offers genuinely useful help. Most users I've seen turn it off, something databricks must be aware of because they require admins permission for users to opt out of AI.
I don't mean to rant, there's lots that is useful in databricks, but it doesn't seem like this funding round is targeting any of that.
>Most users I've seen turn it off, something databricks must be aware of because they require admins permission for users to opt out of AI.
This is a very worrying trend of having AI enabled by default that you cannot turn it off unless you're the admin.
First I've heard of this, but I don't work in tech. Absolutely insane behavior.
Yeah doesn’t seem like core functionality
Everybody wants a pie in that AI bubble, whether it sticks or not, and that's a bad thing for companies' long term vision.
It might come down like the dotcom bubble like fallout when this thing bursts.
The first major misstep? Brother they already raised A, B, C, D, E, F, G, H and I. At what point do you end the suffering?
My personal theory of startups starts with "Series F is for F*cked", I have no idea what it takes to get to a Series K...
Anyone investing is in a k hole
i don't think that it is possible to raise a 100 billion without name dropping ai in every sentence in every meeting you have with a potential investor....
They are not raising $100B, they are raising money at a valuation of $100B.
what is the investor thesis for coming in with such a multiple? You know they will have to find a a greater fool, possibly the public to buy at an even higher ratio to break any profit on that....
This isn't really venture investing at this point. The valuation risk calculation is very different for preferred shares than common stock, and with a healthy ARR they have very little risk (maybe not much profit, but it's not that different than a bond on some level...).
My company is heavily invested in Databricks and let me tell you it sucks. 5 min to spin up a job that needs to run for 10 seconds is a terrible way to spend ones time and money.
Yes, but you can pay to spin up unlimited numbers of jobs that spend 97% of their time spinning up! Truly cloud scale.
use serverless then. its literally the simplest solution. what sort of poor decisions your team is making? it needs to run 10 seconds but still spinning up a cluster?
Unfortunately what I see is companies, especially smaller companies who originally got into Databricks because they hired people with Databricks/Spark experience, are trying to get away from the platform because it is too expensive -- and with that kind of money it is just easier to use Snowflake.
... which is also not cheap
Yeah but looks like it is more "managed" and analysts especially prefer writing SQL over Python.
Honestly, as a Data engineer on the DWH side, I figured that my career is going to come to an end in a few years. AI + Cloud managed DWH are going to make all technical issues trivial, and I'm not someone who is interested in business context. Not sure where to move though.
I'm really surprised to hear this. If anything, I'd expect that every company transitioning from
> "we want to store/retrieve thin event logs and clickstreams"
to
> "we need to store/retrieve/join thick prose from customer interactions/reviews at every layer of the stack to give our LLMs the right context"
would create a significant need for data engineering for bespoke/enterprise/retail-monster use cases. (And data analysis too, until LLMs get better at tabular data.)
Are you seeing that this transformation need is actually being sufficiently covered by cloud providers, on the ground?
Or that people aren't seeing the problem this way, and are just doing prompt engineering with minimal RAG on static help-center datasets? That seems suboptimal, to say the least.
That's what I think too, but meh I'm not super interested in that I guess, as it definitely rings the death bell. But I agree that's definitely the future. And people are going to be trained to accommodate AI instead of the other way around -- it's much easier!
What’s the obvious rationale for going through the whole alphabet of funding rounds, instead of going public / IPO after «the usual» number of raising money.
Wouldn’t the current strategy result in some serious stock dilution for the early investors?
Investors put 10 billion in in a previous round; that's a lot. Somehow, more is needed now. 100M is just 1% of that. So it's not going to massively move the needle. But it does raise the question where all that cash is going.
My guess is that they might be about to embark on a shopping spree and acquire some more VC backed companies. They've actually bought quite a few companies already in the past few years. And they would need cash to buy more. The company itself seems healthy and generating revenue. So, it shouldn't strictly need a lot of extra capital. Acquisitions would be the exception. You can either do that via share swaps or cash. And of course cash would mostly go to the VCs backing the acquired companies. Which is an interesting way to liquidate investments. I would not be surprised to learn that there's a large overlap with the groups of VCs of those companies and those backing databricks. 100M$ on top of 10B sounds like somebody wants in on that action.
As a financial construction it's a bit shady of course. VCs are using money from big institutional investors to artificially inflate one of their companies so that it can create exits for some of their other investments via acquisitions financed with more investment. It creates a steady stream of "successes". But it sounds a bit like a pyramid game. At some point the big company will have to deliver some value. I assume the hope is some gigantic IPO here to offload the whole construction to the stock market.
At least in some sort, this new venture market dynamics in those private markets is looking more similar with the Art market. I remember that J used to follow several private auctions where most of the auctioneers had some sort of ring where in time to time someone needed some liquidity.
Even in some situations where some artworks could have way less value at the public auction houses (Christie’s, Phillips, Sotheby's) their preference was to market between in this circuit of private auctions.
Right I'm curious how long many of these deca/centa unicorn startups can make payroll & pay their cloud bills if all of this AI FOMO unlimited exit liquidity VC investment takes even a pause.
Where did you get the 100M figure from?
Because they don't want the public market to put a real valuation on the company, when they can still raise money with a made up valuation.
Stock dilution doesn't work like that. If a seed investor invests for 5% at a $10mil valuation, and the company goes 10x (ie. a valuation of $100mil), if the company now raises a $100mil Series K, that means the Series K investor owns 50% of the company, and the seed investor got diluted down to 2.5%. However, the new valuation of the company is now $200mil with the cash that the new investor brought in, effectively making the seed investor's investment worth the same.
It's a smaller piece of a bigger pie.
To answer your question, the right question to ask is why go public when you can remain private? Public means more paperwork, more legalese, more scrutiny, and less control for the founder, and all of that only to get a bit more liquidity for your stock. If you can remain private, there really isn't much of a reason to not do that.
An IPO means selling a whole bunch of people, whereas fundraising rounds pre-IPO mean courting a small number of large investors. I think it's partly a sign of the times that there's enough concentrated capital that you can get enough money from private hands to not need to go the IPO route yet.
The private market is getting out of hand, then. I think it makes sense for private companies beyond a certain size to have the same reporting requirements that listed ones do. At these valuations the private market for startups is becoming systemically important.
To some degree, they do -- under SEC rules (Exchange Act §12(g)), private companies with >$10M in assets and 2,000+ shareholders (or 500+ non-accredited investors) have to start public-style reporting. I assume there's some clever accounting to ensure they're not at the 2,000 shareholder cap (perhaps double-trigger RSUs don't count as being a shareholder yet?)
This heavily depends on share classes and preferences. Surely the new investor wants better terms. The issue isn't so much dilution as a preference but added risk of never even getting a payout at all.
> If you can remain private, there really isn't much of a reason to not do that.
With the exception of founders it's better for literally everybody else, more scrutiny, more pressure on c-corp, more liquidity, etc.
Both have benefits. Staying private means a lot less distractions, less investor scrutiny (good and bad), and the general ability to do whatever you want (good and bad).
It's a lot easier to stay long-term focused without investors breathing down your neck. As a private company you're not dealing with shortsellers, retail memers, institutional capital that wants good earnings now, etc..
Of course, the bad side is that if the company gets mismanaged, there's far less accountability and thus it could continue until it's too late. In the public markets it's far easier to oust the C-suite if things go south.
It's a shame that the trend of staying private longer means retail gets shut out from companies like this.
An order of magnitude less scrutiny, but also an order of magnitude in size of investor base. The private markets trade at Palantir levels so why go public. Also the private markets are now routinely doing secondary transactions so even less reason to go public.
A new round is easier than IPO. Especially when the IPO outcome is not necessarily positive.
IPO needs real numbers, VCs just want buzzwords
it's funny how we're letting private companies get away with made up numbers. Rather than making IPOs easier, owning a private company above a certain valuation should come with at least an obligation for GAAP accounting, indepndent audits etc. This is really for the greater good - so what is we see 2-5 years of a beautiful AI bubble if it's going to come crashing down again. It's lawmakers and regulators role to smooth out and dampen the natural tendency for the markets to go bubbly.
the premise probably is that VCs are qualified/educated investors and know what they are doing.
If the private markets can offer you the liquidity you need on your terms, then why subject yourself to the scrutiny of the public markets?
Plus the markets are in a weird state right now.
I’ve never seen a Series K before. I wonder how their cap table looks like.
In India, Zomato[0] (now listed) and Swiggy[1] both had a Series K. SpaceX has only gotten to a Series J, but they've done some secondary sales since. Apparently, Palantir[2] has had a Series K as well, back in 2015.
[0]: https://appedus.com/indias-zomato-raised-500-million-in-seri...
[1]: https://techcrunch.com/2022/01/24/indian-food-delivery-giant...
[2]: https://www.finsmes.com/2015/12/palantir-technologies-raises...
If you Google "Series K investment" basically all the hits are about this. Same applies for J and I - you have to get back to H before you start seeing anyone else.
Technically speaking - "FUCKED"
Why Databricks would do this (rather than IPO) is obvious. When you can raise privately, it’s way easier than IPO. The real question to me is why the investors (new and previous) are going along with it?
Because it is a better valuation than what they would get in the public markets with an IPO?
You'd think previous investors would want some actual liquidity though at some point. The early investors have had plenty of chances now but surely not everyone's been able to cash out. But hey, they have lots of funny money now I guess?
if thrive and andreeseen who are both very old investors are leading the round, clearly they are doubling down on it
Looks like someone is thinking “hey let’s wave our hands in the air and talk about AI and someone will write us a cheque!” as a way to kick the can down the road that this far into it they’re still not selling a product that’s making money. Looks a bit desperate TBH.
yep i still don't understand how hosting notebooks and spark ( which btw is ancient big data tech) is worth 100B.
whats so hard about this. i don't get it.
What do most people use now in place of Spark?
lot of usage has moved to snowflake. I know snowflake cannot do everything spark does but a huge number of workloads on spark can be moved to snowflake ( which has superior ux)
All the smart boomers retired 100 years ago. Now we are left with the dumdums. Forget "its the economy stupid", now "its the boomers stupid".
Are there any cheaper alternatives to Databricks, EC2, DynamoDB, S3 solution? Where cost is more predictable and controlled?
What's a good roll your own solution? DB storage doesn't need to be dynamic like with DynamoDB. At max 1TB - maybe double in the future.
Could this be done on a mid size VPS (32GB RAM) hosting Apache Spark etc - or better to have a couple?
P.S. total beginner in this space, hence the (naive) question.
Depends on how you define cheaper - you could set up Apache Iceberg, Spark, MLFlow, AirFlow, JupyterLab, etc and create an abomination that sort of looks like Databricks if you squint, but then you have to deal with set up, maintenance, support, etc.
Computationally speaking - again depends on what your company does - Collect a lot of data? You need a lot of storage.
Train ML Models, you will need GPUs - and you need to think about how to utilise those GPUs.
Or...you could pay databricks, log in and start working.
I worked at a company who tried to roll their own, and they wasted about a year to do it, and it was flaky as hell and fell apart. Self hosting makes sense if you have the people to manage it, but the vast majority of medium sized companies will have engineers who think they can manage this, try it, fail and move on to another company.
Don't worry, most places go straight with databricks and get a flaky as hell system that falls apart anyway, but then they can blame databricks instead of their own incompetence.
I'm surprised at how often this is reality. Bureaucrat at the top of the decision tree smiles smugly while describing how easy they're accomplishing <goal> with <system>. I've been that bureaucrat too many times.
yeah where IT blocks half of the config, and you disable half of the features that could make it great, just to make sure they definitely don't give control to..GASP... A DATA ENGINEER
To be fair, the Data Engineer is probably either a data analyst turned DBA who yearns for the comfort of SQL Server or worse, the lowest bidder.
Hey I did more than that (eventually) but this person knows what they are talking about lol.
I don't think there is anything out there that really bundles everything exactly like databricks does.
There are better storage solutions, better compute and better AI/ML platforms, but once you start with databricks, you dig yourself a hole because the replacing it is hard because it has such a specific subset of features across multiple domains.
In our multinational environment, we have a few companies that are on different tech stacks (result of M&A). I can say Snowflake can do a lot of the things Databricks does, but not everything. Teradata is also great and somehow not gaining a lot of traction. But they are near impossible to get into as a startup, which does not attract new talent to give it a go.
On the ML side, Dataiku and Datarobot are great.
Tools like Talend, snaplogic, fivetran are also really good at replacing parts of databricks.
So you see, there are better alternatives for sure, cheaper at the same time too, but there is no drop-in replacement I can think of
Exactly this. But you don't really want to bundle straight away -- think about the exact problem you have and then solve exactly that problem. After you've sorted a few problems like this think if a bundled platform is useful.
Thanks for this. Lots to look into.
Maybe I wasn't super clear. Wasn't looking for a 1:1 replacement.
Trying to understand what other options are out there for small teams / projects that don't need all those enterprise features that Databricks offers (governance etc).
For a few TBs of data, well partitioned and stored in parquet or some such format, you could just use duckdb on a single node.
Thanks - will check out DuckDB.
It's been mentioned but I want to add that the original idea of the post (mid size VPS hosting apache spark) might be missing that spark is ideal for distributed and resilient work (if a node fails the framework is able to avoid losing that work).
If you don't need this features, specially the distributed one, going tall (single instance with high capacity, replicate when necessary) or going simpler (multiple servers but without spark coordinating the work) could be good options depending on your/the team's knowledge
Exasol costs us a fraction of what we used to pay for Databricks, and that is even with us serving far more users than we used to do (from a data size perspective we are not at the petabytes scale yet, but getting there).
Self host on Hetzner, it will save you time, money and troubles.
> Series K
I never seen such invertment round. aren't you supposed to stop at C or D? .. or at least at some point?
Yes, they need to stop at Z.
tripple AAA
If they run out of letters, will they eventually raise a series AA?
Imagine the funding they get 10 years later when they finally do a AAA round.
/s
lmao good one
My little Databricks story: we setup hosted model inference for an in-house model. Worked great for several months!
But then they did maintenance and broke the entire feature. Reconfiguring everything from scratch didn't work. A key part where a Docker image is selected was replaced with a hard-coded value including a long system path (and employee name -- verified via LinkedIn).
Because of constant turn-over in account reps we couldn't get any help there. General support was of no use. We finally got acknowledgement of the issue when we got yet another new account rep, but all they did was push us towards paid support.
We exhaustively investigated the issue and it was clearly the case that nothing could be done on our end to resolve it. The entire underlying compute layer was busted.
Eventually they released a newer version of the feature which did work again, but at this point it has become impossible to justify the cost of the platform and we're 100% off.
Good luck to them, but from my experience the business fundamentals are misaligned and it's not a company I hope to ever work with again.
$2-3B in 2024 revenue based on estimates I can find. That’s a 33-50x revenue multiple lol.
Also announcing the signed term sheet but not the close so this is a PR push to find more investors?
its 3.7 billion as of now as a google search will tell you
I prefer full year actuals but thanks!
PE ratio of 40 isn't bad is this market actually. Mature companies like Google/Meta are hovering around 30.
that is earnings (net income) not revenue (top line) so these are wildly different and incomparable numbers
Got it - thanks for the correction.
Just finished ripping out Databricks at one of my clients, and have several more queued up. Folks can't wait to get as far away as they can, and as fast as they can from any of their offerings. Poor performance, bad product, bad UX: hard to get even decent logs out of the damn thing, and it's incredibly overpriced.
They told a good story and had a good sales team, but the writing is on the wall for them.
what's the deal size you being able to rip out ?
$150k for the current client deal, but I've had projects as high as $300k converting their offerings to plain Kubernetes and Serverless.
Palantir did the same, and did pretty well in the end with that last surge of cash.
Foundry is a MARVELOUS stack ! [And VERY expensive !]
Prediction for 2026 - investors will be shitting bricks.
Shitting data bricks, presumably.
It will be a nice discount acquihire for Microsoft in a few years.
Databricks on azure is huge. I've heard that in some Azure regions, over 70% of the compute usage is just Databricks. So there is definitely an incentive for MS to acquire them.
Except that Microsoft looks better if you have the illusion of choice. Azure Databricks or Azure Databricks but you have to build it yourself out of janky azure services
so, azure growth is circular. VC funding for openai and databricks funeled into azure for growth. What happens when things get into reverse gear ?
I always struggled to understand how do you make a company adopt a platform like databricks to « manage data » isnt managing data a minefield with plenty of open source pieces of software that serve different purposes ? who is the typical databricks customer?
I think that's the main offering of databricks- you get a "data platforn in a box" and navigating the forest of piecemeal solutions is replaced with telling your data science and analytics teams to "use databricks".
It's easy to look on knowing lots about data tools and say "this could be better done with open source tools for a fraction of the cost", but if you're not a big tech company, hiring a team to manage your data platform for 5 analysts is probably a lot more expensive than just buying databricks.
What exactly is a "data platform"?
We have a large postgres server running on a dedicated server that handles millions of users, billions of record updates and inserts per day, and when I want to run an analysis I just open up psql. I wrote some dashboards and alerting in python that took a few hours to spin up. If we ever ran into load issues, we'd just set up some basic replication. It's all very simple and can easily scale further.
Sounds like you have the benefit of a nicely designed server and good practices. A lot of companies aren't the same.
Imagine you're a big company with loads of teams/departments multiple different types of SQL servers for data reporting, plus some parquet datalakes, and hey, just for fun why not a bunch of csvs.
Getting data from all these locations becomes a full time job, so at some point someone wants some tool/ui that lets data analysts log into a single thing, and get the experience that you currently have with one postgres server.
I think it's not a problem of scale in the CS sense, more the business sense where big organisations become complex and disorganised and need abstractions on top to make them workable.
we have databricks at my company 50m ARR, 150 employee thats still growing at 15% YoY. With 0 full time Data Engineer (1 data scientist + 1 db admin both co-manage everything on there as part-time jobs. They have their full-time role). We are able to have data from like 100 transactional database tables, Zendesk, all our logs of every API call, every single event from every user in our mobile and web applications, banking data, calendar data, goole play store data, apple store data, all in 1 place. We are a 2-sided marketplace, we can easily get 360 degree view of our B2B customers, B2C customers, measure employee productivity across all departments. It's that deep data understanding of our customers that powers our growth
My team of 3 data scientists are able to support a culture of experimentation, data-informed decision making accross the entire org.
And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me.
I really struggle to imagine being able to that any cheaper. How else we can engineer a data hub for all of our data and manage appropriate access & permissions, run complex calculations in seconds (yes we have replaced overnight complex calculation done by engineering teams), join data from so many disparate sources, at a total cost (tool + labor) <80k/yr. I double dare you to suggest or find me a cheaper option for our use case.
simple businesses dont need databricks. one humungous postgres handle operational transactions is what very simple businesses need
you kill off all open source pieces, in turn compliance is happy, and a CTO is happy because he has a maintenance contract and can blame other people if stuff goes wrong.
It's a way to get those pesky Python people to shut up
Oh, and a CTO is always valued more if he manages a 5 million Databricks budget, where he can prove his worth by showing a 5% discount het negotiated very well, than a 1 million whatever-else budget that would be best in class. Everybody wins.
makes for good boilerplate conversation while playing golf too
> who is the typical databricks customer?
The CTO of a "traditional" company who is responsible for "implementing digital transition".
My company is doing the dbx thing, and the best I can tell my manager is that I'm neutral on it.
My working theory is that the UI, a low-grade web-based SQL editor and catalog browser, is more integrated that the hodgepodge of tools that we were using before, and people may gain something from that. I've seen similar with in-house tools that collect ad-hoc/reporting/ETL into one app, and one should never underestimate the value that people give to the UI.
But we give up price-performance; the only way it can work is if we shrink the workload. So it's a cleanup of stale pipelines combined with a migration. Chaos in other words.
we have databricks at my company 50m ARR, 150 employee thats still growing at 15% YoY.. With 0 full time Data Engineer (1 data scientist + 1 db admin manages everything on there as part time jobs). We are able to have data from like 100 transactional database tables, Zendesk, all our logs of every API call, every single event from every user in our mobile and web applications, banking data, calendar data, goole play store data, apple store data, all in 1 place. We are a 2-sided marketplace, we can easily get 360 degree data on our B2B customers, B2C customers, measure employee productivity across all departments.
My team of 3 data scientists are able to support a culture of experimentation, data-informed decision making accross the entire org.
And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me.
I really struggle to imagine being able to that any cheaper. How else we can engineer a hub for all of our data and manage appropriate access, run complex calculations in seconds, join data from so many disparate sources, at a total cost (tool + labor) <80k/yr. I double dare you to suggest or find me a cheaper option for our use case.
I think the governance stuff might push it over the top for a lot of organisations; it's pretty well integrated with IAM providers not only for structured/modelled data but also workspaces for the data sciencey stuff. Pretty much everything has permissions associated with it. When you have a big data engineering/science push off the back of the AI hype I think it appeals to the cheque writers to have something centralised and controlled.
Aside from that I do get the feeling that most small and medium sized companies have been oversold on it - they don't really have enough data to leverage a lot of the features and they don't really have the skill a lot of the time to avoid shooting themselves in the foot. It's possible for a reporting analyst upskilling to learn the programming skill to not create a tangled web of christmas lights but not probable in most situations. There seems to be a whole cottage industry of consultancies now that purport to get you up and running with limited actual success.
At least it's an incentive for companies to get their data in order and standardise on one place and a set of processes.
In terms of actual development the notebook IDE feels like big old turd to use tho and it feels slow in general if you're at all used to local dev. People do kinda like these web based tools tho. Can't trust people all the time! There's VS code and PyCharm extensions but my team work mainly with notebooks at the moment for good or ill and the experience there is absolute flaky dogshit.
I think it's possible to make some good stuff with it and it's paying my bills at the moment, but I think a lot of the adoption may be doomed to failure lol
wonder what they employees think. will they ever IPO and cashout?
Since this year the employees are vesting RSUs (not options, and also no expiry date) quorterly now, they sell a portion of them (automatically) and pay taxes to the government at each vesting event, as the expiry date no longer exists. For liquidity there are tenders where employees sell their stock privately, so the employees no longer need IPO to cash out.
Just to clarify - for many years employees were getting the RSUs not options, just with the expiratation date attached - which is gone since this year.
So what happened to employees who had RSUs with expirations that have passed? Do they lose the value? I know my startup stock had 10yr expirations.
It didn’t happen as they were careful to make a tender before expiration hit anyone.
I've heard they're regularly doing buybacks for employees.
If their options haven't converted to stock yet, it's not looking good. This is the sort of shenanigans that demand a strike. And ideally regulation.
Options can a significant portion of sign on bonus but they typically vest over several years so I guess they are hoping for an IPO eventually. IMHO Databricks will be overtaken by "events" including AI disillusionment, broader open source tools and broader education across the workforce. So the eventual IPO will not happen.
Depends. Some options only vest in the case of an "exit event", i.e. an acquisition or an IPO. At this point I would assume such options are borderline worthless.
Yeah I think this is how it usually works, and yeah at $100bn valuation they are now 100% worthless, because investors get paid first, and there's no way they'll get sold or IPO for more than $100bn.
> Yeah I think this is how it usually works, and yeah at $100bn valuation they are now 100% worthless, because investors get paid first, and there's no way they'll get sold or IPO for more than $100bn.
Not quite right? Because the raise-implied valuation doesnt account for preferences. The IPO could be for 50bn and the latest investors could do well given the preference stack of first money outs in later rounds.
This curvature of spacetime is caused by the mass of the AI bubble.
While many comments were focused on the "K" letter, I wanted to remind us all that OpenAI stretched their Series E from Jan 23, 2023 to Nov 22, 2024 -- 23 months, squeezing in 6 rounds
source: https://tracxn.com/d/companies/openai/__kElhSG7uVGeFk1i71Co9...
New meaning for "AI singularity" there.
Pull out the Prince albums, its time to party like its 1999.
https://www.youtube.com/watch?v=I6IQ_FOCE6I
for laypeople this is like the, "what does salesforce even do" meme, but the explanation is a million times more ridiculous....
How is this company worth even 1% of that?
Their product looks like basic wrappers for managing postgres instances and dashboards. Why would anyone with even minimal technical expertise pay for a generic service like that?
A lot of people purchasing their products have a vague understanding of the problem they're trying to solve and an even worse grasp of how dbx solves it for them. I'm living this first hand.
we have databricks at my company 50m ARR, 150 employee. With 0 full time Data Engineer (1 data scientist + 1 db admin manages everything on there as part time jobs). We are able to have data from like 100 transactional database tables, Zendesk, all our logs, every single event from eveery user in our mobile application, banking data all in 1 place. We are a 2-sided marketplace, we can easily get 360 degree data on our B2B customers, B2C customers, measure employee producting.
My team of 3 data scientists are able to support a a culture of experimentation, data-informed decision making and support the entire org, and we are still growing 15% YoY.
And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me.
i struggle to imagine how else we can engineer a hub for all of our data and manage permissions appropriately at less tooling and engineering cost
Do you use any payed support from databricks?
no we don't. Our plan includes some support, but we honestly haven't needed it. We are also aggressive about sizing compute resources to the task, and foregoing some of the more costly "easier serverless options" that databricks provides. Their serverless SQL though is excellent value for money.
A company that does $4B in revenue at nice margins should be worth more than $1B.
what screen are you even looking at when you log on to databricks? does your admin not provide you any access?
Not sure if this reply was meant for me, but I've never even heard of databricks until today. I had just taken a look at their website.
At my company we just use a large self-managed postgres server and I access it directly.
Private markets are starting look a little frothy, aren't they?
Regardless of the product and idea they had, a company that is 15 years old and raised 10+ billion dollars still needing to raise money after all this time is ridiculous.
Not being sustainable after all this time and billions of dollars is a sign company is just burning money, and a lot of it. wework vibes.
They were expecting to be cash flow positive in Jan 2025, according to [0]. That said, it is hard to tell if they actually became cash flow positive since with them still being a private company, they aren't required to release that information.
[0]: https://www.databricks.com/company/newsroom/press-releases/d...
Whenever companies release glowing fluff PR about their amazing financials they key word in there is “non-GAAP.”
i.e. when we exclude a bunch of pesky costs and other expenses that are the reason we’re not doing so well, we’re actually doing really well!
Non-GAAP has its place, but if used to say the company is doing well (vs like actual accounting) that’s usually not a good sign. Real healthy companies don’t need to hide behind non-GAAP.
Yes but free cash flow is free cash flow, and that's what matters for survival (i.e. run-rate). So long as fcf is positive, you'll never go bankrupt.
Really what they don't tell you is how much SBC they have. That's what crushes public tech stocks so much. They'll have nice fcf, but when you look under the hood you realize they're diluting you by 5% every year. Take a look at MongoDB (picked one randomly). It went public in 2016 with 48.9m shares outstanding. Today, it has 81.7m shares outstanding. 67% dilution in 9 years.
This. To me if you are still unprofitable after 15 years you are not really a business.
However genuinely curious about the thesis applied by the VC’s/Funds that invest in such a late stage round? Is it simply they are taking a chance that they won’t be the last person holding the potato? Like they will get out in series L or M rounds or the company may IPO by then. Either ways they will make a small return? Or is the calculus diff?
The last person in usually gets the best deal, in that they can get preference and push everyone else (previous investors, founders, and employees) down. If things goes south, they get their money out before anyone else.
Isn’t everyone “the last” at the moment they are taking participation in the round? If someone thinks they’re gonna get preferential treatment in Series C or D, and then comes someone in E with preferential treatement, then
Why don't early investors put clauses in their investment to protect themselves against being screwed over by later investors? It seems like an obvious thing to ask for if you're giving someone a lot of money, so I'm assuming there must be a very good reason it's not done.
Early investors (the main ones at least) usually get pro-rate rights - which means you can invest in later rounds to maintain your ownership percentage (i.e a later round dilutes your ownership, so you invest a bit until the ownership stays the same).
But the pref stack always favors later investors, partly because that's just the way it's always been, and if you try to change that now no one will take your money, and later investors will not want to invest in a company unless they get the senior liquidity pref.
The VCs should, they're called anti-dilution measures
Its less financially/legally saavy parties like angel investors and early employees who (sometimes) get screwed out of valuation
> However genuinely curious about the thesis applied by the VC’s/Funds that invest in such a late stage round
1) It's evaluated as any other deal. If you model out a good return quantitatively/qualitatively, then you do the deal. Doesn't really matter how far along it is.
2) Large private funds have far fewer opportunities to deploy because of the scale. If you have a $10B fund, you'd need to fund 2,000 seed companies (at a generous $5m on $25m cap). Obviously that's not scalable and too diversified. With this Databricks round, you can invest a few billion in one go, which solves both problems.
I guess making a quick buck pre-IPO? It's essentially lending cash on loose terms.
Why they do it via an equity offering and not debt is unclear. You'd imagine the latter is cheaper for a hectocorn.
Anchoring IPO expectations and hype. 100B valuation is useful.
Even more if you are familiar with their pricing.
This! We did some simple testing on their platform to integrate it into our product for a customer. In a few days of light work rang up a huge bill. Many multiples of what we spend on OpenAI, which gets heavy use.
It does have some good aspects for engineers working with these tools in some cases: https://ludic.mataroa.blog/blog/i-accidentally-saved-half-a-...
Man this article was fantastic. Thanks for sharing!
Granted, OpenAI is burning tons of cash as well. The great tech race of cash burning is at full steam!
Its unfair to compare Databricks to OpenAI because they're at very different points in the enshittification[1] process.
OpenAI is still early, burning VC money to acquire customers by operating at a loss. This makes it appear cheap.
DataBricks is further along, attempting to claw back the value they provided to customers by raising prices.
[1] https://en.wikipedia.org/wiki/Enshittification
That may be, but our use of DB was 1/1000 of what we do in a month with OpenAI and the bill we racked up was $3,000 in 1 day. We talked with them and because we freaked out and deleted the widget (whatever the connectors are called) they didn't have logs for what we did, so they couldn't refund anything (they were willing). The fact that they couldn't find anything because we deleted whatever it was, that was weird, because they could certainly bill us. We're never using them again.
But they can’t get the value as long as they have to compete with snowflake.
Them and Snowflake have been in an acquisition race, gobbling up data engineering startups like Pac-Man.
That costs a fair bit of dosh.
What's the end game? What are investors expecting and how are they expecting the company to get $100b in profits and over what period of time?
Being the dominant player, I presume, starve out the rest by being _the_ way to do your big data.
Do we know that they need to raise and are not sustainable? I don't think them raising is evidence of either.
Why would they raise money if they do not need? Raising money dilutes existing shareholders - who are probably not too happy about it.
I worked at a place once where the CEO basically said that it's a lot easier to raise money when you don't need it than to raise it when you do. The US economy is looking pretty weird with a bunch of conflicting predictors. Maybe they're buffering for a recession.
Its always true. Whether you are a start up or an individual. People throw money at you when you least need it. But when you do need it, they give all types of hassle
depends on who is making a decision and how exactly is the funding round structured - for some investors, diluting other shareholders is actually a good thing. For existing employees, if they get an option to partially cash out now is probably better than waiting indefinitely for an IPO etc
Afaik databricks is just selling shares on private markets rather than IPOing in order to retain more independence.
I can’t know if it’s completely true ofc, but that’s what employees are told.
At least it is not unprecedented. Palantir raised a series I in 2020 after 17 years of operation.
At that time the Palantit valuation was considered 'hefty / overpriced' at $9B. Current stock price valuation, post IPO is a completely detached from fundamentals ~$378B
if you were to apply the same ratio to Databricks it would have to trade at 42 000 000 000 000 000 USD - enough to buy the entire US sovereign debt, the moon, all earth's minerals with plenty to spare. A completely rational market if you ask me.
Is this just a case of waiting to stay private while still giving current employees some liquidity?
It feels like they may have got market share using low costs and this has led to this situation.
The costs of using Databricks are anything but low, though.
True, but it can still be lower than alternatives or lower than the cost to provide.
it really is the most expensive I've ever came across. It would be a flatout no-go if it weren't for Microsoft pushing everyone onto this platform, supported by their network of really absolutely neutral Gartner friends and Deloittes/KPMG/Accenture/TCS "experts" to recommend what lines their pockets.
Same story was with Spunk. Yet it was acquired by Cisco for $28 billion. Valuation and ability to burn cash for 10+ years of the Silicon Valley companies never cease to amaze me.
If they’re not profitable by now watch Oracle just buy them in the future and that’ll be that.
Just go public already WTF
Just curious, wouldn't it get harder for companies like Databricks or Clickhouse to compete against AWS in the long run? They have better products, for sure. Yet over time, the product gap between what they offer and what AWS offers will narrow, and as a result the cost will be what matters most to the customers. And how can they compete on cost given that they run on AWS?
AWS is always going to be a lego bricks - choose your own adventure or assemble your own stuff option for people who want to struggle with IAM policies. plus AWS VPs compete against each other, so there is no concept of the best opinionated way of doing things. just a bunch of random options with tradeoffs. might attract some nerds. nobody else
Theoretically, yes, if AWS were really focused on they could probably deliver something like databricks; all the components are off the shelf, and a significant number of databricks clusters are on aws anyway. The question though is why; they’re already driving a lot of traffic to AWS and managing all the end customer stuff. The benefit of killing databricks is less than letting it live and grow and buy more from AWS.
Good point. I had the assumption that AWS EMR and Redshift had incentives to compete with Databricks. Another assumption was that someone in the AWS will eventually be ambitious enough to add offerings similar to Databrick's, like how AWS added MKS and OpenSearch. Both assumptions can well be wrong, though.
series k What do they when they run out of alphabets?
We'll transition to using a UUID v4
This fucker Ghodsi will do everything but go public
then he'll really have come clean
who the fuck wants to do sarbanes oxley. sox killed IPOs. the private market is quite liquid. why attract activists and losers with an agenda to your company
>Series K
Mega lmao. They already owe $20B.
Their revenue is good, though, further adding to the mistery.
[flagged]
I'm as skeptical as anyone, but have you ever heard of companies like Oracle, which got rich off a database or Snowflake (current market cap 65B)? Companies pay oodles of money for that capabilities.
oracle succeeded because of its lobbyists and sales contacts, so much so that they spun out into another multi billion $ org
I’d imagine pretty much all of the s and p 500 companies rely on databricks, a large percentage of them at least
for what? managed postgres and some ml training tools?
because it's recommended by nearly all consultants and Microsoft.
Simple as that, it's consulting Heaven. Much like SAS and SAP. Everybody happy. Now to be far to databricks, if used properly and ignore the cost, it does actually function pretty well. Compared to Synapse, PowerBI Tabular, Fabric, Azure ML, ... that's already a big big big step forward.
Databricks is janky, but so much better than the Azure services for data.
If you're buying from microsoft, it won't be cheap either way, might as well treat yourself a little bit.
Dont forget salesforce
Spark
> and Lakebase, a new type of operational database (OLTP), built on open source Postgres, and optimized for AI Agents.
Rust + Cloud Object Store/serverless/S3 + Postgres. Slap "AI agents" on top: keyword peak reached. So they will easily raise the 100bn.
Meanwhile, this is Lakebase/Neon: https://blog.opensecret.cloud/why-we-migrated-from-neon-to-p...
Due diligence? Taboo.
They’re not raising $100b. They’re raising _at_ $100b.
I stand corrected.
To be honest, I completely lost the sense of scale with money in general. It all feels like Zimbabwe dollars to me. The news talking billions and trillions. Meanwhile friends who used to be well-off (in US/UK/EU) struggle with mortgage payments and/or bills. And the ones not laid off are expected to grind 10hs per day to keep their jobs.
https://en.wikipedia.org/wiki/HyperNormalisation
https://www.theguardian.com/wellness/ng-interactive/2025/may...
all good if you dont have any data on databricks lakehouse. if its already on the lakehouse, its one more click to reverse-etl all that to postgres lakebase and get sub millisecond query response