Maybe it's the quotes selected for the article, but it seems like the judge simply doesn't get the objections. And the reasoning is really strange:
"Even if the Court were to entertain such questions, they would only work to unduly delay the resolution of the legal questions actually at issue."
So because the lawsuit pertains to copyright, we can ignore possible constitutional issues because it'll make things take longer?
Also, rejecting something out of hand simply because a lawyer didn't draft it seems really antithetical to what a judge should be doing. There is no requirement for a lawyer to be utilized.
> … but it seems like the judge simply doesn't get the objections. And the reasoning is really strange
The full order is linked in the article: https://cdn.arstechnica.net/wp-content/uploads/2025/06/NYT-v.... If you read that it becomes more clear: The person who complained here filed a specific "motion to intervene" which has a strict set of requirements. These requirements were not met. IANAL, but it doesn't seem too strange to me here.
> Also, rejecting something out of hand simply because a lawyer didn't draft it seems really antithetical to what a judge should be doing. There is no requirement for a lawyer to be utilized.
This is also mentioned in the order: An individual have the right to represent themselves, but a corporation does not. This was filed by a corporation initially. The judge did exactly what a judge what supposed to do: Interpret the law as written.
That appears to be the case to me too. There are so many similar services that people use that could fall victim to this same issue of privacy . We should have extremely strong privacy laws preventing this, somehow blocking over-broad court orders. And we don't. Imagine these case:
1. a court order that a dating service saves off all chats and messages between people connecting on the service (instead of just saving off say the chats from a suspected abuser)
2. saving all text messages going through a cell phone company
3. how about saving all google docs? Probably billions of these a day are being created.
4. And how has the govt not tried to put out a legal request that signal add backdoors and save all text messages (because there will no doubt be nefarious users like our own secretary of defense). I think it would take a very significant reason to succeed against a private organization like signal.
The power and reach of this makes me wonder if the US govt already has been doing this to normal commercial services (outside of phone calls and texting). I recall reading back in the day they were "tapping" / legally accessing through some security laws phone company trunks. And then we learned about tapping google communications from Edward Snowden.
My experience has definitely been that I've heard it more online and in San Francisco, and not very often in Germany, Texas, or Russia. Are you in one of those areas?
I feel like the etymology is something like "print off a few sheets" becoming "copy off a sheet with the copier", and then to the more general digital copy meaning.
I've definitely been using phrasing like it ("save off a copy", etc) for at least the past 20-25 years (upstate NY, moderately online, avid reader, parent is an English professor, fairly well-traveled).
Conceptually, I think that the "off" serves the purpose of aligning it with something like "split off"—you're essentially forking the history by creating a separate saved copy.
> We should have extremely strong privacy laws preventing this, somehow blocking over-broad court orders
Quick question. Should your percieved "right to privacy" supersede all other laws?
To extrapolate into the real world. Should it be impossible for the police to measure the speed of your vehicle to protect your privacy? Should cameras in stores be unable to record you stealing for fear of violating your privacy?
I think there's an idea akin to Europe's "right to be forgotten" here.
We can all observe the world in the moment. Police can obtain warrants to wiretap (or the digitial equivalent) suspects in real-time. That's fine!
The objection is that we are ending up with laws and rulings that require a recording of history of everyone by everyone - just so the police can have the convenience of trawling through data everyone reasonably felt was private and shouldn't exist except transiently? Not to mention that perhaps the state should pay for all this commercially unnecessary storage? Our digital laws are so out-of-touch with the police powers voters actually consented to - that mail (physical post) and phone calls shall not be in intercepted except under probable cause (of a particular suspect performing a specific crime) and a judge's consent. Just carry that concept forward.
On a technical level, I feel a "perfect forward secrecy" technique should be sufficient for implementers. A warrant should have a causal (and pinpoint) effect on what is collected by providers. Of course we can also subpoena information that everyone reasonably expected was recorded (i.e. not transient and private). This matches the "physical reality" of yesteryear - the police can't execute a warrant for an unrecoreded person-to-person conversation that happened two weeks ago; you need to kindly ask one of the conversents (who have their own rights to privacy / silence, are forgetful, and can always "plead the 5th").
> Not to mention that perhaps the state should pay for all this commercially unnecessary storage?
They do. Not upfront, but they pay very nicely for inconveniencing businesses when demanding the data.
Unfortunately, that often creates incentive for the business to error on the side of sharing too much with the authorities, even when proper procedure (warrants) have not been followed. It's the only way to retroactively get payed for all that storage and for the code to retrieve that data.
Didn't the Supremes decide there was no constitutional right to privacy as a side effect of overturning Roe? (Or, at least, throw it into full Calvin Ball mode...)
Not really. The reason the overturning of Roe was widely considered to be inevitable, even by jurists who were pro-choice, is that the theory of privacy used in that case was fundamentally incompatible with a broad range of regulatory powers most people think the Constitution grants the Federal government.
The reasoning behind Roe was generally regarded as tenuous even by the justices that supported it. Overturning it was required to defend the government’s Constitutional authority for agencies like the FDA, which was undermined by inconsistencies introduced by Roe v Wade. Eventually those judicial inconsistencies come home to roost.
tl;dr: Roe being overturned had little to do with privacy and more to do with protecting specific regulatory powers from being unconstitutional using the same reasoning introduced in Roe v Wade.
Removing such decisions from Federal purview was an elegant solution to the problem, with the practical effect of deferring all such decisions to voters at the State level.
I think both you and pyuser583 are correct to a certain extent. The stated reasons for overturning Roe were because of the tenuous basis as a privacy grounded position. On the other hand I completely believe that Roe was such a politically charged issue that the judges voted according to their allegiance, even though they are not supposed to have such allegiances. Everybody would have been surprised if any of the judges had decided differently to how they did. So, while there may have been a legal argument to be made, I don't think that particular issue was decided on those grounds.
Your two examples don't map to the concern about data privacy.
Speed cameras only operate on public roads. The camera in the store is operated by the store owner. In both cases one of the parties involved in the transaction (driving, purchasing) is involved in enforcement. It is clear in both cases that these measures protect everyone and they have clear limits also.
Better examples would be police searching your home all the time, whenever they want (This maps to device encryption).
Or store owners surveilling competing stores / forcing people to wear cameras 24/7 "to improve the customer experience" (This maps to what Facebook / Google try to do, or what internet wire tapping does).
> searching your home all the time, whenever they want
What? How does OpenAI map to your home at all? This is pure nonsense. You seem to have entirely dismissed the comparison to driving a little too out of hand.
The internet is, like the roads, public infrastructure. You can't claim that encryption makes all traffic on the public infrastructure as private as staying home.
You sound like one of those "free man of the land" guys: "I'm not driving your honor, I was traveling."
No right or law supersedes all other laws, and obviously no-one is asking for that. But nor should a court just be able to order anything they want without regard to who else is affected.
Judges generally are either lawyers or have legal experience. The judge in question was formerly a practicing lawyer, albeit claims to have nearly completed a Ph.D. in zoology.
It's extremely unlikely that a protected class is going to start treating a non-protected class with the same regard in society.
I'm not even sure how there could be a constitutional issue here, but it probably isn't for this court to figure out anyways.
>Also, rejecting something out of hand simply because a lawyer didn't draft it seems really antithetical to what a judge should be doing. There is no requirement for a lawyer to be utilized.
This is 100% wrong. Pro Se litigation is well regulated, in the first case a non lawyer tried to file in representation of his company, which is not himself so it's not pro se, and you need to be a licensed lawyer to represent someone else.
"So because the lawsuit pertains to copyright, we can ignore possible constitutional issues because it'll make things take longer?"
Not quite, the contention is that the judge doesn't see how it would be successful, so it would be a delay that never addresses a constitutional issue by her judgment.
>So because the lawsuit pertains to copyright, we can ignore possible constitutional issues because it'll make things take longer?
What constitutional issues do you believe are present?
> There is no requirement for a lawyer to be utilized.
Corporations must be represented by an attorney, by law. So that's not true outright. Second, if someone did file something pro-se, they might get a little leeway. But the business isn't represented pro-se, so why on earth would the judge apply a lower standard appropriate for a pro-se party so a sophisticated law firm, easily one of the largest and best in the country?
When you are struggling to reason around really straightforward issues like that, it does not leave me with confidence about your other judgments regarding the issues present here.
Do you think the 4th amendment enjoins courts from requiring the preservation of records as part of discovery? The court is just requiring OpenAI to maintain records it already maintains and segregate them. Even if one thinks that _is_ a government seizure, which it isn't---See Burdeau v. McDowell, 256 U.S. 465 (1921); cf. Walter v. United States, 447
U.S. 649, 656 (1980) (discussing the "state agency" requirement)---no search or seizure has even occurred. There's no reasonable expectation of privacy in the records you're sending to OpenAI (you know OpenAI has them!!; See, e.g., Smith v. Maryland, 442 U.S. 735 (1979)) and you don't have any possessory interest in the records. See, e.g., United States v. Jacobsen, 466 U.S. 109 (1984).
This would help explain why entities with a “zero data retention” agreement are “not affected,” then, per OpenAI’s statement at the time? Because records aren’t created for those queries in the first place, so there’s nothing to retain?
AIUI Because if you have a zero data retention agreement you are necessarily not in the class of records at issue (since enterprise customers records are not affected, again AIUI per platinffs' original motion which might be because they don't think they're relevant for market harm or something).
So I think that this is more so an artefact of the parameters than an outcome of some mechanism of law.
> The court is requiring OpenAI to maintain records it would have not maintained otherwise.
Not quite. The court is requiring OpenAPI to maintain records longer than it would otherwise retain them. It's not making them maintain records that they never would have created in the first place (like if a customer of theirs has a zero-retention agreement in place).
Legal holds are a thing; you're not going to successfully argue against them on 4A grounds. This might seem like an overly broad legal hold, though, but I'm not sure if there are any rules that prevent that sort of thing.
So the parties are just like duking it out in a parking lot or something? If the government is not involved, then why does OpenAI even bother listening to the judge?
How so? The comment I was responding to made no such distinction - it merely claimed "the government is not involved". In the US, "government" is generally taken to be anybody acting under the banner of nation/state authority (contrast with say the UK). So the government is most certainly involved here - adjudicating the case and issuing this retention order.
(for example, your own comment: the executive and judicial branches [of government])
> How so? The comment I was responding to made no such distinction
Should I mention that water is wet every time I mention water? The executive is the executive, the judicial is the judicial. It's inherent in the discussion and pretending otherwise only for the benefit of furthering obtuse points that go nowhere serves the benefit of no one. So you either didn't know, and do now, or you're just cratering the discussion.
I made an obtuse reply to an obtuse wrong assertion. You then baselessly claimed I didn't know about different branches of government. That second bit is what tends to crater conversations.
The distinction of the judiciary is most certainly relevant to the actual legal analysis here - the judiciary often reserves sweeping authoritarian powers for themselves, even when they do act to restrain the legislative/executive. So without even really analyzing the details, I am pretty sure that the law as written supports this action.
But the comment I was responding to wasn't making a larger more nuanced argument - rather it said that the government was not involved, defining away the actions of the judiciary as somehow not being governmental action, regardless of them being done with the authority of government.
The overall analysis is that if people are up in arms about this, it just reinforces the need for some actual privacy laws in this country - both to protect from corporations themselves abusing our data, and in this case to prevent the government from creating overly broad judicial orders that may only target specific companies but end up running roughshod over many individuals' rights.
(and just to be clear to avoid going off into the terminology weeds again: the definition of rights I'm using is the one of imagined natural rights, not merely what has been codified into law)
It's either ambiguous or a logical fallacy - the original concern is about government overreach in general, so it cannot be dismissed by focusing on the executive when it's obviously not the executive acting.
The commenter I responded to is an attorney who presumably just tried to cut out having to detail a longer argument based on the actual nuances that have been interpreted from the 4th amendment. Still, the argument bit off too much so in the interest of deeper analysis and rational discussion it seemed worth calling out.
The topic's entire complaint is of people not liking a government action, arrived at by government policy. You made a comment that the "government is not involved". Reading your other comments in this thread, it seems as if your point is to brush aside people's concerns as if they are nonsensical. But that simplistic statement is obviously false, because we are talking about a government order. My comments only seem pedantic because I've had to tediously spell out the details in response to your argumentative wriggling instead of you just accepting that your simplistic statement was wrong.
> When you are struggling to reason around really straightforward issues like that, it does not leave me with confidence about your other judgments regarding the issues present here.
Or, perhaps, that's not something known by most. I didn't struggle to understand that, I simply didn't know it. Also, again, the article could have mentioned that, and I started my statement by saying maybe the article was doing a bad job conveying things.
> What constitutional issues do you believe are present?
This method of interrogation of online comments is always interesting to me. Because you seem to want to move the discussion to that of whether or not the issues are valid, which wasn't what I clearly was discussing. When you are struggling to reason around really straightforward issues like that, it does not leave me with confidence about your other judgments regarding the issues present here.
Method of interrogation? I'm not a mind reader. If you don't want to make yourself clear, that's fine. If you want to be petty, that's fine as well.
>Or, perhaps, that's not something known by most. I didn't struggle to understand that, I simply didn't know it.
Sorry you struggled to not understand your own concept that you put forward that because a lawyer isn't required (not true, but granting you this for the sake of this conversation), we shouldn't hold lawyers up to the standard of a lawyer anyway? That's facially silly.
This doesn't seem especially newsworthy. Oral arguments are set for OpenAI itself to oppose the preservation order that has everyone so (understandably) up in arms. Seems unlikely that two motions from random ChatGPT users were going to determine the outcome in advance of that.
Seems that a judge does not understand the impact of asking company X to "retain all data" and is unwilling to rapidly reconsider. Part of what makes this newsworthy is the impact of the initial ruling.
Retention orders of this kind are not uncommon and the judge has not ordered it be turned over to anyone until they hear arguments on it.
I note with amazement that tons of hn users with zero legal experience, let alone judge experience, are sure its the judge who doesn't understand, not them. Based on what I don't know but they really are sure they get it more than the judge!
Underlying this issue is that the judicial system (or the patent system, or the political system) is not populated with enough individuals possessing software engineering "common sense."
It is highly likely that this is not confined to just software, I'm sure other engineering or complex disciplines feel the same way about their discipline.
How do we have experts inform these decisions without falling into the trap of lobbying where the rich control the political and legal sphere?
Anyway, I cede you the point that the US law does not match my "common sense" esp around this 3rd party rule mentioned in other comments. It kind of sucks that US "winning the internet" means that even non-US citizens are subject to US law in this regard.
The judicial system is supposed to apply the law, not "common sense". How could it be otherwise? If you don't like the law then take that up with the legislative branch.
Who is meant to pay for all this data retention? If OpenAI win the argument, can they claim the storage costs from plaintiffs?
It's OK to say "don't throw out a few pieces of paper for a bit", but that doesn't compare to "please spend $500k/month more on S3 bills until whenever we get around to hearing the rest of the case". (Perhaps that much money isn't that important to either side in this _particular_ case, but there is a cost to all this data retention stuff).
> Seems that a judge does not understand the impact of asking company X to "retain all data"
You can count on the fact that the judge does in fact understand that this is a very routine part of such a process.
It is more like the users of ChatGpt don't understand the implication of giving "the cloud" sensitive information and what can happen to it.
It might surprise many such users the extent that the data they casually place at the hands of giant third parties can be, and has routinely, been the target of successful subpoena.
As an illustration, if two huge companies sue each other, part of the legal process involves disclosure. This means inhaling vast quantities of data from their data stores, their onsite servers, executives laptops. Including those laptops that have Ashley Madison data on them. Of course, part of the legal process is motions to exclude this and that, but that may well be after the data is extracted.
Not a real answer, but I think a local LLM is going to be the way to go. I've been playing with them for some time now — and, yeah, they're still not there with the save context, hardware requirements needed for a really good model... But I suspect like anything else in tech, a year or two from now a decent local LLM will not be such a stretch.
I can't wait actually. It's less about privacy to me than to being offline.
Non-technicals don't know how LLMs work, and, more importantly, don't care about their privacy.
For a technology to be widely used, by definition, you need to make it appealing to the masses, and there is almost zero demand for private LLM right now.
That's why I don't think that local llms will win. There are narrow use cases where regulations can force local llm usage (like for medical stuff), but overall I think that services will win (as they always do)
I still happen to trust Apple with my cloud data, with their secure enclave. To that end, an Apple solution where my history/context is kept in my cloud account, perhaps even a future custom Apple chip that could run some measure of a local LLM.... This "aiPhone" might be the mainstream solution that non-technicals will enjoy.
> there is almost zero demand for private LLM right now.
you need some really expensive hardware to run a local LLM, most of which is unavailable to the average user. The demand might just simply be hidden as these users do not know nor want to expend the resources for it.
but i have hope that the hardware costs will come down eventually, enough that it reveals the demand for local LLM.
After all, i prefer my private questions to an LLM not be ever revealed.
You don't have any recourse, at least not under American law. This a textbook third-party doctrine case: American law and precedent is unambiguous that once you voluntarily give your data to a third party-- e.g. when you sent it to OpenAI-- it's not yours anymore and you have no reasonable expectation of privacy about it. Probably people are going to respond to this with a bunch of exceptions, but those exceptions all have to be enumerated and granted specifically with new laws; they don't exist by default, and don't exist for OpenAI.
Like it or not, the judge's ruling sits comfortably within the framework of US law as it exists at present: since there's no reasonable expectation of privacy for chat logs sent to OpenAI, there's nothing to weigh against the competing interest of the active NYT case.
> once you voluntarily give your data to a third party-- e.g. when you sent it to OpenAI-- it's not yours anymore and you have no reasonable expectation of privacy about it.
The 3rd party doctrine is worse than that - the data you gave is not only not yours anymore, it is not theirs either, but the governments. They're forced to act as a government informant, without any warrant requirements. They can say "we will do our very best to keep your data confidential", and contractually bind themselves to do so, but hilariously, in the Supreme Court's wise and knowledgeable legal view, this does not create an "expectation of privacy", despite whatever vaults and encryption and careful employee vetting and armed guards standing between your data and unauthorized parties.
I don't think it is accurate to say that the data becomes the government's or they have to act as an informant (I think that implies a bit more of an active requirement than responding to a subpoena), but I agree with the gist.
> You don't have any recourse, at least not under American law.
Implying that the recourse is to change the law.
Those precedents are also fairly insane and not even consistent with one another. For example, the government needs a warrant to read your mail in the possession of the Post Office -- not only a third party but actually part of the government -- but not the digital equivalent of this when you transfer some of your documents via Google or Microsoft?
This case is also not the traditional third party doctrine case. Typically you would have e.g. your private project files on Github or something which Github is retaining for reasons independent of any court order and then the court orders them to provide them to the court. In this case the judge is ordering them to retain third party data they wouldn't have otherwise kept. It's not clear what the limiting principle there would be -- could they order Microsoft to retain any of the data on everyone's PC that isn't in the cloud, because their system updater gives them arbitrary code execution on every Windows machine? Could they order your home landlord to make copies of the files in your apartment without a warrant because they have a key to the door?
> It's not clear what the limiting principle there would be -- could they order Microsoft to retain any of the data on everyone's PC that isn't in the cloud, because their system updater gives them arbitrary code execution on every Windows machine?
My understanding is it's closer to something like: They cannot order a company to create new tools, but can tell them to not destroy the data they already have. So, in the question of MS having the ability to create a tool that extracts your data is not the same as MS already having that tool functioning and collecting all of your data that they store and are then told to simply not destroy. Similarly, VPNs that are not set-up to create logs can't keep or hand over what they don't have.
Laws can be made to require the collection and storage of all user data by every online company, but we're not there -- yet. Many companies already do it on their own, and the user then decides if that's acceptable or not to continue using that service.
If the company created their service to not have the data in the first place, this probably never would have found its way to a judge. Their service would cost more, be slower, and probably be difficult to iterate on as it's easier to hack things together in a fast moving space then build privacy/security first solutions.
The issue is, what does "the data they already have" mean? Does your landlord "have" all the files in your apartment because they have a key to the door?
Real Property, Tenant's rights, and Landlord laws are an entirely separate domain. However, I believe in some places if you stop paying rent long enough, then yes all your stuff now belongs to the landlord because they have the "key".
"the data they already have" means the data the user gave the company (no one is "giving" their files to their landlord) and that the company is in full possession of and now owns. Users in this case are not in possession or ownership of the data they gave away at this point.
If you hand out photocopies of the files in your apartment, the files in your apartment are still yours, but the copies you gave away to a bunch of companies are not. Those now belong to the company you gave them to and they can do whatever they want with it. So if they keep it and a judge tells them the documents are not to be destroyed (because laws things), they would probably get into trouble if they went against the order.
Which is what I was trying to bring attention to; the fact that the company has a choice in what data (if any) they decided to collect, possess, and own. If they never collected/stored it then no one's privacy would be threatened.
The third-party doctrine has been weakened by the Supreme Court recently, in United States v. Jones and Carpenter v. United States. Those are court decisions, not new laws passed by Congress. See also this quote:
If OpenAI doesn't succeed at oral argument, then in theory they could try for an appeal either under the collateral order doctrine or seeking a writ of mandamus, but apparently these rarely succeed, especially in discovery disputes.
Yep. This is why we need constitutional amendments or more foundational laws around privacy that changes this default. Which should be a bipartisan issue, if money had less influence in politics.
This is the perverse incentives one rather than the money one. The judges want to order people to do things and the judges are the ones who decide if the judges ordering people to do things is constitutional.
To prevent that you need Congress to tell them no, but that creates a sort of priority inversion: The machinery designed to stop the government from doing something bad unless there is consensus is then enabling government overreach unless there is consensus to stop it. It's kind of a design flaw. You want checks and balances to stop the government from doing bad things, not enable them.
> once you voluntarily give your data to a third party-- e.g. when you sent it to OpenAI-- it's not yours anymore and you have no reasonable expectation of privacy about it.
sorry for the layperson question, but does this apply then to my company's storage of confidential info on say google drive, even with an enterprise agreement?
OpenAI is the actual counterparty here though and not a third party. Presumably their contracts with their users are still enforceable.
Furthermore, if the third party doctrine is upheld in its most naïve form, then this would breach the EU-US Data Privacy Framework. The US must ensure equivalent privacy protections to those under the GDPR in order for the agreement to be valid. The agreement also explicitly forbids transferring information to third parties without informing those whose information is transferred.
Well, I don't think anyone is expecting the framework to work this time either after earlier tries has been invalidated. It is just panicked politicians trying to kick the can to avoid the fallout that happens when it can't be kicked anymore.
Yes, and I suppose the courts can't care that much about executive orders. Even so, one would think that they had some sense and wouldn't stress things that the politicians have built.
3rd party doctrine in the US is actual law... so I'm not sure what's confusing about that. The president has no power to change discovery law. That's congress. Why would a judge abrogate US law like that?
You're confused. This is not about the FBI's right to data, it's about the New York Times' right to the same. The doctrine you're referencing is irrelevant.
The magistrate is suggesting that there is no reasonable expectation of privacy in chats OpenAI agreed to delete, at the request of users. This is bizarre, because there's no way for OpenAI to use data that is deleted. It's gone. It doesn't require abrogation of US law, it requires a sensible judge to sit up and recognize they just infringed on the privacy expectations of millions of people.
They probably do already, but won't this ruling force OpenAI to operate separate services for the US and EU? The US users must accept that their logs are stored indefinitely, while an EU user is entitled to have theirs delete.
Stop giving your information to third parties with the expectation that they keep it private when they wont and cannot. Your banking information is also subject to subpoena... I don't see anyone here complaining about that. Just the hot legal issue of the day that programmers are intent on misunderstanding.
Do you really think a European court wouldn't similarly force a provider to preserve records in response to being accused of destroying records pertinent to a legal dispute?
Fundamentally, on-prem or just foregoing is the safest way, yes. If one still uses these remote services it's also important to be prudent about exactly what data you share with them when doing so[0]. Note I did not say "Send your sensitive data to these countries instead".
The laws still look completely different in US and EU though. EU has stronger protections and directives on privacy and weaker supremacy of IP owners. I do not believe lawyers in any copyright case would get access to user data in a case like this. There is also a gap in the capabilities and prevalence of govt to force individual companies or even employees to insert and maintain secret backdoors with gag orders outside of court (though parts of the EU seem to be working hard to close that gap recently...).
[0]: Using it to derive baking recipes is not the same as using it to directly draft personal letters. Using it over VPN with pseudonym account info is not the same as using it from your home IP registered to your personal email with all your personals filled out and your credit card linked. Running a coding agent straight on your workstation is different to sandboxing it yourself to ensure it can only access what it needs.
> I do not believe lawyers in any copyright case would get access to user data in a case like this.
Based on what? Keep in mind that the data is to be used for litigation purposes only and cannot be disclosed except to the extent necessary to address the dispute. It can't be given to third parties who aren't working on the issue.
> There is also a gap in the capabilities and prevalence of govt to force individual companies or even employees to insert and maintain secret backdoors with gag orders outside of court
There's no secret backdoor here. OpenAI isn't being asked to write new code--and in fact their zero-data-retention (ZDR) API hasn't changed to record data that it never recorded in the first place. They were simply ordered to disable deletion functionality in their main API, and they were not forbidden from disclosing that change to their customers.
If anyone is under the impression OpenAI isn't saving every character typed into the chats and every bit sent to the API, I would implore them to look at the current board members.
Even if OpenAI and other LLM providers were prohibited by law not to retain the data (opposite of this forced retention), no one should trust them to do so.
If you want to input sensitive data into an LLM, do so locally.
We (various human societies) do need to deal with this new ability to surveil every aspect of our lives. There are clear and obvious benefits in the future - medicine, epidemiology will have enormous reservoirs of data to draw on, entire new fields of mass behavioural psychology will come into being (I call it MAssive open online psychology or moop) and we might even find governments able to use minute by minute data of their citizens to you know, provide services to the ones they miss…
But all of this assumes a legal framework we can trust - and I don’t think this comes into being piecemeal with judges.
My personal take is that data that, without the existence of activity of a natural human, data that woukd not exist or be different must belong to that human - and that it can only be held in trust without explicit payment to that human if the data is used in the best interests of the human (something something criminal notwithstanding)
Blathering on a bit I know but I think “in the best interests of the user / citizen is a really high and valuable bar, and also that by default, if my activities create or enable the data,it belongs to me, really forces data companies to think.
Zero knowledge proofs + blockchain stream payments + IPFS or similar based storage with encryption and incentive mechanisms.
Its still outside the overton window (especially on HN), but the only way that I’ve seen where we can get the benefits of big data and maintain privacy is by locking the data to the user and not aggregating it in all these centralized silos that then are incentivized to build black markets around that data.
As far as cryptographic solutions go: what would be ideal is homomorphic encryption, where the server can do the calculations on data it can't decrypt (your query) and send you something back that only you can decrypt. Assuming that's unworkable, we could still have anonymity via cryptocurrency payments for tokens (don't do accounts) + ipfs or tor or similar. You can carry around your query + answer history with you.
Magistrate judges are more variable, not subject to Senate confirmation, do not serve for life, render decisions that very often are different in character from those of regular judges-- focusing more on "trees" than "forest". Without consent, their scope of duties is limited and they cannot simply swap in for a district judge. They actually are supervised by a district judge, and appeals, as it were, are directed to that officer not an appellate court.
In a nutshell, I used quotes to indicate how the position was described by the article. These judidical officers are not interchangeable with judges in the federal system, and in my experience this distinction is relevant to both why this person issued the kind of decision they did, and what it means for confidence in the US justice system.
So in your view, there is no expectation of privacy in anything typed into the Internet. And, if a news organization, or a blogger, or whoever, came up with some colorable argument to discover everything everyone types into the Internet, and sued the right Internet hub-- you think this is totally routine and no one should be annoyed in the least, and moreover no one should be allowed to intervene or move for protective order, because it would be more convenient for the court to just hand over all the Internet to the news or blogger or whoever.
It's precisely that perspective that I think should sink a magistrate, hundreds of times over.
The internet isn't magic, if you send data to business X that is under jurisdiction of country Z, it's judicial system can get it by court order.
This always has been like this, you are in HN, did you think E2EE was just a LARP? It's not even like this is some Patriot Act gag-order bullshit, if you could claim an exception for privacy for any user data, 99% of companies would be immune to discovery.
So no, the spooks are not gonna look at your deepest secrets that you put in CleverBot 9000, but giving your data to Sam "Give me your eyes for 20 bucks" Altman was stupid. Yes, if you are capable of reaching this site it's your *fault*, you should know better.
Well, the magistrate's job is to apply precedent, to which I have no idea why you are of the belief that this is not a routine application of 3rd party doctrine. So that's no reason to hold against the magistrate, even if you disagree with the law.
Second, what colorable argument? There is no colorable argument that entitles you to "discover everything everyone types into the Internet" so there's no need to pretend there is for the purpose of this conversation. Feel free to posit one. You didn't, because none exist. Discovery is limited and narrow. Here, what the court is demanding from OpenAI is limited and narrow, unlike the ridiculous scenario you offered.
> So in your view, there is no expectation of privacy in anything typed into the Internet.
In the view of American law, as it is currently written and settled, when what you've typed into the internet is relevant to ongoing litigation, yes, there is no expectation of privacy from discovery for anything you typed into the particular service on the internet that's being litigated. Likewise, there's no expectation of privacy if you're not either litigant, but you have been subpoenaed, and forced to testify. The fifth amendment only protects you from self-incrimination.
There are far more horrifying aspects of American law, as it is currently written and settled, I can't say I have the energy to be all that outraged over this one, as opposed to any of the other insane shit that's currently going on.
When people are routinely being disappeared without due process or legal recourse, the issue of 'a few lawyers sworn to secrecy going over some user queries under the constraints of a judge in an active litigation' is not actually a serious issue. This category of thing happens all the time, and it's uncomfortable for third parties involved, but a millenium of common law has generally put the needs of the courts reaching a fair decision in a case above the needs of unrelated third parties to not be bothered by them.
Losing this case would be an incredibly serious issue for OpenAI's business model though, though, which is why it's throwing shit at the wall to see if it sticks, and is shouting for sympathy to anyone who wants to listen. I can't say I give a fig about their financial well-being, though.
> So in your view, there is no expectation of privacy in anything typed into the Internet.
This is a good point because chat gpt is The Internet and any order pertaining to a specific website applies to every website. Similarly if the police get a warrant to search a house it applies to every house on the earth
Also not deleting user-submitted content is the same thing as mass surveillance. For example this website doesn’t allow you to delete comments after a certain period, so Hacker News is a surveillance apparatus
No judge can block any kind of mass surveillance program which has been ongoing since more than a decade now. This is a joke and completely irrelevant. OpenAI, just like every other corp, is storing as much as they can to profile you and your bits stream
I find it really strange how many people are outraged or shocked about this.
I have to assume that they are all simply ignorant of the fact that this exact same preservation of your data happens in every other service you use constantly other than those that are completely E2EE like signal chats.
Gmail is preserving your emails and documents. Your cell provider is preserving your texts and call histories. Reddit is preserving your posts and DMs. Xitter is preserving your posts and DMs.
This is not to make a judgement about whether or not this should be considered acceptable, but it is the de facto state of online services.
>I find it really strange how many people are outraged
>This is not to make a judgement about whether or not this should be considered acceptable
A person is outraged because they find it unacceptable. This is beyond terms and conditions, OpenAI is being forced to keep data they want to discard for the user.
I am shocked that you are shocked, that people are taking a position on this when you suggest you dont take a position on this.
Outrage suggests a level of surprise with the anger. This is not surprising at all.
When you hand over your data to a 3rd party, you should not expect it to remain private from the government that rules over that party. The entire 21st century has been a constant deluge of learning how much we are all monitored constantly.
I think you'd be surprised at the amount of people who don't understand this. Go ask a random person on the street or a nontechnical friend you have and see what answer you get.
To me, this is akin to Google saying that they don't want to follow a court-ordered law because it would be a privacy invasion.. I feel like OpenAI framed the issue as a privacy conversation and some news journalist are going along with it without questioning the source and their current privacy policy re: data retention and data sharing affiliates, vendors, etc.
It takes 30 seconds to save the privacy policy and upload it to an LLM and ask it questions and it quickly becomes clear that their privacy policy allows them to hold onto data indefinitely as is.
Even though this doesn't apply to enterprise customers, I'm just waiting for European customers to wake up and realize that ChatGPT isn't compatible with the GDPR today. And if the court suddenly decides that enterprise customers should also be part of the preservation order it'll be a big hit for OpenAI.
It's crazy how much I hate every single top level take in this thread.
Real human beings actual real work is allegedly being abused to commit fraud at a massive scale, robbing those artist of the ability to sustain themselves. Your false perception of intimacy while asking the computer Oracle to write you smut does not trump the fair and just discovery process.
Yeah nice strawman, but theres been plenty of leaked chats with ChatGPT (just stuff that's been set to public accidentally) that are so obviously "private" in nature that its not funny.
Sorry but, humans have a right to privacy beyond your dislike of the services they use.
You clearly have an emotional connection to people that you feel are being harmed by AI, so I’m not going to gaslight you about that.
But I will tell you that real humans asking private, real questions of LLMs is also happening, and these two things aren’t related. Many people who don’t have the technical literacy to understand the implications are sending messages with extremely sensitive personal, medical, and financial information.
Straw-manning all of these users as tech-savvy, horny IP thieves is ridiculous. I could find your argument more persuasive if you actually considered the privacy needs of the people who had nothing to do with building or perpetuating the systems.
EDIT: to be clear I’m not sure what I think the solution should be, as I also understand the need for discovery.
After reading the actual Order, it appears the defendants filed an application to the Court expressing concern that plaintiff, OpenAI, was potentially destroying evidence and to prevent a spoliation claim (relief due to destruction of evidence), the Judge Ordered OpenAI to stop destruction of anything (eg to preserve everything).
A person not a party to the action then filed an application to intervene in the lawsuit because the Judge's Preservation Order constituted a breach of the terms of his contract with OpenAI regarding his use of OpenAI's product - more specifically that the Intervenor entered into usage of OpenAI's product upon the agreement that OpenAI would not preserve any portion of Intervenor's communication with the OpenAI product.
The problem, as I see it, is that the Judge did not address the issue that her Order constituted a breach of Intervenor's contractual interests. That suggests to me that Intervenor did not expressly state that he held contractual rights that the Court's Order was violating. I would think the next step would be to file an Order to Show Cause directly against the Magistrate Judge claiming the Magistrate's Order constitutes an unconstitutional government taking of property without Due Process.
There should be a law about what "delete" means for online services. I used to delete old comments on reddit until their AI caught up to my behavior and shadow-banned my 17 year-old account. As soon as that happened, I could see every comment I ever deleted in my history again. The only consolation is that no one but me could see my history anymore.
They're both paper tigers. The CLOUD Act and that massive data center in Utah trump* each of them respectively.
What happens in the US stays in the US, delete button or not.
This kind of conspiracy theory comes up a lot on here. Most of these products have the option to allow or deny that and contrary to the opinions here those policies are then followed. This whole episode is news because it violates that.
I remember back in the day when I made the mistake to use Facebook and iPhones that a) Facebook never actually deleted anything and b) iMessage was also not deleting (but both were merely hiding).
This is why in this part of the world we have GDPR and it would be amazing to see OpenAI receiving penalties for billions of euros, while at the same time a) the EU will receive more money to spend, and b) the US apparatus will grow stronger because it will know everything about everyone (the very few things they didn't already know via the FAANGS.
Lately I have been thinking that "they" play chess with our lives, and we sleepwalking to either a Brave New World (for the elites) and/or a 1984/animal farm for the rest. To give a more pleasant analogy, the humans in WALL-E or a darker analogy, the humans in the Matrix.
"creating mass surveillance program harming all ChatGPT users" is just taking the lawyers' words out of their mouth at face value. Totally ridiculous. And of course its going to lead to extreme skepticism from the crowd here when its put forward that way. Another way to do describe this: "legal discovery process during a lawsuit continues on as it normally would in any other case"
""Proposed Intervenor does not explain how a court’s document retention order that directs the preservation, segregation, and retention of certain privately held data by a private company for the limited purposes of litigation is, or could be, a 'nationwide mass surveillance program,'" Wang wrote. "It is not. The judiciary is not a law enforcement agency.""
This is a horrible view of privacy.
This gives unlimited ability for judges to violate the privacy rights of people while stating they are not law enforcement.
For example, if the New York Times sues that people using an a no scripts addin, are bypassing its paywall, can a judge require that the addin collect and retain all sites visited by all its users and then say its ok because the judiciary is not a law enforcement agency?
> This gives unlimited ability for judges to violate the privacy rights of people while stating they are not law enforcement.
See my comment above in reply to aydyn: in general, "privacy rights" do not exist in American law, and as such the judge is violating nothing.
People are always surprised to learn this, but it's the truth. There's the Fourth Amendment, but courts have consistently interpreted that very narrowly to mean your personal effects in your possession are secure against seizure specifically by the government. It does not apply to data you give to third-parties, under the third-party doctrine. There are also various laws granting privacy rights in specific domains, but those only apply to the extent of the law in question; there is no constitutional right to privacy and no broad law granting it either.
Until that situation changes, you probably shouldn't use the term "privacy rights" in the context of American law: since those don't really exist, you'll just end up confusing yourself and others.
I really wish we had broad laws requiring telling the truth about concrete things (in contracts, by government officials, etc.). But we don't even have any real enforcement even for blatant perjury.
"However, McSherry warned that "it's only a matter of time before law enforcement and private litigants start going to OpenAI to try to get chat histories/records about users for all sorts of purposes, just as they do already for search histories, social media posts, etc.""
If this is a concern, is the the best course of action for McSherry to stop using ChatGPT.
We have read this sort of "advice" this countless times in HN comments relating to use of software/websites controlled by so-called "tech" companies.
Something like, "If you are concerned about [e.g., privacy, whatever], then do not use it. Most users do not care."
Don't use _____.
This is a common refrain in HN comment threads.
"OpenAI will have a chance to defend panicked users on June 26, when Wang hears oral arguments over the ChatGPT maker's concerns about the preservation order."
"Some users appear to be questioning how hard OpenAI will fight. In particular, Hunt is worried that OpenAI may not prioritize defending users' privacy if other concerns-like "financial costs of the case, desire for a quick resolution, and avoiding reputational damage"-are deemed more important, his filing said."
"Intervening ChatGPT users had tried to argue that, at minimum, OpenAI should have been required to directly notify users that their deleted and anonymous chats were being retained. Hunt suggested that it would have stopped him from inputting sensitive data sooner."
Any OpenAI argument that invokes "user privacy" is only doing so as an attempt to protect OpenAi from potentially incriminating discovery. OpenAI will argue for its own interests.
Maybe it's the quotes selected for the article, but it seems like the judge simply doesn't get the objections. And the reasoning is really strange:
"Even if the Court were to entertain such questions, they would only work to unduly delay the resolution of the legal questions actually at issue."
So because the lawsuit pertains to copyright, we can ignore possible constitutional issues because it'll make things take longer?
Also, rejecting something out of hand simply because a lawyer didn't draft it seems really antithetical to what a judge should be doing. There is no requirement for a lawyer to be utilized.
> … but it seems like the judge simply doesn't get the objections. And the reasoning is really strange
The full order is linked in the article: https://cdn.arstechnica.net/wp-content/uploads/2025/06/NYT-v.... If you read that it becomes more clear: The person who complained here filed a specific "motion to intervene" which has a strict set of requirements. These requirements were not met. IANAL, but it doesn't seem too strange to me here.
> Also, rejecting something out of hand simply because a lawyer didn't draft it seems really antithetical to what a judge should be doing. There is no requirement for a lawyer to be utilized.
This is also mentioned in the order: An individual have the right to represent themselves, but a corporation does not. This was filed by a corporation initially. The judge did exactly what a judge what supposed to do: Interpret the law as written.
So the arguments are sound but the procedure wasnt followed so someone else just needs to follow the procedure and get our chats deletable?
The judge went further to say the arguments weren't sound, either.
They went further and said they didn’t find the arguments sound.
Personally I find it really hard to see anything sound in the initial order.
That appears to be the case to me too. There are so many similar services that people use that could fall victim to this same issue of privacy . We should have extremely strong privacy laws preventing this, somehow blocking over-broad court orders. And we don't. Imagine these case:
1. a court order that a dating service saves off all chats and messages between people connecting on the service (instead of just saving off say the chats from a suspected abuser)
2. saving all text messages going through a cell phone company
3. how about saving all google docs? Probably billions of these a day are being created.
4. And how has the govt not tried to put out a legal request that signal add backdoors and save all text messages (because there will no doubt be nefarious users like our own secretary of defense). I think it would take a very significant reason to succeed against a private organization like signal.
The power and reach of this makes me wonder if the US govt already has been doing this to normal commercial services (outside of phone calls and texting). I recall reading back in the day they were "tapping" / legally accessing through some security laws phone company trunks. And then we learned about tapping google communications from Edward Snowden.
>2. saving all text messages going through a cell phone company
Point of order, phone companies already do that. Third Party Doctrine. I don't believe they should, but as of right now, that's where we're at.
"saving off"?
this is a strange turn of phrase
Maybe it's a regional dialect? You can find a lot of examples of it in the wild, for example on microsoft forums:
1. https://answers.microsoft.com/en-us/msoffice/forum/all/savin...
2. https://community.fabric.microsoft.com/t5/Service/End-User-S...
In both cases it's being used for "copying".
We can also see it was used as early as 15 years ago on this very site ( https://news.ycombinator.com/item?id=1182478 ), so it's not a new turn of phrase.
My experience has definitely been that I've heard it more online and in San Francisco, and not very often in Germany, Texas, or Russia. Are you in one of those areas?
I feel like the etymology is something like "print off a few sheets" becoming "copy off a sheet with the copier", and then to the more general digital copy meaning.
Huh. I'm from the Boston area, 50yo, avid reader, former English teacher, fairly well-travelled...
I've definitely been using phrasing like it ("save off a copy", etc) for at least the past 20-25 years (upstate NY, moderately online, avid reader, parent is an English professor, fairly well-traveled).
Conceptually, I think that the "off" serves the purpose of aligning it with something like "split off"—you're essentially forking the history by creating a separate saved copy.
> We should have extremely strong privacy laws preventing this, somehow blocking over-broad court orders
Quick question. Should your percieved "right to privacy" supersede all other laws?
To extrapolate into the real world. Should it be impossible for the police to measure the speed of your vehicle to protect your privacy? Should cameras in stores be unable to record you stealing for fear of violating your privacy?
I think there's an idea akin to Europe's "right to be forgotten" here.
We can all observe the world in the moment. Police can obtain warrants to wiretap (or the digitial equivalent) suspects in real-time. That's fine!
The objection is that we are ending up with laws and rulings that require a recording of history of everyone by everyone - just so the police can have the convenience of trawling through data everyone reasonably felt was private and shouldn't exist except transiently? Not to mention that perhaps the state should pay for all this commercially unnecessary storage? Our digital laws are so out-of-touch with the police powers voters actually consented to - that mail (physical post) and phone calls shall not be in intercepted except under probable cause (of a particular suspect performing a specific crime) and a judge's consent. Just carry that concept forward.
On a technical level, I feel a "perfect forward secrecy" technique should be sufficient for implementers. A warrant should have a causal (and pinpoint) effect on what is collected by providers. Of course we can also subpoena information that everyone reasonably expected was recorded (i.e. not transient and private). This matches the "physical reality" of yesteryear - the police can't execute a warrant for an unrecoreded person-to-person conversation that happened two weeks ago; you need to kindly ask one of the conversents (who have their own rights to privacy / silence, are forgetful, and can always "plead the 5th").
Unfortunately, that often creates incentive for the business to error on the side of sharing too much with the authorities, even when proper procedure (warrants) have not been followed. It's the only way to retroactively get payed for all that storage and for the code to retrieve that data.
Do you have any sources for this? Never heard of the justice system compensating a company for providing data.
Here's a source discussing Google specifically: https://www.nytimes.com/2020/01/24/technology/google-search-...
I know of smaller companies that have been charging since at least 2014.
Why do you think telecom companies are so eager to comply with wiretap laws?
There are better extrapolations into the real world:
Cellphone location data: https://en.wikipedia.org/wiki/Carpenter_v._United_States
Thermal imaging a home: https://en.wikipedia.org/wiki/Kyllo_v._United_States
Biometric unlocking: https://cdt.org/insights/circuit-court-split-lays-the-ground...
In these cases, privacy, or rather, the constitution, does supersede all other laws.
Didn't the Supremes decide there was no constitutional right to privacy as a side effect of overturning Roe? (Or, at least, throw it into full Calvin Ball mode...)
https://www.americanbar.org/groups/communications_law/public...
Every decision to overturn a decided opinion, can potentially harm all other decided opinions.
Beyond that, no, it didn’t impact anything other than abortion.
When the SCOTUS ruled the constitution protected the right to engage in gay sex, and later gay marriage, precidents were overturned.
Conservatives claimed this might make it easier to overturn Roe. It didn’t.
Roe wasn’t in danger until SCOTUS had six reliable anti-Roe justices.
Not really. The reason the overturning of Roe was widely considered to be inevitable, even by jurists who were pro-choice, is that the theory of privacy used in that case was fundamentally incompatible with a broad range of regulatory powers most people think the Constitution grants the Federal government.
The reasoning behind Roe was generally regarded as tenuous even by the justices that supported it. Overturning it was required to defend the government’s Constitutional authority for agencies like the FDA, which was undermined by inconsistencies introduced by Roe v Wade. Eventually those judicial inconsistencies come home to roost.
tl;dr: Roe being overturned had little to do with privacy and more to do with protecting specific regulatory powers from being unconstitutional using the same reasoning introduced in Roe v Wade.
Removing such decisions from Federal purview was an elegant solution to the problem, with the practical effect of deferring all such decisions to voters at the State level.
I think both you and pyuser583 are correct to a certain extent. The stated reasons for overturning Roe were because of the tenuous basis as a privacy grounded position. On the other hand I completely believe that Roe was such a politically charged issue that the judges voted according to their allegiance, even though they are not supposed to have such allegiances. Everybody would have been surprised if any of the judges had decided differently to how they did. So, while there may have been a legal argument to be made, I don't think that particular issue was decided on those grounds.
Your two examples don't map to the concern about data privacy.
Speed cameras only operate on public roads. The camera in the store is operated by the store owner. In both cases one of the parties involved in the transaction (driving, purchasing) is involved in enforcement. It is clear in both cases that these measures protect everyone and they have clear limits also.
Better examples would be police searching your home all the time, whenever they want (This maps to device encryption).
Or store owners surveilling competing stores / forcing people to wear cameras 24/7 "to improve the customer experience" (This maps to what Facebook / Google try to do, or what internet wire tapping does).
> searching your home all the time, whenever they want
What? How does OpenAI map to your home at all? This is pure nonsense. You seem to have entirely dismissed the comparison to driving a little too out of hand.
The internet is, like the roads, public infrastructure. You can't claim that encryption makes all traffic on the public infrastructure as private as staying home.
You sound like one of those "free man of the land" guys: "I'm not driving your honor, I was traveling."
No right or law supersedes all other laws, and obviously no-one is asking for that. But nor should a court just be able to order anything they want without regard to who else is affected.
This is like storing copies of all physical mail because one letter could contain a Xerox of a newspaper.
Judges generally are either lawyers or have legal experience. The judge in question was formerly a practicing lawyer, albeit claims to have nearly completed a Ph.D. in zoology.
It's extremely unlikely that a protected class is going to start treating a non-protected class with the same regard in society.
I'm not even sure how there could be a constitutional issue here, but it probably isn't for this court to figure out anyways.
>Also, rejecting something out of hand simply because a lawyer didn't draft it seems really antithetical to what a judge should be doing. There is no requirement for a lawyer to be utilized.
This is 100% wrong. Pro Se litigation is well regulated, in the first case a non lawyer tried to file in representation of his company, which is not himself so it's not pro se, and you need to be a licensed lawyer to represent someone else.
"So because the lawsuit pertains to copyright, we can ignore possible constitutional issues because it'll make things take longer?"
Not quite, the contention is that the judge doesn't see how it would be successful, so it would be a delay that never addresses a constitutional issue by her judgment.
>So because the lawsuit pertains to copyright, we can ignore possible constitutional issues because it'll make things take longer?
What constitutional issues do you believe are present?
> There is no requirement for a lawyer to be utilized.
Corporations must be represented by an attorney, by law. So that's not true outright. Second, if someone did file something pro-se, they might get a little leeway. But the business isn't represented pro-se, so why on earth would the judge apply a lower standard appropriate for a pro-se party so a sophisticated law firm, easily one of the largest and best in the country?
When you are struggling to reason around really straightforward issues like that, it does not leave me with confidence about your other judgments regarding the issues present here.
I read the ruling(judgment?) first and the fine article second. If I'd just read the article I'd be more confused than this guy.
The PDF is easy to read and really lucid once you get past the formatting. Ars should have just converted it to markdown.
Tech coverage of legal stuff is terrible. Ars has zero interest in representing this story in a fair way. They are making controversy for page views.
> What constitutional issues do you believe are present?
4th Amendment (Search and Seizure)
Do you think the 4th amendment enjoins courts from requiring the preservation of records as part of discovery? The court is just requiring OpenAI to maintain records it already maintains and segregate them. Even if one thinks that _is_ a government seizure, which it isn't---See Burdeau v. McDowell, 256 U.S. 465 (1921); cf. Walter v. United States, 447 U.S. 649, 656 (1980) (discussing the "state agency" requirement)---no search or seizure has even occurred. There's no reasonable expectation of privacy in the records you're sending to OpenAI (you know OpenAI has them!!; See, e.g., Smith v. Maryland, 442 U.S. 735 (1979)) and you don't have any possessory interest in the records. See, e.g., United States v. Jacobsen, 466 U.S. 109 (1984).
This would help explain why entities with a “zero data retention” agreement are “not affected,” then, per OpenAI’s statement at the time? Because records aren’t created for those queries in the first place, so there’s nothing to retain?
AIUI Because if you have a zero data retention agreement you are necessarily not in the class of records at issue (since enterprise customers records are not affected, again AIUI per platinffs' original motion which might be because they don't think they're relevant for market harm or something).
So I think that this is more so an artefact of the parameters than an outcome of some mechanism of law.
> There's no reasonable expectation of privacy in the records
There is a reasonable expectation that deleted and anonymous chats would not be indefinitely retained.
> The court is just requiring OpenAI to maintain records it already maintains and segregate them.
Incorrect. The court is requiring OpenAI to maintain records it would have not maintained otherwise.
That is the crux of this entire thing.
> The court is requiring OpenAI to maintain records it would have not maintained otherwise.
Not quite. The court is requiring OpenAPI to maintain records longer than it would otherwise retain them. It's not making them maintain records that they never would have created in the first place (like if a customer of theirs has a zero-retention agreement in place).
Legal holds are a thing; you're not going to successfully argue against them on 4A grounds. This might seem like an overly broad legal hold, though, but I'm not sure if there are any rules that prevent that sort of thing.
> This might seem like an overly broad legal hold
Exactly
Litigation holds do not violate the 4th amendment.
How?
the government is not involved at all in this dispute, neither state or federal.
So the parties are just like duking it out in a parking lot or something? If the government is not involved, then why does OpenAI even bother listening to the judge?
This only demonstrates that you do not understand the difference between the executive and judicial branches. It does not demonstrate a good point.
How so? The comment I was responding to made no such distinction - it merely claimed "the government is not involved". In the US, "government" is generally taken to be anybody acting under the banner of nation/state authority (contrast with say the UK). So the government is most certainly involved here - adjudicating the case and issuing this retention order.
(for example, your own comment: the executive and judicial branches [of government])
> How so? The comment I was responding to made no such distinction
Should I mention that water is wet every time I mention water? The executive is the executive, the judicial is the judicial. It's inherent in the discussion and pretending otherwise only for the benefit of furthering obtuse points that go nowhere serves the benefit of no one. So you either didn't know, and do now, or you're just cratering the discussion.
I made an obtuse reply to an obtuse wrong assertion. You then baselessly claimed I didn't know about different branches of government. That second bit is what tends to crater conversations.
The distinction of the judiciary is most certainly relevant to the actual legal analysis here - the judiciary often reserves sweeping authoritarian powers for themselves, even when they do act to restrain the legislative/executive. So without even really analyzing the details, I am pretty sure that the law as written supports this action.
But the comment I was responding to wasn't making a larger more nuanced argument - rather it said that the government was not involved, defining away the actions of the judiciary as somehow not being governmental action, regardless of them being done with the authority of government.
The overall analysis is that if people are up in arms about this, it just reinforces the need for some actual privacy laws in this country - both to protect from corporations themselves abusing our data, and in this case to prevent the government from creating overly broad judicial orders that may only target specific companies but end up running roughshod over many individuals' rights.
(and just to be clear to avoid going off into the terminology weeds again: the definition of rights I'm using is the one of imagined natural rights, not merely what has been codified into law)
So what? The poster was obviously referring to the executive so pretending like there was any ambiguity was ridiculous
It's either ambiguous or a logical fallacy - the original concern is about government overreach in general, so it cannot be dismissed by focusing on the executive when it's obviously not the executive acting.
The commenter I responded to is an attorney who presumably just tried to cut out having to detail a longer argument based on the actual nuances that have been interpreted from the 4th amendment. Still, the argument bit off too much so in the interest of deeper analysis and rational discussion it seemed worth calling out.
It's a legal dispute. The role the "government" plays in a legal dispute is very clear. You're being obnoxiously pedantic.
The topic's entire complaint is of people not liking a government action, arrived at by government policy. You made a comment that the "government is not involved". Reading your other comments in this thread, it seems as if your point is to brush aside people's concerns as if they are nonsensical. But that simplistic statement is obviously false, because we are talking about a government order. My comments only seem pedantic because I've had to tediously spell out the details in response to your argumentative wriggling instead of you just accepting that your simplistic statement was wrong.
It's a civil case and the gov't is not involved so...
> When you are struggling to reason around really straightforward issues like that, it does not leave me with confidence about your other judgments regarding the issues present here.
Or, perhaps, that's not something known by most. I didn't struggle to understand that, I simply didn't know it. Also, again, the article could have mentioned that, and I started my statement by saying maybe the article was doing a bad job conveying things.
> What constitutional issues do you believe are present?
This method of interrogation of online comments is always interesting to me. Because you seem to want to move the discussion to that of whether or not the issues are valid, which wasn't what I clearly was discussing. When you are struggling to reason around really straightforward issues like that, it does not leave me with confidence about your other judgments regarding the issues present here.
Now that you’ve both done it, can we stop with the ad hominem?
Method of interrogation? I'm not a mind reader. If you don't want to make yourself clear, that's fine. If you want to be petty, that's fine as well.
>Or, perhaps, that's not something known by most. I didn't struggle to understand that, I simply didn't know it.
Sorry you struggled to not understand your own concept that you put forward that because a lawyer isn't required (not true, but granting you this for the sake of this conversation), we shouldn't hold lawyers up to the standard of a lawyer anyway? That's facially silly.
> If you don't want to make yourself clear, that's fine. If you want to be petty, that's fine as well.
I literally just repeated to you what you said to me. But, yeah, I'm the petty one.
> Sorry you struggled to not understand your own concept that you put forward that because a lawyer isn't required
What? Why are you misinterpreting everything I wrote?
> we shouldn't hold lawyers up to the standard of a lawyer anyway? That's facially silly.
Where in the world did I say this?
This doesn't seem especially newsworthy. Oral arguments are set for OpenAI itself to oppose the preservation order that has everyone so (understandably) up in arms. Seems unlikely that two motions from random ChatGPT users were going to determine the outcome in advance of that.
Seems that a judge does not understand the impact of asking company X to "retain all data" and is unwilling to rapidly reconsider. Part of what makes this newsworthy is the impact of the initial ruling.
Retention orders of this kind are not uncommon and the judge has not ordered it be turned over to anyone until they hear arguments on it.
I note with amazement that tons of hn users with zero legal experience, let alone judge experience, are sure its the judge who doesn't understand, not them. Based on what I don't know but they really are sure they get it more than the judge!
Underlying this issue is that the judicial system (or the patent system, or the political system) is not populated with enough individuals possessing software engineering "common sense."
It is highly likely that this is not confined to just software, I'm sure other engineering or complex disciplines feel the same way about their discipline.
How do we have experts inform these decisions without falling into the trap of lobbying where the rich control the political and legal sphere?
Anyway, I cede you the point that the US law does not match my "common sense" esp around this 3rd party rule mentioned in other comments. It kind of sucks that US "winning the internet" means that even non-US citizens are subject to US law in this regard.
The judicial system is supposed to apply the law, not "common sense". How could it be otherwise? If you don't like the law then take that up with the legislative branch.
Is this a legislative issue?
Yes
Who is meant to pay for all this data retention? If OpenAI win the argument, can they claim the storage costs from plaintiffs?
It's OK to say "don't throw out a few pieces of paper for a bit", but that doesn't compare to "please spend $500k/month more on S3 bills until whenever we get around to hearing the rest of the case". (Perhaps that much money isn't that important to either side in this _particular_ case, but there is a cost to all this data retention stuff).
Once data exists in persisted form, it has that curious tendency to leak or be repurposed.
Lex non cogit ad impossibilia. - The law cannot compel the impossible.
A judicial system populated by people who don't understand what's possible is a real issue.
I noted that, ordering retention is not the same as ordering the turning of the data to authorities.
However the risk of data being leaked, or data being requested through a gag order, cannot be ignored.
That said I don't think the arguments were made, the judge is right to dismiss arguments that don't address these nuances.
I wonder what the precedent with google searches is.
> Seems that a judge does not understand the impact of asking company X to "retain all data"
You can count on the fact that the judge does in fact understand that this is a very routine part of such a process.
It is more like the users of ChatGpt don't understand the implication of giving "the cloud" sensitive information and what can happen to it.
It might surprise many such users the extent that the data they casually place at the hands of giant third parties can be, and has routinely, been the target of successful subpoena.
As an illustration, if two huge companies sue each other, part of the legal process involves disclosure. This means inhaling vast quantities of data from their data stores, their onsite servers, executives laptops. Including those laptops that have Ashley Madison data on them. Of course, part of the legal process is motions to exclude this and that, but that may well be after the data is extracted.
For understanding of this topic, pay attention to what DannyBee says https://news.ycombinator.com/item?id=44361478 and not what HN users wish were true.
If HN decided these things, decisions would be a lot quicker and easier to predict.
Unfortunately I believe that the obvious conclusion being reached is in fact newsworthy these days...
Well for one, we established that users have a right to argue in the case, and we don't need to rely on OpenAI.
You can file a motion yourself pro se as the original plaintiff did. Regardless of whether you'd be successful, it is your right.
The judge is clearly not caring about this issue so arguing before her seems pointless. What is the recourse for OpenAI and users?
Not a real answer, but I think a local LLM is going to be the way to go. I've been playing with them for some time now — and, yeah, they're still not there with the save context, hardware requirements needed for a really good model... But I suspect like anything else in tech, a year or two from now a decent local LLM will not be such a stretch.
I can't wait actually. It's less about privacy to me than to being offline.
> a local LLM is going to be the way to go
Non-technicals don't know how LLMs work, and, more importantly, don't care about their privacy.
For a technology to be widely used, by definition, you need to make it appealing to the masses, and there is almost zero demand for private LLM right now.
That's why I don't think that local llms will win. There are narrow use cases where regulations can force local llm usage (like for medical stuff), but overall I think that services will win (as they always do)
I still happen to trust Apple with my cloud data, with their secure enclave. To that end, an Apple solution where my history/context is kept in my cloud account, perhaps even a future custom Apple chip that could run some measure of a local LLM.... This "aiPhone" might be the mainstream solution that non-technicals will enjoy.
> there is almost zero demand for private LLM right now.
you need some really expensive hardware to run a local LLM, most of which is unavailable to the average user. The demand might just simply be hidden as these users do not know nor want to expend the resources for it.
but i have hope that the hardware costs will come down eventually, enough that it reveals the demand for local LLM.
After all, i prefer my private questions to an LLM not be ever revealed.
> I think that services will win (as they always do)
We can have services but also private history/contexts. Those can be "local" (and encrypted).
Most companies would kill for private LLM capabilities were it possible. I think Mistral's even making it part of their enterprise strategy.
You don't have any recourse, at least not under American law. This a textbook third-party doctrine case: American law and precedent is unambiguous that once you voluntarily give your data to a third party-- e.g. when you sent it to OpenAI-- it's not yours anymore and you have no reasonable expectation of privacy about it. Probably people are going to respond to this with a bunch of exceptions, but those exceptions all have to be enumerated and granted specifically with new laws; they don't exist by default, and don't exist for OpenAI.
Like it or not, the judge's ruling sits comfortably within the framework of US law as it exists at present: since there's no reasonable expectation of privacy for chat logs sent to OpenAI, there's nothing to weigh against the competing interest of the active NYT case.
> once you voluntarily give your data to a third party-- e.g. when you sent it to OpenAI-- it's not yours anymore and you have no reasonable expectation of privacy about it.
The 3rd party doctrine is worse than that - the data you gave is not only not yours anymore, it is not theirs either, but the governments. They're forced to act as a government informant, without any warrant requirements. They can say "we will do our very best to keep your data confidential", and contractually bind themselves to do so, but hilariously, in the Supreme Court's wise and knowledgeable legal view, this does not create an "expectation of privacy", despite whatever vaults and encryption and careful employee vetting and armed guards standing between your data and unauthorized parties.
I don't think it is accurate to say that the data becomes the government's or they have to act as an informant (I think that implies a bit more of an active requirement than responding to a subpoena), but I agree with the gist.
This clearly seems counter to the spirit of the 4th amendment.
> You don't have any recourse, at least not under American law.
Implying that the recourse is to change the law.
Those precedents are also fairly insane and not even consistent with one another. For example, the government needs a warrant to read your mail in the possession of the Post Office -- not only a third party but actually part of the government -- but not the digital equivalent of this when you transfer some of your documents via Google or Microsoft?
This case is also not the traditional third party doctrine case. Typically you would have e.g. your private project files on Github or something which Github is retaining for reasons independent of any court order and then the court orders them to provide them to the court. In this case the judge is ordering them to retain third party data they wouldn't have otherwise kept. It's not clear what the limiting principle there would be -- could they order Microsoft to retain any of the data on everyone's PC that isn't in the cloud, because their system updater gives them arbitrary code execution on every Windows machine? Could they order your home landlord to make copies of the files in your apartment without a warrant because they have a key to the door?
> It's not clear what the limiting principle there would be -- could they order Microsoft to retain any of the data on everyone's PC that isn't in the cloud, because their system updater gives them arbitrary code execution on every Windows machine?
My understanding is it's closer to something like: They cannot order a company to create new tools, but can tell them to not destroy the data they already have. So, in the question of MS having the ability to create a tool that extracts your data is not the same as MS already having that tool functioning and collecting all of your data that they store and are then told to simply not destroy. Similarly, VPNs that are not set-up to create logs can't keep or hand over what they don't have.
Laws can be made to require the collection and storage of all user data by every online company, but we're not there -- yet. Many companies already do it on their own, and the user then decides if that's acceptable or not to continue using that service.
If the company created their service to not have the data in the first place, this probably never would have found its way to a judge. Their service would cost more, be slower, and probably be difficult to iterate on as it's easier to hack things together in a fast moving space then build privacy/security first solutions.
The issue is, what does "the data they already have" mean? Does your landlord "have" all the files in your apartment because they have a key to the door?
Real Property, Tenant's rights, and Landlord laws are an entirely separate domain. However, I believe in some places if you stop paying rent long enough, then yes all your stuff now belongs to the landlord because they have the "key".
"the data they already have" means the data the user gave the company (no one is "giving" their files to their landlord) and that the company is in full possession of and now owns. Users in this case are not in possession or ownership of the data they gave away at this point.
If you hand out photocopies of the files in your apartment, the files in your apartment are still yours, but the copies you gave away to a bunch of companies are not. Those now belong to the company you gave them to and they can do whatever they want with it. So if they keep it and a judge tells them the documents are not to be destroyed (because laws things), they would probably get into trouble if they went against the order.
Which is what I was trying to bring attention to; the fact that the company has a choice in what data (if any) they decided to collect, possess, and own. If they never collected/stored it then no one's privacy would be threatened.
The third-party doctrine has been weakened by the Supreme Court recently, in United States v. Jones and Carpenter v. United States. Those are court decisions, not new laws passed by Congress. See also this quote:
https://en.wikipedia.org/wiki/Third-party_doctrine#:~:text=w...
If OpenAI doesn't succeed at oral argument, then in theory they could try for an appeal either under the collateral order doctrine or seeking a writ of mandamus, but apparently these rarely succeed, especially in discovery disputes.
Justice Sotomayor's concurrence in U.S. v. Jones is not binding precedent, so I wouldn't characterize it as weakening the third-party doctrine yet.
Yep. This is why we need constitutional amendments or more foundational laws around privacy that changes this default. Which should be a bipartisan issue, if money had less influence in politics.
This is the perverse incentives one rather than the money one. The judges want to order people to do things and the judges are the ones who decide if the judges ordering people to do things is constitutional.
To prevent that you need Congress to tell them no, but that creates a sort of priority inversion: The machinery designed to stop the government from doing something bad unless there is consensus is then enabling government overreach unless there is consensus to stop it. It's kind of a design flaw. You want checks and balances to stop the government from doing bad things, not enable them.
OpenAI is the actual counterparty here though and not a third party. Presumably their contracts with their users are still enforceable.
Furthermore, if the third party doctrine is upheld in its most naïve form, then this would breach the EU-US Data Privacy Framework. The US must ensure equivalent privacy protections to those under the GDPR in order for the agreement to be valid. The agreement also explicitly forbids transferring information to third parties without informing those whose information is transferred.
Well, I don't think anyone is expecting the framework to work this time either after earlier tries has been invalidated. It is just panicked politicians trying to kick the can to avoid the fallout that happens when it can't be kicked anymore.
Yes, and I suppose the courts can't care that much about executive orders. Even so, one would think that they had some sense and wouldn't stress things that the politicians have built.
3rd party doctrine in the US is actual law... so I'm not sure what's confusing about that. The president has no power to change discovery law. That's congress. Why would a judge abrogate US law like that?
You're confused. This is not about the FBI's right to data, it's about the New York Times' right to the same. The doctrine you're referencing is irrelevant.
The magistrate is suggesting that there is no reasonable expectation of privacy in chats OpenAI agreed to delete, at the request of users. This is bizarre, because there's no way for OpenAI to use data that is deleted. It's gone. It doesn't require abrogation of US law, it requires a sensible judge to sit up and recognize they just infringed on the privacy expectations of millions of people.
It's a routine discovery hearing regarding documents that OpenAI creates and keeps for a period of time in the normal practice of its business.
They probably do already, but won't this ruling force OpenAI to operate separate services for the US and EU? The US users must accept that their logs are stored indefinitely, while an EU user is entitled to have theirs delete.
Stop giving your information to third parties with the expectation that they keep it private when they wont and cannot. Your banking information is also subject to subpoena... I don't see anyone here complaining about that. Just the hot legal issue of the day that programmers are intent on misunderstanding.
appealing whatever ruling this judge makes?
You can't appeal a case you're not a party to.
It's a direct answer to the question what recourse OpenAI has.
Users should stop sending information that shouldn't be public to US cloud giants like OpenAI.
Do you really think a European court wouldn't similarly force a provider to preserve records in response to being accused of destroying records pertinent to a legal dispute?
Fundamentally, on-prem or just foregoing is the safest way, yes. If one still uses these remote services it's also important to be prudent about exactly what data you share with them when doing so[0]. Note I did not say "Send your sensitive data to these countries instead".
The laws still look completely different in US and EU though. EU has stronger protections and directives on privacy and weaker supremacy of IP owners. I do not believe lawyers in any copyright case would get access to user data in a case like this. There is also a gap in the capabilities and prevalence of govt to force individual companies or even employees to insert and maintain secret backdoors with gag orders outside of court (though parts of the EU seem to be working hard to close that gap recently...).
[0]: Using it to derive baking recipes is not the same as using it to directly draft personal letters. Using it over VPN with pseudonym account info is not the same as using it from your home IP registered to your personal email with all your personals filled out and your credit card linked. Running a coding agent straight on your workstation is different to sandboxing it yourself to ensure it can only access what it needs.
> I do not believe lawyers in any copyright case would get access to user data in a case like this.
Based on what? Keep in mind that the data is to be used for litigation purposes only and cannot be disclosed except to the extent necessary to address the dispute. It can't be given to third parties who aren't working on the issue.
> There is also a gap in the capabilities and prevalence of govt to force individual companies or even employees to insert and maintain secret backdoors with gag orders outside of court
There's no secret backdoor here. OpenAI isn't being asked to write new code--and in fact their zero-data-retention (ZDR) API hasn't changed to record data that it never recorded in the first place. They were simply ordered to disable deletion functionality in their main API, and they were not forbidden from disclosing that change to their customers.
>"What is the recourse for OpenAI and users?"
Start using services of countries who are unlikely to submit data to the US.
"OpenAI user" is not an inherent trait. Just use another product, make it OpenAI's problem.
If anyone is under the impression OpenAI isn't saving every character typed into the chats and every bit sent to the API, I would implore them to look at the current board members.
smh it's not like one of their founders is out there collecting biometrics
collecting eyeballs
Even if OpenAI and other LLM providers were prohibited by law not to retain the data (opposite of this forced retention), no one should trust them to do so.
If you want to input sensitive data into an LLM, do so locally.
> prohibited by law not to retain the data
is the same as forced retention. You've got a double negative here that I think you didn't intend.
Oops, you're correct it was not intended. "prohibited by law from retaining data"
We (various human societies) do need to deal with this new ability to surveil every aspect of our lives. There are clear and obvious benefits in the future - medicine, epidemiology will have enormous reservoirs of data to draw on, entire new fields of mass behavioural psychology will come into being (I call it MAssive open online psychology or moop) and we might even find governments able to use minute by minute data of their citizens to you know, provide services to the ones they miss…
But all of this assumes a legal framework we can trust - and I don’t think this comes into being piecemeal with judges.
My personal take is that data that, without the existence of activity of a natural human, data that woukd not exist or be different must belong to that human - and that it can only be held in trust without explicit payment to that human if the data is used in the best interests of the human (something something criminal notwithstanding)
Blathering on a bit I know but I think “in the best interests of the user / citizen is a really high and valuable bar, and also that by default, if my activities create or enable the data,it belongs to me, really forces data companies to think.
Be interested in some thoughts
Zero knowledge proofs + blockchain stream payments + IPFS or similar based storage with encryption and incentive mechanisms.
Its still outside the overton window (especially on HN), but the only way that I’ve seen where we can get the benefits of big data and maintain privacy is by locking the data to the user and not aggregating it in all these centralized silos that then are incentivized to build black markets around that data.
How do you apply ZKPs to ChatGPT queries?
As far as cryptographic solutions go: what would be ideal is homomorphic encryption, where the server can do the calculations on data it can't decrypt (your query) and send you something back that only you can decrypt. Assuming that's unworkable, we could still have anonymity via cryptocurrency payments for tokens (don't do accounts) + ipfs or tor or similar. You can carry around your query + answer history with you.
Previous discussion:
OpenAI slams court order to save all ChatGPT logs, including deleted chats
https://news.ycombinator.com/item?id=44185913
The "judge" here is actually a magistrate whose term expires in less than a year.[1]
Last time I saw such weak decision-making from a magistrate I was pleased to see they were not renewed, and I hope the same for this individual.
[1] https://nysd.uscourts.gov/sites/default/files/2025-06/Public...
Not sure why scare quotes are warranted here. Your so-called '"judge"' is, in fact, a judge.
https://www.uscourts.gov/about-federal-courts/types-federal-...
Magistrate judges are more variable, not subject to Senate confirmation, do not serve for life, render decisions that very often are different in character from those of regular judges-- focusing more on "trees" than "forest". Without consent, their scope of duties is limited and they cannot simply swap in for a district judge. They actually are supervised by a district judge, and appeals, as it were, are directed to that officer not an appellate court.
In a nutshell, I used quotes to indicate how the position was described by the article. These judidical officers are not interchangeable with judges in the federal system, and in my experience this distinction is relevant to both why this person issued the kind of decision they did, and what it means for confidence in the US justice system.
I don't think a routine application of 3rd party doctrine should sink a magistrate judge.
So in your view, there is no expectation of privacy in anything typed into the Internet. And, if a news organization, or a blogger, or whoever, came up with some colorable argument to discover everything everyone types into the Internet, and sued the right Internet hub-- you think this is totally routine and no one should be annoyed in the least, and moreover no one should be allowed to intervene or move for protective order, because it would be more convenient for the court to just hand over all the Internet to the news or blogger or whoever.
It's precisely that perspective that I think should sink a magistrate, hundreds of times over.
The internet isn't magic, if you send data to business X that is under jurisdiction of country Z, it's judicial system can get it by court order.
This always has been like this, you are in HN, did you think E2EE was just a LARP? It's not even like this is some Patriot Act gag-order bullshit, if you could claim an exception for privacy for any user data, 99% of companies would be immune to discovery.
So no, the spooks are not gonna look at your deepest secrets that you put in CleverBot 9000, but giving your data to Sam "Give me your eyes for 20 bucks" Altman was stupid. Yes, if you are capable of reaching this site it's your *fault*, you should know better.
Well, the magistrate's job is to apply precedent, to which I have no idea why you are of the belief that this is not a routine application of 3rd party doctrine. So that's no reason to hold against the magistrate, even if you disagree with the law.
Second, what colorable argument? There is no colorable argument that entitles you to "discover everything everyone types into the Internet" so there's no need to pretend there is for the purpose of this conversation. Feel free to posit one. You didn't, because none exist. Discovery is limited and narrow. Here, what the court is demanding from OpenAI is limited and narrow, unlike the ridiculous scenario you offered.
> So in your view, there is no expectation of privacy in anything typed into the Internet.
In the view of American law, as it is currently written and settled, when what you've typed into the internet is relevant to ongoing litigation, yes, there is no expectation of privacy from discovery for anything you typed into the particular service on the internet that's being litigated. Likewise, there's no expectation of privacy if you're not either litigant, but you have been subpoenaed, and forced to testify. The fifth amendment only protects you from self-incrimination.
There are far more horrifying aspects of American law, as it is currently written and settled, I can't say I have the energy to be all that outraged over this one, as opposed to any of the other insane shit that's currently going on.
When people are routinely being disappeared without due process or legal recourse, the issue of 'a few lawyers sworn to secrecy going over some user queries under the constraints of a judge in an active litigation' is not actually a serious issue. This category of thing happens all the time, and it's uncomfortable for third parties involved, but a millenium of common law has generally put the needs of the courts reaching a fair decision in a case above the needs of unrelated third parties to not be bothered by them.
Losing this case would be an incredibly serious issue for OpenAI's business model though, though, which is why it's throwing shit at the wall to see if it sticks, and is shouting for sympathy to anyone who wants to listen. I can't say I give a fig about their financial well-being, though.
> So in your view, there is no expectation of privacy in anything typed into the Internet.
This is a good point because chat gpt is The Internet and any order pertaining to a specific website applies to every website. Similarly if the police get a warrant to search a house it applies to every house on the earth
Also not deleting user-submitted content is the same thing as mass surveillance. For example this website doesn’t allow you to delete comments after a certain period, so Hacker News is a surveillance apparatus
I don't either. Of the long list of 'Horrible, far-reaching legal decisions that have been made in the past week/month/year', this isn't one.
No judge can block any kind of mass surveillance program which has been ongoing since more than a decade now. This is a joke and completely irrelevant. OpenAI, just like every other corp, is storing as much as they can to profile you and your bits stream
What we type into the textarea is more valuable than the returned output from the llm.
I find it really strange how many people are outraged or shocked about this.
I have to assume that they are all simply ignorant of the fact that this exact same preservation of your data happens in every other service you use constantly other than those that are completely E2EE like signal chats.
Gmail is preserving your emails and documents. Your cell provider is preserving your texts and call histories. Reddit is preserving your posts and DMs. Xitter is preserving your posts and DMs.
This is not to make a judgement about whether or not this should be considered acceptable, but it is the de facto state of online services.
>I find it really strange how many people are outraged
>This is not to make a judgement about whether or not this should be considered acceptable
A person is outraged because they find it unacceptable. This is beyond terms and conditions, OpenAI is being forced to keep data they want to discard for the user.
I am shocked that you are shocked, that people are taking a position on this when you suggest you dont take a position on this.
A person should get out more. They'd probably learn a lot, by the sound of it.
Outrage suggests a level of surprise with the anger. This is not surprising at all.
When you hand over your data to a 3rd party, you should not expect it to remain private from the government that rules over that party. The entire 21st century has been a constant deluge of learning how much we are all monitored constantly.
I think you'd be surprised at the amount of people who don't understand this. Go ask a random person on the street or a nontechnical friend you have and see what answer you get.
To me, this is akin to Google saying that they don't want to follow a court-ordered law because it would be a privacy invasion.. I feel like OpenAI framed the issue as a privacy conversation and some news journalist are going along with it without questioning the source and their current privacy policy re: data retention and data sharing affiliates, vendors, etc.
It takes 30 seconds to save the privacy policy and upload it to an LLM and ask it questions and it quickly becomes clear that their privacy policy allows them to hold onto data indefinitely as is.
fighting microsoft to get the illusion of privacy is the modern day fight against windmills
Even though this doesn't apply to enterprise customers, I'm just waiting for European customers to wake up and realize that ChatGPT isn't compatible with the GDPR today. And if the court suddenly decides that enterprise customers should also be part of the preservation order it'll be a big hit for OpenAI.
Your honor, it appears somebody has obtained your Netflix video rental history and has been asking ChatGPT about it.
OpenAI already saves all chats
It's crazy how much I hate every single top level take in this thread.
Real human beings actual real work is allegedly being abused to commit fraud at a massive scale, robbing those artist of the ability to sustain themselves. Your false perception of intimacy while asking the computer Oracle to write you smut does not trump the fair and just discovery process.
Yeah nice strawman, but theres been plenty of leaked chats with ChatGPT (just stuff that's been set to public accidentally) that are so obviously "private" in nature that its not funny.
Sorry but, humans have a right to privacy beyond your dislike of the services they use.
You clearly have an emotional connection to people that you feel are being harmed by AI, so I’m not going to gaslight you about that.
But I will tell you that real humans asking private, real questions of LLMs is also happening, and these two things aren’t related. Many people who don’t have the technical literacy to understand the implications are sending messages with extremely sensitive personal, medical, and financial information.
Straw-manning all of these users as tech-savvy, horny IP thieves is ridiculous. I could find your argument more persuasive if you actually considered the privacy needs of the people who had nothing to do with building or perpetuating the systems.
EDIT: to be clear I’m not sure what I think the solution should be, as I also understand the need for discovery.
After reading the actual Order, it appears the defendants filed an application to the Court expressing concern that plaintiff, OpenAI, was potentially destroying evidence and to prevent a spoliation claim (relief due to destruction of evidence), the Judge Ordered OpenAI to stop destruction of anything (eg to preserve everything).
A person not a party to the action then filed an application to intervene in the lawsuit because the Judge's Preservation Order constituted a breach of the terms of his contract with OpenAI regarding his use of OpenAI's product - more specifically that the Intervenor entered into usage of OpenAI's product upon the agreement that OpenAI would not preserve any portion of Intervenor's communication with the OpenAI product.
The problem, as I see it, is that the Judge did not address the issue that her Order constituted a breach of Intervenor's contractual interests. That suggests to me that Intervenor did not expressly state that he held contractual rights that the Court's Order was violating. I would think the next step would be to file an Order to Show Cause directly against the Magistrate Judge claiming the Magistrate's Order constitutes an unconstitutional government taking of property without Due Process.
Is there any proof that ChatGPT was deleting chats? I would think they would keep them to use as training data.
There should be a law about what "delete" means for online services. I used to delete old comments on reddit until their AI caught up to my behavior and shadow-banned my 17 year-old account. As soon as that happened, I could see every comment I ever deleted in my history again. The only consolation is that no one but me could see my history anymore.
As long as you are in the eu GDPR gives you the right to be forgotten.
Or CCPA in California
They're both paper tigers. The CLOUD Act and that massive data center in Utah trump* each of them respectively. What happens in the US stays in the US, delete button or not.
* Deliberately ambiguous.
This kind of conspiracy theory comes up a lot on here. Most of these products have the option to allow or deny that and contrary to the opinions here those policies are then followed. This whole episode is news because it violates that.
the only "conspiracy" here is acting like ignoring privacy laws and toggles is not the default for these corporation and has been for over a decade.
I remember back in the day when I made the mistake to use Facebook and iPhones that a) Facebook never actually deleted anything and b) iMessage was also not deleting (but both were merely hiding).
This is why in this part of the world we have GDPR and it would be amazing to see OpenAI receiving penalties for billions of euros, while at the same time a) the EU will receive more money to spend, and b) the US apparatus will grow stronger because it will know everything about everyone (the very few things they didn't already know via the FAANGS.
Lately I have been thinking that "they" play chess with our lives, and we sleepwalking to either a Brave New World (for the elites) and/or a 1984/animal farm for the rest. To give a more pleasant analogy, the humans in WALL-E or a darker analogy, the humans in the Matrix.
CCPA also allows for you to have your data deleted
"creating mass surveillance program harming all ChatGPT users" is just taking the lawyers' words out of their mouth at face value. Totally ridiculous. And of course its going to lead to extreme skepticism from the crowd here when its put forward that way. Another way to do describe this: "legal discovery process during a lawsuit continues on as it normally would in any other case"
[dead]
""Proposed Intervenor does not explain how a court’s document retention order that directs the preservation, segregation, and retention of certain privately held data by a private company for the limited purposes of litigation is, or could be, a 'nationwide mass surveillance program,'" Wang wrote. "It is not. The judiciary is not a law enforcement agency.""
This is a horrible view of privacy.
This gives unlimited ability for judges to violate the privacy rights of people while stating they are not law enforcement.
For example, if the New York Times sues that people using an a no scripts addin, are bypassing its paywall, can a judge require that the addin collect and retain all sites visited by all its users and then say its ok because the judiciary is not a law enforcement agency?
> This gives unlimited ability for judges to violate the privacy rights of people while stating they are not law enforcement.
See my comment above in reply to aydyn: in general, "privacy rights" do not exist in American law, and as such the judge is violating nothing.
People are always surprised to learn this, but it's the truth. There's the Fourth Amendment, but courts have consistently interpreted that very narrowly to mean your personal effects in your possession are secure against seizure specifically by the government. It does not apply to data you give to third-parties, under the third-party doctrine. There are also various laws granting privacy rights in specific domains, but those only apply to the extent of the law in question; there is no constitutional right to privacy and no broad law granting it either.
Until that situation changes, you probably shouldn't use the term "privacy rights" in the context of American law: since those don't really exist, you'll just end up confusing yourself and others.
You don’t have privacy rights once you hand over your data to a third party.
This isn’t a new issue OpenAI is forcing the courts to wrestle with for the first time.
"Judge creates mass surveillance program; denies it."
I really wish we had broad laws requiring telling the truth about concrete things (in contracts, by government officials, etc.). But we don't even have any real enforcement even for blatant perjury.
> Judge denies creating “mass surveillance program” harming all ChatGPT users
What a horribly worded title.
A judge rejected the creation of a mass surveillance program?
A judge denied that creating a mass surveillance program harms all ChatGPT users?
A judge denied that she created a mass surveillance program, and its creation (in the opinion of the columnist) harms all ChatGPT users?
The judge's act of denying resulted in the creation of a mass surveillance program?
The fact that a judge denied what she did harms all ChatGPT users?
(After reading the article, it's apparently the third one.)
The third one is the only correct way to interpret the title.
[dead]
"However, McSherry warned that "it's only a matter of time before law enforcement and private litigants start going to OpenAI to try to get chat histories/records about users for all sorts of purposes, just as they do already for search histories, social media posts, etc.""
If this is a concern, is the the best course of action for McSherry to stop using ChatGPT.
We have read this sort of "advice" this countless times in HN comments relating to use of software/websites controlled by so-called "tech" companies.
Something like, "If you are concerned about [e.g., privacy, whatever], then do not use it. Most users do not care."
Don't use _____.
This is a common refrain in HN comment threads.
"OpenAI will have a chance to defend panicked users on June 26, when Wang hears oral arguments over the ChatGPT maker's concerns about the preservation order."
"Some users appear to be questioning how hard OpenAI will fight. In particular, Hunt is worried that OpenAI may not prioritize defending users' privacy if other concerns-like "financial costs of the case, desire for a quick resolution, and avoiding reputational damage"-are deemed more important, his filing said."
"Intervening ChatGPT users had tried to argue that, at minimum, OpenAI should have been required to directly notify users that their deleted and anonymous chats were being retained. Hunt suggested that it would have stopped him from inputting sensitive data sooner."
Any OpenAI argument that invokes "user privacy" is only doing so as an attempt to protect OpenAi from potentially incriminating discovery. OpenAI will argue for its own interests.
Correction: If this is a concern, is the best course of action for _Hunt_ to stop using ChatGPT.