53 points by caltlgin 2 months ago
I'm having the opposite problem. I want a site I admin added back to the Wayback Machine, the previous owner of the domain got it removed. It seems really hard to get anyones attention, I've tried email and twitter...
Have you tried https://twitter.com/textfiles/ ?
Jason also frequents here under the same username.
oh, will do!
I think we got our site back on the archive by adjusting our robots.txt.
Hmm, at the moment I have not robots.txt. Is there anything I can do to make it archive.org friendly?
For every copy of your site made by archive.org, there are thousands from webcrawlers that don't respect robottxt.
A tasteful way to go about this would be to give context to past posts with a note at the same url.
To the people who want to be forgotten, you will be, but the internet is too valuable to the future for retroactive censorship.
> but the internet is too valuable to the future for retroactive censorship
If I don't allow you to copy and publish my content, I'm not censoring you. Using that term in this context makes no sense.
As for webcrawlers: sure, they are out there. And when their makers publish your content, they are infringing your rights and (in many countries) commit a crime.
I'm not a lawyer, but I'd argue you've published your content by making it available in your index. An archive or library doesn't publish content, but offers access to content having been published.
If you made a reasonable effort to control access to your works that might be a different story.
When I publish copyrighted material that I own and I do not grant others a license to copy it, their reproduction rights are limited to fair use. Reasonable uses of fair use are things such as keeping a private copy for your own study and use of small excerpts in scholarly articles and criticisms.
Fair use has never been considered by the courts to include making unlimited copies of the entirety of the work and distributing them worldwide to others. This is clearly a violation of copyright and is a criminal act under US law.
That's true for books (and public libraries, a status which the Internet Archive holds in the US, if I'm not mistaken; that doesn't change their status w.r.t. the rest of the world, however), but I don't believe you can directly transfer rules regarding books to websites.
The Internet Archive publishes the content, they don't use a robots.txt, canonical-tag (they do set a link-header) or robots meta-tag asking search engines not to index their version. For all intents and purposes, it's just a copy of your content published on their website.
I understand their goal, I'm not opposed to it on a fundamental level, but I do believe that the choice of participation should rest with the content creator.
Since copyright is first and foremost intended to be a mechanism to enrich the long-term public good, I'd argue that a rightsholder's short-sighted intent to control their work is outweighed by the public good done through archiving and preserving a work.
Copyright was never intended to give authors ironclad control over their works in perpetuity. It was intended to give a limited period of exclusivity, in return for the expectation that the work would always become public property. Eternal copyright upsets that balance, and shackles public culture behind bars.
"...but I don't believe you can directly transfer rules regarding books to websites."
The rules for copyright do apply to websites. Here is what the US Copyright Office says:
"What does copyright protect?
Copyright, a form of intellectual property law, protects original works of authorship including literary, dramatic, musical, and artistic works, such as poetry, novels, movies, songs, computer software, and architecture. Copyright does not protect facts, ideas, systems, or methods of operation, although it may protect the way these things are expressed. See Circular 1, Copyright Basics, section 'What Works Are Protected.'
When is my work protected?
Your work is under copyright protection the moment it is created and fixed in a tangible form that it is perceptible either directly or with the aid of a machine or device."
Source URL = https://www.copyright.gov/help/faq/faq-general.html
More FAQs: https://www.copyright.gov/help/faq/
'Circular 1' URL = https://www.copyright.gov/circs/circ01.pdf
The difference is that, while the contents of a book are covered by copyright in the same way that a website, a handwritten letter, or a photograph is, the physical artifact that is the book itself can be loaned, sold, etc. because of first sale doctrine.
Physical libraries lending books are much more protected by first sale doctrine than any other legislation. There are some additional rights for libraries/archives at both the federal and (in some cases) state level but it's very far from a carte blanche to replicate and freely distribute copyrighted digital content.
"archive.org" obeys "robots.txt". Over-obeys, in fact. If you add a robots.txt file that locks them out, old archives disappear, too. Even if the domain name has changed hands.
> A few months ago we stopped referring to robots.txt files on U.S. government and military web sites for both crawling and displaying web pages (though we respond to removal requests sent to [email protected]). As we have moved towards broader access it has not caused problems, which we take as a good sign. We are now looking to do this more broadly.
It might not be true forever. Unfortunately.
This is an important and useful improvement. Many domain parkers/squatters/etc who snap up dead domains have robots.txt files that block everything or almost everything, breaking the ability to see the previous site via archive.org.
(Side note: domain name expiration was a mistake.)
A fun thing to do before your website expires is set up HSTS with "includeSubDomains" and enroll your website in HSTS preload. Many of the bots that backorder domains in order to put ad pages on them don't use SSL at all (not even LetsEncrypt) and the domain ends up becoming useless for them.
If they ignore robots.txt, than what else gives them the right to copy and host content from other sites? As much as I value Wayback and archive.org, I think putting this into the realm of bilateral negotiation and a DMCA-like model outside courts is a slippery slope. It's a non-solution potentially breeding new monopolies, like Google's exclusive relations with news publishers is doing. Is there nothing in HTML metadata (schema.org etc.) informing crawlers and users about usage rights that could be lifted or extended for this purpose now, especially now that the EU copyright reform has set a legal framework and recognition of principles in the attention economy?
> If they ignore robots.txt, than what else gives them the right to copy and host content from other sites?
The same thing that gives them the right otherwise: fair use, and explicit archiving exceptions written into copyright law. robots.txt adds no additional legality.
Fair use does not give you the right to wholesale scrape content that is otherwise under copyright with a non-CC/open license, which is effectively what the Internet Archive does. (To be clear, I approve of IA's mission but it's in a legal grey area.)
robots.txt has never had much of a legal meaning. Respecting it was mostly a defense along the lines of "You only have to ask, even retrospectively, and we won't copy your content." As a practical matter, very few are going to sue a non-profit to take down content when they pretty much only have to send an email, (almost) no questions asked.
> Fair use does not give you the right to wholesale scrape content
Yes, it potentially does. There are court cases establishing precedent that copying something in its entirety can still be fair use, as well as law and court cases establishing specific allowances for archives/libraries/etc.
There's probably an argument where archiving a particular site as a whole has some compelling public interest--say a politician's campaign site. But it seems unlikely that would extend to randomly archiving (and making available to the public) web sites in general.
I've always been told that fair use--as a defense against a copyright infringement claim--is very fact dependent.
IANAL, but I fail to see how fair use can be leveraged to give archive sites a right to host other site's content when that content is available publically und non-discriminatory, and there are eg. Creative Common license metadata tags for giving other sites explicit and specific permissions to re-host content. There are also concerns to be addressed under EU copyright reform (eg. preview of large portions of text from other sites without giving those other sites clicks).
If your point is that content creators can't technically or "jurisdictionally" stop archival sites from rehosting, then the logical consequence is that content creators need to look at DRM and similar draconic measures which I hope they rather aren't forced to do.
how does it work for all the content that is not under fair use because the author comes from a country with different laws?
The author's jurisdiction is irrelevant. The only question is what jurisdiction's laws apply to the Internet Archive (or in general whatever party does the copying).
Then that country can try to enforce its laws on the Internet Archive. Won't be easy, though.
Besides the aforementioned gov/mil sites that archive.org announced a changed policy on, I noticed during the TurboTax controversy  that archive.org seems to have disobeyed robots.txt (2008) [1a,1b] and NOARCHIVE (2006)  for TurboTax as well. It wouldn't surprise me if Archive.org retroactively changed this because of the controversy, since they manually changed display settings during the Jo Reid  controversy. But the latter case involved a webmaster who set robots.txt to wipe out old pages, whereas Turbotax have had robots.txt + NOARCHIVE as a policy for years. I had assumed Archive.org wouldn't even originally store such pages during the crawl, but that seems not to be the case?
Don’t do this. Please don’t do this. The Wayback Machine is one of the only records of history we have on the internet, often the only way to look back and see what has been. It’s invaluable for that—and if your site is to be any part of the internet’s history, it should be available there too.
This fundamentalism is unhelpful.
What if people don’t want to be part of internet history? It’s hard enough to be anonymous, or even to move on from mistakes, as it is.
Then why make a site? That's like saying you don't want any photos of you to exist but you've been outside for years while people where making photos and now you're telling them to delete those.
So, what you’re saying is that a decision you make as, say, a 17 year old is one that you must stand by for the rest of your life?
And, yes, I get it, there’s more than the IA but it’s a point of principle for me. I’m not talking about erasing newspaper articles but rather the blog a kid posts when they are naive.
> So, what you’re saying is that a decision you make as, say, a 17 year old is one that you must stand by for the rest of your life?
That's basically the way it is anyway. If you go out in public (physically or virtually), you've lost some control over how long-reaching your actions might be. If you do something stupid in public, you can't prevent people from posting their videos/photos of it, or just talking about you. The internet is no different. While you can get your stuff removed from some places, you have no control over it generally speaking. Somebody might have screenshots for example.
In most parts of the world you are entitled to privacy even if you are outside, and you can demand photos/videos taken of you without your consent to be deleted.
Plus, just because I have a website, that doesn't make it's content open domain, for some business to copy all it's contents and publish them without my knowledge.
>you can demand photos/videos taken of you without your consent to be deleted. //
Can you name maybe five large countries where that's true? It's not true in USA, nor UK AFAIK. I understand it's not true in Germany either.
So, I only know contradictions, interested to hear. China and Russia, don't seem likely to have such laws - maybe they're common in South America, Africa?
Tempting as it is to reduce everything to a binary, that's not how things work.
>So, what you’re saying is that a decision you make as, say, a 17 year old is one that you must stand by for the rest of your life?
Yes, that's why at ryanmercer.com I leave incredibly embarrassing LiveJournal posts up from high school in 2001.
That's who I was, in 2001, not necessarily now. Those are things I willingly and freely shared on the internet. Do they make me cringe... yes, do I wish I'd never have posted them... yes, but I did so they are there to document who I was, to document what a random teen thought during that period, if someone or a potential employer wants to old 2001 Ryan against 2019 Ryan then I don't want to have anything to do with them.
I do believe that it would be norm in near future.
You can store photo and video for some time (for legal purposes mostly) but after that you would need to either obtain consent of everyone on them, remove the photo or edit it (replace real people with computer generated ones).
The tech for the latter is basically already here.
How would that even work? My Flickr account has pictures of probably 10s of thousands of people with various degrees of identifiability. If someone really wants their photo deleted and they ask me nicely, I might very well do so. But I'm not going to delete my photos from public-facing sites just because there are people in them.
They asked politely, saying please, and not advocating for any kind of imposition. How is that fundamentalism?
I personally totally agree with the petition. And at the same time, I also think people should have the right to remove their mistakes. It's just that when they have no strong reason to remove a site, I'd rather people leave it there so history can be preserved.
Polite or not the previous commenter made a sweeping appeal based on their view of what is right and seemingly without acknowledging that there are legitimate reasons not to want to be a part of internet history.
If a person’s reason is more akin to a whim, then persuade them of the value of being a part of the record you want to see.
I read maguay's comment as an attempt to persuade, not a fundamentalist call that no one should do this.
Then don't be part of the internet.
That's like saying "what if people don't want to be part of human history?" Like it or not, unless a individual is completely inconsequential, they're part of history, and that should be preserved for future historians to dig through.
We have the capability to preserve untold amounts of data for the future - far more than any other time in history - and some of us are more worried about ensuring that they don't have a place at that table. It's so sad, honestly.
If you do not want to be part of internet history, you should not be part of the internet present.
Some people don’t want their site to be part of internet history, and that is a legitimate view.
>that is a legitimate view.
Think about personal correspondences of famous people before the internet.
But if it's online and not gated by login, it's explicitly public, unlike private correspondence.
In the UK taking a private copy of a website is a copyright infringement; doesn't matter if you publish it.
Your idea probably works in USA with Fair Use.
Could you back up this claim? Reading about fair dealing as it is called in UK law  leads me to believe otherwise, especially if the copy is used for archival purposes .
The context was private individuals keeping copies.
There are some exceptions but they're not really pertinent here: You can make a transient copy, eg in order to view a website you might cache something. You can retain a copy of a TV show until you watch it - but can only watch once, and not with company. You can keep copies to facilitate workarounds for disabilities (but again you can use that for retention) ...
Registered archives can keep works so long as they're not accessible by the public.
Your second link, I wasn't totally aware of those changes. However, they don't seem especially pertinent. You're not allowed access to the whole copy of an archive copyright work. Private archives can't keep copies. Public archives can only do so when buying access is not feasible.
The private study requires you to be on a related official course of study, and the works used -- but only accessed in part -- have to be cited in the study results.
So, WBM isn't a UK public library and couldn't copy a UK served website legally for archive. A UK public archive could serve the pages, but only parts of them, and only to people physically in the building.
I think my summary was correct in context; detailed corrections welcome!
Fair Use permits the unlicensed use of copyright-protected works in certain circumstances and takes into account a number of factors--including the amount of the original work that's used. So quoting a paragraph from a blog post to comment on it is almost certainly OK. An entire website? Probably not under most circumstances.
What about them?
Because it's _my_ site and I don't want it archived.
If you want full control over works you author, you should keep them to yourself. Publishing inherently gives up control, and the whole idea of copyright in the first place was to enrich the public domain after a period of exclusivity designed to incentivize production of new works.
This idea that copyright has anything to do with "control" is what has caused so much cultural loss. Please don't perpetuate that idea.
So don't publish it.
Though I somewhat sympathize with this view, there can be legitimate reasons for depublication. The Internet is deep and vast.
That said, removal may entail simply removing public access, rather than deleting archived content:
The Internet Archive may, in appropriate circumstances and at its discretion, remove certain content or disable access to content that appears to infringe the copyright or other intellectual property rights of others.
Let's say you spend a few years building a site and then things change and you lose interest. You stop paying for hosting and let the domain drop?
Then someone catches the domain and recreates the site with your content, without your knowledge or permission. You're happy with this are you?
> Let's say you spend a few years building a site and then things change and you lose interest. You stop paying for hosting and let the domain drop?
> Then someone catches the domain and recreates the site with your content, without your knowledge or permission. You're happy with this are you?
What's to stop someone from doing that without the internet archive?
If the domain has dropped how are they going to find the content?
> What's to stop someone from doing that without the internet archive?
This is exactly my point.
Does the Wayback Machine actually delete the content it removes or does it just not make it available? In 200 years, when everyone involved is dead, there is far more of a case to be made to publish it than now.
Very good question.
There are MANY reasons to remove sites from there.
Ok, list a few then please.
One of my local bus companies just became the only company to run buses to the local university, after the one other bus company stopped operating in the area.
They've removed themselves from archive.org so now we can't easily show that they've more than doubled the cost of the annual student bus pass over the last few years.
1. Don't want their lives 'ruined' because of socially unacceptable tweets.
2. Don't want their citizens to know that they are actively committing genocide.
3. Countless other equally legitimate reasons for removing things from the public record.
4. I'm embarrassed.
5. I could loose money.
6. My reputation will be tarnished.
7. I don't want anyone to know that I a into 'x y and z.'
8. I was young and foolish.
Some actually legitimate reasons
1. Outed themselves as a minority which is now being persecuted. (too late, the state already has the evidence)
2. Need to remove a post that is actively agitating/acting as a focus point for some group that rises to the level of physical threats.
People, the internet is public. If you put up something on port 80 or 443, you have just published a book. You can't unpublish a book. I'm sorry if the affordances are shitty and the social media platforms intentionally mislead you into thinking that publishing is 'sharing,' but if you published it, you have to own it. You cannot unspeak, and if you do you or if a systems allows you to, then that is a fundamental violation of the social contract. If you fucked up, and want to appologise, or provide additional context, then by all means do so.
In cases where a tweet, post, etc. incites a brigade, there need to be ways to temporarily hide content, but if it is deleted forever, then there is a tempest in a teapot without any teapot for reference. Not that it will ever happen, but platforms like twitter should be held accountable for facilitating viral hatred and brigading, it would incentivize them to implement algorithms to damp the spread and to force additional context onto users before they are allowed to view a hot and bothered tweet (or similar). You must correctly answer these 10 questions about the context from which the author was speaking before you are allowed to retweet or even view this message. That might be a good compromise for 'surge' internet outrage.
> You cannot unspeak, and if you do you or if a systems allows you to, then that is a fundamental violation of the social contract.
There's no such social contract.
> 2. Need to remove a post that is actively agitating/acting as a focus point for some group that rises to the level of physical threats.
Mobs have short attention span and are mobilized by the newest controversy of the day. We're talking about recording history. You remove a bug from git, but you don't alter the entire history for it. You remove passwords from git, but you also change the current passwords.
There's no reason we can't 'fix' the present and record the past at the same time.
One example: Made comments as a young man on subjects, now don't agree with them. Or they can cost me the job.
Insulted Putin, or China, or MBS and now I need to go to their country
Don't want your picture taken all the time? Don't want everything you say recorded and archived? You might also not want everything you write to be archived. If you want control over your content, archives are a problem.
> Don't want your picture taken all the time? Don't want everything you say recorded and archived? You might also not want everything you write to be archived. If you want control over your content, archives are a problem.
Good luck tracking down every company/user that has visited your page then. Any single one of them could be archivers. It's not hard to change a user agent to look like Google.
This is like arguing against archiving newspapers. If you explicitly publish it online for the world to see, you can't make people unsee it.
> Good luck tracking down every company/user that has visited your page then. Any single one of them could be archivers.
Could be, sure. And anyone could wear a hidden camera and secretly take your picture, or a wire and secretly record you.
If any of those undercover archivers re-publishes your content, send a DMCA notice and sue them. Where copyright infringement is a crime, report them.
There's a cultural component to this, I believe. Americans seem to feel that pictures, recordings etc taken in public are fair game, continental Europe has a different stance. Even in public, you can't take pictures of ordinary people and publish them (unless they're part of an extraordinary event).
>>Even in public, you can't take pictures of ordinary people and publish them (unless they're part of an extraordinary event).
I don't know which European country you have in mind specifically, arguably some are more strict on this than others(Germany, Austria) but most places you can take and publish pictures taken in public places without asking for permission. It's only an issue if someone is specifically a subject of your picture - so a wide shot of street is absolutely fine, but a photo zoomed in on someone's face is not, even if they were in a public space.
True, it's usually about being identifiable. Italy, France and the Netherlands require model release as well if you want to publish those pictures if I remember correctly. I don't know how Eastern Europe handles these cases.
Some also allow news content in general, even if the picture itself isn't noteworthy (i.e. illustrating a shopping mall vs somebody standing next to a politician being attacked with a cake), but I don't know about the intricacies.
At least in the US, model releases are strictly for photos used commercially (e.g. in advertising or marketing materials). Editorial use, which includes just putting it up on a blog or whatever, doesn't have restrictions.
> Americans seem to feel that pictures, recordings etc taken in public are fair game
I'm European, but I fail to see how anybody could have any expectation of privacy when in a public place. You either outlaw camera's completely or you have to accept that you might end up in the background of somebodies photograph. I don't think outlawing camera's is realistic.
> You either outlaw camera's completely or you have to accept that you might end up in the background of somebodies photograph.
You don't need to outlaw cameras any more than you need to outlaw knives to keep people from stabbing others. But as mentioned, there's a fundamental difference in the idea of privacy, I suppose. It can be understood as "something that happens in a non-public place" or it can be understood as a larger idea that you have a certain right to not be surveilled, recorded and stalked.
My point is that if cameras are not outlawed, then you can be photographed in public by accident, just because you happened to walk into a shot or happened to be in the background when some tourists wanted to photograph something. There's a difference between you can't photograph me while I'm in public and you can't harass me. Its not the act of walking in the same direction and in close proximity to the person that makes following them in public stalking, so you don't need to ban public photography of other people in order to prevent them form being surveilled, recorded and stalked.
I guess that I find the idea that you should expect privacy when in a public space kind of strange (its right there in the word: public), but that doesn't mean that I think its ok for someone to follow you around recording you (but not because of the actual act of being recorded, but rather because of the targeted nature).
Similarly, I think passive recording (ie non-targeted surveillance) of public spaces should be allowed in and of itself, but that its the use that dictates whether its abusive or not (ie if its done so that people can be identified, then that seems similar to me to following someone around, but if its done for the backdrop of a movie or art project, or its done to study foot traffic on a street.. basically there are many reasons which aren't abusive).
Then dont publish and share your pictures and conversations with everyone. Privacy goes out the window the moment you share stuff with everyone on your own. If you want control over your content you need access control. Its the difference of actually publishing pictures of yourself online instead of being photographed in public. Privacy covers the second not the first.
Does Archive.org make any judgment calls when it comes to honoring requests to remove content? For example, I can see people trying to scrub evidence of their own lies or promises or other damaging misdeeds asking for their content to be removed.
It feels like burning old newspapers if a subject of an old story doesn’t like the story. Or a book author forcing a library to remove her books from the shelves. There is something Orwellian about letting people purge history of they don’t like it. When something is published and public, the bell has already been rung. Should we force people who saw the original content to never speak of it? Can we sue them to prevent them from talking about the “bad” content? Erasing sites from Wayback, to me, feels like the sanctioning of censorship, or erasing history. The so-called “right” to be forgotten is a strange right in free societies. Does the right to be forgotten give people the right to destroy old newspapers than someone has saved? Can people go into someone’s home and seize books that depict the claimant in a negative light? Wayback is like a photo gallery of the past. We shouldn’t be allowing people to rewrite history.
I personally find a "right ro be forgotten" as such laughable, but I understand it was specifically introduced not to expose something stupid you did or said, or a non-advantageous photo taken from you or similar as your only public record in times of clickbait and staged polarizing crap. Then there's the problem of copyrighted material, and of publishing stuff on your site with the intent of making money off your user's attention via ads, one of the very few avenues of financing content creation. All these concerns have to be balanced against another, which creates a difficult legal environment for archival sites.
Think of all the things you did as teenager or child. Now think of all these things as being documented on the internet. Do we really want to be haunted by our past in such a way?
Yes, we all agree children and teenagers do stupid shit. Since we all acknowledge it, can we be mature about it and not "haunt" people with it?
Easy to say as a bystander. What if its you kid who's being bullied or who got bullied? What are you gonna do about it? My other post in this thread addresses that point 
I'm sorry, but these kind of straw men are making discussions on the subject impossible. I argued that we shouldn't judge adults based on the stupid things they did as children.
> Do we really want to be haunted by our past in such a way?
To which my reply was: let's be mature about it and not care about trivialities from someone's past.
Now you change the subject to: "What if its you kid who's being bullied or who got bullied?". My kid being bullied "right now" is not the same as "my kid did some stupid shit 10 years ago and people are making fun of him now because of it". This is another problem with another solution, and it's not something I argued about.
> Since we all acknowledge it, can we be mature about it and not "haunt" people with it?
No, we can't.
That's a lot of false equivalence in one comment.
Whatever we each may feel about the Archive's policy, it's a bit over the top to compare it with forcing someone to never speak of what they saw, or to go into anyone's home to seize their books or destroy their personal newspaper collection.
If the new owner of a domain can erase all the history of content published by a previous owner, that is kind of similar to forcing everyone to never speak of it again.
Personally I'd be very interested in subscribing to an archive of things Archive.org has removed on request.
Like what Lumen ( https://www.lumendatabase.org/ ) does for DMCA takedowns? Yeah that'd be cool to see what's happening and keep track of those requests.
> The so-called “right” to be forgotten is a strange right in free societies.
I believe the "right" (I agree that it should be quoted) to be forgotten is a temporary stopgap measure for those who haven't received a proper digital literacy education beforehand (that is, almost everyone). I expect the "right" will hang around for a long time, for the lack of better alternatives.
Children do awful things, including to each other. Bullying, for example. Nowadays, it can get recorded easily because everyone carries a smartphone. And it gets spread because of the Internet.
The right to be forgotten is a simple way to say: "I do not want this content to exist on the Internet". Does that stop the content from existing? No, of course not. Revenge pornography can still be found after that Pinkwhateveritwascalled website got shut down. But it got more difficult, and its a matter of supply and demand. If the website is only accessible via Tor, then those who got the content on their computer took more effort into obtaining the content. You could make the same argument for child pornography.
That being said, the real problem is the lack of prosecution for the content creators. And that is true for child pornography and revenge pornography and bullying videos. However, trend is that the latter 2 are on the rise on the public web. If the right to be forgotten can slow that trend down, I'd say that's a good thing.
You have a good point, and that's why the "right" will and has to say right now. Ultimately I think a disclosure of information of any kind against the subject's will will become a direct criminal charge in the future (in contrast to a set of specialized laws that we have now), and then the "right" will be obsoleted.
Or maybe a temporary stopgap measure until privacy is so diminished that society realizes everyone does "unacceptable" things and stops being outraged by what is normal.
I mean, probably >90% of people have been deeply drunk at some point of their life. So objetively and rationally speaking, a drunk photo is not a reasonable reason to reject someone from, e.g., a job. Maybe if such photos become widespread we will come to our senses in this?
"90 percent of people do it" is not an argument against rejecting someone for it. If I had two otherwise-identical candidates but one of them had photos of themselves online passed out in a pool of their own vomit and the other didn't, I'd pick the one who didn't. Same for other evidence of poor judgment, like a picture of them in a racist Halloween costume or comments mocking a disabled person or screenshots from an oh-so-funny thread on 8chan. If you don't want something to be part of your public image keep it private.
I'd say it's quite an argument. If 90% of people do it, it means that there is a 90% a prior probability that your other candidate has done it as well. So there is a 90% probability that you are basing your decision entirely in a non-factor. Surely there are more meaningful differences* between candidates to take into account that one that has 90% probability of not even being there at all.
To each their own, though.
*I know you said "otherwise identical", but that's not a very realistic situation and if it did happen, it would justify choosing by just any irrelevant difference, like one candidate having one day of work experience more than the other, so I don't think that says much about the importance of a criterion.
A 90 percent chance is still better than a 100 percent chance, especially when it comes to public evidence. 90 percent of people have probably said the N-word but I'm still not hiring somebody with public video of them doing it.
Since the question of why removal might be sensible has been raised, I thought I'd offer a historical perspective.
From a 1966 BBC documentary:
"Well, he who has access to information controls the game. This is very dangerous. I think both your country and mine have never trusted the government completely. We do so for good reason. Here we have a mechanism that could be abused. Here we have a mechanism that would allow the creation of a dictator. . .
I've yet to see an expression by anyone in Congress about this new type of danger. In fact, we see proposals for centralizing information, we see proposals for rushing ahead into new, more efficient computer information systems, and very little thought is being given to the dangers of the misuse of these systems. . . I ask a lot of people about privacy, why they valued it, and I was surprised by the number of people who said "Well, I don't do anything wrong. Why should I worry about privacy?" And then, on the other hand, I think there's a more wise group that says, 'Privacy is really the right to be wrong, then go on and live the rest of your life, without having it mark you forever.' I tend to think this latter view is the view we should hold."
The speaker is Paul Baran, of RAND Corporation, and the inventor of packet-based switching -- the technology which makes the Internet possible.
If you want to know who could possibly have forseen the negative consequences of universal information networks might have been: their creator did.
Baran's full archive of RAND publications are now freely downloadable from RAND, after I'd requested access in July of 2018, for which I'm immensely grateful.
The comments here bring to mind this:
Attorney to witness: What did you see?
Witness: It looked like he was saying "XYZ."
Opposing attorney: Objection.
Judge: Sustained. The jury will disregard the previous question and answer.
—Except: the jury has heard it — too late.
True, in reality, the interpretation of an order to disregard and forget statements or information will vary from juror to juror. Among twelve regular people who aren't necessarily very smart, trained according to good practices, and mostly decidedly unprofessional by way of the selection process, some might fixate and even obsess and perceverate over an order to put a seemingly juicy tidbit out of their mind.
You may be dealing with OCD cases, who just watch the wizard of Oz and are generally paranoid about the machinations of faceless organized institutions of state power.
But the courts also operate with the full understanding that every juror is capable of willfully engaging in nullification of any aspect of the law. As a juror, one can elect to choose either polar outcome, and sociopathically adjucate their decision based on schizophrenic delusions that they've concealed throughout the whole of any given trial they participate in.
It's somewhat understood that a trial by jury is a near total roll of the dice, with multiple layers of decisions interfacing to produce an outcome.
The judge is there to referee the whole ordeal. The example you highlight is something to be used only sparingly.
Judges run the gamut of quality, and may also be corrupt. An entire trial may be a cruel joke.
Even so, like a sport, the court room follows rules and a code of conduct on some level. The judge is there to ensure a level playing field.
A good judge would notice a particularly eggregious statement as being disturbing to jurors beyond repair, by considering what it sounds like while in their shoes. If it seems like a line was crossed by an error of practice, as if a debilitating injury were dealt to an athlete, a judge would disqualify the entire competition and declare a mistrial if they're doing their job.
What does this translate to on the internet? Well, any given server cluster is sort of like a juror. They're sort of blameless for the request logs they capture, and the backups that get retained. It's the admins that have to erase things. Human judgement is required, to ensure that data gets disavowed from technical systems operated by businesses.
It's good that the only place where people still cannot remove information is a library.