If they were scraping the information/pictures off the supermarket sites, then I'd have expected the cease and desist to come from the supermarkets.
Given that the letter came from "a company" then I presume they were taking their information from an aggregator - and seems entirely fair that you should "pay the aggregator" (as there's clearly a company out there doing something similar to what they're doing - but they were taking their data).
If this company was just a data-hose, then I'd have thought you could monetize your consumer-focussed product, simply by flogging anonymized data from your users back to the supermarkets.
"Customer X, dropped product Y from their weekly basket (or swapped supermarket), when you raised the price of Z (or other supermarket dropped it)"
> ...then I presume they were taking their information from an aggregator - and seems entirely fair that you should "pay the aggregator"
This seems to be what they are doing. From their FAQ:
"Maintaining this data for over 200,000 products is a costly endeavour and asking them to offer this for free would be a huge ask. We appreciate that the data company is absolutely entitled to request compensation for their work."
You can see that the aggregator has said that Trolley scraped the images and data in question from the supermarket sites.
The implication seemingly being that the supermarket sites are licensing the images and prices of their own products from this third party (potentially only at a peppercorn price).
But this is only implied and not said, which in legal documents is an important distinction. I would suspect the supermarkets license the info on their sites to this aggregator, who seems put out somebody scraped the same information and is not licensing it from them instead, and is implying trolley needs to get a license, while in actuality perhaps they don’t need to.
I think that your interpretation is slightly off and Nielsen might have actual teeth here (again to remind everyone this is UK, in the US Nielsen would be in the wrong). The supermarket owns the database rights to their online store, which it then exclusively licensed (as far as I know) to Nielsen for re-use of third parties. Regardless of your emotions here (and personally I'm disappointed that this is happening), this is UK law and Nielsen and the supermarkets are correct here. If Trolley instead walked physically in stores and then manually collect the data and bought products to shoot photos of* then they have teeth as they have exerted effort into independently create their own database.
* This is now plain copyright law which raises questions, like if the manufacturer-in-question can stop Trolley from publishing those photos (while Trolley owns the copyright on the photo itself, there are de minimis considerations when it comes to the packaging design which is copyrighted by the manufacturer). It might pass fair dealing though.
Then the onus is still on Nielsen to demonstrate an exclusive license. Otherwise they have no standing when trolley scrapes the sites directly, because only the original copyright holder (the stores) may enforce that.
The pictures are the real red flag. Unless a photograph is bundled with a cc license one always needs to assume somebody will need to grant permission for its reuse, and often that will be for compensation. Trolley should have known this.
I very much doubt the price of the product is coming from this company, everything else sure, trolley.co.uk should explicitly ask them if they data aggregator provides the price to the website.
It doesn't matter if they also store that information, (might be scraping that themselves), if they don't provide that to the website, its not infringing their database rights, it is infringing the supermarkets.
Yes - the feed likely consists of things like the canonical product name, description, weight/size, recommended price, pack shots, and nutritional info.
Trolley.co.uk might be able sidestep this demand by displaying ONLY the product names and prices, but it'd be a much less visually-appealing site as a result.
IANAL but given the unreasonableness of the premise as I read it (? scraped data from public websites of multiple companies is "owned" by some mysterious third-party) this sounds more like a consequence of the UK's legal system, where minnows have to shut down when challenged by sharks, because the costs make it impossible to resist, regardless of the strength of the case.
Also, its all a bit vague, especially the reason for the donations - to pay these jokers off? That just enables and encourages the bullying. They should provide more transparency as to who is claiming ownership, and what evidence is being given and legal arguments for paying.
You generally don't own the copyright to images that you scrape from the web, even public websites. A number of companies handle consumer product information internationally (such as GS1 for barcodes) and shops license product information (images, EAN Codes, nutritional information) from them, so they don't have to deal with each manufacturer individually. The manufacturers also work with these organizations.
If they went into supermarkets and took pictures of all the products it would be another story, but as is, they are using pictures on the internet in a way that they aren't licensed to. This is not particular to the UK, just because some site display a picture publically does not mean you are allowed to display the same picture on your site.
The cease and desist letter stated: "proprietary images and data", I doubt a commercial/sponsored database of prices exists. That would have grave anti-cartell consequences. But you're right: for most reasonable people reading the article, the _only_ relevant data on trolley.co.uk _would_ be the price information. (And I don't have any insider knowledge, I was only guessing ...)
For the sake of 30k I wouldn't be surprised if Asda/Lidl/Tescos step in with the cash for a PR piece. Along the lines of "Being so confident they're the cheapest supermarket, we'll back a price comparison site" and helping 500,000 people to save some money.
I'd be bloody surprised as well, Tesco's even charge some of their suppliers to access an online portal to upload supplier invoices! They had two portals for uploading invoices, one that was easier to use but cost, the other was a web browser interface that was clunky. I think one of them was called Marrakesh or something like that.
Having been part of their new store opening team in the past and the way they undercut all other rivals in the area to get people in, which typically runs for a few weeks, depots drop everything to make sure everything is in stock at the expensive of other stores, even running dedicated lorries to the store which is costly but that came out of the Depot Transport budgets, and then Transport would argue with Warehouse if the Warehouse had failed to load cages properly because Wave1/Wave2 picks never go smoothly because goods-in is never to schedule as no one can predict holdups on the road networks, it really is a cut throat business in so many ways. I'm surprised at how many companies put up with Tesco's demands, but thats what you need to do to become a stock market leader! Likewise the public are really just price sensitive before brand loyalty. Life in the UK is very much, you get what you are given, and I've never worked for an entity that didnt have criminal elements either, which is shockingly two faced of the public, but people keep their mouth shut to keep a wage coming in and every community needs an Emmanual Goldstein figure. Massacres are a community's dirty little secret. So when is a simple reminder about general jobs losses, maybe a smattering of innuendo, not a veiled form of intimidation and harassment? Story telling can be so triggering.
Seems very off and unsustainable (e.g. what happens in 2023 is unclear), along with hiding the licensor's name - that seems unreasonable. Who is the end recipient of this donation? I don't buy the story.
Setup a shell company in Bermuda that does the scrapping for you and then claim you buy the data from them. When they approach you just tell them you have a NDA and can't disclose the company. That'll keep them at bay for awhile... then when you're forced to disclose, they'll spend the rest of their time chasing a shell company in Bermuda.
OR.....
Just create a firefox extension that scrapes the pages for your users.
There's many ways to keep this going while giving them the middle finger and serving your users.
Also, laws only matter when the outcome doesn't to the people in charge. Just because "the law is on your side" doesn't mean you'll win even if you could afford it, because laws don't magically rule on their own. It is judges, with all their pettiness and cravenness and nepotism, which rule.
If your data infringes copyright they will go after you not your fictitious shell company because it doesn't matter where you got the copyright infringing material it matters whether you illegally distributed it.
While pictures would be under copyright (and it's possible copyright is with a third party specialised in this), I can't see how the price listed for a product on, say, Tesco's website and collected by yourself on Tesco's website could be subject to licensing by a third party, even considering database rights.
I am also puzzled by those guys claim that they got a very "generous offer" from that company.
If they are unsure a better first step would be to crowdfund legal advice.
They could then crowdsource taking pictures of all the products from their users.
Unless there is a third party which is taking photos of all grocery products and then licensing that to the supermarkets, to avoid them all having to take their own photos?
I used to work for one of the major supermarkets and am aware that there are providers of product images, but don't know the specifics in enough detail.
Taking a photo only gives you copyright over that photo, it doesn’t stop other people taking a similar photo.
A lot of business try and claim copyright of a subject (like a tourist attraction) and try and prevent photography of it, but that’s legal BS in any jurisdiction I’ve heard of.
No, I mean they take the actual photographs and then license them to the retailers. Plus they can also capture other product data in a consistent format etc.
Right, but if you're making a price-comparison website with scraped data, you can't scrape the product image from a supermarket website legally.
You could of course get images from some other source - but you need meticulous organisation when a single brand of tea might have 40, 80, 160, 240 and 600 bag packages.
Well it seems they're not scraping Tesco's website but some aggregator middleman. Like making a package tracking website using Parcels instead of FedEx, UPS, etc individually. Sure it's convenient to write code that only scrapes one website, but now that website is mad because it has a business selling that data.
It looks like a startup that's hit a brick wall viz a viz licensing pictures.
One way of getting around this, albeit perhaps at the expense of user experience, is to insert stock photos. E.g. I imagine 'carrots' is not particularly unique. Where stock photos can't be used, i.e. it's a specialist/one-off product, perhaps the classic white-box-black-text approach?
If this is a genuine site with good intentions, staffed by volunteers then it would be a shame to see it stamped down by the supermarkets.
They've got this data from somewhere, so why wouldn't they have looked through the license before using it?
Anyway, speaking of MySupermarket, what happened to them? All I know was one day they decided to shut down, without clear reasons. Does anyone have more insight into why it happened?
Yeah, I'm scratching my head at this. On the one hand, the UK does have some weird database trolls (a copyright troll called FootballDataCo claimed licence fees from anybody publishing football fixture lists irrespective of where they sourced the information from, and probably can do again since the EU court judgement against them presumably no longer applies)
On the other hand it makes no sense to accompany a supposed order to c&d screen scraping of multiple (independently maintained) websites with thanks for their generosity and "this company is absolutely entitled to request compensation for their work". Either they're a valuable data provider or an licensing obstacle to using perfectly adequate screen scraping techniques, not both. Not sure why a price comparison website would screen scrape the supermarkets with APIs either...
They're not "independently maintained" websites, they all license their data from the same 3rd party data company and re-publish it on their own websites.
You can't copyright a list of facts. I'm confused about what they have run in to? Maybe they should take some of the money they have raised and talk to a lawyer.
The cease & desist mentions Asda [1] and [2] says "Asda chose NielsenIQ Brandbank, their existing digital content provider for 13 years"
So I'd wager the C&D came from Brandbank - who presumably supply product photos and product data (barcode, pack size, and all the other data like nutritional information you'd find on the packet if you were browsing in store)
Nobody said it was copyright specifically. However:
"A database right is a sui generis property right, comparable to but distinct from copyright, that exists to recognise the investment that is made in compiling a database, even when this does not involve the "creative" aspect that is reflected by copyright.[1] Such rights are often referred to in the plural: database rights."
Yes but it doesn't prevent you to build your the exact same copy as long as you don't use the other DB as a source. Scraping the info yourself is perfectly fine. That's also the reason why people add "mistakes" to their DB as a kind of watermark.
But they are (unwittingly) using the other DB as a source. The data they're scraping turns out not to have been created by the supermarkets in question but sourced from a third party database and published under license on the supermarket websites.
Which raises a very interesting legal question: If the data is recompiled, despite having come from another source via an intermediary (who are using that data to describe the products they are selling), is it still subject to the database rights?
How about we try out a reductio ad absurdam: I'm going to start a startup. Our business model is to license the information on customers' websites, and then license it back to them in turn. This then means that anyone who uses our services in the UK can sue for copyright on lists of facts obtained by screen scrapers, as they've been through our database copyright washer. Think flight prices, hotels, cars, anything.
You absolutely can copyright lists of facts in most jurisdictions outside the US
In the US it is pretty hard[1], but most other jusdictions find that if the list has some "creative" or "work" element (ie, it isn't just a list of everything) then it can be under copyright (eg [2]).
Different situation. You don't buy a terminal, you license it. The license can dictate what you're allowed to do with the data. Rebroadcast the data and you may be just fine on copyright or any other IP, but you violate the license so Bloomberg cancels it.
Delayed prices for publicly traded stocks are generally available freely. Real time prices and order book though are not, not to mention prices for anything traded OTC like many bonds, swaps, etc.
Yeah, things may differ b/w countries. But the Wikipedia article you linked mentions this:
> In regard to collections of facts, O'Connor wrote that copyright can apply only to the creative aspects of collection: the creative choice of what data to include or exclude, the order and style in which the information is presented, etc.—not to the information itself.
So, a list of facts still can be creative and valuable to the society. INAL, but such works should be protected by copyright, or what's the point of copyright in the first place.
Copyright is not like patents. If you arrive at the same result, by a different method, then even if the result is identical, there is no copyright violation.
So "the creative choice of what data to include or exclude, the order and style in which the information is presented, etc." is very weak. You can copy all the data, and then make your own choice on what to include (for example "everything I can find"), and you would not be violating the copyright.
The quote you gave is for things like making artwork based on certain patterns or letters in the data. But it won't help you if your creative choice is "everything" because there is no creativity there.
> So, a list of facts still can be creative and valuable to the society
A list of facts is not creative, it's simply work. And yes, it might be valuable, but that's not how we decide if something is copyrightable.
> or what's the point of copyright in the first place.
What that means is that if you write a list of e.g. “Top 10 tennis players in my opinion”, then you can hold copyright over that. But if you write “Top 10 tennis players by rating” then you can’t as that’s not a creative expression.
I don't think you can own rights to something other people collected and curated though. Though you can own rights to the individual components, but in this case that would be the specific price of something, which doesn't seem like it should be subject to copyright.
The images are obviously going to be copyrighted, I suspect the product information is also likely sourced from third party, and thus you would be infringing database rights by scraping.
I find it very unlikely that the price of the products is provided by thirdparty data provider, and think the actual price is not covered by them despite them suggesting it might be.
> The images are obviously going to be copyrighted
Copyrights are inadvertent and sometimes unintuitive. If the images are created by the product manufacturers, then they hold the copyrights. If the supermarkets edit the images, then they hold copyrights to these edits.
Likewise, if trolley edits the images (undoing the supermarket edits, like cropping around some text) then trolley together with the manufacturer holds the copyright for this new image, not the supermarket.
It doesn't matter where it was scraped from or who was hosting the image, just as it doesn't matter which camera was used.
Yes, if they scrape prices from supermarkets' websites then prices come from supermarkets, they are the ones setting those prices after all.
Really, they don't need product data beyond the name. They could drop pictures and detailed data and build their own database by crowdsourcing pictures and data from their users.
Only relevant if they are actually duplicating the "data company's" database; which is not what web-scraping is.
For example; creating a map from air photographs is not "duplicating" another geo coded database. The geo data exists independently of any map.
"Database rights" are not the same as "data rights", and if your "database" is inadvertently and routinely recreated as by product of other processes, then "database rights" can not apply. Otherwise we are in absurd world.
(In this absurd world, a phone book company could charge license fees to everyone with a phone because their phone books would necessarily contain a subset of the global phone book.)
The UK has FootballDataCo which historically "owned" the fact a football match between two clubs will be held on a particular date, irrespective of how the publisher found out that fact. So UK law has not alway been sensible about this.
The EU Court of Justice threw that out, but we're not in the EU any more...
This use case seems very similar to a search index (aka google) type use case. Crawling and indexing vs scraping and storing, and then publishing in a searchable format for end users to utilise? Surely there's plenty of precedent for this?
It is, but arguably it's closer to the Google News model (in that more than just the titles are shared) which recently saw some pushback in the EU [1]
In general, if the data owner says you must take it down, you rarely have a leg to stand on unless you can claim some kind of protected use (e.g. reporting or criticism)
The title is misleading. Never do they mention the price data. After doing some research it appears that what is really the cause for concern are the product images.
The "data" they mention are product descriptions. The issuer of the cease and desist appears to be NielsenIQ. And they don't provide price data.
Their main product is to provide standardised product images with a white/transparent background, and small product descriptions. Their customers being big E-Commerce stores.
As others have pointed out, it seems sketchy as hell, the desperate last-minute plea for a considerable sum of cash, the weird reasoning about data gotten through a web scraper being somehow subject to fees, and especially in how they want to "protect" the name of the data company.
Regardless, if it was legit then I'd strongly agree with the suggestion of setting up a shell company to not only avoid litigation but to set a trend in teaching these bully companies how the internet works since it seems a few still have to be taught. Anyone using a c&d against a smaller company for data that can be gotten through a web scraper really should fuck off. The UK does seem to be the place where this kind of shitty IP litigation takes place more than is typical.
From their FAQ I think these are the key bits of information:
>What does the data company do?
>The data company digitises and maintains the product information for 98% of grocery products in the UK. They’ve generously worked with us to bring this cost as low as possible.
>Why are you not giving the name of the data company?
>Maintaining this data for over 200,000 products is a costly endeavour and asking them to offer this for free would be a huge ask. We appreciate that the data company is absolutely entitled to request compensation for their work. We know that providing the name of the data company could bring negative implications for them which we wouldn’t want - especially with how generous they’ve already been.
A lot of E-commerce software have data feeds that are ingested by parties like pricespy. These feeds are typically either public or made available after mutual agreements. Scraping a large amount of sites for data is just too much labor in the long run.
My guess is that they were circumventing free-tier API limits for the data-aggregation company. It’s not like this data aggregation is one big dump that they downloaded once and forgot the where it came from. Prices change daily. They’d need to be refreshing via an API (data feed) at least daily.
Trolley scraped the supermarket websites, and the supermarkets got their data (product images, nutrition info, etc.) from Brandbank (the data aggregation company).
> Maintaining this data for over 200,000 products is a costly endeavour and asking them to offer this for free would be a huge ask.
No. Whoever (as someone pointed out, probably Brandbank, they supply most of the supermarkets with data so if it isn't them, then it's likely a scam) has already received full pay for the data they've supplied - if they're charging you again for the data, that's just greed (sorry, I mean "perfectly valid capitalism") - even at 14p/entry or whatever they're charging.
Some people here have mentioned that facts can't be copyrighted - it reminds me of a problem in 2011 that the people distributing timezone data to a bunch of open source projects were sued because a 3rd party software house "owned" the rights to that data[1]. I don't know where that landed (timezones still work in linux, so I think they found a workaround) but most projects like this capitulate rather than risk the cost of court.
But yes, the product descriptions and images may be copyrightable - certainly, Brandbank do all their imagery in-house, and any typos in the descriptions will be owned by them, because some person being paid peanuts - likely somewhere in the Philippines - is having to sit down with a photo of the can of Campbells soup, reading and re-typing the descriptions word for word.
It seems to be related to the product images, or the images were easiest to identify and fight over https://www.trolley.co.uk/imgs/cease-and-desist-letter.png
While a product name, price could be public (a person can see it in a store), the picture is very specific and trolley didn't take the pictures.
It seems they could also just cease use of the 'offending' informationa nd use generic imagery or descriptions instead. For bandwidth/processing reasons, I'd prefer as much text and as few graphics as possible anyway.
[1] is a report of Tesco reducing the range of distinct products they stock from ~90,000 to ~63,000. Some of which will be seasonal items like easter eggs. Some others will only be available in certain stores.
I suspect taking 63,000 product photos, and keeping them all matched up to the right products is probably more than a week's work.
They could but at this point the (claimed) damage is already done.
The price is high but I'd imagine that company guarantees high resolution professional photos of every product. Chasing the last remaining 10% products might be hard. At 160.000 products, let's say spending 1 minute per product (finding it, post production etc) at 12h/day it takes one person 9 months.
I'm also slightly confused.
If they were scraping the information/pictures off the supermarket sites, then I'd have expected the cease and desist to come from the supermarkets.
Given that the letter came from "a company" then I presume they were taking their information from an aggregator - and seems entirely fair that you should "pay the aggregator" (as there's clearly a company out there doing something similar to what they're doing - but they were taking their data).
If this company was just a data-hose, then I'd have thought you could monetize your consumer-focussed product, simply by flogging anonymized data from your users back to the supermarkets. "Customer X, dropped product Y from their weekly basket (or swapped supermarket), when you raised the price of Z (or other supermarket dropped it)"
> ...then I presume they were taking their information from an aggregator - and seems entirely fair that you should "pay the aggregator"
This seems to be what they are doing. From their FAQ:
"Maintaining this data for over 200,000 products is a costly endeavour and asking them to offer this for free would be a huge ask. We appreciate that the data company is absolutely entitled to request compensation for their work."
In the UK, the aggregator holds the IP on their aggregation: https://en.wikipedia.org/wiki/Database_right
The Cease & Desist letter that Trolley received is here: https://www.trolley.co.uk/imgs/cease-and-desist-letter.png
You can see that the aggregator has said that Trolley scraped the images and data in question from the supermarket sites.
The implication seemingly being that the supermarket sites are licensing the images and prices of their own products from this third party (potentially only at a peppercorn price).
But this is only implied and not said, which in legal documents is an important distinction. I would suspect the supermarkets license the info on their sites to this aggregator, who seems put out somebody scraped the same information and is not licensing it from them instead, and is implying trolley needs to get a license, while in actuality perhaps they don’t need to.
I think that your interpretation is slightly off and Nielsen might have actual teeth here (again to remind everyone this is UK, in the US Nielsen would be in the wrong). The supermarket owns the database rights to their online store, which it then exclusively licensed (as far as I know) to Nielsen for re-use of third parties. Regardless of your emotions here (and personally I'm disappointed that this is happening), this is UK law and Nielsen and the supermarkets are correct here. If Trolley instead walked physically in stores and then manually collect the data and bought products to shoot photos of* then they have teeth as they have exerted effort into independently create their own database.
* This is now plain copyright law which raises questions, like if the manufacturer-in-question can stop Trolley from publishing those photos (while Trolley owns the copyright on the photo itself, there are de minimis considerations when it comes to the packaging design which is copyrighted by the manufacturer). It might pass fair dealing though.
Then the onus is still on Nielsen to demonstrate an exclusive license. Otherwise they have no standing when trolley scrapes the sites directly, because only the original copyright holder (the stores) may enforce that.
The pictures are the real red flag. Unless a photograph is bundled with a cc license one always needs to assume somebody will need to grant permission for its reuse, and often that will be for compensation. Trolley should have known this.
> Trolley should have known this
It is entirely possible that Trolley were completely aware of this issue, but decided to ignore it and proceed anyway.
How often do we see the phrase "Ask forgiveness, not permission" get quoted on HN?
That idiom might be useful - until you stray near lawyers.
>>again to remind everyone this is UK, in the US Nielsen would be in the wrong
So then would not the simpler solution be to shut down UK operations and start up US operations via a US Legal entity?
I very much doubt the price of the product is coming from this company, everything else sure, trolley.co.uk should explicitly ask them if they data aggregator provides the price to the website.
It doesn't matter if they also store that information, (might be scraping that themselves), if they don't provide that to the website, its not infringing their database rights, it is infringing the supermarkets.
Yes - the feed likely consists of things like the canonical product name, description, weight/size, recommended price, pack shots, and nutritional info.
Trolley.co.uk might be able sidestep this demand by displaying ONLY the product names and prices, but it'd be a much less visually-appealing site as a result.
Is it aggregation if you're scraping from one site and reordering the items you've scraped?
IANAL but given the unreasonableness of the premise as I read it (? scraped data from public websites of multiple companies is "owned" by some mysterious third-party) this sounds more like a consequence of the UK's legal system, where minnows have to shut down when challenged by sharks, because the costs make it impossible to resist, regardless of the strength of the case.
Also, its all a bit vague, especially the reason for the donations - to pay these jokers off? That just enables and encourages the bullying. They should provide more transparency as to who is claiming ownership, and what evidence is being given and legal arguments for paying.
You generally don't own the copyright to images that you scrape from the web, even public websites. A number of companies handle consumer product information internationally (such as GS1 for barcodes) and shops license product information (images, EAN Codes, nutritional information) from them, so they don't have to deal with each manufacturer individually. The manufacturers also work with these organizations.
If they went into supermarkets and took pictures of all the products it would be another story, but as is, they are using pictures on the internet in a way that they aren't licensed to. This is not particular to the UK, just because some site display a picture publically does not mean you are allowed to display the same picture on your site.
That makes more sense than simply "pricing data" which is the focus of the article.
The cease and desist letter stated: "proprietary images and data", I doubt a commercial/sponsored database of prices exists. That would have grave anti-cartell consequences. But you're right: for most reasonable people reading the article, the _only_ relevant data on trolley.co.uk _would_ be the price information. (And I don't have any insider knowledge, I was only guessing ...)
For the sake of 30k I wouldn't be surprised if Asda/Lidl/Tescos step in with the cash for a PR piece. Along the lines of "Being so confident they're the cheapest supermarket, we'll back a price comparison site" and helping 500,000 people to save some money.
I wouldn't be surprised, I'd bl**y astonished. They are as tight as a gnats arse, they screw suppliers out of pennies to improve their profits.
I'd be bloody surprised as well, Tesco's even charge some of their suppliers to access an online portal to upload supplier invoices! They had two portals for uploading invoices, one that was easier to use but cost, the other was a web browser interface that was clunky. I think one of them was called Marrakesh or something like that.
Having been part of their new store opening team in the past and the way they undercut all other rivals in the area to get people in, which typically runs for a few weeks, depots drop everything to make sure everything is in stock at the expensive of other stores, even running dedicated lorries to the store which is costly but that came out of the Depot Transport budgets, and then Transport would argue with Warehouse if the Warehouse had failed to load cages properly because Wave1/Wave2 picks never go smoothly because goods-in is never to schedule as no one can predict holdups on the road networks, it really is a cut throat business in so many ways. I'm surprised at how many companies put up with Tesco's demands, but thats what you need to do to become a stock market leader! Likewise the public are really just price sensitive before brand loyalty. Life in the UK is very much, you get what you are given, and I've never worked for an entity that didnt have criminal elements either, which is shockingly two faced of the public, but people keep their mouth shut to keep a wage coming in and every community needs an Emmanual Goldstein figure. Massacres are a community's dirty little secret. So when is a simple reminder about general jobs losses, maybe a smattering of innuendo, not a veiled form of intimidation and harassment? Story telling can be so triggering.
Even the legal system in the UK is criminal!
Which is exactly why they spend money on PR to increase those profits.
The news of this website shutting down isn't high profile enough to justify it.
If it was featured on MoneySavingExpert.com and Martin Lewis was ranting about it, then perhaps.
Seems very off and unsustainable (e.g. what happens in 2023 is unclear), along with hiding the licensor's name - that seems unreasonable. Who is the end recipient of this donation? I don't buy the story.
Setup a shell company in Bermuda that does the scrapping for you and then claim you buy the data from them. When they approach you just tell them you have a NDA and can't disclose the company. That'll keep them at bay for awhile... then when you're forced to disclose, they'll spend the rest of their time chasing a shell company in Bermuda.
OR.....
Just create a firefox extension that scrapes the pages for your users.
There's many ways to keep this going while giving them the middle finger and serving your users.
Being on the right side of the law is irrelevant if you can't afford to defend yourself in court.
It's not about winning in court, it's about not having the resources to go to court & fight it to begin with.
Also, laws only matter when the outcome doesn't to the people in charge. Just because "the law is on your side" doesn't mean you'll win even if you could afford it, because laws don't magically rule on their own. It is judges, with all their pettiness and cravenness and nepotism, which rule.
> That'll keep them at bay for awhile
This definitely would not work in the UK, and I don't know of many places it would. Asda does 22bn GBP revenue per year
If your data infringes copyright they will go after you not your fictitious shell company because it doesn't matter where you got the copyright infringing material it matters whether you illegally distributed it.
No standing, others are allowed to create the same database by themselves.
Good luck with that!
Rule of thumb: as a baseline, you have absolutely no rights in life... that is, no rights unless you're able to enforce them.
If you're going to that much trouble why not host the entire thing anonymously from Bermuda or Russia or somewhere?
NB: It’s scraping, not scrapping.
The whole thing smells fishy.
While pictures would be under copyright (and it's possible copyright is with a third party specialised in this), I can't see how the price listed for a product on, say, Tesco's website and collected by yourself on Tesco's website could be subject to licensing by a third party, even considering database rights.
I am also puzzled by those guys claim that they got a very "generous offer" from that company.
If they are unsure a better first step would be to crowdfund legal advice.
They could then crowdsource taking pictures of all the products from their users.
Unless there is a third party which is taking photos of all grocery products and then licensing that to the supermarkets, to avoid them all having to take their own photos?
I used to work for one of the major supermarkets and am aware that there are providers of product images, but don't know the specifics in enough detail.
Taking a photo only gives you copyright over that photo, it doesn’t stop other people taking a similar photo.
A lot of business try and claim copyright of a subject (like a tourist attraction) and try and prevent photography of it, but that’s legal BS in any jurisdiction I’ve heard of.
No, I mean they take the actual photographs and then license them to the retailers. Plus they can also capture other product data in a consistent format etc.
See here as an example: https://www.brandbank.com/content-licensing/
So for instance you could scrape a website that licenced images from brandbank above, and then fall foul of their copyright.
Brandbank as an example does work with UK supermarkets, e.g. see below:
https://www.brandbank.com/tag/sainsburys/ https://www.brandbank.com/tesco-case-study/
Right, but if you're making a price-comparison website with scraped data, you can't scrape the product image from a supermarket website legally.
You could of course get images from some other source - but you need meticulous organisation when a single brand of tea might have 40, 80, 160, 240 and 600 bag packages.
This assumes that the supermarket took the photos - what I mean is the supermarket might buy those product photos under license.
You absolutely can scrape a product image from a supermarket website legally. Why would you think otherwise?
Whether you can use those images commercially is another matter entirely.
Well it seems they're not scraping Tesco's website but some aggregator middleman. Like making a package tracking website using Parcels instead of FedEx, UPS, etc individually. Sure it's convenient to write code that only scrapes one website, but now that website is mad because it has a business selling that data.
It's almost certainly the photos that are the issue, not the actual price data which is unable to be copywritten.
It looks like a startup that's hit a brick wall viz a viz licensing pictures.
One way of getting around this, albeit perhaps at the expense of user experience, is to insert stock photos. E.g. I imagine 'carrots' is not particularly unique. Where stock photos can't be used, i.e. it's a specialist/one-off product, perhaps the classic white-box-black-text approach?
If this is a genuine site with good intentions, staffed by volunteers then it would be a shame to see it stamped down by the supermarkets.
Better still use an AI to generate a stock image until one is available.
As others have said, it seems fishy.
They've got this data from somewhere, so why wouldn't they have looked through the license before using it?
Anyway, speaking of MySupermarket, what happened to them? All I know was one day they decided to shut down, without clear reasons. Does anyone have more insight into why it happened?
> MySupermarket, what happened to them?
iirc they were bought out by a supermarket - presumably to prevent customers from realising their prices weren't competitive
See https://www.supersmartlist.com/why-did-mysupermarket-fail/ for some analysis (although it doesn't mention a buyout).
Yeah, I'm scratching my head at this. On the one hand, the UK does have some weird database trolls (a copyright troll called FootballDataCo claimed licence fees from anybody publishing football fixture lists irrespective of where they sourced the information from, and probably can do again since the EU court judgement against them presumably no longer applies)
On the other hand it makes no sense to accompany a supposed order to c&d screen scraping of multiple (independently maintained) websites with thanks for their generosity and "this company is absolutely entitled to request compensation for their work". Either they're a valuable data provider or an licensing obstacle to using perfectly adequate screen scraping techniques, not both. Not sure why a price comparison website would screen scrape the supermarkets with APIs either...
They're not "independently maintained" websites, they all license their data from the same 3rd party data company and re-publish it on their own websites.
They got it from web scraping, doesn't exactly have a license I guess...?
You can't copyright a list of facts. I'm confused about what they have run in to? Maybe they should take some of the money they have raised and talk to a lawyer.
The cease & desist mentions Asda [1] and [2] says "Asda chose NielsenIQ Brandbank, their existing digital content provider for 13 years"
So I'd wager the C&D came from Brandbank - who presumably supply product photos and product data (barcode, pack size, and all the other data like nutritional information you'd find on the packet if you were browsing in store)
[1] https://www.trolley.co.uk/imgs/cease-and-desist-letter.png [2] https://www.brandbank.com/asda-accelerates-their-rich-conten...
Nobody said it was copyright specifically. However:
"A database right is a sui generis property right, comparable to but distinct from copyright, that exists to recognise the investment that is made in compiling a database, even when this does not involve the "creative" aspect that is reflected by copyright.[1] Such rights are often referred to in the plural: database rights."
https://en.wikipedia.org/wiki/Database_right
Yes but it doesn't prevent you to build your the exact same copy as long as you don't use the other DB as a source. Scraping the info yourself is perfectly fine. That's also the reason why people add "mistakes" to their DB as a kind of watermark.
But they are (unwittingly) using the other DB as a source. The data they're scraping turns out not to have been created by the supermarkets in question but sourced from a third party database and published under license on the supermarket websites.
Which raises a very interesting legal question: If the data is recompiled, despite having come from another source via an intermediary (who are using that data to describe the products they are selling), is it still subject to the database rights?
How about we try out a reductio ad absurdam: I'm going to start a startup. Our business model is to license the information on customers' websites, and then license it back to them in turn. This then means that anyone who uses our services in the UK can sue for copyright on lists of facts obtained by screen scrapers, as they've been through our database copyright washer. Think flight prices, hotels, cars, anything.
Is that a viable business model? If not, why not?
> is it still subject to the database rights
Someone is creating a phone book as a service to companies.
Elsewhere, these companies are publishing their phone numbers to get more customers.
Then the phone book creators demand license fees anywhere these phone numbers appear in lists.
It's very dubious "database rights" come into play at all.
That's a better demonstration of the principle than I managed, actually. It does seem faintly absurd, unless there's something else involved.
That's like saying you can write a word for word copy of Harry Potter and the Goblet of Fire as long as you don't use the book as the source.
No because Harry Potter is not a collection of facts. Price of goods are.
Enough monkeys and enough time..
And I thought IP law in the states was bad.
You absolutely can copyright lists of facts in most jurisdictions outside the US
In the US it is pretty hard[1], but most other jusdictions find that if the list has some "creative" or "work" element (ie, it isn't just a list of everything) then it can be under copyright (eg [2]).
[1] https://www.justia.com/intellectual-property/copyright/lists...
[2] https://www.smh.com.au/national/list-makers-on-solid-copyrig...
That's true in the USA but this is in the UK.
Yes, I could understand (if not also somewhat disagree) with it regarding photos, but "and data"? What does that entail?
You can if you curate the list. Curating requires collecting and processing information, which often can be expensive.
Not in the US you can't. Even if you curate the list, a list of facts can never be copyrighted, only creative expression can be copyrighted.
But other countries have different rules.
See here for all the details: https://en.wikipedia.org/wiki/Feist_Publications,_Inc.,_v._R....
So I can just buy a Bloomberg terminal, start broadcasting real time stock prices and collect a fee?
Different situation. You don't buy a terminal, you license it. The license can dictate what you're allowed to do with the data. Rebroadcast the data and you may be just fine on copyright or any other IP, but you violate the license so Bloomberg cancels it.
Bloomberg may not allow you to redistribute those same prices and terminate your membership.
But stock prices are pretty much public knowledge (no one is coming after you for licencing for quoting a price at someone)
Delayed prices for publicly traded stocks are generally available freely. Real time prices and order book though are not, not to mention prices for anything traded OTC like many bonds, swaps, etc.
Certainly, unless you sign some kind of contract with Bloomberg. (Realistically they probably make you sign.)
Stock prices can not be copyrighted.
Yeah, things may differ b/w countries. But the Wikipedia article you linked mentions this:
> In regard to collections of facts, O'Connor wrote that copyright can apply only to the creative aspects of collection: the creative choice of what data to include or exclude, the order and style in which the information is presented, etc.—not to the information itself.
So, a list of facts still can be creative and valuable to the society. INAL, but such works should be protected by copyright, or what's the point of copyright in the first place.
Copyright is not like patents. If you arrive at the same result, by a different method, then even if the result is identical, there is no copyright violation.
So "the creative choice of what data to include or exclude, the order and style in which the information is presented, etc." is very weak. You can copy all the data, and then make your own choice on what to include (for example "everything I can find"), and you would not be violating the copyright.
The quote you gave is for things like making artwork based on certain patterns or letters in the data. But it won't help you if your creative choice is "everything" because there is no creativity there.
> So, a list of facts still can be creative and valuable to the society
A list of facts is not creative, it's simply work. And yes, it might be valuable, but that's not how we decide if something is copyrightable.
> or what's the point of copyright in the first place.
For creativity, rather than effort.
What that means is that if you write a list of e.g. “Top 10 tennis players in my opinion”, then you can hold copyright over that. But if you write “Top 10 tennis players by rating” then you can’t as that’s not a creative expression.
I think just collecting also counts.
I don't think you can own rights to something other people collected and curated though. Though you can own rights to the individual components, but in this case that would be the specific price of something, which doesn't seem like it should be subject to copyright.
> You can't copyright a list of facts
Says who?
The images are obviously going to be copyrighted, I suspect the product information is also likely sourced from third party, and thus you would be infringing database rights by scraping.
I find it very unlikely that the price of the products is provided by thirdparty data provider, and think the actual price is not covered by them despite them suggesting it might be.
> The images are obviously going to be copyrighted
Copyrights are inadvertent and sometimes unintuitive. If the images are created by the product manufacturers, then they hold the copyrights. If the supermarkets edit the images, then they hold copyrights to these edits.
Likewise, if trolley edits the images (undoing the supermarket edits, like cropping around some text) then trolley together with the manufacturer holds the copyright for this new image, not the supermarket.
It doesn't matter where it was scraped from or who was hosting the image, just as it doesn't matter which camera was used.
Yes, if they scrape prices from supermarkets' websites then prices come from supermarkets, they are the ones setting those prices after all.
Really, they don't need product data beyond the name. They could drop pictures and detailed data and build their own database by crowdsourcing pictures and data from their users.
The "data company" does not "own" the data. Especially not if it was collected independently by trolley.co.uk.
And old case that is sometimes used to illustrate how copyright works: https://en.wikipedia.org/wiki/Feist_Publications,_Inc.,_v._R....
That's not really relevant for the UK, where indeed there is such a thing as database copyright.
Only relevant if they are actually duplicating the "data company's" database; which is not what web-scraping is.
For example; creating a map from air photographs is not "duplicating" another geo coded database. The geo data exists independently of any map.
"Database rights" are not the same as "data rights", and if your "database" is inadvertently and routinely recreated as by product of other processes, then "database rights" can not apply. Otherwise we are in absurd world.
(In this absurd world, a phone book company could charge license fees to everyone with a phone because their phone books would necessarily contain a subset of the global phone book.)
The UK has FootballDataCo which historically "owned" the fact a football match between two clubs will be held on a particular date, irrespective of how the publisher found out that fact. So UK law has not alway been sensible about this.
The EU Court of Justice threw that out, but we're not in the EU any more...
This use case seems very similar to a search index (aka google) type use case. Crawling and indexing vs scraping and storing, and then publishing in a searchable format for end users to utilise? Surely there's plenty of precedent for this?
It is, but arguably it's closer to the Google News model (in that more than just the titles are shared) which recently saw some pushback in the EU [1]
In general, if the data owner says you must take it down, you rarely have a leg to stand on unless you can claim some kind of protected use (e.g. reporting or criticism)
[1] https://www.mondaq.com/copyright/53690/google-news-service-i...
The title is misleading. Never do they mention the price data. After doing some research it appears that what is really the cause for concern are the product images.
> Never do they mention the price data. After doing some research...
Can you share your research? Because the cease and desist letter they link to clearly says "images and data"...
The "data" they mention are product descriptions. The issuer of the cease and desist appears to be NielsenIQ. And they don't provide price data. Their main product is to provide standardised product images with a white/transparent background, and small product descriptions. Their customers being big E-Commerce stores.
yeah it's a scam just don't donate to these people
They should ask people to take a picture of the products for them, instead of raising 30k USD/year to send the money to a fortune 500 company.
As others have pointed out, it seems sketchy as hell, the desperate last-minute plea for a considerable sum of cash, the weird reasoning about data gotten through a web scraper being somehow subject to fees, and especially in how they want to "protect" the name of the data company.
Regardless, if it was legit then I'd strongly agree with the suggestion of setting up a shell company to not only avoid litigation but to set a trend in teaching these bully companies how the internet works since it seems a few still have to be taught. Anyone using a c&d against a smaller company for data that can be gotten through a web scraper really should fuck off. The UK does seem to be the place where this kind of shitty IP litigation takes place more than is typical.
From their FAQ I think these are the key bits of information: >What does the data company do?
>The data company digitises and maintains the product information for 98% of grocery products in the UK. They’ve generously worked with us to bring this cost as low as possible.
>Why are you not giving the name of the data company?
>Maintaining this data for over 200,000 products is a costly endeavour and asking them to offer this for free would be a huge ask. We appreciate that the data company is absolutely entitled to request compensation for their work. We know that providing the name of the data company could bring negative implications for them which we wouldn’t want - especially with how generous they’ve already been.
Doesn't make a whole lot of sense to me... Aren't there tons of companies like Trolley that scrape data, etc. without issues?
Like, I assume these do: PriceSpy, PriceRunner, PrisJakt, Shopping.com, PriceGrabber, etc.
A lot of E-commerce software have data feeds that are ingested by parties like pricespy. These feeds are typically either public or made available after mutual agreements. Scraping a large amount of sites for data is just too much labor in the long run.
My guess is that they were circumventing free-tier API limits for the data-aggregation company. It’s not like this data aggregation is one big dump that they downloaded once and forgot the where it came from. Prices change daily. They’d need to be refreshing via an API (data feed) at least daily.
Trolley scraped the supermarket websites, and the supermarkets got their data (product images, nutrition info, etc.) from Brandbank (the data aggregation company).
There's a lot in here that's infuriating.
> Maintaining this data for over 200,000 products is a costly endeavour and asking them to offer this for free would be a huge ask.
No. Whoever (as someone pointed out, probably Brandbank, they supply most of the supermarkets with data so if it isn't them, then it's likely a scam) has already received full pay for the data they've supplied - if they're charging you again for the data, that's just greed (sorry, I mean "perfectly valid capitalism") - even at 14p/entry or whatever they're charging.
Some people here have mentioned that facts can't be copyrighted - it reminds me of a problem in 2011 that the people distributing timezone data to a bunch of open source projects were sued because a 3rd party software house "owned" the rights to that data[1]. I don't know where that landed (timezones still work in linux, so I think they found a workaround) but most projects like this capitulate rather than risk the cost of court.
But yes, the product descriptions and images may be copyrightable - certainly, Brandbank do all their imagery in-house, and any typos in the descriptions will be owned by them, because some person being paid peanuts - likely somewhere in the Philippines - is having to sit down with a photo of the can of Campbells soup, reading and re-typing the descriptions word for word.
It's just... ugh.
[1] https://www.wired.com/2011/10/time-zone-data-lawsuit/
Once these guys pay the £29K, you can be sure next year the fee will to up 5-10x
Sounds like extortion…
can publicly available data be the property of someone?
It seems to be related to the product images, or the images were easiest to identify and fight over https://www.trolley.co.uk/imgs/cease-and-desist-letter.png While a product name, price could be public (a person can see it in a store), the picture is very specific and trolley didn't take the pictures.
It seems they could also just cease use of the 'offending' informationa nd use generic imagery or descriptions instead. For bandwidth/processing reasons, I'd prefer as much text and as few graphics as possible anyway.
Unfortunately, supermarket products would be difficult to distinguish without imagery.
even then, why does it cost 28k PER YEAR, for photos of product images? Can't you just spend a week in a market taking product photos yourself?
[1] is a report of Tesco reducing the range of distinct products they stock from ~90,000 to ~63,000. Some of which will be seasonal items like easter eggs. Some others will only be available in certain stores.
I suspect taking 63,000 product photos, and keeping them all matched up to the right products is probably more than a week's work.
[1] https://www.theguardian.com/business/2015/jan/30/tesco-cuts-...
They could but at this point the (claimed) damage is already done.
The price is high but I'd imagine that company guarantees high resolution professional photos of every product. Chasing the last remaining 10% products might be hard. At 160.000 products, let's say spending 1 minute per product (finding it, post production etc) at 12h/day it takes one person 9 months.
Crowdsourcing the task might work.
The Google logo is publicly available on Google’s website. Does anyone own the Google logo?
Depends. If I spend a ton of time and effort to aggregate data from public sources and license it out, my dataset can still be considered proprietary.
Depends on your country / jurisdiction
Practically, no. Especially when different jurisdictions etc. become a factor.
Sadly they forgot to put an easily to find email address on the page. Can't be bothered to help them out via gofundme.com
move to a jurisdiction with more reasonable IP laws