The winning submission [0] was discussed on HN recently [1]. It's highly impressive from both technical decisions and graphic design viewpoints, it somehow elegantly visualizes 2 billion books (in a way that resembles a bookcase no less).
But in terms of comparison of yours to bwv, I don't agree that bwv's is technically superior in every way. It lacks comparison, ISBN selection and link creation. bwv's main focus looks to be that one feature to highlight the rare books without trying to get the other requirements that AA wanted.
Congrats to you too! Indeed, I think they could have improved the visual and comparison part, its a bit dark and not too interesting to look at. But I am envious of how smooth their tiling is. My tiles are 4096x4096 which allows me to satisfy both the 20,000 file limit and the max 20mb file limit imposed by cloudflare. I had some issues with smaller tiles, and wanting to host it on cloudflare restricted me from doing 512x512 tiles iirc. Also I really like that they extracted the publisher information and put that as a pmtile vector, thats something I attempted but ultimately ran out of time with.
Its due to how those ISBN ranges were handed out - I think they probably gave a block like 978-53 (for example) to those countries, meaning the right to distributed ISBNs 978-530-000-000 to 978-539-999-999 and then later they ran out or had all subblocks distributed to publishers, and then they got a new block further away (so not 978-54 in my example) and therefore those blocks are not numerically close to each other and thus also they are separate "islands" in the hilbert space.
Thanks! That is indeed all thanks to using the hilbert curve fractal which has the property that it maps numbers which are close together onto 2d (or higher dimensional) coordinates which are close together, its a very cool property! Its used in lots of contexts for that reason
I’m glad you said that, because I was also surprised by the fact that the bwv-1011 only made it to honorable mention even though its technical focus was on visualizing the rarity of books, which ostensibly was the primary objective of the whole effort.
I really like that your page talks about _why_ a Hilbert curve is good. I don't remember ever learning about those before, and now hopefully if I'm ever trying to visualize 1D data, I might remember that :)
Fascinating. It allows for some interesting observations when you as I zoom in on this one (sadly no direct links to coords/zoom level) https://archive.anarchy.cool/maps/isbn.html You can find publishers like Hueber Verlag[1] in the eastern part of the German language section. They spread their ISBN numbers in a pattern with something like 1360000 between them (I know, ISBN having a checksum leads to gaps in the numbering), which generates a repetitive pattern with plenty of empty space. It is so wasteful on this huge chunk they have.
Are there no rules on how publishers have to assign their numbers? Just so they could hand back an unused block if they don't need it any longer.
I searched for 'Stubborn Attachments' which worked.
On the same bookshelf there are several other Stripe Press books.
One of them is called Zero to One Hundred, by Stephanie Friedman.
When you search that book on Amazon, it has a different title, which I guess is reasonable as the book hasn't been published yet and they may not have finalized the decision: https://a.co/d/bQX5CNf
Here's where it gets weird:
- if you search for the book 'Zero to one hundred' (the title shown on the 'shelf') it doesn't come up
- if you search for the book by its ISBN, it does come up, but the name displayed in the search results is yet another alternate title. And the bookshelf displays that title. So the same part of the bookshelf looks different depending on what you searched for.
I haven't yet read the blog post about how this impressive visualization works, so I don't have an idea of why this is the case.
I don't think it's the tool that's the issue, I think it's the book itself?
If you search the ISBN on the web, you'll get "Zero to One Hundred" with the cover of "Built to Grow" and vice versa.
There's also "Experiment, Build, Scale" which is the book that the visualisation shows, also with the same ISBN attributed to the previous two.
Experiment, Build, Scale seems to be the only book of Stephanie's that is in Google Books while Worldcat has "Zero to One Hundred" with the cover art for "Built to Grow".
Most of the online bookstore pages have this mess so I wouldn't blame the tool for what seems like an upstream data quality issue.
Sorry I didn't mean to make it seem like I think the tool is at fault.
I just think it's interesting that the book title shows differently on the shelf depending on whether you reach it via an ISBN search, vs. if you discover it by panning from a nearby book.
> Most of the online bookstore pages have this mess so I wouldn't blame the tool for what seems like an upstream data quality issue.
I think that's an uncharitable read of the GP's comment. I read it as curiosity about how the upstream data issues present in the tool, which also interests the part of my brain that likes to solve minor mysteries.
I feel like visualizations of large datasets which are viewer-directed (i.e. they want you to "explore" the data instead of trying to tell you something specific about it or communicate a narrative) are often "pretty" but never particularly enlightening. I feel like that holds true for these in particular.
- publisher
- assigned title
- (roughly) order of publication
That's all that they communicate --- there is no hierarchy here to aid in discovery or to organize the content (and further complicating things, the same text may appear multiple times in a different binding --- a differentiation which is immaterial to an e-book).
The elephant in the room of course is the matter that "Anna's Archive" is not a legitimate book repository, but a piracy site, so what they are showcasing is how compleat (and brazen) their theft (and attendant lack of compensation) is.
> The thing is, ISBNs map to:
> - publisher - assigned title - (roughly) order of publication
I assume the task isn't just to visualize isbns literally. Presumably you are allowed to cross reference with other data.
> The elephant in the room of course is the matter that "Anna's Archive" is not a legitimate book repository, but a piracy site,
I think its pretty clear that the target audience doesn't care. I don't think the target audience holding differing political views is really a valid critcism of the project. It should be evaluated in the context and audience it was created for.
This is not a political stance, but one of basic questions of authorship and what compensation authors should receive and what control they should have over their work.
See arguments by Alexander Pope in Pope _V._ Curll.
> This is not a political stance, but one of basic questions of authorship and what compensation authors should receive and what control they should have over their work.
Questions of compensation and ownership are one of the most political questions of all.
What exactly do you think communist revolutions were revolting over?
Not really, because it depends on the basis of morality. In fact, this 'morality' problem is shown in the existence of libraries in the US.
Is a book a collective good? Or property? In the US the answer is 'both' in an awkward way. But the US does know that having books behind a paywall is not in society's best interest.
And in reality 99% of the books will never be read, which makes their 'value' as property suspect.
> This would be far more interesting if it were based on an hierarchical system such as LoC, and instead afforded an interface for accessing legitimately available books
Given that "Textbooks" are separated out and "Animals" and "Childrens' Books" and "Health & Wellness" are top-level categories? and that it mixes in books which are not available for download, not really.
The UI is not all that great either.
I would like to see:
- an hierarchical list with a hierarchy which actually makes sense and truly organizes knowledge
- of legitimately available downloadable books
- which has a nice UI
but it's far more important that LLMs have training data without consideration of recompense than any other consideration.
That's my issue with attempts to 3D-ify viz. Unless you are actually modeling a 3D volume, like medical imaging or CAD, the added "forced exploration" of 3D simply hides insights.
I'm curious why there's no clear "Spanish" in these ISBN visualizations; there's 2 slots for English, one for France, Germany, Japan, Soviet Union, China, etc. but no big one for Spain. Do we really have so few books in Spanish? Or is this a predominantly English distribution?
I say this as someone who grew up in Spanish libraries and book shops, surrounded and immersed in Spanish books, so it feels a bit strange to see the tiny bit we occupy in the world map here.
The dataset consists of books from the Anna Archive, each identified by an ISBN. The ISBNs and titles are extracted from datasets [1], which include magazines and books primarily in Chinese, English, and French.
Example: Germany publishes five times more books than the Netherlands [2], and Spain publishes twice as many books as the Netherlands. However, in visualizations, Germany appears similar to the Netherlands, while Spain and Mexico do not aligned with the high-level labels [3].
>I'm curious why there's no clear "Spanish" in these ISBN visualizations
I had the exact same question, and I do have a completely unsupported theory. There's one large block that appears to be Argentina, or possibly Peru, although their titles are on the fringes of the large block. The block is otherwise unlabled, no name sitting at the center of the block like you see with the other major ones. I would be slightly surprised if it were entirely argentina, but it would make a lot of sense if that block were Spanish.
They do half of the work (which is a helluva lot)... the other half is done by the volunteers that digitize books.
I was looking at my country's "shelve" and it's so sad to see so many missing titles. I almost wanted to go to my local livrary and digitize sone of them. The old ones that are out of print and imposible to acquire right now...
The winning submission kind of remind me of the Eagle mode file manager where you can zoom into a directory to see files in it and keep zooming to access subdirectories.
>We started mapping ISBNs two years ago with our scrape of ISBNdb. Since then, we have scraped many more metadata sources, such as Worldcat, Google Books, Goodreads, Libby, and more. A full list can be found on the “Datasets” and “Torrents” pages on Anna’s Archive. We now have by far the largest fully open, easily downloadable collection of book metadata (and thus ISBNs) in the world.
So, it your books would need to be present in one of the databases that Anna's Archive scraped, at the time they scraped it.
Libraries have been trying to collect humanity’s knowledge almost since the invention of writing. In the digital age, it might actually be possible to create a comprehensive collection of all human writing that meets certain criteria. That’s what shadow libraries do - collect and share as many books as possible.
One shadow library, Anna’s Archive (which I will not link here directly due to copyright concerns), recently posed a question: How could we effectively visualize 100,000,000 books or more at once? There’s lots of data to view: Titles, authors, which countries the books come from, which publishers, how old they are, how many libraries hold them, whether they are available digitally, etc.
- https://phiresky.github.io/blog/2025/visualizing-all-books-i...
Basically, legally gray online book repositories such as Anna's Archive, who was the creator of this bounty, are trying to collect a lot of books. The question quickly arises - how many books are there?
The best way to track books is by using ISBN, international standard book number, basically the personal id of any given books, given to books by an international agency. Now that you know which books exist, you can check which books your repository already has and which ones are missing.
But ISBN covers the space of over 2 billion possible existing books. That's a lot. So, Anna's Archive has created a contest to display this space in the cleanest way possible. The winning submission is very nicely done, and in my view very well deserving of the 6,000$ bounty.
There are multiple ways to look at this, but for example, my middle European country's laws explicitly state that breaking copyright is okay, if the material is used for teaching purposes. Downloading for personal use is also allowed.
Are they breaking the laws of the country where they host their own data? I can't really say.
In honesty, I don't believe copyright laws will survive this decade, much less this century. With models being trained on copyrighted material and no cases setting the precendent that this is not okay, I feel like the new reality is that you can steal anything, as long as you 'launder' it through an AI model.
Maybe that may be the next big startup, re-creating copyrighted books through AI models, just different enough to skirt the laws. Who wouldn't like to read 'Owner of Numerous Pieces of Jewelery' instead of 'Lord of the Rings'?
There are places that have a minimal or no formal recognition of IP rights. Not counting stateless or breakaway regions like Transnistria and Sealand, countries like Somalia and South Sudan either do not have a government-run IP system, or in the case of South Sudan are not part of the Berne Convention. I doubt that Anna's Archive operates in one of these places, but there are still safe harbors for their mission.
Ok so from what I understood, this visualisation displays all the ISBNs that are assigned into countries, then across publishers. Books that are not highlighted are the ones that are not present on Annas Archives? Is that so?
Annas Archive has both books in their archive, but they also have other datasets that connect a book ISBN to the metadata (title, author, publisher, ...).
In my visualisation https://isbnviz.pages.dev you can see which books they actually have the files of (blue) and which ones they know exist because they have the metadata from some other source (like google books, ...) (red). Finally, there are also ISBNs not contained in any of the sets that Annas Archive has, and these are either assigned or not assigned. A lot of the 979 prefixed ISBNs are not assigned, that means, no country/publisher has the right to assign them to a book. Other ISBNs are assigned to a publisher, but they just haven't published a book with that ISBN yet. Or they may have published a book, but Anna's archive doesnt know about the book because its not in their (or the ones they scraped) dataset.
I don't see any arabic literature. Curious whether that due to lack of actual digital/ocr text or lack of availability of the pdf/epub formats of the books.
Public request: anybody here who hates Anna’s and wants to make a principled complaint about it? I love it and the idea of it so much, but I imagine some feel differently and I’d like to hear your best takedown shot.
Judging by your profile location being in the netherlands, I think you are confusing the generic Ziggo ISP blocked page[1], where it lists Russia Today and Sputnik News and then in another post ThePirateBay
In this case, the ISP blocked it because the website is anna's archive [2], which was blocked around a year ago, but they have not made a post about that.
If you put "pcm." in front of the link it will work (for now)
You should probably edit your post, so as not to misinform. But I have to admit this confusion stems from bad decisions at the ISP.
Seems that editing is not possible due to the negative point I gathered; which is weird as I just reported that I can not watch it. People seem to view everything though a political lens now.
But thank you for your information; I saw the post.
The winning submission [0] was discussed on HN recently [1]. It's highly impressive from both technical decisions and graphic design viewpoints, it somehow elegantly visualizes 2 billion books (in a way that resembles a bookcase no less).
[0]: https://phiresky.github.io/blog/2025/visualizing-all-books-i...
[1]: https://news.ycombinator.com/item?id=42897120
Im slightly surprised mine won 3rd place, I believe they liked my simplicity and visualisation. Hosted at https://isbnviz.pages.dev
But honestly, I find both of these better: - https://bwv-1011.github.io/isbn-viewer/ - https://anna.candyland.page/map-sample.html
in particular the one from bwv is technically similar but just all around better than mine, it is what I would want mine to be
I'm also surprised that I got 3rd place.
But in terms of comparison of yours to bwv, I don't agree that bwv's is technically superior in every way. It lacks comparison, ISBN selection and link creation. bwv's main focus looks to be that one feature to highlight the rare books without trying to get the other requirements that AA wanted.
Congrats to you too! Indeed, I think they could have improved the visual and comparison part, its a bit dark and not too interesting to look at. But I am envious of how smooth their tiling is. My tiles are 4096x4096 which allows me to satisfy both the 20,000 file limit and the max 20mb file limit imposed by cloudflare. I had some issues with smaller tiles, and wanting to host it on cloudflare restricted me from doing 512x512 tiles iirc. Also I really like that they extracted the publisher information and put that as a pmtile vector, thats something I attempted but ultimately ran out of time with.
What is it that make yours and bws' have a floating island with spain/italy/++ in addition to them being represented in the main blob?
Its due to how those ISBN ranges were handed out - I think they probably gave a block like 978-53 (for example) to those countries, meaning the right to distributed ISBNs 978-530-000-000 to 978-539-999-999 and then later they ran out or had all subblocks distributed to publishers, and then they got a new block further away (so not 978-54 in my example) and therefore those blocks are not numerically close to each other and thus also they are separate "islands" in the hilbert space.
I see, thanks for explaining. Cool that your visualization then shows these idiosyncrasies!
Thanks! That is indeed all thanks to using the hilbert curve fractal which has the property that it maps numbers which are close together onto 2d (or higher dimensional) coordinates which are close together, its a very cool property! Its used in lots of contexts for that reason
I’m glad you said that, because I was also surprised by the fact that the bwv-1011 only made it to honorable mention even though its technical focus was on visualizing the rarity of books, which ostensibly was the primary objective of the whole effort.
I really like that your page talks about _why_ a Hilbert curve is good. I don't remember ever learning about those before, and now hopefully if I'm ever trying to visualize 1D data, I might remember that :)
Fascinating. It allows for some interesting observations when you as I zoom in on this one (sadly no direct links to coords/zoom level) https://archive.anarchy.cool/maps/isbn.html You can find publishers like Hueber Verlag[1] in the eastern part of the German language section. They spread their ISBN numbers in a pattern with something like 1360000 between them (I know, ISBN having a checksum leads to gaps in the numbering), which generates a repetitive pattern with plenty of empty space. It is so wasteful on this huge chunk they have.
Are there no rules on how publishers have to assign their numbers? Just so they could hand back an unused block if they don't need it any longer.
[1] I can see how publishing learning material in 30 languages can give people "ideas" when assigning ISBN numbers https://de.wikipedia.org/wiki/Hueber_Verlag
This is amazing.
One thing I found odd.
I searched for 'Stubborn Attachments' which worked.
On the same bookshelf there are several other Stripe Press books.
One of them is called Zero to One Hundred, by Stephanie Friedman.
When you search that book on Amazon, it has a different title, which I guess is reasonable as the book hasn't been published yet and they may not have finalized the decision: https://a.co/d/bQX5CNf
Here's where it gets weird:
- if you search for the book 'Zero to one hundred' (the title shown on the 'shelf') it doesn't come up
- if you search for the book by its ISBN, it does come up, but the name displayed in the search results is yet another alternate title. And the bookshelf displays that title. So the same part of the bookshelf looks different depending on what you searched for.
I haven't yet read the blog post about how this impressive visualization works, so I don't have an idea of why this is the case.
I don't think it's the tool that's the issue, I think it's the book itself?
If you search the ISBN on the web, you'll get "Zero to One Hundred" with the cover of "Built to Grow" and vice versa.
There's also "Experiment, Build, Scale" which is the book that the visualisation shows, also with the same ISBN attributed to the previous two.
Experiment, Build, Scale seems to be the only book of Stephanie's that is in Google Books while Worldcat has "Zero to One Hundred" with the cover art for "Built to Grow".
Most of the online bookstore pages have this mess so I wouldn't blame the tool for what seems like an upstream data quality issue.
Sorry I didn't mean to make it seem like I think the tool is at fault.
I just think it's interesting that the book title shows differently on the shelf depending on whether you reach it via an ISBN search, vs. if you discover it by panning from a nearby book.
> Most of the online bookstore pages have this mess so I wouldn't blame the tool for what seems like an upstream data quality issue.
I think that's an uncharitable read of the GP's comment. I read it as curiosity about how the upstream data issues present in the tool, which also interests the part of my brain that likes to solve minor mysteries.
I feel like visualizations of large datasets which are viewer-directed (i.e. they want you to "explore" the data instead of trying to tell you something specific about it or communicate a narrative) are often "pretty" but never particularly enlightening. I feel like that holds true for these in particular.
The thing is, ISBNs map to:
- publisher - assigned title - (roughly) order of publication
That's all that they communicate --- there is no hierarchy here to aid in discovery or to organize the content (and further complicating things, the same text may appear multiple times in a different binding --- a differentiation which is immaterial to an e-book).
The elephant in the room of course is the matter that "Anna's Archive" is not a legitimate book repository, but a piracy site, so what they are showcasing is how compleat (and brazen) their theft (and attendant lack of compensation) is.
This would be far more interesting if it were based on an hierarchical system such as LoC, and instead afforded an interface for accessing legitimately available books as are available from https://www.gutenberg.org/ or listed at: http://onlinebooks.library.upenn.edu/ or worked on at: https://www.wikibooks.org/
> The thing is, ISBNs map to: > - publisher - assigned title - (roughly) order of publication
I assume the task isn't just to visualize isbns literally. Presumably you are allowed to cross reference with other data.
> The elephant in the room of course is the matter that "Anna's Archive" is not a legitimate book repository, but a piracy site,
I think its pretty clear that the target audience doesn't care. I don't think the target audience holding differing political views is really a valid critcism of the project. It should be evaluated in the context and audience it was created for.
This is not a political stance, but one of basic questions of authorship and what compensation authors should receive and what control they should have over their work.
See arguments by Alexander Pope in Pope _V._ Curll.
> This is not a political stance, but one of basic questions of authorship and what compensation authors should receive and what control they should have over their work.
Questions of compensation and ownership are one of the most political questions of all.
What exactly do you think communist revolutions were revolting over?
when China decided to wholesale ignore Western copyright in the digital age, completely.. the equation changed IMHO.
Yes, but dealing with that politically would be made easier by having the moral high ground.
Not really, because it depends on the basis of morality. In fact, this 'morality' problem is shown in the existence of libraries in the US.
Is a book a collective good? Or property? In the US the answer is 'both' in an awkward way. But the US does know that having books behind a paywall is not in society's best interest.
And in reality 99% of the books will never be read, which makes their 'value' as property suspect.
If so few books are to be read, then why is it so difficult to pay for those which are?
> This would be far more interesting if it were based on an hierarchical system such as LoC, and instead afforded an interface for accessing legitimately available books
Isn't this exactly what Open Library does?
Given that "Textbooks" are separated out and "Animals" and "Childrens' Books" and "Health & Wellness" are top-level categories? and that it mixes in books which are not available for download, not really.
The UI is not all that great either.
I would like to see:
- an hierarchical list with a hierarchy which actually makes sense and truly organizes knowledge
- of legitimately available downloadable books
- which has a nice UI
but it's far more important that LLMs have training data without consideration of recompense than any other consideration.
That's my issue with attempts to 3D-ify viz. Unless you are actually modeling a 3D volume, like medical imaging or CAD, the added "forced exploration" of 3D simply hides insights.
I had a Pavlovian response to reach for the defrag program at first sight of the top image.
win 98 had the best animation. pity everything beyond that was dogshit
This was great fun to enter nevertheless, congrats all involved.
My entry is still live for now for anyone curious:
https://d199hl4t3ts6d9.cloudfront.net/
I'm curious why there's no clear "Spanish" in these ISBN visualizations; there's 2 slots for English, one for France, Germany, Japan, Soviet Union, China, etc. but no big one for Spain. Do we really have so few books in Spanish? Or is this a predominantly English distribution?
I say this as someone who grew up in Spanish libraries and book shops, surrounded and immersed in Spanish books, so it feels a bit strange to see the tiny bit we occupy in the world map here.
The dataset consists of books from the Anna Archive, each identified by an ISBN. The ISBNs and titles are extracted from datasets [1], which include magazines and books primarily in Chinese, English, and French.
Example: Germany publishes five times more books than the Netherlands [2], and Spain publishes twice as many books as the Netherlands. However, in visualizations, Germany appears similar to the Netherlands, while Spain and Mexico do not aligned with the high-level labels [3].
[1] https://annas-archive.li/datasets
[2] https://internationalpublishers.org/wp-content/uploads/2023/...
[3] https://software.annas-archive.li/AnnaArchivist/annas-archiv...
>I'm curious why there's no clear "Spanish" in these ISBN visualizations
I had the exact same question, and I do have a completely unsupported theory. There's one large block that appears to be Argentina, or possibly Peru, although their titles are on the fringes of the large block. The block is otherwise unlabled, no name sitting at the center of the block like you see with the other major ones. I would be slightly surprised if it were entirely argentina, but it would make a lot of sense if that block were Spanish.
My most sincere love to all shadow libraries out there, you're doing god's work.
They do half of the work (which is a helluva lot)... the other half is done by the volunteers that digitize books.
I was looking at my country's "shelve" and it's so sad to see so many missing titles. I almost wanted to go to my local livrary and digitize sone of them. The old ones that are out of print and imposible to acquire right now...
So much knowledge lost.
To be fair, the authors of the books also contribute quite a bit.
Does Anna’s Archive track and account for duplicate ISBNs?
https://scis.edublogs.org/2017/09/28/the-dreaded-case-of-dup...
The winning submission kind of remind me of the Eagle mode file manager where you can zoom into a directory to see files in it and keep zooming to access subdirectories.
https://eaglemode.sourceforge.net/emvideo.html
Where the database is from? How and how often is it updated?
I have two self-published books with ISBNs. Neither of them has the details in the 1st place submission (I assume it won’t be in any other as well?).
One was published on Feb 23 and the other on Dec 24. I had hoped at least the older one would be there. Does anyone know why they are not?
The ISBNs:
- 9786500718836
- 9786501276830
From https://annas-archive.org/blog/all-isbns.html :
>We started mapping ISBNs two years ago with our scrape of ISBNdb. Since then, we have scraped many more metadata sources, such as Worldcat, Google Books, Goodreads, Libby, and more. A full list can be found on the “Datasets” and “Torrents” pages on Anna’s Archive. We now have by far the largest fully open, easily downloadable collection of book metadata (and thus ISBNs) in the world.
So, it your books would need to be present in one of the databases that Anna's Archive scraped, at the time they scraped it.
Noob here, but can someone explain like im fivr, why this is important? It looks beautiful nevertheless
I'll start off by quoting the winning submission.
Libraries have been trying to collect humanity’s knowledge almost since the invention of writing. In the digital age, it might actually be possible to create a comprehensive collection of all human writing that meets certain criteria. That’s what shadow libraries do - collect and share as many books as possible.
One shadow library, Anna’s Archive (which I will not link here directly due to copyright concerns), recently posed a question: How could we effectively visualize 100,000,000 books or more at once? There’s lots of data to view: Titles, authors, which countries the books come from, which publishers, how old they are, how many libraries hold them, whether they are available digitally, etc. - https://phiresky.github.io/blog/2025/visualizing-all-books-i...
Basically, legally gray online book repositories such as Anna's Archive, who was the creator of this bounty, are trying to collect a lot of books. The question quickly arises - how many books are there?
The best way to track books is by using ISBN, international standard book number, basically the personal id of any given books, given to books by an international agency. Now that you know which books exist, you can check which books your repository already has and which ones are missing.
But ISBN covers the space of over 2 billion possible existing books. That's a lot. So, Anna's Archive has created a contest to display this space in the cleanest way possible. The winning submission is very nicely done, and in my view very well deserving of the 6,000$ bounty.
I like Annas Archive but its definitely not legally gray.
There are multiple ways to look at this, but for example, my middle European country's laws explicitly state that breaking copyright is okay, if the material is used for teaching purposes. Downloading for personal use is also allowed.
Are they breaking the laws of the country where they host their own data? I can't really say.
In honesty, I don't believe copyright laws will survive this decade, much less this century. With models being trained on copyrighted material and no cases setting the precendent that this is not okay, I feel like the new reality is that you can steal anything, as long as you 'launder' it through an AI model.
Maybe that may be the next big startup, re-creating copyrighted books through AI models, just different enough to skirt the laws. Who wouldn't like to read 'Owner of Numerous Pieces of Jewelery' instead of 'Lord of the Rings'?
There are places that have a minimal or no formal recognition of IP rights. Not counting stateless or breakaway regions like Transnistria and Sealand, countries like Somalia and South Sudan either do not have a government-run IP system, or in the case of South Sudan are not part of the Berne Convention. I doubt that Anna's Archive operates in one of these places, but there are still safe harbors for their mission.
Ok so from what I understood, this visualisation displays all the ISBNs that are assigned into countries, then across publishers. Books that are not highlighted are the ones that are not present on Annas Archives? Is that so?
Also what do you mean by unassigned?
Annas Archive has both books in their archive, but they also have other datasets that connect a book ISBN to the metadata (title, author, publisher, ...).
In my visualisation https://isbnviz.pages.dev you can see which books they actually have the files of (blue) and which ones they know exist because they have the metadata from some other source (like google books, ...) (red). Finally, there are also ISBNs not contained in any of the sets that Annas Archive has, and these are either assigned or not assigned. A lot of the 979 prefixed ISBNs are not assigned, that means, no country/publisher has the right to assign them to a book. Other ISBNs are assigned to a publisher, but they just haven't published a book with that ISBN yet. Or they may have published a book, but Anna's archive doesnt know about the book because its not in their (or the ones they scraped) dataset.
I don't see any arabic literature. Curious whether that due to lack of actual digital/ocr text or lack of availability of the pdf/epub formats of the books.
Is there anywhere that lists/publicises/collates competitions like this?
I would like to have had a go at this but you often only find out about these things when winners are announced.
Public request: anybody here who hates Anna’s and wants to make a principled complaint about it? I love it and the idea of it so much, but I imagine some feel differently and I’d like to hear your best takedown shot.
Well, I made a comment at:
https://news.ycombinator.com/item?id=43193432
Does that count?
The thing is, if we're going to have GPL software, then we need copyright.
Yes, the terms/lengths need to be adjusted, but one can't do that by fiat/unilaterally.
Their download wait time is upsetting to me because I'm impatient and cheap (else you have to pay). At least they have the libgen links now.
I don't hate Anna's Archive though.
The "external links" section is the only thing making the website usable for non-subscribers.
it's not libgen
no wonder nobody can find my book :-)
These ISBN visualizations remind me of the maps of IPv4 address space.
https://xkcd.com/195/
https://ant.isi.edu/address/
https://www.caida.org/archive/id-consumption/census-map/
Love the Trantor reference!
[dead]
[dead]
I have no idea whats on the site as my provider blocks it because European sanctions against Russia as this is on of the RussiaToday sites.
Judging by your profile location being in the netherlands, I think you are confusing the generic Ziggo ISP blocked page[1], where it lists Russia Today and Sputnik News and then in another post ThePirateBay
In this case, the ISP blocked it because the website is anna's archive [2], which was blocked around a year ago, but they have not made a post about that.
If you put "pcm." in front of the link it will work (for now)
You should probably edit your post, so as not to misinform. But I have to admit this confusion stems from bad decisions at the ISP.
[1] https://www.ziggo.nl/website-geblokkeerd
[2] https://en.wikipedia.org/wiki/Anna%27s_Archive#Netherlands
Seems that editing is not possible due to the negative point I gathered; which is weird as I just reported that I can not watch it. People seem to view everything though a political lens now. But thank you for your information; I saw the post.
No you were down voted because you claimed something false.
It it false but I would not blame the parent; the ISP blocked page is unclear and suggests the block is linked to Russia.
Do you have any evidence?