Ask HN: Bookmarking with _working_ full text search?
I'm a pinboard.in refuge and I'm still looking for a bookmarking search that performs _working_ full text search for bookmarks I've saved. I want to pay for a service that acts like a personal Google for everything I've bookmarked. I presume that would involve the bookmarking service to save/cache each page I've bookmarked - I'll leave that for the bookmarking service to figure out.
I thought I may have found that with raindrop.io, but searching for a single specific string of text (like 'wine' in a HN URL of people discussing the 'wine' emulator) does not seemingly work with Raindrop. It seems that unless the string is in the URL title or description, the Bookmark service fails/isn't actually performing a full text search of a previously bookmarked page.
Shameless plug:
I’m working on an open-source social bookmarking site in Elixir that is API compatible with delicious/pinboard.
It’s named linkhut and it’s currently able to import your bookmarks from pinboard and browser exports. The flagship instance is: https://ln.ht
I’m still working on the archiving and full text search feature. I’ve been experimenting with different approaches and there’s still a few things I want to explore before settling on a solution.
I get that this is not really useful to you as of now, but if you still haven’t found anything in a couple of months, think about checking it out I might just have launched that feature by then.
Raindrop.io full text search should work but only in the paid pro plan. And even then, I think the problem with the example you cited is that “comments” are generally stripped off (like they are in “reader mode” apps) so that HN replies are not indexed.
I always recommend Zotero as a local/online bookmark manager. It does full text search, but you have to "save page as snapshot" and set search to "everything", for it to work
Not that I know of. Which is to say, I have in the past considered writing a tool to periodically (nightly?) grab my browser history, scrape as much of it as possible, and dump all the html into a directory somewhere where I could invoke ripgrep on it. >_>
I had more or less exactly the same idea, and my starting point is a script that downloads a page, runs it through Mozilla's Readability library (which Firefox uses for reader mode), converts to Markdown and saves to file. Thinking about another tool to grab browsing history and run this over each entry.
https://github.com/fifteenthstandard/mdl
A few friends and I are working on a solution to this problem right now! Should launch with a Show HN sometime in the next couple weeks!
You can try hamsterbase.com which I have developed.
1. perfect support for 99% of websites, support import html, mhtml, webarchive.
2. full text search support、provide public api.
3. all functions are completely offline, no account registration, no credit card required, no information collected.
4. currently free in beta
5. support singlefile, singlefile can automatically save all pages viewed.
Have you tried https://historio.us/ ??
If it has to be online somewhere, then just make a link farm page. Go to that page and use text search to find a link, then click on it.
Browsers have bookmark searching; e.g. in Firefox use "* " (star space) before search terms to search bookmarks.
How much would you be willing to pay for such a service?
$20/month USD. I put a lot of value on being able to retrieve everything I've ever read.
Why can't it be built into the browser?
Preserving privacy is nice when there's no compelling reason to sacrifice it. Does this really need to be a SaaS?
It is built into Safari (via the History interface). Opera had this feature before it became a Chrome reskin. Chrome used to support it, but it only worked on http (not https) sites and after some years the feature was dropped. Chrome addons like Falcon bring it back (but Falcon seems unattended these days). The Min Browser offers full-text search history out of the box, but the browser experience is ... eccentric.
There was a SaaS service for this called Recawl which required a browser plugin. memex.garden offered this as part of a SaaS but they dropped this feature. Browserparrot focuses on this, again as a SaaS, again with uncertain pricing. Diskernet does this fully-local, but the software is not free (and is only offered via subscription pricing). St. Clair Software's HistoryHound does full-text history search, but only on Mac; I suppose it's got a bigger featureset than the Safari tool, and it supports not-Safari browsers.
The field is littered with previous attempts to get this right.
Depends on required features like cross browser support, cross device support, handling pdfs, ocr images, etc. Some of the mentioned features already exist in the browser. Not sure if the browser vendors are incentivized to develop and maintain such features.
I recall a project showcased on HN that does browser history full text indexing via a proxy. Perhaps that would be a better approach.
Yeah several threads on HN that have lists of tools and pros/cons.
It is?
In Firefox, type *, space and then search terms; this searches bookmarks.
Also, in the Bookmarks -> Manage Boomarks menu there is a search box, for those who despise convenience.
They're talking about including page content, Firefox only includes title, URL, and manually added tags.
Export to Sqlite or use a bookmark sync manager that uses Sqlite, and use https://www.sqlite.org/fts5.html to search.
If you need a Web UI just get one of those analytics/BI freewares that works with Sqlite.