Ask HN: Bookmarking with _working_ full text search?

15 points by gowings97 4 days ago

I'm a pinboard.in refuge and I'm still looking for a bookmarking search that performs _working_ full text search for bookmarks I've saved. I want to pay for a service that acts like a personal Google for everything I've bookmarked. I presume that would involve the bookmarking service to save/cache each page I've bookmarked - I'll leave that for the bookmarking service to figure out.

I thought I may have found that with raindrop.io, but searching for a single specific string of text (like 'wine' in a HN URL of people discussing the 'wine' emulator) does not seemingly work with Raindrop. It seems that unless the string is in the URL title or description, the Bookmark service fails/isn't actually performing a full text search of a previously bookmarked page.

LunarAurora 3 days ago

Raindrop.io full text search should work but only in the paid pro plan. And even then, I think the problem with the example you cited is that “comments” are generally stripped off (like they are in “reader mode” apps) so that HN replies are not indexed.

I always recommend Zotero as a local/online bookmark manager. It does full text search, but you have to "save page as snapshot" and set search to "everything", for it to work

saulrh 4 days ago

Not that I know of. Which is to say, I have in the past considered writing a tool to periodically (nightly?) grab my browser history, scrape as much of it as possible, and dump all the html into a directory somewhere where I could invoke ripgrep on it. >_>

  • clusmore 4 days ago

    I had more or less exactly the same idea, and my starting point is a script that downloads a page, runs it through Mozilla's Readability library (which Firefox uses for reader mode), converts to Markdown and saves to file. Thinking about another tool to grab browsing history and run this over each entry.

    https://github.com/fifteenthstandard/mdl

hamsterbase 3 days ago

You can try hamsterbase.com which I have developed.

1. perfect support for 99% of websites, support import html, mhtml, webarchive.

2. full text search support、provide public api.

3. all functions are completely offline, no account registration, no credit card required, no information collected.

4. currently free in beta

5. support singlefile, singlefile can automatically save all pages viewed.

notRobot 4 days ago

A few friends and I are working on a solution to this problem right now! Should launch with a Show HN sometime in the next couple weeks!

kazinator 4 days ago

If it has to be online somewhere, then just make a link farm page. Go to that page and use text search to find a link, then click on it.

Browsers have bookmark searching; e.g. in Firefox use "* " (star space) before search terms to search bookmarks.

agencies 4 days ago

How much would you be willing to pay for such a service?

  • gowings97 3 days ago

    $20/month USD. I put a lot of value on being able to retrieve everything I've ever read.

  • metadat 4 days ago

    Why can't it be built into the browser?

    Preserving privacy is nice when there's no compelling reason to sacrifice it. Does this really need to be a SaaS?

    • stonogo 4 days ago

      It is built into Safari (via the History interface). Opera had this feature before it became a Chrome reskin. Chrome used to support it, but it only worked on http (not https) sites and after some years the feature was dropped. Chrome addons like Falcon bring it back (but Falcon seems unattended these days). The Min Browser offers full-text search history out of the box, but the browser experience is ... eccentric.

      There was a SaaS service for this called Recawl which required a browser plugin. memex.garden offered this as part of a SaaS but they dropped this feature. Browserparrot focuses on this, again as a SaaS, again with uncertain pricing. Diskernet does this fully-local, but the software is not free (and is only offered via subscription pricing). St. Clair Software's HistoryHound does full-text history search, but only on Mac; I suppose it's got a bigger featureset than the Safari tool, and it supports not-Safari browsers.

      The field is littered with previous attempts to get this right.

    • agencies 4 days ago

      Depends on required features like cross browser support, cross device support, handling pdfs, ocr images, etc. Some of the mentioned features already exist in the browser. Not sure if the browser vendors are incentivized to develop and maintain such features.

      • metadat 4 days ago

        I recall a project showcased on HN that does browser history full text indexing via a proxy. Perhaps that would be a better approach.

        • agencies 4 days ago

          Yeah several threads on HN that have lists of tools and pros/cons.

    • kazinator 4 days ago

      It is?

      In Firefox, type *, space and then search terms; this searches bookmarks.

      Also, in the Bookmarks -> Manage Boomarks menu there is a search box, for those who despise convenience.

      • metadat 4 days ago

        They're talking about including page content, Firefox only includes title, URL, and manually added tags.

Proven 4 days ago

Export to Sqlite or use a bookmark sync manager that uses Sqlite, and use https://www.sqlite.org/fts5.html to search.

If you need a Web UI just get one of those analytics/BI freewares that works with Sqlite.