I’m working on a self-hosted search service called Hister with the goal to reduce my dependence on online search engines.

Hister is a full text indexer for websites which saves all the visited pages rendered by your browser. It provides a flexible web (and terminal) search interface & query language to explore previously visited content with ease or quickly fall back to traditional search engines.

I’ve been using it for a few months and as my local index is growing I can avoid opening google/duckduckgo/kagi more and more frequently.

The project is still heavily under development with a growing community, but the current version is in a fairly usable state in my opinion, so I wanted to share it here - perhaps some of you find it useful as well. (Or at least have some constructive criticism =])

The code is AGPLv3 licensed, available at https://github.com/asciimoo/hister website: https://hister.org/ read-only demo: https://demo.hister.org/

About me: I develop privacy protecting and data liberating free software since 2008. I’m the author of Searx, Colly (https://github.com/gocolly/colly) and many more smaller free software/self-hosted projects (https://github.com/asciimoo).

  • Free_Appalachia@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    20 hours ago

    This looks really rad. I have been trying to build a leftist search engine using searxng and it has a lot of issue because of things getting blocked over tor due to the amount of background requests you have to be making to those sites. I have thought about building something like this to deal with that issue, but just had my hands full with other projects. Quite obviously the use case is people looking to build their own privately indexed search results without it having to get polluted by shit that comes up in other search engines. This project looks really cool and I am going to look into replacing my searxng instance with this soon.

  • steel_for_humans@piefed.social
    link
    fedilink
    English
    arrow-up
    0
    ·
    21 hours ago

    I don’t get it. It indexes pages which were already visited, right? So in order to find some website I need to first use another search engine. Afterwards, that website is in my browsing history and if I need it again, I don’t need to search for it. So what’s the use case for this project?

    • asciimoo@lemmy.mlOP
      link
      fedilink
      arrow-up
      0
      ·
      20 hours ago

      It indexes pages which were already visited, right?

      Yes, if you use the browser extension only, but Hister has an API and a crawler as well if you’d like to add content you have not visited yet. Also, Hister supports indexing local text files, not just websites.

      Afterwards, that website is in my browsing history and if I need it again, I don’t need to search for it

      • Unfortunately browser history does not include the page’s content only the URL + title combo at best.
      • Browser’s can’t show an offline preview (Having offline previews is a huge privacy - and productivity - win in my opinion, it completely eliminates the need of creating external network requests)

      These are the biggest weaknesses of the browser history compared to Hister, but there are many more nuances where Hister can provide extra features and QoL improvements. I recommend checking the documentation & posts on the website if you are interested in the details.

      • ScoffingLizard@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        0
        ·
        3 hours ago

        So does that mean that the index starts off as empty? If so, is there a way to create a centralized (I know that’s a bad word) starting repo such that the engine already knows some cool results? I have tabbed bookmarks for news that is not shitty, archives, video that isn’t YouTube, privacy resources, etc. It would be cool if people could post indices focused on certain topics that they could add. Like indices for random stuff, like dog grooming, kayaking, or woodworking. It could be a hub like Docker Hub, but for cool results.

        Sorry. Ha ha. You know you have a good idea when people start asking for features. I haven’t even started it yet. Maybe I can try self hosting on my desktop.

        This is exciting! I normally use Searxist on Android.

    • paris@lemmy.blahaj.zone
      link
      fedilink
      arrow-up
      0
      ·
      20 hours ago

      Your browsing history does not have full text search, so if you only remember the content of the page and not the title of it, you’re SOL. Or if you browse across multiple devices, you have to check multiple places to hope to find it.

  • utopiah@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    1 day ago

    Interesting, I didn’t see it in the documentation so if you didn’t document that already, you can have your local instance as search suggestion for Firefox on mobile and desktop. I use it for my own wiki, e.g. https://mastodon.pirateparty.be/@utopiah/116351732150481942

    Also how I would imagine it is default search there and if no hit then fallback to a default search engine, e.g. DDG.

    • asciimoo@lemmy.mlOP
      link
      fedilink
      arrow-up
      0
      ·
      1 day ago

      Also how I would imagine it is default search there and if no hit then fallback to a default search engine, e.g. DDG.

      This is exactly how I use it. Hister has even a hotkey to quickly jump to your preferred online search engine with the current search query if you cannot find what you are looking for.