End of searx.is
TL;DR
I launched searx.is in Spring 2022 to diversify search results across 50+ engines. For three years, I maintained the service through 611 commits, handling half a million daily requests on upgraded EPYC servers.
Bots later overwhelmed the system by crawling through unique IPs, bypassing rate limits. Despite firewall rules (30 requests/5 seconds per IP), bot farms overloaded the server, disrupting other services.
Recent code changes from new developers—frequently requiring constant adaptation—turned maintenance into a chore. I’ve redirected searx.is to searx.space to join the broader network.
So long, and thanks for all the search!
Slightly Longer with Nuance
In April 2022, I sstarted a custom searxng instance at searx.is. I tested it for a a month before getting the domain name and running the site. The idea was to query around 50+ search engines and keep a diversity of search results; hopefully this diversity lead to better or more interesting results. Over the past few years, I've made 611 git commits to keep the service running. Another 1,676 git commits to keep the documentation up to date.
At peak, the service received around half a million requests a day. I upgraded the server on which it runs from AMD Opteron to more modern AMD EPYC CPUs with more cores and more RAM in response to the load. I experimented with a back-end proxy network to farm out results to more machines to get search results in parallel. The back-end proxies also avoid rate limiting by IP address at the search engine itself.
Recently, it seems my custom searx instance has turned into a bot crawler tarpit. Bots endlessly crawl and use my site as a proxy to get search results. The bots are likely blocked from directly crawling the search engines, so the bot authors look for proxies to accomplish their goal. Initially, I added firewall-level rate limiting (no more than 30 connections in 5 seconds per IP address) and that worked for a week or so. Bot crawling farms have lots of IP addresses, so they just farmed out their requests in parallel from unique IP addresses. Each individual IP address is rate limited, but the crawl as a whole buried my server. Running at 100% CPU consumption for hours on ends makes for an unhappy server. The other services on the server suffered as a result. I further tuned the redis/valkey rate limiter, only to find that just added more load the already busy system.
The final papercut is the recent code changes and modifications to the codebase. Some new developers have taken over the project and are going full steam with daily code changes; and frequently breaking changes that required me to constantly adapt my code. The changes are faster than I want to keep pace. I'm more excited to work on other projects, and this maintenance has turned into a boring chore. The labor of love turned into a love of labor.
My searx.is now redirects to the main searxng index to find searxng instances, https://searx.space/.
So long, and thanks for all the search!