Search engine: do it yourself
Sun 27 October 2013
- big ones
- old ones
- stange ones
- do it yourself
The big ones
This one are run by big big big companies. They are powerfull, well fitted for large general usage, have indexed many many many (not only web) content. They are also the must used (just look at alexa). I will only list some of them as every body may know them.
They are run by private company that make what they want. If they want to modify the result for any reason (such as promoting their commercial partners or make some web ressource unvisble), they can make it.
The old ones
We don't remember them but they were here in late 90's.
The strange ones
They are specific and are difficult to use for general web search.
- web content directories such as dmoz (open directory). Each link is human-checked (high quality but not many content)
- specific web site search engine (e.g. imdb)
- semantic search engines such as wolfram alpha. you ask this kind of search engine with natural language.
- mystery seeker: the search engine that respond somethink you don't ask for
- creative common search engine
- amfibi for companies search
- archive to find old version of a web site
The ones you run yourself
They consist of a piece of software you run on your computer. They are slow and usually share theyre result among a on-purpuse peer to peer network. The main advantage is the impossibility to deindex a ressource indexed by one or more computer involved in the P2P network. The two main (maybe the only two that exist) are:
Both project propose publicly available runnig node to test (here and here). The main difference between both is that yacy only rely on yacy-running-computer crawler to index the web and prenset search result whereas seeks is fisrt of all a meta search engine that share its results by its P2P network.
Others I use
When I don't have my computer, I use one of the following:
It's just others. I just configure them into my prefered meta search engine (seeks project), and I don't use them otherwise
- search me
- active search result
- entire web
- fact bites
Anyway, I advise you to compare results, and to run it yourself. OK it's less reliable (P2P node shut down affects index capabilities, the few number of running nodes don't allow big worldwide web index), less easy to use (installation process, less user-friendly,...), slower (must query other p2p nodes and wait there answer). If it's too complicated or you don't want to install stuff you don't know on your computer, use a metasearch engine.
As I was writing this post, I saw that many people already write about search engine alternatives. So there is really no reason to complain about google's monopolistic situation.