Search engine: do it yourself
Sun 27 October 2013
[caption id="" align="alignright" width="300"] Search engine ancestor (Photo credit: Wikipedia)[/caption]
- big ones
- old ones
- stange ones
- do it yourself
- others
Of course, my preference goes to the "do it yourself" solution (namely yacy and seeks), and as usual, for each usage a search specific tool should be prefered.
The big ones
This one are run by big big big companies. They are powerfull, well fitted for large general usage, have indexed many many many (not only web) content. They are also the must used (just look at alexa). I will only list some of them as every body may know them.
- bing
- yahoo search
- baidu
- exalead
They are run by private company that make what they want. If they want to modify the result for any reason (such as promoting their commercial partners or make some web ressource unvisble), they can make it.
The old ones
[caption id="" align="alignright" width="300"] Lycos! Go get it! (Photo credit: Wikipedia)[/caption]
We don't remember them but they were here in late 90's.
The strange ones
They are specific and are difficult to use for general web search.
- web content directories such as dmoz (open directory). Each link is human-checked (high quality but not many content)
- specific web site search engine (e.g. imdb)
- semantic search engines such as wolfram alpha. you ask this kind of search engine with natural language.
- mystery seeker: the search engine that respond somethink you don't ask for
- creative common search engine
- amfibi for companies search
- archive to find old version of a web site
The ones you run yourself
[caption id="" align="alignright" width="180"] Do it yourself fencing. (Photo credit: M i x y)[/caption]
They consist of a piece of software you run on your computer. They are slow and usually share theyre result among a on-purpuse peer to peer network. The main advantage is the impossibility to deindex a ressource indexed by one or more computer involved in the P2P network. The two main (maybe the only two that exist) are:
Both project propose publicly available runnig node to test (here and here). The main difference between both is that yacy only rely on yacy-running-computer crawler to index the web and prenset search result whereas seeks is fisrt of all a meta search engine that share its results by its P2P network.
Others I use
When I don't have my computer, I use one of the following:
- duck duck go whose popularity increases. Its bang can be compared to the konqueror web shortcut
- qwant that is quite new. I like the way they present results.
The others
It's just others. I just configure them into my prefered meta search engine (seeks project), and I don't use them otherwise
- seeks.fr
- mahalo
- search me
- active search result
- blekko
- chacha
- cluuz
- entire web
- fact bites
- mojeek
- seekfu
- ixquick
Anyway, I advise you to compare results, and to run it yourself. OK it's less reliable (P2P node shut down affects index capabilities, the few number of running nodes don't allow big worldwide web index), less easy to use (installation process, less user-friendly,...), slower (must query other p2p nodes and wait there answer). If it's too complicated or you don't want to install stuff you don't know on your computer, use a metasearch engine.
As I was writing this post, I saw that many people already write about search engine alternatives. So there is really no reason to complain about google's monopolistic situation.
Related articles (or not):
- search engine part II
- email confidentiality
- Language evolution
- Gmail and reply-to
- quick and easy mathtool
Category: tools Tagged: Ask.com bing DIY Google Lycos Metasearch engine Search tools Web search engine Wikipedia yahoo search