Last Modified: 2000-01-26
Namazu is a search engine software intended for easy use. Not only it works as a CGI program for the small or medium scale WWW search engine, but also works as a personal use such as search system for local HDDs. Currently, search clients for Mule, Tcl/Tk, JAVA and Win32 are available.
(The Japanese word `Namazu' means `catfish' in English)
Namazu can works on UNIX, Win32 and OS/2 environment. For Win32,
Namazu the full text retrieval search system for Win32 has maintained by Hirose-san.
For OS/2, OS/2
port of Namazu the full text retrieval search system has maintained by Shimizu-san. Each page
has binary package.
Development of Namazu consists of lots of contributers. I operate a mailing list to discusss the development. If you are interested in Namazu, please participate in it. (but there is basically discussed in Japanese only...)
This search engine consists of indexer and search client. The former is written in Perl and the latter is in C. Indexer makes index in advance for quickly searching. It can search almost 100 megabyte documents within one second. I think it's not too slow. But, of course there is intensive influence of disk cache. As a reason of algorism, its performance would not depend on size of index, it increase by logarithmic scale.
By using namazu.el, you can use Namazu on Mule, similarly you can search with Namazu on X Window System by using tknamazu and on Win32 by using Search-S.
Search client would not require much memory and it singly works as CGI, so it's comparatively light I think.
As Namazu is designed for Mail/News as well, It's especially useful to use Namazu for mailing list or netnews' search system.
Namazu supports AND, OR and NOT search. Result of search will be printed in order of score and it has abstract like AltaVista or ODIN. That summary is made by structure of HTML's headings. Besides, if result couldn't print on one page, it will automatically be splited and it can be shown page by page (in 20 lots by default.)
The score would not only be calculated by term frequency, but
also be caluclated by weight of HTML elements. Furthermore,
Namazu supports meta information such as <META
NAME="keywords" CONTENT="
foo
bar">
.
Namazu considers how to handle HTML document. To take `ALT' description from <IMG> element, to decode entity sets, and to print result in correct HTML 4.0 Strict DTD HTML.
Indexing process will take fifty minutes to index 25 MByte files with Linux Box has Pentium 166 MHz + 64 MB.
For more information, read the manual.
I do distribute Namazu as free software (It is also called `open source software') under the terms of General Public License version 2.
Attention: those pages are mostly written in Japanese.