[GUFSC] uma maquina de busca para web.

Emerson Ribeiro de Mello emerson em das.ufsc.br
Terça Julho 29 11:13:07 GMT+3 2003


Olah,

Acho que muitos conhecem o htdig, uma ferramenta utilizada para fazer 
busca em servidores web. Só que a danada é pesada que só vendo.

Uma alternativa para ela é o MnoGoSearch (http://www.mnogosearch.org/) 
(versão para UNIX sob a GPL e para windows é comercial)

E suas características são:

mnoGoSearch (formerly known as UdmSearch) has number of unique features, 
which make it capable of wide range of application - from search within 
your site to a specialized search system.

# Full text indexing. Different priority can be configured for body, 
title, keywords, description of a document.
# Supporting all widely used single- and multi-byte character sets, 
including UTF8, as well as most of the popular Eastern Asia languages.
# Automatic document character set and language guesser for about 70 
charset/language combinations.
# ASP-frontend
# Web-configurator
# HTTP/1.0 support
# FTP support
# NNTP support (both news:// and nntp:// URL schemes) in standard and 
extended modes.
# HTTP Proxy support
# Local file system indexing support (file: URL schema)
# Supporting gzip, deflate, compress content encoding
# Built-in database support
# Different SQL databases support. Currently MySQL, PostgreSQL, miniSQL, 
Solid, Virtuoso, InterBase, Oracle, SyBase, MS SQL, iODBC, unixODBC, 
EasySoft ODBC-ODBC bridge IBM DB2 databases may be used as mnoGoSearch 
backend.
# Search clusters: a possibility to distribute database between several 
machines.
# Basic authorization support (to index password protected areas)
# Both HTML documents and plain text files can be indexed
# External parsers support for other file types (pdf, ps, doc etc.)
# Mirroring features
# Stopwords support
# "keywords" and "description" META tags support
# User defined META tag support.
# Reentry capability. You can run few indexers and searching processes at 
the time
# Continual indexing
# Indexing depth can be limited
# Robots exclusion standard support (both <META NAME="robots"> and 
robots.txt)
# HTML templates to easily customize search results
# Boolean query support
# Fuzzy search: different word forms, synonyms, substrings
# C CGI, PHP3, Perl search frontends
# Search on subsection of database
# It is very flexible. You can configure mnoGoSearch to run in different 
modes, including 'ftpsearch mode' (searching through URLs rather than 
their content),'link validation' (to check site for bad references), 
'netminder' (What's new since ...?). There is also extended news support 
built into the package. 


-- 
[]'s
Emerson R. de Mello

+---------------------------------+
| http://www.das.ufsc.br/~emerson |
|                                 |
| Telefone:                       |
|       (48) 331-7576 - UFSC/LCMI |
|                                 |
+---------------------------------+


Mais detalhes sobre a lista de discussão GUFSC