The Ex-Crawler Project is divided into three subprojects. The main part is the Ex-Crawler daemon server, a highly configurable and flexible Web crawler written in Java. It comes with its own socket server, with which you can manage the server, users, distributed grid/volunteer computing, and much more. Crawled information is stored in a database (Currently MySQL, PostgreSQL, and MSSQL are supported). The second part is a graphical (Java Swing) distributed grid/volunteer computing client, including user computer state detection, based on JADIF Project. The Web search engine is written in PHP. It comes with a Content Management System, user language detection and multi-language support, and templates using Smarty, including an application framework that is partly forked from Joomla 1.5, so that Joomla components can be adapted quickly.
Project Release infomations and Project Resources. Note that these informations are from this projects Freecode.com page and the downloads themselves may not be hosted with SourceForge.JP.
This release features a complete database rework, many speed improvements (up to 60% faster), PDF crawling, language detection, an URL filter, and hundreds of other improvements, bugfixes, and updates. Ex-Crawler can now be run as a daemon. Startup scripts and a process watcher were included. Setup was simplified. A utility that creates the required database tables was added and an automatic performance benchmark test was implemented so that you don't need to handle the number of threads manually.