medusa-crawler 1.0.0.pre.1
Medusa: a ruby crawler framework
Medusa is a ruby framework to crawl and collect useful information about the pages it visits. It is versatile, allowing you to write your own specialized tasks quickly and easily.
#### Features
-
Choose the links to follow on each page with ‘focus_crawl()`
-
Multi-threaded design for high performance
-
Tracks 301 HTTP redirects
-
Allows exclusion of URLs based on regular expressions
-
HTTPS support
-
Records response time for each page
-
Obey robots.txt
-
In-memory or persistent storage of pages during crawl using Moneta adapters.
-
Inherits OpenURI behavior (redirects, automatic charset and encoding detection, proxy configuration options).
Gemfile:
=
install:
=
Versions:
- 1.0.0 August 17, 2020 (23 KB)
- 1.0.0.pre.2 August 14, 2020 (23 KB)
- 1.0.0.pre.1 August 06, 2020 (24 KB)