-
arXiv:1212.5633 [pdf, ps, other]
Design, implementation and experiment of a YeSQL Web Crawler
Abstract: We describe a novel, "focusable", scalable, distributed web crawler based on GNU/Linux and PostgreSQL that we designed to be easily extendible and which we have released under a GNU public licence. We also report a first use case related to an analysis of Twitter's streams about the french 2012 presidential elections and the URL's it contains.
Submitted 21 December, 2012; originally announced December 2012.