Distributed Web Crawler
dc.contributor.author | Bjerkander, Hans | |
dc.contributor.author | Karlsson, Erik | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers) | sv |
dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering (Chalmers) | en |
dc.date.accessioned | 2019-07-03T13:21:04Z | |
dc.date.available | 2019-07-03T13:21:04Z | |
dc.date.issued | 2014 | |
dc.description.abstract | This thesis investigates possible improvements in distributed web-crawlers. Web-crawling is the cornerstone of search-engines and a well defined part of Internet technology. Due to the size of the Web, it is important that a web-crawler is fast and efficient, since a web-crawler should be able to find the interesting sites before they change or disappear. The thesis will focus on crawler distribution concerning modularity, fault-tolerance and group membership services. The download order of crawlers will also be covered, since this greatly influences the efficiency of a crawler. In addition to the theoretical basis of the thesis, a prototype has been constructed in Java. The prototype is efficient, modular, fault-tolerant and configurable. The result from the thesis indicates that using a membership service is a good way to distribute a crawler and conclusively, the thesis also demonstrate a way to improve the crawling order compared to a breadth-first ordering. | |
dc.identifier.uri | https://hdl.handle.net/20.500.12380/193680 | |
dc.language.iso | eng | |
dc.setspec.uppsok | Technology | |
dc.subject | Data- och informationsvetenskap | |
dc.subject | Computer and Information Science | |
dc.title | Distributed Web Crawler | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master Thesis | en |
dc.type.uppsok | H | |
local.programme | Computer science – algorithms, languages and logic (MPALG), MSc |
Ladda ner
Original bundle
1 - 1 av 1
Hämtar...
- Namn:
- 193680.pdf
- Storlek:
- 677.95 KB
- Format:
- Adobe Portable Document Format
- Beskrivning:
- Fulltext