Locating faulty data in an harvested database - Extending a Metadata language with support for semantic rules to find erroneous data in a vast and incomplete database
dc.contributor.author | Gundberg, Per | |
dc.contributor.author | Steen Timle, Joel | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers) | sv |
dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering (Chalmers) | en |
dc.date.accessioned | 2019-07-03T13:01:15Z | |
dc.date.available | 2019-07-03T13:01:15Z | |
dc.date.issued | 2012 | |
dc.description.abstract | This thesis deals with the task of finding erroneous entries in a large database whose content have been automatically collected by scanning different sources on the world wide web. The information is divided into different events, organized in different event classes. As part of the thesis work, a language to describe semantic and structural rules on the information has been designed as an extension to the already existing Metadata language of the database. A set of rules has been written in this language which describes the extended demands. A tool to test the information in the database against rules described in the extended language has also been implemented. The result of the evaluation not only reports if an entry does not fulfill a rule, but also what part of the entry breaks the rule. This information is stored in a database for further analysis and use. Subsets of the database have been checked and during these tests, about five percent of the events did not fulfil all of the rules defined for its event class. | |
dc.identifier.uri | https://hdl.handle.net/20.500.12380/163809 | |
dc.language.iso | eng | |
dc.setspec.uppsok | Technology | |
dc.subject | Data- och informationsvetenskap | |
dc.subject | Informations- och kommunikationsteknik | |
dc.subject | Computer and Information Science | |
dc.subject | Information & Communication Technology | |
dc.title | Locating faulty data in an harvested database - Extending a Metadata language with support for semantic rules to find erroneous data in a vast and incomplete database | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master Thesis | en |
dc.type.uppsok | H |
Ladda ner
Original bundle
1 - 1 av 1
Hämtar...
- Namn:
- 163809.pdf
- Storlek:
- 592.75 KB
- Format:
- Adobe Portable Document Format
- Beskrivning:
- Fulltext