Locating faulty data in an harvested database - Extending a Metadata language with support for semantic rules to find erroneous data in a vast and incomplete database

dc.contributor.authorGundberg, Per
dc.contributor.authorSteen Timle, Joel
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)sv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineering (Chalmers)en
dc.date.accessioned2019-07-03T13:01:15Z
dc.date.available2019-07-03T13:01:15Z
dc.date.issued2012
dc.description.abstractThis thesis deals with the task of finding erroneous entries in a large database whose content have been automatically collected by scanning different sources on the world wide web. The information is divided into different events, organized in different event classes. As part of the thesis work, a language to describe semantic and structural rules on the information has been designed as an extension to the already existing Metadata language of the database. A set of rules has been written in this language which describes the extended demands. A tool to test the information in the database against rules described in the extended language has also been implemented. The result of the evaluation not only reports if an entry does not fulfill a rule, but also what part of the entry breaks the rule. This information is stored in a database for further analysis and use. Subsets of the database have been checked and during these tests, about five percent of the events did not fulfil all of the rules defined for its event class.
dc.identifier.urihttps://hdl.handle.net/20.500.12380/163809
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectData- och informationsvetenskap
dc.subjectInformations- och kommunikationsteknik
dc.subjectComputer and Information Science
dc.subjectInformation & Communication Technology
dc.titleLocating faulty data in an harvested database - Extending a Metadata language with support for semantic rules to find erroneous data in a vast and incomplete database
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
dc.type.uppsokH
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
163809.pdf
Storlek:
592.75 KB
Format:
Adobe Portable Document Format
Beskrivning:
Fulltext