Finding relevant search results in social networks - Implementation and evaluation of relevance models in the context of social networks

Typ
Examensarbete för masterexamen
Master Thesis
Program
Publicerad
2011
Författare
Grennborg, Marcus
Fredrik, Pettersson
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Social media services and the social networks are quickly becoming a big and natural part of the web. Providing a way to quickly determine the quality of status updates written therein would be very useful to bring order to the massive amount of content generated. We have looked at the current research and techniques in information retrieval for searching and have implemented a few of them to determine how useful they are in the context of social networks. Based on these implementations we have created a proof of concept search engine for Twitter. The solution contains a crawler utilizing mainly the Adaptive OPIC algorithm for the selection policy, together with some other parameters. The search engine does a custom ranking as a combination of several parameters, such as the popularity, text analysis and freshness. Some of the parameters are also personal ranking algorithms and are thus based upon the user doing the query. These are interest based ranking and an estimated shortest distance in the users own social graph. The estimated distance is done using the Seeds-based ranking algorithm; an alternative version of the algorithm is also proposed. Testing has been done, both user testing (using NDCG) and performance testing to determine how good the implemented techniques are in the context of social networks. The results of these tests are analyzed and we elaborate on possible uses of the search engine and possible future work. The results show that old relevance models that have been used in search solutions for the web and similar media are still very useful in the context of social networks. And also that usage of personalized attributes (such as users relation to each other) is a good way to measure the quality of a status. We also conclude that you in practice can index everything of a social media, but have to use some methods for focused crawling based on the demands set on the search solution that is to be created.
Beskrivning
Ämne/nyckelord
Datavetenskap (datalogi) , Computer Science
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material
Index