A List of Social Tagging Datasets Made Available for Research

This list is not exhaustive - help expand it!
Social Tagging SystemsResearch GroupSourceYear ObtainedAvailabilityContactReferences
CiteULikeOversity Ltd.PrimaryDaily Snapshots Via Download after Email (link) Richard Cameron 
BibsonomyKDEPrimaryPeriodical Snapshots every half yearAvailable after signed license agreementAndreas Hotho[Hotho 2006]
MovieLensGroupLensPrimary2009 Via Download (link) GroupLens Info[Sen 2006]
GiveALinkNaN GroupPrimaryCurrent information via APIVia APIFilippo Menczer[Markines 2009]
ESP GameLuis von AhnPrimary2006 Via Download (link) Luis von Ahn[VonAhn 2004]
DeliciousDAI LaborSecondary2007/2008Via Email RequestAlan Said[Wetzker 2006]
Delicious, Stumble Upon & WikipediaNLP and Information Retrieval Group Secondary2008/2009 Via Download (link) Arkaitz Zubiaga[Zubiaga 2009a] [Zubiaga 2009b] [Zubiaga 2009c]
Delicious, Flickr, Last.fm, zexe.netTAGoraSecondary2006, 2007, 2008 Via Download (link) Vittorio Loreto 
Delicious, Flickr, Diigo, Bibsonomy and othersAgents and Social ComputationSecondary2009Via Email RequestMarkus Strohmaier[Grahsl 2010]

In case you are aware of other available datasets, please let me know by leaving a comment on a corresponding blog post.

Page updated and maintained by Markus Strohmaier.


[Grahsl 2010] H.P. Grahsl, C. Körner, M. Strohmaier. A Collection of Tagging Datasets Containing Complete Personomies From Heterogeneous Sources. Technical Report, Knowledge Management Institute, Graz University of Technology. To be published in 2010

[Hotho 2006] A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. BibSonomy: A Social Bookmark and Publication Sharing System. In Aldo de Moor, Simon Polovina, and Harry Delugach, editors, Proceedings of the Conceptual Structures Tool Interoperability Workshop at the 14th International Conference on Conceptual Structures, Aalborg, Denmark

[Markines 2009] B. Markines and F. Menczer. A Scalable, Collaborative Similarity Measure for Social Annotation Systems. Proc. 20th ACM Conf. on Hypertext and Hypermedia (HT).

[Sen 2006] S. Sen, S. K. Lam, A. M. Rashid, D. Cosley, D. Frankowski, J. Osterhouse, F. M. Harper, and J. Riedl. tagging, communities, vocabulary, evolution. In CSCW '06: Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work, pages 181-190, New York, NY, USA, 2006. ACM.

[VonAhn 2004] L. von Ahn and L. Dabbish. Labeling Images with a Computer Game. ACM Conference on Human Factors in Computing Systems, CHI 2004. pp 319-326.

[Wetzker 2008] R. Wetzker, C. Zimmermann, and C. Bauckhage. Analyzing Social Bookmarking Systems: A Delicious cookbook. In Mining Social Data (MSoDa) Workshop Proceedings, pp. 26-30. ECAI 2008, (July 2008).

[Zubiaga 2009a] A. Zubiaga, R. Martínez, and V. Fresno. Getting the Most Out of Social Annotations for Web Page Classification. Proceedings of DocEng 2009, the 9th ACM Symposium on Document Engineering, pp. 74-83, Munich, Germany. 2009.

[Zubiaga 2009b] A. Zubiaga, A. P. García-Plaza, V. Fresno, and R. Martínez. Content-based Clustering for Tag Cloud Visualization. Proceedings of ASONAM 2009, International Conference on Advances in Social Networks Analysis and Mining. 2009.

[Zubiaga 2009c] A. Zubiaga. Enhancing Navigation on Wikipedia with Social Tags. Wikimania 2009. Buenos Aires, Argentina. 2009.

Last edited on December 7, 2009 (Christian Körner, Markus Strohmaier)