Web Search (00011)

This dataset contains the results of comparing websearches across Bing, Google, Yahoo, and Ask. This data is provided by Robert Bredereck at TU Berlin. Robert provides tools to compute Kemeny rankings on this data at his website at TU Berlin.

The data files marked big contain around 2000 candidates each while the data files marked small contain between 100 and 200 results. The search querys are shown in the names of the individual data files below. For the WebImpact files the number of search results for a particular term were used to creage a complete ranking over the search terms. These files measure the webimpact of various world cities and countries. The results are not complete and not every candidate (website) is ranked by all the voters (search engines). We have extended this data into tournament graphs, weighted majoirty graphs, and created a toc dataset where all candidates are tied, at the end of rankings.

Selected studies: N. Betzler, R. Bredereck and R. Niedermeier. Theoretical and empirical evaluation of data reduction for exact Kemeny Rank Aggregation. Autonomous Agents and Multi-Agent Systems, 28(5):721-748; 2014. | R. Bredereck. Fixed-Parameter Algorithms for Computing Kemeny scores - Theory and Practice. Thesis, Department of Mathematics and Computer Science, University of Jena, 2009. | N. Betzler, R. Bredereck, and R. Niedermeier. Partial Kernelization for Rank Aggregation: Theory and Experiments. Proc. 5th International Symposium on Parameterized and Exact Computation (IPEC), 2010.

Download the dataset [zip, 6.2 MB]



  • Number of files: 151
  • Total size: 23.5 MB
  • Data types: soc, soi, toc.
  • Publication date: July 9, 2014
  • Last modification: April 10, 2024
webimpact_capitals5 — 00011-00000001.soc
webimpact_nations5 — 00011-00000002.soc
webimpact_richest5 — 00011-00000003.soc
websearch_big_Death+Valley — 00011-00000004.soi
websearch_big_Gulf+war — 00011-00000005.soi
websearch_big_HIV — 00011-00000006.soi
websearch_big_Lipari — 00011-00000007.soi
websearch_big_National+parks — 00011-00000008.soi
websearch_big_Penelope+Fitzgerald — 00011-00000009.soi
websearch_big_San+Francisco — 00011-00000010.soi
websearch_big_Shakespeare — 00011-00000011.soi
websearch_big_Thailand+tourism — 00011-00000012.soi
websearch_big_Zener — 00011-00000013.soi
websearch_big_affirmative+action — 00011-00000014.soi
websearch_big_alcoholism — 00011-00000015.soi
websearch_big_amusement+parks — 00011-00000016.soi
websearch_big_architecture — 00011-00000017.soi
websearch_big_bicycling — 00011-00000018.soi
websearch_big_blues — 00011-00000019.soi
websearch_big_cheese — 00011-00000020.soi
websearch_big_citrus+groves — 00011-00000021.soi
websearch_big_classical+guitar — 00011-00000022.soi
websearch_big_computer+vision — 00011-00000023.soi
websearch_big_cruises — 00011-00000024.soi
websearch_big_field+hockey — 00011-00000025.soi
websearch_big_gardening — 00011-00000026.soi
websearch_big_graphic+design — 00011-00000027.soi
websearch_big_java — 00011-00000028.soi
websearch_big_lyme+disease — 00011-00000029.soi
websearch_big_mutual+funds — 00011-00000030.soi
websearch_big_parallel+architecture — 00011-00000031.soi
websearch_big_recycling+cans — 00011-00000032.soi
websearch_big_rock+climbing — 00011-00000033.soi
websearch_big_stamp+collection — 00011-00000034.soi
websearch_big_sushi — 00011-00000035.soi
websearch_big_table+tennis — 00011-00000036.soi
websearch_big_telecommuting — 00011-00000037.soi
websearch_big_vintage+cars — 00011-00000038.soi
websearch_big_volcano — 00011-00000039.soi
websearch_big_zen+budism — 00011-00000040.soi
websearch_small_Death+Valley — 00011-00000041.soi
websearch_small_Gulf+war — 00011-00000042.soi
websearch_small_HIV — 00011-00000043.soi
websearch_small_Lipari — 00011-00000044.soi
websearch_small_National+parks — 00011-00000045.soi
websearch_small_Penelope+Fitzgerald — 00011-00000046.soi