This dataset has been assembled for the USEWOD workshop series (http://usewod.org). You may use it for research purposes only, as defined in the "Usage Agreement for the USEWOD Log Dataset". FORMAT: ------- All logs in /data/CLF-server-logs/ are provided in the Apache Combined Log Format [1]. However, there are slight differences between the datasets, due to the fact that they have been provided by different parties. The IP addresses in all datasets have been anonymised, but in different ways: bio2RDF: All bio2rdf log entries show requests that have been routed through the service's web interface, and therefore have identical IP addresses and user agents. DBPedia 3.3, 3.4 and SWDF: all IP addresses have been set to "0.0.0.0". Two additional fields have been added to the end of each log entry: the country code of the request IP (determined using the GeoLite Database [2]), and a hash of the IP. DBPedia 3.5.1: all IP addresses have been set to "0.0.0.0". All agent strings have been replaced with "preprocessed". DBPedia 3.6 and later: all IP address fields have been replaced with a hash of the address. All timestamps have been set to 04:00. LGD: All IP addresses have been replaced with "0.0.0.X", where X=1 for the first IP encountered, and X+1 for each new IP after that. The logs in /data/SPARQL-endpoint-logs/ do not follow a particular format, but should be should be self-explanatory. Each query is preceded by a timestamp. [1] http://httpd.apache.org/docs/current/logs.html [2] http://dev.maxmind.com/geoip/geolite