Index of /published/datasets/caching/2019

[ICO]NameLast modifiedSizeDescription

[PARENTDIR]Parent Directory  -  
[TXT]README.html2019-11-29 01:19 2.1K 
[DIR]upload/2019-11-26 09:15 -  
[DIR]text/2019-11-26 04:40 -  

Analytics Datasets: Caching

Analytics Datasets: Caching

Description

This dataset is a restricted public snapshot of the wmf.webrequest table intended for caching research. You can read about the data and the reasoning behind it on our on-Wiki documentation.

Updates

The data is updated manually and irregularly upon request. The previous variant of this data set was released in 2016 upon request.

Contents

The current iteration of this data set includes a total of 42 compressed files, 21 of which hold upload (image) web request data and the other 21 of which hold text pageview web request data.

Each upload data file, denoted cache-u, contains exactly 24 hours of consecutive data. These files are each roughly 1.5GB in size and hold roughly 4GB of decompressed data each.

Each text data file, denoted cache-t, contains exactly 24 hours of consecutive data. These files are each roughly 100MB in size and hold roughly 300MB of decompressed data each.

The compressed file names look like: cache-@-##.gz.
The decompressed file names look like: cache-@-##.tsv.

Download Caching Data