Advanced Mobile Contributions Metrics - Final Report

Megan Neisler
November 22, 2019

Project summary

As part of efforts to improve the mobile contribution experience, the Web team deployed the Advanced Mobile Contributions (AMC) mode. This is an opt-in feature set that adds more contributor capabilities to the mobile web experience. Please see the (project page) for more details about the background, changes and goals of the project.

The feature was was first deployed as an opt-in setting to identified target wikis including Arabic, Indonesian, Spanish, Italian, Japanese, Persian, and Thai Wikipedias due to their relatively large populations of existing mobile editors. After testing and feedback, AMC for deployed and promoted the feature set to all Wikimedia projects.

Timeline

  • March 20, 2019 - AMC released as opt-in setting on Arabic, Indonesian, and Spanish Wikipedias
  • June 17, 2019 - the team released a second set of features and included additional Wikipedias for testing and feedback (Italian, Japanese, Persian, and Thai).
  • August 7, 2019- AMC deployed to all Wikipedias.
  • October 10, 2019 - AMC deployed everywhere.

This report shows the status of the key performance indicators (KPIs) identified in the Annual Plan following the deployment of AMC to all Wikimedia Projects. Results from first progress report, showing the status of the KPIs as of the end of FY18-19 (June 2019), are available in the FY 2018/2019 AMC Metrics Status Report. We reviewed metrics overall, on english wikipedia, and on all of the target Wikipedia projects.

Metrics

In the annual plan, the Readers Web team defined the following KPIs:

Mobile web edit rate.

  • Rate of edits on mobile web coming from AMC mode and overall rate.
  • Target: 10% increase from last year (overall metric). Measured using the new AMC edits tag (done in T212959) to see if they reach 1/11th of total mobile web edits, defined using the existing edit tag. AMC tag was created on Jan 16, 2019

Retention rate for opt-in advanced mobile mode amongst medium and high-volume editors (100+ edits (medium-volume), 500+ edits (high volume) )

  • The proportion of editors who opt-in and stay opted-in to AMC mode.
  • Target: At least 60% retention.
  • Measured using the opt-in/opt-out button (done in T211197 using the PrepUpdate mf_amc_optin). This is set to true each time someone opts out.

Moderation actions on mobile web

  • Rate of moderation actions on mobile web coming from AMC mode and the overall rate.
  • Target: 10% increase from last year.
  • Moderation actions are defined in T213461
  • Measured using the AMC edit tag as for edits (above).

Other Metrics

  • Rates of access to special pages overall with the proportion of AMC requests measured using the X-Analytics tag for AMC (done in T212961.
  • Rates of access and edits to article talk pages with the proportion of AMC edits measured using on AMC X-Analytics tag and AMC edit tag

For more links to implementation tasks and technical details, see this overview task T210660

Key observations

  • Mobile Web Edit Rate: The mobile web edit rate across all Wikipedia projects has been increasing steadlily over the past year. This rate did not change signficantly following AMC deployment; however, the proportion of mobile web edits made in AMC mode has been increasing. In September 2019, there was a 22.7% year over year increase in the mobile web edit across all Wikpedias. 26% of these edits were made by users in AMC mode. The highest proportion of AMC tagged mobile web edits are made by high volume editors (500+ cumulative edits).
  • Retention Rate: There has been an 87.7% overall retention rate of the opt-in AMC mode across all Wiki projects with significant increases in the number of AMC opt-ins following the central notice banner deployed on July 15, 2019, the deployment of AMC on all Wikipedias on August 7, 2019, and also following the deployment across all Wikipedia projects on October 10, 2019. The retention rate of AMC was higher for medium to high volume editors compared to low volume editors.
  • Moderation actions on mobile web: There was a significant increase in the number of moderation actions following AMC deployment on all Wikipedia projects. Moderation actions on mobile web increased by 31% across all Wikipedias and 37.6% on English Wikipedia from the month August 2019 (first month of deployment to all Wikipedia) to the September 2019. The thank action is the most commonly used moderation action on mobile web. However, the number of blocks and approves has both seen signficant increases since AMC was deployed on all wikis.
  • Rate of access to special pages : Requests to special pages has remained relatively flat (there was only a 3% month over month change between September 2019 and October 2019) and views from mobile web in AMC mode represent about only 2% of all mobile web views to these pages. However, there has been an increase in the proportion of mobile web requests to special pages coming from AMC mode since it was deployed on all Wikipedias on August 7, 2018. The proportion of AMC requests increased from September to October 2019 by 15.9%. The Mobile Diff, Watchlist, Contributions, and Search are the most viewed special pages from users in AMC mode on all Wikipedias.
  • Rate of access to talk pages : There was a 10.6% year over year increase in the number of views to article talk pages from mobile web in October 2019. About 42% of these views came from users in AMC mode. The rate of edits to talk pages have also continued to increase following the July and August deployment dates. While other variables such as changes in the number of active edtiors may also lead to these increases, the proportion of talk page edits made by users in AMC mode has been increasing each month following deployment. AMC edits accounted for 42.1% of all talk page edits made by logged-in users on mobile web in October 2019.
In [2046]:
library(IRdisplay)

display_html(
'<script>  
code_show=true; 
function code_toggle() {
  if (code_show){
    $(\'div.input\').hide();
  } else {
    $(\'div.input\').show();
  }
  code_show = !code_show
}  
$( document ).ready(code_toggle);
</script>
  <form action="javascript:code_toggle()">
    <input type="submit" value="Click here to toggle on/off the raw code.">
 </form>'
)
In [ ]:
vertical_lines <- as.numeric(as.Date(c("2019-03-20", "2019-06-17", "2019-08-07")))

Mobile web edit rate

Methodology

The mobile web edit rate is based on edits on all Wikipedias between June 1, 2018 through September 30, 2019 recorded in the mediawiki_history dataset.

We reviewed the both the rate of overall mobile web edits and the rate of edits on mobile web coming from AMC mode (Measured using the an AMC edits tag (done in T212959 on January 16, 2016).

At the time of this analysis, we did not review any changes on non-Wikipedia projects since AMC was not deployed on those projects until October 10, 2019.

In [1631]:
shhh <- function(expr) suppressPackageStartupMessages(suppressWarnings(suppressMessages(expr)))
shhh({
    library(magrittr); library(zeallot); library(glue); library(tidyverse); library(zoo); library(lubridate)
    library(scales)
})
In [ ]:
#Collect all mobile web edits and mobile web edits tagged as AMC from all target wikis where AMC was deployed 
#grouped by wiki and user edit count

# In terminal
# spark2R --master yarn --executor-memory 2G --executor-cores 1 --driver-memory 4G

query <- "SELECT
    date_format(event_timestamp, 'yyyy-MM-dd') as date,
    wiki,
    user_edit_count,
    sum(cast(other_mobile_web_edit as int)) as other_mobile_web_edits,
    sum(cast(amc_edit as int)) as amc_edits,
    sum(cast(mobile_web_edit as int)) as mobile_web_edits
FROM (
    SELECT
        wiki_db as wiki,
        event_timestamp,
        (array_contains(revision_tags, 'mobile web edit') and not
        array_contains(revision_tags, 'advanced mobile edit')) as other_mobile_web_edit, 
        (array_contains(revision_tags, 'advanced mobile edit') and
        array_contains(revision_tags, 'mobile web edit')) as amc_edit,
        array_contains(revision_tags, 'mobile web edit') as mobile_web_edit,
        CASE
            WHEN event_user_revision_count is NULL THEN 'undefined'
            WHEN event_user_revision_count < 100 THEN 'under 100'
            WHEN event_user_revision_count >=100 AND event_user_revision_count < 500 THEN '100-499'
            ELSE '500+'
    END AS user_edit_count
    FROM wmf.mediawiki_history mwh
    INNER JOIN canonical_data.wikis cd
        ON wiki_db = database_code 
    WHERE
        mwh.event_entity = 'revision' and
        mwh.event_type = 'create' and
        cd.database_group = 'wikipedia' and
        mwh.event_timestamp IS NOT NULL and
        mwh.event_timestamp between '2018-06-01' and '2019-09-30' and 
        mwh.snapshot = '2019-10'
) edits
GROUP BY wiki, date_format(event_timestamp, 'yyyy-MM-dd'), user_edit_count"

results <- collect(sql(query))
save(results, file="Readers-Web-AMC-metrics/Data/mobile_web_edit_counts.RData")
In [579]:
load("Data/mobile_web_edit_counts.RData")
mobile_web_edit_counts <- results
In [581]:
mobile_web_edit_counts$date <- as.Date(mobile_web_edit_counts$date, format = "%Y-%m-%d")

mobile_web_edit_counts_clean <-  mobile_web_edit_counts %>%
gather(edit_type, edit_count, 4:6) %>%
    arrange(desc(date))

Overall mobile web edit rate

In [1741]:
##Overall monthly web edit counts and yoy change
mobile_web_edit_monthly_overall_yoy <- mobile_web_edit_counts_clean %>%
filter(edit_type == 'mobile_web_edits') %>%   # filter to all overall mobile web edits (both AMC and non-AMC)
mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)

tail(mobile_web_edit_monthly_overall_yoy)
A tibble: 6 × 3
datetotal_mobile_editsyearOverYear
<date><dbl><dbl>
2019-04-011180291 NA
2019-05-011247127 NA
2019-06-01122187824.57948
2019-07-01128703526.34613
2019-08-01134672826.30888
2019-09-01122036622.79424
In [2005]:
##Plot time series of mobile edits rate.


p <- mobile_web_edit_monthly_overall_yoy %>%
 ggplot(aes(x=date, y = total_mobile_edits, color = edit_type)) + 
    geom_line(color = 'blue', size = 1.5 ) +
geom_vline(xintercept = vertical_lines,
             linetype = "dashed", color = "black") +
  geom_text(aes(x=as.Date('2019-03-20'), y=1.15E6, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.7, vjust = -1.2, angle = 90, color = "black") +
  geom_text(aes(x=as.Date('2019-06-17'), y=1.15E6, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.7, vjust = -1.2, angle = 90, color = "black") +
   geom_text(aes(x=as.Date('2019-08-07'), y=1.15E6, label="AMC deployed on all Wikipedias"), size=3.7, vjust = -1.2, angle = 90, color = "black") +
    scale_y_continuous("Number of mobile web edits per month", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Mobile web edits on all Wikipedia projects") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"))

p
     
ggsave("Figures/mobile_web_edits_overall_monthly.png", p, width = 18, height = 9, units = "in", dpi = 150)

There was a steady increase in total mobile web edits the past 15 months, which has been occuring prior to the deployment of AMC on target wikis. This is likely partly due to a sustained increase in overall active editors. In September 2019, there was a 22.7% year over year increase in overall mobile web edits. We reviewed the number of these edits made while the user was in AMC mode to determine the impact of AMC on this increase.

Proportion of Mobile Web Edits Made in AMC Mode

Since August deployment Date through September 2019

In [1753]:
# across all Wikipedias for logged in users since August deployment Date
amc_edits_prop_overall <- mobile_web_edit_counts %>%
    filter(date >= "2019-08-07", #deployment date across all wikis 
           user_edit_count != 'undefined') %>% #limit only to logged in users
         summarise(mobile_web_edits = sum(mobile_web_edits, na.rm = TRUE),
                 amc_edits = sum(amc_edits, na.rm = TRUE)) %>%
         mutate(amc_prop = amc_edits/mobile_web_edits *100)

amc_edits_prop_overall
A data.frame: 1 × 3
mobile_web_editsamc_editsamc_prop
<dbl><dbl><dbl>
110808019160917.29198

Per Month Since Deployment

In [1756]:
# across all Wikipedias for logged in users per month since deployment

amc_edits_prop_overall_bymonth <- mobile_web_edit_counts %>%
    filter(user_edit_count != 'undefined',
          date >= '2019-03-01')%>%  #since first March deployment
    mutate(date = floor_date(date, "month")) %>%
      group_by(date) %>%
        summarise(mobile_web_edits = sum(mobile_web_edits, na.rm = TRUE),
                 amc_edits = sum(amc_edits, na.rm = TRUE)) %>%
         mutate(prop = amc_edits/mobile_web_edits *100)

amc_edits_prop_overall_bymonth
A tibble: 7 × 4
datemobile_web_editsamc_editsprop
<date><dbl><dbl><dbl>
2019-03-01538779 693 0.1286242
2019-04-01521036 1701 0.3264650
2019-05-01568578 2351 0.4134877
2019-06-01559685 3089 0.5519176
2019-07-01609794 12180 1.9973958
2019-08-01647386 45795 7.0738323
2019-09-0158228214909725.6056344
In [2058]:
#Plot yoy of mobile web edits
p <- mobile_web_edit_counts_clean %>%
    filter(date >= '2019-03-17', #date of first deployment to target wikis
         date <= '2019-09-28', #remove last week due to incomplete data
        edit_type != 'mobile_web_edits',
         user_edit_count != 'undefined') %>%
    mutate(date = floor_date(date, "week")) %>%
    group_by(date, edit_type) %>%
    summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE)) %>%
    ggplot(aes(x= date, y= total_mobile_edits, fill = edit_type)) +
    geom_col() + 
    geom_vline(xintercept = vertical_lines,
             linetype = "dashed", color = "black") +
    geom_text(aes(x=as.Date('2019-03-20'), y=80E3, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.6, vjust = -1.2, angle = 90, color = "black") +
    geom_text(aes(x=as.Date('2019-06-17'), y=80E3, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.6, vjust = -1.2, angle = 90, color = "black") +
    geom_text(aes(x=as.Date('2019-08-07'), y=80E3, label="AMC deployed on all Wikipedias"), size=3.6, vjust = -1.2, angle = 90, color = "black") +
    scale_y_continuous("Number of edits per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "2 weeks") +
    labs(title = "Mobile web edits on all Wikipedia projects by edit mode") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))

p
ggsave("Figures/mobile_web_edits_overall_amc_prop.png", p, width = 18, height = 9, units = "in", dpi = 150)

The proportion of AMC tagged mobile web edits have increased following each wiki deployment. Since deployment of AMC on all Wikipedias on August 7, 2019, 17.3% of all mobile web edits by logged-in users were made while in AMC mode. In September 2019, 26% of all mobile web edits were made while in AMC mode.

Mobile web edit rate by user experience

In [2007]:
## Plot of overall mobile web edit rate by user edit count

p <- mobile_web_edit_counts_clean %>%
    filter(edit_type == 'mobile_web_edits', #look at all mobile web edits
          user_edit_count != 'undefined', #remove undefined user edit counts
) %>% 
    mutate(date = floor_date(date, "month")) %>%
    group_by(date, user_edit_count) %>%
    summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE))%>%
    ggplot(aes(x=date, y = total_mobile_edits, color = user_edit_count, linetype = user_edit_count)) + 
    geom_line(size = 1.5)+
    geom_text(aes(x=as.Date('2019-03-20'), y=200E3, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.5, vjust = -1.2, angle = 90, color = "black") +
  geom_text(aes(x=as.Date('2019-06-17'), y=200E3, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.5, vjust = -1.2, angle = 90, color = "black") +
   geom_text(aes(x=as.Date('2019-08-07'), y=200E3, label="AMC deployed on all Wikipedias"), size=3.5, vjust = -1.2, angle = 90, color = "black") +
    scale_y_continuous("Number of edits per month", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "2 months", limits = c()) +
    labs(title = "Mobileweb edits by user experience on all Wikipedia project") +
geom_vline(xintercept = vertical_lines,
             linetype = "dashed", color = "black") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="bottom")

ggsave("Figures/mobile_web_edits_byeditcount.png",p, width = 18, height = 9, units = "in", dpi = 150)

p
In [1834]:
##Calculate overall YOY increase for 100+ and 500+ editors across all wikis

mobile_web_edit_under100 <- mobile_web_edit_counts_clean %>%
    filter(user_edit_count == 'under 100') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)


mobile_web_edit_100 <- mobile_web_edit_counts_clean %>%
    filter(user_edit_count == '100-499') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)


mobile_web_edit_500 <- mobile_web_edit_counts_clean %>%
    filter(user_edit_count == '500+') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)

Overall year over year change in mobile web edits by editor experience

In [1835]:
# yoy table
edit_count <- c('under 100', '100+', '500+')
mobile_web_editcount_yoy <- rbind(mobile_web_edit_under100[16,], mobile_web_edit_100[16,], mobile_web_edit_500[16,])

mobile_web_editcount_yoy$edit_count= edit_count

mobile_web_editcount_yoy
A tibble: 3 × 4
datetotal_mobile_editsyearOverYearedit_count
<date><dbl><dbl><chr>
2019-09-0139735814.94700under 100
2019-09-0116943024.25016100+
2019-09-0159777644.12715500+

In September 2019, there was a year over year increase in mobile web edits for all user edit count groups but the highest increase (44%) was seen for logged-in editors with over 500 cumulative edits on Wikipedia projects. This is an increase from the 37% year over year increase in June 2019 for the same editor group.

Proportion of mobile web edits tagged with AMC by experience

Since August deployment Date through September 2019

In [1760]:
amc_edits_prop_overall_byeditcount <- mobile_web_edit_counts %>%
    filter(date >= "2019-08-07", #deployment date across all wikis
          user_edit_count != 'undefined') %>% 
         group_by(user_edit_count) %>% 
        summarise(mobile_web_edits = sum(mobile_web_edits, na.rm = TRUE),
                 amc_edits = sum(amc_edits, na.rm = TRUE)) %>%
        mutate(prop = amc_edits/mobile_web_edits *100)

amc_edits_prop_overall_byeditcount
A tibble: 3 × 4
user_edit_countmobile_web_editsamc_editsprop
<chr><dbl><dbl><dbl>
100-499 165873 3095018.658853
500+ 56243414770826.262281
under 100379773 12951 3.410195

**In September 2019

In [594]:
amc_edits_prop_overall_byeditcount_Sept <- mobile_web_edit_counts %>%
    filter(date >= "2019-09-01",
           date <= '2019-09-30', #deployment date across all wikis
          user_edit_count != 'undefined') %>% 
         group_by(user_edit_count) %>% 
        summarise(mobile_web_edits = sum(mobile_web_edits, na.rm = TRUE),
                 amc_edits = sum(amc_edits, na.rm = TRUE)) %>%
        mutate(prop = amc_edits/mobile_web_edits *100)

amc_edits_prop_overall_byeditcount_Sept
A tibble: 3 × 4
user_edit_countmobile_web_editsamc_editsprop
<chr><dbl><dbl><dbl>
100-499 84715 2364227.907690
500+ 29888811769539.377626
under 100198679 7760 3.905798

We are also seeing the highest proportion of AMC mobile web edits completed by high volume editors and the lowest proportion by low volume editors. From August 7, 2019 (deployment date on all wikis) to the end of September 2019, 39.43% of all logged-in mobile web edits by high volume editors (500+) were made while in AMC.

Mobile web edit rate for English Wikipedia

In [2008]:
##Plot of mobile web edits for enwiki.

p <- mobile_web_edit_counts_clean %>%
     filter(edit_type == 'mobile_web_edits',
           wiki == 'enwiki')%>%
     mutate(date = floor_date(date, "month")) %>%
    group_by(date, wiki)%>%
    summarise(monthly_edits = sum(edit_count)) %>%
    ggplot(aes(x=date, y = monthly_edits)) + 
    geom_line(size = 1.5, color = "blue") +
    geom_vline(xintercept = as.numeric(as.Date(c("2019-08-07"))),
             linetype = "dashed", color = "black") +
  geom_text(aes(x=as.Date('2019-08-08'), y=4.6E5, label="AMC deployed on all Wikipedias"), size=3.7, vjust = -1.2, angle = 90, color = "black") +
    scale_y_continuous("Number of edits per month", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "2 months", limits = c()) +
    labs(title = "Mobile web edits on English Wikipedia") +
    ggthemes::theme_tufte(base_size = 10, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position= "bottom",
        legend.text=element_text(size = 12))

p

ggsave("Figures/mobile_web_edits_enwiki.png", p, width = 18, height = 9, units = "in", dpi = 150)

Proportion of mobile web edits tagged with AMC on English Wikipedia

Since deployment through September 2019

In [596]:
#Overall Proportion of mobile web edits tagged with AMC on English Wikipedia since deployment

amc_edits_prop_enwiki <- mobile_web_edit_counts %>%
    filter(date >= "2019-08-07", #deployment date across all wikis
          wiki == 'enwiki',
          user_edit_count != 'undefined') %>% 
        summarise(mobile_web_edits = sum(mobile_web_edits),
                 amc_edits = sum(amc_edits)) %>%
        mutate(prop = amc_edits/mobile_web_edits *100)

amc_edits_prop_enwiki
A data.frame: 1 × 3
mobile_web_editsamc_editsprop
<dbl><dbl><dbl>
4232246859416.20749

Per month since deployment

In [1763]:
amc_edits_prop_enwiki_permonth <- mobile_web_edit_counts %>%
    filter(date >= "2019-08-01" ,#deployment date across all wikis
          wiki == 'enwiki',
          user_edit_count != 'undefined') %>% 
    mutate(date = floor_date(date, "month")) %>%
      group_by(date) %>%
        summarise(mobile_web_edits = sum(mobile_web_edits, na.rm = TRUE),
                 amc_edits = sum(amc_edits, na.rm = TRUE)) %>%
         mutate(prop = amc_edits/mobile_web_edits *100)

amc_edits_prop_enwiki_permonth
A tibble: 2 × 4
datemobile_web_editsamc_editsprop
<date><dbl><dbl><dbl>
2019-08-0124721611701 4.733108
2019-09-012218405689325.645961
In [2009]:
p <- mobile_web_edit_counts_clean %>%
  filter(date >= '2019-08-07',
         date <= '2019-09-28', #remove last week due to incomplete data
        edit_type != 'mobile_web_edits',
         wiki == 'enwiki',
         user_edit_count != 'undefined') %>%
 mutate(date = floor_date(date, "week")) %>%
  filter(date != '2019-08-04') %>% #remove incomplete week
  group_by(date, edit_type) %>%
  summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE)) %>%
  ggplot(aes(x= date, y= total_mobile_edits, fill = edit_type)) +
  geom_col() + 
 geom_vline(xintercept = as.Date('2019-08-07'),
             linetype = "dashed", color = "black") +
   geom_text(aes(x=as.Date('2019-08-07'), y=6E3, label="AMC deployed on all Wikipedias"), size=4, vjust = -1.2, angle = 90, color = "black") +
    scale_y_continuous("Number of edits per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "2 weeks") +
    labs(title = "Mobile web edits on English Wikipedia by edit mode") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))

p
ggsave("Figures/mobile_web_edits_enwiki_amc_prop.png", p, width = 18, height = 9, units = "in", dpi = 150)

AMC edits account for about 16.2% of all mobile web edits made on English Wikipedia by logged-in users since deployment on August 8, 2019. The proportion of mobile web edits made while in AMC mode increased from about 4.7% the first month of deployment to 25.6% of mobile web edits from logged-in users in September.

In [2011]:
# Plot of overall mobile web edit rate by user edit count

p <- mobile_web_edit_counts_clean %>%
    filter(edit_type == 'mobile_web_edits',
           wiki == 'enwiki',
          user_edit_count != 'undefined') %>% ### remove undefined user edit counts
    mutate(date = floor_date(date, "month")) %>%
    group_by(date, user_edit_count) %>%
    summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE))%>%
    ggplot(aes(x=date, y = total_mobile_edits, color = user_edit_count, linetype = user_edit_count)) + 
    geom_line(size = 1.5)+
    scale_y_continuous("Number of edits per month", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "2 months", limits = c()) +
    labs(title = "Mobile web edits by user edit count on English Wikipedia") +
geom_vline(xintercept = as.numeric(as.Date(c("2019-08-08"))),
             linetype = "dashed", color = "black") +
  geom_text(aes(x=as.Date('2019-08-08'), y=7E4, label="AMC deployed on all Wikipedias"), size=3.7, vjust = -1.2, angle = 90, color = "black") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="bottom")

ggsave("Figures/mobile_web_edits_enwiki_byeditcount.png", p , width = 18, height = 9, units = "in", dpi = 150)

p

Proportion of mobile web edits tagged with AMC on English Wikipedia by editor experience

Since deployment through September 2019

In [1782]:
amc_edits_prop_enwiki_byeditcount <- mobile_web_edit_counts %>%
    filter(date >= "2019-08-07", #deployment date across all wikis
          wiki == 'enwiki',
          user_edit_count != 'undefined') %>% 
         group_by(user_edit_count) %>% 
        summarise(mobile_web_edits = sum(mobile_web_edits),
                 amc_edits = sum(amc_edits)) %>%
        mutate(prop = amc_edits/mobile_web_edits *100)

amc_edits_prop_enwiki_byeditcount
A tibble: 3 × 4
user_edit_countmobile_web_editsamc_editsprop
<chr><dbl><dbl><dbl>
100-499 631261187618.813167
500+ 2090135207524.914718
under 100151085 4643 3.073105

Similar to the trends seen across all Wikipedia projects, the highest proportion of AMC edits are made by high volume editors on English Wikipedia. From August 7, 2019 (deployment date on all wikis) to the end of September 2019, 24.9% of all logged-in mobile web edits by high volume editors (500+) were made while in AMC.

Mobile web edit rate on target wikis

In [2013]:
##Plot of mobile web edits by target wiki.

p <- mobile_web_edit_counts_clean %>%
     filter(edit_type == 'mobile_web_edits',
           wiki %in% c('eswiki', 'arwiki', 'idwiki', 'itwiki', 'jawiki', 'fawiki', 'thwiki'))%>%
     mutate(date = floor_date(date, "month")) %>%
    group_by(date, wiki)%>%
    summarise(monthly_edits = sum(edit_count)) %>%
    ggplot(aes(x=date, y = monthly_edits, color = wiki)) + 
    geom_line(size = 1.5) +
    geom_vline(xintercept = vertical_lines,
             linetype = "dashed", color = "black") +
  geom_text(aes(x=as.Date('2019-03-20'), y=6E4, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.7, vjust = -1.2, angle = 90, color = "black") +
  geom_text(aes(x=as.Date('2019-06-17'), y=6E4, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.7, vjust = -1.2, angle = 90, color = "black") +
   geom_text(aes(x=as.Date('2019-08-07'), y=6E4, label="AMC deployed on all Wikipedias"), size=3.7, vjust = -1.2, angle = 90, color = "black") +
    scale_y_continuous("Number of edits per month", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "2 months", limits = c()) +
    labs(title = "Monthly mobile web edits on target Wikipedia projects ") +
    ggthemes::theme_tufte(base_size = 10, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position= "bottom",
        legend.text=element_text(size = 12))

p

ggsave("Figures/mobile_web_edits_bytargetwiki.png", p, width = 18, height = 9, units = "in", dpi = 150)
In [1831]:
##Calculate YOY change for target wikis 
#Arwiki
mobile_web_edit_arwiki <- mobile_web_edit_counts_clean %>%
    filter(wiki == 'arwiki',
          edit_type == 'mobile_web_edits') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)


#EsWiki
mobile_web_edit_eswiki <- mobile_web_edit_counts_clean %>%
    filter(wiki == 'eswiki',
          edit_type == 'mobile_web_edits') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)


#idwiki
mobile_web_edit_idwiki <- mobile_web_edit_counts_clean %>%
    filter(wiki == 'idwiki',
          edit_type == 'mobile_web_edits') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)


#itwiki
mobile_web_edit_itwiki <- mobile_web_edit_counts_clean %>%
    filter(wiki == 'itwiki',
          edit_type == 'mobile_web_edits') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)


#jawiki
mobile_web_edit_jawiki <- mobile_web_edit_counts_clean %>%
    filter(wiki == 'jawiki',
          edit_type == 'mobile_web_edits') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)


#fawiki
mobile_web_edit_fawiki <- mobile_web_edit_counts_clean %>%
    filter(wiki == 'fawiki',
          edit_type == 'mobile_web_edits') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)

#thwiki
mobile_web_edit_thwiki <- mobile_web_edit_counts_clean %>%
    filter(wiki == 'thwiki',
          edit_type == 'mobile_web_edits') %>%
    mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_mobile_edits = sum(edit_count)) %>%
  arrange(date) %>%
mutate(yearOverYear= (total_mobile_edits/lag(total_mobile_edits,12) -1)*100)
In [1833]:
# Create YoY Table
wiki_list <- c('arwiki', 'eswiki', 'idwiki', 'itwiki', 'jawiki', 'fawiki', 'thwiki')
mobile_web_edit_yoy <- rbind(mobile_web_edit_arwiki[16,], mobile_web_edit_eswiki[16,],
                            mobile_web_edit_idwiki[16,], mobile_web_edit_itwiki[16,],
                            mobile_web_edit_jawiki[16,], mobile_web_edit_fawiki[16,], mobile_web_edit_thwiki[13,])

mobile_web_edit_yoy
A tibble: 7 × 3
datetotal_mobile_editsyearOverYear
<date><dbl><dbl>
2019-09-01 22123 1.332906
2019-09-01114102 1.370837
2019-09-01 1957470.520080
2019-09-01 7112433.571214
2019-09-01 7721348.372406
2019-09-01 3292755.389335
2019-06-01 1379529.676631

Similar to overall trends, there was a steady increase in total mobile web edits the past 15 months for the all of the target Wikipedia projects.

The table below shows a comparison of year over year rates seen in June 2019 and September 2019 for all of the target wikis. In September 2019, there was a year over year increase for all target wikis ranging from 1.3% on Arabic Wikipedia to 70.5% on Indonesia Wikipedia.

As noted, these increases has been occuring prior to the deployment of AMC on target wikis and cannot be attributed to the deployment of AMC alone. Part of these changes are partly due to a sustained increase in overall active editors seen on these wikis.

Year over year changes in mobile web edit rates on target wikis

Wiki June 2019 September 2019
Arabic Wiki -7.4% 1.3%
Spanish Wiki 16.7% 1.4%
Indonesian Wiki 22.1% 70.5%
Italian Wiki 27.1% 33.6%
Japanese Wiki 37.9% 48.4%
Persian Wiki 57.9% 55.4%
Thai Wiki 23.3% 29.7%

Proportion of mobile web edits tagged with AMC on target wikis

Arabic, Spanish and Indonesian Target Wikis (Since March deployment through end of September 2019)

In [277]:
amc_edits_prop_target <- mobile_web_edit_counts %>%
    filter(date >= "2019-03-20", #deployment date
          wiki %in% c('arwiki', 'idwiki', 'eswiki'),
          user_edit_count != 'undefined') %>% 
        group_by(wiki) %>%
        summarise(mobile_web_edits = sum(mobile_web_edits),
                 amc_edits = sum(amc_edits)) %>% 
  mutate(prop = amc_edits/mobile_web_edits *100)

amc_edits_prop_target
A tibble: 3 × 4
wikimobile_web_editsamc_editsprop
<chr><dbl><dbl><dbl>
arwiki 977001337513.689867
eswiki25549922791 8.920191
idwiki 76099 6731 8.845057
In [2014]:
p <- mobile_web_edit_counts_clean %>%
  filter(date >= '2019-03-30',
         date <= '2019-09-28', #remove last week due to incomplete data
        edit_type != 'mobile_web_edits',
         wiki %in% c('arwiki', 'idwiki', 'eswiki'),
         user_edit_count != 'undefined') %>%
mutate(date = floor_date(date, "week")) %>%
  group_by(date, wiki, edit_type) %>%
  summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE)) %>%
  ggplot(aes(x= date, y= total_mobile_edits, fill = edit_type)) +
  geom_col() + 
  facet_wrap(~ wiki, nrow = 3, scale = "free_y") +
    scale_y_continuous("Number of edits per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "2 weeks") +
    labs(title = "Mobile web edits on target wikipedias",
         subtitle = "Where AMC deployed on March 20, 2019") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))

amc_edits_prop_targetwiki_stacked_chart_March
ggsave("Figures/mobile_web_edits_March-target_amc_prop.png", p, width = 18, height = 9, units = "in", dpi = 150)

Italian, Japanese, Persian and Thai Target Wikis (Since June deployment through end of September 2019)

In [278]:
amc_edits_prop_target_July <- mobile_web_edit_counts %>%
    filter(date >= "2019-06-17", #deployment date
          wiki %in% c('itwiki', 'jawiki', 'fawiki', 'thwiki' ),
          user_edit_count != 'undefined') %>% 
        group_by(wiki) %>%
        summarise(mobile_web_edits = sum(mobile_web_edits),
                 amc_edits = sum(amc_edits)) %>% 
  mutate(prop = amc_edits/mobile_web_edits *100)

amc_edits_prop_target_July
A tibble: 4 × 4
wikimobile_web_editsamc_editsprop
<chr><dbl><dbl><dbl>
fawiki 698841095015.668823
itwiki 987111029610.430448
jawiki15009313568 9.039729
thwiki 24545 2188 8.914239
In [2015]:
p <- mobile_web_edit_counts_clean %>%
  filter(date >= '2019-06-17',
         date <= '2019-09-28', #remove last week due to incomplete data
        edit_type != 'mobile_web_edits',
         wiki %in% c('itwiki', 'jawiki', 'fawiki', 'thwiki'),
         user_edit_count != 'undefined') %>%
mutate(date = floor_date(date, "week")) %>%
  group_by(date, wiki, edit_type) %>%
  summarise(total_mobile_edits = sum(edit_count, na.rm = TRUE)) %>%
  ggplot(aes(x= date, y= total_mobile_edits, fill = edit_type)) +
  geom_col() + 
  facet_wrap(~ wiki, nrow = 4, scale = "free_y") +
    scale_y_continuous("Number of edits per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "2 weeks") +
    labs(title = "Mobile web edits on target wikipedias",
         subtitle = "Where AMC was deployed on June 17, 2019") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))

p
ggsave("Figures/mobile_web_edits_June_target_amc_prop.png", p, width = 18, height = 9, units = "in", dpi = 150)

On target wikis, the proportion of all mobile web edits made by logged-in users while in AMC mode has also increased since deployment but do not account for as high of proprotion of edits as seen on English Wikipedia.

Both Arabic (13.7%) and Persian Wikipedia (15.7%) have had the highest proportion of AMC tagged mobile web edits since deployment, while the proportion of AMC tagged mobile web edit on other target wikis range from 8.8% to 10%.

Retention rate for AMC

Methodology

We calcululated retention rate by reviewing the number of users who selected to opt-in to AMC mode in their preferences and stayed opted-in during the retention period (from the deployment dates through October 30, 2019).

This was measured using opt-in/opt-out button done in T211197 and recorded in the PrepUpdate table.

In [ ]:
##Collect retention rates on target wikis with breakdown by user edit counts

# In terminal
# spark2R --master yarn --executor-memory 2G --executor-cores 1 --driver-memory 4G

query <- 
"with amc_optins as (
SELECT CONCAT(year,'-',LPAD(month,2,'0'),'-',LPAD(day,2,'0')) AS date,
wiki, 
event.isdefault as amc_selection,
event.userid as userid
FROM event_sanitized.prefupdate 
WHERE event.property = 'mf_amc_optin'
AND year = 2019 AND ((month >= 3 and day >=20) OR (month >= 4 and month <= 10))
),
edits as (
SELECT
event_user_id as userid,
wiki_db,
ARRAY_CONTAINS(event_user_groups_historical, 'bot') AS user_is_bot,
CASE
    WHEN max(event_user_revision_count) is NULL THEN 'undefined'
    WHEN max(event_user_revision_count) < 100 THEN 'under 100'
    WHEN max(event_user_revision_count) >=100 AND max(event_user_revision_count) < 500 THEN '100-499'
    ELSE '500+'
    END AS user_edit_count
FROM wmf.mediawiki_history
WHERE snapshot = '2019-10' 
Group by event_user_id, wiki_db, ARRAY_CONTAINS(event_user_groups_historical, 'bot')
)
SELECT date, wiki, amc_selection, user_edit_count, user_is_bot, COUNT(*) as n_opt
FROM amc_optins
LEFT JOIN edits
ON amc_optins.userid = edits.userid and 
amc_optins.wiki = edits.wiki_db
GROUP BY date, wiki, amc_selection, user_edit_count, user_is_bot"


results <- collect(sql(query))
save(results, file="Readers-Web-AMC-metrics/Data/amc_retention_rates.RData")
In [683]:
load("Data/amc_retention_rates.RData")
amc_retention_rates <- results
In [684]:
amc_retention_rates$date <- as.Date(amc_retention_rates$date, format = "%Y-%m-%d")
#Revise amc_opt_out to factor and clarfiy TRUE and FALSE labels. 
amc_retention_rates$amc_selection %<>% factor(c(TRUE, FALSE), c("amc_opt_out", "amc_opt_in"))

Overall retention rate of opt-in AMC

In [729]:
# Overall retention rate across all Wikipedias.

amc_retention_overall_percent <- amc_retention_rates %>%
    filter(user_edit_count != 'undefined',
           user_is_bot != 'TRUE', #remove any bots and unregistered users
          date <= '2019-10-30')  %>%
    spread(amc_selection, n_opt) %>%
    summarise(amc_opt_out = sum(amc_opt_out, na.rm = TRUE),
              amc_opt_in = sum(amc_opt_in, na.rm = TRUE),                 
        prop_opt_in_percent =(amc_opt_in)/(amc_opt_out+amc_opt_in)*100)

head(amc_retention_overall_percent)
A data.frame: 1 × 3
amc_opt_outamc_opt_inprop_opt_in_percent
<dbl><dbl><dbl>
46363307587.7065

There has been an 87.7% overall retention rate of the opt-in AMC mode across all Wiki projects, surpassing the target of 60%.

In [2016]:
p <- amc_retention_rates %>%
 filter(user_edit_count != 'undefined',
            user_is_bot != 'TRUE', #remove any bots and unregistered users
          date <= '2019-10-30') %>%
    mutate(date = floor_date(date, "week")) %>%
    group_by(date, amc_selection) %>% 
    summarise(n_opt = sum(n_opt)) %>% 
    ggplot(aes(x=date, y = n_opt, color = amc_selection)) + 
    geom_line(size = 1)+
    scale_y_continuous("AMC selection count per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Retention rate of opt-in AMC on all Wikimedia Projects") +
geom_vline(xintercept = c(vertical_lines, as.Date('2019-07-15'),as.Date('2019-10-10')),
             linetype = "dashed", color = "black") +
  geom_text(aes(x=as.Date('2019-03-20'), y=2E3, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.6, vjust = -1.2, angle = 90, color = "black") +
  geom_text(aes(x=as.Date('2019-06-17'), y=2E3, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.6, vjust = -1.2, angle = 90, color = "black") +
     geom_text(aes(x=as.Date('2019-07-15'), y=2E3, label="AMC central notice banner deployed"), size=3.6, vjust = -1.2, angle = 90, color = "black") +
           geom_text(aes(x=as.Date('2019-08-07'), y=2E3, label="AMC deployed on all Wikipedias"), size=3.6, vjust = -1.2, angle = 90, color = "black") +
           geom_text(aes(x=as.Date('2019-10-10'), y=2E3, label="AMC deployed on all Wikimedia Projects"), size=3.6, vjust = -1.2, angle = 90, color = "black") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="bottom")

ggsave("Figures/amc_retention_overall.png", p, width = 18, height = 9, units = "in", dpi = 150)
p

The time series chart above shows the increase in AMC retention rate following each deployment and campaign. There were significant increases in the number of AMC opt-ins following the central notice banner deployed on July 15, 2019, the deployment of AMC on all Wikipedias on August 7, 2019, and also following the deployment across all Wikipedia projects on October 10, 2019.

There is also a sudden spike that occurs between August 28, 2019 through August 31, 2019. Further investigation is needed to determine if this is due to change in user behavior or a data artificat.

Overall Retention Rate of opt-in AMC by editor experience

In [731]:
# Overall retention rate for 100+ and 500+ editors

amc_retention_prop_overall_byeditor <- amc_retention_rates %>%
  filter(user_edit_count != 'undefined',
            user_is_bot != 'TRUE', #remove any bots and unregistered users
          date <= '2019-10-30') %>%
group_by(user_edit_count, amc_selection) %>%
summarise(n_opt = sum(n_opt)) %>%
spread(amc_selection, n_opt) %>%
group_by(user_edit_count) %>%
mutate(prop_opt_in_perct = amc_opt_in/(amc_opt_out+amc_opt_in)*100)


amc_retention_prop_overall_byeditor
A grouped_df: 3 × 4
user_edit_countamc_opt_outamc_opt_inprop_opt_in_perct
<chr><dbl><dbl><dbl>
100-499 589 855993.56143
500+ 19371232986.42226
under 10021101218785.24166
In [2017]:
##Overall retention rate proportion editor type

p <- amc_retention_rates %>%
   filter(
       user_edit_count != 'undefined',
            user_is_bot != 'TRUE', #remove any bots and unregistered users
          date <= '2019-10-30') %>%
    group_by(user_edit_count, amc_selection) %>% 
    summarise(n_opt = sum(n_opt))%>% 
    ungroup()%>%
    ggplot(aes(x=factor(1), y= n_opt, fill = amc_selection)) + 
    geom_col(position="fill") +
    scale_y_continuous(labels = scales::percent_format()) +
    facet_wrap(~ user_edit_count, scale = "free_y") +
    labs(title = "Retention rate of opt-in AMC by user edit count",
        fill = "AMC Selection",
        x = NULL,
        y = "Proportion of AMC option selection") +
  ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_blank(),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="bottom")

ggsave("Figures/amc_retention_overall_byeditcount.png", p, width = 18, height = 9, units = "in", dpi = 150)
p

The retention rate of AMC was higher for medium to high volume editors compared to low volume editors. There was a 93.6% overall retention rate amongst users with 100+ edits and 86.4% rate limited to users with 500+ edits.

Retention Rate of opt-in AMC on English Wikipedia

In [733]:
amc_retention_overall_percent_enwiki <- amc_retention_rates %>%
    filter( wiki == 'enwiki',
            user_edit_count != 'undefined',
            user_is_bot != 'TRUE', #remove any bots and unregistered users
          date <= '2019-10-30')  %>%
    spread(amc_selection, n_opt) %>%
    summarise(amc_opt_out = sum(amc_opt_out, na.rm = TRUE),
              amc_opt_in = sum(amc_opt_in, na.rm = TRUE),                 
        prop_opt_in_percent =(amc_opt_in)/(amc_opt_out+amc_opt_in)*100)

amc_retention_overall_percent_enwiki
A data.frame: 1 × 3
amc_opt_outamc_opt_inprop_opt_in_percent
<dbl><dbl><dbl>
17631444989.12534
In [734]:
# Enwiki retention rate by user experience

amc_retention_prop_enwiki_byeditor <- amc_retention_rates %>%
 filter(wiki == 'enwiki',
            user_edit_count != 'undefined',
            user_is_bot != 'TRUE', #remove any bots and unregistered users
          date <= '2019-10-30') %>%
group_by(user_edit_count, amc_selection) %>%
summarise(n_opt = sum(n_opt)) %>%
spread(amc_selection, n_opt) %>%
group_by(user_edit_count) %>%
mutate(prop_opt_in_perct = amc_opt_in/(amc_opt_out+amc_opt_in)*100)


amc_retention_prop_enwiki_byeditor
A grouped_df: 3 × 4
user_edit_countamc_opt_outamc_opt_inprop_opt_in_perct
<chr><dbl><dbl><dbl>
100-499 243449094.86584
500+ 806536286.93256
under 100714459786.55620

On English Wikipedia, 89% of all registered users who selected to opt-in to AMC mode since deployment on August 7, 2019 through the end of October 30, 2019 stayed opted-in.

Similar to trends seen across all Wikimedia projects, the highest retention rate (94.9%) is seen for users with cumulative edit counts between 100 to 499. Both high-volume (500+ cumulative edits) and low-volume (under 100 cumulative edits) users groups had similar retention rates.

Retention Rate of opt-in AMC on Target Wikipedias

In [735]:
# Overall retention rate

amc_retention_targetwiki_percent <- amc_retention_rates %>%
filter(wiki %in% c('eswiki', 'arwiki', 'idwiki', 'itwiki', 'jawiki', 'fawiki', 'thwiki'), 
     user_edit_count != 'undefined',
            user_is_bot != 'TRUE', #remove any bots and unregistered users
          date <= '2019-10-30') %>%  
spread(amc_selection, n_opt)%>%
group_by(wiki) %>%
 summarise(amc_opt_out = sum(amc_opt_out, na.rm = TRUE),
              amc_opt_in = sum(amc_opt_in, na.rm = TRUE),                 
        prop_opt_in_percent =(amc_opt_in)/(amc_opt_out+amc_opt_in)*100)

amc_retention_targetwiki_percent
A tibble: 7 × 4
wikiamc_opt_outamc_opt_inprop_opt_in_percent
<chr><dbl><dbl><dbl>
arwiki427194782.01348
eswiki336181884.40111
fawiki267139483.92535
idwiki162 79883.12500
itwiki207101483.04668
jawiki238 94479.86464
thwiki 55 20478.76448
In [2018]:
##Plot overall proportion of AMC retention rates on each target wiki

p <- amc_retention_rates %>%
    filter(wiki %in% c('eswiki', 'arwiki', 'idwiki', 'itwiki', 'jawiki', 'fawiki', 'thwiki'), 
     user_edit_count != 'undefined',
      user_is_bot != 'TRUE', #remove any bots and unregistered users
     date <= '2019-10-30') %>% 
    group_by(wiki, amc_selection) %>% 
    summarise(n_opt = sum(n_opt)) %>% 
ggplot(aes(x=factor(1), y= n_opt, fill = amc_selection)) + 
    geom_col(position="fill") +
    scale_y_continuous(labels = scales::percent_format()) +
    facet_wrap(~wiki, scale = "free_y") +
    labs(title = "Retention rate of opt-in AMC on all target Wikipedias",
        fill = "AMC Selection",
        x= NULL,
        y = "Proportion of AMC option selection") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_blank(),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="bottom")

ggsave("Figures/amc_retention_target_wikis.png", p, width = 18, height = 9, units = "in", dpi = 150)
p

On target wikis, opt-in retention rates were slightly below the retention rate for English Wikipedia but still were above the target of 60%. Retention rates ranged from 78.8% on Thai Wikipedia to 84.4% on Spanish Wikipedia.

In [2019]:
# Retention rates for Target wikis where AMC was deployed in March
p <- amc_retention_rates %>%
    filter(wiki %in% c('eswiki', 'arwiki', 'idwiki'), 
     user_edit_count != 'undefined',
     user_is_bot != 'TRUE', #remove any bots and unregistered users
     date >= '2019-03-20',
    date <= '2019-10-30') %>%
    mutate(date = floor_date(date, "week")) %>%
    group_by(date, wiki, amc_selection) %>% 
    summarise(n_opt = sum(n_opt)) %>% 
    ggplot(aes(x=date, y = n_opt, color = amc_selection)) + 
    geom_line(size = 1)+
    facet_wrap(~ wiki, nrow = 4, scale = "free_y") +
    scale_y_continuous("AMC selection count per week", labels = polloi::compress) +
    scale_x_date(labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Retention rate of opt-in AMC on target Wikipedias",
        subtitle = "Where AMC was deployed on March 20, 2019") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
          axis.title.x=element_blank(),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="bottom")

ggsave("Figures/amc_retention_targetwiki_March_weekly.png", p, width = 18, height = 9, units = "in", dpi = 150)
p
In [2020]:
# Retention rates for Target wikis where AMC was deployed in June

p <- amc_retention_rates %>%
    filter(wiki %in% c('itwiki', 'jawiki', 'fawiki', 'thwiki'), 
     user_edit_count != 'undefined',
     user_is_bot != 'TRUE', #remove any bots and unregistered users
     date >= '2019-06-17',
    date <= '2019-10-30') %>%
    mutate(date = floor_date(date, "week")) %>%
    group_by(date, wiki, amc_selection) %>% 
    summarise(n_opt = sum(n_opt)) %>% 
    ggplot(aes(x=date, y = n_opt, color = amc_selection)) + 
    geom_line(size = 1)+
    facet_wrap(~ wiki, nrow = 4, scale = "free_y") +
    scale_y_continuous("AMC selection count per week", labels = polloi::compress) +
    scale_x_date(labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Retention rate of opt-in AMC on target Wikipedias",
        subtitle = "Where AMC was deployed on June 17, 2019") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
          axis.title.x=element_blank(),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="bottom")

ggsave("Figures/amc_retention_targetwiki_June_weekly.png", p, width = 18, height = 9, units = "in", dpi = 150)
p

On the target wikis, retention rates remained fairly consistent from deployment of AMC through the end of October 2019 with weekly retention rates mostly staying abover 60%.

Thai Wikipedia's AMC retention rate dipped to around 58% during the weeks of 2019-07-21 and 2019-08-04 but increased to 78.8% by the end of October 2019.

Moderation actions on mobile web

Methodology

We calculated changes in the overall rate of moderation actions on mobile web. Moderation actions are defined in T213461.

Since moderation actions recorded in the log table were not tagged with both the mobile web edit and amc tag until March 18, 2019 (https://phabricator.wikimedia.org/T215477#5008003), we could not calculate a yoy increase for these actions but reviewed changes since March 2019 through September 2019. As a result, seasonal fluctuations may elevate some of the identified changes.

Data was retrieved from the revision_tag field availble in mediawiki_history and the logs table. All actions that were completed on mobile web were tagged as mobile web edit and advanced mobile edit if applicable.

Overall Rate of Moderation Actions

In [ ]:
# Overall rate for logging table actions on mobile web (not limited to AMC tags) 
# March 2019-October 2019

# In terminal
# spark2R --master yarn --executor-memory 2G --executor-cores 1 --driver-memory 4G

query <- "SELECT 
    SUBSTR(log_timestamp, 0, 8) AS date,
    logging.wiki_db as wiki,
    SUM(If(logging.log_type = 'block' and logging.log_action = 'block', 1, 0)) as block,
    SUM(If(logging.log_type = 'block' and logging.log_action = 'unblock', 1, 0)) as unblock,
    SUM(If(logging.log_type = 'delete' and logging.log_action = 'delete', 1, 0)) as delete,
    SUM(If(logging.log_type = 'protect' and logging.log_action = 'protect', 1, 0)) as protect,
    SUM(If(logging.log_type = 'move' and logging.log_action = 'move', 1, 0)) as move,
    SUM(If(logging.log_type = 'thanks' and logging.log_action = 'thank', 1, 0)) as thank,
    SUM(If(logging.log_type = 'review' and logging.log_action = 'approve', 1, 0)) as approve
FROM wmf_raw.mediawiki_logging as logging
INNER JOIN  (
    SELECT change_tag.ct_rev_id as rev_id,
    change_tag_def.ctd_name as tag_name,
    change_tag.ct_log_id as log_id
  FROM wmf_raw.mediawiki_change_tag as change_tag
  INNER JOIN wmf_raw.mediawiki_change_tag_def as change_tag_def ON 
    change_tag.ct_tag_id = change_tag_def.ctd_id 
  WHERE change_tag.snapshot = '2019-10' and
  change_tag_def.snapshot = '2019-10'
) as ct
ON logging.log_id = ct.log_id
WHERE 
logging.snapshot = '2019-10' and
logging.log_timestamp >= '20190301' and 
logging.log_timestamp < '20191001' and 
(ct.tag_name like '%mobile web edit%' or ct.tag_name like '%advanced mobile edit%')
GROUP BY SUBSTR(log_timestamp, 0, 8), logging.wiki_db"

results <- collect(sql(query))
save(results, file="Readers-Web-AMC-metrics/Data/moderation_counts_log.RData")
In [755]:
load("Data/moderation_counts_log.RData")
moderation_counts_log <- results
In [756]:
moderation_counts_log$date <- as.Date(moderation_counts_log$date, format = "%Y%m%d")
moderation_counts_log$date<- format(moderation_counts_log$date,"%Y-%m-%d")
In [757]:
moderation_counts_log$date <- as.Date(moderation_counts_log$date, format = "%Y-%m-%d")
In [885]:
# Query to collect unblock and rollback from the revision tag table table. 
# Done using change tags now available in mediawiki_history

# In terminal
# spark2R --master yarn --executor-memory 2G --executor-cores 1 --driver-memory 4G

query <- "select
    date_format(event_timestamp, 'yyyy-MM-dd') as date,
    wiki_db as wiki,
    sum(cast(mw_rollback as int)) as rollback,
    sum(cast(mw_undo as int)) as undo
from (
    select
        wiki_db,
        event_timestamp,
        array_contains(revision_tags, 'mw-rollback') as mw_rollback, 
        array_contains(revision_tags, 'mw-undo') as mw_undo
    from wmf.mediawiki_history
    where
        event_timestamp IS NOT NULL and
        (array_contains(revision_tags, 'mobile web edit') or 
        array_contains(revision_tags, 'advanced mobile edit')) and
        event_timestamp >= '2019-03-01' and   event_timestamp < '2019-10-01' and 
        snapshot = '2019-10'
) edits
group by wiki_db, date_format(event_timestamp, 'yyyy-MM-dd')"


results <- collect(sql(query))
save(results, file="Readers-Web-AMC-metrics/Data/moderation_counts_ct.RData")
In [758]:
load("Data/moderation_counts_ct.RData")
moderation_counts_ct <- results
moderation_counts_ct$date <- as.Date(moderation_counts_ct$date, format = "%Y-%m-%d")
In [759]:
##Join the two moderation count tables
moderation_counts_all <- inner_join(moderation_counts_log, moderation_counts_ct, 
                                     by = c("date", "wiki"))

Average number of moderation actions per month by type

In [2021]:
p <- moderation_counts_all %>%
    gather(action_type, action_count, block:undo) %>%
    filter(date >= '2019-03-18') %>% #date where log actions were tagged
    mutate(date = floor_date(date, "month")) %>%
    group_by(date, action_type) %>%
    summarise(action_count = sum(action_count)) %>%
    group_by(action_type) %>% 
    summarise(monthly_avg_count = round(mean(action_count),0)) %>%
    ggplot(aes(x=action_type, y= monthly_avg_count, fill = action_type)) +
    geom_bar(stat='identity') +
    geom_text(aes(label=monthly_avg_count), vjust=0) +
 labs(title = "Average monthly moderation actions by type \n March 2019-October 2019") +
     ylab("Average number of moderation actions per month")+
     xlab("type") +
ggthemes::theme_tufte(base_size = 11, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position = "none")


        
p
ggsave("Figures/moderation_counts_bytype_overall_avg.png",p, width = 18, height = 9, units = "in", dpi = 150)
In [2022]:
#Plot overall moderation actions by action type

p <- moderation_counts_all %>%
    gather(action_type, action_count, block:undo) %>%
    filter(date >= '2019-03-18') %>%
    mutate(date = floor_date(date, "month")) %>%
group_by(date, action_type) %>% 
    summarise(action_count = sum(action_count)) %>% 
    ggplot(aes(x=date, y = action_count, color = action_type)) + 
    geom_line(size = 1)+
    scale_y_continuous("Number of moderation actions per month", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Overall rate of mobile web moderation actions by type") +
geom_vline(xintercept = vertical_lines,
             linetype = "dashed", color = "black") +
  geom_text(aes(x=as.Date('2019-03-20'), y=2E6, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.5, vjust = -1.2, angle = 90, color = "black") +
  geom_text(aes(x=as.Date('2019-06-17'), y=2E6, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.5, vjust = -1.2, angle = 90, color = "black") +
   geom_text(aes(x=as.Date('2019-08-07'), y=2E6, label="AMC deployed on all Wikipedias"), size=3.5, vjust = -1.2, angle = 90, color = "black") +
    ggthemes::theme_tufte(base_size = 11, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="right")

moderation_counts_bytype_monthly
ggsave("Figures/moderation_counts_bytype_monthly_overall.png", p , width = 18, height = 9, units = "in", dpi = 150)

There is an overall increase in all types of moderation actions on mobile web following the AMC deployment. The thank action is the most commonly used moderation action on mobile web. However, the number of blocks and approves has both seen signficant increases since AMC was deployed on all wikis.

Overall moderaction action counts across all Wikimedia Projects

In [795]:
moderation_counts_all_overall <- moderation_counts_all %>%
   filter(date >= '2019-03-18') %>% #date where log actions were tagged
  gather("action_type", "action_count", 3:11 ) %>%
  group_by(date) %>%
  summarise(total_count = sum(action_count)) %>%
 arrange(date)
In [2023]:
p <- moderation_counts_all_overall %>%
mutate(rolling_average = rollmean(as.numeric(total_count), 7, na.pad=TRUE, align="right")) %>%
ggplot(aes(x=date, y = rolling_average)) + 
    geom_line(size = 1, color = 'blue')+
    scale_y_continuous("Number of moderation actions (7-day rolling average)", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Rate of mobile web moderation actions on all Wikimedia Projects") +
geom_vline(xintercept = vertical_lines,
             linetype = "dashed", color = "black") +
  geom_text(aes(x=as.Date('2019-03-20'), y=4E5, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.6, vjust = 1.5, angle = 90, color = "black") +
  geom_text(aes(x=as.Date('2019-06-17'), y=4E5, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.6, vjust = 1.5, angle = 90, color = "black") +
   geom_text(aes(x=as.Date('2019-08-07'), y=4E5, label="AMC deployed on all Wikipedias"), size=3.6, vjust = 1.5, angle = 90, color = "black") +
    ggthemes::theme_tufte(base_size = 11, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="right")

p
ggsave("Figures/moderation_counts_daily_overall.png", p, width = 18, height = 9, units = "in", dpi = 150)
Warning message:
“Removed 6 rows containing missing values (geom_path).”Warning message:
“Removed 6 rows containing missing values (geom_path).”

Overall month over month changes in moderaction action rate

In [1998]:
#Calculate MoM (month over month) overall increase in moderation actions.

moderation_action_monthly_overall_mom <- moderation_counts_all_overall %>%
mutate(date = floor_date(date, "month"))%>%
 filter(date != '2019-03-01') %>% #data incomplete for March
  group_by(date) %>%
  summarise(total_action_count = sum(total_count)) %>%
  arrange(date) %>%
  mutate(monthOvermonth= (total_action_count/lag(total_action_count,1) -1)*100) 

moderation_action_monthly_overall_mom
A tibble: 6 × 3
datetotal_action_countmonthOvermonth
<date><dbl><dbl>
2019-04-01 7810383 NA
2019-05-01 8300069 6.26968
2019-06-01 6859370-17.35767
2019-07-01 8408001 22.57687
2019-08-0111393652 35.50964
2019-09-0114949262 31.20694

There was a signficant increase in the number of moderation actions across all Wikimedia projects following the deployment of AMC to all Wikipedias in August 2019. From Q4 (April to June 2019) to Q1 (July - September 2019), the total number of moderation actions on mobile web increased by 47%. The month over month increase from August to September 2019 was 31% across all wikis.

Data was unable to compare year over year changes so these increases may be elevated due to seasonal fluctuations; however, there is a greater rate of increase immediately following the date of deployment compared to previous months and the percent change is higher than we see in overall moderation rates across all types of platforms. For example, in September 2019, there was an 8% month over month increase across all platforms. Once a desktop revision tag is created, it would be valuable to compare moderation action on mobile web compare to rates seen on desktop.

Moderation action rates for English Wikipedia

In [2024]:
#Time series of action type on Wikis

p <- moderation_counts_all %>%
   filter(wiki == 'enwiki',
       date >= '2019-03-18') %>% #date where log actions were tagged) 
    gather(action_type, action_count, block:undo) %>%
    arrange(desc(date)) %>%
mutate(date = floor_date(date, "month")) %>%
group_by(date, action_type) %>% 
    summarise(action_count = sum(action_count)) %>% 
    ggplot(aes(x=date, y = action_count, color = action_type)) + 
    geom_line(size = 1)+
    scale_y_continuous("Number of moderation actions per month", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Rate of mobile web moderation actions \n on English Wikipedia by type") +
geom_vline(xintercept = as.Date('2019-08-07'),
             linetype = "dashed", color = "black") +
geom_text(aes(x=as.Date('2019-08-07'), y=300E3, label="AMC deployed on all Wikipedias"), size=3.6, vjust = -0.5, angle = 90, color = "black") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="right")

p
ggsave("Figures/moderation_counts_bytype_enwiki_monthly.png", p, width = 18, height = 9, units = "in", dpi = 150)
In [2025]:
#Overall trend on English Wikipedia

moderation_counts_enwiki_monthly <- moderation_counts_all %>%
   filter(wiki == 'enwiki',
         date > '2019-03-20') %>% #after log actions were tagged) %>%  
    gather(action_type, action_count, block:undo) %>%
    group_by(date) %>% 
    summarise(action_count = sum(action_count))

p <- moderation_counts_enwiki_monthly %>%
#mutate(action_count_avg = rollmean(action_count, 7, na.pad=TRUE, align="right")) %>% 
    ggplot(aes(x=date, y = action_count)) + 
    geom_line(size = 1.5, color = 'blue')+
    scale_y_continuous("Number of moderation actions per day", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Rate of mobile web moderation actions on English Wikipedia") +
 geom_vline(xintercept = as.Date('2019-08-07'),
             linetype = "dashed", color = "black") +
   geom_text(aes(x=as.Date('2019-08-07'), y=100E3, label="AMC deployed on all Wikipedias"), size=3.6, vjust = -0.5, angle = 90, color = "black") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="right")

p
ggsave("Figures/moderation_counts_enwiki_daily.png", p, width = 18, height = 9, units = "in", dpi = 150)

Overall month over month changes in moderation actions for English Wikipedia

In [822]:
#Calculate MoM (month over month) change in moderation actions for enwiki.

moderation_action_monthly_enwiki_mom <- moderation_counts_all %>%
filter(wiki == 'enwiki',
      date >= '2019-04-01') %>% #remove first month as we did not have complete data
  gather(action_type, action_count, block:undo)  %>%
mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_action_count = sum(action_count)) %>%
  arrange(date) %>%
mutate(monthOvermoth = (total_action_count - lag(total_action_count))/lag(total_action_count) *100)

moderation_action_monthly_enwiki_mom
A tibble: 6 × 3
datetotal_action_countmonthOvermoth
<date><dbl><dbl>
2019-04-01 640008 NA
2019-05-01 659899 3.10793
2019-06-01 79131019.91380
2019-07-01153607194.11748
2019-08-01225572446.85024
2019-09-01310377637.59556

Moderation actions on English Wikipedia have steadily increased with a signficant increase following AMC deployment across all wikis on August 7th, 2109. Comparing the month before deployment (July 2019) to the month after (August 2019), the number of moderation actions on mobile web increased by 46.9% from July to September 2019. There was a 37.6% month over month increase from August to September 2019.

Moderation action rate for target Wikipedia projects

In [2026]:
#Action type on Wikis

p <- moderation_counts_all %>%
   filter(wiki %in% c('eswiki', 'arwiki', 'idwiki', 'itwiki', 'jawiki', 'fawiki', 'thwiki'),
          date >= '2019-03-18') %>% #date where log actions were tagged)
    gather(action_type, action_count, block:undo) %>%
    arrange(desc(date)) %>%
mutate(date = floor_date(date, "month")) %>%
group_by(date, action_type) %>% 
    summarise(action_count = sum(action_count)) %>% 
    ggplot(aes(x=date, y = action_count, color = action_type)) + 
    geom_line(size = 1)+
    scale_y_continuous("Number of moderation actions per month", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Rate of mobile web moderation actions count \n on target wikis by type") +
geom_vline(xintercept = vertical_lines,
             linetype = "dashed", color = "black") +
  geom_text(aes(x=as.Date('2019-03-20'), y=200E3, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.5, vjust = -0.5, angle = 90, color = "black") +
geom_text(aes(x=as.Date('2019-08-07'), y=200E3, label="AMC deployed on all Wikipedias"), size=3.5, vjust = -0.5, angle = 90, color = "black") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="right")

p
ggsave("Figures/moderation_counts_bytype_targetwikis_monthly.png", p, width = 18, height = 9, units = "in", dpi = 150)

On target wikis, the thank action is also the most commonly used moderation action on mobile web compared to the other actions. There has been a general increase in the use of moderation actions on almost all the target wikis with few declines on the smaller ones.

In [857]:
moderation_counts_targetwiki <- moderation_counts_all %>%
    filter(wiki %in% c('eswiki', 'arwiki', 'idwiki', 'itwiki', 'jawiki', 'fawiki', 'thwiki'),
          date >= '2019-03-18') %>% #date where log actions were tagged) 
    gather(action_type, action_count, block:undo) %>%
    group_by(date, wiki)%>%
    summarise(action_count = sum(action_count))

head(moderation_counts_targetwiki)
A grouped_df: 6 × 3
datewikiaction_count
<date><chr><dbl>
2019-03-18arwiki 3384
2019-03-18eswiki 5400
2019-03-18fawiki 2059
2019-03-18idwiki11165
2019-03-18itwiki 6337
2019-03-18jawiki 745
In [2027]:
#Plot time series of moderation actions on target wikis

p <- moderation_counts_targetwiki %>%
#mutate(action_count_avg = rollmean(action_count, 7, na.pad=TRUE, align="right")) %>% 
    ggplot(aes(x=date, y = action_count)) + 
    geom_line(size = 1.5, color = 'blue')+ 
      facet_wrap(~ wiki, ncol = 3,nrow = 6, scale = "free_y") +
    scale_y_continuous("Number of moderation actions per day", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Rate of mobile web moderation actions on all target wikis") +
    ggthemes::theme_tufte(base_size = 12, base_family = "Gill Sans") +
    theme(axis.text.x=element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_line("gray70"),
        legend.position="right")

p

ggsave("Figures/moderation_counts_daily_targetwikis.png", p, width = 18, height = 9, units = "in", dpi = 150)

Overall month over month changes in moderations actions for target Wikipedias

In [1823]:
#Calculate MoM (month over month) change in moderation actions for all target wikis.

moderation_action_monthly_target_mom <- moderation_counts_all %>%
filter(wiki %in% c('eswiki', 'arwiki', 'idwiki', 'itwiki', 'jawiki', 'fawiki', 'thwiki')) %>%
  gather(action_type, action_count, block:undo)  %>%
mutate(date = floor_date(date, "month")) %>%
  group_by(date) %>%
  summarise(total_action_count = sum(action_count)) %>%
  arrange(date) %>%
mutate(monthOvermoth = (total_action_count - lag(total_action_count))/lag(total_action_count) *100)

moderation_action_monthly_target_mom
A tibble: 7 × 3
datetotal_action_countmonthOvermoth
<date><dbl><dbl>
2019-03-011323664 NA
2019-04-011244052 -6.014517
2019-05-011287409 3.485144
2019-06-011441935 12.002868
2019-07-011121789-22.202526
2019-08-011325758 18.182475
2019-09-011455543 9.789494
Wiki Q4 over Q1 Change
Spanish Wiki 89.7%
Arabic Wiki 22.9%
Indonesian Wiki -72.8%
Italian Wiki 10.8%
Japanese Wiki -8.7%
Persian Wiki 36.0%
Thai Wiki -19.6%

On the target wikis, there has been much more fluctation in the number of moderation actions each month since AMC deployment. Comparing Q4 (April-June 2019) to Q1 (July-September), the number of moderation actions increasd for Spanish Wikipedia, Arabic, Persian and Italian Wikipedias while the number of moderation actions decreased for Indonesian, Japanese and Thai Wikipedias.

These fluctuations in the number of moderation actions may be due to seasonal changes and also due to the smaller size of some of these wikis.

Rate of access to special pages

Methodology

We calculated the rate of access (i.e. pageviews) to special pages from AMC and mobile web edits overall by using the X-Analytics tag (a general purpose header for measurement purposes) for AMC done in T212961.

The X-Analytics tag for AMC is recorded in the mf-m key. It is recorded as 'b%2Camc' if the user is opted into both beta and amc mode and as 'amc' for users opted into just amc. Please see a full list of X-Analytic key definitions.

The analysis was based on a 1/64 sample of data collected from the webrequest data from August 2019 through October 30, 2019.

Note: During analysis, I found that only Search and Recent Changes special pages were being recorded as pageviews in the pageview_hourly table starting around July 22nd to July 23rd, leading to an inaccurate drop in views on special pages due to a bug. Further investigation is needed. In the meantime, I reviewed all requests to special pages from the webrequest data (not isolated to pageviews) to complete the analysis.

In [1954]:
#special page requests from logged in users on mobile web including X-analytics tag for AMC views
#from wmf.webrequest

query <- 
"SELECT 
    CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) AS date,
    CONCAT(normalized_host.project, '.',normalized_host.project_family) AS wiki,
    SUM(IF ((x_analytics_map['mf-m'] = 'b%2Camc' OR
    x_analytics_map['mf-m'] = 'amc'), 1, 0)) as amc_request,
    COUNT(*) AS all_mobile_web_requests 
FROM wmf.webrequest TABLESAMPLE(BUCKET 1 OUT OF 64 ON hostname, sequence)
    WHERE year = 2019 AND month >= 08
-- look at special pages
    AND namespace_id = -1
    AND normalized_host.project_family = 'wikipedia'
    AND x_analytics_map['loggedIn'] IS NOT NULL
    AND agent_type = 'user'
    AND access_method = 'mobile web'
    AND webrequest_source = 'text'
GROUP BY CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')),  CONCAT(normalized_host.project, '.',normalized_host.project_family)"
In [1955]:
special_pages_requests <- wmf::query_hive(query)
In [1956]:
special_pages_requests$date <- as.Date(special_pages_requests$date, format = "%Y-%m-%d")
In [2047]:
special_pages_requests_clean <- special_pages_requests %>%
    mutate(other_mobile_web_request = all_mobile_web_requests - amc_request) %>% 
    gather(request_type, request_count, 3:5) %>%
    arrange(desc(date))
In [1959]:
special_pages_requests_clean$request_type %<>% factor(levels= c("amc_request","all_mobile_web_requests", "other_mobile_web_request"))
In [1960]:
# Which special pages are being viewed from amc mode on all Wikipedia projects?

query <- 
"SELECT 
    CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) AS date,
     CONCAT(normalized_host.project, '.',normalized_host.project_family) AS wiki,
    x_analytics_map['special'] AS special_page_name, 
    COUNT(*) AS requests 
FROM wmf.webrequest TABLESAMPLE(BUCKET 1 OUT OF 64 ON hostname, sequence)
    WHERE year = 2019 AND month >= 08
    AND (x_analytics_map['mf-m'] = 'b%2Camc' OR
     x_analytics_map['mf-m'] = 'amc')
    AND namespace_id = -1
    AND normalized_host.project_family = 'wikipedia'
    AND x_analytics_map['loggedIn'] IS NOT NULL
    AND agent_type = 'user'
    AND access_method = 'mobile web'
    AND webrequest_source = 'text'
    GROUP BY CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')), x_analytics_map['special'],  
    CONCAT(normalized_host.project, '.',normalized_host.project_family)"
In [1962]:
special_page_requests_bypage <- wmf::query_hive(query)
In [1963]:
special_page_requests_bypage$date <- as.Date(special_page_requests_bypage$date, format = "%Y-%m-%d")

Rate of access to special pages overall

In [1970]:
##all mobile web requests to special pages

special_pages_requests_overall = special_pages_requests_clean %>%
    mutate(date = floor_date(date, "week")) %>%
    filter(request_type == "all_mobile_web_requests",
          date != '2019-08-25',
          date != '2019-11-24' )%>% #remove weeks with incomplete data
     group_by(date) %>%
    summarize(request_count = sum(request_count)) %>%
    arrange(desc(date))
In [1971]:
special_pages_requests_bytype <- special_pages_requests_clean  %>%
  mutate(date = floor_date(date, "week")) %>%
    filter(request_type != "all_mobile_web_requests",
          date != '2019-08-25',
          date != '2019-11-24') %>%
  group_by(date, request_type) %>%
  summarise(request_count = sum(request_count)) 

head(special_pages_requests_byrequest)
A grouped_df: 6 × 3
daterequest_typerequest_count
<date><fct><int>
2019-08-25amc_request1575
2019-09-01amc_request2463
2019-09-08amc_request2469
2019-09-15amc_request2377
2019-09-22amc_request2304
2019-09-29amc_request2216
In [2031]:
#Plot mobile web requests to special pages overall

p <- ggplot() +
 geom_bar(data =special_pages_requests_bytype, aes(x= date, y= request_count, fill = request_type),stat= "identity", position = "stack") + 
 geom_smooth(data = special_pages_requests_overall, aes(x=date, y=request_count), se = FALSE) +
    scale_y_continuous("Number of requests per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "1 week") +
    labs(title = "Mobile web requests to special pages on all Wikipedia projects",
        subtitle = "Based on a sample of data collected from webrequest data") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))
           
p
ggsave("Figures/mobile_web_special_page_requests_overall.png", p, width = 18, height = 9, units = "in", dpi = 150)
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'

Proportion of AMC requests to special pages

In [1973]:
#Calculate monthly proportion of amc_edits 

special_pages_requests_monthly_prop <- special_pages_requests %>%
      mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>% 
    summarise(amc_requests_total = sum(amc_request),
             all_mobile_web_requests_total = sum(all_mobile_web_requests),
             amc_prop = amc_requests_total/all_mobile_web_requests_total *100)

special_pages_requests_monthly_prop
A tibble: 4 × 4
dateamc_requests_totalall_mobile_web_requests_totalamc_prop
<date><int><int><dbl>
2019-08-01 1465 983511.489563
2019-09-01102725750331.786332
2019-10-01122965941292.069584
2019-11-01 84454656981.813407

Month over months changes in all mobile web requests to special pages

In [1974]:
#Calculate month over month changes

special_pages_requests_mom <- special_pages_requests_clean %>%
    mutate(date = floor_date(date, "month")) %>%
    filter(request_type == "all_mobile_web_requests",
        date != '2019-11-01',
        date != '2019-08-01') %>% #October is the only complete month to compate mom changes
    group_by(date) %>%
    summarise(request_count = sum(request_count)) %>%
    arrange(date) %>%
    mutate(monthOvermonth= (request_count/lag(request_count,1) -1)*100) 


tail(special_pages_requests_mom)
A tibble: 2 × 3
daterequest_countmonthOvermonth
<date><int><dbl>
2019-09-01575033 NA
2019-10-015941293.320853

Requests to special pages has remained relatively flat (there was only a 3.3% month over month change between September 2019 and October 2019) and views from mobile web in AMC mode represent about only 2% of all mobile web views to these pages. However, there has been an increase in the proportion of mobile web requests to special pages coming from AMC mode since it was deployed on all Wikipedias on August 7, 2018. The proportion of AMC requests increased from September to October 2019 by 15.9%.

Top 10 Special pages viewed in AMC Mode

In [1975]:
special_page_requests_bypage_top10 <- special_page_requests_bypage %>%
 group_by(special_page_name) %>%
 summarise(requests = sum(requests)) %>%
 mutate(percent_specialpage_requests = requests/sum(requests)*100) %>%
 top_n(10)  %>%
 arrange(desc(percent_specialpage_requests))

special_page_views_fromamc_top10
Selecting by percent_specialpage_requests
A tibble: 10 × 3
specialpagenamerequestspercent_specialpage_requests
<chr><int><dbl>
MobileDiff 63035.156250
Watchlist 44825.000000
Contributions18210.156250
Search 149 8.314732
MobileOptions 59 3.292411
Recentchanges 56 3.125000
EditWatchlist 35 1.953125
Log 23 1.283482
Whatlinkshere 21 1.171875
Specialpages 20 1.116071

The Mobile Diff, Watchlist, Contributions, and Search are the most viewed special pages from users in AMC mode on all Wikipedias.

Rate of access to Special Pages on English Wikipedia

In [2048]:
special_pages_requests_enwiki = special_pages_requests_clean %>%
 mutate(date = floor_date(date, "week")) %>%
    filter(wiki == 'en.wikipedia',
          request_type == "all_mobile_web_requests",
          date != '2019-08-25',
          date != '2019-11-24' )%>% #remove weeks with incomplete data 
  group_by(date) %>%
summarize(request_count = sum(request_count))
In [2030]:
special_pages_requests_bytype_enwiki <- special_pages_requests_clean %>%
    mutate(date = floor_date(date, "week")) %>%
         filter(wiki == 'en.wikipedia',
          request_type != "all_mobile_web_requests",
          date != '2019-08-25',
          date != '2019-11-24' ) %>% #remove weeks with incomplete data 
    group_by(date, request_type) %>%
    summarise(request_count = sum(request_count)) 

p <- ggplot() +
 geom_bar(data =special_pages_requests_bytype_enwiki, aes(x= date, y= request_count, fill = request_type),stat= "identity", position = "stack") + 
 geom_smooth(data = special_pages_requests_enwiki, aes(x=date, y=request_count), se = FALSE) +
    scale_y_continuous("Number of requests per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "1 week") +
    labs(title = "Mobile web requests to special pages on English Wikipedia",
        subtitle = "Based on a sample of data collected from webrequest data") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))
p
ggsave("Figures/mobile_web_special_page_requests_enwiki.png", p, width = 18, height = 9, units = "in", dpi = 150)
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'

Monthly proportion of AMC requests to special pages

In [1981]:
#Calculate monthly proportion of amc_edits 

special_pages_requests_monthly_prop_enwiki <- special_pages_requests %>%
    filter(wiki == 'en.wikipedia')  %>% 
      mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>% 
    summarise(amc_requests_total = sum(amc_request),
             all_mobile_web_requests_total = sum(all_mobile_web_requests),
             amc_prop = amc_requests_total/all_mobile_web_requests_total *100)

special_pages_requests_monthly_prop_enwiki
A tibble: 4 × 4
dateamc_requests_totalall_mobile_web_requests_totalamc_prop
<date><int><int><dbl>
2019-08-01 594 486301.221468
2019-09-0142312818151.501340
2019-10-0149092834921.731619
2019-11-0131652255861.403013

Month over month change in all mobile web requests to special pages

In [1846]:
#Calculate month over month changes in special requests on English Wiki

special_pages_requests_enwiki_mom <- special_pages_requests_clean %>%
    filter(wiki == 'en.wikipedia',
          request_type == "all_mobile_web_requests",
          date >= '2019-09-01',
          date <= '2019-10-31') %>% 
    mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>%
    summarise(request_count = sum(request_count)) %>%
    arrange(date) %>%
    mutate(monthOvermonth= (request_count/lag(request_count,1) -1) *100)


tail(special_pages_requests_enwiki_mom)
A tibble: 2 × 3
daterequest_countmonthOvermonth
<date><int><dbl>
2019-09-01281815 NA
2019-10-012834920.5950712

Similar to overall trends, the overall rate of access to special pages from mobile web has remained pretty flat; however, the proportion of these views coming from AMC mode has increased.

On English Wikipedia, requests from AMC mode to special pages represent 1.7% of all logged-in mobile web views in October 2019. This is slightly lower compared to the rate across all Wikipedias (2.0%) but is a 15.3% month over month increase from September 2019.

Top 10 Special pages viewed in AMC Mode on English Wikipedia

In [1982]:
special_page_requests_bypage_top10_enwiki <- special_page_requests_bypage %>%
 filter(wiki == 'en.wikipedia') %>%
 group_by(special_page_name) %>%
 summarise(requests = sum(requests)) %>%
 mutate(percent_specialpage_requests = requests/sum(requests)*100) %>%
 top_n(10)  %>%
 arrange(desc(percent_specialpage_requests))

special_page_requests_bypage_top10_enwiki
Selecting by percent_specialpage_requests
A tibble: 10 × 3
special_page_namerequestspercent_specialpage_requests
<chr><int><dbl>
MobileDiff 472636.6384991
Watchlist 362328.0874486
Search 1144 8.8689046
Contributions1124 8.7138538
MobileOptions 504 3.9072796
EditWatchlist 222 1.7210636
Recentchanges 157 1.2171486
Whatlinkshere 131 1.0155826
Log 116 0.8992945
Notifications 116 0.8992945

The Mobile Diff, Watchlist, Contributions, and Search are also the most viewed special pages from users in AMC mode on English Wikipedia. There's a slightly lower proportion of mobile web requests to the Contributions page from AMC on English Wikipedia compared to all Wikipedias (8.7% on English Wikipedia vs 10.2% overall).

Rate of access to special pages on target Wikipedias

March Deployment Target Wikis

In [1986]:
special_pages_requests_targetwiki_March = special_pages_requests_clean %>%
 mutate(date = floor_date(date, "week")) %>%
  filter(wiki %in% c('ar.wikipedia', 'es.wikipedia', 'id.wikipedia'),
          request_type == "all_mobile_web_requests",
          date != '2019-08-25',
          date != '2019-11-24' ) %>% #remove weeks with incomplete data 
  group_by(date, wiki) %>%
summarize(request_count = sum(request_count))
In [2032]:
special_pages_requests_bytype_targetwiki_March <- special_pages_requests_clean %>%
    mutate(date = floor_date(date, "week")) %>%
     filter(wiki %in% c('ar.wikipedia', 'es.wikipedia', 'id.wikipedia'),
          request_type != "all_mobile_web_requests",
          date != '2019-08-25',
          date != '2019-11-24' ) %>% #remove weeks with incomplete data 
    group_by(date, wiki, request_type) %>%
    summarise(request_count = sum(request_count)) 

p <- ggplot() +
 geom_bar(data =special_pages_requests_bytype_targetwiki_March, aes(x= date, y= request_count, fill = request_type),stat= "identity", position = "stack") + 
 geom_smooth(data = special_pages_requests_targetwiki_March, aes(x=date, y=request_count), se = FALSE) +
    facet_wrap(~ wiki, nrow = 7, scale = "free_y") +
    scale_y_continuous("Number of requests per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "1 week") +
    labs(title = "Mobile web requests to special pages on target wikis",
        subtitle = "Based on a sample of data collected from webrequest data") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))

p

ggsave("Figures/mobile_web_special_page_requests_March_target.png", p, width = 18, height = 9, units = "in", dpi = 150)
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'

June Deployment Target Wikis

In [1988]:
special_pages_requests_targetwiki_June = special_pages_requests_clean %>%
 mutate(date = floor_date(date, "week")) %>%
  filter(wiki %in% c('it.wikipedia', 'ja.wikipedia', 
                       'fa.wikipedia', 'th.wikipedia'),
          request_type == "all_mobile_web_requests",
          date != '2019-08-25',
          date != '2019-11-24' ) %>% #remove weeks with incomplete data 
  group_by(date, wiki) %>%
summarize(request_count = sum(request_count))
In [2033]:
special_pages_requests_bytype_targetwiki_June <- special_pages_requests_clean %>%
    mutate(date = floor_date(date, "week")) %>%
  mutate(date = floor_date(date, "week")) %>%
  filter(wiki %in% c('it.wikipedia', 'ja.wikipedia', 
                       'fa.wikipedia', 'th.wikipedia'),
          request_type != "all_mobile_web_requests",
          date != '2019-08-25',
          date != '2019-11-24' ) %>% #remove weeks with incomplete data 
    group_by(date, wiki, request_type) %>%
    summarise(request_count = sum(request_count)) 

p <- ggplot() +
 geom_bar(data =special_pages_requests_bytype_targetwiki_June, aes(x= date, y= request_count, fill = request_type),stat= "identity", position = "stack") + 
 geom_smooth(data = special_pages_requests_targetwiki_June, aes(x=date, y=request_count), se = FALSE) +
    facet_wrap(~ wiki, nrow = 7, scale = "free_y") +
    scale_y_continuous("view count", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "1 week") +
    labs(title = "Mobile web requests to special pages on target wikis",
        subtitle = "Based on a 1/64 sample of data collected from webrequest data") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))

p

ggsave("Figures/mobile_web_special_page_requests_June_target.png", p, width = 18, height = 9, units = "in", dpi = 150)
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
In [1993]:
#Calculate monthly proportion of amc_edits 

special_pages_requests_enwiki_mom <- special_pages_requests_clean %>%
    filter(wiki == 'it.wikipedia',
          request_type == "all_mobile_web_requests",
          date >= '2019-09-01',
          date <= '2019-10-31') %>% 
    mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>%
    summarise(request_count = sum(request_count)) %>%
    arrange(date) %>%
    mutate(monthOvermonth= (request_count/lag(request_count,1) -1)*100)

From September 2019 to October 2019, there was an increase in the proportion of special page views from users while in AMC mode on Arabic, Spanish, Indonesian, and Italian Wikipedias. Conversely, there was a decline for Japanese, Persian and Thai Wikpedias.

See table below for a breakdown of the % increase in requests to special pages from AMC mode from Sepetember to October 2019 for each target wiki.

Summary of access requests to Special Pages on Target Wikis - October 2019

Wiki Proportion of AMC Edits Sept 2019 to Oct 2019 change in AMC Proportion Sept 2019 to Oct 2019 change in requests
Arabic Wiki 3.7% 21.3% 2.8%
Spanish Wiki 2.9% 32.8% 7.0%
Indonesian Wiki 1.6% 11.1% 13.5%
Italian Wiki 2.8% 92.0% -1.2%
Japanese Wiki 1.9% -15.5% 0.0%
Persian Wiki 2.3% -3.6% 2.2%
Thai Wiki 1.9% -7.8% -0.2%

Top 10 Special pages viewed in AMC Mode on Target Wikipedias

In [1994]:
special_page_requests_bypage_top10_targetwiki <- special_page_requests_bypage  %>%
 filter(wiki %in% c('ar.wikipedia', 'es.wikipedia', 'id.wikipedia', 'it.wikipedia', 'ja.wikipedia', 
                       'fa.wikipedia', 'th.wikipedia')) %>% 
 group_by(special_page_name) %>%
 summarise(requests = sum(requests)) %>%
 mutate(percent_specialpage_requests = requests/sum(requests)*100) %>%
 top_n(10)  %>%
 arrange(desc(percent_specialpage_requests))

special_page_requests_bypage_top10_targetwiki
Selecting by percent_specialpage_requests
A tibble: 10 × 3
special_page_namerequestspercent_specialpage_requests
<chr><int><dbl>
MobileDiff 305836.065574
Watchlist 177120.886897
Contributions 97711.522585
Search 696 8.208515
Recentchanges 245 2.889492
MobileOptions 230 2.712584
EditWatchlist 188 2.217243
Log 135 1.592169
Whatlinkshere 126 1.486024
Specialpages 117 1.379880

Compared to English Wikipedia, there are a higher proportion of requests to the Contributions page from target wikis (11.5% vs 8.7%) and a lower proprotion of request to the Watchlist page (20.8% vs. 28.1%). Mobile Diff remains the top viewed special page acros all reviewed Wikipedias.

Rate of access and edits to article talk pages

Methodology

The same methodology used to obtain the rate of access to special pages was used to obtain the rate of access to talk pages except we reviewed all requests to article talk pages where the page namespace id is equal to 1.

The proportion of AMC requests to special pages was based on a 1/64 sample of data collected the webrequest data from August 2019 through most recent data available in November 2019. Data for the overall change in talk page views was based on pageview_hourly data from June 2018 through October 30, 2019 to review overall year over year changes in mobile web rates.

For rate of edits to article talk pages, we reviewed all mobile web and AMC tagged edits from the mediawiki_history table. Data reviewed was from August 2019 through October 2019.

In [1876]:
#talk page views from logged in users from wmf.pageview_hourly
#dating back to June 2018 through October 2019.
query <- "
SELECT 
    CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) AS date,
    project, 
    access_method, 
    SUM(view_count) AS views 
FROM wmf.pageview_hourly
WHERE ((year = 2018 AND month >= 06) or (year = 2019 and month <=10)) 
    AND project LIKE '%.wikipedia'
    AND namespace_id = 1
    AND agent_type = 'user'
GROUP BY CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')), project, access_method"
In [1877]:
talk_page_views <- wmf::query_hive(query)
In [1878]:
talk_page_views$date <- as.Date(talk_page_views$date, format = "%Y-%m-%d")
In [1879]:
#Total talk page requests from logged in users by request type - amc and non-amc edit
#Sample from the webrequest data.

query <- 
"SELECT 
    CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) AS date,
    CONCAT(normalized_host.project, '.',normalized_host.project_family) AS wiki,
    SUM(IF ((x_analytics_map['mf-m'] = 'b%2Camc' OR
    x_analytics_map['mf-m'] = 'amc'), 1, 0)) as amc_request,
    COUNT(*) AS all_mobile_web_requests 
FROM wmf.webrequest TABLESAMPLE(BUCKET 1 OUT OF 64 ON hostname, sequence)
    WHERE year = 2019 AND month >= 08
    AND namespace_id = 1
-- Only review logged in users 
    AND normalized_host.project_family = 'wikipedia'
    AND x_analytics_map['loggedIn'] IS NOT NULL
    AND agent_type = 'user'
    AND access_method = 'mobile web'
    AND webrequest_source = 'text'
    AND is_pageview
GROUP BY CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')),  CONCAT(normalized_host.project, '.',normalized_host.project_family)"
In [1880]:
talk_page_requests <- wmf::query_hive(query)
In [1881]:
talk_page_requests$date <- as.Date(talk_page_requests$date, format = "%Y-%m-%d")
In [1882]:
talk_page_requests_clean <- talk_page_requests %>%
    mutate(other_mobile_web_request = all_mobile_web_requests - amc_request) %>% 
    gather(request_type, request_count, 3:5) %>%
    arrange(desc(date))
In [1883]:
talk_page_requests_clean$request_type %<>% factor(levels= c("amc_request","all_mobile_web_requests", "other_mobile_web_request"))
In [1275]:
## All article talk page edits on mobile web coming from AMC mode

query <- "select
    date_format(event_timestamp, 'yyyy-MM-dd') as date,
    wiki,
    sum(cast(other_mobile_web_edit as int)) as other_mobile_web_edits,
    sum(cast(amc_edit as int)) as amc_edits,
    sum(cast(all_mobile_web_edit as int)) as all_mobile_web_edits
from (
    select
        wiki_db as wiki,
        event_timestamp,
        (array_contains(revision_tags, 'mobile web edit') and not
        array_contains(revision_tags, 'advanced mobile edit')) as other_mobile_web_edit, 
        (array_contains(revision_tags, 'advanced mobile edit') and
        array_contains(revision_tags, 'mobile web edit')) as amc_edit,
        array_contains(revision_tags, 'mobile web edit') as all_mobile_web_edit
    from wmf.mediawiki_history mwh
    INNER JOIN canonical_data.wikis cd
        ON wiki_db = database_code 
    where
        mwh.event_entity = 'revision' and
        mwh.event_type = 'create' and
        cd.database_group = 'wikipedia' and
        mwh.event_timestamp IS NOT NULL and
        mwh.event_user_id IS NOT NULL and
--looking only at historical page namespaces
        page_namespace_historical == 1 and
        mwh.event_timestamp between '2018-06-01' and '2019-10-31' and 
        mwh.snapshot = '2019-10'
) edits
group by wiki, date_format(event_timestamp, 'yyyy-MM-dd')"
In [1276]:
talk_page_edits <- wmf::query_hive(query)
In [1277]:
talk_page_edits$date <- as.Date(talk_page_edits$date, format = "%Y-%m-%d")
In [1278]:
talk_page_edits_clean <-  talk_page_edits %>%
  filter(date < '2019-10-30') %>% #remove incomplete Nov data
gather(edit_type, edit_count, 3:5) %>%
    arrange(desc(date))

Rate of access and edits to talk pages overall

Access rates

In [2049]:
talk_page_views_all <- talk_page_views %>%
    filter(access_method == 'mobile web') %>%
    mutate(date = floor_date(date, 'month')) %>% 
    group_by(date) %>%
    summarise(views = sum(views))
In [2034]:
# Plot YoY Changes in overall mobile web views

talk_page_views_all_yoy_plot <- talk_page_views_all %>%
 mutate(year = case_when(date >= '2018-06-01' & date < '2019-06-01' ~ '2018/2019',
                         date >= '2019-06-01' & date < '2020-06-01' ~ '2019/2020'),
         MonthN =as.factor(format(as.Date(date),"%m")),
         Month = months(as.Date(date), abbreviate=TRUE))


talk_page_views_all_yoy_plot$MonthN = factor(talk_page_views_all_yoy_plot$MonthN, levels=c("06", "07", "08", "09", "10","11", "12", "01", "02", "03", 
                                                                           "04", "05" ))

talk_page_views_all_yoy_plot$year = factor(talk_page_views_all_yoy_plot$year, levels = c('2018/2019', '2019/2020'))


p <- ggplot(talk_page_views_all_yoy_plot, aes(x=MonthN, y = views, group = year, color = year, linetype = year)) +    
  geom_line(size = 0.8) +
  scale_y_continuous("Number of views per month", labels = polloi::compress) +
  scale_x_discrete(breaks = talk_page_views_all_yoy_plot$MonthN, labels = talk_page_views_all_yoy_plot$Month )+
    geom_vline(xintercept = c(10, 1, 3), linetype = "dashed", color = "black") +
    geom_text(aes(x=10, y=4E6, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.7, vjust = 1.5, angle = 90, color = "black") +
    geom_text(aes(x=1, y=4E6, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.7, vjust = 1.5, angle = 90, color = "black") +
   geom_text(aes(x=3, y=4E6, label="AMC deployed on all Wikipedias"), size=3.7, vjust = 1.5, angle = 90, color = "black") +
  labs(title = "Mobile web talk page views on all Wikipedia Projects") +
        xlab("Month") +
        scale_color_brewer(palette = 'Set1', breaks=c('2018/2019', '2019/2020'), direction= -1) +
        scale_linetype_manual(breaks=c('2018/2019', '2019/2020'), values=c(2,1)) +
  ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
  theme(plot.title = element_text(hjust = 0.5),
        legend.title = element_blank(),
        legend.position = "bottom",
        panel.grid = element_line("gray70"),
        legend.key.width=unit(0.5,"cm"))
p 

ggsave("Figures/mobile_web_talk_page_views_overall.png", p, width = 18, height = 9, units = "in", dpi = 150)

Year over year percent changes in mobile web views to article talk pages

In [1887]:
# Calculate year over year changes
talk_page_views_all_yoy <- talk_page_views_all%>%
   mutate(date = floor_date(date, 'month')) %>% 
   group_by(date)%>%
   summarise(views = sum(views)) %>%
   arrange(date) %>%
   mutate(yoy_percent = (views/lag(views,12) -1) *100) %>%
   arrange(desc(date))

head(talk_page_views_all_yoy)
A tibble: 6 × 3
dateviewsyoy_percent
<date><int><dbl>
2019-10-013844925 10.60188
2019-09-012962659-28.86988
2019-08-012388837-31.53688
2019-07-012111978-51.06993
2019-06-011943459-14.60349
2019-05-012119076 NA

There was a decrease in the number of views to article talk pages on all Wikipedia Projects from January to May 2019; however, there was a sharp increase in the number of views starting in June 2019 with a 10.6% year over year increase in October 2019. Other variables may contributing to these increases but a review of the proportion of talk pages requests made while in AMC will help indicate how much AMC attributes to this increase.

In [1893]:
talk_page_requests_all = talk_page_requests_clean %>%
    mutate(date = floor_date(date, "week")) %>%
    filter(request_type == "all_mobile_web_requests",
          date != '2019-08-18',
          date != '2019-11-24') %>% #remove incomplete weeks
     group_by(date) %>%
    summarize(request_count = sum(request_count))
In [2036]:
talk_page_requests_byrequest <- talk_page_requests_clean  %>%
      mutate(date = floor_date(date, "week")) %>%
       filter(request_type != "all_mobile_web_requests",
          date != '2019-08-18',
          date != '2019-11-24') %>% #remove incomplete weeks
      group_by(date, request_type) %>%
      summarise(request_count = sum(request_count)) 

p <- ggplot() +
 geom_bar(data =talk_page_requests_byrequest, aes(x= date, y= request_count, fill = request_type),stat= "identity", position = "stack") + 
 geom_smooth(data = talk_page_requests_all, aes(x=date, y=request_count), se = FALSE) +
    scale_y_continuous("Number of requests per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "1 week") +
    labs(title = "Mobile web requests to talk pages on all Wikipedia Projects",
        subtitle = "Based on a sample of data collected from webrequest data") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))
           
p

ggsave("Figures/mobile_web_talk_page_requests_amc_prop.png", p, width = 18, height = 9, units = "in", dpi = 150)
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
In [1892]:
#Calculate monthly proportion of amc_edits 

talk_page_requests_monthly_prop <- talk_page_requests %>%
      mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>% 
    summarise(amc_requests_total = sum(amc_request),
             all_mobile_web_requests_total = sum(all_mobile_web_requests),
             amc_prop = amc_requests_total/all_mobile_web_requests_total *100)

talk_page_requests_monthly_prop
A tibble: 4 × 4
dateamc_requests_totalall_mobile_web_requests_totalamc_prop
<date><int><int><dbl>
2019-08-01 4015725.47771
2019-09-0137584444.43128
2019-10-0139994142.40170
2019-11-0138078548.40764

Almost half of all views to talk pages are made by users in AMC mode, which is signficantly higher than the percentage of views to special pages.

There was a 10.6% year over year increase in the number of views to article talk pages in October 2019. About 42% of these views came from users in AMC mode.

Talk Page Edit rates overall

In [1895]:
talk_page_edits_all <- talk_page_edits_clean %>%
  filter(edit_type == 'all_mobile_web_edits') %>%
  mutate(date = floor_date(date, "week")) %>%
  filter(date != '2018-05-27',
        date !=  '2019-10-27') %>% #remove incomplete data weeks
  group_by(date) %>%
  summarise(edit_count = sum(edit_count))
In [2037]:
#Overall Mobile Web Edits by edit type

talk_page_edits_bytype_all <- talk_page_edits_clean %>%
  filter(edit_type != 'all_mobile_web_edits') %>%
  mutate(date = floor_date(date, "week")) %>%
  filter(date != '2018-05-27',
        date !=  '2019-10-27') %>% #remove incomplete data weeks
  group_by(date, edit_type) %>%
  summarise(edit_count = sum(edit_count)) 


p <- ggplot() +
  geom_col(data =talk_page_edits_bytype_all, aes(x= date, y= edit_count, fill = edit_type), stat= 'identity', position = "stack") + 
    geom_smooth(data = talk_page_edits_all, aes(x=date, y=edit_count), se = FALSE) +
    scale_y_continuous("Number of edits per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
  geom_vline(xintercept = vertical_lines,
             linetype = "dashed", color = "black") +
  geom_text(aes(x=as.Date('2019-03-20'), y=1.4E3, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.6, vjust = 1.5, angle = 90, color = "black") +
  geom_text(aes(x=as.Date('2019-06-17'), y=1.4E3, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.6, vjust = 1.5, angle = 90, color = "black") +
   geom_text(aes(x=as.Date('2019-08-07'), y=1.4E3, label="AMC deployed on all Wikipedias"), size=3.6, vjust = 1.5, angle = 90, color = "black") +
    labs(title = "Mobile web edits to article talk pages on all Wikipedias") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))
p

ggsave("Figures/mobile_web_talk_page_edits_amc_prop.png", p, width = 18, height = 9, units = "in", dpi = 150)
Warning message:
“Ignoring unknown parameters: stat”`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'

Monthly proportion of AMC talk page edits

In [2050]:
#Calculate monthly proportion of amc talk page edits

talk_page_edits_monthly_prop <- talk_page_edits %>%
      mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>% 
    summarise(amc_edits_total = sum(amc_edits),
             all_mobile_web_edits_total = sum(all_mobile_web_edits),
             amc_prop = amc_edits_total/all_mobile_web_edits_total *100)

tail(talk_page_edits_monthly_prop)
A tibble: 6 × 4
dateamc_edits_totalall_mobile_web_edits_totalamc_prop
<date><int><int><dbl>
2019-05-01 50 9119 0.5483057
2019-06-01 47 8339 0.5636167
2019-07-01 241 9309 2.5888925
2019-08-01100111183 8.9510865
2019-09-013501 995135.1823937
2019-10-0147991141142.0559110

Year over year changes in the overall rate of mobile web talk page edits

In [1907]:
#Calculate year over year changes in overall rate

talk_pages_edits_yoy <- talk_page_edits_clean %>%
    filter(edit_type == "all_mobile_web_edits") %>% #Look at all mobile web requests
    mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>%
    summarise(edit_count = sum(edit_count)) %>%
    arrange(date) %>%
    mutate(yearOveryear = (edit_count/lag(edit_count,12) -1)*100) 


tail(talk_pages_edits_yoy)
A tibble: 6 × 3
dateedit_countyearOveryear
<date><int><dbl>
2019-05-01 9119 NA
2019-06-01 833920.66271
2019-07-01 930930.61597
2019-08-011118361.79109
2019-09-01 995165.49144
2019-10-011107769.03708

The number of talk page edits on mobile web has been increasing for the past year so these increases cannot be solely attributed to AMC mode; however, higher year over year increases are seen starting in August 2019 around the date AMC was deployed to all Wikipedias.

From September 2019 to October 2019, the proportion of AMC edits increased by 19.5%. AMC edits accounted for 42.1% of all talk page edits made by logged-in users on mobile web in October 2019.

In [1289]:
head(talk_page_edits_clean)
A data.frame: 6 × 4
datewikiedit_typeedit_count
<date><chr><chr><int>
2019-10-29nlwiki other_mobile_web_edits 0
2019-10-29barwikiother_mobile_web_edits 0
2019-10-29trwiki other_mobile_web_edits 0
2019-10-29uzwiki other_mobile_web_edits 1
2019-10-29eswiki other_mobile_web_edits27
2019-10-29lbwiki other_mobile_web_edits 0

Rate of access to article talk pages on English Wikipedia

In [1909]:
talk_page_views_all_enwiki <- talk_page_views %>%
    filter(access_method == 'mobile web',
          project == 'en.wikipedia') %>%
    mutate(date = floor_date(date, 'month')) %>% 
    group_by(date) %>%
    summarise(views = sum(views))
In [2038]:
# Plot YoY Changes on English Wikipedia

talk_page_views_enwiki_yoy_plot <- talk_page_views_all_enwiki %>%
 mutate(year = case_when(date >= '2018-06-01' & date < '2019-06-01' ~ '2018/2019',
                         date >= '2019-06-01' & date < '2020-06-01' ~ '2019/2020'),
         MonthN =as.factor(format(as.Date(date),"%m")),
         Month = months(as.Date(date), abbreviate=TRUE))


talk_page_views_enwiki_yoy_plot$MonthN = factor(talk_page_views_enwiki_yoy_plot$MonthN, levels=c("06", "07", "08", "09", "10","11", "12", "01", "02", "03", 
                                                                           "04", "05" ))

talk_page_views_enwiki_yoy_plot$year = factor(talk_page_views_enwiki_yoy_plot$year, levels = c('2018/2019', '2019/2020'))


p <- ggplot(talk_page_views_enwiki_yoy_plot, aes(x=MonthN, y = views, group = year, color = year, linetype = year)) +    
  geom_line(size = 0.8) +
  scale_y_continuous("Number of views per month", labels = polloi::compress) +
  scale_x_discrete(breaks = talk_page_views_all_yoy_plot$MonthN, labels = talk_page_views_all_yoy_plot$Month )+
    geom_vline(xintercept = c(10, 1, 3), linetype = "dashed", color = "black") +
    geom_text(aes(x=10, y=3E6, label="AMC deployed on Arabic, Indonesian, and Spanish Wikipedias"), size=3.6, vjust = 1.5, angle = 90, color = "black") +
    geom_text(aes(x=1, y=3E6, label="AMC deployed on Italian, Japanese, Persian, and Thai Wikipedias"), size=3.6, vjust = 1.5, angle = 90, color = "black") +
   geom_text(aes(x=3, y=3E6, label="AMC deployed on all Wikipedias"), size=3.6, vjust = 1.5, angle = 90, color = "black") +
  labs(title = "Mobile web talk page views on English Wikipedia") +
        xlab("Month") +
        scale_color_brewer(palette = 'Set1', breaks=c('2018/2019', '2019/2020'), direction= -1) +
        scale_linetype_manual(breaks=c('2018/2019', '2019/2020'), values=c(2,1)) +
  ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
  theme(plot.title = element_text(hjust = 0.5),
        legend.title = element_blank(),
        legend.position = "bottom",
        panel.grid = element_line("gray70"),
        legend.key.width=unit(0.5,"cm"))
p 

ggsave("Figures/mobile_web_talk_page_views_enwiki.png", p, width = 18, height = 9, units = "in", dpi = 150)

Year over year changes in mobile web views to article talk pages

In [1915]:
# Calculate year over year changes on English Wikipedia

talk_page_views_enwiki_yoy <- talk_page_views_all_enwiki %>%
   mutate(date = floor_date(date, 'month')) %>% 
   group_by(date)%>%
   summarise(views = sum(views)) %>%
   arrange(date) %>%
   mutate(yoy_percent = (views/lag(views,12) -1) *100) %>%
   arrange(desc(date))

head(talk_page_views_enwiki_yoy)
A tibble: 6 × 3
dateviewsyoy_percent
<date><int><dbl>
2019-10-012139701 -8.789061
2019-09-011582205-50.762338
2019-08-011268353-52.309524
2019-07-011112919-68.682060
2019-06-011061270-31.217809
2019-05-011126863 NA

Similar to trends seen across all Wikipedia projects, there was a decrease in the number of views to article talk pages on all English Wikipedia from January to May 2019; however, there was a sharp increase in the number of views starting in June 2019. Other variables may contributing to these increases and decreases but a review of the proportion of talk pages requests made while in AMC will help indicate how much AMC attributes to these views.

In [1917]:
talk_page_requests_enwiki = talk_page_requests_clean %>%
    mutate(date = floor_date(date, "week")) %>%
    filter(request_type == "all_mobile_web_requests",
          wiki == 'en.wikipedia',
          date != '2019-08-18',
          date != '2019-11-24') %>% #remove weeks with incomplete data
     group_by(date) %>%
    summarize(request_count = sum(request_count))
In [2039]:
talk_page_requests_enwiki_byrequest <- talk_page_requests_clean  %>%
      mutate(date = floor_date(date, "week")) %>%
          filter(request_type != "all_mobile_web_requests",
          wiki == 'en.wikipedia',
          date != '2019-08-18',
          date != '2019-11-24') %>% #remove weeks with incomplete data
      group_by(date, request_type) %>%
      summarise(request_count = sum(request_count)) 

p <- ggplot() +
 geom_bar(data =talk_page_requests_enwiki_byrequest, aes(x= date, y= request_count, fill = request_type),stat= "identity", position = "stack") + 
 geom_smooth(data = talk_page_requests_enwiki , aes(x=date, y=request_count), se = FALSE) +
    scale_y_continuous("Number of requests per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %d %Y"), date_breaks = "1 week") +
    labs(title = "Mobile web requests to talk pages on English Wikipedia",
                subtitle = "Based on a sample of data collected from webrequest data") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))
           
p

ggsave("Figures/mobile_web_talk_page_request_amc_prop_enwiki.png", p, width = 18, height = 9, units = "in", dpi = 150)
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'

Proportion of AMC views to article talk pages

In [1919]:
#Calculate monthly proportion of amc_edits on English Wiki 

talk_page_requests_monthly_prop_enwiki <- talk_page_requests %>%
    filter(wiki == 'en.wikipedia')  %>% 
      mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>% 
    summarise(amc_requests_total = sum(amc_request),
             all_mobile_web_requests_total = sum(all_mobile_web_requests),
             amc_prop = amc_requests_total/all_mobile_web_requests_total *100)

talk_page_requests_monthly_prop_enwiki
A tibble: 4 × 4
dateamc_requests_totalall_mobile_web_requests_totalamc_prop
<date><int><int><dbl>
2019-08-01 19 7724.67532
2019-09-0116238841.75258
2019-10-0118946340.82073
2019-11-0121144147.84580

Month over month changes in all mobile web views to article talk pages

In [1920]:
#Calculate month over month changes on English Wiki 

talk_pages_requests_mom <- talk_page_requests_clean %>%
    filter(wiki == 'en.wikipedia',
          request_type == "all_mobile_web_requests",
          date >= '2019-09-01',
          date < '2019-11-01') %>% #Look at all mobile web requests
    mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>%
    summarise(request_count = sum(request_count)) %>%
    arrange(date) %>%
    mutate(monthOvermonth= request_count/lag(request_count,1) -1) 


tail(talk_pages_requests_mom)
A tibble: 2 × 3
daterequest_countmonthOvermonth
<date><int><dbl>
2019-09-01388 NA
2019-10-014630.193299

From September 2019 to October 2019, the number of pageviews to talk pages on mobile web increased by 19.3%. About 41% of these views came from users in AMC mode.

Article talk page edits on English Wikipedia

In [1921]:
talk_page_edits_enwiki <- talk_page_edits_clean %>%
  filter(edit_type == 'all_mobile_web_edits',
        wiki == 'enwiki') %>%
  mutate(date = floor_date(date, "week")) %>%
  filter(date != '2018-05-27',
        date !=  '2019-10-27') %>% #remove incomplete data weeks
  group_by(date) %>%
  summarise(edit_count = sum(edit_count))
In [2056]:
#Enwiki Mobile Web Edits by edit type

talk_page_edits_bytype_enwiki <- talk_page_edits_clean %>%
  filter(edit_type != 'all_mobile_web_edits',
        wiki == 'enwiki') %>%
  mutate(date = floor_date(date, "week")) %>%
  filter(date != '2018-05-27',
        date !=  '2019-10-27') %>% #remove incomplete data weeks
  group_by(date, edit_type) %>%
  summarise(edit_count = sum(edit_count)) 


p <- ggplot() +
  geom_col(data =talk_page_edits_bytype_enwiki, aes(x= date, y= edit_count, fill = edit_type), stat= 'identity', position = "stack") + 
    geom_smooth(data = talk_page_edits_enwiki, aes(x=date, y=edit_count), se = FALSE) +
    scale_y_continuous("Number of edits per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
  geom_vline(xintercept = as.Date('2019-08-07'),
             linetype = "dashed", color = "black") +
   geom_text(aes(x=as.Date('2019-08-07'), y=800, label="AMC deployed on all Wikipedias"), size=3.8, vjust = -1, angle = 90, color = "black") +
    labs(title = "Mobile web edits to article talk pages \n on English Wikipedia") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))
p

ggsave("Figures/mobile_web_talk_page_edits_enwiki.png", p, width = 18, height = 9, units = "in", dpi = 150)
Warning message:
“Ignoring unknown parameters: stat”`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'

Proportion of AMC talk page edits on English Wikipedia

In [1925]:
#Calculate monthly proportion of talk page edits on English Wikipedia

talk_page_edits_monthly_prop_enwiki <- talk_page_edits %>%
      filter(wiki == 'enwiki') %>%
      mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>% 
    summarise(amc_edits_total = sum(amc_edits),
             all_mobile_web_edits_total = sum(all_mobile_web_edits),
             amc_prop = amc_edits_total/all_mobile_web_edits_total *100)

tail(talk_page_edits_monthly_prop_enwiki)
A tibble: 6 × 4
dateamc_edits_totalall_mobile_web_edits_totalamc_prop
<date><int><int><dbl>
2019-05-01 04416 0.000000
2019-06-01 04196 0.000000
2019-07-01 04455 0.000000
2019-08-01 4305410 7.948244
2019-09-011565464133.721181
2019-10-012630590344.553617

Year over year changes in talk page edits on English Wikipedia

In [2057]:
#Calculate year over year changes on English Wikipedia

talk_pages_edits_yoy_enwiki <- talk_page_edits_clean %>%
    filter(wiki == 'enwiki',
            edit_type == "all_mobile_web_edits") %>% #Look at all mobile web requests
    mutate(date = floor_date(date, "month")) %>%
    group_by(date) %>%
    summarise(edit_count = sum(edit_count)) %>%
    arrange(date) %>%
    mutate(yearOveryear = (edit_count/lag(edit_count,12) -1) *100)


tail(talk_pages_edits_yoy_enwiki)
A tibble: 6 × 3
dateedit_countyearOveryear
<date><int><dbl>
2019-05-014416 NA
2019-06-014196 16.49084
2019-07-014455 24.54571
2019-08-015410 67.80397
2019-09-014641 59.26561
2019-10-015753103.14266

From September 2019 to October 2019, the proportion of AMC edits increased on English Wikipedia by 32.1% (compared to 19.5% across all Wikipedias). AMC edits accounted for 44.6% of all talk page edits made by logged-in users on mobile web in October 2019.

Rate of access and edits to article talk pages on target wikipedias

Rate of access to article talk pages on Target Wikipedias

In [1929]:
talk_page_views_all_targetwiki_March <- talk_page_views %>%
    filter(access_method == 'mobile web',
          project %in% c('ar.wikipedia', 'es.wikipedia', 'id.wikipedia')) %>%
    mutate(date = floor_date(date, 'month')) %>% 
    group_by(date, project) %>%
    summarise(views = sum(views))
In [2042]:
# Plot YoY Changes on Target Wikipedia

talk_page_views_targetwiki_yoy_plot_March <- talk_page_views_all_targetwiki_March %>%
group_by(project)  %>%
 mutate(year = case_when(date >= '2018-06-01' & date < '2019-06-01' ~ '2018/2019',
                         date >= '2019-06-01' & date < '2020-06-01' ~ '2019/2020'),
         MonthN =as.factor(format(as.Date(date),"%m")),
         Month = months(as.Date(date), abbreviate=TRUE))


talk_page_views_targetwiki_yoy_plot_March$MonthN = factor(talk_page_views_targetwiki_yoy_plot_March$MonthN, levels=c("06", "07", "08", "09", "10","11", "12", "01", "02", "03", 
                                                                           "04", "05" ))

talk_page_views_targetwiki_yoy_plot_March$year = factor(talk_page_views_targetwiki_yoy_plot_March$year, levels = c('2018/2019', '2019/2020'))


p <- ggplot(talk_page_views_targetwiki_yoy_plot_March, aes(x=MonthN, y = views, group = year, color = year, linetype = year)) +    
  geom_line(size = 0.8) +
  facet_wrap(~ project, nrow = 7, scale = "free_y") +
  scale_y_continuous("Number of views per momth", labels = polloi::compress) +
 scale_x_discrete(breaks = talk_page_views_targetwiki_yoy_plot_March$MonthN, labels = talk_page_views_targetwiki_yoy_plot_March$Month )+
  labs(title = "Mobile web talk page views on target Wikipedia projects \n Where AMC was deploymed on March 20, 2019") +
        xlab("Month") +
        scale_color_brewer(palette = 'Set1', breaks=c('2018/2019', '2019/2020'), direction= -1) +
        scale_linetype_manual(breaks=c('2018/2019', '2019/2020'), values=c(2,1)) +
  ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
  theme(plot.title = element_text(hjust = 0.5),
        legend.title = element_blank(),
        legend.position = "bottom",
        panel.grid = element_line("gray70"),
        legend.key.width=unit(0.5,"cm"))
p 

ggsave("Figures/mobile_web_talk_page_views_target_March.png", p, width = 18, height = 9, units = "in", dpi = 150)
In [1931]:
talk_page_views_all_targetwiki_June <- talk_page_views %>%
    filter(access_method == 'mobile web',
          project %in% c('it.wikipedia', 'ja.wikipedia', 
                       'fa.wikipedia', 'th.wikipedia')) %>%
    mutate(date = floor_date(date, 'month')) %>% 
    group_by(date, project) %>%
    summarise(views = sum(views))
In [2043]:
# Plot YoY Changes on Target Wikipedia

talk_page_views_targetwiki_yoy_plot_June <- talk_page_views_all_targetwiki_June %>%
group_by(project)  %>%
 mutate(year = case_when(date >= '2018-06-01' & date < '2019-06-01' ~ '2018/2019',
                         date >= '2019-06-01' & date < '2020-06-01' ~ '2019/2020'),
         MonthN =as.factor(format(as.Date(date),"%m")),
         Month = months(as.Date(date), abbreviate=TRUE))


talk_page_views_targetwiki_yoy_plot_June$MonthN = factor(talk_page_views_targetwiki_yoy_plot_June$MonthN, levels=c("06", "07", "08", "09", "10","11", "12", "01", "02", "03", 
                                                                           "04", "05" ))

talk_page_views_targetwiki_yoy_plot_June$year = factor(talk_page_views_targetwiki_yoy_plot_June$year, levels = c('2018/2019', '2019/2020'))


p <- ggplot(talk_page_views_targetwiki_yoy_plot_June, aes(x=MonthN, y = views, group = year, color = year, linetype = year)) +    
  geom_line(size = 0.8) +
  facet_wrap(~ project, nrow = 7, scale = "free_y") +
  scale_y_continuous("Number of views per month", labels = polloi::compress) +
 scale_x_discrete(breaks = talk_page_views_targetwiki_yoy_plot_June$MonthN, labels = talk_page_views_targetwiki_yoy_plot_June$Month )+
  labs(title = "Mobile web talk page views \n on target Wikipedia projects \n Where AMC was deployed on June 17, 2019") +
        xlab("Month") +
        scale_color_brewer(palette = 'Set1', breaks=c('2018/2019', '2019/2020'), direction= -1) +
        scale_linetype_manual(breaks=c('2018/2019', '2019/2020'), values=c(2,1)) +
  ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
  theme(plot.title = element_text(hjust = 0.5),
        legend.title = element_blank(),
        legend.position = "bottom",
        panel.grid = element_line("gray70"),
        legend.key.width=unit(0.5,"cm"))
p 

ggsave("Figures/mobile_web_talk_page_views_target_June.png", p, width = 18, height = 9, units = "in", dpi = 150)

There have also been year over year increases for most all of the target wikipedia projects except for Thai Wikipedia from June to October 2019 this year. These increases started prior to AMC deployment but there is a higher rate of increase following deployment and about 36.3% of all mobile web requests to article talk pages were done in AMC mode in October 2019.

Rate of edits to article talk pages from target wikipedias

In [1943]:
talk_page_edits_targetwiki_March <- talk_page_edits_clean %>%
  filter(edit_type == 'all_mobile_web_edits',
        wiki %in% c('arwiki', 'eswiki', 'idwiki')) %>%
  mutate(date = floor_date(date, "week")) %>%
  filter(date != '2018-05-27',
        date !=  '2019-10-27') %>% #remove incomplete data weeks
  group_by(date, wiki) %>%
  summarise(edit_count = sum(edit_count))
In [2044]:
# March deployment target wiki Mobile Web Edits by edit type

talk_page_edits_bytype_targetwiki_March <- talk_page_edits_clean %>%
  filter(edit_type != 'all_mobile_web_edits',
         wiki %in% c('arwiki', 'eswiki', 'idwiki')) %>%
  mutate(date = floor_date(date, "week")) %>%
  filter(date != '2018-05-27',
        date !=  '2019-10-27') %>% #remove incomplete data weeks
  group_by(date, wiki, edit_type) %>%
  summarise(edit_count = sum(edit_count)) 


p <- ggplot() +
  geom_col(data =talk_page_edits_bytype_targetwiki_March, aes(x= date, y= edit_count, fill = edit_type), stat= 'identity', position = "stack") + 
    geom_smooth(data = talk_page_edits_targetwiki_March, aes(x=date, y=edit_count), se = FALSE) +
    facet_wrap(~ wiki, nrow = 7, scale = "free_y") +
    scale_y_continuous("Number of edits per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Mobile web edits to article talk pages on target Wikipedia \n Where AMC was deployed on March 20, 2019") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))
p
ggsave("Figures/mobile_web_talk_page_edits_target_March.png", p, width = 18, height = 9, units = "in", dpi = 150)
Warning message:
“Ignoring unknown parameters: stat”`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
In [1946]:
talk_page_edits_targetwiki_July <- talk_page_edits_clean %>%
  filter(edit_type == 'all_mobile_web_edits',
        wiki %in% c('itwiki', 'jawiki', 
                       'fawiki', 'thwiki')) %>%
  mutate(date = floor_date(date, "week")) %>%
  filter(date != '2018-05-27',
        date !=  '2019-10-27') %>% #remove incomplete data weeks
  group_by(date, wiki) %>%
  summarise(edit_count = sum(edit_count))
In [2045]:
# July deployment target wiki Mobile Web Edits by edit type

talk_page_edits_bytype_targetwiki_July <- talk_page_edits_clean %>%
  filter(edit_type != 'all_mobile_web_edits',
         wiki %in% c( 'itwiki', 'jawiki', 
                       'fawiki', 'thwiki')) %>%
  mutate(date = floor_date(date, "week")) %>%
  filter(date != '2018-05-27',
        date !=  '2019-10-27') %>% #remove incomplete data weeks
  group_by(date, wiki, edit_type) %>%
  summarise(edit_count = sum(edit_count)) 


p <- ggplot() +
  geom_col(data =talk_page_edits_bytype_targetwiki_July, aes(x= date, y= edit_count, fill = edit_type), stat= 'identity', position = "stack") + 
    geom_smooth(data = talk_page_edits_targetwiki_July, aes(x=date, y=edit_count), se = FALSE) +
    facet_wrap(~ wiki, nrow = 7, scale = "free_y") +
    scale_y_continuous("Number of edits per week", labels = polloi::compress) +
    scale_x_date("Date", labels = date_format("%b %Y"), date_breaks = "1 month") +
    labs(title = "Mobile web edits to article talk pages on target Wikipedias \n Where AMC was deployed on June 17, 2019") +
     ggthemes::theme_tufte(base_size = 14, base_family = "Gill Sans") +
     theme(axis.text.x=element_text(angle = 45, hjust = 1),
        panel.grid = element_line("gray70"),
        legend.position="bottom",
        legend.title=element_blank(),
        legend.text=element_text(size=14))
p

ggsave("Figures/mobile_web_talk_page_views_target_June.png", p, width = 18, height = 9, units = "in", dpi = 150)
Warning message:
“Ignoring unknown parameters: stat”`geom_smooth()` using method = 'loess' and formula 'y ~ x'
`geom_smooth()` using method = 'loess' and formula 'y ~ x'

Summary of Mobile Web Talk Page Edits on Target Wikis - October 2019

Wiki Proportion of AMC Edits Month over Month Change in AMC Proportion Year over Year Change
Spanish Wiki 28.27% 57.0% 41.8%
Arabic Wiki 39.9% 75.26% 16.9%
Indonesian Wiki 13.2% -39.1% 58.5%
Italian Wiki 33.0% 146.27% 31.5%
Japanese Wiki 32.8% -6.21% 71.2%
Persian Wiki 12.4% -47.23% 45.2%
Thai Wiki 18.8% 408.11% 72.5%

On target wikis, the rate of talk page edits has increased but not as sharply as seen on English Wikipedia. In addition, these increase start to occur prior to AMC deployment and are likely due to other variables.

There is also a much greater fluctuation in the proportion of talk page edits made in AMC mode. For some of the smaller sized wikis such as Thai Wikipedia, the proportion of AMC requests is highly influenced by only a few users and as a result there are greater fluctuations in the number of edits. In October 2019, the highest proportion of AMC edits are made by contributors on Arabic Wikipedia (39.9%) followed by Japanese, Italian, and Spanish Wikipedias.