Megan Neisler, Staff Data Scientist, Wikimedia Foundation
Modified
2025-02-24
Purpose
In T378777, the Editing team changed the desktop Reference Check experience by presenting the check in a side rail located adjacent to the editable content, rather than presenting the Check within the editable content.
This change was deployed on 12 December 2024 to all wikis where Reference Check is currently offered as defaut. At this time of deployment, this included all Wikipedia except English and German Wikipedia. See Edit check/Deployment Status.
The purpose of this analysis is to review any changes in constructive activation, as defined by the WE 1.2 KR, and in the identifed Edit Check guardrail metrics by reviewing data two weeks before and after the change.
Research Questions
In the two weeks before and after this change was merged,
Do we notice a change in constructive activation rates?
Do we notice a change in the frequency and number of edit checks presented?
The number of editing sessions where edit check was shown
Do we notice a change in how likely the edit check change is disrupting to the person’s editing experience as measured by:
Edit completion rate
False positive report rate
Revert rate
Do we notice a change in how likely edit check is to cause people to publish constructive edits as measured by:
Absolute number of people that made a change to address the policy violation edit check was alerting them of.
Proportion of all published new content edits where the edit check was shown that made a change to address the policy violation edit check was alerting them of.
Summary of Findings
Overall, we did not observe any sharp declines or increases in constructive activation or the identified guardrail metrics following the deployment of the change.
We did observe some slight positive changes indicating that the new location of reference check to the side rail may be increasing user engagement with the check. These positive changes include an increase in the proportion of edits that were shown reference check and added a reference as well as a decrease in revert rate. However, as this is not a controlled experiment, other potential external factors may contribute to some of these observed changes.
Constructive Activation Rate
During this reviewed timeframe, reference check was shown to 14% of all VisualEditor edits and 6.7% of all newcomers that created an account.
There were no significant changes in constructive activation rates. Constructive activation rate was 28.7% prior to the Multi-Check Phase 1 deployment and 28.9% after the deployment. This represent only a 0.7% increase which is not statistically signficant.
Edit Completion Rate
Overall, there were no changes in edit completion rate with the inclusion of edits that were reverted. 80% of edits that were shown reference check were successfully saved pre and post the move of edit checks to the side rail.
If we exclude reverted edits, there was a slight increase (3.2% increase [2 percentage points]) in edits completed. 68% of edits where reference check was presented was successfully saved and not reverted following the move of the check to the side rail compared to 66% of edits successfully completed prior to the change.
We also observed similar increases in edit completion rate (excluding reverted edits) across all experience level groups (newcomers, junior contributors, and unregistered users) and wikis.
False Positive Rate
Overall declines of reference check decreased by 4.7%. 51.2% of edit attempts included an explicit reason for declining an reference check following the change while 53.7% of edit attempts included a decline reason prior to the change.
There was a 6.4% (0.5 percentage points, 7.8% pre to 8.3% post) increase in the proportion of edit attempts that indicated that the reference check presented was irrelevant; however, the rate of all other types of declines decreased.
There were no significant changes in decline rates by editor experience level.
Revert Rate
We observed the most significant change in revert rates pre and post the move of reference check to the side rail.
The revert rate of new content edits where reference check was presented decreased by 15.7% (20.4% pre to 17.2% post change).
We observed revert rate decreases across all experience level groups (unregistered, newcomer, and junior contributors) and across the majority of Wikipedias, except for Spanish Wikipedia which had a 2 percentage point increase in revert rate.
Total distinct users that included a reference after being shown reference check
Overall there was slight decrease in the absolute number of users that added a reference after being shown reference check (- 166 users across all wikis); however, there were no significant changes around the date edit check was moved to the side rail.
Additionally, the lower number of users is also likely to be impacted be seasonal trends and changes in editing activity around the December holidays.
Proportion of edits that included a reference after being shown reference check
There was a 8% increase in the proportion of new content edits that included a reference following the change (34.8% pre change to 37.9% post change).
Increases were observed across all editor experience levels and most wikis.
Constructive Activation Rate
For WE 1.2 KR, we defined constructive activation as: “The percentage of newcomers making at least one edit to an article in the main namespace of a Wikipedia project on a mobile device within 24 hours of registration (also on a mobile device) and that edit not being reverted within 48 hours of being published.”
For this analysis, we are reviewing constructive activation on desktop devices instead of mobile as this is where the Multi-Check Phase 1 change was deployed. However, reviewing the impacts from edit check changes on this metric will help use develop a better understanding of how we are tracking against the improvment targets and decide if any adjustment to the strategy are needed.
Methodology
We gathered desktop registrations from two weeks pre and post deployment of the change deployed on December 12th. For those registrations, we gathered data on edits to a main namespace completed on a desktop device within 24 hours of registration and the reverts of those edits. English and German Wikipedia were excluded as reference check was not available as default on those wikis at the time of this analysis. See Edit check/Deployment Status.
We reviewed changes to contructive activation rates overall as well as for edits where edit check was shown.
# load data for assessing activationsall_users_edit_data <-read.csv(file ='data/activation-edit-data.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
Show the code
#reformat user-id and adjust to include wiki to account for duplicate user id instances.# Users do not have the smae user_id on different wikisall_users_edit_data$user_id <-as.character(paste(all_users_edit_data$user_id,all_users_edit_data$wiki_db,sep ="-" ))
Show the code
# Check for duplicate user idslength(unique(all_users_edit_data$user_id)) ==nrow(all_users_edit_data)
TRUE
Show the code
# format registration timestamp to dayall_users_edit_data$user_registration_timestamp <-as.Date(all_users_edit_data$user_registration_timestamp, format ="%Y-%m-%d")
Show the code
#add column to calculate pre and post dates based on registration timestampall_users_edit_data <- all_users_edit_data %>%mutate(pre_post =case_when( user_registration_timestamp >='2024-11-27'& user_registration_timestamp <='2024-12-11'~"pre", user_registration_timestamp >='2024-12-12'& user_registration_timestamp <='2024-12-26'~"post" ),pre_post =factor( pre_post ,levels =c("pre", "post") ))
Number of edits completed by newcomers
We want to first take a quick look at the types of edits newcomers complete 24 hours after registering to understand how many of these users are encountering reference check.
# filter out users that registered on mobile web or mobile appall_users_edit_desktop <- all_users_edit_data %>%filter(reg_on_mobile ==0& reg_w_api ==0)
Show the code
# Total Editsall_users_edit_desktop %>%# group_by(pre_post) %>%summarise(num_article_edits_24hrs_all =sum(num_article_edits_24hrs_all),num_article_edits_24hrs_visualeditor =sum(num_article_edits_24hrs_visualeditor),num_article_edits_24hrs_editcheck =sum(num_article_edits_24hrs_editcheck))
66% of all edits completed by newcomers 24 hours after registering are completed on VisualEditor. About 14% of these edits were shown reference check.
6.7% of all newcomers were shown edit check at least once.
Note: No changes were made in the Multi-Check Phase 1 deployment which would impact how frequently the check was shown. We did observe decreases in the absolute number of all types of edits following the change. Note: This is likely related to seasonal changes due to the end of December holidays.
## add column to define constructive activation by VEall_users_edit_desktop <- all_users_edit_desktop %>%mutate(is_constr_activated_visualeditor =case_when( (num_article_edits_24hrs_visualeditor - num_article_reverts_24hrs_visualeditor) >0~"constr_activation_visualeditor",TRUE~'not_constr_activated'))
Show the code
## add column to define constructive activation by EditCheck## Defining as a user shown edit check at least once and defined as making constructive editsall_users_edit_desktop <- all_users_edit_desktop %>%mutate(is_constr_activated_editcheck =case_when(num_article_edits_24hrs_editcheck >0& (num_article_edits_24hrs_all - num_article_reverts_24hrs_all) >0~"constr_activation_editcheck",TRUE~'not_constr_activated'))
Show the code
# check that activation was defined appropriatelyall_users_edit_desktop %>%filter(is_activated =='is_activated', num_article_edits_24hrs_all >0 ) %>%slice_head(n=5)
## VEconstructive_activation_visualeditor <- all_users_edit_desktop %>%group_by(pre_post, is_constr_activated_visualeditor) %>%summarise(num_users =n()) %>%mutate(pct_users =paste0(round(num_users/sum(num_users) *100, 1), "%")) %>%filter(is_constr_activated_visualeditor =='constr_activation_visualeditor')%>%select(-2) %>%ungroup() %>%gt() %>%opt_stylize(5) %>%tab_header(title ="Contructive Activation Rates for Newcomers that used VisualEditor" ) %>%cols_label(pre_post ="Pre or post change",num_users ="Number of newcomers",pct_users ="Constructive Activation Rates" ) display_html(as_raw_html(constructive_activation_visualeditor))
Contructive Activation Rates for Newcomers that used VisualEditor
Pre or post change
Number of newcomers
Constructive Activation Rates
pre
4648
21.5%
post
3706
21.6%
Constructive Activation Rates for Newcomers presented at least one reference check
Definition: The percentage of that were presented at least one Reference Check andall newcomers made at least one edit to an article in the main namespace of a Wikipedia project on a desktop device within 24 hours of registration and that edit not being reverted within 48 hours of being published.”
Show the code
## EditCheckconstructive_activation_ec <- all_users_edit_desktop %>%group_by(pre_post, is_constr_activated_editcheck) %>%summarise(num_users =n()) %>%mutate(pct_users =round(num_users/sum(num_users) *100, 2)) %>%filter(is_constr_activated_editcheck=='constr_activation_editcheck')%>%select(-2) %>%ungroup() %>%gt() %>%opt_stylize(5) %>%tab_header(title ="Contructive Activation Rates for Newcomers shown Reference Check" ) %>%cols_label(pre_post ="Pre or post change",num_users ="Number of newcomers",pct_users ="Constructive Activation Rates" ) display_html(as_raw_html(constructive_activation_ec))
Contructive Activation Rates for Newcomers shown Reference Check
Pre or post change
Number of newcomers
Constructive Activation Rates
pre
1205
5.57
post
1017
5.92
The low proportion here primarily reflects that only a small proportion of all users that created account reach the stage where a reference check would be presented (when they attempt to save).
A newcomer would need to succesfully transtion through stages after creating an account before reaching this stage. The work that will be completed in T385906 will help visualize the full constructive activation funnel and help better isolate the impact of edit check on this metric.
Constructive Activation Rates By Wiki
Show the code
# all dekstop by wikiconstructive_activation_wiki <- all_users_edit_desktop %>%group_by(wiki_db, pre_post, is_constr_activated) %>%summarise(num_users =n_distinct(user_id)) %>%mutate(pct_users =paste0(round(num_users/sum(num_users) *100, 2), "%")) %>%filter(is_constr_activated =='is_constr_activated', num_users >250)%>%## Limit to wikis with over 150 users that made an editgroup_by(wiki_db) %>%#select(-2) %>% gt() %>%opt_stylize(5) %>%tab_header(title ="Contructive Activation Rates by Wiki" ) %>%cols_label(wiki_db ="Wikipedia",pre_post ="Pre or post change",num_users ="Number of newcomers",pct_users ="Constructive Activation Rates" ) %>%tab_footnote(footnote ="Limited to wikis with at least 500 newcomers that created accounts during reviewed timeframe",locations =cells_column_labels(columns ="num_users" ) ) display_html(as_raw_html(constructive_activation_wiki))
Show the code
## VEconstructive_activation_visualeditor_wiki <- all_users_edit_desktop %>%group_by(wiki_db, pre_post, is_constr_activated_visualeditor) %>%summarise(num_users =n()) %>%mutate(pct_users =paste0(round(num_users/sum(num_users) *100, 1), "%")) %>%filter(is_constr_activated_visualeditor =='constr_activation_visualeditor', num_users >250) %>%# limit to wikis with 100 usersgroup_by(wiki_db) %>%#select(-2) %>% gt() %>%opt_stylize(5) %>%tab_header(title ="Contructive Activation Rates by Wiki for Newcomers that used VisualEditor" ) %>%cols_label(wiki_db ="Wikipedia",pre_post ="Pre or post change",num_users ="Number of newcomers",pct_users ="Constructive Activation Rates" ) %>%tab_footnote(footnote ="Limited to wikis with at least 500 newcomers that created accounts during reviewed timeframe",locations =cells_column_labels(columns ="num_users" ) ) display_html(as_raw_html(constructive_activation_visualeditor_wiki ))
Contructive Activation Rates by Wiki for Newcomers that used VisualEditor
Pre or post change
is_constr_activated_visualeditor
Number of newcomers1
Constructive Activation Rates
eswiki
pre
constr_activation_visualeditor
611
19.8%
post
constr_activation_visualeditor
365
19.5%
frwiki
pre
constr_activation_visualeditor
696
24.9%
post
constr_activation_visualeditor
537
24%
jawiki
pre
constr_activation_visualeditor
315
28%
post
constr_activation_visualeditor
294
27.2%
ptwiki
pre
constr_activation_visualeditor
380
22.2%
post
constr_activation_visualeditor
306
25.3%
ruwiki
pre
constr_activation_visualeditor
317
18.1%
post
constr_activation_visualeditor
303
20.5%
1 Limited to wikis with at least 500 newcomers that created accounts during reviewed timeframe
Constructive Edits
Reference check is not presented to newcomers until they attempt to save an edit, requiring them to successfully transition through several stages after creating an account prior to reaching this stage. To help isolate the impact of this intervention on newcomers, we also reviewed changes in overall constructive edit rates. This limits the analysis to newcomers that successfully published an edit.
For this analysis, we’re defining constructive edits as the proportion of all edits completed by newcomers within 24 hours that are not reverted within 48 hours. This is limited to users that were shown at least once reference check within 24 hours after registering.
Show the code
# constructive editsconstructive_edits_editcheck <- all_users_edit_desktop %>%filter(num_article_edits_24hrs_editcheck >0) %>%#limit to edits where ref check was shown at least oncegroup_by(pre_post) %>%summarise(num_article_edits_total =sum(num_article_edits_24hrs_all),num_article_reverts_total =sum(num_article_reverts_24hrs_all)) %>%mutate(pct_const =paste0(round((num_article_edits_total-num_article_reverts_total)/num_article_edits_total *100, 1), "%")) %>%gt() %>%opt_stylize(5) %>%tab_header(title ="Proportion of constructive edits completed by newcomers shown Reference Check at least once" ) %>%cols_label(pre_post ="Pre or post change",num_article_edits_total ="Total number of edits published",num_article_reverts_total ="Total number of edits reverted",pct_const ="Constructive Edit Rate" ) %>%tab_footnote(footnote ="Defined as the proportion of all published edits that are reverted within 48 hours",locations =cells_column_labels(columns ="pct_const" ) ) display_html(as_raw_html(constructive_edits_editcheck))
Proportion of constructive edits completed by newcomers shown Reference Check at least once
Pre or post change
Total number of edits published
Total number of edits reverted
Constructive Edit Rate1
pre
5151
924
82.1%
post
4431
662
85.1%
1 Defined as the proportion of all published edits that are reverted within 48 hours
Key Findings:
Reference check is not presented to newcomers until they attempt to save an edit, requiring them to successfully transition through several stages after creating an account before reaching this stage. During this reviewed timeframe, reference check was shown to 14% of all VisualEditor edits and 6.7% of all newcomers that created an account.
There were no significant changes in constructive activation rates when reviewing overall edits or when limits to edits completed with VisualEditor. Constructive activation rate was 28.7% prior to the Multi-Check Phase 1 deployment and 28.9% after the deployment. This represent only a 0.7% increase which is not statistically signficant.
If we limit to only users that published edit, there was a +3.7% increase in total constructive edits following the Multi-Check Phase 1 deployments for newcomers that were presented with at least one reference check.
Note: The work that will be completed in T385906 will help visualize the full constructive activation funnel and help better isolate the impact of edit check on constructive activation rates.
Edit Check Guardrails
Methodology
We reviewed a sample of edits collected two weeks pre and post deployment of the change deployed on Dec 12th to present a single reference check in a side rail to people within visual editor on desktop.
Data was limited to edits completed by unregistered users or users with 100 or fewer edits on a desktop main page namespace on all the wikis where Reference Check is deployed as default. See deployment status.
Data was collected from EditAttemptStep, VisualEditorFeatureUse and mediawiki_history. Note: For EditAttemptStep and VisualEditorFeatureUse, the logging of edit check events changed after the edit check was moved to the side rail. See instrumentation changes below:
Pre Change: event.feature = ‘editCheckReferences’ and event.action = ‘context-show’
Post Change: event.feature = ‘editCheckDialog’ OR event.action ‘window-open-from-check’
Number of editing sessions where reference check was shown
As this change did not increase the number of checks presented in a single session, we did not review the average number of checks presented in a single session. However, we did review the number of editing sessions where reference checks were presented to the users to confirm this change in edit check location did not change how frequently the edit check was activated.
Note: This query to collect this data is resource intensive as it gathers all edit attempts, so we limited to a sample of edit attempts completed from Dec 4 through Dec 19th (1 week pre and post the change) to a subset of wikis (gurwiki, fonwiki, gpewiki, hawiki, kgwiki, lnwiki, arwiki, afwiki, zhwiki, frwiki, itwiki, jawiki, ptwiki, eswiki, swwiki, viwiki, yowiki)
# data reformattingedit_check_frequency_data$date <-as.Date(edit_check_frequency_data$date, format ="%Y-%m-%d")# Set experience level group and factor levelsedit_check_frequency_data <- edit_check_frequency_data %>%mutate(experience_level_group =case_when( user_edit_count ==0& user_status =='registered'~'Newcomer', user_edit_count ==0& user_status =='unregistered'~'Unregistered', user_edit_count >0& user_edit_count <=100~"Junior Contributor", user_edit_count >100~"Non-Junior Contributor" ),experience_level_group =factor(experience_level_group,levels =c("Unregistered","Newcomer", "Non-Junior Contributor", "Junior Contributor") )) #add column to calulcate pre and post datesedit_check_frequency_data <- edit_check_frequency_data %>%mutate(pre_post =case_when( date >='2024-12-04'& date <='2024-12-11'~"pre", date >='2024-12-12'& date <='2024-12-19'~"post" ),pre_post =factor( pre_post ,levels =c("pre", "post") ))
Show the code
edit_session_num_overall <- edit_check_frequency_data %>%filter(was_edit_check_shown ==1) %>%group_by(pre_post) %>%summarise(n_editing_session =n_distinct(editing_session),n_users =n_distinct(user_id)) %>%gt() %>%tab_header(title ="Number of editing sessions where reference check was shown pre and post edit check change" ) %>%cols_label(pre_post ="Pre or Post Change",n_editing_session ="Number of Editing Sessions",n_users ="Number of Users" ) display_html(as_raw_html(edit_session_num_overall))
Number of editing sessions where reference check was shown pre and post edit check change
# plot daily editing sessionstextaes <-data.frame(y =475,x =as.Date(c('2024-12-15')),lab =c("Reference check moved to side rail"))p <- edit_session_num_daily %>%ggplot(aes(x = date, y = n_editing_session)) +geom_line(linewidth =1.5, color ='steelblue2') +geom_vline(xintercept =as.Date('2024-12-12'), linetype ='dashed', size =1) +geom_segment(aes(x =as.Date(c('2024-12-15')), y =450, xend =as.Date('2024-12-12'), yend =420),arrow =arrow(length =unit(0.8, "cm")), size =1, color ="black") +geom_text(mapping =aes(y = y, x = x, label = lab), data = textaes, inherit.aes =FALSE, size =5) +scale_x_date(date_labels ="%b-%d", date_breaks ="1 week", minor_breaks =NULL) +scale_y_continuous(limits =c(0, 500))+labs(title ="Daily number of editing sessions where reference check was shown",y ="Number of distinct editing sessions") +theme_bw() +scale_color_manual(values=c("#000099", "#666666"), name ="Final state") +theme(panel.grid.major =element_blank(),panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=18),legend.position="bottom",axis.text.x =element_text(hjust=1),axis.line =element_line(colour ="black"))p
Warning message in geom_segment(aes(x = as.Date(c("2024-12-15")), y = 450, xend = as.Date("2024-12-12"), :
“All aesthetics have length 1, but the data has 16 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.”
There were no significant changes in the number of editing sessions where reference check was shown pre and post deployment of the change.
Edit Completion Rate
We reviewed the proportion of edits by newcomers, junior contributors, and unregistered users that were shown reference check during their edit session and successfully published their edit (event.action = saveSuccess). The analysis does not include all edits started but is limited to only edits that met the save attempt step where reference check is shown and then subsequently published their edit.
Show the code
# load data for assessing edit completion rateedit_completion_rates_data <-read.csv(file ='data/edit_completion_rate_data.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
Show the code
# data reformattingedit_completion_rates_data$date <-as.Date(edit_completion_rates_data$date, format ="%Y-%m-%d")# Set experience level group and factor levelsedit_completion_rates_data <- edit_completion_rates_data %>%mutate(experience_level_group =case_when( user_edit_count ==0& user_status =='registered'~'Newcomer', user_edit_count ==0& user_status =='unregistered'~'Unregistered', user_edit_count >0& user_edit_count <=100~"Junior Contributor", user_edit_count >100~"Non-Junior Contributor" ),experience_level_group =factor(experience_level_group,levels =c("Unregistered","Newcomer", "Non-Junior Contributor", "Junior Contributor") )) #add column to calulcate pre and post dates# Need to update theseedit_completion_rates_data <- edit_completion_rates_data %>%mutate(pre_post =case_when( date >='2024-11-27'& date <='2024-12-11'~"pre", date >='2024-12-12'& date <='2024-12-26'~"post" ),pre_post =factor( pre_post ,levels =c("pre", "post") ))
Overall Edit Completion Rates
Includes Reverted Edits
Show the code
edit_completion_rate_overall <- edit_completion_rates_data %>%group_by(pre_post) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Edit Completion Rate Pre and Post Change (Inlcudes Reverted Edits)" ) %>%cols_label(pre_post ="Pre or Post Change",n_edits ="Number of editing sessions shown reference check",n_saves ="Number of published edits",completion_rate ="Proportion of reference check sessions saved" ) %>%opt_stylize(style =2) display_html(as_raw_html(edit_completion_rate_overall ))
Edit Completion Rate Pre and Post Change (Inlcudes Reverted Edits)
Pre or Post Change
Number of editing sessions shown reference check
Number of published edits
Proportion of reference check sessions saved
pre
15206
12158
80%
post
13266
10674
80.5%
Exludes Reverted Edits
Show the code
edit_completion_rate_overall <- edit_completion_rates_data %>%group_by(pre_post) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0& was_reverted ==0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Edit Completion Rate Pre and Post Change (Exludes Reverted Edits)" ) %>%cols_label(pre_post ="Pre or Post Change",n_edits ="Number of editing sessions shown reference check",n_saves ="Number of published edits",completion_rate ="Proportion of reference check sessions saved" ) %>%opt_stylize(style =2) display_html(as_raw_html(edit_completion_rate_overall ))
Edit Completion Rate Pre and Post Change (Exludes Reverted Edits)
# plot daily editing sessionstextaes <-data.frame(y =c(0.76),x =as.Date(c('2024-12-17')),lab =c("Reference check presented in side rail"))p <- edit_completion_rate_overall_daily %>%ggplot(aes(x = date, y = completion_rate)) +geom_line(size =1.5, color ='#0072B2') +geom_vline(xintercept =as.Date('2024-12-12'), linetype ='dashed', size =1) +geom_segment(aes(x =as.Date(c('2024-12-15')), y =0.75, xend =as.Date('2024-12-12'), yend =0.66),arrow =arrow(length =unit(0.7, "cm")), size =1, color ="black") +geom_text(mapping =aes(y = y, x = x, label = lab), data = textaes, inherit.aes =FALSE, size =5) +scale_y_continuous(labels = scales::percent, limits =c(0.4, 0.8)) +scale_x_date(date_labels ="%b-%d", date_breaks ="1 week", minor_breaks =NULL) +labs(title ="Daily edit completion rate for sessions shown Reference Check",y ="Proportion of editing sessions",caption ="excludes edits reverted within 48 hours") +theme_bw() +theme(panel.grid.major =element_blank(),panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=18),legend.position="bottom",axis.text.x =element_text(hjust=1),axis.line =element_line(colour ="black"))p
Warning message:
“Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.”
Warning message in geom_segment(aes(x = as.Date(c("2024-12-15")), y = 0.75, xend = as.Date("2024-12-12"), :
“All aesthetics have length 1, but the data has 30 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.”
By User Edit Count
Show the code
edit_completion_rate_editcount <- edit_completion_rates_data %>%filter(experience_level_group !='Non-Junior Contributor') %>%group_by(experience_level_group, pre_post) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0& was_reverted ==0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), '%')) %>%gt() %>%tab_header(title ="Edit Completion Rate By Editor Experience" ) %>%cols_label(experience_level_group ="Experience Level",pre_post ="Pre or Post Change",n_edits ="Number of editing sessions shown reference check",n_saves ="Number of published edits",completion_rate ="Proportion of reference check sessions saved" ) %>%opt_stylize(style =2) %>%tab_footnote(footnote ="Excludes edits reverted within 48 hours",locations =cells_column_labels(columns ="n_saves" ) ) display_html(as_raw_html(edit_completion_rate_editcount))
Edit Completion Rate By Editor Experience
Pre or Post Change
Number of editing sessions shown reference check
Number of published edits1
Proportion of reference check sessions saved
Unregistered
pre
8568
4905
57.2%
post
7243
4299
59.4%
Newcomer
pre
1742
1152
66.1%
post
1469
995
67.7%
Junior Contributor
pre
4896
3974
81.2%
post
4554
3742
82.2%
1 Excludes edits reverted within 48 hours
By Wiki
Show the code
edit_completion_rate_wiki <- edit_completion_rates_data %>%group_by(wiki, pre_post) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), '%')) %>%filter(n_saves >500)%>%##limited to wikis with over w00 saved edits gt() %>%tab_header(title ="Edit Completion Rate By Wiki" ) %>%cols_label(wiki ="Wiki",pre_post ="Pre or Post Change",n_edits ="Number of editing sessions shown reference check",n_saves ="Number of published edits",completion_rate ="Proportion of reference check sessions saved" ) %>%tab_footnote(footnote ="Limited to wikis with over 100 saved edits where edit check was shown",locations =cells_column_labels(columns ='n_edits' ) ) %>%opt_stylize(style =2) %>%tab_footnote(footnote ="Excludes edits reverted within 48 hours",locations =cells_column_labels(columns ="n_saves" ) ) display_html(as_raw_html(edit_completion_rate_wiki))
Edit Completion Rate By Wiki
Pre or Post Change
Number of editing sessions shown reference check1
Number of published edits2
Proportion of reference check sessions saved
eswiki
pre
1123
930
82.8%
post
779
660
84.7%
frwiki
pre
1921
1700
88.5%
post
1748
1531
87.6%
itwiki
pre
1270
1020
80.3%
post
1095
937
85.6%
nlwiki
pre
966
621
64.3%
post
690
506
73.3%
ruwiki
pre
1546
1236
79.9%
post
1447
1168
80.7%
1 Limited to wikis with over 100 saved edits where edit check was shown
2 Excludes edits reverted within 48 hours
Key Findings:
Overall, there were no changes in edit completion rate with the inclusion of edits that were reverted. 80% of edits where reference check was presented were successfully saved pre and post the move of edit checks to the side rail.
If we exclude reverted edits, there was a slight increase (3.2% increase [2 percentage points]) in edits completed. 68% of edits where reference check was presented was successfully saved and not reverted following the move of the check to the side rail compared to 66% of edits successfully completed prior to the change.
We also observed similar increases in edit completion rate (excluding reverted edits) across all experience level groups (newcomers, junior contributors, and unregistered users) and wikis.
False Positive Rate
The proportion of edits where reference check was shown and the contributor dismissed adding a citation by explicitly indicating that the information they are adding does not need to make the specified change.
For this metric, we reviewed the proportion of edits where reference check was shown and the contributor dismissed adding a citation by explicitly indicating that the information they are adding does not violate the specified policy. This was determined in the data by reviewing published edits with decline-irrelevant tag. We also reviewed the proportion of edits that explicitly added a reason for declining the reference check to see in changes in the overall decline rate.
Note: this metric relies on users explicitly selecting an option. It does not account for instances where the reference check was shown in error and the user did not select one of the provided options for declining the check.
See available edit tags for documentation of decline options.
Data Gathering and Processing
Show the code
# load published edit tag datapublished_edit_check_data <-read.csv(file ='data/published_edit_check_data.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
Show the code
# data reformattingpublished_edit_check_data$date <-as.Date(published_edit_check_data$date, format ="%Y-%m-%d")# Set experience level group and factor levelspublished_edit_check_data <- published_edit_check_data %>%mutate(experience_level_group =case_when(is.na(user_edit_count) ~'Unregistered', user_edit_count ==1~'Newcomer', user_edit_count >1& user_edit_count <=100~"Junior Contributor", user_edit_count >100~"Non-Junior Contributor" ),experience_level_group =factor(experience_level_group,levels =c("Unregistered", "Newcomer", "Non-Junior Contributor", "Junior Contributor") )) # rename is mobile editpublished_edit_check_data <- published_edit_check_data %>%mutate(platform =case_when( is_mobile_edit ==1~'phone', is_mobile_edit ==0~'desktop', )) #add column to calulcate pre and post datespublished_edit_check_data<- published_edit_check_data %>%mutate(pre_post =case_when( date >='2024-11-27'& date <='2024-12-11'~"pre", date >='2024-12-12'& date <='2024-12-26'~"post" ),pre_post =factor( pre_post ,levels =c("pre", "post") ))
Overall Declines
Show the code
edit_check_decline_overall <- published_edit_check_data %>%filter(is_edit_check_activated ==1) %>%#only edits where showngroup_by(pre_post) %>%summarise(n_edits =n_distinct(revision_id),n_edits_decline =n_distinct(revision_id[decline_other ==1| decline_common_knowledge ==1| decline_irrelevant ==1| decline_uncertain ==1]), ) %>%mutate(prop_users =paste0(round(n_edits_decline/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Overall Edit Check Declines" ) %>%cols_label(pre_post ="Pre or Post Change",n_edits ="Number of published edits where reference check was shown",n_edits_decline ="Number of published edits that selected decline option",prop_users ="Proportion of edits that declined reference check" ) %>%opt_stylize(style =2) display_html(as_raw_html(edit_check_decline_overall))
Overall Edit Check Declines
Pre or Post Change
Number of published edits where reference check was shown
Number of published edits that selected decline option
# plot daily editing sessionstextaes <-data.frame(y =c(0.66),x =as.Date(c('2024-12-16')),lab =c("Reference check moved to side rail"))p <- edit_check_decline_overall_daily %>%ggplot(aes(x = date, y = prop_edits)) +geom_line(size =1.5, color ='#0072B2') +geom_vline(xintercept =as.Date('2024-12-12'), linetype ='dashed', size =1) +geom_segment(aes(x =as.Date(c('2024-12-15')), y =0.65, xend =as.Date('2024-12-12'), yend =0.55),arrow =arrow(length =unit(0.8, "cm")), size =1, color ="black") +geom_text(mapping =aes(y = y, x = x, label = lab), data = textaes, inherit.aes =FALSE, size =5) +scale_y_continuous(labels = scales::percent, limits =c(0.3, 0.7)) +scale_x_date(date_labels ="%b-%d", date_breaks ="1 week", minor_breaks =NULL) +labs(title ="Daily overall decline rate for sessions shown Reference Check",y ="Proportion of reference checks declined",caption ="includes all edits where a user added an explicit reason for declining reference check") +theme_bw() +theme(panel.grid.major =element_blank(),panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=18),legend.position="bottom",axis.text.x =element_text(hjust=1),axis.line =element_line(colour ="black"))p
Warning message in geom_segment(aes(x = as.Date(c("2024-12-15")), y = 0.65, xend = as.Date("2024-12-12"), :
“All aesthetics have length 1, but the data has 29 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.”
Show the code
# overall by typeedit_check_decline_overall_bytype <- published_edit_check_data %>%filter(is_edit_check_activated ==1) %>%#only edits where showngroup_by(pre_post) %>%summarise(n_edits =n_distinct(revision_id),decline_uncertain =n_distinct(revision_id[decline_uncertain ==1]),decline_other =n_distinct(revision_id[decline_other ==1]),decline_common_knowledge =n_distinct(revision_id[decline_common_knowledge ==1]),decline_irrelevant =n_distinct(revision_id[decline_irrelevant ==1]), ) %>%pivot_longer(cols =contains('decline'), names_to ="decline_reason", values_to ="n_decline_edits") %>%mutate (prop_users =paste0(round(n_decline_edits/n_edits *100, 1), "%")) %>%select(-c(2,4)) %>%group_by(decline_reason) %>%gt() %>%tab_header(title ="Overall Edit Check Declines By Decline Reason" ) %>%cols_label(pre_post ="Pre or Post Change",decline_reason ="Decline Reason",prop_users ="Proportion of edits that declined reference check" ) %>%opt_stylize(style =2) display_html(as_raw_html(edit_check_decline_overall_bytype ))
colorfriendly <-c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")p <- edit_check_decline_userexp %>%ggplot(aes(x= decline_reason, y = prop_edits, fill = decline_reason)) +geom_col(position ='dodge') +facet_grid(vars(pre_post), vars(experience_level_group)) +scale_y_continuous(labels = scales::percent) +geom_text(aes(label =paste0(prop_edits *100, "%"), fontface=2), vjust=1.2, size =8, color ="white") +scale_fill_manual(values= colorfriendly) +labs (y ="Percent of reference checks declined ",x ="Decline citation reason",title ="Proportion of edits where reference check was shown \n and declined by editor experience") +theme(panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=18),axis.title.x=element_blank(),axis.text.x=element_blank(),axis.ticks.x=element_blank(),legend.position="bottom",axis.line =element_line(colour ="black"))p
By Wiki
Show the code
false_positive_rate_wiki<- published_edit_check_data %>%filter(user_status =='registered', is_edit_check_activated ==1) %>%#only edits where shown) %>%group_by(wiki, pre_post) %>%summarise(n_edits =n_distinct(revision_id),n_false_positive =n_distinct(revision_id[decline_irrelevant >0])) %>%mutate(false_positive_rate =paste0(round(n_false_positive/n_edits *100, 1), "%")) %>%filter(n_edits >500) %>%select(-c(3,4)) %>%gt() %>%tab_header(title ="Reference Check False Positive Rate by Wiki" ) %>%cols_label(wiki ="Wiki",pre_post ="Pre or Post Change",false_positive_rate ="False positive rate" ) %>%tab_footnote(footnote ="Limited to wikis with over 500 edits; Determined in the data by reviewing published edits with `decline-irrelevant` tag",locations =cells_column_labels(columns ='false_positive_rate' ) )%>%opt_stylize(style =2) display_html(as_raw_html(false_positive_rate_wiki))
Reference Check False Positive Rate by Wiki
Pre or Post Change
False positive rate1
eswiki
pre
9.7%
post
13.1%
frwiki
pre
3%
post
2.8%
ruwiki
pre
3.9%
post
6.3%
1 Limited to wikis with over 500 edits; Determined in the data by reviewing published edits with `decline-irrelevant` tag
Key Findings:
Overall declines of reference check decreased by 4.7%. 51.2% of edit attempts included an explicit reason for declining an reference check following the change while 53.7% of edit attempts included a decline reason prior to the change.
There was a 6.4% (0.5 percentage points, 7.8% pre to 8.3% post) increase in the proportion of edit attempts that indicated that the reference check presented was irrelevant; however, the rate of all other types of declines decreased.
There were no significant changes in decline rates by editor experience level.
Results vary by wiki. Spanish Wikipedia saw the highest increase in false positive rates (35% increase [9.7% pre change to 13.1% post change]).
It is important to note that this ony considers users that explicitly selected a decline reason and not users that did not provide a reason for dismissing reference check.
New Content Edit Revert Rate
Proportion of all new content edits (defined by editcheck-newcontent tag) where reference check was shown that were reverted within 48 hours of being published.
Show the code
revert_rate_overall <- published_edit_check_data %>%filter(is_edit_check_activated ==1, is_new_content ==1) %>%group_by(pre_post) %>%summarise(n_edits =n_distinct(revision_id),n_reverts =n_distinct(revision_id[was_reverted >0])) %>%mutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="New content edit revert rate" ) %>%cols_label(pre_post ="Pre or Post Change",n_edits ="Number of published edits",n_reverts ="Number of edits reverted",revert_rate ="Revert Rate" ) %>%tab_footnote(footnote ="Limited to edits where edit check was shown and that were reverted within 48 hours",locations =cells_column_labels(columns ='revert_rate' ) )%>%opt_stylize(style =2) display_html(as_raw_html(revert_rate_overall))
New content edit revert rate
Pre or Post Change
Number of published edits
Number of edits reverted
Revert Rate1
pre
13749
2807
20.4%
post
11412
1965
17.2%
1 Limited to edits where edit check was shown and that were reverted within 48 hours
# plot daily editing sessionstextaes <-data.frame(y =c(0.31),x =as.Date(c('2024-12-16')),lab =c("Reference check moves to side rail"))p <- revert_rate_overall_daily %>%ggplot(aes(x = date, y = n_reverts/n_edits)) +geom_line(size =1.5, color ='#0072B2') +geom_vline(xintercept =as.Date('2024-12-12'), linetype ='dashed', size =1) +geom_segment(aes(x =as.Date(c('2024-12-15')), y =0.3, xend =as.Date('2024-12-12'), yend =0.2),arrow =arrow(length =unit(0.8, "cm")), size =1, color ="black") +geom_text(mapping =aes(y = y, x = x, label = lab), data = textaes, inherit.aes =FALSE, size =5) +scale_y_continuous(labels = scales::percent, limit =c(0, 0.4)) +scale_x_date(date_labels ="%b-%d", date_breaks ="1 week", minor_breaks =NULL) +labs(title ="Daily revert rate for new content edits shown Reference Check",y ="Proportion of edits reverted") +theme_bw() +theme(panel.grid.major =element_blank(),panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=18),legend.position="bottom",axis.text.x =element_text(hjust=1),axis.line =element_line(colour ="black"))p
Warning message in geom_segment(aes(x = as.Date(c("2024-12-15")), y = 0.3, xend = as.Date("2024-12-12"), :
“All aesthetics have length 1, but the data has 29 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.”
By Editor Experience Group
Show the code
revert_rate_editorexp <- published_edit_check_data %>%filter( is_edit_check_activated ==1) %>%group_by(experience_level_group, pre_post) %>%summarise(n_edits =n_distinct(revision_id),n_reverts =n_distinct(revision_id[was_reverted >0])) %>%mutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="New content edit revert rate by editor experience" ) %>%cols_label(experience_level_group ="Experience level group",pre_post ="Pre or Post Change",n_edits ="Number of published edits",n_reverts ="Number of edits reverted",revert_rate ="Revert Rate" ) %>%tab_footnote(footnote ="Limited to edits where reference check was shown and that were reverted within 48 hours",locations =cells_column_labels(columns ='revert_rate' ) )%>%opt_stylize(style =2) display_html(as_raw_html(revert_rate_editorexp))
New content edit revert rate by editor experience
Pre or Post Change
Number of published edits
Number of edits reverted
Revert Rate1
Unregistered
pre
7803
2017
25.8%
post
6296
1455
23.1%
Newcomer
pre
1672
365
21.8%
post
1298
231
17.8%
Junior Contributor
pre
6755
751
11.1%
post
6030
563
9.3%
1 Limited to edits where reference check was shown and that were reverted within 48 hours
By Wiki
Show the code
revert_rate_wiki <- published_edit_check_data %>%filter(is_edit_check_activated ==1) %>%group_by(wiki, pre_post) %>%summarise(n_edits =n_distinct(revision_id),n_reverts =n_distinct(revision_id[was_reverted >0])) %>%mutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%filter(n_edits >500) %>%select(-c(3,4)) %>%gt() %>%tab_header(title ="New content edit revert rate by wiki" ) %>%cols_label(wiki ="Wiki",pre_post ="Pre or Post Change",revert_rate ="Revert Rate" ) %>%tab_footnote(footnote ="Limited to wikis with more than 500 published edits \n and where reference check was available as default during reviewed timeframe",locations =cells_column_labels(columns ='revert_rate' ) )%>%opt_stylize(style =2) display_html(as_raw_html(revert_rate_wiki))
New content edit revert rate by wiki
Pre or Post Change
Revert Rate1
eswiki
pre
18.1%
post
20.1%
frwiki
pre
20.2%
post
18.5%
itwiki
pre
19.8%
post
19.2%
jawiki
pre
7%
post
6.6%
nlwiki
pre
31.4%
post
17.6%
ruwiki
pre
18.4%
post
18.5%
1 Limited to wikis with more than 500 published edits and where reference check was available as default during reviewed timeframe
Key Findings:
We observed the most significant change in revert rates pre and post the move of reference check to the side rail.
The revert rate of new content edits where reference check was presented decreased by 15.7% (20.4% pre to 17.2% post change).
We observed revert rate decreases across all experience level groups (unregistered, newcomer, and junior contributors) and across the majority of Wikipedias, except for Spanish Wikipedia which had a 2 percentage point increase in revert rate.
Total distinct users that included a reference after being shown reference check
Total number of distinct users (limited to registered as we don’t track distinct anons) that included a new reference with their new content edit after being shown reference check.
Overall
Show the code
num_users_change <- published_edit_check_data %>%filter(is_edit_check_activated ==1, user_status =='registered', #only track unique registered users is_new_content ==1) %>%group_by(pre_post) %>%summarise(n_users =n_distinct(user_id[includes_policy_change ==1& was_reverted ==0])) %>%gt() %>%tab_header(title ="Number of registered users that included a new reference after being shown reference check" ) %>%cols_label(pre_post ="Pre or Post Change",n_users ="Number of registered users" ) %>%tab_footnote(footnote ="Limited to registered users with 100 or fewer edits",locations =cells_column_labels(columns ='n_users' ) )%>%opt_stylize(style =2)display_html(as_raw_html(num_users_change))
Number of registered users that included a new reference after being shown reference check
Pre or Post Change
Number of registered users1
pre
2245
post
2079
1 Limited to registered users with 100 or fewer edits
# plot daily editing sessionstextaes <-data.frame(y =c(255),x =as.Date(c('2024-12-16')),lab =c("Reference check moves to side rail"))p <- num_users_change_daily %>%ggplot(aes(x = date, y = n_users)) +geom_line(size =1.5, color ='#0072B2') +geom_vline(xintercept =as.Date('2024-12-12'), linetype ='dashed', size =1) +geom_segment(aes(x =as.Date(c('2024-12-15')), y =250, xend =as.Date('2024-12-12'), yend =210),arrow =arrow(length =unit(0.8, "cm")), size =1, color ="black") +geom_text(mapping =aes(y = y, x = x, label = lab), data = textaes, inherit.aes =FALSE, size =5) +scale_y_continuous(limits =c(50, 300)) +scale_x_date(date_labels ="%b-%d", date_breaks ="1 week", minor_breaks =NULL) +labs(title ="Daily number of users that that included a reference after being shown reference check",y ="Number of users") +theme_bw() +theme(panel.grid.major =element_blank(),panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=18),legend.position="bottom",axis.text.x =element_text(hjust=1),axis.line =element_line(colour ="black"))p
Warning message in geom_segment(aes(x = as.Date(c("2024-12-15")), y = 250, xend = as.Date("2024-12-12"), :
“All aesthetics have length 1, but the data has 29 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.”
By Editor Experience
Show the code
num_users_change_userexp <- published_edit_check_data %>%filter(is_edit_check_activated ==1, user_status =='registered', is_new_content ==1) %>%group_by(experience_level_group, pre_post) %>%summarise(n_users =n_distinct(user_id[includes_policy_change ==1& was_reverted ==0])) %>%gt() %>%tab_header(title ="Number of users by editor experience \n that included a reference after being shown reference check" ) %>%cols_label(experience_level_group ="Editor experience",pre_post ="Pre or Post Change",n_users ="Number of registered users" ) %>%opt_stylize(style =2) display_html(as_raw_html(num_users_change_userexp ))
Number of users by editor experience that included a reference after being shown reference check
Pre or Post Change
Number of registered users
Newcomer
pre
597
post
569
Junior Contributor
pre
1714
post
1580
By Wiki
Show the code
num_users_change_wiki <- published_edit_check_data %>%filter( user_status =='registered', #only track unique registered users is_new_content ==1) %>%group_by(wiki, pre_post) %>%summarise(n_users =n_distinct(user_id[includes_policy_change ==1& was_reverted ==0])) %>%filter(n_users >150) %>%gt() %>%tab_header(title ="Number of users by wiki \n that included a reference after being shown reference check" ) %>%cols_label(wiki ="Wiki",pre_post ="Pre or Post Change",n_users ="Number of registered users" ) %>%tab_footnote(footnote ="Limited to wikis with more than 200 users \n and where edit check was available as default during reviewed timeframe",locations =cells_column_labels(columns ='pre_post' ) )%>%opt_stylize(style =2)display_html(as_raw_html(num_users_change_wiki ))
Number of users by wiki that included a reference after being shown reference check
Pre or Post Change1
Number of registered users
eswiki
pre
407
post
292
frwiki
pre
578
post
537
ptwiki
pre
238
post
193
ruwiki
pre
230
post
206
1 Limited to wikis with more than 200 users and where edit check was available as default during reviewed timeframe
Key Findings:
Overall there was slight decrease in the absolute number of users that added a reference after being shown reference check (- 166 users across all wikis); however, there were no significant changes around the date edit check was moved to the side rail.
Additionally, the lower number of users is also likely to be impacted be seasonal trends and changes in editing activity around the December holidays. We will also review the proportion of edits that included a reference to help provide more insights into any changes in frequency of response to edit checks presented.
Proportion of edits that included a reference after being shown reference check
Proportion of all published new content edits where the reference check was shown and added a new reference.
Overall
Show the code
prop_w_change_overall <- published_edit_check_data %>%filter(is_edit_check_activated ==1, is_new_content ==1) %>%#limit to only new content editsgroup_by(pre_post) %>%summarise(n_content_edits =n_distinct(revision_id),n_edits_w_change =n_distinct(revision_id[includes_policy_change ==1& was_reverted ==0])) %>%mutate(activation_rate =paste0(round(n_edits_w_change/n_content_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Proportion of new content edits that included a new reference" ) %>%cols_label(pre_post ="Pre or Post Change",n_content_edits ="Number of new content edits",n_edits_w_change ="Number of new content edits with a reference",activation_rate ="Proportion of new content edits with a reference" ) %>%tab_footnote(footnote ="Limited to new content edits where reference check was shown",locations =cells_column_labels(columns ='n_content_edits' ) ) %>%opt_stylize(style =2)display_html(as_raw_html(prop_w_change_overall))
Proportion of new content edits that included a new reference
Pre or Post Change
Number of new content edits1
Number of new content edits with a reference
Proportion of new content edits with a reference
pre
13749
4790
34.8%
post
11412
4327
37.9%
1 Limited to new content edits where reference check was shown
Editor Experience
Show the code
prop_w_change_userexp <- published_edit_check_data %>%filter(is_edit_check_activated ==1, is_new_content ==1) %>%#limit to only new content editsgroup_by(experience_level_group, pre_post) %>%summarise(n_content_edits =n_distinct(revision_id),n_edits_w_change =n_distinct(revision_id[includes_policy_change ==1& was_reverted ==0])) %>%mutate(activation_rate =paste0(round(n_edits_w_change/n_content_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Proportion of new content edits with a change to address policy violation by editor experience" ) %>%cols_label(experience_level_group ="Editor experience",pre_post ="Pre or Post Change",n_content_edits ="Number of new content edits",n_edits_w_change ="Number of new content edits that include a new reference",activation_rate ="Proportion of new content edits that include a new reference" ) %>%tab_footnote(footnote ="Limited to new content edits where edit check was shown",locations =cells_column_labels(columns ='n_content_edits' ) )%>%opt_stylize(style =2) display_html(as_raw_html(prop_w_change_userexp))
Proportion of new content edits with a change to address policy violation by editor experience
Pre or Post Change
Number of new content edits1
Number of new content edits that include a new reference
Proportion of new content edits that include a new reference
Unregistered
pre
6906
1898
27.5%
post
5551
1645
29.6%
Newcomer
pre
1645
597
36.3%
post
1278
569
44.5%
Junior Contributor
pre
5198
2295
44.2%
post
4583
2113
46.1%
1 Limited to new content edits where edit check was shown
By Wiki
Show the code
prop_w_change_wiki <- published_edit_check_data%>%filter(is_edit_check_activated ==1, is_new_content ==1) %>%#limit to only new content editsgroup_by(wiki, pre_post) %>%summarise(n_content_edits =n_distinct(revision_id),n_edits_w_change =n_distinct(revision_id[includes_policy_change ==1& was_reverted ==0])) %>%mutate(activation_rate =paste0(round(n_edits_w_change/n_content_edits *100, 1), "%")) %>%filter(n_content_edits >500) %>%gt() %>%tab_header(title ="Proportion of new content edits with a new reference" ) %>%cols_label(wiki ="Wiki",pre_post ="Pre or Post Change",n_content_edits ="Number of new content edits",n_edits_w_change ="Number of new content edits with a new reference",activation_rate ="Proportion of new conent edits with a new reference" ) %>%tab_footnote(footnote ="Limited wikis with over 500 published edits and to new content edits where edit check was shown",locations =cells_column_labels(columns ='n_content_edits' ) )%>%opt_stylize(style =2) display_html(as_raw_html(prop_w_change_wiki ))
Proportion of new content edits with a new reference
Pre or Post Change
Number of new content edits1
Number of new content edits with a new reference
Proportion of new conent edits with a new reference
eswiki
pre
1049
418
39.8%
post
696
264
37.9%
frwiki
pre
2068
749
36.2%
post
1783
710
39.8%
itwiki
pre
1139
310
27.2%
post
956
291
30.4%
nlwiki
pre
698
200
28.7%
post
544
195
35.8%
ruwiki
pre
1408
476
33.8%
post
1211
406
33.5%
1 Limited wikis with over 500 published edits and to new content edits where edit check was shown
Key Findings:
There was a 8% increase in the proportion of new content edits that included a reference following the change (34.8% pre change to 37.9% post change).
Increases were observed across all editor experience levels and most wikis.