We reviewed the following set of leading indicators 2 weeks after starting the Multi-Check (References) A/B Test:
Proportion of new content edits presented multiple reference checks within a single editing session
Proportion of contributors that are presented Multi Check (References) and complete their edits
Proportion of edits wherein people elect to dismiss/not change the text they’ve added.
Proportion of people blocked after publishing an edit where Multi Check was shown a
Proportion of published edits that add new content and are reverted within 48hours
Decision to be made: What – if any – adjustments/investigations will we prioritize for us to be confident moving forward with evaluating the Multi Check’s impact in T379131?
Note: Results are based on initial AB test data to check if any adjustments to the feature need to be prioritized. More event data will be needed to confirm statistical significance for many of these findings. We will review the complete AB test data (based on two week duration) as part of the analysis in T379131
Methodology
We collected two weeks of AB test events logged between 25 March 2025 and 08 April 2025. In this AB test, users in the test group can be shown multiple reference checks within a single editing session while users in the control group will only see one reference check for all edits that meet the requirements for it be shown.
For each leading indicator metric, we reviewed the following dimensions: by experiment group (test and control), by platform (mobile web or desktop), by user experience and status, and by partner wiki.
We also compared edits that were shown more than one reference check in a single session in the test group to edits that were only presented a single reference check. For edits presented more than one reference check, we reviewed a split by the number of checks shown to determine if there was a significant metric change at a certain number of checks presented.
We relied on events logged in VisualEditorFeatureUse and change tags recorded in the revision tags table. See instrumentation spec.
Data was limited to mobile and desktop edits completed on a main page namespace using VisualEditor on one of the partner Wikipedias. We also limited to edits completed by newcomers, junior contributors, and unregistered users as those are the users that would be shown reference check under the default config settings.
Summary of results
Proportion of new content edits presented multiple reference checks within a single editing session
In the test group, multiple reference checks were shown within a single editing session at 19% of all published new content VE edits (549 edits) by unregistered users and users with 100 or fewer edits. For edits shown multiple checks, the majority of edits (73%) were shown between 2 to 5 checks. Based on this rate, we should have sufficient multi-check events after the test run for 4 weeks to confirm the overall statistical significance of any changes introduced by this change.
Proportion of contributors that are presented Multi Check (References) and complete their edits
The edit completion rate for sessions that were shown multiple checks within a session was 76.1% compared to 75% for sessions shown only one check, indicating that multiple checks are not causing significant disruption or confusion to the editors.
Proportion of edits wherein people elect to dismiss/not change the text they’ve added.
While we observed a slightly higher increase in the proportion of individual reference checks dismissed for edits shown multiple checks in the test group, sessions shown multiple checks are more likely to include at least one new reference in the final published edit compared to sessions shown just a single check. In the test group, 47.5% of all published edits shown multiple checks did not include at least one new reference compared to 60.3% of edits that were shown a single check.
Proportion of people blocked after publishing an edit where Multi Check was shown
There were also no significant changes in the proportion of users blocked after being shown multiple checks compared to a single check.
Proportion of published edits that add new content and are reverted within 48hours
We observed no significant differences in the revert rate of new content edits between the control and the test group for editing sessions where a reference check was shown. In the test group, the revert rate of new content edits shown multiple checks (17%) is currently lower compared to sessions shown a single check (26%.).
Proportion of published new content edits presented multiple reference checks within a single editing session
Methodology: The number of reference checks shown within a single editing session is determined by the following event: event.feature = 'editCheck-addReference' AND event.action = 'check-shown-presave'.
We further limited the review to edits that were successfully published and identified as new content edits with the tag editcheck-newcontent.
Code
#load frequency dataedit_check_frequency_data <-read.csv(file ='Queries/data/edit_check_frequency_data_li.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
#Set fields and factor levels to assess number of checks shown#Note limited to 1 sidebar open as we're looking for cases where multiple checks presented in a single sidebar (vs user going back and forth)edit_check_frequency_data <- edit_check_frequency_data %>%mutate(multiple_checks_shown =ifelse(n_checks_shown >1& n_sidebar_opens <2, 1, 0), multiple_checks_shown =factor( multiple_checks_shown ,levels =c(0,1)))# note these buckets can be adjusted as needed based on distribution of dataedit_check_frequency_data <- edit_check_frequency_data %>%mutate(checks_shown_bucket =case_when(is.na(n_checks_shown) ~'0', n_checks_shown ==1| (n_checks_shown >1& n_sidebar_opens >=2) ~'1', n_checks_shown ==2& n_sidebar_opens <2~'2', n_checks_shown >2& n_checks_shown <=5& n_sidebar_opens <2~"3-5", n_checks_shown >5& n_checks_shown <=10& n_sidebar_opens <2~"6-10", n_checks_shown >10& n_checks_shown <=15& n_sidebar_opens <2~"11-15", n_checks_shown >15& n_checks_shown <=20& n_sidebar_opens <2~"16-20", n_checks_shown >20& n_sidebar_opens <2~"over 20" ),checks_shown_bucket =factor(checks_shown_bucket ,levels =c("0","1","2", "3-5", "6-10","11-15" ,"16-20", "over 20") ))
Proportion of new content edits shown at least one reference check
Code
reference_checks_shown_bytest <- edit_check_frequency_data %>%filter(is_new_content ==1& was_saved ==1) %>%#limit to new content editsgroup_by(test_group) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_refcheck =n_distinct(editing_session[was_edit_check_shown ==1])) %>%mutate(prop_check_shown =paste0(round(n_editing_session_refcheck/n_editing_session *100, 1), "%")) %>%gt() %>%tab_header(title ="Published new content edits shown at least one reference check by experiment group" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",n_editing_session ="Number of edits",n_editing_session_refcheck ="Number of edits shown reference checks", prop_check_shown ="Proportion of edits shown reference check" ) %>%tab_source_note( gt::md('Limited to published new content edits by unregistered users and users with 100 or fewer edits') )display_html(as_raw_html(reference_checks_shown_bytest))
Published new content edits shown at least one reference check by experiment group
Experiment Group
Number of edits
Number of edits shown reference checks
Proportion of edits shown reference check
control (single check)
2932
2314
78.9%
test (multiple checks)
2926
2306
78.8%
Limited to published new content edits by unregistered users and users with 100 or fewer edits
Proportion of new content edits shown multiple checks in the test group
Code
multi_refchecks_overall <- edit_check_frequency_data %>%filter(is_new_content ==1& was_saved ==1) %>%group_by(test_group) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_multicheck =n_distinct(editing_session[was_edit_check_shown ==1& multiple_checks_shown ==1])) %>%mutate(prop_check_shown =paste0(round(n_editing_session_multicheck/n_editing_session *100, 1), "%")) %>%filter(test_group =='test (multiple checks)') %>%gt() %>%tab_header(title ="Published new content edits shown multiple checks in the test group" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",n_editing_session ="Number of edits",n_editing_session_multicheck ="Number of edits shown multiple reference checks", prop_check_shown ="Proportion of edits shown multiple reference checks" ) %>%tab_source_note( gt::md('Limited to published new content edits by unregistered users and users with 100 or fewer edits') )display_html(as_raw_html(multi_refchecks_overall))
Published new content edits shown multiple checks in the test group
Experiment Group
Number of edits
Number of edits shown multiple reference checks
Proportion of edits shown multiple reference checks
test (multiple checks)
2926
549
18.8%
Limited to published new content edits by unregistered users and users with 100 or fewer edits
Proportion of new content edits by number of checks shown
Code
multi_refchecks_overall <- edit_check_frequency_data %>%filter(is_new_content ==1 , was_saved ==1, test_group =='test (multiple checks)') %>%#want to limit to test group where multiple can be shownmutate(total_sessions =n_distinct(editing_session)) %>%group_by(total_sessions, checks_shown_bucket) %>%summarise(n_editing_session_refcheck =n_distinct(editing_session)) %>%mutate(prop_check_shown =paste0(round(n_editing_session_refcheck/total_sessions *100, 2), "%")) %>%ungroup() %>%select(-1) %>%mutate(n_editing_session_refcheck =ifelse(n_editing_session_refcheck <50, "<50", n_editing_session_refcheck)) %>%#sanitizing per data publication guidelinesgt() %>%tab_header(title ="Published new content edits by total number of reference checks shown in the test group" ) %>%opt_stylize(5) %>%cols_label(checks_shown_bucket ="Number of reference checks",n_editing_session_refcheck ="Number of edits", prop_check_shown ="Proportion of edits" ) %>%tab_source_note( gt::md('Limited to published new content edits by unregistered users and users with 100 or fewer edits') )display_html(as_raw_html(multi_refchecks_overall))
Published new content edits by total number of reference checks shown in the test group
Number of reference checks
Number of edits
Proportion of edits
0
620
21.19%
1
1759
60.12%
2
214
7.31%
3-5
186
6.36%
6-10
94
3.21%
11-15
<50
0.99%
16-20
<50
0.34%
over 20
<50
0.58%
Limited to published new content edits by unregistered users and users with 100 or fewer edits
Proportion of new content edits shown multiple checks in the test group by platform
Code
multi_reference_checks_shown_byplatform <- edit_check_frequency_data %>%filter(is_new_content ==1 , was_saved ==1, test_group =='test (multiple checks)') %>%#limit to test group sessions with more than one ref checkgroup_by(platform) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_multicheck =n_distinct(editing_session[was_edit_check_shown ==1& multiple_checks_shown ==1 ] )) %>%mutate(prop_check_shown =paste0(round(n_editing_session_multicheck/n_editing_session *100, 1), "%")) %>%gt() %>%tab_header(title ="Published new content edits shown multiple reference checks by platform" ) %>%opt_stylize(5) %>%cols_label(platform ="Platform",n_editing_session ="Number of edits",n_editing_session_multicheck ="Number of edits shown multiple reference checks", prop_check_shown ="Proportion of edits shown multiple reference checks" ) %>%tab_source_note( gt::md('Limited to new content edits by unregistered users and users with 100 or fewer edits assigned to test group') )display_html(as_raw_html(multi_reference_checks_shown_byplatform))
Published new content edits shown multiple reference checks by platform
Platform
Number of edits
Number of edits shown multiple reference checks
Proportion of edits shown multiple reference checks
desktop
1922
441
22.9%
phone
1004
108
10.8%
Limited to new content edits by unregistered users and users with 100 or fewer edits assigned to test group
Proportion of new content edits shown multiple checks by user experience
Code
multi_reference_checks_shown_byuserstatus <- edit_check_frequency_data %>%filter(is_new_content ==1 , was_saved ==1,test_group =='test (multiple checks)') %>%#limit to test group sessions with more than one ref checkgroup_by(experience_level_group) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_multicheck =n_distinct(editing_session[was_edit_check_shown ==1& multiple_checks_shown ==1 ])) %>%mutate(prop_check_shown =paste0(round(n_editing_session_multicheck/n_editing_session *100, 1), "%")) %>%gt() %>%tab_header(title ="Published new content edits shown multiple reference checks by user experience" ) %>%opt_stylize(5) %>%cols_label(experience_level_group ="User Status",n_editing_session ="Number of edits",n_editing_session_multicheck ="Number of edits shown multiple reference checks", prop_check_shown ="Proportion of edits shown multiple reference checks" ) %>%tab_source_note( gt::md('Limited to new content edits by unregistered users and users with 100 or fewer edits assigned to test group') )display_html(as_raw_html(multi_reference_checks_shown_byuserstatus))
Published new content edits shown multiple reference checks by user experience
User Status
Number of edits
Number of edits shown multiple reference checks
Proportion of edits shown multiple reference checks
Unregistered
1496
231
15.4%
Newcomer
328
89
27.1%
Junior Contributor
1102
229
20.8%
Limited to new content edits by unregistered users and users with 100 or fewer edits assigned to test group
By partner Wikipedia
Code
multi_reference_checks_shown_bywiki <- edit_check_frequency_data %>%filter(is_new_content ==1, was_saved ==1, test_group =='test (multiple checks)') %>%#limit to test group sessions with more than one ref checkgroup_by(wiki) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_multicheck =n_distinct(editing_session[was_edit_check_shown ==1& multiple_checks_shown ==1 ])) %>%mutate(prop_check_shown =paste0(round(n_editing_session_multicheck/n_editing_session *100, 1), "%")) %>%filter(n_editing_session_multicheck >50) %>%gt() %>%tab_header(title ="Published new content edits shown multiple reference checks by partner wikipedia" ) %>%opt_stylize(5) %>%cols_label(wiki ="Wikipedia",n_editing_session ="Number of edits",n_editing_session_multicheck ="Number of edits shown multiple reference checks", prop_check_shown ="Proportion of edits shown multiple reference checks" ) %>%tab_source_note( gt::md('Limited to wikis with at least 50 published new content edits shown multiple checks') )display_html(as_raw_html(multi_reference_checks_shown_bywiki))
Published new content edits shown multiple reference checks by partner wikipedia
Wikipedia
Number of edits
Number of edits shown multiple reference checks
Proportion of edits shown multiple reference checks
eswiki
636
117
18.4%
frwiki
857
169
19.7%
itwiki
637
97
15.2%
ptwiki
235
51
21.7%
Limited to wikis with at least 50 published new content edits shown multiple checks
Key Insights
Reference checks are presented at about 78% of published new content VisualEditor edits completed by unregistered users and users with 100 or fewer edits. Frequency is the same across experiment groups as the change introduced by this experiment did not impact the number of sessions that could be shown a reference check at least once.
In the test group, multiple reference checks were shown within a single editing session at 19% of all new content VE edits (549 edits) by unregistered users and users with 100 or fewer edits.
For edits shown multiple checks, the majority of edits (73%) were shown between 2 to 5 checks. 3% of edits were shown over 20 checks within a single session.
Multiple reference checks are shown more frequently at desktop compared to mobile web. 23% of new content edits on desktop were shown multiple reference checks compared to 11% of new content edits on mobile web.
Newcomers are also more likely to be shown multiple reference checks (27% of new content edits published by newcomers in the test group were shown multiple reference checks compared to 21% of edits by junior contributors and 15% by unregistered users).
At all partner wikis, the proportion of new content edits shown multiple reference checks ranges from 15% at Italian Wikipedia to 22% at Portuguese Wikipedia. All partner wikis have had several published edits where multiple checks were shown; however, some of the smaller wikis have had very few events (< 25) logged to date.
Proportion of contributors that are presented multi check (References) and complete their edits
Methodology We reviewed the proportion of edits by newcomers, junior contributors, and unregistered users that were shown reference check during their edit session and successfully published their edit (event.action = saveSuccess). The analysis is limited to only edits that reached the point where reference check was presented at least once after indicating their intent to save (event.action = saveIntent).
Code
# load data for assessing edit completion rateedit_completion_rates_data <-read.csv(file ='Queries/data/edit_completion_rate_data.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
#Set fields and factor levels to assess number of checks shown#Note limited to 1 sidebar open as we're looking for cases where multiple checks presented in a single sidebar (vs user going back and forth)edit_completion_rates_data <- edit_completion_rates_data %>%mutate(multiple_checks_shown =ifelse(n_checks_shown >1& n_sidebar_opens <2, "multiple checks shown", "one check shown"), multiple_checks_shown =factor( multiple_checks_shown ,levels =c("one check shown", "multiple checks shown")))# note these buckets can be adjusted as needed based on distribution of dataedit_completion_rates_data <- edit_completion_rates_data %>%mutate(checks_shown_bucket =case_when(is.na(n_checks_shown) ~'0', n_checks_shown ==1| (n_checks_shown >1& n_sidebar_opens >=2) ~'1', n_checks_shown ==2& n_sidebar_opens <2~'2', n_checks_shown >2& n_checks_shown <=5& n_sidebar_opens <2~"3-5", n_checks_shown >5& n_checks_shown <=10& n_sidebar_opens <2~"6-10", n_checks_shown >10& n_checks_shown <=15& n_sidebar_opens <2~"11-15", n_checks_shown >15& n_checks_shown <=20& n_sidebar_opens <2~"16-20", n_checks_shown >20& n_sidebar_opens <2~"over 20" ),checks_shown_bucket =factor(checks_shown_bucket ,levels =c("0","1","2", "3-5", "6-10","11-15" ,"16-20", "over 20") ))
Code
#Remove one abnormal instance of multiple checks being shown within control groupedit_completion_rates_data <- edit_completion_rates_data %>%filter(!(test_group =='control (single check)'& multiple_checks_shown =="multiple checks shown"))
Edit completion rate by experiment group
Code
edit_completion_rate_overall <- edit_completion_rates_data %>%filter(ref_check_shown ==1) %>%#limit to sessions where referen check was showngroup_by(test_group) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Edit completion rate by experiment group" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",n_edits ="Number of edit attempts shown reference check",n_saves ="Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to edit attempts shown at least one reference check') )display_html(as_raw_html(edit_completion_rate_overall ))
Edit completion rate by experiment group
Experiment Group
Number of edit attempts shown reference check
Number of published edits
Proportion of edits saved
control (single check)
3145
2342
74.5%
test (multiple checks)
3107
2338
75.2%
Limited to edit attempts shown at least one reference check
Edit completion rate by if multiple checks were shown
Code
edit_completion_rate_bymulti <- edit_completion_rates_data %>%filter(ref_check_shown ==1) %>%group_by(test_group, multiple_checks_shown) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Edit completion rate by if multiple checks were shown" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment group",multiple_checks_shown ="Multiple checks shown",n_edits ="Number of edit attempts shown reference check",n_saves ="Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to edit attempts shown at least one reference check') )display_html(as_raw_html(edit_completion_rate_bymulti ))
Edit completion rate by if multiple checks were shown
Multiple checks shown
Number of edit attempts shown reference check
Number of published edits
Proportion of edits saved
control (single check)
one check shown
3145
2342
74.5%
test (multiple checks)
one check shown
2371
1778
75%
multiple checks shown
736
560
76.1%
Limited to edit attempts shown at least one reference check
Edit completion rate by number of checks shown
Code
edit_completion_rate_bynchecks <- edit_completion_rates_data %>%filter(ref_check_shown ==1) %>%group_by(test_group, checks_shown_bucket) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%ungroup()%>%mutate(n_edits =ifelse(n_edits <50, "<50", n_edits),n_saves =ifelse(n_saves <50, "<50", n_saves)) %>%#sanitizing per data publication guidelinesgroup_by(test_group) %>%gt() %>%tab_header(title ="Edit completion rate by the number of reference checks shown" ) %>%opt_stylize(5) %>%cols_label(checks_shown_bucket ="Number of checks shown",n_edits ="Number of edit attempts shown reference check",n_saves ="Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to edits shown at least one reference check') )display_html(as_raw_html(edit_completion_rate_bynchecks ))
Edit completion rate by the number of reference checks shown
Number of checks shown
Number of edit attempts shown reference check
Number of published edits
Proportion of edits saved
control (single check)
1
3145
2342
74.5%
test (multiple checks)
1
2371
1778
75%
2
269
220
81.8%
3-5
250
189
75.6%
6-10
129
96
74.4%
11-15
<50
<50
74.4%
16-20
<50
<50
58.8%
over 20
<50
<50
50%
Limited to edits shown at least one reference check
Edit completion rate by platform
Code
edit_completion_rate_byplatform <- edit_completion_rates_data %>%filter(ref_check_shown ==1) %>%group_by(platform, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Edit completion rate by experiment group and platform" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",platform ="Platform",n_edits ="Number of edit attempts shown reference check",n_saves ="Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to edit attempts shown at least one reference check') )display_html(as_raw_html(edit_completion_rate_byplatform))
Edit completion rate by experiment group and platform
Experiment Group
Number of edit attempts shown reference check
Number of published edits
Proportion of edits saved
desktop
control (single check)
1843
1427
77.4%
test (multiple checks)
1832
1444
78.8%
phone
control (single check)
1302
915
70.3%
test (multiple checks)
1275
894
70.1%
Limited to edit attempts shown at least one reference check
Edit completion rate by user experience
Code
edit_completion_rate_byuserstatus <- edit_completion_rates_data %>%filter(ref_check_shown ==1) %>%group_by(experience_level_group, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Edit completion rate by experiment group and editor experience" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",experience_level_group ="Experiment Group",n_edits ="Number of edit attempts shown reference check",n_saves ="Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to edit attempts shown at least one reference check') )display_html(as_raw_html(edit_completion_rate_byuserstatus ))
Edit completion rate by experiment group and editor experience
Test Group
Number of edit attempts shown reference check
Number of published edits
Proportion of edits saved
Unregistered
control (single check)
1783
1266
71%
test (multiple checks)
1811
1292
71.3%
Newcomer
control (single check)
433
317
73.2%
test (multiple checks)
397
289
72.8%
Junior Contributor
control (single check)
929
759
81.7%
test (multiple checks)
899
757
84.2%
Limited to edit attempts shown at least one reference check
Edit completion rate by partner Wikipedia
Code
edit_completion_rate_bywiki <- edit_completion_rates_data %>%filter(ref_check_shown ==1) %>%group_by(wiki, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%filter(n_saves >=100) %>%gt() %>%tab_header(title ="Edit completion rate by experiment group and user status" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",wiki ="Wikipedia",n_edits ="Number of edit attempts shown edit check",n_saves ="Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to wikis with at least 100 published edits') )display_html(as_raw_html(edit_completion_rate_bywiki))
Edit completion rate by experiment group and user status
Test Group
Number of edit attempts shown edit check
Number of published edits
Proportion of edits saved
arwiki
control (single check)
258
146
56.6%
test (multiple checks)
207
112
54.1%
eswiki
control (single check)
643
468
72.8%
test (multiple checks)
712
517
72.6%
frwiki
control (single check)
826
671
81.2%
test (multiple checks)
865
687
79.4%
itwiki
control (single check)
727
549
75.5%
test (multiple checks)
684
538
78.7%
jawiki
control (single check)
277
210
75.8%
test (multiple checks)
240
186
77.5%
ptwiki
control (single check)
257
175
68.1%
test (multiple checks)
231
167
72.3%
Limited to wikis with at least 100 published edits
Key Insights
The overall edit completion rate for the test group (75.2%) is currently 1% higher than the edit completion rate for the control group (74.5%), indicating that multiple checks are not causing significant disruption or confusion to the editors.
We also directly compared editing sessions shown multiple reference checks to editing sessions shown only one reference check. The edit completion rate for sessions that were shown multiple checks within a session was 76.1% compared to 75% for sessions shown only one check.
Edit completion rates stay around 75% for up to 15 checks shown within a single session. After that, edit completion rate decreases to 58.8% for editing sessions shown between 16 to 26 checks and 50% for edits shown over 20. Note: There were fewer than 50 edit attempts overall that were shown over 16 reference checks so more data is needed to confirm the decrease at this threshold.
We also did not observe any significant differences in edit completion rate by platform, user experience level or wiki. More data will be needed to confirm any statistically significant changes in completion rates caused by multi-check.
Proportion of published new content edits wherein people elected to dismiss adding a new reference.
Methodology: We reviewed the propotion of published new content edits that people elected to dismiss adding a new reference. This was determined by edits where the user declined to add a reference at least once in a session (event.feature = 'editCheck-addReference'AND event.action = 'action-reject') and where no new reference was included in the final published new content edit (edits with revision tag:editcheck-newreference).
We also reviewed the proportion of all individual reference checks that were dismissed.
Code
# load data for assessing edit reject frequencyedit_check_reject_data <-read.csv(file ='Queries/data/edit_check_rejects_data.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
#Set fields and factor levels to assess number of checks shown#Note limited to 1 sidebar open as we're looking for cases where multiple checks presented in a single sidebar (vs user going back and forth)edit_check_reject_data <- edit_check_reject_data %>%mutate(multiple_checks_shown =ifelse(n_checks_shown >1& n_sidebar_opens <2, "multiple checks shown", "single check shown"), multiple_checks_shown =factor( multiple_checks_shown ,levels =c("single check shown", "multiple checks shown")))# note these buckets can be adjusted as needed based on distribution of dataedit_check_reject_data <- edit_check_reject_data %>%mutate(checks_shown_bucket =case_when(is.na(n_checks_shown) ~'0', n_checks_shown ==1| (n_checks_shown >1& n_sidebar_opens >=2) ~'1', n_checks_shown ==2& n_sidebar_opens <2~'2', n_checks_shown >2& n_checks_shown <=5& n_sidebar_opens <2~"3-5", n_checks_shown >5& n_checks_shown <=10& n_sidebar_opens <2~"6-10", n_checks_shown >10& n_checks_shown <=15& n_sidebar_opens <2~"11-15", n_checks_shown >15& n_checks_shown <=20& n_sidebar_opens <2~"16-20", n_checks_shown >20& n_sidebar_opens <2~"over 20" ),checks_shown_bucket =factor(checks_shown_bucket ,levels =c("0","1","2", "3-5", "6-10","11-15" ,"16-20", "over 20") ))
Code
#remove some small occurrences of abnormal data. Will investigate but <0.001% of data at moment so won't impact results.#Remove one abnormal instance of multiple checks being shown within control groupedit_check_reject_data <- edit_check_reject_data %>%filter(!(test_group =='control (single check)'& multiple_checks_shown =="multiple checks shown"))# remove one abnormal instance of multiple reject actions being logged with no instances of checks being shown# Relable n_rejects optionedit_check_reject_data <- edit_check_reject_data %>%filter(!(is.na(n_checks_shown) & n_rejects >0)) %>%mutate(n_rejects =ifelse(n_checks_shown >0&is.na(n_rejects), 0, n_rejects))
Proportion of new content edits without a reference by experiment group
Code
edit_check_dismissal_overall <- edit_check_reject_data %>%filter(was_edit_check_shown ==1& is_new_content ==1) %>%#limit to where showngroup_by(test_group) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0& included_new_reference ==0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Proportion of new content edits where reference checks was shown \n and no new reference was added" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",n_edits ="Number of edits shown reference check",n_rejects ="Number of edits that did not add at least one new reference",dismissal_rate ="Proportion of edits where people elected to not add a reference" ) %>%tab_source_note( gt::md('Limited to published new content edits where at least one reference check was shown') )display_html(as_raw_html(edit_check_dismissal_overall ))
Proportion of new content edits where reference checks was shown and no new reference was added
Experiment Group
Number of edits shown reference check
Number of edits that did not add at least one new reference
Proportion of edits where people elected to not add a reference
control (single check)
2313
1333
57.6%
test (multiple checks)
2307
1320
57.2%
Limited to published new content edits where at least one reference check was shown
Proportion of new content edits without a reference by if multiple checks were shown
Code
edit_check_dismissal_bymultiple <- edit_check_reject_data %>%filter(was_edit_check_shown ==1& is_new_content ==1) %>%#limit to where showngroup_by(test_group,multiple_checks_shown) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0& included_new_reference ==0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Proportion of new content edits without a reference by if multiple checks were shown" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",multiple_checks_shown ="Multiple Checks",n_edits ="Number of edits shown reference check",n_rejects ="Number of edits that did not add at least one new reference",dismissal_rate ="Proportion of edits where people elected to not add a reference" ) %>%tab_source_note( gt::md('Limited to published new content edits') )display_html(as_raw_html(edit_check_dismissal_bymultiple ))
Proportion of new content edits without a reference by if multiple checks were shown
Multiple Checks
Number of edits shown reference check
Number of edits that did not add at least one new reference
Proportion of edits where people elected to not add a reference
control (single check)
single check shown
2313
1333
57.6%
test (multiple checks)
single check shown
1760
1061
60.3%
multiple checks shown
547
259
47.3%
Limited to published new content edits
Proportion of new content edits without a reference by number of checks shown
Code
edit_check_dismissal_bynchecks <- edit_check_reject_data %>%filter(was_edit_check_shown ==1& is_new_content ==1& n_sidebar_opens <2 ) %>%#limit to where showngroup_by(test_group, checks_shown_bucket) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0& included_new_reference ==0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%ungroup() %>%mutate(n_edits =ifelse(n_edits <50, "<50", n_edits),n_rejects =ifelse(n_rejects <50, "<50", n_rejects)) %>%#sanitizing per data publication guidelinesgroup_by(test_group) %>%gt() %>%tab_header(title ="Proportion of new content edits without a reference by the number of checks shown" ) %>%opt_stylize(5) %>%cols_label(checks_shown_bucket ="Number of reference checks shown",n_edits ="Number of edits shown reference check",n_rejects ="Number of edits that did not add at least one new reference",dismissal_rate ="Proportion of edits where people elected to not add a reference" ) %>%tab_source_note( gt::md('Limited to published new content edits') )display_html(as_raw_html(edit_check_dismissal_bynchecks))
Proportion of new content edits without a reference by the number of checks shown
Number of reference checks shown
Number of edits shown reference check
Number of edits that did not add at least one new reference
Proportion of edits where people elected to not add a reference
control (single check)
1
2032
1226
60.3%
test (multiple checks)
1
1512
941
62.2%
2
213
107
50.2%
3-5
186
84
45.2%
6-10
94
<50
44.7%
11-15
<50
<50
46.4%
16-20
<50
<50
60%
over 20
<50
<50
43.8%
Limited to published new content edits
Overall reference check dismissal rate by experiment group
We also reviewed the total number of individual reference checks dismissed to determine if a large of portion of checks within a single sessions were being actively dismissed by users.
Code
edit_check_dismissal_totals <- edit_check_reject_data %>%filter(was_edit_check_shown ==1& is_new_content ==1 ) %>%#limit to where showngroup_by(test_group, multiple_checks_shown) %>%summarise(n_checks_shown =sum(n_checks_shown), #Note there are NAs for sessions that don't select. Need to replace with 0n_rejects =sum(n_rejects )) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_checks_shown *100, 1), "%")) %>%gt() %>%opt_stylize(5) %>%tab_header(title ="Proportion of distinct reference checks shown that were dismissed" ) %>%cols_label(#multiple_checks_shown = "Multiple checks shown",n_checks_shown ="Number of checks shown",n_rejects ="Number of reference checks dismissed",dismissal_rate ="Proportion of reference checks dismissed" ) display_html(as_raw_html(edit_check_dismissal_totals))
Proportion of distinct reference checks shown that were dismissed
multiple_checks_shown
Number of checks shown
Number of reference checks dismissed
Proportion of reference checks dismissed
control (single check)
single check shown
2823
1713
60.7%
test (multiple checks)
single check shown
4433
1804
40.7%
multiple checks shown
2903
2025
69.8%
Proportion of new content edits without a reference by platform
Code
edit_check_dismissal_byplatform <- edit_check_reject_data %>%filter(was_edit_check_shown ==1& is_new_content ==1) %>%#limit to where showngroup_by(platform,test_group) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0& included_new_reference ==0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Proportion of new content edits without a reference by platform" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",platform ="Platform",n_edits ="Number of edits shown reference check",n_rejects ="Number of edits that did not add at least one new reference",dismissal_rate ="Proportion of edits where people elected to not add a reference" ) %>%tab_source_note( gt::md('Limited to published new content edits') )display_html(as_raw_html(edit_check_dismissal_byplatform ))
Proportion of new content edits without a reference by platform
Experiment Group
Number of edits shown reference check
Number of edits that did not add at least one new reference
Proportion of edits where people elected to not add a reference
desktop
control (single check)
1413
701
49.6%
test (multiple checks)
1419
705
49.7%
phone
control (single check)
900
632
70.2%
test (multiple checks)
888
615
69.3%
Limited to published new content edits
Proportion of new content edits without a reference by user experience
Code
edit_check_dismissal_byuserstatus <- edit_check_reject_data %>%filter(was_edit_check_shown ==1& is_new_content ==1) %>%#limit to where showngroup_by(experience_level_group, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0& included_new_reference ==0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Proportion of new content edits without a reference by user experience" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",experience_level_group ="User Status",n_edits ="Number of edits shown reference check",n_rejects ="Number of edits that did not add at least one new reference",dismissal_rate ="Proportion of edits where people elected to not add a reference" ) %>%tab_source_note( gt::md('Limited to published new content edits') )display_html(as_raw_html(edit_check_dismissal_byuserstatus))
Proportion of new content edits without a reference by user experience
Experiment Group
Number of edits shown reference check
Number of edits that did not add at least one new reference
Proportion of edits where people elected to not add a reference
Unregistered
control (single check)
1255
832
66.3%
test (multiple checks)
1282
861
67.2%
Newcomer
control (single check)
309
156
50.5%
test (multiple checks)
283
143
50.5%
Junior Contributor
control (single check)
749
345
46.1%
test (multiple checks)
742
316
42.6%
Limited to published new content edits
Proportion of new content edits without a reference by partner Wikipedia
Code
edit_check_dismissal_bywiki <- edit_check_reject_data %>%filter(was_edit_check_shown ==1& is_new_content ==1) %>%#limit to where showngroup_by(wiki, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0& included_new_reference ==0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%filter(n_rejects >65) %>%#remove wikis with too few editsgt() %>%tab_header(title ="Proportion of new content edits without a reference by Wikipedia" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",wiki ="Wikipedia",n_edits ="Number of edits shown reference check",n_rejects ="Number of edits that did not add at least one new reference",dismissal_rate ="Proportion of edits where people elected to not add a reference" ) %>%tab_source_note( gt::md('Limited to wikis with at least 100 published edits') )display_html(as_raw_html(edit_check_dismissal_bywiki))
Proportion of new content edits without a reference by Wikipedia
Experiment Group
Number of edits shown reference check
Number of edits that did not add at least one new reference
Proportion of edits where people elected to not add a reference
eswiki
control (single check)
461
307
66.6%
test (multiple checks)
514
310
60.3%
frwiki
control (single check)
661
358
54.2%
test (multiple checks)
676
392
58%
itwiki
control (single check)
546
338
61.9%
test (multiple checks)
535
340
63.6%
jawiki
control (single check)
209
144
68.9%
test (multiple checks)
184
100
54.3%
ptwiki
control (single check)
172
70
40.7%
test (multiple checks)
164
72
43.9%
Limited to wikis with at least 100 published edits
Key Insights
Comparing overall rates observed in the test and control groups, there are no significant differences in the proportion of published new content edits where people elected not to add a new reference. 57.6% of new content edits where reference check was shown did not include a new reference in the test group compared to 57.2% of new content edits in the control group.
While we observed a slightly higher increase in the proportion of individual checks dismissed for edits shown multiple checks in the test group, sessions shown multiple checks are more likely to include at least one new reference in the final published edit compared to sessions shown just a single check. In the test group, 47.5% of all published edits shown multiple checks did not include at least one new reference compared to 60.3% of edits that were shown a single check.
Currently, the proportion of edits without a new reference appears to decrease slightly with increasing number of checks shown; however, more edits where multiple checks are presented are needed to confirm.
These trends do not vary significantly by platform, user experience level, and wiki.
Proportion of published new content edits that are reverted within 48hours
Methdology: Reviewed the proportion of all new content edits where reference check was shown and were reverted within 48 hours.
Code
# load data for assessing edit reject frequencyedit_check_revert_data <-read.csv(file ='Queries/data/edit_check_reverts_data.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
# set field to indicate if more than one check was shown in a single session. Note: This should only be applicable to the test group edit_check_revert_data <- edit_check_revert_data %>%mutate(multiple_checks_shown =ifelse(n_checks_shown >1& n_sidebar_opens <2, "multiple checks shown", "single check shown"), multiple_checks_shown =factor( multiple_checks_shown ,levels =c("single check shown", "multiple checks shown")))# note these buckets can be adjusted as needed based on distribution of dataedit_check_revert_data <- edit_check_revert_data %>%mutate(checks_shown_bucket =case_when(is.na(n_checks_shown) ~'0', n_checks_shown ==1| (n_checks_shown >1& n_sidebar_opens >=2) ~'1', n_checks_shown ==2& n_sidebar_opens <2~'2', n_checks_shown >2& n_checks_shown <=5& n_sidebar_opens <2~"3-5", n_checks_shown >5& n_checks_shown <=10& n_sidebar_opens <2~"6-10", n_checks_shown >10& n_checks_shown <=15& n_sidebar_opens <2~"11-15", n_checks_shown >15& n_checks_shown <=20& n_sidebar_opens <2~"16-20", n_checks_shown >20& n_sidebar_opens <2~"over 20" ),checks_shown_bucket =factor(checks_shown_bucket ,levels =c("0","1","2", "3-5", "6-10","11-15" ,"16-20", "over 20") ))
Code
#Remove one abnormal instance of multiple checks being shown within control groupedit_check_revert_data <- edit_check_revert_data %>%filter(!(test_group =='control (single check)'& multiple_checks_shown =="multiple checks shown"))
Revert rate by experiment group
Code
edit_check_revert_overall <- edit_check_revert_data %>%filter(is_new_content ==1, was_edit_check_shown ==1) %>%#limit to where showngroup_by(test_group) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="New content edit revert rate by experiment group" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",n_edits ="Number of published edits shown reference check",n_reverts ="Number of edits reverted",revert_rate ="Proportion of new content edits that were reverted" ) %>%tab_source_note( gt::md('Limited to published new content edits shown at least one reference check') )display_html(as_raw_html(edit_check_revert_overall ))
New content edit revert rate by experiment group
Test Group
Number of published edits shown reference check
Number of edits reverted
Proportion of new content edits that were reverted
control (single check)
2313
562
24.3%
test (multiple checks)
2307
604
26.2%
Limited to published new content edits shown at least one reference check
Revert rate by if mulitiple checks were shown
Code
edit_check_revert_bymultiple <- edit_check_revert_data %>%filter(is_new_content ==1& was_edit_check_shown ==1) %>%#limit to where showngroup_by( multiple_checks_shown) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="New content edit revert rate by if multiple checks were shown" ) %>%opt_stylize(5) %>%cols_label(multiple_checks_shown ="Multiple Check",n_edits ="Number of published new content edits",n_reverts ="Number of edits reverted ",revert_rate ="Proportion of new content edits that were reverted" ) %>%tab_source_note( gt::md('Limited to published new content edits shown at least one reference check') )display_html(as_raw_html(edit_check_revert_bymultiple ))
New content edit revert rate by if multiple checks were shown
Multiple Check
Number of published new content edits
Number of edits reverted
Proportion of new content edits that were reverted
single check shown
4073
1073
26.3%
multiple checks shown
547
93
17%
Limited to published new content edits shown at least one reference check
Revert rate by the number of checks shown
Code
edit_check_revert_bynchecks <- edit_check_revert_data %>%filter(is_new_content ==1& was_edit_check_shown ==1) %>%#limit to where showngroup_by(test_group, checks_shown_bucket) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%select(-c(3,4)) %>%# removing number columns since data is too granulargroup_by(test_group)%>%gt() %>%tab_header(title ="New content edit revert rate by the number of checks shown" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",checks_shown_bucket ="Number of Checks Shown",revert_rate ="Proportion of new content edits that were reverted" ) display_html(as_raw_html(edit_check_revert_bynchecks))
New content edit revert rate by the number of checks shown
Number of Checks Shown
Proportion of new content edits that were reverted
control (single check)
1
24.3%
test (multiple checks)
1
29%
2
20.2%
3-5
16.7%
6-10
9.6%
11-15
14.3%
16-20
30%
over 20
18.8%
New content edit revert rate by platform
Code
edit_check_revert_byplatform <- edit_check_revert_data %>%filter(is_new_content ==1& was_edit_check_shown ==1) %>%#limit to where showngroup_by( platform, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="New content edit revert rate by platform" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",platform ="Platform",n_edits ="Number of published new content edits",n_reverts ="Number of edits reverted",revert_rate ="Proportion of new content edits that were reverted" ) %>%tab_source_note( gt::md('Limited to published new content edits shown at least one reference check') )display_html(as_raw_html(edit_check_revert_byplatform ))
New content edit revert rate by platform
Test Group
Number of published new content edits
Number of edits reverted
Proportion of new content edits that were reverted
desktop
control (single check)
1413
232
16.4%
test (multiple checks)
1419
294
20.7%
phone
control (single check)
900
330
36.7%
test (multiple checks)
888
310
34.9%
Limited to published new content edits shown at least one reference check
New content edit revert rate by user experience
Code
edit_check_revert_byuserexp <- edit_check_revert_data %>%filter(is_new_content ==1& was_edit_check_shown ==1) %>%#limit to where showngroup_by(experience_level_group,test_group ) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="New content edit revert rate by user experience" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiement Group",experience_level_group ="User Status",n_edits ="Number of published new content edits",n_reverts ="Number of edits reverted",revert_rate ="Proportion of new content edits that were reverted" ) %>%tab_source_note( gt::md('Limited to published new content edits shown at least one reference check') )display_html(as_raw_html(edit_check_revert_byuserexp))
New content edit revert rate by user experience
Experiement Group
Number of published new content edits
Number of edits reverted
Proportion of new content edits that were reverted
Unregistered
control (single check)
1255
357
28.4%
test (multiple checks)
1282
404
31.5%
Newcomer
control (single check)
309
78
25.2%
test (multiple checks)
283
67
23.7%
Junior Contributor
control (single check)
749
127
17%
test (multiple checks)
742
133
17.9%
Limited to published new content edits shown at least one reference check
New content edit revert rate by partner Wikipedia
Code
edit_check_revert_bywiki <- edit_check_revert_data %>%filter(is_new_content ==1& was_edit_check_shown ==1) %>%#limit to where showngroup_by( wiki, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%filter(n_reverts >50) %>%gt() %>%tab_header(title ="New content edit revert rate by partner Wikipedia" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",wiki ="Wikipedia",n_edits ="Number of published new content edits",n_reverts ="Number of edits reverted",revert_rate ="Proportion of new content edits that were reverted" ) %>%tab_source_note( gt::md('Limited to wikis with at least 100 published edits') )display_html(as_raw_html(edit_check_revert_bywiki))
New content edit revert rate by partner Wikipedia
Experiment Group
Number of published new content edits
Number of edits reverted
Proportion of new content edits that were reverted
eswiki
control (single check)
461
188
40.8%
test (multiple checks)
514
197
38.3%
frwiki
control (single check)
661
155
23.4%
test (multiple checks)
676
166
24.6%
itwiki
control (single check)
546
134
24.5%
test (multiple checks)
535
155
29%
Limited to wikis with at least 100 published edits
Key Insights
There are no significant difference in the revert rate of new content edits between the control and the test group for editing sessions where a reference check was shown. The revert rate of new content edits in the control group was 24.3% and the revert rate of new content edits in the test group was 26.2%.
In the test group, the revert rate of new content edits shown multiple checks is 17% compared to 26% for sessions shown a single check.
There were no significant increases in revert rate based on the number of checks shown. Current trends indicate that edits shown between 3 to 5 reference checks are less likely to be reverted compared to edits shown 2 reference checks. However, the number of edits shown over 2 checks is still limited, and we need more multi-check editing sessions to confirm this trend.
No significant differences in revert rate for content edits published on mobile web. There is currently a slight increase in the new content edit revert rate on desktop for the test group; however, this increase was just observed for edits shown a single check not multiple checks. We will confirm impacts on revert rate in the full AB test analysis.
In the test group, we observed a decrease in new content edit revert rate for both junior contributors and newcomers. There was a slight increase in revert rate for unregistered contributors (28% revert rate in control to 31.5% in the test group).
Proportion of people blocked after publishing an edit where Multi Check was shown
Methodology: We gathered all edits where edit check was shown from the mediawiki_revision_change_tag table and joined with mediawiki_private_cu_changes to gather user name info. We then reviewed both global and local blocks made within 6 hours of the edit check event as identified in the logging table.
Note: We do not yet have block data for April dates so analysis is limited to blocks that occured between 25 March 2025 through 31 March 2025.
Code
# load data for assessing blocksedit_check_blocks <-read.csv(file ='Queries/data/edit_check_eligible_users_blocked.csv',header =TRUE,sep =",",stringsAsFactors =FALSE )
Code
#rename experiment field to clarifyedit_check_blocks <- edit_check_blocks%>%mutate(test_group =factor(bucket,levels =c("2025-03-editcheck-multicheck-reference-control", "2025-03-editcheck-multicheck-reference-test"),labels =c("control (single check)", "test (multiple checks)")))
Block rates by experiment group
Code
edit_check_local_blocks_overall <- edit_check_blocks %>%group_by(test_group) %>%summarise(blocked_users =n_distinct(cuc_ip[is_local_blocked =='True'| is_global_blocked =='True']),all_users =n_distinct(cuc_ip)) %>%#look at blocksmutate(prop_blocks =paste0(round(blocked_users/all_users *100, 1), "%")) %>%select(-c(2,3)) %>%#removing granular data columns gt() %>%tab_header(title ="Proportion of users blocked by experiment group" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",prop_blocks ="Proportion of users blocked" ) %>%tab_source_note( gt::md('Limited to users blocked 6 hours after publishing an edit where reference check was shown') )display_html(as_raw_html(edit_check_local_blocks_overall))
Proportion of users blocked by experiment group
Test Group
Proportion of users blocked
control (single check)
3%
test (multiple checks)
4.1%
Limited to users blocked 6 hours after publishing an edit where reference check was shown
Key Insights
3.3% of users were blocked after publishing an edit where at least one reference check was shown. By experiment group, 4.1% of users were blocked in the test group compared to 3% in the control group. This difference is not statistically significant and limited to edits by unregistered users in each group.
No global blocks were issued to any users that published an edit where at least one reference check was shown.