The Editing team is evaluating the impact of Paste Check through an A/B test.
Paste Check is an Edit Check that will appear when people paste text into an article they are likely not to have written. This check is an effort to increase the likelihood that the new content people are adding to Wikipedia is aligned with the Movement’s commitment to offering information under a free content license. Multiple paste checks can be shown within an editing session while a user is making their edit. You can find more details about this check on the Project Page.
The Paste Check A/B test was deployed on 8 October 2025 to 22 partner wikis identified in T405422. Prior to completing the full analysis, we reviewed the following set of leading indicators 2 weeks after starting the Paste A/B Test:
Proportion of edits Paste Check is shown within
Proportion of contributors that are presented Paste Check and complete their edits
Proportion of edits wherein people elect to dismiss/not change the text they’ve added
Proportion of people blocked after publishing an edit where Paste Check was shown
Proportion of published edits that add new content and are reverted within 48hours
Proportion of edit sessions in which ≥1 Paste Check is shown and people do not interact with at least one of the Paste Checks people were shown.
Decision to be made: What – if any – adjustments/investigations will we prioritize for us to be confident moving forward with evaluating the Paste Check’s impact in T399669?
In this AB test, users in the test group will be shown Paste Check if attempting an edit that meets the requirements for the check to be shown in VisualEditor. The control group is provided the default editing experience where no Paste Check is shown.
We collected two weeks of AB test events logged between 9 October 2025 and 22 October 2025 on all 22 partner wikipedias.
We relied on events logged in EditAttemptStep, VisualEditorFeatureUse, and change tags recorded in the revision tags table. See instrumentation spec.
Data was limited to mobile and desktop edits completed on a main page namespace using VisualEditor on one of the partner Wikipedias. We also limited to edits completed by unregistered users and users with 100 or fewer edits as those are the users that would be shown Paste Check under the default config settings.
For each leading indicator metric, we reviewed the following dimensions: by experiment group (test and control), by platform (mobile web or desktop), by user experience and status, and by partner Wikipedia. We also reviewed some indicators such as edit completion rate by the number of checks shown within a single editing session.
Note: For the by user experience analysis, we split newer editors into three experience level groups: (1) unregistered, (2) newcomer (registered user making their first edit on Wikipedia), and (3) Junior Contributor (user that has made between 2 and 100 edits).
Results are based on initial AB test data to check if any adjustments to the feature need to be prioritized. More event data will be needed to confirm statistical significance of these findings. We will review the complete AB test data as part of the analysis in T399669
Summary of results
Paste Check Frequency
Paste Check was shown at least once at 36% of all published new content edits by newer editors in the test group. For reference, this is significantly higher than rates observed for Tone Check (only 9% of all published new content edits were shown Tone Check).
A higher proportion of published edits on desktop are shown Paste Check (39%) compared to mobile (24%).
Paste Check appears slightly more frequently for newcomers. We observed a 15% increase in the proportion of published new content edits shown Paste Check when limited to users making their first edit on a Wikipedia.
Paste Check Edit Completion Rate
Edits shown Paste Check are completed at a higher rate (52%) than edits in the control group that are eligible but not shown Paste Check (49%). This represents a 6% relative increase.
We currently don’t see any increase in edit abandonment rate even if a large number (>3) Paste Checks are shown in a single session.
We observed increases by both platform types as well. There was a 15% increase (6 percentage points) for mobile web edits and 4% increase (2 percentage points) on desktop.
Edit completion rate increased across all user experience types to differing degrees. There was an 11% increase in edit completion rate for unregistered users while we only observed a 2.6% increase in edit completion rate for Newcomers (registered users making their fist edit).
Paste Check Dismissal Rate (Users select to keep pasted text)
Users selected to keep the pasted text when prompted at 55% of edits shown Paste Check. This edit check dismissal rate is similar to rates observed for Tone Check and Reference Check.
Users are more likely to keep their pasted text on desktop. Users selected to keep the pasted text at 48% of all published mobile web edits where Paste Check was shown compared to 56% of desktop published edits.
Users select “I wrote this content and its not published elsewhere” in over half (54%) of all published edits where the user selected to keep their pasted text. This is the most frequently selected reason for keeping pasted text on both mobile web and desktop.
Registered newcomers (users making their first edit on the Wikipedia) are dismissing Paste Check at higher rates compared to unregistered users or Junior Contributors. These users selected the “I wrote this content…” option at 69.5% of all published edits where Paste Check was dismissed.
Paste Check Revert Rate
Overall, new content edits shown Paste Check are reverted less frequently. We’ve observed a -21.3% decrease in published edits where Paste Check was shown compared to edits eligible but not shown. Paste Check. Decreases were observed across all reviewed user types (unregistered, newcomers, and Junior Contributors).
Revert rates for both edits shown Paste Check (9.6%) or eligible to be shown Paste Check (12.2%) are lower than the revert rates we’ve observed for other types of edits. For example, there’s 25% revert rate for edits detected as having non-netural tone (see T371158#11220470).
When split by platform, we see differing trends per platform. For desktop, we’ve observed a -28% decrease in revert rate. While on mobile, there’s been a slight 8.8% increase. However, at this point in this AB test, there’s been a low absolute number of mobile edits shown or eligible to be shown Paste Check that have been reverted (<50 edits). We will need more data to confirm any trends.
Paste Check Interaction Rate
At 45% of all editing sessions where Paste Check was shown, people did not interact with one or more of the Paste Checks presented.
Of these, 35% of editing sessions did not include interaction with any of the Paste Checks presented. The other 10% of edits were edits presented multiple Paste Checks where people did not interact with one or more of the Paste Checks presented.
There’s no variation in interaction rate by platform type.
Paste Check Frequency
Question: Are newer editors encountering Paste Check?
Methodology: We reviewed the proportion of published new content edits where at least one Paste Check was shown during the editing session (event.feature = 'editCheck-paste' AND event.action IN ('check-shown-midedit').
This analysis was specifically limited to edits that were successfully published and identified as new content edits with the tag editcheck-newcontent.
Code
#load frequency datapaste_check_frequency_data <-read.csv(file ='data/paste_check_frequency_data.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
Code
# Cleaning up dataset and renaming fields to clarify meanings# Set experience level group and factor levelspaste_check_frequency_data <- paste_check_frequency_data %>%mutate(experience_level_group =case_when( user_edit_count ==0& user_status =='registered'~'Newcomer', user_edit_count ==0& user_status =='unregistered'~'Unregistered', user_edit_count >0& user_edit_count <=100~"Junior Contributor", user_edit_count >100~"Non-Junior Contributor" ),experience_level_group =factor(experience_level_group,levels =c("Unregistered","Newcomer", "Non-Junior Contributor", "Junior Contributor") )) #rename experiment field to clarifypaste_check_frequency_data <- paste_check_frequency_data %>%mutate(test_group =factor(test_group,levels =c('2025-09-editcheck-paste-control', '2025-09-editcheck-paste-test'),labels =c("control (no paste check)", "test (paste check available)")))#rename platform from phone to mobile web to clarify meaningpaste_check_frequency_data <- paste_check_frequency_data %>%mutate(platform =factor(platform,levels =c('phone', 'desktop'),labels =c("mobile web", "desktop")))
Code
#Set fields and factor levels to assess number of checks shownpaste_check_frequency_data <- paste_check_frequency_data %>%mutate(multiple_checks_shown =ifelse(n_checks_shown >1, 1, 0), multiple_checks_shown =factor( multiple_checks_shown ,levels =c(0,1)))# note these buckets can be adjusted as needed based on distribution of datapaste_check_frequency_data <- paste_check_frequency_data %>%mutate(checks_shown_bucket =case_when(is.na(n_checks_shown) ~'0', n_checks_shown ==1~'1', n_checks_shown ==2~'2', n_checks_shown >2& n_checks_shown <=5~"3-5", n_checks_shown >5& n_checks_shown <=10~"6-10", n_checks_shown >10~"over 10" ),checks_shown_bucket =factor(checks_shown_bucket ,levels =c("0","1","2", "3-5", "6-10", "over 10") ))
Overall
Code
paste_checks_shown_saved_overall <- paste_check_frequency_data %>%filter(test_group =="test (paste check available)"#limit to test group edits& is_new_content ==1 ) %>%#limit to published new content editsgroup_by(test_group) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_refcheck =n_distinct(editing_session[was_paste_check_shown ==1])) %>%mutate(prop_check_shown =paste0(round(n_editing_session_refcheck/n_editing_session *100, 1), "%")) %>%gt() %>%tab_header(title ="Published new content edits shown at least one Paste Check" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",n_editing_session ="Number of edits",n_editing_session_refcheck ="Number of edits shown Paste Check", prop_check_shown ="Proportion of edits shown Paste Check" ) %>%tab_source_note( gt::md('Limited to published new content edits by unregistered users and users with 100 or fewer edits') )display_html(as_raw_html(paste_checks_shown_saved_overall))
Published new content edits shown at least one Paste Check
Experiment Group
Number of edits
Number of edits shown Paste Check
Proportion of edits shown Paste Check
test (paste check available)
954
343
36%
Limited to published new content edits by unregistered users and users with 100 or fewer edits
Paste Check was shown at least once at 36% of all published new content edits by newer editors in the test group. This is significantly higher than rates observed for Tone Check (only 9% of all published new content edits were shown Tone Check).
Note: This also aligns with frequency rates previously identified in T403861
By if multiple checks were shown
Code
paste_checks_shown_saved_bymultiple <- paste_check_frequency_data %>%filter(test_group =="test (paste check available)"&#limit to test group edits is_new_content ==1) %>%#limit to published new content editsgroup_by(test_group) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_multicheck =n_distinct(editing_session[was_paste_check_shown ==1& multiple_checks_shown ==1])) %>%mutate(prop_check_shown =paste0(round(n_editing_session_multicheck/n_editing_session *100, 1), "%")) %>%gt() %>%tab_header(title ="Published new content edits shown multiple Paste Checks" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",n_editing_session ="Number of edits",n_editing_session_multicheck ="Number of edits shown multiple Paste Checks", prop_check_shown ="Proportion of edits shown multiple Paste Checks" ) %>%tab_source_note( gt::md('Limited to published new content edits by unregistered users and users with 100 or fewer edits') )display_html(as_raw_html(paste_checks_shown_saved_bymultiple))
Published new content edits shown multiple Paste Checks
Experiment Group
Number of edits
Number of edits shown multiple Paste Checks
Proportion of edits shown multiple Paste Checks
test (paste check available)
954
109
11.4%
Limited to published new content edits by unregistered users and users with 100 or fewer edits
Around 11.4% of all published new content edits were shown more than one Paste Check in a session.
By number of checks shown
Code
paste_checks_shown_saved_bynchecks <- paste_check_frequency_data %>%filter(test_group =="test (paste check available)"&#limit to test group edits is_new_content ==1& was_paste_check_shown ==1) %>%#limit to published new content editsmutate(total_sessions =n_distinct(editing_session)) %>%group_by(total_sessions, checks_shown_bucket) %>%summarise(n_editing_session_tonecheck =n_distinct(editing_session)) %>%mutate(prop_check_shown =paste0(round(n_editing_session_tonecheck/total_sessions *100, 2), "%")) %>%ungroup() %>%select(-c(1,3)) %>%#mutate(n_editing_session_tonecheck = ifelse(n_editing_session_tonecheck < 50, "<50", n_editing_session_refcheck)) %>% #sanitizing per data publication guidelinesgt() %>%tab_header(title ="Published new content edits by number of Paste Checks shown" ) %>%opt_stylize(5) %>%cols_label(checks_shown_bucket ="Number of Paste Checks shown",#n_editing_session_tonecheck = "Number of edits", prop_check_shown ="Proportion of edits" ) %>%tab_source_note( gt::md('Limited to published new content edits shown at least one Paste Check') )display_html(as_raw_html(paste_checks_shown_saved_bynchecks))
Published new content edits by number of Paste Checks shown
Number of Paste Checks shown
Proportion of edits
1
68.22%
2
13.99%
3-5
13.99%
6-10
2.92%
over 10
0.87%
Limited to published new content edits shown at least one Paste Check
Paste Check was shown only once in the majority of all editing sessions shown Paste Check (68%).
For sessions where multiple checks are shown, the majority (88%) are shown between 2 to 5 Paste Checks, with minimal sessions (<4%) with over 6 Paste Checks presented.
By platform
Code
paste_checks_shown_byplatform <- paste_check_frequency_data %>%filter(test_group =="test (paste check available)"&#limit to test group edits is_new_content ==1) %>%#limit to published new content editsgroup_by(platform) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_tonecheck =n_distinct(editing_session[was_paste_check_shown ==1])) %>%mutate(prop_check_shown =paste0(round(n_editing_session_tonecheck/n_editing_session *100, 1), "%")) %>%mutate(n_editing_session_tonecheck =ifelse(n_editing_session_tonecheck <50, "<50", n_editing_session_tonecheck))%>%#sanitizing per data publication guideline#select(-2) %>% #removing total number of edits column to santize data for publicationgt() %>%tab_header(title ="Published new content edits shown Paste Check by platform" ) %>%opt_stylize(5) %>%cols_label(platform ="Platform",#n_editing_session = "Number of edits",n_editing_session_tonecheck ="Number of edits shown Paste Check", prop_check_shown ="Proportion of edits shown Paste Check" ) %>%tab_source_note( gt::md('Limited to published new content edits by unregistered users and users with 100 or fewer edits') )display_html(as_raw_html(paste_checks_shown_byplatform))
Published new content edits shown Paste Check by platform
Platform
n_editing_session
Number of edits shown Paste Check
Proportion of edits shown Paste Check
mobile web
208
50
24%
desktop
746
293
39.3%
Limited to published new content edits by unregistered users and users with 100 or fewer edits
A higher proportion of edits on desktop are shown Paste Check (39%) compared to mobile (24%). We observed a 22% increase the number of edits on desktop shown Paste Check compared to mobile web.
By user experience
Code
paste_checks_shown_byuser_status <- paste_check_frequency_data %>%filter(test_group =="test (paste check available)"&#limit to test group edits is_new_content ==1) %>%#limit to published new content editsgroup_by(experience_level_group ) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_pastecheck =n_distinct(editing_session[was_paste_check_shown ==1])) %>%mutate(prop_check_shown =paste0(round(n_editing_session_pastecheck/n_editing_session *100, 1), "%")) %>%select(-2) %>%#removing total number of edits column to santize data for publicationgt() %>%tab_header(title ="Published new content edits shown Paste Check by user experience" ) %>%opt_stylize(5) %>%cols_label(experience_level_group ="User Experience",#n_editing_session = "Number of edits",n_editing_session_pastecheck ="Number of edits shown Paste Check", prop_check_shown ="Proportion of edits shown Paste Check" ) %>%tab_source_note( gt::md('Limited to published new content edits by unregistered users and users with 100 or fewer edits') )display_html(as_raw_html(paste_checks_shown_byuser_status ))
Published new content edits shown Paste Check by user experience
User Experience
Number of edits shown Paste Check
Proportion of edits shown Paste Check
Unregistered
63
35%
Newcomer
59
40.4%
Junior Contributor
221
35.2%
Limited to published new content edits by unregistered users and users with 100 or fewer edits
Paste Check appears slightly more frequently for newcomers. There is 15% increase in the proportion of published new content edits shown Paste Check when limited to users making their first edit.
By partner Wikipedia
Code
paste_checks_shown_bywiki <- paste_check_frequency_data %>%filter(test_group =="test (paste check available)"&#limit to test group edits is_new_content ==1) %>%#saved test group editsgroup_by(wiki) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_pastecheck =n_distinct(editing_session[was_paste_check_shown ==1])) %>%mutate(prop_check_shown =paste0(round(n_editing_session_pastecheck/n_editing_session *100, 1), "%")) %>%filter(n_editing_session >50) %>%mutate(n_editing_session_pastecheck =ifelse(n_editing_session_pastecheck <50, "<50", n_editing_session_pastecheck))%>%#sanitizing per data publication guidelineselect(-2) %>%#removing total number of edits column to santize data for publicationgt() %>%tab_header(title ="Published new content edits shown Paste Check by Wikipedia" ) %>%opt_stylize(5) %>%cols_label(wiki ="Wikipedia",#n_editing_session = "Number of edits",n_editing_session_pastecheck ="Number of edits shown Paste Check", prop_check_shown ="Proportion of edits shown Paste Check" ) %>%tab_source_note( gt::md('Limited to wikis with at least 50 published new content edits during reviewed timeframe') )display_html(as_raw_html(paste_checks_shown_bywiki))
Published new content edits shown Paste Check by Wikipedia
Wikipedia
Number of edits shown Paste Check
Proportion of edits shown Paste Check
dewiki
66
40.2%
fawiki
<50
29%
idwiki
<50
40.4%
itwiki
<50
31.1%
plwiki
<50
26.1%
ruwiki
66
39.5%
ukwiki
<50
33.3%
Limited to wikis with at least 50 published new content edits during reviewed timeframe
No significant differences in frequency by Wikipedia. Note: There has been only one editing session shown Paste Check at fonwiki, glwiki, and hiwiki. No Paste Checks have been shown at euwiki at the time of this analysis.
Paste Check Edit Completion Rate
Question Do newer editors understand the feature?
Methodology We reviewed the proportion of edits where Paste Check was shown at least once during the edit session and that were successfully published (event.action = saveSuccess). These edits were compared to the completion rate of edits in the control group that were eligible but not shown Paste Check, as implemented in T402460.
Note: This analysis excludes edits that were abandoned prior to reaching the point where Paste Check was or would have been shown.
Code
# load data for assessing edit completion rateedit_completion_rates <-read.csv(file ='data/paste_edit_completion_rate.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
Code
# Set experience level group and factor levelsedit_completion_rates <- edit_completion_rates %>%mutate(experience_level_group =case_when( user_edit_count ==0& user_status =='registered'~'Newcomer', user_edit_count ==0& user_status =='unregistered'~'Unregistered', user_edit_count >0& user_edit_count <=100~"Junior Contributor", user_edit_count >100~"Non-Junior Contributor" ),experience_level_group =factor(experience_level_group,levels =c("Unregistered","Newcomer", "Non-Junior Contributor", "Junior Contributor") )) #rename experiment field to clarfiyedit_completion_rates <- edit_completion_rates %>%mutate(test_group =factor(test_group,levels =c('2025-09-editcheck-paste-control', '2025-09-editcheck-paste-test'),labels =c("control (not shown Paste Check)", "test (shown Paste Check)")))#rename platform from phone to mobile web to clarify meaningedit_completion_rates <- edit_completion_rates %>%mutate(platform =factor(platform,levels =c('phone', 'desktop'),labels =c("mobile web", "desktop")))
Code
#Set fields and factor levels to assess number of checks shownedit_completion_rates <- edit_completion_rates %>%mutate(multiple_checks_shown =ifelse(n_checks_shown >1, "multiple checks shown", "one check shown"), multiple_checks_shown =factor( multiple_checks_shown ,levels =c("one check shown", "multiple checks shown")))# note these buckets can be adjusted as needed based on distribution of dataedit_completion_rates <- edit_completion_rates %>%mutate(checks_shown_bucket =case_when(is.na(n_checks_shown) ~'0', n_checks_shown ==1~'1', n_checks_shown ==2~'2', n_checks_shown >2& n_checks_shown <=5~"3-5", n_checks_shown >5& n_checks_shown <=10~"6-10", n_checks_shown >10~"over 10" ),checks_shown_bucket =factor(checks_shown_bucket ,levels =c("0","1","2", "3-5", "6-10","over 10") ))
Overall
Code
edit_completion_rate_overall <- edit_completion_rates %>%filter(paste_check_shown ==1 ) %>%#limit to sessions where tone check was showngroup_by(test_group) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%"))
Code
# plot visualization of overall edit completion ratesdodge <-position_dodge(width=0.9)p <- edit_completion_rate_overall %>%ggplot(aes(x= test_group, y = n_saves/n_edits)) +geom_col(position ='dodge', fill ='dodgerblue4') +scale_y_continuous(labels = scales::percent) +geom_text(aes(label =paste(completion_rate), fontface=2), vjust=1.2, size =10, color ="white") +scale_fill_manual(values= cbPalette, name ="Reason") +labs (y ="Percent of edit attempts completed ",x ="Experiment Group",title ="Paste Check edit completion rate",caption ="Limited to edit attempts shown or eligible to be shown at least one Paste Check") +theme(panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=24),legend.position="none",axis.line =element_line(colour ="black")) p
Edits shown Paste Check are completed at a higher rate (52%) than edits in the control group that are eligible but not shown Paste Check (49%). This represents a 6% relative increase.
By if multiple checks were shown
Code
edit_completion_rate_bymulti <- edit_completion_rates %>%filter(paste_check_shown ==1& test_group =='test (shown Paste Check)') %>%group_by(test_group, multiple_checks_shown) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Paste Check edit completion rate by if multiple checks were shown" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment group",multiple_checks_shown ="Multiple tone checks shown",n_edits ="Number of edit attempts shown tone check",n_saves ="Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to edit attempts shown or eligible to be shown at least one Paste Check') )display_html(as_raw_html(edit_completion_rate_bymulti))
Paste Check edit completion rate by if multiple checks were shown
Multiple tone checks shown
Number of edit attempts shown tone check
Number of published edits
Proportion of edits saved
test (shown Paste Check)
one check shown
2447
1227
50.1%
multiple checks shown
1086
608
56%
Limited to edit attempts shown or eligible to be shown at least one Paste Check
Interestingly, the edit completion rate of edits shown multiple Paste Checks is higher than edits shown a single Paste Check. We observed a similar trend for Tone Check as well.
This may be indicative of the type of edit where multiple checks are shown. For example, edits shown multiple checks are likely longer text edits where the user is more motivated to save their edit.
By number of checks shown
Code
edit_completion_rate_bynchecks <- edit_completion_rates %>%filter(paste_check_shown ==1& test_group =='test (shown Paste Check)') %>%#limit to tone checks shown and test groupgroup_by(test_group, checks_shown_bucket) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%ungroup()%>%mutate(n_edits =ifelse(n_edits <50, "<50", n_edits),n_saves =ifelse(n_saves <50, "<50", n_saves)) %>%#sanitizing per data publication guidelinesgroup_by(test_group) %>%gt() %>%tab_header(title ="Paste Check edit completion rate by the number of checks shown" ) %>%opt_stylize(5) %>%cols_label(checks_shown_bucket ="Number of tone checks shown",n_edits ="Number of edit attempts shown tone check",n_saves ="Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to edit attempts shown or eligible to be shown at least one Paste Check') )display_html(as_raw_html(edit_completion_rate_bynchecks))
Paste Check edit completion rate by the number of checks shown
Number of tone checks shown
Number of edit attempts shown tone check
Number of published edits
Proportion of edits saved
test (shown Paste Check)
1
2447
1227
50.1%
2
537
287
53.4%
3-5
412
229
55.6%
6-10
106
66
62.3%
over 10
<50
<50
83.9%
Limited to edit attempts shown or eligible to be shown at least one Paste Check
We currently don’t see any increase in edit abandonment rate even if a large number (>3) Paste Checks are shown in a single session.
By platform
Code
edit_completion_rate_byplatform <- edit_completion_rates %>%filter(paste_check_shown ==1) %>%group_by(platform, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%#mutate(n_saves = ifelse(n_saves < 50, "<50", n_saves))%>% #sanitizing per data publication guidelineselect(-c(3,4)) %>%gt() %>%tab_header(title ="Paste Check edit completion rate by platform" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",platform ="Platform",#n_edits = "Number of edit attempts shown tone check",#n_saves = "Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to edit attempts shown or eligible to be shown at least one Paste Check') )display_html(as_raw_html(edit_completion_rate_byplatform))
Paste Check edit completion rate by platform
Experiment Group
Proportion of edits saved
mobile web
control (not shown Paste Check)
39.1%
test (shown Paste Check)
45.1%
desktop
control (not shown Paste Check)
51.4%
test (shown Paste Check)
53.5%
Limited to edit attempts shown or eligible to be shown at least one Paste Check
We observed increases by both platform type as well. There was a 15% increase (6 percentage points) for mobile web edits and 4% increase (2 percentage points) on desktop.
By user experience
Code
edit_completion_rate_byuserstatus <- edit_completion_rates %>%filter(paste_check_shown ==1) %>%group_by(experience_level_group, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%#select(-c(3,4)) %>% #data sanitizing for publicationgt() %>%tab_header(title ="Paste check edit completion rate by editor experience" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",experience_level_group ="Experiment Group",n_edits ="Number of edit attempts shown tone check",n_saves ="Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to edit attempts shown or eligible to be shown at least one Paste Check') )display_html(as_raw_html(edit_completion_rate_byuserstatus))
Paste check edit completion rate by editor experience
Test Group
Number of edit attempts shown tone check
Number of published edits
Proportion of edits saved
Unregistered
control (not shown Paste Check)
1043
433
41.5%
test (shown Paste Check)
1147
530
46.2%
Newcomer
control (not shown Paste Check)
604
232
38.4%
test (shown Paste Check)
622
245
39.4%
Junior Contributor
control (not shown Paste Check)
1627
950
58.4%
test (shown Paste Check)
1764
1060
60.1%
Limited to edit attempts shown or eligible to be shown at least one Paste Check
Edit completion rate increased across all user experience types to differing degrees. There was an 11% increase in edit completion rate for unregistered users while we only observed a 2.6% increase in edit completion rate for Newcomers (registered users making their fist edit).
Note: There was concern that newcomers could be discouraged by the interface copy that suggests their account could be blocked for introducing a copyright violation; however, we are not currently seeing any indication this is increasing abandonment rate.
By partner Wikipedia
Code
edit_completion_rate_bywiki <- edit_completion_rates %>%filter(paste_check_shown ==1) %>%group_by(wiki, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_saves =n_distinct(editing_session[saved_edit >0])) %>%mutate(completion_rate =paste0(round(n_saves/n_edits *100, 1), "%")) %>%filter(n_edits >100) %>%#limit to wikis with sufficient eventsselect(-c(3,4)) %>%#data sanitizing for publicationgt() %>%tab_header(title ="Paste Check edit completion rate by Wikipedia" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",wiki ="Wikipedia",#n_edits = "Number of edit attempts shown tone check",#n_saves = "Number of published edits",completion_rate ="Proportion of edits saved" ) %>%tab_source_note( gt::md('Limited to Wikipedias with at least 100 edit attempts during reviewed timeframe') )display_html(as_raw_html(edit_completion_rate_bywiki ))
Paste Check edit completion rate by Wikipedia
Test Group
Proportion of edits saved
arwiki
control (not shown Paste Check)
16.6%
test (shown Paste Check)
21.5%
dewiki
control (not shown Paste Check)
56.9%
test (shown Paste Check)
62.6%
fawiki
control (not shown Paste Check)
37.4%
test (shown Paste Check)
51.7%
idwiki
control (not shown Paste Check)
61.6%
test (shown Paste Check)
57.9%
itwiki
control (not shown Paste Check)
46.7%
test (shown Paste Check)
55.3%
nlwiki
control (not shown Paste Check)
49.6%
test (shown Paste Check)
51.2%
plwiki
control (not shown Paste Check)
62.6%
test (shown Paste Check)
57.1%
ruwiki
control (not shown Paste Check)
56.5%
test (shown Paste Check)
52.9%
ukwiki
control (not shown Paste Check)
53.8%
test (shown Paste Check)
51.9%
zhwiki
control (not shown Paste Check)
40.6%
test (shown Paste Check)
53.8%
Limited to Wikipedias with at least 100 edit attempts during reviewed timeframe
Results are slightly more variable on a per wiki basis. However, there are no current signs of significant edit abandonment rate for edits shown Paste Check. We observed increases in edit completion rate across all partner wikis except for 4 (Note: This excludes wikis with under 100 eligible edit attempt sessions at the time of this review).
Paste Check Dismissal Rate (Users that select to keep pasted text)
Question: Do people find Paste Check relevant?
Methodology: We reviewed the proportion of published edits shown Paste Check wherein people elected to keep the text they added (i.e. the Paste Check was dismissed). This was determined by edits where the user dimissed a Paste Check at least once in a session (event.feature = 'editCheck-paste' AND event.action = 'action-keep'). The analysis includes splits by the reason the user selected for keeping the text.
Note: We will use these results to consider decision made in T406164#11247475 in how to present Paste Check on mobile.
Code
# load data for assessing edit reject frequencyedit_check_reject_data <-read.csv(file ='data/paste_check_rejects_data.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
Code
# Set experience level group and factor levelsedit_check_reject_data <- edit_check_reject_data %>%mutate(experience_level_group =case_when( user_edit_count ==0& user_status =='registered'~'Newcomer', user_edit_count ==0& user_status =='unregistered'~'Unregistered', user_edit_count >0& user_edit_count <=100~"Junior Contributor", user_edit_count >100~"Non-Junior Contributor" ),experience_level_group =factor(experience_level_group,levels =c("Unregistered","Newcomer", "Non-Junior Contributor", "Junior Contributor") )) #rename experiment field to clarifyedit_check_reject_data <- edit_check_reject_data %>%mutate(test_group =factor(test_group,levels =c('2025-09-editcheck-paste-control', '2025-09-editcheck-paste-test'),labels =c("control (no tone check)", "test (shown Paste Check)")))#rename platform from phone to mobile web to clarify meaningedit_check_reject_data <- edit_check_reject_data %>%mutate(platform =factor(platform,levels =c('phone', 'desktop'),labels =c("mobile web", "desktop")))
Code
#Set fields and factor levels to assess number of checks shownedit_check_reject_data <- edit_check_reject_data %>%mutate(multiple_checks_shown =ifelse(n_checks_shown >1, "multiple checks shown", "single check shown"), multiple_checks_shown =factor( multiple_checks_shown ,levels =c("single check shown", "multiple checks shown")))# note these buckets can be adjusted as needed based on distribution of dataedit_check_reject_data <- edit_check_reject_data %>%mutate(checks_shown_bucket =case_when(is.na(n_checks_shown) ~'0', n_checks_shown ==1~'1', n_checks_shown ==2~'2', n_checks_shown >2& n_checks_shown <=5~"3-5", n_checks_shown >5& n_checks_shown <=10~"6-10", n_checks_shown >10~"over 10" ),checks_shown_bucket =factor(checks_shown_bucket ,levels =c("0","1","2", "3-5", "6-10", "over 10") ))
Code
# shorten and clarify reason field namesedit_check_reject_data <- edit_check_reject_data %>%mutate(reject_reason =case_when( reject_reason =='no_reject_reason'~'No reason provided', reject_reason =='edit-check-feedback-reason-other'~'None applies', reject_reason =='edit-check-feedback-reason-wrote'~'I wrote content', reject_reason =='edit-check-feedback-reason-permission'~'I have permission' ),reject_reason =factor(reject_reason ,levels =c("No reason provided","None applies","I wrote content", "I have permission") ))
Overall
Code
# overall dismissal rateedit_check_dismissal_overall <- edit_check_reject_data %>%filter(was_paste_check_shown ==1) %>%#limit to where shownsummarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Paste Check dismissal rate" ) %>%opt_stylize(5) %>%cols_label(n_edits ="Number of edits shown Paste check",n_rejects ="Number of edits that dimisssed Paste Check",dismissal_rate ="Proportion of edits where Paste Check was dismissed" ) %>%tab_source_note( gt::md('Limited to published edits where at least one Paste Check was shown') )display_html(as_raw_html(edit_check_dismissal_overall ))
Paste Check dismissal rate
Number of edits shown Paste check
Number of edits that dimisssed Paste Check
Proportion of edits where Paste Check was dismissed
1655
903
54.6%
Limited to published edits where at least one Paste Check was shown
Users elected to keep the pasted text when prompted at 55% of edits shown Paste Check. This edit check dismissal rate is similar to rates observed for Tone Check and Reference Check.
By dismissal reason
Code
edit_check_dismissal_byreason_overall <- edit_check_reject_data %>%filter(was_paste_check_shown ==1& n_rejects >0) %>%#limit to where shown and user elected to keep textgroup_by(reject_reason) %>%summarise(n_edits_rejected =n_distinct(editing_session)) %>%mutate(select_rate =paste0(round(n_edits_rejected/sum(n_edits_rejected) *100, 1), "%"))
Code
# plot bar chart of reason selectiondodge <-position_dodge(width=0.9)p <- edit_check_dismissal_byreason_overall %>%ggplot(aes(x= reject_reason, y = n_edits_rejected/sum(n_edits_rejected))) +geom_col(position ='dodge', fill ='dodgerblue4') +scale_y_continuous(labels = scales::percent) +geom_text(aes(label =paste(select_rate, "\n", n_edits_rejected,"edits"), fontface=2), vjust=1.2, size =10, color ="white") +scale_fill_manual(values= cbPalette, name ="Reason") +labs (y ="Percent of edits ",x ="Selected reason",title ="Reasons users selected for keeping pasted text",caption ="Limited to published edits where a user selected to keep pasted text") +theme(panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=24),legend.position="none",axis.line =element_line(colour ="black")) p
Users select “I wrote this content and its not published elsewhere” in a little over half (54%) of all published edits where the user selected to keep their pasted text.
By if multiple checks were shown
Code
edit_check_dismissal_bymultiple <- edit_check_reject_data %>%filter(was_paste_check_shown ==1) %>%#limit to where showngroup_by(multiple_checks_shown) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%gt() %>%tab_header(title ="Paste Check dismissal rate by if multiple checks shown" ) %>%opt_stylize(5) %>%cols_label(multiple_checks_shown ="Multiple Checks",n_edits ="Number of edits shown Paste Check",n_rejects ="Number of edits that dimisssed Paste Check",dismissal_rate ="Proportion of edits where Paste Check was dismissed" ) %>%tab_source_note( gt::md('Limited to published edits where at least one Paste Check was shown') )display_html(as_raw_html(edit_check_dismissal_bymultiple ))
Paste Check dismissal rate by if multiple checks shown
Multiple Checks
Number of edits shown Paste Check
Number of edits that dimisssed Paste Check
Proportion of edits where Paste Check was dismissed
single check shown
1111
539
48.5%
multiple checks shown
544
365
67.1%
Limited to published edits where at least one Paste Check was shown and dismissed
We see a higher dismissal rate if more checks are shown. Interestingly, this was not observed with Tone Check.
By platform
Code
edit_check_dismissal_byplatform <- edit_check_reject_data %>%filter(was_paste_check_shown ==1) %>%#limit to where showngroup_by(platform) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0 ])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%ungroup() %>%#mutate(n_edits = ifelse(n_edits < 50, "<50", n_edits),#n_rejects = ifelse(n_rejects < 50, "<50", n_rejects)) %>% #sanitizing per data publication guidelines#select(-2) %>%gt() %>%tab_header(title ="Paste Check dismissal rate by platform" ) %>%opt_stylize(5) %>%cols_label(platform ="Platform",n_edits ="Number of edits shown tone check",n_rejects ="Number of edits that dimisssed Paste Check",dismissal_rate ="Proportion of edits where Paste Check was dismissed" ) %>%tab_source_note( gt::md('Limited to published edits where at least one Paste Check was shown') )display_html(as_raw_html(edit_check_dismissal_byplatform ))
Paste Check dismissal rate by platform
Platform
Number of edits shown tone check
Number of edits that dimisssed Paste Check
Proportion of edits where Paste Check was dismissed
mobile web
261
125
47.9%
desktop
1394
779
55.9%
Limited to published edits where at least one Paste Check was shown
Users are more likely to keep their pasted text on desktop. Users selected to keep the pasted text at 48% of all published mobile web edits where Paste Check was shown compared to 56% of desktop published edits.
There was -14% decrease in the Paste check dismissal rate on mobile compared to desktop.
Dismissal reason by platform
Code
edit_check_dismissal_byreason_byplatform <- edit_check_reject_data %>%filter(was_paste_check_shown ==1& n_rejects >0) %>%#limit to where shown and user elected to keep textgroup_by(platform, reject_reason) %>%summarise(n_edits_rejected =n_distinct(editing_session)) %>%mutate(select_rate =round(n_edits_rejected/sum(n_edits_rejected), 2))
Code
# plot bar chart of reason selectiondodge <-position_dodge(width=0.9)p <- edit_check_dismissal_byreason_byplatform %>%ggplot(aes(x= reject_reason, y =select_rate, fill = reject_reason)) +geom_col(position ='dodge',) +scale_y_continuous(labels = scales::percent) +geom_text(aes(label =paste0(select_rate *100, "%"), fontface=2), vjust=1.2, size =10, color ="white") +facet_grid(~ platform ) +labs (y ="Percent of edits ",x ="Selected reason",title ="Reasons users selected for keeping pasted text") +scale_fill_manual(values= cbPalette, name ="Reason") +theme(panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=24),legend.position="bottom",axis.text.x =element_blank(),axis.ticks.x =element_blank(),axis.line =element_line(colour ="black")) p
No significant differences by platform; however, there are a few trends to highlight:
Users on desktop are less likely to select “None of the above applies” compared to mobile.
“I wrote this content…” is the most frequently selected reason for keeping text on both platforms.
Users on mobile are more likely to not provide a reason for keeping the pasted text compared to desktop.
By user experience
Code
edit_check_dismissal_byuserexp <- edit_check_reject_data %>%filter(was_paste_check_shown ==1) %>%#limit to where showngroup_by(experience_level_group) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%ungroup() %>%mutate(n_edits =ifelse(n_edits <50, "<50", n_edits),n_rejects =ifelse(n_rejects <50, "<50", n_rejects)) %>%#sanitizing per data publication guidelines#select(-2) %>%gt() %>%tab_header(title ="Paste Check dismissal rate by user experience" ) %>%opt_stylize(5) %>%cols_label(experience_level_group ="User Experience",n_edits ="Number of edits shown tone check",n_rejects ="Number of edits that dimisssed Paste Check",dismissal_rate ="Proportion of edits where Paste Check was dismissed" )%>%tab_source_note( gt::md('Limited to published edits where at least one Paste Check was shown') )display_html(as_raw_html(edit_check_dismissal_byuserexp ))
Paste Check dismissal rate by user experience
User Experience
Number of edits shown tone check
Number of edits that dimisssed Paste Check
Proportion of edits where Paste Check was dismissed
Unregistered
353
154
43.6%
Newcomer
244
157
64.3%
Junior Contributor
1058
594
56.1%
Limited to published edits where at least one Paste Check was shown
Newcomers (users making their first edit on the Wikipedia) are dismissing Paste Check at higher rates compared to unregistered users or Junior Contributors. This might be a sign that the check is slightly more confusing to Newcomers. We observed a similar trend for Tone Check.
Dismissal reason by user experinece
Code
edit_check_dismissal_byreason_byuserexp <- edit_check_reject_data %>%filter(was_paste_check_shown ==1& n_rejects >0) %>%#limit to where shown and user elected to keep textgroup_by(experience_level_group, reject_reason) %>%summarise(n_edits_rejected =n_distinct(editing_session)) %>%mutate(select_rate =round(n_edits_rejected/sum(n_edits_rejected),2))
Code
# plot bar chart of reason selectiondodge <-position_dodge(width=0.9)p <- edit_check_dismissal_byreason_byuserexp %>%ggplot(aes(x= reject_reason, y = select_rate, fill = reject_reason)) +geom_col(position ='dodge') +scale_y_continuous(labels = scales::percent) +geom_text(aes(label =paste0(select_rate *100, "%"), fontface=2), vjust=1.2, size =10, color ="white") +facet_grid( ~ experience_level_group) +labs (y ="Percent of edits ",x ="Selected reason",title ="Reasons users selected for keeping pasted text") +scale_fill_manual(values= cbPalette, name ="Reason") +theme(panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=24),legend.position="bottom",axis.text.x =element_blank(),axis.ticks.x =element_blank(),axis.line =element_line(colour ="black")) p
When we split the results by user type, we see that registered newcomers (users making their first edit on the Wikipedia) are more frequently selecting the “I wrote this content…” when providing a reason for keeping there text compared to unregistered users or Junior Contributors.
Registered newcomers (user making their first edit) selected the “I wrote this content…” option at 69.5% of all published edits where Paste Check was dismissed. Unregistered users are the most likely to select “None of the above applies” or “I have permission to reuse” options.
By partner Wikipedia
Code
edit_check_dismissal_bywiki <- edit_check_reject_data %>%filter(was_paste_check_shown ==1) %>%#limit to where showngroup_by(wiki) %>%summarise(n_edits =n_distinct(editing_session),n_rejects =n_distinct(editing_session[n_rejects >0])) %>%#limit to new content edits without a refernecemutate(dismissal_rate =paste0(round(n_rejects/n_edits *100, 1), "%")) %>%filter(n_edits >50) %>%# limit to wikis with over 50 edits.ungroup() %>%mutate(n_edits =ifelse(n_edits <50, "<50", n_edits),n_rejects =ifelse(n_rejects <50, "<50", n_rejects)) %>%#sanitizing per data publication guidelinesselect(-2) %>%gt() %>%tab_header(title ="Paste Check dismissal rate by partner Wikipedia" ) %>%opt_stylize(5) %>%cols_label(wiki ="Wikipedia",#n_edits = "Number of edits shown tone check",n_rejects ="Number of edits that dimisssed Paste Check",dismissal_rate ="Proportion of edits where Paste Check was dismissed" ) %>%tab_source_note( gt::md('Limited to Wikipedias where Paste Check was shown in at least 50 edits') )display_html(as_raw_html(edit_check_dismissal_bywiki ))
Paste Check dismissal rate by partner Wikipedia
Wikipedia
Number of edits that dimisssed Paste Check
Proportion of edits where Paste Check was dismissed
arwiki
<50
49.3%
dewiki
190
57.6%
fawiki
67
60.4%
idwiki
52
42.3%
itwiki
102
54.3%
nlwiki
51
66.2%
plwiki
63
60.6%
ruwiki
141
51.8%
ukwiki
53
59.6%
zhwiki
<50
55.7%
Limited to Wikipedias where Paste Check was shown in at least 50 edits
No significant differences per Wikipedia. The dismissal rate is currently highest at nlwiki (66.2%).
Paste Check Revert Rate
Question:Is Paste Check causing any disruption?
Methdology: Reviewed the proportion of all published new content edits where Paste Check was shown at least once in an editing session and were reverted within 48 hours. This was compared to the revert rate of edits in the control group identifed as eligible but not shown Paste Check.
Code
# load data for assessing tone check published dataedit_check_save_data <-read.csv(file ='data/paste_check_saves_data.tsv',header =TRUE,sep ="\t",stringsAsFactors =FALSE )
Code
# Set experience level group and factor levelsedit_check_save_data <- edit_check_save_data %>%mutate(experience_level_group =case_when( user_edit_count ==0& user_status =='registered'~'Newcomer', user_edit_count ==0& user_status =='unregistered'~'Unregistered', user_edit_count >0& user_edit_count <=100~"Junior Contributor", user_edit_count >100~"Non-Junior Contributor" ),experience_level_group =factor(experience_level_group,levels =c("Unregistered","Newcomer", "Non-Junior Contributor", "Junior Contributor") )) #rename experiment field to clarifyedit_check_save_data <- edit_check_save_data %>%mutate(test_group =factor(test_group,levels =c('2025-09-editcheck-paste-control', '2025-09-editcheck-paste-test'),labels =c("control (Paste Check not shown)", "test (Paste Check shown)")))#rename platform from phone to mobile web to clarify meaningedit_check_save_data <- edit_check_save_data %>%mutate(platform =factor(platform,levels =c('phone', 'desktop'),labels =c("mobile web", "desktop")))
Code
# set field to indicate if more than one check was shown in a single session. Note: This should only be applicable to the test group edit_check_save_data <- edit_check_save_data %>%mutate(multiple_checks_shown =ifelse(n_checks_shown >1, "multiple checks shown", "single check shown"), multiple_checks_shown =factor( multiple_checks_shown ,levels =c("single check shown", "multiple checks shown")))# note these buckets can be adjusted as needed based on distribution of dataedit_check_save_data <- edit_check_save_data %>%mutate(checks_shown_bucket =case_when(is.na(n_checks_shown) ~'0', n_checks_shown ==1~'1', n_checks_shown ==2~'2', n_checks_shown >2& n_checks_shown <=5~"3-5", n_checks_shown >5& n_checks_shown <=10~"6-10", n_checks_shown >10~"over 10" ),checks_shown_bucket =factor(checks_shown_bucket ,levels =c("0","1","2", "3-5", "6-10","over 10") ))
Overall
Code
edit_check_reverts_overall <- edit_check_save_data %>%filter(is_new_content ==1& was_paste_check_shown ==1) %>%#limit to eligible editsgroup_by(test_group) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%"))
Code
# plot visualization of overall edit completion ratesdodge <-position_dodge(width=0.9)p <- edit_check_reverts_overall %>%ggplot(aes(x= test_group, y = n_reverts/n_edits)) +geom_col(position ='dodge', fill ='dodgerblue4') +scale_y_continuous(labels = scales::percent) +geom_text(aes(label =paste(revert_rate), fontface=2), vjust=1.2, size =10, color ="white") +scale_fill_manual(values= cbPalette, name ="Reason") +labs (y ="Percent of edit attempts completed ",x ="Experiment Group",title ="New content edit revert rate",caption ="Limited to published new content edits shown or eligible to be shown Paste Check") +theme(panel.grid.minor =element_blank(),panel.background =element_blank(),plot.title =element_text(hjust =0.5),text =element_text(size=24),legend.position="none",axis.line =element_line(colour ="black")) p
Overall, new content edits shown Paste Check are reverted less frequently. We’ve observed a -21.3% decrease in published edits where Paste Check was shown compared to edits eligible but not shown Paste Check.
Revert rates for both edits shown Paste Check (9.6%) or eligible to be shown Paste Check (12.2%) are lower than the revert rates we’ve observed for other types of edits. For example, there’s 25% revert rate for edits detected as having non-netural tone (see T371158#11220470).
Notes: * This control group revert rate aligns with the revert rates observed in the baseline analysis conducted in T403861 * This may also include edits where the final published text did not include pasted text. The event instrumented in T402460 identifies edits that would be eligible to be shown Paste Check during an edit session if enabled, but does not assess whether the final published text still includes pasted text.
By if mulitiple checks were shown
Code
edit_check_revert_bymultiple <- edit_check_save_data %>%filter(is_new_content ==1& was_paste_check_shown ==1& test_group =='test (Paste Check shown)' ) %>%group_by( multiple_checks_shown) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%select(-c(2,3)) %>%# removing granular data columns for publicationgt() %>%tab_header(title ="New content edit revert rate by if multiple checks were shown" ) %>%opt_stylize(5) %>%cols_label(multiple_checks_shown ="Multiple Check",#n_edits = "Number of published new content edits",#n_reverts = "Number of edits reverted ",revert_rate ="Proportion of new content edits that were reverted" ) %>%tab_source_note( gt::md('Limited to published new content edits shown or eligible to shown Paste Check') )display_html(as_raw_html(edit_check_revert_bymultiple ))
New content edit revert rate by if multiple checks were shown
Multiple Check
Proportion of new content edits that were reverted
single check shown
11.5%
multiple checks shown
5.5%
Limited to published new content edits shown or eligible to shown Paste Check
We also observed a significant decrease for edits that were shown multiple Paste Checks. Interestingly, we observed the reverse trend in the Tone Check leading indicator analysis (edits shown multiple tone checks were reverted more frequently).
By platform
Code
edit_check_revert_byplatform <- edit_check_save_data %>%filter(is_new_content ==1&was_paste_check_shown ==1) %>%group_by(platform, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%select(-c(3,4)) %>%# removing granular data columns for publicationgt() %>%tab_header(title ="New content edit revert rate by platform" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",platform ="Platform",#n_edits = "Number of published new content edits",# n_reverts = "Number of edits reverted",revert_rate ="Proportion of new content edits that were reverted" ) %>%tab_source_note( gt::md('Limited to published new content edits shown or eligible to shown Psste Check') )display_html(as_raw_html(edit_check_revert_byplatform ))
New content edit revert rate by platform
Test Group
Proportion of new content edits that were reverted
mobile web
control (Paste Check not shown)
14.7%
test (Paste Check shown)
16%
desktop
control (Paste Check not shown)
11.8%
test (Paste Check shown)
8.5%
Limited to published new content edits shown or eligible to shown Psste Check
When split by platform, we differing trends per platform. For desktop, we’ve observed a -28% decrease in revert rate.
While on mobile we’ve observed a slight 8.8% increase. However, so far there’s been a low absolute number of mobile edits shown or eligible to be shown Paste Check that have been reverted (<50 edits) to confirm this trend. This also indicates that there has been no signficant disruption caused by the introduction of this check.
By user experience
Code
edit_check_revert_byuserexp <- edit_check_save_data %>%filter(is_new_content ==1& was_paste_check_shown ==1) %>%group_by(experience_level_group,test_group) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%select(-c(3,4)) %>%# removing granular data columns for publicationgt() %>%tab_header(title ="New content edit revert rate by user experience" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiement Group",experience_level_group ="User Status",#n_edits = "Number of published new content edits",#n_reverts = "Number of edits reverted",revert_rate ="Proportion of new content edits that were reverted" ) %>%tab_source_note( gt::md('Limited to published new content edits shown or eligible to shown Paste Check') )display_html(as_raw_html(edit_check_revert_byuserexp))
New content edit revert rate by user experience
Experiement Group
Proportion of new content edits that were reverted
Unregistered
control (Paste Check not shown)
17.2%
test (Paste Check shown)
11.1%
Newcomer
control (Paste Check not shown)
19.4%
test (Paste Check shown)
10.2%
Junior Contributor
control (Paste Check not shown)
10.1%
test (Paste Check shown)
9%
Limited to published new content edits shown or eligible to shown Paste Check
Revert rates decreased across all user types.
By partner Wikipedia
Code
edit_check_revert_bywiki <- edit_check_save_data %>%filter(is_new_content ==1& was_paste_check_shown ==1) %>%group_by(wiki, test_group) %>%summarise(n_edits =n_distinct(editing_session),n_reverts =n_distinct(editing_session[was_reverted ==1])) %>%#limit to new content edits without a refernecemutate(revert_rate =paste0(round(n_reverts/n_edits *100, 1), "%")) %>%filter(n_edits >25) %>%# limit to wikis with over 50 published editsselect(-c(3,4)) %>%# removing granular data columns for publicationgt() %>%tab_header(title ="New content edit revert rate by partner Wikipedia" ) %>%opt_stylize(5) %>%cols_label(test_group ="Experiment Group",wiki ="Wikipedia",#n_edits = "Number of published new content edits",#n_reverts = "Number of edits reverted",revert_rate ="Proportion of new content edits that were reverted" ) %>%tab_source_note( gt::md('Limited to wikis with at least 25 published new content edits') )display_html(as_raw_html(edit_check_revert_bywiki))
New content edit revert rate by partner Wikipedia
Experiment Group
Proportion of new content edits that were reverted
dewiki
control (Paste Check not shown)
18.2%
test (Paste Check shown)
13.6%
itwiki
control (Paste Check not shown)
13.3%
test (Paste Check shown)
7.1%
ruwiki
control (Paste Check not shown)
7.6%
test (Paste Check shown)
9.1%
Limited to wikis with at least 25 published new content edits
There is currently insufficient data to confirm any per wiki trends; however, we’ve only observed revert rate decreases for Wikipedias where we have recorded > 25 published new content edits shown Paste Check.
Paste Check Block Rate
Question:Is Paste Check causing any disruption?
Methodology: We gathered all edits where edit check was shown from the mediawiki_revision_change_tag table and joined with mediawiki_private_cu_changes to gather user name info. We then reviewed both global and local blocks made within 6 hours of the tone check event as identified in the logging table.
Code
# load data for assessing blocksedit_check_blocks <-read.csv(file ='data/paste_check_eligible_users_blocked.csv',header =TRUE,sep =",",stringsAsFactors =FALSE )
Code
#rename experiment field to clarifyedit_check_blocks <- edit_check_blocks%>%mutate(test_group =factor(bucket,levels =c('2025-09-editcheck-paste-control', '2025-09-editcheck-paste-test'),labels =c("control (no Paste Check)", "test (Paste Check available)")))
Code
edit_check_local_blocks_overall <- edit_check_blocks %>%#filter(user_id == 0) %>% #filter to identify logged out usersgroup_by(test_group) %>%summarise(blocked_users =n_distinct(ip[is_local_blocked =='True'| is_global_blocked =='True']),all_users =n_distinct(ip)) %>%#look at blocksmutate(prop_blocks =paste0(round(blocked_users/all_users *100, 1), "%")) %>%select(-c(2,3)) %>%#removing granular data columns gt() %>%tab_header(title ="Proportion of users blocked by experiment group" ) %>%opt_stylize(5) %>%cols_label(test_group ="Test Group",prop_blocks ="Proportion of users blocked" ) %>%tab_source_note( gt::md('Limited to users blocked 6 hours after publishing an edit where Paste Check was shown') )display_html(as_raw_html(edit_check_local_blocks_overall))
Proportion of users blocked by experiment group
Test Group
Proportion of users blocked
test (Paste Check available)
0.5%
Limited to users blocked 6 hours after publishing an edit where Paste Check was shown
0.5% of all users were blocked after publishing an edit where at least one Paste check was shown compared to 0.3% in the control group. This difference is not statistically significant.
No global blocks were issued to any users that published an edit where at least one Paste Check was shown.
Paste Check Interaction Rate
Question:Are newcomers and Junior Contributors interacting with Paste Check?
Methdology: Reviewed the proportion of all edits sessions in which at least one Paste Check is shown and people do not interact with one or more of the Paste Checks checks shown. “Interact” in this context refers to people tapping either of the buttons that appear within the Paste Check “card”: Yes, keep it (action-keep) or No, remove it (action-remove).
Note: The need for this metric emerged through T407543 wherein we identified Paste Check is not being shown in the Pre-Save moment if people do not interact with it during Mid-Edit. As a result, I limited this analysis to editing sessions that clicked to enter the pre-save window (action = 'saveIntent') as those are the users that would not see the Paste Check presented again in this moment.
# Cleaning up dataset and renaming fields to clarify meanings#limit to sessions where paste check was shownpaste_check_engagement_data <-paste_check_engagement_data %>%filter(n_checks_shown >0)# Set experience level group and factor levelspaste_check_engagement_data <- paste_check_engagement_data %>%mutate(experience_level_group =case_when( user_edit_count ==0& user_status =='registered'~'Newcomer', user_edit_count ==0& user_status =='unregistered'~'Unregistered', user_edit_count >0& user_edit_count <=100~"Junior Contributor", user_edit_count >100~"Non-Junior Contributor" ),experience_level_group =factor(experience_level_group,levels =c("Unregistered","Newcomer", "Non-Junior Contributor", "Junior Contributor") )) #rename experiment field to clarifypaste_check_engagement_data <- paste_check_engagement_data %>%mutate(test_group =factor(test_group,levels =c('2025-09-editcheck-paste-control', '2025-09-editcheck-paste-test'),labels =c("control (no paste check)", "test (paste check available)")))#rename platform from phone to mobile web to clarify meaningpaste_check_engagement_data <- paste_check_engagement_data %>%mutate(platform =factor(platform,levels =c('phone', 'desktop'),labels =c("mobile web", "desktop")))
Code
#Set fields and factor levels to assess number of checks shownpaste_check_engagement_data <- paste_check_engagement_data %>%mutate(multiple_checks_shown =ifelse(n_checks_shown >1, 1, 0), multiple_checks_shown =factor( multiple_checks_shown ,levels =c(0,1)))# note these buckets can be adjusted as needed based on distribution of datapaste_check_engagement_data <- paste_check_engagement_data %>%mutate(checks_shown_bucket =case_when(is.na(n_checks_shown) ~'0', n_checks_shown ==1~'1', n_checks_shown ==2~'2', n_checks_shown >2& n_checks_shown <=5~"3-5", n_checks_shown >5& n_checks_shown <=10~"6-10", n_checks_shown >10~"over 10" ),checks_shown_bucket =factor(checks_shown_bucket ,levels =c("0","1","2", "3-5", "6-10", "over 10") ))
Overall
Code
paste_check_interaction_overall <- paste_check_engagement_data %>%#filter(saved_edit == 1) %>% #limit to published editssummarise(n_editing_session =n_distinct(editing_session),n_editing_session_noeng =n_distinct(editing_session[engaged_w_check ==0| n_clicks < n_checks_shown ])) %>%#no engagement or some checks with no engagementsmutate(prop_check_noeng =paste0(round(n_editing_session_noeng/n_editing_session *100, 1), "%")) %>%gt() %>%tab_header(title ="Editing sessions where people do not interact with one or more checks presented" ) %>%opt_stylize(5) %>%cols_label(n_editing_session ="Number of edits shown Paste Check",n_editing_session_noeng ="Number of edits with no Paste Check interaction", prop_check_noeng ="Proportion of edits" ) %>%tab_source_note( gt::md('Limited to editing sessions shown at least one Paste Check') )display_html(as_raw_html(paste_check_interaction_overall))
Editing sessions where people do not interact with one or more checks presented
Number of edits shown Paste Check
Number of edits with no Paste Check interaction
Proportion of edits
2833
1265
44.7%
Limited to editing sessions shown at least one Paste Check
At 45% of all editing sessions where Paste Check was shown, people did not interact with one or more of the Paste Checks presented.
Of these, 35% of editing sessions did not include interaction with any of the Paste Checks presented. The other 10% of edits were edits presented multiple Paste Checks where people did not interact with one or more of the Paste Checks presented.
By if multiple checks were shown
Code
paste_check_interaction_bymultiple <- paste_check_engagement_data %>%#filter(saved_edit == 1) %>% #limit to published editsgroup_by(multiple_checks_shown) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_noeng =n_distinct(editing_session[engaged_w_check ==0| n_clicks < n_checks_shown])) %>%mutate(prop_check_noeng =paste0(round(n_editing_session_noeng/n_editing_session *100, 1), "%")) %>%gt() %>%tab_header(title =md("Editing sessions where people do not interact with one or more checks presented<br> by if multiple checks were shown") ) %>%opt_stylize(5) %>%cols_label(multiple_checks_shown ="Multiple checks shown?",n_editing_session ="Number of edits shown Edit Check",n_editing_session_noeng ="Number of edits with no Paste Check interaction", prop_check_noeng ="Proportion of edits" ) %>%tab_source_note( gt::md('Limited to editing sessions shown at least one Paste Check') )display_html(as_raw_html(paste_check_interaction_bymultiple))
Editing sessions where people do not interact with one or more checks presented
by if multiple checks were shown
Multiple checks shown?
Number of edits shown Edit Check
Number of edits with no Paste Check interaction
Proportion of edits
0
1960
789
40.3%
1
873
476
54.5%
Limited to editing sessions shown at least one Paste Check
By platform
Code
paste_check_interaction_byplatform <- paste_check_engagement_data %>%#filter(saved_edit == 1) %>% #limit to published editsgroup_by(platform) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_noeng =n_distinct(editing_session[engaged_w_check ==0| n_clicks < n_checks_shown])) %>%mutate(prop_check_noeng =paste0(round(n_editing_session_noeng/n_editing_session *100, 1), "%")) %>%gt() %>%tab_header(title =md("Editing sessions where people do not interact with one or more checks presented<br> by platform") ) %>%opt_stylize(5) %>%cols_label(platform ="Platform",n_editing_session ="Number of edits shown Paste Check",n_editing_session_noeng ="Number of edits with no Paste Check interaction", prop_check_noeng ="Proportion of edits" ) %>%tab_source_note( gt::md('Limited to editing sessions shown at least one Paste Check') )display_html(as_raw_html(paste_check_interaction_byplatform))
Editing sessions where people do not interact with one or more checks presented
by platform
Platform
Number of edits shown Paste Check
Number of edits with no Paste Check interaction
Proportion of edits
mobile web
546
242
44.3%
desktop
2287
1023
44.7%
Limited to editing sessions shown at least one Paste Check
Interaction rates are similar for both mobile web and desktop.
By user experience
Code
paste_check_interaction_byuser_status <- paste_check_engagement_data %>%#filter(saved_edit == 1) %>% #limit to published editsgroup_by(experience_level_group) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_noeng =n_distinct(editing_session[engaged_w_check ==0| n_clicks < n_checks_shown])) %>%mutate(prop_check_noeng =paste0(round(n_editing_session_noeng/n_editing_session *100, 1), "%")) %>%gt() %>%tab_header(title =md("Editing sessions where people do not interact with one or more checks presented<br> by user experience") ) %>%opt_stylize(5) %>%cols_label(experience_level_group ="User Experience",n_editing_session ="Number of edits shown Paste Check",n_editing_session_noeng ="Number of edits with no Paste Check interaction", prop_check_noeng ="Proportion of edits" ) %>%tab_source_note( gt::md('Limited to editing sessions shown at least one Paste Check') )display_html(as_raw_html(paste_check_interaction_byuser_status ))
Editing sessions where people do not interact with one or more checks presented
by user experience
User Experience
Number of edits shown Paste Check
Number of edits with no Paste Check interaction
Proportion of edits
Unregistered
880
425
48.3%
Newcomer
446
186
41.7%
Junior Contributor
1507
654
43.4%
Limited to editing sessions shown at least one Paste Check
There’s only a slight variation in interaction rates by user type. Unregistered users are slighlty more likely to not interact with Paste Check compared to registered newcomers and Junior Contributors.
By partner Wikipedia
Code
paste_check_interaction_bywiki <- paste_check_engagement_data %>%#filter(saved_edit == 1) %>% #limit to published editsgroup_by(wiki) %>%summarise(n_editing_session =n_distinct(editing_session),n_editing_session_noeng =n_distinct(editing_session[engaged_w_check ==0| n_clicks < n_checks_shown])) %>%mutate(prop_check_noeng =paste0(round(n_editing_session_noeng/n_editing_session *100, 1), "%")) %>%filter(n_editing_session >50) %>%#filter to wikis with at least 50 editsselect(-c(2,3)) %>%#sanitizing granular rowsgt() %>%tab_header(title =md("Editing sessions where people do not interact with one or more checks shown <br>by Wikipedia") ) %>%opt_stylize(5) %>%cols_label(wiki ="Wikipedia",#n_editing_session = "Number of edits",#n_editing_session_noeng = "Number of edits with no Paste Check interaction", prop_check_noeng ="Proportion of edits with no Paste Check interaction" ) %>%tab_source_note( gt::md('Limited to editing sessions shown at least one Paste Check and Wikis with over 50 edits') )display_html(as_raw_html(paste_check_interaction_bywiki))
Editing sessions where people do not interact with one or more checks shown
by Wikipedia
Wikipedia
Proportion of edits with no Paste Check interaction
arwiki
58.1%
bnwiki
37.9%
cawiki
53.1%
cswiki
32.4%
dewiki
39.3%
fawiki
35.5%
idwiki
53.3%
itwiki
51.6%
nlwiki
41.8%
plwiki
42.1%
ruwiki
43.5%
simplewiki
44.6%
ukwiki
48.3%
viwiki
58.2%
zhwiki
40.4%
Limited to editing sessions shown at least one Paste Check and Wikis with over 50 edits
There’s some variation in interaction rates on a per Wikipedia basis. At cswiki, about 32% of editing sessions do not include any interaction with at least one of the Paste Checks shown while at viwiki, 58.2% of editing sessions do not include any interaction with Paste Checks shown.