Machine Translation Service Analysis Report 2024

Content Translation tool

Author

Krishna Chaitanya Velaga, Product Analytics

Published

October 22, 2024

Overview

Content translation supports multiple machine translation services. When multiple options are available for a language, even if one is provided by default, users can use a different service. See default configuration and languages supported files for details on all available language pairs and defaults.

The purpose of the report is to understand the usage of MT services across various languages, and if needed inform changes to default service provided on certain language pairs. In addition to also understand translation quality of MT services, such as through modification percentage of the MT content and deletion rate of the articles. The previous iterations of the report have been run in May 2022, October 2022, and November 2023.

Methodology

For each machine translation service, we compared the following:

Percent of translations published by each machine translation service:
Overall across all languages
Daily usage trends
Usage at each Language Pair (Source - Target)
Most frequently used service for each target Language
Percent each machine translation service was modified by users
Percent of articles created with each machine translation service that were deleted
YoY comparision of service usage by language pair
Machine translation service usage by user edit count bucket

Data sources:

Period: All published translations from from January to September 2024 were considered for analysis.

Summary

Overall usage

Google’s translation service, which has been the most used translation service across all language pairs has been used for 71% of all all the published translations.
MinT (Machine in Translation), a Wikimedia Foundation hosted open-source machine translation service, is the second most used translation service accounting for 16% of all the published translations.
Compared to last year, the year-over-year growth in usage of MinT translation service is ~8%,¹ w at the same time Google’s service usage reduced by ~9%.²
No machine translation was used (scratch) for 6% of the published translations using the Content Translation tool.

YoY comparison

On average, ~960 articles are translated and published daily using the Content Translation tool.
Compared to 2023, the median number of daily translations:
- using Google reduced from 669 to 626 (6% decrease).
- using MinT increased from 48 to 118 (145% increase).
- for other services, the change is not more than 1-2%.

Language pairs where an optional service was used more or close to the default

There are 44 language pairs where an optional service (or no service, i.e., scratch) was used more or close to the default.
Among the language pairs where Google is the default:
- For 20 language pairs, MinT was used more than Google.
- For 10 language pairs, no service was used (i.e., scratch).
Among the language pairs where MinT is the default, for 10 language pairs, Google was used more.

Usage of service at each target language

MinT was used 100% for translating articles to nine languages.
MinT was used for 90% of all translations for translating articles to 18 languages (within the respective language).
MinT was used for the majority of the translations (>50% of the services) for translating articles to 18 languages.
Google was used 100% for translating articles to nine languages.
Google was used for 90% of all services for translating articles to 53 languages.
Google was used for the majority of the translations (>50% of the services) to 49 languages.
Only for translating articles to Aragonese, Apertium was used for 90% of the translations.
Only for translating articles to Chuvash, Yandex was used for 100% of the translations.
Only for translating articles to Basque (eu), Elia was the most used service (85% of all services).
Even in languages where LingoCloud is supported, the usage has been quite low. For Chinese (zh), it was used for ~2% of 4000+ translations, and for less than ~1% of 150+ translations to Wu Chinese (wuu).

Percentage of MT content modified by the user

The majority of translations across all MT services were modified by at least 10% at the time of publication.
For machine translation suggestions from MinT, 32% were modified by less than 10%—the highest of all services.
The percentage of translations with a human modification percentage between 10% and 50% for MinT and Google is 54%.
The percentage of translations with a human modification percentage higher than 50% is the least for MinT and Elia at 13% and 11%, respectively.
Apertium has the highest percentage of translations where the human modification percentage was more than 50%.

Deletion rates by MT service

Articles translated using MinT were deleted the least: 2.23% of the 30,000+ articles.
Yandex and Google had the highest percentage of deleted articles, with more than 3%.
2.7% of the articles translated using Apertium were deleted.

Setup

Code

import pandas as pd
import duckdb
import numpy as np
import warnings

import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick

import plotly.express as px
import plotly.graph_objects as go
import plotly.subplots as sp
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
# import kaleido
import great_tables as gt

from IPython.display import display_html
from IPython.display import display, HTML

Code

init_notebook_mode(connected=True)

pd.options.display.max_columns = None
pd.options.display.max_rows = 100

bold = '\033[1m'
end = '\033[0m'
underline = '\033[4m'

# max width for plotly charts
iplot_width = 950

# always show options bar
iplot_config = {'displayModeBar': True}

Code

# connect to database
# conn = duckdb.connect('~/git/machine-translation-service-usage-analysis/data_gathering/secrets/mt_data.db')
conn = duckdb.connect('~/Desktop/GIT/wmf_git/machine-translation-service-usage-analysis/data_gathering/secrets/mt_data.db')

Code

def query(query_string, df=False, conn=conn):
    if df==True:
        return conn.sql(query_string).df()
    else:
        return conn.sql(query_string).show()

Code

start_date = '2024-01-01'
end_date = '2024-09-30'
period_label = 'January to September 2024'

Overall usage by service

Code

mt_compare_overall = query(f"""
WITH base AS (
    SELECT 
        mt_service,
        COUNT(DISTINCT translation_id) AS n_translations
    FROM 
        mt_logs
    WHERE
        translation_start_time >= '{start_date}'
        AND translation_start_time <= '{end_date}'
        AND mt_service != 'undefined'
    GROUP BY
        mt_service
    ORDER BY
        n_translations DESC
)

SELECT
    mt_service AS 'Machine translation service',
    n_translations AS 'Number of Translations',
    n_translations / SUM(n_translations) OVER () AS 'Percent of all published translations'
FROM
    base    
""", True)

Code

mt_compare_overall_tbl = (
    gt
    .GT(mt_compare_overall, rowname_col='Machine translation service')
    .tab_header('Published translations by machine translations service across all language pairs', period_label)
    .fmt_percent(columns='Percent of all published translations')
    .data_color(columns='Percent of all published translations', palette='Greens')
    .opt_stylize()
)

mt_compare_overall_tbl

	Number of Translations	Percent of all published translations
Published translations by machine translations service across all language pairs
January to September 2024
Google	187953	71.20%
MinT	42644	16.15%
scratch	18126	6.87%
Apertium	6291	2.38%
Yandex	5076	1.92%
Elia	3779	1.43%
LingoCloud	110	0.04%

Summary

Google’s translation service, which has been the most used translation service across all language pairs has been used for 71% of all all the published translations.
MinT (Machine in Translation), a Wikimedia Foundation hosted open-source machine translation service, is the second most used translation service accounting for 16% of all the published translations.
Compared to last year, the year-over-year growth in usage of MinT translation service is ~8%,³ w at the same time Google’s service usage reduced by ~9%.⁴
No machine translation was used (scratch) for 6% of the published translations using the Content Translation tool.

Code

warnings.filterwarnings('ignore')

# bar chart
mt_compare_overall['Percent of all published translations'] *= 100

plt.figure(figsize=(10, 6))
ax = sns.barplot(data=mt_compare_overall, x='Machine translation service', y='Percent of all published translations', palette="BuGn_r")

# format y-axis as percetange
ax.yaxis.set_major_formatter(mtick.PercentFormatter(decimals=0))

# add data labels for bars
for p in ax.patches:
    ax.annotate(f'{p.get_height():.2f}%', (p.get_x() + p.get_width() / 2., p.get_height()), 
                ha='center', va='center', fontsize=8, fontweight='bold', color='black', 
                xytext=(0, 10), textcoords='offset points')

# plot and axes titles
plt.title(f"Published translations by machine translations service across all language pairs\n{period_label}")
plt.xlabel("Machine translation service", fontweight='bold')
plt.ylabel("Percent of all published translations", fontweight='bold')

plt.show()

Daily published translations

Translations published per day by each MT service were reviewed to identify any sudden increases or decreases in usage and to determine if those changes corresponded to deployments or setting changes that may have impacted MT availability.

Code

mt_daily = query(f"""
WITH base AS (
    SELECT 
        mt_service,
        translation_start_time AS date,
        translation_id
    FROM 
        mt_logs
    WHERE
        translation_start_time >= '{start_date}'
        AND translation_start_time <= '{end_date}'
        AND mt_service != 'undefined')

SELECT
    date,
    mt_service,
    COUNT(DISTINCT translation_id) AS n_translations
FROM
    base
GROUP BY
    date,
    mt_service
""", True)

Code

warnings.filterwarnings('ignore')
mt_daily = (
    mt_daily
    .replace([np.inf, -np.inf], np.nan)
    .dropna(subset=['date', 'n_translations'])
)

Code

mt_services = mt_daily.mt_service.unique()

# subplots
fig = sp.make_subplots(rows=4, cols=2, 
                       shared_xaxes=True, 
                       subplot_titles=mt_services,
                       x_title='Date',
                       y_title='Number of translations',
                       vertical_spacing=0.05, 
                       horizontal_spacing=0.05)

for i, mt_service in enumerate(mt_services):
    row, col = divmod(i, 2)
    service_data = mt_daily.query(f"mt_service == '{mt_service}'").sort_values('date')

    fig.add_trace(go.Scatter(x=service_data['date'], 
                             y=service_data['n_translations'], 
                             mode='lines',
                             name=mt_service,
                             showlegend=False, 
                             line=dict(color='DarkCyan')),
                  row=row+1, col=col+1)

fig.update_xaxes(range=[mt_daily['date'].min(), pd.to_datetime(end_date)])
fig.update_layout(title_text=f"Daily number of published translations created by MT Service<br>{period_label}",
                  title_x=0.5, height=800, width=iplot_width)

iplot(fig, config=iplot_config)

Note

From the above charts showing the daily number of published translations by service, we can observe that there is a spike around 24-26 June 2024 for most of the services. Also, there was an increase in MinT service usage between February and March, where on average 300-400 articles were published daily. We’ll further investigate these.

Rise in daily average of MinT usage during Feburary and March 2024

Code

mint_spike = query("""
WITH base AS (
    SELECT
        MONTH(translation_start_time) AS month,
        MONTHNAME(translation_start_time) AS month_name,
        target_wiki_db AS wiki_db,
        COUNT(DISTINCT target_revision_id) AS article_count
    FROM
        mt_logs
    WHERE
        mt_service = 'MinT'
    GROUP BY
        month,
        month_name,
        wiki_db
),

ranked AS (
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY month
            ORDER BY article_count DESC
        ) AS rank
    FROM
        base
    WHERE
        article_count > 50
        AND month <= 4
)

SELECT * 
FROM ranked 
WHERE rank <= 3 
ORDER BY month, rank
""", True)

Code

mint_spike_tbl = (
    gt.GT(mint_spike, groupname_col='month_name', rowname_col='wiki_db')
    .tab_header('Top Wikis using MinT', 'Jan to Apr 2024')
    .cols_hide(['month', 'rank'])
    .data_color('article_count', palette='Greens')
    .cols_label(article_count='Total articles translated')
    .opt_stylize()
)

mint_spike_tbl

	Total articles translated
Top Wikis using MinT
Jan to Apr 2024
January
ffwiki	250
kiwiki	80
minwiki	51
February
hawiki	2369
igwiki	1624
satwiki	362
March
hawiki	1541
urwiki	1114
igwiki	934
April
hawiki	413
uzwiki	332
urwiki	270

Note

The source of increase in average number of daily translations using MinT between February and March 2024 was from increased activity on Hausa, Urdu and Santali Wikipedias - likely due some article creation/translation campaign. For all these language, the default translation service from English is MinT.

Spike in daily translations during 24-26 June 2024

Code

spikes = {
    'Google': ['2024-03-06', '2024-03-07', '2024-06-25', '2024-06-26', '2024-06-27'],
    'Apertium': ['2024-05-11'],
    'MinT': ['2024-06-25', '2024-06-26', '2024-06-27'],
    'Yandex': ['2024-06-25', '2024-06-26', '2024-06-27'],
}

Code

june2024_spike = query("""
WITH base AS (
    SELECT
         target_wiki_db,
         mt_service,
         COUNT(DISTINCT target_revision_id) AS article_count
    FROM 
        mt_logs
    WHERE
        translation_start_time IN ('2024-06-25', '2024-06-26', '2024-06-27')
    GROUP BY
         translation_start_time,
         target_wiki_db,
         mt_service
    ORDER BY
        article_count DESC
),

ranked AS (
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY mt_service
            ORDER BY article_count DESC
        ) AS rank
    FROM
        base
)

SELECT
    target_wiki_db AS Wikipedia,
    mt_service AS Service,
    article_count AS '# Articles'
FROM
    ranked
WHERE
    rank = 1
    AND article_count > 25
ORDER BY
    article_count DESC
""", True)

june2024_spike_tbl = (
    gt.GT(june2024_spike)
    .tab_header('Source of spike in MT usage', 'during 24-26 June 2024')
)

june2024_spike_tbl

Wikipedia	Service	# Articles
Source of spike in MT usage
during 24-26 June 2024
uzwiki	MinT	2814
uzwiki	Google	1681
uzwiki	Yandex	88
uzwiki	scratch	82
euwiki	Elia	32

Note

The spike was caused due to increased activity on Uzbek Wikipedia, possibly due to a campaign. We can also observe this on Wikistats. During the period, MinT was the most used service, which is also the default service when translating from English to Uzbek.

Code

mt_daily_agg = (
    mt_daily
    .groupby('mt_service')
    .agg(
        avg_daily=('n_translations', np.mean),
        median_daily=('n_translations', np.median)
    )
    .sort_values('avg_daily', ascending=False)
)

mt_daily_agg_tbl = (
    gt
    .GT(mt_daily_agg.reset_index(), rowname_col='mt_service')
    .tab_header('Number of daily translations by service', period_label)
    .fmt_number(decimals=0)
    .cols_label(
        avg_daily='Average',
        median_daily='Median'
    )
    .opt_stylize()
)

mt_daily_agg_tbl

	Average	Median
Number of daily translations by service
January to September 2024
Google	686	626
MinT	156	118
scratch	66	58
Apertium	23	21
Yandex	19	17
Elia	14	12
LingoCloud	1	1

Summary

On average, ~960 articles are translated and published daily using the Content Translation daily.
Compared to 2023, the median number of daily translations:
- using Google reduced from 669 to 626 (6% decrease).
- using MinT increased from 48 to 118 (145% increase).
- for other services, the change is not more than 1-2%.

Usage by language pair

The number and percentage of publications by each machine translation service at each language pair (i.e., source language and target language) were reviewed. Due to the large combination of language pairs, the data was saved to a Google Spreadsheet to easily filter and identify the percentage of publications by language pair for each machine translation service.

Code

mt_by_langpair = query(f"""
WITH base AS (
    SELECT
        source_language,
        target_language,
        mt_service,
        COUNT(DISTINCT translation_id) AS n_translations
    FROM 
        mt_logs
    WHERE
        translation_start_time >= '{start_date}' 
        AND translation_start_time <= '{end_date}'
        AND mt_service != 'undefined'
    GROUP BY
        source_language,
        target_language,
        mt_service
    ORDER BY
        source_language,
        target_language,
        n_translations
)

SELECT
    *,
    n_translations / SUM(n_translations) OVER (PARTITION BY source_language, target_language) AS pct_translations
FROM
    base
""", True)

Code

mt_by_langpair.to_csv('~/git/machine-translation-service-usage-analysis/data_gathering/secrets/mt_usage_langpair.tsv', sep='\t', index=False)

Higher use of optional service

Listing of language pairs where an optional service was used more or close to the default service.

Code

mt_defaults = query("""
WITH base AS (
    SELECT
        *,
        source_language||'-'||target_language AS pair
    FROM
        mt_by_langpair mt
)
    
SELECT 
    b.* EXCLUDE(mt_service),
    b.mt_service AS service_used,
    dfs.service AS default_service
FROM 
    base b
    JOIN mt_defaults dfs
    ON b.pair = dfs.pair
WHERE
    service_type = 'default_mt'
""", True)

Code

mt_optional_services_used_more = mt_defaults[
    (mt_defaults['service_used'] != mt_defaults['default_service']) & 
    (mt_defaults['pct_translations'] > 0.50) &
    (mt_defaults['n_translations'] > 10)
].sort_values('pair')

mt_optional_services_used_more_tbl = (
    gt
    .GT(mt_optional_services_used_more, groupname_col='default_service', rowname_col='pair')
    .tab_header('Languge pairs where an optional service was used more or close to the default', 'having at least 10 published translations; accouting for 50% or more translations')
    .fmt_percent(columns='pct_translations', decimals=0)
    .cols_label(
        source_language='Source',
        target_language='Target',
        n_translations='# Translations',
        pct_translations='% Translations',
        service_used='MT service used'
    )
    .opt_stylize(6)
)

mt_optional_services_used_more_tbl

	Source	Target	# Translations	% Translations	MT service used
Languge pairs where an optional service was used more or close to the default
having at least 10 published translations; accouting for 50% or more translations
Google
ar-en	ar	en	123	95%	scratch
ba-tt	ba	tt	17	89%	MinT
bg-nl	bg	nl	11	52%	scratch
de-en	de	en	27	100%	scratch
en-ban	en	ban	24	96%	MinT
en-din	en	din	13	81%	MinT
en-ja	en	ja	101	53%	scratch
en-min	en	min	19	90%	MinT
en-new	en	new	47	96%	MinT
en-pam	en	pam	13	81%	MinT
en-shn	en	shn	11	100%	MinT
en-ss	en	ss	80	94%	MinT
en-tn	en	tn	78	80%	MinT
en-tum	en	tum	12	92%	MinT
en-ve	en	ve	44	90%	MinT
en-war	en	war	22	88%	MinT
fa-en	fa	en	131	80%	scratch
fr-br	fr	br	13	62%	MinT
he-en	he	en	163	87%	scratch
id-ace	id	ace	18	95%	MinT
id-ban	id	ban	13	93%	MinT
id-min	id	min	60	97%	MinT
it-en	it	en	14	88%	scratch
it-fur	it	fur	122	98%	MinT
it-vec	it	vec	26	96%	MinT
ja-en	ja	en	14	100%	scratch
lmo-it	lmo	it	13	100%	MinT
pt-fo	pt	fo	11	100%	MinT
simple-id	simple	id	18	69%	scratch
tt-ba	tt	ba	40	62%	Yandex
uk-be	uk	be	146	71%	Yandex
zh-en	zh	en	21	51%	scratch
MinT
en-as	en	as	109	69%	Google
en-el	en	el	2119	62%	Google
en-gu	en	gu	44	57%	Google
en-hi	en	hi	722	56%	Google
en-ig	en	ig	11144	66%	Google
en-kn	en	kn	799	63%	Google
en-or	en	or	605	89%	Google
en-pa	en	pa	1393	52%	Google
en-te	en	te	4149	66%	Google
en-uz	en	uz	15350	61%	Google
Apertium
es-en	es	en	62	54%	scratch
mk-sr	mk	sr	14	61%	Google

Summary

There are 44 language pairs where an optional service (or no service i.e. scratch) was used more or close to the default.
Among the language pairs where Google is the default:
- Among 20 language pairs, MinT was used more than Google. The pairs are: ba-tt, en-ban, en-din, en-min, en-new, en-pam, en-shn, en-ss, en-tn, en-tum, en-ve, en-war, fr-br, id-ace, id-ban, id-min, it-fur, it-vec, lmo-it, and pt-fo.
- Among 10 language pairs, no service used (i.e. scratch). The pairs are: ar-en, bg-nl, de-en, en-ja, fa-en, he-en, it-en, ja-en, simple-id, and zh-en.
Among the language pairs where MinT is the default, for 10 language pairs, Google was used more. The pairs are: en-as, en-el, en-gu, en-hi, en-ig, en-kn, en-or, en-pa, en-te, and en-uz.

Usage at each target language

Next, a closer look was taken at each machine translation service, identifying its usage at all target languages where available and determining the languages each service is helping to support the most.

Code

def chart_order_services(services, selected_service):
    services.remove(selected_service) 
    services.sort()
    services.insert(0, selected_service)
    return services

Code

def get_service_usage_by_target(service):
    
    # get usage information by selected service    
    service_usage_by_target = query(f"""
        WITH languages AS (
            SELECT 
                DISTINCT target_language
            FROM 
                mt_logs
            WHERE
                translation_start_time >= '{start_date}' 
                AND translation_start_time <= '{end_date}' 
                AND mt_service = '{service}'
                AND NOT mt_service = 'undefined'
        ),

        base AS (
            SELECT
                *
            FROM
                mt_logs
            WHERE
                target_language IN (SELECT target_language FROM languages)
                AND translation_start_time >= '{start_date}' 
                AND translation_start_time <= '{end_date}'
                AND NOT mt_service = 'undefined'
        ),

        agg AS (
            SELECT
                target_language,
                mt_service,
                COUNT(DISTINCT translation_id) AS n_translations
            FROM
                base
            GROUP BY
                target_language,
                mt_service
            ORDER BY
                target_language,
                n_translations DESC
        )
                    
        SELECT
            *,
            n_translations / SUM(n_translations) OVER (PARTITION BY target_language) AS pct_translations
        FROM
            agg
        """, True)
    
    return service_usage_by_target

Code

# plot to generate usage chart

def chart_usage(service, min_translations=10, min_percent=0.1, chart_height=750, chart_width=1400, xlabel_offset=0.025, return_fig=False):
    
    service_usage_by_target = get_service_usage_by_target(service)
    
    # top languages
    top_langs = (
        service_usage_by_target
        .query(f"""(mt_service == @service) & \
                    (n_translations >= {min_translations}) & \
                    (pct_translations > {min_percent})""")
        .sort_values(['pct_translations'], ascending=False)
        .target_language
        .values
        .tolist()
    )
    
    top_langs_usage = (
        service_usage_by_target
        .query("""target_language == @top_langs""")
        .assign(
            target_language=lambda df: pd.Categorical(
                df['target_language'], 
                categories=top_langs, 
                ordered=True),
            mt_service=lambda df: pd.Categorical(
                df['mt_service'], 
                categories=chart_order_services(df.mt_service.unique().tolist(), service), 
                ordered=True)
        )
        .sort_values(['target_language', 'mt_service'])
    )
    
    if service == 'scratch':
        chart_title = f'Languages where no MT service was used (scratch)'
    else:
        chart_title = f'Languages most supported by {service} (by percentage of published translations)'
    
    fig = px.bar(top_langs_usage, 
                 y='target_language', 
                 x='pct_translations', 
                 color='mt_service', 
                 orientation='h', 
                 height=chart_height, 
                 width=chart_width,
                 color_discrete_sequence=px.colors.qualitative.T10,
                 labels={
                     'target_language': 'Target language', 
                     'pct_translations': 'Percent of all published translations', 
                     'mt_service': 'MT service'
                 },
                 title=chart_title,
                 category_orders={'target_language': top_langs})
    
    fig.update_xaxes(tickformat=".0%")
    annotations = []
    
    # add data labels for the selected service only
    for _, row in top_langs_usage.iterrows():
        if row['mt_service'] == service:
            annotations.append(
                dict(
                    x=row['pct_translations'] - xlabel_offset,
                    y=row['target_language'],
                    text=f"{row['pct_translations']:.0%}",
                    showarrow=False,
                    font=dict(color="white")
                )
            )
            
    fig.update_layout(annotations=annotations)
    
    if return_fig:
        return fig
    else:
        fig.show()

Code

print(f'Available services: {mt_by_langpair.mt_service.unique().tolist()}')

Available services: ['Google', 'scratch', 'Yandex', 'MinT', 'Apertium', 'LingoCloud', 'Elia']

MinT

MinT (Machine in Translation) is a machine translation service based on open-source neural machine translation models. The service is hosted in the Wikimedia Foundation infrastructure, and it runs translation models that have been released by other organizations with an open-source license. MinT is designed to provide translations from multiple machine translation models. Initially, it uses the following models: NLLLB-200, OpusMT, IndicTrans2 and Softcatalà. From January to September 2024, MinT accounted for 16% of the all the published translations.

Code

iplot(chart_usage('MinT', chart_height=1500, chart_width=iplot_width, return_fig=True, min_translations=5), config=iplot_config)

Summary

MinT was used 100% for translating articles to the following languages (# 9): Buginese (bug), Banjar (bjn), Limburgish (li), Sicilian (scn), Kongo (kg), Shan (shn), Fiji Hindi (hif), Low Saxon (nds), and Cherokee (chr).
MinT was used for 90% of all services for translating articles to the following languages (# 18): Santali (sat), Fula (ff), Kashmiri (ks), Kikuyu (ki), Central Bikol (bcl), Friulian (fur), Crimean Tatar (crh), Newar (new), South Azerbaijani (azb), Minangkabau (min), Balinese (ban), Faroese (fo), Swati (ss), Tumbuka (tum), Gun (guw), Icelandic (is), Southern Sotho (st), and Zulu (zu).
MinT was used for majority of the translations (>50% of the services) to the following languages (# 18): Venda (ve), Waray (war), Lombard (lmo), Acehnese (ace), Tswana (tn), Kabyle (kab), Venetian (vec), Dinka (din), Kapampangan (pam), Tibetan (bo), Breton (br), Meitei (mni), Ligurian (lij), Malayalam (ml), Maithili (mai), Moroccan Arabic (ary), Fon (fon), and Dzongkha (dz).

Google

Code

iplot(chart_usage('Google', chart_height=2500, chart_width=iplot_width, return_fig=True), config=iplot_config)

Summary

Google was used 100% for translating articles to nine languages (# 9).⁵
Google was used for 90% of all services for translating articles to 53 languages.⁶
Google was used for majority of the translations (>50% of the services) to 49 languages⁷

Apertium

Code

iplot(chart_usage('Apertium', chart_height=500, chart_width=iplot_width, return_fig=True, min_translations=5), config=iplot_config)

Summary

Only for translating articles to Aragonese, Apertium was used for 90% of the translations.
Apertium was used for majority of the translations (>50% of the services) to 6 languages.⁸

Yandex

Code

iplot(chart_usage('Yandex', chart_height=500, chart_width=iplot_width, return_fig=True, min_translations=5), config=iplot_config)

Summary

Only for translating articles to Chuvash, Yandex was used for 100% of the translations.
Yandex was used for majority of the translations (>50% of the services) to 2 languages.⁹

Elia

Code

iplot(chart_usage('Elia', min_translations=5, min_percent=0.05, chart_height=350, chart_width=iplot_width, return_fig=True), config=iplot_config)

Summary

Only for translating articles to Basque (eu), Elia service was most used (85% of all services).

LingoCloud

Code

iplot(chart_usage('LingoCloud', min_translations=1, min_percent=0, chart_height=350, xlabel_offset=0.009, chart_width=iplot_width, return_fig=True), config=iplot_config)

Summary

Even in languages where LingoCloud is supported, the usage has been quite low. For Chinese (zh), it was used for ~2% of 4000+ translations, and less than ~1% of 150+ translations to Wu Chinese (wuu).

Percent MT content modified

The content provided by each machine translation service can be modified by the user before publishing. The analysis tracks the percentage each translation is modified by the user before publication. Following the MT abuse calculation documentation, a warning or error is displayed to the user based on the extent of unmodified content. This encourages users to make further edits. Depending on the situation, some users may still be able to publish their translations, but the resulting page may be added to a tracking category for potentially unreviewed translations, subject to community review. In other cases, users may not be allowed to publish.

For the purpose of this analysis, we have focused on published translations and have categorized the extent to which machine translation content was modified by users into three categories: less than 10%, between 10% and 50%, and over 50%. These categories can be adjusted as needed.

Method: Data on percent each translation is modified comes from the translations_progress field¹⁰ in the cx_translation table (as indicated by the human percentage stat).

Code

conn.sql(f"""
CREATE OR REPLACE VIEW hpct_modified AS
SELECT
    *,
    CASE 
        WHEN human_translated_percent < 0.1 THEN 'less than 10%'
        WHEN human_translated_percent >= 0.1 AND human_translated_percent <= 0.5 THEN 'between 10% and 50%'
        WHEN human_translated_percent >0.5 THEN 'over 50%'
    END AS 'pct_modified'
FROM
    mt_logs
WHERE
    translation_start_time >= '{start_date}'
    AND translation_start_time <= '{end_date}'
    AND mt_service != 'scratch'
    AND target_wiki_db != 'uzwiki'
""")

Code

pct_modified_overall = query("""
WITH base AS (
    SELECT
        mt_service,
        pct_modified,
        COUNT(DISTINCT translation_id) AS n_translations
    FROM
        hpct_modified
    GROUP BY
        mt_service,
        pct_modified
)
        
SELECT
    *,
    n_translations / SUM(n_translations) OVER (PARTITION BY mt_service) AS pct_translations
FROM
    base
""", True)

Code

pct_order = ['less than 10%', 'between 10% and 50%', 'over 50%']

# chart function
def chart_pct_modified(df, title, pct_order=pct_order, iplot_width=800):
    
    mt_service_order = (
        df
        .query("""pct_modified == 'less than 10%'""")
        .sort_values('pct_translations', ascending=False)
        .mt_service
        .unique()
        .tolist()
    )

    df['mt_service'] = pd.Categorical(df['mt_service'], categories=mt_service_order, ordered=True)
    df['pct_modified'] = pd.Categorical(df['pct_modified'], categories=pct_order, ordered=True)
    df = df.sort_values(['mt_service', 'pct_modified'], ascending=[True, True])

    fig = go.Figure()

    for pct_mod in pct_order:
        data = df[df['pct_modified'] == pct_mod]
        fig.add_trace(go.Bar(
            x=data['pct_translations'],
            y=data['mt_service'],
            orientation='h',
            name=pct_mod,
            text=[f"{val:.0%}" for val in data['pct_translations']],  # Format text labels as percentages
            textposition='auto',
            textfont_color='white',
            marker_color={pct: px.colors.qualitative.T10[i] for i, pct in enumerate(pct_order)}[pct_mod] 
        ))

    fig.update_layout(
        barmode='stack',
        height=600,
        width=iplot_width,
        legend=dict(
            orientation="h",
            yanchor="bottom",
            y=1.02,
            xanchor="right",
            x=1
        ),
        xaxis_tickformat=".0%",
        title=title
    )

    return fig

Code

iplot(chart_pct_modified(pct_modified_overall, 'Percentage published translations modified by users'), config=iplot_config)

Summary

Note: uzwiki has been excluded as it was introducing a skew due to increased activity from a campaign.

Majority of the translations across all MT services were modified at least 10% during publication.
The machine translation suggestions from MinT, 32% were modified less than 10% - highest of all services.¹¹
Percentage of translations with human modification percentage between 10% and 50% for MinT and Google is 54%.
Percentage of translations with human modification percentage higher than 50% is least for MinT and Elia at 13% and 11% respectively.
Apertium has the highest percentage of translations where the human modification percentage was more than 50%.

at MinT supported languages

Code

pct_modified_mint_by_lang = query("""
WITH
    mint_langs AS (
        SELECT
            target_language,
            COUNT(DISTINCT translation_id) AS n_translations
        FROM
            hpct_modified
        WHERE
            mt_service = 'MinT'
        GROUP BY
            target_language
    ),
    
    base AS (
        SELECT
            mt_service,
            pct_modified,
            target_language,
            COUNT(DISTINCT translation_id) AS n_translations
        FROM
            hpct_modified
        WHERE
            target_language IN (SELECT DISTINCT target_language FROM mint_langs WHERE n_translations >= 10)
        GROUP BY
            mt_service,
            pct_modified,
            target_language
        )
    
SELECT
    *,
    n_translations / SUM(n_translations) OVER (PARTITION BY target_language, mt_service) AS pct_translations
FROM
    base
""", True)

Code

pct_modified_mint_by_lang['pct_modified'] = pd.Categorical(pct_modified_mint_by_lang['pct_modified'], categories=pct_order, ordered=True)
pct_modified_mint_by_lang = pct_modified_mint_by_lang.sort_values(['pct_modified'], ascending=[True])

Code

color_mapping = {
    'less than 10%': px.colors.qualitative.T10[0],
    'between 10% and 50%': px.colors.qualitative.T10[1],
    'over 50%': px.colors.qualitative.T10[2]
}

unique_target_languages = pct_modified_mint_by_lang['target_language'].unique()
num_columns = 4
num_rows = (len(unique_target_languages) + num_columns - 1) // num_columns

fig = sp.make_subplots(
    rows=num_rows, 
    cols=num_columns, 
    subplot_titles=unique_target_languages, 
    horizontal_spacing=0.1, 
    vertical_spacing=0.005, 
    shared_xaxes=True, 
    specs=[[{}] * num_columns] * num_rows
)

all_traces = []

for i, target_language in enumerate(unique_target_languages):
    row_num = i // num_columns + 1
    col_num = i % num_columns + 1
    
    filtered_data = pct_modified_mint_by_lang.query(f"target_language == '{target_language}'")
    
    traces = []
    categories = filtered_data['pct_modified'].unique()

    for category in categories:
        category_data = filtered_data[filtered_data['pct_modified'] == category]
        
        trace = go.Bar(
            x=category_data['pct_translations'],
            y=category_data['mt_service'],
            orientation='h',
            name=category,
            marker=dict(color=color_mapping[category]),
            text=[f"{val:.0%}" if val > 0.2 else '' for val in category_data['pct_translations']],
            textposition='auto',
            textfont=dict(size=10)
        )
        
        traces.append(trace)
        
        
    for trace in traces:
        fig.add_trace(trace, row=row_num, col=col_num)

    
fig.update_layout(
    title="Percentage published translations modified by users at MinT supported languages",
    height=300 * num_rows,
    width=1050,
    barmode='stack',
    showlegend=False
)

for row_num in range(1, num_rows + 1):
    for col_num in range(1, num_columns + 1):
        fig.update_xaxes(tickformat=".0%", row=row_num, col=col_num)

Code

iplot(fig, config=iplot_config)

Deletion rate by MT service

Code

mt_deletion_overall = query("""
SELECT
    mtd.mt_service AS mt_service,
    SUM(created_cx_total)::INT AS '# Articles created',
    SUM(deleted_cx_total)::INT AS '# Articles deleted',
    SUM(deleted_cx_total) / SUM(created_cx_total)  AS 'Percentage of deleted articles',
    pct_translations AS 'Percent of translations modified under 10%'
FROM
    mt_deletion_ratios mtd
LEFT JOIN (
    SELECT
        mt_service,
        pct_translations
    FROM
        pct_modified_overall
    WHERE
        pct_modified = 'less than 10%'
    ) pct_modf
    ON mtd.mt_service = pct_modf.mt_service
WHERE
    wiki != 'uzwiki'
GROUP BY
    mtd.mt_service,
    pct_translations
""", True).sort_values('# Articles created', ascending=False)

Code

mt_deletion_overall_tbl = (
    gt
    .GT(mt_deletion_overall, rowname_col='mt_service')
    .tab_header('Percentage of articles deleted, by Machine Translation Service', f'{period_label}; excluding uzwiki')
    .fmt_percent(columns=['Percentage of deleted articles', 'Percent of translations modified under 10%'])
    .opt_stylize()
    .tab_source_note(gt.html('Note: As observed before, due to a campaign uzwiki, had a high creation and deletion rate,<br>skewing the overall percentage, so it has been excluded.<br>Please refer to the appendix for uzwiki aggregates.'))
)

mt_deletion_overall_tbl

	# Articles created	# Articles deleted	Percentage of deleted articles	Percent of translations modified under 10%
Percentage of articles deleted, by Machine Translation Service
January to September 2024; excluding uzwiki
Google	153991	5082	3.30%	29.86%
MinT	30320	675	2.23%	32.41%
scratch	16028	490	3.06%
Apertium	5781	156	2.70%	15.12%
Yandex	3925	126	3.21%	20.53%
Elia	3591	106	2.95%	15.74%
LingoCloud	78	4	5.13%	38.18%
Note: As observed before, due to a campaign uzwiki, had a high creation and deletion rate, skewing the overall percentage, so it has been excluded. Please refer to @mt-deletion-uzwiki-edit-bucket in the appendix for uzwiki aggregates.

Summary

Articles translated using MinT were deleted the least: 2.23% of the 30,000+ articles.
Yandex and Google are the top services. The percentage of articles deleted is more than 3%.
2.7% of the articles translated using Apertium were deleted.

at MinT supported languages, by wiki

Code

mt_deletion_mint_langs = query("""
WITH mint_langs AS (
    SELECT
        COUNT(DISTINCT translation_id) AS n_translations,
        target_language||'wiki' AS wiki
    FROM 
        mt_logs
    WHERE 
        mt_service = 'MinT' 
    GROUP BY
        target_language
),
    
deletion_ratios AS (
    SELECT
        wiki,
        mt_service,
        SUM(created_cx_total) AS created_cx_total,
        SUM(deleted_cx_total) AS deleted_cx_total
    FROM 
        mt_deletion_ratios
    WHERE
        wiki IN (SELECT DISTINCT wiki FROM mint_langs WHERE n_translations > 15)
    GROUP By
        wiki,
        mt_service
)
            
SELECT 
    *,
    deleted_cx_total / (created_cx_total + deleted_cx_total) AS deletion_ratio
FROM 
    deletion_ratios
""", True)

Code

mt_deletion_mint_langs_min1 = mt_deletion_mint_langs.query("""deletion_ratio > 0.01""")
unique_wikis = mt_deletion_mint_langs_min1['wiki'].unique()

num_wikis = len(unique_wikis)
num_rows = (num_wikis + 2) // 3

fig = sp.make_subplots(
    rows=num_rows, 
    cols=3, 
    subplot_titles=unique_wikis, 
    horizontal_spacing=0.1/2, 
    vertical_spacing=0.035/2, 
    shared_yaxes=True
)

traces = []

for i, wiki in enumerate(unique_wikis):
    row_num = i // 3 + 1
    col_num = i % 3 + 1
    
    wiki_data = mt_deletion_mint_langs_min1[mt_deletion_mint_langs_min1['wiki'] == wiki].sort_values('deletion_ratio')
        
    trace = go.Bar(
        x=wiki_data['mt_service'],
        y=wiki_data['deletion_ratio'],
        name=wiki,
        text=[f"{val:.0%}" for val in wiki_data['deletion_ratio']],
        textposition='auto',
        textfont=dict(size=10),
        marker=dict(color='RoyalBlue')
    )
    
    traces.append(trace)
    
    fig.add_trace(trace, row=row_num, col=col_num)

fig.update_layout(
    title="Deletion Ratio by MT Service at MinT supported languages",
    height=200 * num_rows,
    width=950,
    showlegend=False
)

for row_num in range(1, num_rows + 1):
    for col_num in range(1, 4):
        fig.update_xaxes(row=row_num, col=col_num)

Code

iplot(fig, config=iplot_config)

MT service usage by user edit bucket

Code

mt_deletion_by_ueb = query("""
SELECT
    mt_service,
    user_editcount_bucket,
    SUM(created_cx_total) AS created_cx_total,
    SUM(deleted_cx_total) AS deleted_cx_total,
    SUM(deleted_cx_total) / SUM(created_cx_total) AS deletion_ratio
FROM
    mt_deletion_ratios
WHERE
    wiki != 'uzwiki'
GROUP BY
    mt_service,
    user_editcount_bucket
""", True)

Code

ordered_ueb = ['1-5', '6-99', '100-999', '1000-4999', '5000+']
mt_deletion_by_ueb['user_editcount_bucket'] = pd.Categorical(mt_deletion_by_ueb['user_editcount_bucket'], categories=ordered_ueb, ordered=True)
mt_deletion_by_ueb = mt_deletion_by_ueb.sort_values(['user_editcount_bucket', 'created_cx_total'], ascending=[True, False])

Code

mt_deletion_by_ueb_tbl = (
    gt
    .GT(mt_deletion_by_ueb, groupname_col='mt_service', rowname_col='user_editcount_bucket')
    .tab_header('Articles created and deleted by MT service & User Edit Count Bucket', period_label)
    .fmt_percent(columns='deletion_ratio', decimals=0)
    .fmt_number(columns=['created_cx_total', 'deleted_cx_total'], decimals=0)
    .cols_label(
        created_cx_total = '# Articles created',
        deleted_cx_total = '# Articles deleted',
        deletion_ratio = '% Articles deleted'
    )
    .opt_stylize()
    .tab_source_note(gt.html('Note: As observed before, due to a campaign uzwiki, had a high creation and deletion rate,<br>skewing the overall percentage, so it has been excluded.<br>Please refer to @tbl-mt-deletion-uzwiki-mt-service in the appendix for uzwiki aggregates.'))
)

mt_deletion_by_ueb_tbl

	# Articles created	# Articles deleted	% Articles deleted
Articles created and deleted by MT service & User Edit Count Bucket
January to September 2024
Google
1-5	7,628	1,507	20%
6-99	11,453	1,345	12%
100-999	17,892	616	3%
1000-4999	24,101	409	2%
5000+	92,917	1,205	1%
MinT
1-5	1,304	203	16%
6-99	2,752	177	6%
100-999	5,324	140	3%
1000-4999	5,381	119	2%
5000+	15,559	36	0%
scratch
1-5	722	140	19%
6-99	1,199	129	11%
100-999	2,206	77	3%
1000-4999	4,543	112	2%
5000+	7,358	32	0%
Apertium
1-5	339	66	19%
6-99	581	53	9%
100-999	373	13	3%
1000-4999	541	5	1%
5000+	3,947	19	0%
Yandex
1-5	193	41	21%
6-99	284	57	20%
100-999	426	5	1%
1000-4999	418	10	2%
5000+	2,604	13	0%
Elia
1-5	99	9	9%
6-99	202	4	2%
100-999	636	2	0%
1000-4999	587	0	0%
5000+	2,067	91	4%
LingoCloud
1-5	13	1	8%
6-99	19	3	16%
100-999	21	0	0%
1000-4999	9	0	0%
5000+	16	0	0%
Note: As observed before, due to a campaign uzwiki, had a high creation and deletion rate, skewing the overall percentage, so it has been excluded. Please refer to the appendix for uzwiki aggregates.

Summary

We wanted to understand how user experience affects the deletion rate of the articles created using CX, by machine translation service.
Except for Elia and LingoCloud: - ~20% of the articles created by relatively newcomers (1-5 edit count bucket) were deleted. - 12-15% of the articles created by users with 6-99 edits were deleted. - 8-10% of the articles created by users with 100-999 edits were deleted. - 2-4% of the articles created by users with 1000-4999 edits were deleted. - 0-1% of the articles created by users 5000+ edits were deleted.
This shows that irrespective of the machine translation service used, user experience plays a huge role in the outcome of deletion of a translated article.
Also, the deletion percentage on Uzbek Wikipedia (Table 2), during the campaign was significantly higher than the usual, across all experience levels - which indicates that creating a lot of articles within a short period can lead to higher deletion rate.

Appendix

Machine translation deletion overview for uzwiki

Code

mt_deletion_overall_uzwiki = query("""
SELECT
    mtd.mt_service AS mt_service,
    SUM(created_cx_total)::INT AS '# Articles created',
    SUM(deleted_cx_total)::INT AS '# Articles deleted',
    SUM(deleted_cx_total) / SUM(created_cx_total) * 100  AS 'Percentage of deleted articles',
    pct_translations * 100 AS 'Percent of translations modified under 10%'
FROM
    mt_deletion_ratios mtd
LEFT JOIN (
    SELECT
        mt_service,
        pct_translations
    FROM
        pct_modified_overall
    WHERE
        pct_modified = 'less than 10%'
    ) pct_modf
    ON mtd.mt_service = pct_modf.mt_service
WHERE
    wiki == 'uzwiki'
GROUP BY
    mtd.mt_service,
    pct_translations
""", True).sort_values('# Articles created', ascending=False)

mt_deletion_overall_uzwiki.set_index('mt_service')

Table 1

	# Articles created	# Articles deleted	Percentage of deleted articles	Percent of translations modified under 10%
mt_service
Google	15915	919	5.774427	29.864008
MinT	8419	1567	18.612662	32.411278
Yandex	682	184	26.979472	20.534070
scratch	437	57	13.043478	NaN

MT service usage by user edit bucket on uzwiki

Code

mt_deletion_by_ueb_uzwiki = query("""
SELECT
    mt_service,
    user_editcount_bucket,
    SUM(created_cx_total) AS created_cx_total,
    SUM(deleted_cx_total) AS deleted_cx_total,
    SUM(deleted_cx_total) / SUM(created_cx_total) AS deletion_ratio
FROM
    mt_deletion_ratios
WHERE
    wiki == 'uzwiki'
GROUP BY
    mt_service,
    user_editcount_bucket
""", True)

mt_deletion_by_ueb_uzwiki['user_editcount_bucket'] = pd.Categorical(mt_deletion_by_ueb_uzwiki['user_editcount_bucket'], categories=ordered_ueb, ordered=True)
mt_deletion_by_ueb_uzwiki = mt_deletion_by_ueb_uzwiki.sort_values(['user_editcount_bucket', 'created_cx_total'], ascending=[True, False])

mt_deletion_by_ueb_uzwiki_tbl = (
    gt
    .GT(mt_deletion_by_ueb_uzwiki, groupname_col='mt_service', rowname_col='user_editcount_bucket')
    .tab_header('Articles created and deleted by MT service & User Edit Count Bucket', f'{period_label}; on uzwiki')
    .fmt_percent(columns='deletion_ratio', decimals=0)
    .fmt_number(columns=['created_cx_total', 'deleted_cx_total'], decimals=0)
    .cols_label(
        created_cx_total = '# Articles created',
        deleted_cx_total = '# Articles deleted',
        deletion_ratio = '% Articles deleted'
    )
    .opt_stylize()
)

mt_deletion_by_ueb_uzwiki_tbl

Table 2

	# Articles created	# Articles deleted	% Articles deleted
Articles created and deleted by MT service & User Edit Count Bucket
January to September 2024; on uzwiki
Google
1-5	218	38	17%
6-99	1,482	288	19%
100-999	4,932	410	8%
1000-4999	6,471	180	3%
5000+	2,812	3	0%
MinT
1-5	181	44	24%
6-99	1,685	488	29%
100-999	4,390	878	20%
1000-4999	2,027	157	8%
5000+	136	0	0%
Yandex
1-5	26	7	27%
6-99	213	70	33%
100-999	308	83	27%
1000-4999	85	24	28%
5000+	50	0	0%
scratch
1-5	12	0	0%
6-99	79	18	23%
100-999	228	36	16%
1000-4999	114	3	3%
5000+	4	0	0%

Footnotes

8% of all translations in 2023, and 16% in 2024.↩︎
80% of all translations in 2023, and 71% in 2024.↩︎
8% of all translations in 2023, and 16% in 2024.↩︎
80% of all translations in 2023, and 71% in 2024.↩︎
Chewa (ny), Malagasy (mg), Sundanese (su), Gan Chinese (gan), Hawaiian (haw), Aymara (ay), Lao (lo), Quechua (qu), Guarani (gn), Western Frisian (fy), Shona (sn), and Ilocano (ilo).↩︎
Kinyarwanda (rw), Lingala (ln), Maltese (mt), Twi (tw), Sinhala (si), Serbian (sr), Tigrinya (ti), Khmer (km), Sindhi (sd), Irish (ga), Welsh (cy), Tagalog (tl), Marathi (mr), Macedonian (mk), Somali (so), Latin (la), Croatian (hr), Azerbaijani (az), Finnish (fi), Javanese (jv), Tsonga (ts), Bengali (bn), Czech (cs), Afrikaans (af), Lithuanian (lt), Bosnian (bs), Tajik (tg), Estonian (et), Albanian (sq), Xhosa (xh), Bulgarian (bg), Luganda (lg), Dutch (nl), Northern Sotho (nso), Belarusian (Taraškievica orthography) (be-tarask), Polish (pl), Mongolian (mn), Slovenian (sl), Hungarian (hu), Kazakh (kk), Thai (th), Central Kurdish (ckb), Tulu (tcy), Pashto (ps), Serbo-Croatian (sh), Armenian (hy), Burmese (my), Chinese (zh), Latvian (lv), Romanian (ro), Turkish (tr), Cebuano (ceb), Ukrainian (uk), and French (fr).↩︎
Korean (ko), Slovak (sk), Danish (da), Russian (ru), Haitian Creole (ht), Odia (or), Yoruba (yo), Georgian (ka), Vietnamese (vi), Swahili (sw), Swedish (sv), Spanish (es), Italian (it), Hebrew (he), Amharic (am), Egyptian Arabic (arz), Betawi (bew), Dhivehi (dv), Kyrgyz (ky), Turkmen (tk), Uyghur (ug), Malay (ms), Arabic (ar), Maori (mi), Ewe (ee), Portuguese (pt), Oromo (om), Scottish Gaelic (gd), Kurdish (ku), Assamese (as), Persian (fa), Catalan (ca), Bhojpuri (bho), Yiddish (yi), Wu Chinese (wuu), Telugu (te), Igbo (ig), Greek (el), Kannada (kn), Uzbek (uz), Gujarati (gu), Luxembourgish (lb), Indonesian (id), Hindi (hi), Galician (gl), Sanskrit (sa), Norwegian Bokmål (nb), German (de), and Punjabi (pa).↩︎
Norwegian Nynorsk (nn), Occitan (oc), Silesian (szl), Northern Sami (s), Asturian (ast), and Esperanto (eo).↩︎
Bashkir (ba) and Belarusian (be).↩︎
^The translations_progress data shows the percentage of translation completion. human indicates manual translation percentage. mt indicates machine translation percentage. Any edits to machine translation output are considered as manual edits. The percentages are calculated at section level. any indicates the total translation (any=human+mt). Content Translation does not demand full translation of the source article.↩︎
Note: Even though the percentage of translations with human modification percentage less 10% is 38% for LingoCloud, the number of translations that used LingoCloud was only ~100.↩︎