Tracing English interference in AI-generated German: An analysis of word order and syntactic fronting

Paolo Valentinelli

doi:10.62408/ai-ling.v5i1.36

Valentinelli_2026_AI-Linguistica

DOI

https://doi.org/10.62408/ai-ling.v5i1.36

Keywords

artificial intelligence, AI-generated language, ChatGPT, German syntax, newspaper language

Published

January 16, 2026

Journal

Published in Vol. 5 No. 1 (2026) of AI-Linguistica. Linguistic Studies on AI-Generated Texts and Discourses.

AI-Linguistica. Linguistic Studies on AI-Generated Texts and Discourses is a new scholarly journal aiming at providing a publishing plateform for researchers from all areas of Linguistics (interfacing with neighboring fields: Communication Science, Media and Journalism Studies, Computational Linguistics) to reflect on generated texts from a variety of perspectives: theoretical, descriptive, and applied.

We understand ‘generated texts’ in a broad sense, including formats as diverse as texts generated by Large Language Models, AI-powered smart agents (i.e. chatbots, voice assistants, social bots etc.), writing assistance tools, template-based software, and neural machine translation services.

About the Journal

Abstract

Large language models (LLMs) constitute a transformative advancement in natural language processing, yet their development remains disproportionately skewed toward English. Despite the global linguistic landscape, non-English languages – including major languages like Spanish, French or Chinese – are effectively treated as low-resource in current LLM training paradigms. This study analyses two linguistic traits of AI-generated texts which mimic human-authored German newspaper articles and compares them with a purpose-built corpus of real journalistic texts. These features are (i) word order and (ii) pre-field occupation. Through quantitative and qualitative analyses of the outputs of four distinct LLMs, three key phenomena in AI’s German outputs were identified: (i) a marked preference for SVO word order; (ii) reduced syntactic variability compared to human-authored texts; and (iii) the emergence of stylistically marked constructions which mirror English linear progression rather than native German sentence bracketing. While some models approximate human-like syntactic patterns for certain variables, this equivalence remains limited and context-dependent, which may suggest a cross-linguistic interference from the overwhelming English predominance in LLM training data. The study emphasises the linguistic implications of LLM architectures and calls attention to the urgent need for more equitable representation of world languages in natural language processing development.

Valentinelli_2026_AI-Linguistica

Details

DOI

https://doi.org/10.62408/ai-ling.v5i1.36

Published

January 16, 2026

Issue

Vol. 5 No. 1 (2026): AI-Linguistica

Section

Full-Length Article

Keywords

artificial intelligence, AI-generated language, ChatGPT, German syntax, newspaper language

How to Cite

Valentinelli, P. (2026). Tracing English interference in AI-generated German: An analysis of word order and syntactic fronting. AI-Linguistica. Linguistic Studies on AI-Generated Texts and Discourses, 5(1). https://doi.org/10.62408/ai-ling.v5i1.36

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Tracing English interference in AI-generated German: An analysis of word order and syntactic fronting

Authors

Files

Key Information

DOI

Keywords

Published

Journal

Abstract

Details

DOI

Published

Issue

Section

Keywords

How to Cite

License