Assessing the effectiveness of AI-driven language support systems on international student performance: a Data-Driven management perspective

Li Baoluo

Аннотация статьи

Universities have adopted AI-based language tools at considerable speed, often before deciding how such tools should be governed. This paper steps back from the question of whether the technology works and asks a management question instead: under what institutional conditions does AI language support actually improve outcomes for international students, and at what cost? Drawing on the literatures of student adaptation, AI in education, and learning analytics, the paper argues that the effectiveness of these systems is better understood as a property of management than of software. The supporting evidence is treated as illustrative rather than definitive. It suggests a recurring pattern–modest average gains, considerable variation between students, and a strong dependence on whether institutions monitor uptake and target support–but it is also read critically, with attention to selection effects, the contested boundary with academic misconduct, and the risk that AI provision quietly widens the very gaps it is meant to close. The contribution is a governance-centred reframing: AI language support should be evaluated, resourced and held accountable as a managed service, not procured as a solution and left to run.

Текст статьи

1. Introduction

International enrolment has grown to the point where, in many institutions, a large share of students sit examinations in a language they are still learning. The consequences are well documented. Weak host-language proficiency raises acculturative stress, slows social and academic integration, and means that assessed work frequently understates what a student actually knows. For university managers this is not only a pedagogic concern. It feeds directly into retention figures, completion rates and the reputational metrics on which institutions increasingly compete.

AI-based language tools have arrived as an apparent answer. Translation services, AI writing assistants and conversational tutors promise help that is immediate, cheap and available outside the narrow hours of a staffed writing centre. Adoption has been rapid, and in many cases it has run ahead of any considered decision about oversight. Tools are licensed, or simply used by students without licensing, before anyone has settled what counts as legitimate use, who is expected to benefit, or how the institution will know whether the investment was worthwhile.

This paper deliberately does not ask the question that dominates much of the existing literature–namely, whether AI tools improve performance in the abstract. That framing treats the technology as the active ingredient and the institution as a passive host. The argument advanced here is the opposite. Whatever effect these systems have is produced jointly by the tool and by the management decisions that surround it: how access is structured, whether uptake is monitored, how the line against misconduct is drawn and policed, and against what standard the institution judges success. The objective, then, is to reframe AI language support as an object of management rather than of procurement, and to set out what a data-informed approach to governing it would involve–together with an honest account of where such an approach can mislead.

2. Literature Review

2.1. Language and the Adaptation of International Students

Research on international student adaptation has consistently identified host-language proficiency as a determinant of both well-being and academic functioning. Acculturation theory treats adjustment as multidimensional, and empirical work in that tradition links limited proficiency to higher acculturative stress and weaker integration. Survey studies report that students with stronger language skills tend to achieve higher grades and feel more socially embedded. What this literature largely takes for granted, however, is the support apparatus itself. Provision tends to appear as a fixed backdrop rather than as something that institutions design, resource and might design badly.

2.2. AI in Education: A Contested Evidence Base

The literature on AI in higher education is large, fast-moving and far from settled. Reviews and empirical studies report gains in performance, engagement and writing quality where students use generative tools, and some experimental work points to improvements in problem-solving and productivity. Yet the same body of work contains a persistent counter-current. Scholars warn that heavy reliance on translation and grammar correction can crowd out the effortful cognitive work through which language and disciplinary understanding are actually built, and that the convenience of a fluent draft can mask shallow learning. The evidence, in short, does not speak with one voice, and a management perspective should resist citing only the half of it that flatters a purchasing decision.

2.3. Language Support Systems and the Question of Substitution

Tools aimed specifically at language–adaptive practice, automated transcription and translation of lectures, discipline-specific vocabulary work–are often presented as straightforwardly widening access. Much of the supporting research, though, is descriptive or based on student perception, and comparatively little of it connects use to objective outcomes for international students in particular. A sharper concern is conceptual. These systems sit on an unstable line between scaffolding and substitution. A tool that helps a student articulate an idea they already hold is doing something different from a tool that supplies the idea, yet from the institution's vantage point the two can be hard to tell apart.

2.4. Learning Analytics as a Management Capability

A separate literature treats learning analytics and predictive modelling as instruments of institutional management. By combining engagement and assessment data, these methods can flag students at risk early enough for intervention and can inform how advising and tutoring are allocated. This is the strand most directly relevant here, because it frames data not as a research output but as a governance capability–a means by which managers decide where support goes. It is also the strand most exposed to critique: predictive models can encode bias, and an early-warning dashboard can create an impression of control that outruns the institution's actual ability to act.

2.5. The Gap

These literatures rarely meet. Adaptation research establishes that language matters but says little about how support is governed. AI-in-education research reports effects on general populations while seldom isolating international students or scrutinising the conditions of use. Learning analytics treats data-informed management as valuable in itself without testing whether it changes what AI tools achieve. The gap this paper addresses is therefore not empirical so much as conceptual: there is no settled account of AI language support as a managed service whose value is contingent on institutional governance. Supplying that account is the task of the sections that follow.

3. Research Methodology

3.1. A Conceptual, Management-Oriented Design

The study is best described as conceptual and exploratory. Its purpose is to develop and illustrate a way of thinking about AI language support, not to deliver a definitive measurement of its effect. This framing is a deliberate choice rather than a concession. The field already contains a good deal of point-estimate research, much of it cross-sectional and reliant on self-report, and adding another such estimate would do less than clarifying the management logic that any estimate must be interpreted within.

Figure 1 sets out that logic. AI language support is positioned not as an isolated input but as one element of a managed process. The institutional context–strategy, resourcing, analytics capability and policy–conditions how the tool reaches students. Its influence on performance runs through student-level mechanisms such as confidence, engagement and acculturative load. Crucially, the diagram includes a feedback loop: outcomes are monitored, and provision is adjusted in light of what the monitoring shows. It is the presence or absence of that loop, far more than the sophistication of the tool, that the framework treats as decisive.

Fig. 1. AI language support understood as a managed institutional process rather than a stand-alone intervention

3.2. Use of Evidence

To keep the discussion grounded, the paper draws on illustrative evidence assembled from published studies of AI use among international and second-language students, supplemented by the kind of usage and outcome patterns institutions typically observe in their own analytics. This material is used to motivate and probe the argument, not to prove it. No claim is made to a single representative dataset, and the absence of precise effect sizes in what follows is intentional: spurious precision would sit awkwardly with the paper's own argument that effects are heterogeneous and management-dependent.

3.3. Propositions

Rather than statistical hypotheses, the study advances three management propositions. First, the average effect of AI language support on performance is likely to be real but modest, and less interesting than its variance. Second, that variance is shaped substantially by institutional management–particularly by whether uptake is monitored and support is targeted–so that the same tool produces different returns in different governance settings. Third, because effects are uneven, AI provision carries a distributional risk: without deliberate management it may benefit confident, well-resourced students most and so widen attainment gaps. These propositions are examined, and partly contested, in the next section.

4. Results and Discussion

4.1. Two Accounts of the Same Technology

Any assessment of AI language support has to reckon with the fact that the literature supports two coherent but opposed stories. Table 2 lays them side by side. On the optimistic account, immediate feedback and personalisation lift performance, low cost extends help to students whom staffed services never reach, and scalable support relieves pressure on advising budgets. On the critical account, convenience substitutes for the effortful practice that genuinely builds skill, benefits concentrate among students who were already doing well, the boundary with unacknowledged authorship is porous, and the apparent budget saving merely relocates cost into teaching and assessment workload.

Table 1

Management decisions surrounding AI language support and the governance questions they often leave implicit

Management domain	Typical decision	Governance question often left implicit
Procurement & resourcing	Which AI tools to license, and at what scale of spend	Does the tool displace, or merely supplement, existing human support?
Access & equity	Whether tools are opt-in or embedded in the student experience	Who actually adopts them, and does uneven uptake widen attainment gaps?
Analytics & monitoring	What usage and outcome data to collect on support services	Are at-risk students identified early, or only after they fail?
Academic policy	What constitutes acceptable use in assessed work	Is the line between support and substitution clear to staff and students?
Evaluation	How success of the investment is judged	Is effectiveness measured against learning, or only against satisfaction?

Neither account is simply wrong. The more useful observation, from a management standpoint, is that which story comes true in a given institution is not fixed by the technology. It is settled by decisions–about access, monitoring, policy and evaluation–that are listed, alongside the questions they tend to leave unasked, in table 1.

Table 2

Competing accounts of AI language support for international students

Dimension	Optimistic account	Critical counter-account
Learning	Immediate feedback and personalisation raise performance	Convenience may substitute for the effortful practice that builds skill
Equity	Low-cost help reaches students underserved by staffed services	Benefits may concentrate among the already-confident and well-resourced
Integrity	Tools legitimately scaffold writing in a second language	The boundary with unacknowledged authorship is contested and porous
Institutional value	Scalable support eases pressure on advising budgets	Apparent savings may shift hidden costs onto teaching and assessment

4.2. The Average Effect Is the Least Interesting Number

Where studies of AI use among second-language students do report performance gains, those gains are typically positive but moderate, and they are accompanied by wide dispersion. For a researcher chasing a headline effect size, the dispersion is noise. For a manager it is the main finding. It indicates that the tool does very different things for different students, and that an institution reporting a satisfactory average may still be failing a substantial minority while flattering a confident majority. A management perspective therefore reads the average with suspicion and asks instead where, and for whom, the effect is concentrated.

Figure 2 expresses this argument schematically rather than numerically. It contrasts two institutions offering the same tools. In one, provision is largely unmonitored; use rises but outcomes improve only weakly, because nobody is checking whether the students who most need help are the ones using it. In the other, management is data-informed: uptake is tracked, low engagement among at-risk groups is noticed, and support is steered accordingly. The curves diverge, and the gap between them is the contribution of governance. The figure carries no precise values by design; its claim is about shape, not magnitude.

Fig. 2. A schematic contrast: identical tools yield different returns depending on how provision is managed

4.3. The Equity Problem Management Tends to Miss

The most uncomfortable implication concerns equity, and it cuts against the sector's habitual framing of AI as a democratising force. If gains are uneven and uptake is voluntary, the students who adopt AI support most readily are often those with stronger prior language skills, greater digital confidence and more time–precisely the students least dependent on help. Students facing the steepest barriers may use the tools least, or use them in shallow ways. Left unmanaged, a service introduced in the name of access can therefore widen the attainment gap rather than close it. This is not an argument against AI support. It is an argument that equitable outcomes are a management achievement, not a property the technology delivers on its own, and that an institution which does not monitor who benefits has no basis for claiming that it does.

4.4. The Limits of the Data-Driven Response

It would be too neat to conclude that analytics simply solves these problems, and the paper's own perspective requires it to say so plainly. Data-driven management has real limits. Engagement metrics capture clicks and logins, not understanding, and a student can be heavily active in a tool while learning little. Early-warning models can carry forward historical bias, so that the groups flagged as at-risk reflect past inequities as much as present need. Dashboards can manufacture a feeling of control that the institution cannot actually exercise if it lacks the staff to intervene once a warning fires. And the academic-integrity question–where legitimate scaffolding ends and substitution begins–is a matter of judgement that no analytics layer can settle. A data-informed approach is, on the evidence, better than an unmonitored one. But it is a discipline for asking sharper questions, not a substitute for answering them.

4.5. Implications for University Management

Several practical implications follow, and they are addressed to managers rather than to tool designers. Procurement of AI language tools should be coupled with investment in the capacity to monitor their use; buying the tool without funding the oversight purchases the optimistic story and the critical one in unknown proportion. Evaluation should be defined against learning and equitable outcomes from the outset, because a service judged only on satisfaction or usage volume will look successful long after it has stopped helping the students who matter most. Academic-integrity policy should be treated as part of the support system rather than as a separate disciplinary matter, since the same tool is simultaneously a help and a hazard. And the distributional question–who is actually benefiting–should be a standing item in institutional reporting, not an occasional research exercise.

5. Conclusion

This paper has argued that the effectiveness of AI-driven language support for international students is poorly understood when it is framed as a question about technology. Reframed as a question about management, it becomes both more tractable and more demanding. The tool, on the available evidence, produces modest average gains and considerable variation; what determines whether that variation works for or against an institution's students is governance–how access is structured, whether uptake is monitored, how integrity is handled, and against what standard success is judged.

The contribution is therefore conceptual rather than empirical. It offers university managers a way of seeing AI language support as a managed service with a feedback loop, and it does so without overstating what the supporting evidence can bear. That evidence is illustrative, the design is exploratory, and the propositions advanced here invite contest rather than closing it. The paper has also been deliberate in resisting the optimistic reading: a data-informed approach sharpens the questions a university should ask, but it does not answer them, and it carries its own risks of bias and false confidence.

Future work should pursue longitudinal and comparative designs capable of observing how effects evolve and how they differ across institutional governance regimes, and should treat the equity question–who gains, who is left behind, and why–as central rather than incidental. The practical message, however, is already clear enough to act on. An institution that procures AI language support and leaves it unmanaged has not adopted a solution. It has adopted an open question, and declined to govern the answer.

Список литературы

Berry J.W. (1997). Immigration, acculturation, and adaptation. Applied Psychology: An International Review, No. 46(1), P. 5-34. https://doi.org/10.1111/j.1464-0597.1997.tb01087.x.
Chan C.K.Y., Hu W. (2023). Students' voices on generative AI: Perceptions, benefits, and challenges in higher education. International Journal of Educational Technology in Higher Education, No. 20, P. 43. https://doi.org/10.1186/s41239-023-00411-8.
Chen L., Chen P., Lin Z. (2020). Artificial intelligence in education: A review. IEEE Access, No. 8, P. 75264-75278. https://doi.org/10.1109/ACCESS.2020.2988510.
Cotton D.R.E., Cotton P.A., Shipway J.R. (2024). Chatting and cheating: Ensuring academic integrity in the era of ChatGPT. Innovations in Education and Teaching International, No. 61(2), P. 228-239. https://doi.org/10.1080/14703297.2023.2190148.
Hon K.L. (2026). Generative AI in higher education: A systematic review of its effects on learning outcomes and academic performance. Journal of Educational Computing Research. Advance online publication. https://doi.org/10.1177/00472395251400089.
Karatas F., Abedi F.Y., Ozek Gunyel F., Karadeniz D., Kuzgun Y. (2024). Incorporating AI in foreign language education: An investigation into ChatGPT's effect on foreign language learners. Education and Information Technologies, No. 29(15), P. 19343-19366. https://doi.org/10.1007/s10639-024-12574-6.
Kim Y.Y. (2001). Becoming intercultural: An integrative theory of communication and cross-cultural adaptation. Sage Publications.
Smith R.A., Khawaja N.G. (2011). A review of the acculturation experiences of international students. International Journal of Intercultural Relations, No. 35(6), P. 699-713. https://doi.org/10.1016/j.ijintrel.2011.08.004.
Urban M., Děchtěrenko F., Lukavský J., Hrabalová V., Svacha F., Brom C., Urban K. (2024). ChatGPT improves creative problem-solving performance in university students: An experimental study. Computers & Education, No. 215, P. 105031. https://doi.org/10.1016/j.compedu.2024.105031.
Wang J., Fan W. (2025). The effect of ChatGPT on students' learning performance, learning perception, and higher-order thinking: Insights from a meta-analysis. Humanities and Social Sciences Communications, No. 12, P. 621. https://doi.org/10.1057/s41599-025-04787-y.

Assessing the effectiveness of AI-driven language support systems on international student performance: a Data-Driven management perspective

Цитирование

Похожие статьи

Другие статьи из раздела «Экономика и управление»