sia.hackernoon.com

As the timer in the corner slowly bleeds away, my eyes scan the problem on my screen.

“You are given two simple undirected graphs F and G with n vertices. F has m1 edges while G has m2 edges. You may perform one of the following two types of operations any number of times… Determine the minimum number of operations required such that for all integers u and v (1≤u,v≤n), there is a path from u to v in F if and only if there is a path from u to v in G.”

For a moment, the room is silent. Something clicks. I burst into motion, slamming out pseudocode on my notepad, half-baked ideas colliding into something concrete. The more I write, the more I become convinced that my solution works. I switch from notepad to code editor, attempting to translate my illegible scribbles into something the compiler can understand.
I finish implementing, and breathe a sigh of relief as I submit my code. Another day, another contest, another problem solv–
A blue flash of text on the screen catches my attention. My heart sinks as I read the 5 words I hate the most: “Wrong answer on test 2”.
I panic. What’s wrong? Is it integer overflow? An off-by-one error? Or is my solution, which I thought I had proved, inherently wrong?

Competitive programming is brutal: racing against the clock while solving complex algorithmic problems is something that requires not only strong logical reasoning skills, but also good intuition, fast typing speed, and even a little luck. Despite seeming like pure nonsense for many, it is almost a religion for some. There is a vibrant and diverse community centered around solving these (sometimes ridiculously convoluted) computational problems. Many bored, chronically online, and somewhat mentally unstable individuals such as myself are drawn to the allure of engaging in the ultimate battle of wits against some of the smartest idiots on the face of the Earth.

Sites such as Codeforces, Atcoder, and CodeChef host multiple contests every week, allowing participants to compete in ranked/rated contests. In these live contests, competitors are given a set of problems to solve in a given period of time. They then earn a score based on both the accuracy of their solutions and how fast they arrived at them(time of submission). Each subsequent problem is significantly harder, and the difficulty scales harshly: the easiest problems might take 5 minutes, and the hardest problem might take 5 hours, even for the world’s top programmers.

At its core, competitive programming is a test of skill. It's a puzzle that forces competitors to examine it from every angle and wrestle with logic and abstraction under high pressure. It is precisely the intoxicating mix of difficulty and satisfaction that makes competitive programming more than a hobby. In fact, Competitive programming has since become a respected benchmark of problem-solving ability, opening doors to internships, scholarships, and top-tier colleges.
The prestige and recognition is what draws many to competitive programming. Whether for college admissions or employment opportunities, a high Codeforces rating orUSACOlevel can make or break an application. For many it is not just a hobby, but a way to stand out from their peers. When certificates, scholarships, and recruitment pipelines depend on contest outcomes, many contestants are incentivized to abuse the system.

Foul play has long been a part of competitive programming, especially due to the fact that the majority of the competitions are held online. Code sharing, account sharing, and alt abusing are all common occurrences during any contest. Many pay-to-access Telegram channels were created by higher rated participants hoping to make a quick buck, posting anonymous solutions to problems as they solved them. Some participants give their account credentials to a higher-rated friend. Some competitors simply solve problems on an alternate account and resubmit only the correct solutions from their main, avoiding any negative score drops from incorrect submissions.

For the longest time, cheating attempts, for the most part, were almost embarrassingly simple. After each contest, plagiarism checkers would find dozens of identical submissions from newly-created accounts. Relatively more intelligent cheaters could strategically stagger their submissions or lightly edit their outputs, but moderators could remove or nullify many of the illegitimate results with a combination of automated checks and human review.

This low-tech game of cops-and-robbers, however, was soon complicated by the introduction of a new factor: generative AI models. These generative tools created a new level to cheating that none of us were expecting. When ChatGPT first came out, it was useless for programming contests. Lacking the ability to solve the simplest of problems, few treated it as a real threat. Despite this, more models soon followed, each bringing marginal but visible improvements.

It was not until ChatGPT-o1 was released that AI tools started posing a dire threat to the integrity of online programming contests. Community threads and contest post-mortems began flagging users whose submissions seemed suspicious at a rate far greater than before. Many of these participants’ submissions showed inconsistencies in coding styles and formatting habits, indicating use of a generative model. Although o1 was much more capable than its precursors, its solving was still very inconsistent, allowing LLM-based cheating to be further traced and prosecuted. Even despite the onslaught of cheaters, it was apparent that things were only getting started.

Then o3 came out. OpenAI specifically tuned the model for coding tasks, reporting an estimated Codeforces rating of ~2700, a threshold well within the top .1% of the community. Although the rating was perhaps an overestimate, the jump from o1’s reported (but highly overestimated) rating of ~1900 was clear. The abilities of the model had far exceeded anyone's expectations. o3 could adapt to a user's template as well, writing in the programmer’s original style, further exacerbating the threat this posed to the integrity of the competition. And although sometimes o3 use could be very noticeable (in one case it actually OVERoptimized the intended solution), it left us scrambling for answers.

Since its inception, the appeal of the contests was simple: a fair game of speed, logic, and creativity. Everyone had the same constraints, the same clock ticking down, the same editor window. Climbing the ranks meant something. But with AI creeping into contests, that meaning started to erode. Was I really competing against another human, or was I just competing against someone who knew how to prompt better? Even as of this writing, blatant cheaters are winning first place in contests with the use of GPT-5.

For many of us, competitive programming has been more than a resume line or even a hobby. It’s taught us how to think under pressure, decompose messy problems into reducible components, and train an intellectual endurance that shows up in work, interviews, and everyday decisions. Pruning edge cases, choosing the right invariant, reducing constant factors, all come from the habit of inquiry and discipline that competitive programming cultivates. Those are fundamentally human skills: curiosity, the willingness to struggle, and the ability to reflect on failures and iterate. To us, these contest rankings are a proof of growth, a clear sign of progress, and a recognition for our skill and dedication. That’s why the prospect of leaderboards populated by GPT-assisted entries scared us so much. Every AI-generated submission was robbing us of not only a fair competition, but also the personal story of our incremental mastery and effort.

I wanted to hear from people deeper in the competitive programming scene about how they see the future. What did they think would happen to what seemed like the inevitable decay of competitive programming? I reached out to respected competitive programmer and problemsetter, Chongtian Ma (perhaps better known by his online alias, cry, to talk about AI, cheating, and the future of competitive programming and coding as a whole.

When I asked cry about major changes he had noticed since the arrival of LLMs, his answer was immediate: “Obviously people started using LLMs to participate in contests. Then people who don’t use them get salty and complain, or get pressured to also use LLMs.” He views the presence of such models as harmful, leading to a “net loss of legitimate participants day by day.”

The tension caused between resisting an unstoppable technology and preserving the integrity of the contest has sparked an internal debate amongst the members of the community. Some programmers, most notably Legendary Grandmaster Aleksei Daniliuk, argue that cheaters have always existed, and that love for problem solving should outweigh any meaning that comes with the ranking system. But cry sees the situation differently, echoing the worry that participation will dwindle if the rankings themselves lose credibility: “If contests let AI go rampant then [they] will definitely lose value,” he told me bluntly, “Because competitive programming without competitiveness is just programming.”

Nor does he offer much comfort in the idea of finding a solution to combat AI, admitting that “[cheating] is not really preventable.”

Attempts to counteract these models are, to him, futile. “We can’t predict what problems can be GPT-able, and it’s just not worth throwing out solid problems with educational value.” Many problemsetters are faced with the same dilemma: if they ignore AI use, the rankings will be affected no matter how well-written the questions are; but if they attempt to LLM-proof their contests, questions with high educational value could be thrown out in favor of for problems that are AI-resistant yet compromise the quality of the competition. There are no foreseeable solutions to this problem online, Cry suggests, but high-stakes contests may benefit from being conducted in-person: “If [we] transition to in-person contests, the sun will shine bright on the earth once again.”

For him, however, AI disruption of programming contests is only a small piece of the larger trend of tech employers seeking to lay off workers, leveraging AI tools to replace them. “Obviously [AI] will boost productivity,” he said. “[but] once companies comfortably bridge the gap between AI and product development, it’s over for humans”. He echoes the fears of many in the tech industry who risk unemployment as more companies embrace “vibe coding” as a professional standard. When asked what role humans would play once the gap was bridged, his reply was filled with dry mirth: “Sit on the side… and do some occasional prompting.”

What’s more, he suggests that a decrease in the accuracy of online programming contests as a metric for human skill simultaneously discourages companies from using them as hiring signals and hobbyists from enjoying the thrill of creative problem solving. “If it’s publicly known that ratings don’t matter for recruitment, a lot fewer people would even try CodeForces. I feel like it will lose a lot of the charm either way.” And although this would reduce the incentive to cheat, competitive programming would become “just another video game”.

This article is brought to you by Our AI, a student-founded and student-led AI Ethics organization seeking to diversify perspectives in AI beyond what is typically discussed in modern media. If you enjoyed this article, please check out our monthly publications and exclusive articles at https://www.our-ai.org/ai-nexus/read!

After talking with cry, it’s hard not to feel like competitive programming is standing at a crossroads. On one hand, competitive programming still offers something that AI can’t quite replicate: the practice of fast thinking, structured reasoning, and problem-solving under pressure. On the other hand, the integrity of contests and the meaning of rankings are already being chipped away by models that are only getting stronger. I do not doubt that these programming competitions will be very different a few years from now. Maybe they survive by going in person. Maybe they shift entirely to being a casual training ground rather than a high-stakes battle of wits. Or maybe they will slowly drift into what cry called a “nerd game,” something all but stripped of its old meaning.

And if things go from bad to worse, I’ve already got my backup plan. There’s a Codeforces blog floating around about making money from competitive programming side hustles, like running a “Nim game scam” for beginners. If all else fails, maybe that’s where I’ll end up. Hustling games of impartial combinatorics in the corner of some foreign country. At least then, win or lose, it’ll still be humans competing against humans.

Written by Christopher Tang

The End of Fair Play in Coding Contests