The Real Cost is Responsibility
This is the first of many articles about my thoughts on AI and its usage in programming. For the last 2 years, every few months manages to surprise me with a new wave of mass hysteria. This time it is multi-agent workflows with people being obsessed about KLOCs they can output daily. At first it was "10x engineers became real", then "we now have 100x engineers", now some even claim "there are 1000x engineers out there". It feels so shallow, so "marketing", words don't have a meaning now. I feel like some marketing person is trying to sell me something: you need toaster master 2000, oh actually this is 4000, the better model, no no, actually this is our newest premium model, toaster master 4458. Numbers only exist because they somehow encode "improvement" and "luxuriousness" of the product. I feel the same with these bazillionx productivity claimants. They all miss the #1 rule of the programming.
Code is a Liability
Everyone with expertise knows this, you are not supposed to write code unnecessarily. There are acronyms around reducing code duplication (Don't Repeat Yourself). Probably one of the biggest sources of bugs is copy-pasting your own code from somewhere else and forgetting changing stuff. There are countless arguments around how much duplication is okay, when you should abstract something in contrast to keeping it simple. Sometimes it is not possible to avoid it and you have to write a lot of code. But one thing is certain, code is a liability, you need a reason to abstract it, you need a reason to duplicate it, you certainly need a reason to write lots of it. Code is a liability and you are responsible for it, that should be the real metric.
Responsibility Allocated per Line
So, but then, if we care so much about not writing unnecessary code, why do we use lines of code as a metric, one might ask. There are multiple possible answers to this. One, because it is an easily trackable metric, that makes us feel good about ourselves as a metric of pseudo productivity. Two, we as humans (compared to AI), if we follow above average standards, know that whatever lines of count we see is probably optimized with natural evolution of the project, hence know how bigger project got relatively compared to previous iterations. Three, use it as a cautionary size metric; I can guarantee you no one is impressed by Unreal Engine 5.7 being 30+ million lines of code, rather everyone is "damn, I wish it was less, this is gargantuan and it'll be painful if I need to make changes". Because even the code you didn't write (dependency) is a liability and you are responsible for it.
But what is responsibility and how can you gauge it? Lack of clear lines make some people courageous with the code size; just like in the Sorites Paradox, if you add a single grain of sand, when would a mound of sand become a beach? So, how many lines of code is responsible behavior and how many is not; when it becomes irresponsible? To wrap our head around, we can start with extremes; it is relative by definition, so let's define limit for it, 1 day. For the concepts you are familiar with, can you be sure of what single line of code does and be responsible for it? That is most certainly a yes. Can you say the same thing for 30 million lines of code? Absolutely not. So, there are actually limits on how much stuff you can be responsible of.
And that is probably variable, it certainly depends on familiarity, project scope, programming expertise, stress levels, whether you are sleep deprived or not, environment, yadda yadda. Then, for the task at hand, you are allocating different amounts of responsibility depending on the importance of the task. So, we can't gauge it exactly, but we can say this. You have unknown total amounts of responsibility you have available in a given time frame, that you can allocate per line basis depending on importance of the line. So, the real question is, do RAPL (responsibility allocated per line) scale similar to KLOC when AI tools are used?
Responsibility Stretched Thin
With so many lines produced by AI tools, your responsibility allocation is stretching thin as a result. Leftover time from not actually writing the code, or only using AI on stuff you have expertise with might improve your responsibility allocation efficiency. But it is still there, there is a limit, there is a line somewhere there that you don't know, when crossed, will result in degraded output for everything that shares the same allocation pool for given timeframe with similar allocation parameters.
So then the question becomes, do people overplay their responsibility allocation efficiency when using agentic workflows to the highest productivity? When the result you produce is suboptimal, or possibly wrong, or unmaintainable by the current tools, can we really call it productivity? I don't think so.
As Alperen Keleş put it in his article last year, verifiability is the limit. Because, how can you be responsible of something if you can't verify it? For some tasks, AI results in better output than others, but you can't know for sure without verifying it. Even with rigorous testing, the process is not comparable to a human producing software. Because humans iterate the solution while AI approximates to the solution. Even when AI iterates complex problems, it still approximates in the smaller steps it takes during the iteration. As long as you can't verify it automatically, it is practical equivalent of infinite monkey theorem. It can produce good results; but as long as you have to verify it, you are limited by your own observational capabilities.
To Err is Human
I know many people disagree with this, they'll point out that humans also make mistakes. To err is human, that I agree; but timeline behind those are different. When a human that is expert in their craft makes a mistake, there is still an iterative process behind it. It is a "mis"take, it is an outlier, it is unexpected. So, can we really attach same traits to AI making mistakes and say that humans making mistakes is equivalent to AI making mistakes? They don't have the same process, their "mis"takes are not outliers, because it is not result of an iterative process, it just approximated to a wrong solution. That's a different definition for the word mistake. Hence humans are needed in the loop, to verify the mistakes of AI.
Humans do make mistakes, so humans use many verification methodologies to reduce the outliers. Can you use same methodologies to verify output of AI? Yes, indeed. But note something, you are still responsible for it, now you are responsible for making sure you test its output rigorously; you just pushed your responsibility around and allocated it differently. But how many people do you think actually do that properly? I'm sure many people don't follow complicated automated testing methodologies. They just do TDD, that's if you are lucky. (For the record, I don't believe in TDD, but that's irrelevant.)
So, can AI write its own tests? Because that's what I see from most people, they make AI write its own tests. Now you have to verify whether AI's tests are rigorous, you didn't solve the problem, you just pushed the responsibility allocation to different place. Could it work? For some workflows and jobs, maybe; that is one's job definition to figure that out. These are tools after all, if you can ensure you allocate your responsibility responsibly, that's fine.
There is No Magic
But you are still limited by your observational capabilities, you are still limited by your responsibility allocation. There is no magic, either humans are in the loop, or humans are out of the loop for the good. No middle ground. If AI is capable of doing everything, to the degree it can get reasonable output comparable to human output, without requiring human intervention; then why are you there in the first place as a capable 1000x software "engineer"? You are an outlier, you don't belong in there. If AI is not that capable, then you are limited by your "puny" human capabilities.
You can optimize the way humans are in the loop, you can increase their efficiency with the tools; but that's it. So, the next question in line is, can someone who works on multiple projects using multiple agents, each having their own swarms, that produce thousands of lines of code per day, can actually allocate enough responsibility to verification of their ever changing tests and software? That's for everyone to answer on their own, I personally don't believe it is the case.
Most people just focus on business centric viewpoints, they just want to be early adopters, they don't want to be left behind in business; because everyone else is doing it. This is the root of this mass hysteria we have today. How many of the people do you think that use these tools would have guts to keep doing it if they were to serve several years in prison if they ever leaked any user data? That nails it.
Just because your accountability doesn't match your responsibility doesn't mean you have "I can mess around" card. You should do your job as if you are accountable even if you are not, you should account yourself on your own responsibility. That's what being an adult is, that's what being responsible of something means, even if it is disadvantageous as a business practice. You indeed have right to not care about any of these stuff; but if you don't care about it, just stop LARPing as if you do, just outright tell it as some do; be honest about it.
These are tools, use them where they make sense, use them responsibly. It is just that's not the case with many people in the year of 2026, which prompted me to start this article series; so I can just link and quote it instead of repeating essentially half baked versions of this over and over again.