You are viewing the Articles tagged in software engineering

The Age of Comprehension Debt

For as long as we have been writing software, we have been accumulating Technical Debt – a natural by-product of the trade-offs inherent to software engineering; a shortcut taken to meet a deadline, a deferred refactoring, an abstraction that never quite materialized… the list goes on and on. While Technical Debt is understandably spoken of in unfavorable terms, it is important to recognize that, in most cases, the developers who incurred it usually did so consciously, with at least some understanding of the compromises being made and the reasoning behind them. Even when that understanding is only passively held, it remains recoverable; the decisions, however imperfect, were made by someone who could, when needed, explain them.

With the rapid and increasing adoption of AI Agents across engineering teams, however, a new and far more insidious form of debt has begun to emerge; one developers are only beginning to talk about, yet have intuitively been dealing with for some time now. I have seen it loosely termed as “Comprehension Debt,” and strongly feel it warrants its own distinct classification in software engineering. Unlike Technical Debt, which concerns itself with the state of the code, Comprehension Debt concerns itself with the state of the engineer; specifically, the growing gap between the code a team ships and the code the team can actually explain, maintain, and safely evolve. While distinct from Technical Debt, Comprehension Debt almost certainly serves as a precursor to it. A team that cannot fully explain the systems it is modifying is far more likely to introduce brittle abstractions, duplicate logic, architectural inconsistencies, and short-term fixes that slowly erode the codebase’s original architectural vision over time.

Technical Debt, for all of its drawbacks, is generally a known quantity. When a developer takes a shortcut, applies a quick fix, or defers a proper implementation for a later iteration, there is almost always a working knowledge as to why that decision was made. Even when the original author has long since moved on, the rationale typically lives on within the team; be it captured in an architectural decision record, a design document, commit messages, code comments, or simply in the collective memory of those who were present when the decision was made. The debt, in other words, is paired with context, and that context is what differentiates Technical Debt from Comprehension Debt.

Comprehension Debt, by contrast, has no such pairing. When an Agent produces a working implementation, a developer may accept it, test it, and submit a PR without ever having developed an understanding of why the solution works, why a particular approach was chosen over another, or what trade-offs were made in the process. The code runs, the tests pass, the PR gets merged, and the ticket is closed – fast, efficient, yet a potential ticking time bomb has just been added to your product. The primary concern here is that the developer who now owns that code is, in many cases, no more familiar with it than someone reading it for the first time.

Technical Debt is a debt against the codebase. Comprehension Debt is a debt against the developer and the broader engineering organization itself.

This distinction is crucial. Technical Debt can be paid down through refactoring, incremental improvement, etc. It can be quantified, measured, and addressed in a controlled manner. Comprehension Debt, however, compounds silently; and the interest is often paid during a production outage, a security incident, or the unfortunate moment when a developer is asked to explain, extend, or debug something they never truly understood in the first place.

One could easily place the blame on executives buying into industry hype and pushing for increased productivity as the cause of all issues surrounding AI. However, this is a short-sighted viewpoint as it merely defers responsibility away from where it truly belongs. It’s the executives’ job to ensure productivity is maximized for the success of the business; it’s our job as engineering leaders to manage those expectations while also ensuring they can be achieved realistically.

It is important to emphasize that the problem is not with AI itself, nor with the use of Agents within engineering workflows. These are, arguably, the most transformative tools our industry has ever seen, and the productivity gains they offer are both very real and significant. The problem, rather, lies in how these tools are being used, under what conditions, and with what level of experience is guiding their use.

In the hands of a senior engineer, an Agent can be a force multiplier of remarkable capability; it is quite literally the fulfillment of being able to scale oneself horizontally. Seasoned engineers bring with them the accumulated experience to recognize when a suggested solution is appropriate, when it is over-engineered, when it quietly violates an existing architectural boundary, or when it introduces a subtle performance regression that will only surface under load. This experience and insight is the crucial factor as it allows one to steer the Agent with purpose, discretion, and precision; prompting with the right context, rejecting outputs that do not align with the design of the broader system or requirements, and leveraging the Agent to accelerate work they otherwise could have completed on their own – all in a fraction of the time, and ultimately, cost. For the experienced engineer, the Agent is a collaborator, and while the intimate knowledge inherent to writing code by hand inevitably will not be there, the resulting code is still understood just as thoroughly as if it had been written by hand.

In the hands of a junior engineer, however, the same Agent becomes something altogether different. Without the foundational experience required to critically evaluate what is being produced, the output of the Agent is too often accepted at face value. I cannot overstate how many times I have seen this occur. Suggestions are accepted with minimal scrutiny at best, patterns are adopted without understanding their implications, and architectural decisions are, in effect, being made by a tool that almost certainly hasn’t been provided with enough context to have any awareness of the system’s history, constraints, or long-term direction. The junior developer, through no fault of their own, is simply not yet equipped to ask the questions that a senior developer would ask almost instinctively; and the Agent, for all of its capability, will in almost all cases not ask those questions on their behalf.

This is the heart of the matter: Comprehension Debt accrues most rapidly where experience is least present, and it is precisely in those contexts where its consequences are most likely to be misunderstood or overlooked. This discussion frames the experience divide along two poles – junior and senior – to emphasize the contrast most directly, but the same dynamics apply across the spectrum, relative to any engineer’s depth of experience in a given area of the system.

This is not an indictment of junior developers, but a responsibility of engineering leadership. If organizations choose to place powerful Agents in the hands of less experienced engineers, they must also provide the standards, guardrails, mentorship, and review practices necessary to ensure those tools create meaningful velocity by accelerating understanding, rather than quietly replacing it.

Like most things, there will always be exceptions. Some junior engineers will use Agents exceptionally well, with curiosity, discipline, and sound judgment; while some senior engineers will take the path of least resistance simply to get a feature shipped. Wisdom may come from experience, but it must be deliberately exercised to have meaning.

To illustrate the point, it’s helpful to consider a few scenarios which, in one form or another, are playing out across engineering teams with increasing regularity.

Architectural Incoherence: A developer asks an Agent to add a new feature to an existing component or service without providing context outside of the immediate ask. The Agent, lacking full context of the broader architecture, introduces a new dependency, a new pattern, duplicated utilities, or a new abstraction that subtly conflicts with conventions established elsewhere in the codebase. The feature works, the PR is approved, and now the drift is real. Over time, as similar drifts accumulate, the codebase begins to lose its coherence; yet no single commit or PR can be pointed to as the underlying cause.

Silent Security Regressions: An Agent generates an implementation that appears correct and passes all existing tests, yet introduces a subtle security vulnerability; perhaps an unescaped input, an omitted permissions check, a removed CSP entry, exposed credentials, etc. The developer, unfamiliar with the specific attack vector being introduced, submits a PR, and another with just as little understanding merges the change without recognizing the risk. The vulnerability may not surface for months, or years, but when it does, the team is left to investigate code that no one on the team actually authored or fully understood, yet ironically, at least two team members signed off on it.

Performance Landmines: An Agent offers a solution that is functionally correct, elegantly designed, yet scales poorly. Perhaps it introduces repeated re-renders of a heavy component, a naive in-memory operation over a collection that will eventually grow unbounded, a recursive invocation that performs acceptably under development conditions but catastrophically in production. The developer, unaware of the underlying performance characteristics, accepts the solution; and now the landmine is laid, inconspicuously waiting for the load that will trigger it.

Dependency Proliferation: Rather than leveraging an existing internal API or a well-understood third-party dependency, the Agent introduces a new dependency to solve the immediate problem. The developer, not having conducted a thorough review of what is already available, accepts the addition. Over time, the dependency graph becomes bloated, inconsistent, and difficult to reason about, the app bundles grow, and both the end user and the team pay the price for it.

The Debugging Gap: Perhaps most significantly, when something goes wrong; and eventually, something usually does; the developer who owns the code is now tasked with debugging an implementation they did not truly author. They must reverse-engineer the intent, infer the reasoning, and reconstruct the context that was never captured. The time investment required to do so often exceeds what would have been required to write the code correctly in the first place. Moreover, perhaps out of habit, the developer will delegate a significant portion of this work again to an Agent, and the cycle repeats itself.

In each of these scenarios, the code works, the tests pass, and the immediate goal is achieved – and it’s all accomplished with amazing speed. Productivity is achieved! Yet the long-term health of the system, and the team that owns it, has been quietly compromised.

The question, then, is not whether to leverage Agents; the answer to that is unambiguous. The question is how engineering leadership can establish the guardrails necessary to ensure that the productivity gains offered by these tools are not paid for in Comprehension Debt. While there are certainly numerous tools that can be leveraged to help mitigate some of the side effects of Comprehension Debt, what follows are some considerations which, in my experience, warrant particular attention, and should be proactively incorporated, ideally prior to adoption and / or proliferation of any particular AI.

Mandate Comprehension, Not Just Completion: A PR that cannot be explained by its author should not be merged. This is perhaps the single most important guardrail a team can establish. Whether the code was written by the developer, generated by an Agent, or produced through some combination of the two, the developer submitting the PR must be able to articulate, in their own words, what the code does, why it was implemented this way, and what alternatives were considered. This is not a punitive measure; it is a protective one, both for the individual and for the team as a whole.

In practice, enforcing this standard requires more than policy; it requires a cultural shift in how code reviews are approached. One effective process is to make comprehension an explicit part of the PR template itself, prompting the author to briefly describe the nature of the change, the reasoning behind the approach, and any alternatives that were weighed. This shifts the burden of proof to the author before the review even begins, and in doing so, often surfaces gaps in understanding early, before they compound. For junior developers especially, this practice doubles as a learning mechanism; the act of articulating a decision, even imperfectly, begins the process of genuinely internalizing and learning from it. Inline review comments should also be included by the submitter as this encourages a deeper understanding at the implementation level. At best, we have a record of the understanding simply via the PR process, and at worst, even if the developer uses an Agent to provide the comments, they are still implicitly learning when doing so.

Elevate the Code Review: Code Reviews have always been among the most valuable practices available to an engineering organization, and in the age of Agents, their importance is only amplified. While most organizations are looking to automate the review process entirely, I strongly feel this is the wrong direction to take. To be clear, this is not to suggest that none of the code review process should be automated, in fact quite the opposite – mundane, pattern-based portions of the review process, such as overall structure, coding conventions, and virtually everything else that can be added to an Agent’s rules should most certainly be automated. This frees up reviewers for the greater responsibility they now have, as in addition to their traditional responsibilities, they must now evaluate code with an awareness that portions of the source, likely most of it, have been generated. This means asking deeper questions; not merely does this look right?, but does the author understand this, and does it align with the broader system. It is, in effect, a natural extension of the practice, but one which requires a renewed level of diligence. While not the most enjoyable aspect of the job, it is one the responsible developer is already extremely familiar with, as they have been using Agents in this same capacity all along.

Pair Experience with Inexperience: Where junior developers are leveraging Agents, they should do so, whenever practical, in collaboration with more senior team members who can provide the contextual awareness and mentorship the Agent cannot. This is not about restricting access to the tools; it is about ensuring that the experiential gap is actively being closed, rather than widened, by their use. Fundamentally, this is not different than assigning less critical tickets to less experienced developers, and having more senior developers oversee and guide their designs and implementations.

Preserve the Rationale: Design decisions; whether made by a developer, an Agent, or a collaboration between the two; must be captured in a form that outlives the moment of the commit. Architectural Decision Records, design documents, and meaningful commit messages become even more important, not less, as Agents take on a larger share of the authorship. Without this, the reasoning behind a decision is lost at the exact moment the decision is made.

Invest in Foundational Knowledge: It has perhaps never been more important for engineering organizations to invest in the foundational growth of their developers. The fundamentals of system design, architectural patterns, algorithmic thinking, and security principles must be deliberately cultivated; they are the exact skills required to effectively steer an Agent, and precisely the skills that atrophy when an Agent is allowed to do all of the decision making. Training, mentorship, and a culture that values craftsmanship are not luxuries in this environment; they are essentials.

Establish Clear Conventions and Rules for Agent Usage: Teams should be explicit about where Agents are appropriate, where caution is warranted, and where human authorship remains preferred. Generating boilerplate, scaffolding tests, summarizing existing patterns, or exploring implementation approaches are often well-suited to Agent-assisted work. Critical security boundaries, core domain logic, and architecturally significant decisions, however, should remain more deliberately human-led, even if that approach is slower in the short term. Agents can and should still be leveraged in these areas, but they need to be directed with far greater focus, context, and intent.

Perhaps most importantly, organizations should provide carefully crafted shared rules that are used consistently across all Agents. These rules help ensure that Agents are aligned first with the broader engineering org’s standards, further extended to include team / domain specific standards, and result in outputs that are predictable, easy to review and maintain, and most importantly, trustworthy. I cannot overstate the value of this. In practice, well-defined Agent rules have enabled me to generate code that is nearly indistinguishable from what I would have written myself, not only in terms of implementation structure, but also documentation style, naming conventions, architectural patterns, and prose. Without shared rules, Agent usage can quickly become inconsistent and difficult to govern. With them, Agents become far more reliable extensions of the engineering team’s existing standards and expectations.

Measure What Matters: Productivity metrics that reward throughput without regard for comprehension will, inevitably, incentivize the accumulation of Comprehension Debt. Leadership must be thoughtful about what is being measured, and ensure that quality, understanding, and long-term maintainability are given their due weight alongside velocity. In concrete terms, this means supplementing velocity-based metrics with indicators that reveal comprehension and quality over time: tracking the rate of post-merge defects attributed to Agent-generated code, monitoring the frequency with which developers can accurately describe the intent and trade-offs of code they have submitted, and measuring the time required to onboard new team members to areas of the codebase that are heavily Agent-authored. Incident retrospectives should explicitly note whether the implicated code was Agent-generated and whether the responding engineers had sufficient familiarity with it to diagnose the issue without re-engaging an Agent. Over time, these metrics, taken together, provide a meaningful proxy for organizational comprehension health, and allow leadership to course-correct before the debt becomes systemic.

The arrival of capable AI Agents is one of the most significant developments in the history of our profession. The opportunities they present are vast, and the teams that learn to leverage them effectively will, undoubtedly, hold a substantial advantage over those that do not. Yet it would be a profound mistake to assume that effective leverage is synonymous with broad adoption prior to proper governance; or that productivity, measured narrowly by adoption, is the same as progress – it’s not.

The teams that will succeed in this new landscape are not those that adopt these tools most aggressively, but those that adopt them most thoughtfully. They are the teams that pair the remarkable capability of the Agent with the experience, judgment, and craftsmanship of engineers who understand not only what the Agent has produced, but more importantly, why; and who are prepared to own, explain, and evolve that code long after the initial generation has taken place.

Experience and wisdom, combined with a deliberate and informed use of AI, is what will ultimately differentiate the teams that thrive from those that will find themselves face-to-face with the consequences of implementing solutions they never truly comprehended. Technical Debt, we have long known how to manage. Comprehension Debt, however, is a debt we are only beginning to discuss, and will undoubtedly be dealing with for some time; and the sooner we acknowledge it, formally name it, and actively work to prevent its accrual, the better positioned our teams, and our industry as a whole will be for the extraordinary opportunities that lie ahead.

Cross Published at medium.com

Code Review Essentials

Code Reviews are an essential part of Software Engineering, providing numerous benefits for teams and the products they deliver. Having spent a significant amount of time conducting them for many years now, in this article, we will touch upon some key aspects to consider which, generally speaking, are of particular importance.

Similar to functional testing, Code Reviews provide a unique set of quality controls which help ensure standards are upheld; affording teams the ability to verify a number of critical concerns early on and within the confines of engineering specific constructs. This almost certainly yields a higher return as the time investment required to address issues at this stage requires minimal involvement across teams and functions.

Code Reviews also serve to aid in the verification and upholding of best practices, standards, and conventions across teams and within organizations. These standards can cover a broad range of concerns such as consistency, facilitation of reuse, scalability, security, optimization, readability, simplification, and any other auxiliary criteria specific to a given organization.

Additionally, the Code Review helps to confirm that requirements have been fulfilled in the context of the underlying feature being reviewed as, it is not uncommon for developers to misinterpret requirements.

Likewise, developers are generally focused on solving various small problems in a very particular and limited scope. Because of this, it is inevitable that opportunities will be missed, and oversights will be made. One of the primary responsibility of the Reviewer is to provide a holistic and broad perspective which takes into account not only the soundness of the code being reviewed, but also how it measures, complies, and integrates in the context of the larger system as a whole.

By having another set of eyes, so to speak, we arm ourselves with a very important second line of defense, as well as an agent for opportunity.

One of the most beneficial aspects of Code Reviews is the investment in overall knowledge throughout the team; and ultimately, the ROI it provides. As such, core to the Code Review is the proliferation of knowledge. This applies to both the Reviewee, and the Reviewer alike.

For the Reviewee, when areas of improvement, best practices, optimizations, abstractions and the like are outlined, an opportunity is presented for one to learn new (often improved) techniques which they may not have been aware of otherwise. This holds particularly true for more junior developers who simply have yet to acquire the experiential knowledge obtained by their more senior counterparts. By learning from the experiences of others, the Reviewee can expedite their own growth as a Developer. Here, the expectation is that, overtime, each Reviewee will have fewer and fewer of the same review comments to address as they now have a dedicated platform (even if unofficially so) from which to continually learn.

For the Reviewer, Code Reviews provide an opportunity to share knowledge and insight, while affording one the ability to obtain a broader understanding of the system in its entirety, as this knowledge is vital to providing a successful review.

Additionally, it may be necessary for a Reviewer to devise and provide solutions to problems which the may not have encountered previously and, in order to be effective, a Reviewer must be confident in the feedback and solutions they are providing. This alone affords the Reviewer themselves the ability to gain a deeper understanding of their own knowledge, while also challenging themselves in order to obtain the information necessary to do so. Thus, for the Reviewer, Code Reviews present a tremendous opportunity to not only provide value to others, but also to obtain and enhance their own value as well.

In general, developers more or less tend to work in a rather silo’d manner, primarily focusing on one particular problem space (particularly in the scope of a given feature), and only collaborating when necessitated by DSMs, meetings, or when they or another team member runs into a problem and needs assistance. While much of this is a rather natural by-product of feature development, so to, can it be said that Code Reviews naturally cultivate collaboration; thus, collaboration can be built into our processes by default.

With Code Reviews, no one Developer is ever working completely on their own. This has numerous benefits, many of which have already been outlined above, yet perhaps one of the most significant benefits is that developers are much more likely to double check their work and submit something that they can be proud of when they know someone else on their team will be reviewing their work. Likewise, Reviewers, no matter how experienced, are much more likely to validate and double check their feedback for the exact same reasons. This alone lends itself to higher quality output across the board.

Key Aspects to Consider

While numerous aspects must be considered with respect to conducting Code Reviews, generally speaking, there are common considerations which by and large tend to hold true. While certainly not an exhaustive treatise, what follows is a brief outline of those I have found to provide particular value.

Atomicity: PRs should be atomic (relatively small in nature). If PR is excessively large, it should be rejected and the engineer should be informed to break out the PR into smaller submissions (generally these smaller submissions can be merged to an intermediary branch before being merged to the intended target branch). This is crucial as the surface area for mistakes and missed opportunities is proportional to the amount of code being reviewed. In addition, requiring PRs which are smaller in scope encourages developers to think in terms of smaller units of function and subsystems, which in turn leads to clearer separation of concerns, and encapsulation. As such, it is often helpful to impose a change threshold for submitted PRs.

Compatibility: Changes should remain backwards compatible and not introduce breaking changes (unless expressly coordinated across teams). Reviewers need not checkout each PR and explicitly test each feature being submitted, rather, they should always be cautious of breaking changes, particularly in terms of APIs (e.g. argument positions changing, etc.).

Consistency: PRs must fully adhere to well documented and established standards and conventions; typically supported via commit convention tooling. This is crucial as, consistency and conformity of standards leads to a unified codebase where developers can easily work across packages and features with very limited effort as, the overall structure and coding style is consistent; making it much easier to know where everything should, and is, defined, how modules are organized, and readability is immediate as formatting and structure remains the same across packages and modules.

Clarity: All modules, functions, classes, types, etc. are always be clearly named, defined, remain properly encapsulated, and reside within a logical and appropriate location.

Readability: Readability should be favored over excessive succinctness or overly “clever” implementations which do not read well. Conversely, overly verbose implementations are to be avoided as well. It is important to remain cognizant of the fact that code is read many, many more times than it is written. Moreover, when implementations become hard to reason about, that is often a sign of a poor implementation (usually the result of a specific unit doing too far much). Succinct, yet meaningful names must always be used. Strive to ensure code is self documenting in terms of its intention.

Reusability: Implementations must take reuse into account at all times; be it abstractions to common packages, abstractions within a particular project, or abstractions within a particular scope of a project. In addition, Reviewers should always be on the look out for additions which are redundant and should be removed and replaced with existing APIs available. This includes both internal APIs, as well as third-party libraries. Always ensure native APIs are being leveraged (Array.forEach, etc. rather than explicit for loops) as well as standard third party libraries (lodash.debounce, etc. rather than custom implementations). No redundancies should be introduced, and implementations should fully utilize existing APIs, Modules, Components, etc. throughout the available packages.

Simplicity: Solutions should always be implemented in the simplest way possible. Less is more, this extends down to each line of code. Keep things as simple as possible, but no simpler.

Scalability: Implementations must be performant and optimized to an acceptable and expected level – generalized optimizations must be made, and premature optimizations should only be suggested when necessary.

Securability: Implementations must be secure, keeping standardized security measures in place and ensuring attack vectors and cumulative surfaces are fully understood, accounted for, and securely addressed.

Discoverability: Documentation and / or related tools should follow specific conventions and remain succinct and to the point. Ideal documentation should provide a meaningful, yet brief description, followed by a useful example which speaks for itself (often, unit test expectations can be used verbatim here). On a related note, sources should not contain overly verbose inline comments as well. When, for example, a function has more lines of inline comments than actual implementation code, that’s usually a sign that the code does not read well, or the developer has merely been leaving “note to self” comments. In such cases, strive to provide ways to simplify the implementation such that it achieves better readability by being self documenting.

Accountability: It is crucial that all Team members are aware of the criteria against which their code will be reviewed as, doing so essentially holds developers accountable for ensuring they not only understand what is expected, but are diligent in reviewing their own work prior to submission. Developers should be encouraged to pre-submit PRs for performing a “self review” prior to officially submitting and / or assigning a reviewer. This approach is quite valuable as it provides the developer with a high-level overview of their changes outside of the environment they have been working in, and within the context of the branch to which their changes will be integrated.

While there are certainly other factors to consider when conducting Code Reviews, the above considerations touch upon some of the more fundamental aspects, with the key points hopefully being apparent as, perhaps the most important trait of a successful reviewer is in one’s ability to clearly express intent while also passing this knowledge on to others.