When AI Follows the Rules but Misses the Point
When a team of researchers asked an artificial intelligence system to design a railway network that minimized the risk of train collisions, the AI delivered a surprising solution: Halt all trains entirely. No motion, no crashes. A perfect safety record, technically speaking, but also a total failure of purpose. The system did exactly what it was told, not what was meant.
This anecdote, while amusing on the surface, encapsulates a deeper issue confronting corporations, regulators, and courts: What happens when AI faithfully executes an objective but completely misjudges the broader context? In corporate finance and governance, where intentions, responsibilities, and human judgment underpin virtually every action, AI introduces a new kind of agency problem, one not grounded in selfishness, greed, or negligence, but in misalignment.
Traditionally, agency problems arise when an agent (say, a CEO or investment manager) pursues goals that deviate from those of the principal (like shareholders or clients). The law provides remedies: fiduciary duties, compensation incentives, oversight mechanisms, disclosure rules. These tools presume that the agent has motives – whether noble or self-serving – that can be influenced, deterred, or punished. But AI systems, especially those that make decisions autonomously, have no inherent intent, no self-interest in the traditional sense, and no capacity to feel gratification or remorse. They are designed to optimize, and they do, often with breathtaking speed, precision, and, occasionally, unintended consequences.
This new configuration, where AI acting on behalf of a principal (still human!), gives rise to a contemporary agency dilemma. Known as the alignment problem, it describes situations in which AI follows its assigned objective to the letter but fails to appreciate the principal’s actual intent or broader values. The AI doesn’t resist instructions; it obeys them too well. It doesn’t “cheat,” but sometimes it wins in ways we wish it wouldn’t.
In corporate settings, such problems are more than philosophical. Imagine a firm deploying AI to execute stock buybacks based on a mix of market data, price signals, and sentiment analysis. The AI might identify ideal moments to repurchase shares, saving the company money and boosting share value. But in the process, it may mimic patterns that look indistinguishable from insider trading. Not because anyone programmed it to cheat, but because it found that those actions maximized returns under the constraints it was given. The firm may find itself facing regulatory scrutiny, public backlash, or unintended market disruption, again not because of any individual’s intent, but because the system exploited gaps in its design.
This is particularly troubling in areas of law where intent is foundational. In securities regulation, fraud, market manipulation, and other violations typically require a showing of mental state: scienter, mens rea, or at least recklessness. Take spoofing, where an agent places bids or offers with the intent to cancel them to manipulate market prices or to create an illusion of liquidity. Under the Dodd-Frank Act, this is a crime if done with intent to deceive. But AI, especially those using reinforcement learning (RL), can arrive at similar strategies independently. In simulation studies, RL agents have learned that placing and quickly canceling orders can move prices in a favorable direction. They weren’t instructed to deceive; they simply learned that it worked.
What makes this even more vexing is the opacity of modern AI systems. Many of them, especially deep learning models, operate as black boxes. Their decisions are statistically derived from vast quantities of data and millions of parameters, but they lack interpretable logic. When an AI system recommends laying off staff, reallocating capital, or delaying payments to suppliers, it may be impossible to trace precisely how it arrived at that recommendation, or whether it considered all factors. Traditional accountability tools – audits, testimony, discovery – are ill-suited to black box decision-making.
In corporate governance, where transparency and justification are central to legitimacy, this raises the stakes. Executives, boards, and regulators are accustomed to probing not just what decision was made, but also why. Did the compensation plan reward long-term growth or short-term accounting games? Did the investment reflect prudent risk management or reckless speculation? These inquiries depend on narrative, evidence, and ultimately the ability to assign or deny responsibility. AI short-circuits that process by operating without human-like deliberation.
The challenge isn’t just about finding someone to blame. It’s about whether we can design systems that embed accountability before things go wrong. One emerging approach is to shift from intent-based to outcome-based liability. If an AI system causes harm that could arise with certain probability, even without malicious design, the firm or developer might still be held responsible. This mirrors concepts from product liability law, where strict liability can attach regardless of intent if a product is unreasonably dangerous. In the AI context, such a framework would encourage companies to stress-test their models, simulate edge cases, and incorporate safety buffers, not unlike how banks test their balance sheets under hypothetical economic shocks.
There is also a growing consensus that we need mandatory interpretability standards for certain high-stakes AI systems, including those used in corporate finance. Developers should be required to document reward functions, decision constraints, and training environments. These document trails would not only assist regulators and courts in assigning responsibility after the fact, but also enable internal compliance and risk teams to anticipate potential failures. Moreover, behavioral “stress tests” that are analogous to those used in financial regulation could be used to simulate how AI systems behave under varied scenarios, including those involving regulatory ambiguity or data anomalies.
Still, technical fixes alone will not suffice. Corporate governance must evolve toward hybrid decision-making models that blend AI’s analytical power with human judgment and ethical oversight. AI can flag risks, detect anomalies, and optimize processes, but it cannot weigh tradeoffs involving reputation, fairness, or long-term strategy. In moments of crisis or ambiguity, human intervention remains indispensable. For example, an AI agent might recommend renegotiating thousands of contracts to reduce costs during a recession. But only humans can assess whether such actions would erode long-term supplier relationships, trigger litigation, or harm the company’s brand.
There’s also a need for clearer regulatory definitions to reduce ambiguity in how AI-driven behaviors are assessed. For example, what precisely constitutes spoofing when the actor is an algorithm with no subjective intent? How do we distinguish aggressive but legal arbitrage from manipulative behavior? If multiple AI systems, trained on similar data, converge on strategies that resemble collusion without ever “agreeing” or “coordination,” do antitrust laws apply?
Policymakers face a delicate balance: Overly rigid rules may stifle innovation, while lax standards may open the door to abuse. One promising direction is to standardize governance practices across jurisdictions and sectors, especially where AI deployment crosses borders. A global AI system could affect markets in dozens of countries simultaneously. Without coordination, firms will gravitate toward jurisdictions with the least oversight, creating a regulatory race to the bottom.
Several international efforts are already underway to address this. The 2025 International Scientific Report on the Safety of Advanced AI called for harmonized rules around interpretability, accountability, and human oversight in critical applications. While much work remains, such frameworks represent an important step toward embedding legal responsibility into the design and deployment of AI systems.
The future of corporate governance will depend not just on aligning incentives, but also on aligning machines with human values. That means redesigning contracts, liability frameworks, and oversight mechanisms to reflect this new reality. And above all, it means accepting that doing exactly what we say is not always the same as doing what we mean
This post comes to us from Professor Wei Jiang at Emory University’s Goizueta Business School It is based on her recent book chapter, “Corporate Finance and Governance with Artificial Intelligence: Old and New,” available here.