Thesis White Paper | The Blacksmith Fund

BLACKSMITH AI AI-Native Equity Research System

CONSTRAINED INTELLIGENCE IN FUNDAMENTAL EQUITY RESEARCH

A Hybrid Human-AI Workflow for Solving the Analyst Bottleneck

Adam Thomas Contreras

Fundamental Value Investing Frameworks | AI-Native Research Architecture April 2026

EXECUTIVE ABSTRACT

EXECUTIVE ABSTRACT Modern equity research operates under a fundamental contradiction. The volume of corporate disclosure, macroeconomic data, and alternative information has compounded to levels that exceed any individual analyst's cognitive bandwidth — yet purely quantitative models routinely fail to capture the qualitative texture of business reality that separates a compounding franchise from a value trap. This paper documents the idea and concept of a hybrid human-AI investment research system. The architecture integrates a proprietary valuation engine — grounded in fundamental value investing frameworks — with a locally deployed AI that is constrained to act as a qualitative signal extractor, culminating in a structured human judgment layer for final capital allocation. The system's central thesis is not that AI should replace the investor. It is that AI, when properly constrained, can compress the qualitative research cycle without degrading its rigor — freeing the human analyst to function at the level of strategic synthesis rather than document parsing. ——————————————————————————

KEY EMPIRICAL FINDING

In April 2025, the first experimental deployment of this architecture produced results from a personal broker account showing a verified individual position return of approximately 65% on Twin Disc Incorporated (TWIN), entered at $7.91 and exited at $13.00, identified entirely through the AI-assisted funnel. The full-year portfolio, anchored by a subsequent position in Citigroup, yielded an approximate 70% return through December 2025. The results showed real potential of the operational validity of the architecture. Past performance is not indicative of future results. Generated in a personal account over a limited period; not independently audited.

SECTION 1 — THE ANALYST BOTTLENECK PROBLEM

1. THE ANALYST BOTTLENECK PROBLEM Traditional fundamental equity research is, at its core, a labor-intensive information compression problem. An analyst covering a universe of companies must process hundreds of annual reports, quarterly filings, earnings call transcripts, proxy statements, and industry data — all while maintaining sufficient cognitive bandwidth to form original, differentiated investment judgments. The uncomfortable truth of this model is that it relies entirely on a human being, and human beings are structurally imperfect instruments for this task. Human analysts get tired. Fatigue degrades the quality of judgment in ways that are invisible to the analyst themselves. Humans are inconsistent: the same analyst applying the same framework to two similar companies in the same week will weigh qualitative factors differently depending on mood, recent experience, and cognitive load. Humans make errors under time pressure — missed footnotes, misread figures, overlooked risk disclosures — that can transform a well-intentioned investment thesis into a capital-destroying mistake. And human cognitive bandwidth is hard-capped. No matter how talented, no single analyst can maintain genuine analytical depth across more than a handful of active positions simultaneously. This is not a criticism of analysts. It is a description of biology. The constrained AI system at the core of this platform does not get tired. It does not lose consistency across the fifteenth company it evaluates in a single session. It does not make fatigue-driven errors. And its capacity to process information does not approach a ceiling the way human attention does. This structural asymmetry is the foundation of the bottleneck thesis. When the cost of thorough qualitative research is prohibitively high for human analysts, stocks that are quantitatively attractive but qualitatively complex become systematically under-researched. This creates the conditions under which AI-assisted fundamental analysis can extract durable, repeatable alpha — not by predicting the future, but by covering more ground, more consistently, than any human-only research operation ever could.

SECTION 2 — THEORETICAL FOUNDATION

2. THEORETICAL FOUNDATION 2.1 · The Bottleneck as Structural Market Inefficiency The analyst bottleneck is not merely an operational inconvenience — it is a structural source of market inefficiency. When the cost of thorough qualitative research is prohibitively high, stocks that are quantitatively cheap but qualitatively complex become systematically under-researched. This creates the conditions under which disciplined, AI-assisted fundamental analysis can extract durable alpha. The net-net strategy — purchasing securities at a discount to net current asset value — is the original exploitation of this inefficiency, rooted in fundamental value investing frameworks developed over decades of academic and practical research. The innovation here is applying the same principle to the research process itself: identifying under-researched, quantitatively attractive companies and deploying constrained AI to perform the qualitative due diligence that the market has not yet priced in. ——————————————————————————

2.2 · Constrained AI in Financial Analysis Standard large language models suffer from two critical failures in financial contexts: hallucination (generating plausible but false claims) and staleness (training data does not include current filings). The architecture described in this paper addresses both by grounding AI responses strictly in a retrieved document corpus. Rather than generating from broad parametric memory, the system is constrained to synthesize only from the specific company reports provided — producing traceable, evidence-grounded outputs that can be reviewed and audited. Crucially, this paper advocates for local deployment of the AI system. Running the model on-device eliminates data exfiltration risk, removes external API latency, and ensures that proprietary research insights remain fully private — a non-negotiable requirement for any serious investment operation.

SECTION 3 — THE APRIL 2025 EXPERIMENT

3. GENESIS: THE APRIL 2025 EXPERIMENT The system's empirical validation began in April 2025 — not in a laboratory, but in a personal broker account. The catalyst was a direct question: could a single operator, using locally deployed constrained AI in conjunction with automated financial screening, compress weeks of fundamental research into hours — and then act on that compression with enough conviction to generate meaningful returns? The first deployment was deliberately primitive. An AI-assisted script was built to ingest financial statement data across the entire tradeable U.S. equity universe. For each company, the pipeline computed a baseline valuation and benchmarked the result against the live market capitalization to produce a margin of safety score. The initial output was thousands of tickers. A second filtering pass applied ratio-based screens, eliminating distressed companies, micro-caps with illiquid float, and momentum-driven overvaluations. The shortlist that emerged numbered in the dozens. ——————————————————————————

THE TWIN DISC TRADE · APRIL 2025 One of the companies surfacing from this screen was Twin Disc Incorporated (NASDAQ: TWIN) — a manufacturer of power transmission equipment trading at a discount to its conservative asset value. Entry was established at $7.91 per share. Rather than reading the full annual report manually, the constrained AI system was queried against the annual report. The system extracted competitive positioning language, flagged no material risks, and surfaced management commentary indicating stable industrial demand. The human operator reviewed the synthesis, confirmed the thesis, and held the position. Exit was executed at $13.00 per share — a return of approximately 64% on the individual position. Following the Twin Disc exit, a position was established in Citigroup, where the same qualitative signal extraction framework confirmed a similarly asymmetric risk/reward profile. By December 2025, the full-year portfolio had generated an approximate 70% return — the mandate for everything that followed.

SECTION 4 — SYSTEM ARCHITECTURE

4. SYSTEM ARCHITECTURE: THE THREE-STAGE FUNNEL The operational architecture is a three-stage funnel that progressively concentrates an investment universe of thousands of securities down to a highly vetted, conviction-weighted portfolio. Each stage applies a distinct mode of intelligence — deterministic quantitative computation, AI-constrained qualitative extraction, and structured human judgment — in a strict sequential dependency. ——————————————————————————

STAGE 1 · QUANTITATIVE VALUATION ENGINE Proprietary models · Margin of safety determination · Universe compression The first stage functions as a high-throughput, rule-based filter. It ingests structured financial statement data across the full U.S. equity market and executes several proprietary models to determine a margin of safety score for each company in the universe. The output is a ranked shortlist sorted by the mathematical delta between estimated intrinsic value and current market capitalization. Critically, no AI inference touches this layer. The valuation logic is entirely deterministic, reproducible, and auditable. ——————————————————————————

STAGE 2 · AI-ASSISTED QUALITATIVE LAYER Constrained AI · Document signal extraction · Economic adjustment engine Stage 2 is where the system's core innovation lives. The quantitative shortlist contains companies that look attractive on the numbers — but numbers, by definition, only describe the past. Stage 2 asks: does the economic reality behind these numbers support the thesis? The constrained AI system operates against company reports — annual filings, quarterly updates, earnings transcripts, and investor presentations. The AI is prompted to extract structured qualitative signals across several proprietary evaluation frameworks. Each signal domain produces a normalized score with an attached confidence level, an evidence count, and a recency flag. The AI does not assign a final valuation number. It populates a structured adjustment packet that maps to specific valuation inputs. The speculative generative capacity of the language model is bounded by explicit rules, ensuring that qualitative outputs never override the underlying deterministic valuation math. Anti-Overfitting Guardrail: Adjustments flagged as Low Confidence are mathematically shrunk to prevent over-indexing on weak signals. Every factor change is version-controlled with a timestamp, enabling full attribution analysis over time. ——————————————————————————

STAGE 3 · HUMAN JUDGMENT NODE Fundamental value investing frameworks · Narrative evaluation · Final conviction scoring The final stage is where machine outputs meet trained human cognition. The operator reviews the AI-generated synthesis not as an oracle but as a research assistant's first draft — validating narrative coherence, identifying value traps, applying judgment about competitive dynamics and timing, and making the binary entry decision. The Twin Disc example illustrates this stage. The quantitative screen flagged the company as attractively valued. The AI layer confirmed no material red flags and extracted positive positioning signals. The human operator reviewed the synthesis, applied independent judgment, and committed capital. The workflow compressed a research process that would have consumed three to four days into approximately two hours — without sacrificing analytical depth.

SECTION 5 — PERFORMANCE RESULTS

5. PERFORMANCE RESULTS AND DISCUSSION The April-to-December 2025 experimental period produced a portfolio return of approximately 70% from a personal broker account. Two positions anchored the year's performance: Twin Disc Incorporated (TWIN) Identified via the automated screening pipeline and validated via constrained AI analysis of the annual report. Entry at $7.91, exit at approximately $13.00. Approximate return: 64%. Research-to-conviction time: under two hours. Citigroup (C) Identified as a large-cap value opportunity trading below intrinsic value. AI analysis of filings confirmed the balance sheet normalization thesis. Held through year-end with strong contribution to portfolio return. The primary driver of these returns was not a novel predictive algorithm. It was disciplined filtering — the systematic elimination of quantitatively unattractive companies in Stage 1, followed by the rapid invalidation of value traps in Stage 2, leaving only a small number of high-conviction ideas for human judgment to evaluate. The system does not predict the future. It processes the present more thoroughly and efficiently than any single analyst can do manually, then constrains the AI's outputs within an evaluation framework that has been used to generate consistent long-term returns in fundamental value investing. Past performance is not indicative of future results. Generated in a personal account over a limited period; not independently audited.

SECTION 6 — RISK ANALYSIS

6. RISK ANALYSIS AND FAILURE MODES

6.1 · Regime Dependence Deep value strategies are historically cyclical. In high-liquidity, growth-driven market regimes, quantitative screens can systematically surface companies that continue to underperform in the near term. However, the system utilizes several proprietary models to adapt to different market regimes and dynamic valuation environments — reducing the single-regime dependence that has historically constrained pure deep-value strategies.

6.2 · Data Quality and Historical Bias The screening parameters of Stage 1 were initially calibrated against historical financial data. Structural shifts in reporting standards, accounting rule changes, or changes in how companies classify assets could cause the screening logic to misclassify companies. Regular recalibration against out-of-sample data is a standing operational requirement.

6.3 · Hardware and Latency Constraints Running a multi-billion parameter model locally places real constraints on throughput. Dense financial vector operations are memory-bandwidth intensive, and processing a full shortlist of companies through the qualitative layer sequentially can take several hours. This is acceptable for a single-operator system but becomes a bottleneck at institutional scale. More powerful AI hardware is a known infrastructure investment requirement on the development roadmap.

SECTION 7 — SOFT JUDGMENT CODIFICATION

7. CODIFYING SOFT INVESTMENT JUDGMENT: THE ECONOMIC ADJUSTMENT ENGINE The next architectural challenge is ambitious: how do you systematically encode the qualitative reasoning that separates a competent analyst from a great one? This is the domain of Soft Judgment Codification — the structured extraction of tacit investment knowledge into reusable, auditable reasoning components. The insight is that the qualitative reasoning of skilled investors, while apparently intuitive, is in fact patterned. Management quality inference, narrative credibility evaluation, competitive moat assessment — these are not random acts of intuition. They are learned heuristics that can be partially formalized, encoded, and applied at scale. ——————————————————————————

7.1 · Why This Works at Scale: The Elimination of Human Bias The most powerful property of this codification framework is not efficiency — it is consistency. Human analysts, no matter how skilled, bring unavoidable cognitive biases to qualitative evaluation. Recency bias causes over-weighting of recent management commentary. Confirmation bias shapes how analysts interpret ambiguous disclosures. Anchoring effects distort how new information updates a prior thesis. These are not character flaws; they are features of human cognition under uncertainty. The constrained AI system, operating within a structured, rule-based evaluation framework, eliminates these biases from the qualitative translation chain. When the system evaluates management quality, it applies the same criteria, with the same weighting, to the twenty-fifth company it reviews as it did to the first. This is a fundamental truth about AI-based qualitative analysis: it does not just accelerate the work — it removes the systematic human error that contaminates it. At scale, this elimination of bias compounds into a durable, structural advantage over any purely human research operation. ——————————————————————————

7.2 · Evolution of the Factor Framework The system's qualitative layer has evolved significantly from its April 2025 origins. The platform now operates a proprietary factor framework spanning multiple analytical domains — constructed by systematically translating the tacit heuristics of experienced fundamental investors into measurable signals, normalizing those signals into computable features, and mapping those features to specific valuation inputs with explicit confidence weighting and uncertainty ranges. The factor library is a living registry — continuously tested out-of-sample, validated across market regimes, and expanded only when a new factor passes both statistical screening and a clear economic rationale that would survive even if the statistical edge disappeared. ——————————————————————————

7.3 · Multi-Agent Orchestration Architecture The system has evolved from a single-model retrieval interface into a multi-agent orchestration architecture. Rather than relying on one AI system to perform all qualitative functions, the platform routes analytical tasks to specialized agents, each operating against targeted data sources with domain-specific prompting frameworks. The data sources ingested by the platform have expanded far beyond annual reports. The architecture now processes a broad range of corporate disclosures, regulatory filings, insider transaction data, earnings transcripts, and other structured and unstructured sources — each treated as a distinct signal channel with its own confidence weighting and recency decay function. A second-pass critique layer challenges weak or contradictory evidence before adjustment packets are presented to the human operator, further hardening the outputs against misinterpretation.

SECTION 8 — EXPANSION PROTOCOLS

8. EXPANSION PROTOCOLS

8.1 · Extended Valuation Model Suite The expansion roadmap extends the current valuation coverage to a broader suite of proprietary models, each parameterized and calibrated to accurately evaluate companies across different business types, asset intensities, and economic structures — ensuring the system is a multi-perspective analytical framework, not a single-lens instrument.

8.2 · Expanded Data Ingestion Protocols The data ingestion layer is being systematically expanded to incorporate regulatory disclosures, proxy and governance filings, insider activity records, earnings communication transcripts, and a range of alternative and structured data sources. Each new source category is admitted to the platform only after validation of its marginal signal contribution relative to the existing data corpus.

8.3 · Automated Macro Regime Filtering A macro overlay module will dynamically adjust Stage 1 valuation thresholds based on the current interest rate environment, credit spread regime, and equity risk premium estimate — ensuring that the quantitative screen does not mechanically surface deep value candidates in environments where the structural conditions for value realization are impaired. 8.4 · Anomaly Detection Layer A statistical anomaly detection module will operate as a pre-filter on Stage 1 inputs, flagging potential accounting irregularities — aggressive revenue recognition, unusual accrual patterns, related-party transaction volumes — before they enter the valuation engine.

8.5 · Adversarial Agent Architecture For each investment thesis, a structured adversarial debate layer will deploy competing analytical perspectives — constructing the strongest independent cases for and against a position. Their outputs are reconciled by a synthesis layer presenting the human operator with an explicit contradiction map: points of agreement, points of disagreement, and the evidence underlying each. This mirrors the best practices of institutional investment committee review and systematically prevents the confirmation bias that plagues single-model systems.

SECTION 9 — SCALABILITY & FUTURE ML

9. SCALABILITY AND FUTURE MACHINE LEARNING OPERATIONS A skilled fundamental analyst, operating at full capacity with genuine depth and rigor, can realistically maintain active coverage of perhaps a dozen companies simultaneously. Adding more companies does not add proportional insight — it adds dilution. This is not a resolvable problem within the human-only research paradigm. It is a fixed biological constraint. The architecture described in this paper operates under an entirely different constraint set. The quantitative valuation engine screens the entire investable universe — thousands of companies — with perfect consistency on every pass. The qualitative AI layer does not experience fatigue on its thousandth company evaluation. The human operator is engaged only at the final judgment node, reviewing a pre-synthesized, evidence-grounded research packet rather than raw documents. The marginal cost of covering an additional company within this architecture approaches zero at the quantitative screening layer, and is reduced by an order of magnitude at the qualitative layer. Deep fundamental due diligence — which might require three to five days of senior analyst time in a traditional operation — is compressed to hours. This changes the economics of fundamental investing entirely, making institutional-grade research accessible at a scale and cost structure that was previously impossible. ——————————————————————————

9.1 · Future Machine Learning Operations The current architecture represents the first generation of a research platform designed to compound its own capabilities over time. As the system accumulates a structured history of investment decisions, signal extractions, valuation adjustments, and outcome data, it becomes a training substrate for the next generation of machine learning operations. The roadmap includes deployment of meta-cognitive capabilities — systems that reason about the system's own analytical patterns, monitor the performance of individual factor signals across market regimes, identify where the system's own confidence calibration has been systematically mis-specified, and propose adjustments to the factor library based on observed performance attribution. Beyond self-improvement, the platform will develop deeper signal research capabilities — detecting statistical relationships across large populations of company data that human pattern recognition cannot perceive. Which combinations of qualitative signals have historically preceded multiple expansion? Which management compensation structures have reliably preceded value-destructive capital allocation? Which narrative patterns in earnings transcripts have preceded earnings disappointments? At the frontier of this roadmap are experimental regime detection functions — the identification of structural transitions in market behavior, competitive dynamics, or macroeconomic conditions before they appear in consensus data. The long-term vision is a research platform that compounds structural knowledge, eliminates systematic human bias, operates without cognitive fatigue across any scale of coverage, and continuously improves through the structured feedback of its own investment record.

SECTION 10 — CONCLUSION

10. CONCLUSION The central claim of this paper is empirically grounded: AI, when properly constrained within a deterministic valuation framework and supervised by trained human judgment, can solve the analyst bottleneck without compromising analytical rigor. The April 2025 experiment with Twin Disc Incorporated was not a proof that AI can pick stocks. It was a proof that constrained AI can compress the research cycle to a speed at which a disciplined investor can cover more ground, maintain higher analytical standards, and act with more conviction than the market has priced in. The 70% full-year return that followed was a consequence of that compression applied consistently. The future of investment research is not algorithmic replacement of the analyst. It is infrastructural elevation. By automating the repetitive, cognitively expensive work of document ingestion and signal extraction, this platform frees the human mind for the work that has always generated durable alpha: asking better questions about the nature of competitive advantage, the quality of capital allocation, and the durability of economic moats. ——————————————————————————

THESIS IN ONE SENTENCE Constrained AI — bounded by deterministic valuation guardrails, grounded in retrieved documents rather than generative inference, and supervised by human judgment trained in fundamental value investing frameworks — permanently resolves the analyst bottleneck, massively lowers the cost of institutional-grade equity research, and creates a scalable, AI augmented research infrastructure that no human-only operation can match.

FOOTER / LEGAL DISCLAIMER

LEGAL DISCLAIMER & IMPORTANT DISCLOSURES Please Read Carefully Before Relying on Any Information in This Document FOR EDUCATIONAL AND INFORMATIONAL PURPOSES ONLY This document is produced solely for educational and informational purposes. Nothing contained herein constitutes investment advice, financial advice, legal advice, tax advice, or any other form of professional advice. The information presented is intended to describe an experimental research architecture and workflow concept, and is not intended to serve as the basis for any investment decision. NOT AN OFFER OR SOLICITATION OF SECURITIES This document does not constitute an offer to sell or a solicitation of an offer to buy any security, investment product, or financial instrument of any kind in any jurisdiction. Any such offer or solicitation, if made, would only be made pursuant to formal offering documents that comply with applicable federal and state securities laws. PAST PERFORMANCE IS NOT INDICATIVE OF FUTURE RESULTS Any references to past performance — including the approximate 70% experimental portfolio return and the individual position return on Twin Disc Incorporated (NASDAQ: TWIN) — are provided solely for illustrative purposes. These results were generated in a personal broker account during a specific and limited time period, reflect a concentrated experimental portfolio, and have not been audited, verified, or reviewed by any independent third party. There is no representation, warranty, or guarantee that any investment strategy described herein will achieve similar results, achieve profitability, or avoid losses. All investments involve risk, including the possible loss of principal. NO INVESTMENT ADVICE — CONSULT A QUALIFIED PROFESSIONAL Nothing in this document should be construed as a recommendation to buy, hold, or sell any specific security or to adopt any particular investment strategy. Readers are strongly advised to consult with a qualified financial advisor, broker-dealer, or registered investment adviser before making any investment decisions. PROPRIETARY INFORMATION & INTELLECTUAL PROPERTY The methodologies, frameworks, system architectures, factor libraries, and analytical processes described herein are the intellectual property of the author and are protected under applicable intellectual property laws. No portion of this document may be reproduced, distributed, transmitted, or published without the express prior written consent of the author. FORWARD-LOOKING STATEMENTS Certain statements in this document may constitute forward-looking statements. These statements involve known and unknown risks and uncertainties that may cause actual results to differ materially from those expressed or implied. The author undertakes no obligation to update or revise any forward-looking statements. —————————————————————————— © 2026 Adam Thomas Contreras. All Rights Reserved. Blacksmith AI · AI-Native Equity Research