Close Menu
Savannah HeraldSavannah Herald
    • Home
    • News
      • Local
      • State
      • National
      • World
      • HBCUs
    • Events
    • Directories
    • Weather
    • Traffic
    • Sports
    • Politics
    • Lifestyle
      • Faith
      • Senior Living
      • Health
      • Travel
      • Beauty
      • Fashion
      • Food
      • Art & Literature
    • Business
      • Real Estate
      • Entertainment
      • Investing
      • Education
    • Guides
      • Summer Camp Guide
      • Juneteenth Guide
      • Black History Savannah
      • MLK Guide Savannah
    We're Social
    • Twitter
    • Facebook
    • YouTube

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Trending
    • The Source |Travis Scott and Air Jordan Break from Earth Tones with “Pink Pack” Drop
    • Latto: Talks Finding Out Pregnant, Retirement Tweet, More (Video)
    • Sandoval Earns ABCA/Rawlings All-South Region Selection
    • Let’s Break It Down: What Does “Evidence-Based” Actually Mean?
    • How ‘Find My Phone’ Still Works Even When Your Battery Is Dead
    • Fed Used Black Infants for Medical Trials, Suit Alleges
    • Dozens of deputies show up at BOC meeting due to ‘legitimate safety concerns,’ sheriff says
    • Nearly 100-Year-Old NYC Iconic Gem Named Top Attraction In The U.S.
    Facebook X (Twitter) Instagram YouTube
    Login
    Savannah HeraldSavannah Herald
    Savannah HeraldSavannah Herald
    Home » Researchers automated LLM reasoning strategy design and cut token usage by 69.5%
    Tech

    Researchers automated LLM reasoning strategy design and cut token usage by 69.5%

    Savannah HeraldBy Savannah HeraldMay 28, 20267 Mins Read
    Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Tumblr Email
    Researchers automated LLM reasoning strategy design and cut token usage by 69.5%
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Tomorrow’s Tech, Today: Innovation That Moves Us Forward

    Key takeaways
    • AutoTTS automates test-time scaling strategy discovery, shifting engineers to designing the discovery environment rather than hand-crafting rules.
    • An explorer LLM proposes code-defined controllers and evaluates them cheaply using an offline replay environment of pre-collected trajectories.
    • The AI-designed Confidence Momentum Controller applies trend-based stopping, coupled width-depth control, and alignment-aware depth allocation to improve accuracy-cost tradeoffs.

    Test-time scaling (TTS) has emerged as a proven method to improve the performance of large language models in real-world applications by giving them extra compute cycles at inference time. However, TTS strategies have historically been handcrafted, relying heavily on human intuition to dictate the rules of the model’s reasoning. 

    To address this bottleneck, researchers from Meta, Google, and several universities have introduced AutoTTS, a framework that automatically discovers optimal TTS strategies. This automated approach allows enterprise organizations to dynamically optimize compute allocation without manually tuning heuristics. 

    By implementing the optimal strategies discovered by AutoTTS, organizations can directly reduce the token usage and operational costs of deploying advanced reasoning models in production environments. In experimental trials, AutoTTS managed inference budgets efficiently, successfully reducing token consumption by up to 69.5% without sacrificing accuracy.

    The manual bottleneck in test-time scaling

    Test-time scaling enhances LLMs by granting them extra compute when generating answers. This extra compute allows the model to generate multiple reasoning paths or evaluate its intermediate steps before arriving at a final response. 

    The primary challenge for designing TTS strategies is determining how to allocate this extra computation optimally. Historically, researchers have designed these strategies manually, relying on guesswork to build rigid heuristics. Engineers must hypothesize the rules and thresholds for when a model should branch out into new reasoning paths, probe deeper into an existing path, prune an unpromising branch, or stop reasoning altogether. 

    Because this manual tuning process is constrained by human intuition, a vast amount of possible approaches remain unexplored. This often results in suboptimal trade-offs between model accuracy and computing costs.

    Current TTS algorithms can be mapped to a width-depth control space — “width” being the number of reasoning branches explored, “depth” being how far each develops. Self-consistency (SC) samples a fixed number of trajectories and majority-votes the answer. Adaptive-consistency (ASC) saves compute by stopping early once a confidence threshold is hit. Parallel-probe takes a more granular approach, pruning unpromising branches while deepening the rest. All three are hand-crafted, and that’s the constraint AutoTTS is designed to break.

    While some more advanced methods employ richer structures like tree search or external verifiers, they all share one key characteristic: they are meticulously hand-crafted. This manual approach restricts the scope of strategy discovery, leaving a massive portion of the potential resource-allocation space untouched.

    Automating strategy discovery with AutoTTS

    AutoTTS reframes the way test-time scaling is optimized. Instead of treating strategy design as a human task, AutoTTS approaches it as an algorithmic search problem within a controlled environment. 

    This framework redefines the roles of both the human engineer and the AI model. Rather than hand-crafting specific rules for when an LLM should branch, prune, or stop reasoning, the engineer’s role shifts to constructing the discovery environment. The human defines the boundaries, including the control space of states and actions, optimization objectives balancing accuracy versus cost, and the specific feedback mechanisms. 

    AutoTTS framework (source: arXiv)

    An explorer LLM, such as Claude Code, designs the strategy. This explorer acts as an autonomous agent that iteratively proposes TTS “controllers.” These controllers are code-defined policies or algorithms that dictate how an AI model allocates its computational budget during inference. The explorer tests and refines these controllers based on feedback until it discovers an optimal resource-allocation policy. 

    To make this automated search computationally affordable, AutoTTS relies on an “offline replay environment.” If the explorer LLM had to invoke a base reasoning model to generate new tokens every time it tested a new strategy, the compute costs would be astronomical. Instead, it relies on thousands of reasoning trajectories pre-collected from the base LLM. These trajectories include “probe signals,” which are intermediate answers that help the controller evaluate progress across different reasoning branches. 

    During the discovery loop, the explorer agent proposes a controller and evaluates it against this offline data. The agent observes the execution traces of the proposed controller that show it allocated compute over time. By analyzing these traces, the agent can diagnose specific failure modes, such as noting if a controller pruned branches too aggressively in a specific scenario. This provides an advantage over just viewing a final result. The agent then iteratively rewrites its code to improve the accuracy-cost tradeoff. 

    Inside the AI-designed controller

    Because the explorer agent is not constrained by human intuition, it can discover highly coordinated, complex rules that a human engineer would likely never hand-code. One optimal controller discovered by AutoTTS, named the Confidence Momentum Controller, leverages several non-obvious mechanisms to manage compute:

    • Trend-based stopping: Hand-crafted strategies often instruct the model to stop reasoning once it hits a certain instantaneous confidence threshold. The AutoTTS agent discovered that instantaneous confidence can be misleading due to temporary spikes. Instead, the controller tracks an exponential moving average (EMA) of confidence and only stops if the overall confidence level is high and the trend is not actively declining.

    • Coupled width-depth control: Manually designed algorithms usually treat the “widening” of new reasoning paths and the “deepening” of current paths as separate decisions. AutoTTS discovered a closed feedback loop where the two actions are linked. If the confidence of the current branches stalls or regresses, the controller automatically triggers the spawning of new branches.

    • Alignment-aware depth allocation: Instead of giving all active reasoning branches an equal computation budget, the controller dynamically identifies which branches agree with the current leading answer. It then gives those branches priority “bursts” of extra computation. This concentrates the computational budget on the emerging consensus to quickly verify if it is correct.

    Cost savings and accuracy gains in real-world benchmarks

    To test whether an AI could autonomously discover a better test-time scaling strategy, researchers set up a rigorous evaluation framework. The core experiments were conducted on Qwen3 models ranging from 0.6B to 8B parameters. The researchers also tested the system’s ability to generalize on a distilled 8B version of the DeepSeek-R1 model. 

    The explorer AI agent was initially tasked with discovering an optimal strategy using the AIME24 mathematical reasoning benchmark. This discovered strategy was then tested on two held-out math benchmarks, AIME25 and HMMT25, as well as the graduate-level general reasoning benchmark GPQA-Diamond. 

    The AutoTTS discovered controller was pitted against four manually designed test-time scaling algorithms in the industry. These baselines included Self-Consistency with 64 parallel reasoning paths (SC@64), Adaptive-Consistency (ASC), Parallel-Probe, and Early-Stopping Self-Consistency (ESC). ESC is a hybrid approach that generates trajectories in parallel and stops early when an answer seems stable.

    scaling curves

    AutoTTS (red line) outperforms other baselines on industry benchmarks (source: arXiv)

    When set to a balanced, cost-conscious mode, the AutoTTS-discovered controller reduced total token consumption by approximately 69.5% compared to SC@64. At the same time, the controller maintained the same average accuracy across the four Qwen models. When the inference budget was turned up, AutoTTS pushed peak accuracy beyond all handcrafted baselines in five out of eight test cases.

    This efficiency translated to other tasks. On the GPQA-Diamond benchmark, the balanced AutoTTS variant slashed the inference token cost from 510K tokens down to just 151K tokens, while slightly improving overall accuracy. On the DeepSeek model, AutoTTS achieved the highest overall accuracy on the HMMT25 benchmark while cutting the token spend nearly in half.

    For practitioners building enterprise AI applications, these experiments highlight two major operational benefits:

    • Raising peak performance: AutoTTS doesn’t just save money on token consumption. It actively raises the peak attainable performance of the base model. The AI-designed controller is remarkably good at detecting noisy or unproductive reasoning branches on the fly and continuously redirecting its compute budget toward the branches generating the most useful reasoning signals.

    • Cost-effective custom development: Because the framework relies on an offline replay environment, the entire discovery process cost only $39.90 and took 160 minutes. For enterprise teams, that means optimized reasoning strategies tailored to proprietary models and internal tasks are now within reach — without a dedicated research budget.

    Both the AutoTTS framework and the Confidence Momentum Controller are available on GitHub; the CMC can be used as a drop-in replacement for other TTS controllers.

    Read the full article on the original site


    AI and Machine Learning Black Technologists Cybersecurity News Digital Innovation Emerging Technologies Future of Work Gadget Reviews Innovation in Education Minorities in Tech Silicon Valley Updates Smart Devices Software Development Startup News STEM News Tech Culture Tech Equity Tech for Good Tech Industry Updates Tech Trends Technology News
    Share. Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Tumblr Email
    Savannah Herald
    • Website

    Related Posts

    Tech May 29, 2026

    How ‘Find My Phone’ Still Works Even When Your Battery Is Dead

    Tech May 27, 2026

    Welcome To Plathville: Kim Plath Lands New Job Amid Rocky Divorce Drama In Season 7 — “It Is Hard Getting Back…”

    Tech May 27, 2026

    Waymo suspends all freeway rides over safety

    Tech May 26, 2026

    NYT Pips hints, answers for May 27, 2026

    Tech May 26, 2026

    Apple Intelligence image generation will soon get a ‘big’ upgrade: report

    Tech May 25, 2026

    Trump Mobile Phone Review: My Long Weekend With The Golden T1

    Comments are closed.

    Don't Miss
    Local January 20, 2026By Savannah Herald02 Mins Read

    MLK Day Parade to Impact Traffic • Savannah, GA

    January 20, 2026

    City of Savannah: Official News, Events & Community Updates SAVANNAH – Downtown traffic will be…

    CURTIS SYMONDS: PRESIDENT, CO-FOUNDER HBCU GO

    April 17, 2026

    Savannah High Student Welders Earn a Trip to the State Championship

    November 20, 2025

    Grand Canyon National Park Closes Hotels For ‘Foresseable Future’ – Is Your Trip Impacted?

    December 14, 2025

    An easy chic summer look

    August 28, 2025
    Archives
    • May 2026
    • April 2026
    • March 2026
    • February 2026
    • January 2026
    • December 2025
    • November 2025
    • October 2025
    • September 2025
    • August 2025
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    Categories
    • Art & Literature
    • Beauty
    • Black History
    • Business
    • Climate
    • Culture
    • Education
    • Employment
    • Entertainment
    • Faith
    • Fashion
    • Food
    • Gaming
    • Georgia Politics
    • HBCUs
    • Health
    • Health Inspections
    • Investing
    • Lifestyle
    • Local
    • Lowcountry News
    • National
    • National Opinion
    • News
    • Politics
    • Real Estate
    • Senior Living
    • Sports
    • State
    • Tech
    • Transportation
    • Travel
    • World
    Savannah Herald Newsletter

    Subscribe to Updates

    A round up interesting pic’s, post and articles in the C-Port and around the world.

    About Us
    About Us

    The Savannah Herald is your trusted source for the pulse of Coastal Georgia and the Low County of South Carolina. We're committed to delivering timely news that resonates with the African American community.

    From local politics to business developments, we're here to keep you informed and engaged. Our mission is to amplify the voices and stories that matter, shining a light on our collective experiences and achievements.
    We cover:
    🏛️ Politics
    💼 Business
    🎭 Entertainment
    🏀 Sports
    🩺 Health
    💻 Technology
    Savannah Herald: Savannah's Black Voice 💪🏾

    Our Picks

    Unprecedented Arctic heatwave melted 1 per cent of Svalbard’s ice

    September 3, 2025

    Home loan market currently interested in ‘inadequate federal government’

    August 28, 2025

    Georgia Democratic gubernatorial candidates forum in Savannah – Savannah Herald

    January 8, 2026

    Warming Center Opened After Cold Weather Advisory Issued for Savannah/Chatham County This Evening • Savannah Herald

    March 24, 2026

    Oxygen Benefits for Seniors: Choosing the Right Dehumidifier – SeniorCare

    April 25, 2026
    Categories
    • Art & Literature
    • Beauty
    • Black History
    • Business
    • Climate
    • Culture
    • Education
    • Employment
    • Entertainment
    • Faith
    • Fashion
    • Food
    • Gaming
    • Georgia Politics
    • HBCUs
    • Health
    • Health Inspections
    • Investing
    • Lifestyle
    • Local
    • Lowcountry News
    • National
    • National Opinion
    • News
    • Politics
    • Real Estate
    • Senior Living
    • Sports
    • State
    • Tech
    • Transportation
    • Travel
    • World
    Copyright © 2002-2026 Savannahherald.com All Rights Reserved. A Veteran-Owned Business

    Type above and press Enter to search. Press Esc to cancel.

    Manage Consent
    To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
    Functional Always active
    The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
    Preferences
    The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
    Statistics
    The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
    Marketing
    The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
    • Manage options
    • Manage services
    • Manage {vendor_count} vendors
    • Read more about these purposes
    View preferences
    • {title}
    • {title}
    • {title}
    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.

    Sign In or Register

    Welcome Back!

    Login below or Register Now.

    Lost password?

    Register Now!

    Already registered? Login.

    A password will be e-mailed to you.