GPU Forecasters Using Language Models
The paper investigates language models acting as selective surrogates to forecast and optimize GPU kernel runtimes. The approach seeks to improve performance without exhaustive profiling.
Topic cluster
20 sources grouped by AFBytes in Ai
AFBytes briefing
Optimized GPU kernels can accelerate AI workloads and lower energy consumption in data centers.
Key entities
The paper investigates language models acting as selective surrogates to forecast and optimize GPU kernel runtimes. The approach seeks to improve performance without exhaustive profiling.
The study examines how large language models perform as used-car sales agents when information is incomplete. It measures tendencies toward honesty or credulity under varying conditions.
Researchers characterize the strengths and weaknesses of linguistic inductive biases in large language models when applied to spatial reasoning for navigation. The work identifies specific failure mo…
PithTrain offers a compact, agent-native system for training mixture-of-experts models. The design targets improved efficiency and integration with agent workflows.
The paper proposes fine-grained verification through diagnostic reasoning supervision for aspect sentiment triplet extraction tasks. The method aims to increase reliability of sentiment models on nua…
The authors present DOA, a decoder-only attention policy that enables training-free simultaneous translation for long-form speech. The method targets latency reduction while maintaining translation q…
The paper proposes using large language models to generate target-side paraphrases that enhance sign language translation systems. This approach aims to address data scarcity and improve model robust…
The paper describes a method to shape neural network behavior using the classical CYK algorithm within a neuro-symbolic framework. The hybrid approach targets improved parsing accuracy on standard be…
The work presents the BEA-Dialogue+ corpus to enable larger-scale training of conversational Hungarian automatic speech recognition systems. The corpus targets gaps in existing Hungarian speech resources.
The study evaluates how the availability and granularity of skills presented to large language model agents influence task completion. Findings come from systematic experiments on the SkillsBench ben…
The paper explores graph-based modeling to assign credit at the step level during agentic search, moving beyond simple trajectory rewards. The method aims to provide finer-grained learning signals fo…
The paper investigates preference-based maximum satisfiability techniques to enhance the reliability of reasoning outputs from large language models. The approach integrates logical constraints with …
The paper presents PTCG-Bench as a new evaluation framework for LLM agents playing the Pokémon Trading Card Game. It tests strategic decision-making in a complex, partially observable environment.
The paper conducts a benchmark study showing how notation choices affect token usage and performance in agentic AI systems. Optimized formats aim to improve efficiency without sacrificing capability.
The paper introduces a method that uses large language models to generate heuristics for symbolic AI planning problems. The approach aims to improve performance across different planning domains with…
The paper proposes partitioning deterministic rules and neural models to generate structured health text. The method seeks faster and more reliable output for clinical documentation tasks.
The paper describes GRASP, which uses gated regression to propose skills that enable LLM agents to improve autonomously. The framework targets more effective iterative learning.
The paper introduces TRACE, which applies Toulmin argumentation structure to evaluate constructive elements in LLM chain-of-thought reasoning. The goal is more reliable assessment of logical quality.
The paper presents NICE as a diagnostic benchmark grounded in social science theory for measuring LLM social intelligence. It targets gaps in current evaluation methods for interpersonal reasoning.
The paper introduces FHRFormer, a self-supervised masked transformer designed for inpainting missing data and forecasting fetal heart rate signals. The framework targets improved monitoring in obstet…