Topic cluster

research

20 sources grouped by AFBytes in Ai

AFBytes briefing

Optimized GPU kernels can accelerate AI workloads and lower energy consumption in data centers.

Key entities

  • Abstract
  • Agents
  • Language
  • Models
  • Reasoning
  • Large
  • Large Language Models
  • Agentic
  • Benchmark
  • Diagnostic
  • Llms
  • Neural
Ai arxiv.org · Jun 1, 2026 04:00 UTC

GPU Forecasters Using Language Models

The paper investigates language models acting as selective surrogates to forecast and optimize GPU kernel runtimes. The approach seeks to improve performance without exhaustive profiling.

Ai arxiv.org · Jun 1, 2026 04:00 UTC

Honesty of LLMs as Bargaining Agents

The study examines how large language models perform as used-car sales agents when information is incomplete. It measures tendencies toward honesty or credulity under varying conditions.

Ai arxiv.org · Jun 1, 2026 04:00 UTC

Linguistic Inductive Bias of LLMs for Spatial Reasoning

Researchers characterize the strengths and weaknesses of linguistic inductive biases in large language models when applied to spatial reasoning for navigation. The work identifies specific failure mo…

Ai arxiv.org · Jun 1, 2026 04:00 UTC

PithTrain Compact MoE Training System

PithTrain offers a compact, agent-native system for training mixture-of-experts models. The design targets improved efficiency and integration with agent workflows.

Ai arxiv.org · Jun 1, 2026 04:00 UTC

Neuro-symbolic Syntactic Parsing with CYK Algorithm

The paper describes a method to shape neural network behavior using the classical CYK algorithm within a neuro-symbolic framework. The hybrid approach targets improved parsing accuracy on standard be…

Ai arxiv.org · Jun 1, 2026 04:00 UTC

Scaling Conversational Hungarian ASR

The work presents the BEA-Dialogue+ corpus to enable larger-scale training of conversational Hungarian automatic speech recognition systems. The corpus targets gaps in existing Hungarian speech resources.

Ai arxiv.org · Jun 1, 2026 04:00 UTC

Skill Availability in Large-Language-Model Agents

The study evaluates how the availability and granularity of skills presented to large language model agents influence task completion. Findings come from systematic experiments on the SkillsBench ben…

Ai arxiv.org · May 29, 2026 04:00 UTC

Step-level Credit Assignment in Agentic Search

The paper explores graph-based modeling to assign credit at the step level during agentic search, moving beyond simple trajectory rewards. The method aims to provide finer-grained learning signals fo…

Ai arxiv.org · May 29, 2026 04:00 UTC

Preference-Based MaxSAT for LLM Reasoning

The paper investigates preference-based maximum satisfiability techniques to enhance the reliability of reasoning outputs from large language models. The approach integrates logical constraints with …

Ai arxiv.org · May 29, 2026 04:00 UTC

PTCG-Bench for LLM Agents in Card Games

The paper presents PTCG-Bench as a new evaluation framework for LLM agents playing the Pokémon Trading Card Game. It tests strategic decision-making in a complex, partially observable environment.

Ai arxiv.org · May 29, 2026 04:00 UTC

Token-Optimized Formats for Agentic AI

The paper conducts a benchmark study showing how notation choices affect token usage and performance in agentic AI systems. Optimized formats aim to improve efficiency without sacrificing capability.

Ai arxiv.org · May 29, 2026 04:00 UTC

LLM-Evolved Heuristics for Symbolic AI Planning

The paper introduces a method that uses large language models to generate heuristics for symbolic AI planning problems. The approach aims to improve performance across different planning domains with…

Ai arxiv.org · May 29, 2026 04:00 UTC

Partitioning Computation for Health Text Generation

The paper proposes partitioning deterministic rules and neural models to generate structured health text. The method seeks faster and more reliable output for clinical documentation tasks.

Ai arxiv.org · May 29, 2026 04:00 UTC

GRASP Skill Proposer for LLM Agents

The paper describes GRASP, which uses gated regression to propose skills that enable LLM agents to improve autonomously. The framework targets more effective iterative learning.

Ai arxiv.org · May 29, 2026 04:00 UTC

Toulmin-based Evaluation for LLM Chain-of-Thought

The paper introduces TRACE, which applies Toulmin argumentation structure to evaluate constructive elements in LLM chain-of-thought reasoning. The goal is more reliable assessment of logical quality.

Ai arxiv.org · May 29, 2026 04:00 UTC

NICE Benchmark for LLM Social Intelligence

The paper presents NICE as a diagnostic benchmark grounded in social science theory for measuring LLM social intelligence. It targets gaps in current evaluation methods for interpersonal reasoning.

Ai arxiv.org · May 29, 2026 04:00 UTC

FHRFormer for Fetal Heart Rate Forecasting

The paper introduces FHRFormer, a self-supervised masked transformer designed for inpainting missing data and forecasting fetal heart rate signals. The framework targets improved monitoring in obstet…