digitado

Penalizing Localized Dirichlet Energies in Low Rank Tensor Products

digitado ⋅ 21 de January de 2026

arXiv:2601.14173v1 Announce Type: cross Abstract: We study low-rank tensor-product B-spline (TPBS) models for regression tasks and investigate Dirichlet energy as a measure of smoothness. We show that TPBS models admit a closed-form expression for the Dirichlet energy, and reveal scenarios where perfect interpolation is possible with exponentially small Dirichlet energy. This renders global Dirichlet energy-based regularization ineffective. To address this limitation, we propose a novel regularization strategy based on local Dirichlet energies defined on small hypercubes centered at […]

Ver mais

Like 0

Liked Liked

technocracy

Code World Models for Parameter Control in Evolutionary Algorithms

digitado ⋅ 27 de February de 2026

arXiv:2602.22260v1 Announce Type: new Abstract: Can an LLM learn how an optimizer behaves — and use that knowledge to control it? We extend Code World Models (CWMs), LLM-synthesized Python programs that predict environment dynamics, from deterministic games to stochastic combinatorial optimization. Given suboptimal trajectories of $(1{+}1)$-$text{RLS}_k$, the LLM synthesizes a simulator of the optimizer’s dynamics; greedy planning over this simulator then selects the mutation strength $k$ at each step. On lo{} and onemax{}, CWM-greedy performs within 6% of […]

Ver mais

Like 0

Liked Liked

technocracy

How generative AI can help scientists synthesize complex materials

digitado ⋅ 23 de April de 2026

Generative artificial intelligence models have been used to create enormous libraries of theoretical materials that could help solve all kinds of problems. Now, scientists just have to figure out how to make them. In many cases, materials synthesis is not as simple as following a recipe in the kitchen. Factors like the temperature and length of processing can yield huge changes in a material’s properties that make or break its performance. That has limited researchers’ ability to test […]

Ver mais

Like 0

Liked Liked

technocracy

Axe: A Simple Unified Layout Abstraction for Machine Learning Compilers

digitado ⋅ 27 de January de 2026

Scaling modern deep learning workloads demands coordinated placement of data and compute across device meshes, memory hierarchies, and heterogeneous accelerators. We present Axe Layout, a hardware-aware abstraction that maps logical tensor coordinates to a multi-axis physical space via named axes. Axe unifies tiling, sharding, replication, and offsets across inter-device distribution and on-device layouts, enabling collective primitives to be expressed consistently from device meshes to threads. Building on Axe, we design a multi-granularity, distribution-aware DSL and compiler that composes […]

Ver mais

Like 0

Liked Liked

technocracy

Sample Complexity of Average-Reward Q-Learning: From Single-agent to Federated Reinforcement Learning

digitado ⋅ 21 de January de 2026

arXiv:2601.13642v1 Announce Type: new Abstract: Average-reward reinforcement learning offers a principled framework for long-term decision-making by maximizing the mean reward per time step. Although Q-learning is a widely used model-free algorithm with established sample complexity in discounted and finite-horizon Markov decision processes (MDPs), its theoretical guarantees for average-reward settings remain limited. This work studies a simple but effective Q-learning algorithm for average-reward MDPs with finite state and action spaces under the weakly communicating assumption, covering both single-agent and […]

Ver mais

Like 0

Liked Liked

technocracy

Cuando 250 textos bastan para hackear la «verdad» de un LLM

digitado ⋅ 22 de February de 2026

Hay una idea tranquilizadora que muchos hemos dado por buena, casi por inercia: si un modelo se entrena con cantidades descomunales de datos, unas pocas gotas de «veneno» deberían «diluirse» hasta volverse irrelevantes. El problema es que esa intuición tan humana y tan de sentido común parece ser sencillamente falsa. Y no lo dice un tweet alarmista ni una demo oportunista: lo demuestra un trabajo conjunto de Anthropic, el UK AI Security Institute y el Alan Turing Institute […]

Ver mais

Like 0

Liked Liked

technocracy

TASTE-Streaming: Towards Streamable Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling

digitado ⋅ 16 de March de 2026

arXiv:2603.12350v1 Announce Type: new Abstract: Text-speech joint spoken language modeling (SLM) aims at natural and intelligent speech-based interactions, but developing such a system may suffer from modality mismatch: speech unit sequences are much longer than text tokens. Prior work reduces this gap with text-aligned tokenization and embedding (TASTE), producing speech tokens that align in lengths with their textual counterparts. However, the dependence on an external ASR system and the use of a non-causal decoder limits streaming use. To […]

Ver mais

Like 0

Liked Liked

technocracy

Protecting Language Models Against Unauthorized Distillation through Trace Rewriting

digitado ⋅ 18 de February de 2026

arXiv:2602.15143v1 Announce Type: new Abstract: Knowledge distillation is a widely adopted technique for transferring capabilities from LLMs to smaller, more efficient student models. However, unauthorized use of knowledge distillation takes unfair advantage of the considerable effort and cost put into developing frontier models. We investigate methods for modifying teacher-generated reasoning traces to achieve two objectives that deter unauthorized distillation: (1) emph{anti-distillation}, or degrading the training usefulness of query responses, and (2) emph{API watermarking}, which embeds verifiable signatures in […]

Ver mais

Like 0

Liked Liked

technocracy

Effective sample size approximations as entropy measures

digitado ⋅ 27 de February de 2026

arXiv:2602.22954v1 Announce Type: cross Abstract: In this work, we analyze alternative effective sample size (ESS) metrics for importance sampling algorithms, and discuss a possible extended range of applications. We show the relationship between the ESS expressions used in the literature and two entropy families, the R’enyi and Tsallis entropy. The R’enyi entropy is connected to the Huggins-Roy’s ESS family introduced in cite{Huggins15}. We prove that that all the ESS functions included in the Huggins-Roy’s family fulfill all the […]

Ver mais

Like 0

Liked Liked

technocracy

Matrix Manifold Neural Networks++

digitado ⋅ 6 de January de 2026

arXiv:2405.19206v2 Announce Type: replace Abstract: Deep neural networks (DNNs) on Riemannian manifolds have garnered increasing interest in various applied areas. For instance, DNNs on spherical and hyperbolic manifolds have been designed to solve a wide range of computer vision and nature language processing tasks. One of the key factors that contribute to the success of these networks is that spherical and hyperbolic manifolds have the rich algebraic structures of gyrogroups and gyrovector spaces. This enables principled and effective […]

Ver mais

Like 0

Liked Liked