digitado

Ensemble Self-Training for Unsupervised Machine Translation

digitado ⋅ 17 de March de 2026

We present an ensemble-driven self-training framework for unsupervised neural machine translation (UNMT). Starting from a primary language pair, we train multiple UNMT models that share the same translation task but differ in an auxiliary language, inducing structured diversity across models. We then generate pseudo-translations for the primary pair using token-level ensemble decoding, averaging model predictions in both directions. These ensemble outputs are used as synthetic parallel data to further train each model, allowing the models to improve via […]

Ver mais

Like 0

Liked Liked

technocracy

One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

digitado ⋅ 5 de March de 2026

arXiv:2603.03291v1 Announce Type: new Abstract: Reward Models (RMs) are crucial for online alignment of language models (LMs) with human preferences. However, RM-based preference-tuning is vulnerable to reward hacking, whereby LM policies learn undesirable behaviors from flawed RMs. By systematically measuring biases in five high-quality RMs, including the state-of-the-art, we find that issues persist despite prior work with respect to length, sycophancy, and overconfidence. We also discover new issues related to bias toward model-specific styles and answer-order. We categorize […]

Ver mais

Like 0

Liked Liked

technocracy

Safe Stochastic Explorer: Enabling Safe Goal Driven Exploration in Stochastic Environments and Safe Interaction with Unknown Objects

digitado ⋅ 3 de February de 2026

arXiv:2602.00868v1 Announce Type: new Abstract: Autonomous robots operating in unstructured, safety-critical environments, from planetary exploration to warehouses and homes, must learn to safely navigate and interact with their surroundings despite limited prior knowledge. Current methods for safe control, such as Hamilton-Jacobi Reachability and Control Barrier Functions, assume known system dynamics. Meanwhile existing safe exploration techniques often fail to account for the unavoidable stochasticity inherent when operating in unknown real world environments, such as an exploratory rover skidding over […]

Ver mais

Like 0

Liked Liked

technocracy

Minimax Rates for Hyperbolic Hierarchical Learning

digitado ⋅ 27 de January de 2026

We prove an exponential separation in sample complexity between Euclidean and hyperbolic representations for learning on hierarchical data under standard Lipschitz regularization. For depth-$R$ hierarchies with branching factor $m$, we first establish a geometric obstruction for Euclidean space: any bounded-radius embedding forces volumetric collapse, mapping exponentially many tree-distant points to nearby locations. This necessitates Lipschitz constants scaling as $exp(Ω(R))$ to realize even simple hierarchical targets, yielding exponential sample complexity under capacity control. We then show this obstruction vanishes […]

Ver mais

Like 0

Liked Liked

technocracy

PCGamer Article Performance Audit

digitado ⋅ 22 de March de 2026

Research: PCGamer Article Performance Audit Stuart Breckenridge pointed out that PC Gamer Recommends RSS Readers in a 37MB Article That Just Keeps Downloading, highlighting a truly horrifying example of web bloat that added up to 100s more MBs thanks to auto-playing video ads. I decided to have Claude Code for web use Rodney to investigate the page – prompt here. Tags: web-performance, rodney

Ver mais

Like 0

Liked Liked

technocracy

Random Wavelet Features for Graph Kernel Machines

digitado ⋅ 17 de February de 2026

Node embeddings map graph vertices into low-dimensional Euclidean spaces while preserving structural information. They are central to tasks such as node classification, link prediction, and signal reconstruction. A key goal is to design node embeddings whose dot products capture meaningful notions of node similarity induced by the graph. Graph kernels offer a principled way to define such similarities, but their direct computation is often prohibitive for large networks. Inspired by random feature methods for kernel approximation in Euclidean […]

Ver mais

Like 0

Liked Liked

technocracy

Neurons receive precisely tailored teaching signals as we learn

digitado ⋅ 10 de March de 2026

When we learn a new skill, the brain has to decide — cell by cell — what to change. New research from MIT suggests it can do that with surprising precision, sending targeted feedback to individual neurons so each one can adjust its activity in the right direction. The finding echoes a key idea from modern artificial intelligence. Many AI systems learn by comparing their output to a target, computing an “error” signal, and using it to fine-tune […]

Ver mais

Like 0

Liked Liked

technocracy

Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in modern Transformers

digitado ⋅ 28 de January de 2026

Transformer-based multimodal large language models often exhibit in-context learning (ICL) abilities. Motivated by this phenomenon, we ask: how do transformers learn to associate information across modalities from in-context examples? We investigate this question through controlled experiments on small transformers trained on synthetic classification tasks, enabling precise manipulation of data statistics and model architecture. We begin by revisiting core principles of unimodal ICL in modern transformers. While several prior findings replicate, we find that Rotary Position Embeddings (RoPE) increases […]

Ver mais

Like 0

Liked Liked

technocracy

Achieving Linear Speedup for Composite Federated Learning

digitado ⋅ 3 de February de 2026

This paper proposes FedNMap, a normal map-based method for composite federated learning, where the objective consists of a smooth loss and a possibly nonsmooth regularizer. FedNMap leverages a normal map-based update scheme to handle the nonsmooth term and incorporates a local correction strategy to mitigate the impact of data heterogeneity across clients. Under standard assumptions, including smooth local losses, weak convexity of the regularizer, and bounded stochastic gradient variance, FedNMap achieves linear speedup with respect to both the […]

Ver mais

Like 0

Liked Liked

technocracy

Not all tokens are needed(NAT): token efficient reinforcement learning

digitado ⋅ 10 de March de 2026

arXiv:2603.06619v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a key driver of progress in large language models, but scaling RL to long chain-of-thought (CoT) trajectories is increasingly constrained by backpropagation over every generated token. Even with optimized rollout engines, full-token updates can consume a large fraction of total training cost, turning token length into a hidden tax on RL. We introduce Not All Tokens Are Needed (NAT), a unified framework that makes the token budget a […]

Ver mais

Like 0

Liked Liked