digitado

Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space

digitado ⋅ 26 de February de 2026

arXiv:2602.21269v1 Announce Type: cross Abstract: We present Group Orthogonalized Policy Optimization (GOPO), a new alignment algorithm for large language models derived from the geometry of Hilbert function spaces. Instead of optimizing on the probability simplex and inheriting the exponential curvature of Kullback-Leibler divergence, GOPO lifts alignment into the Hilbert space L2(pi_k) of square-integrable functions with respect to the reference policy. Within this space, the simplex constraint reduces to a linear orthogonality condition = 0, defining a codimension-one subspace […]

Ver mais

Like 0

Liked Liked

technocracy

Learning When to Act: Interval-Aware Reinforcement Learning with Predictive Temporal Structure

digitado ⋅ 25 de March de 2026

arXiv:2603.22384v1 Announce Type: new Abstract: Autonomous agents operating in continuous environments must decide not only what to do, but when to act. We introduce a lightweight adaptive temporal control system that learns the optimal interval between cognitive ticks from experience, replacing ad hoc biologically inspired timers with a principled learned policy. The policy state is augmented with a predictive hyperbolic spread signal (a “curvature signal” shorthand) derived from hyperbolic geometry: the mean pairwise Poincare distance among n sampled […]

Ver mais

Like 0

Liked Liked

technocracy

This AI Can Hear, Translate, and Speak Back in 100 Languages

digitado ⋅ 9 de March de 2026

:::info Authors: Loïc Barrault Yu-An Chung Mariano Coria Meglioli David Dale Ning Dong Paul-Ambroise Duquenne Hady Elsahar Hongyu Gong Kevin Heffernan John Hoffman Christopher Klaiber Pengwei Li Daniel Licht Jean Maillard Alice Rakotoarison Kaushik Ram Sadagopan Guillaume Wenzek Ethan Ye ::: Abstract Creating the Babel Fish, a tool that helps individuals translate speech between any two languages, requires advanced technological innovation and linguistic expertise. Although conventional speech-to-speech translation systems composed of multiple subsystems performing translation in a cascaded […]

Ver mais

Like 0

Liked Liked

technocracy

Private Sum Computation: Trade-Offs between Communication, Randomness, and Privacy

digitado ⋅ 9 de February de 2026

arXiv:2602.06238v1 Announce Type: new Abstract: Consider multiple users and a fusion center. Each user possesses a sequence of bits and can communicate with the fusion center through a one-way public channel. The fusion center’s task is to compute the sum of all the sequences under the privacy requirement that a set of colluding users, along with the fusion center, cannot gain more than a predetermined amount $delta$ of information, measured through mutual information, about the sequences of other […]

Ver mais

Like 0

Liked Liked

technocracy

Metabolic cost of information processing in Poisson variational autoencoders

digitado ⋅ 17 de February de 2026

arXiv:2602.13421v1 Announce Type: new Abstract: Computation in biological systems is fundamentally energy-constrained, yet standard theories of computation treat energy as freely available. Here, we argue that variational free energy minimization under a Poisson assumption offers a principled path toward an energy-aware theory of computation. Our key observation is that the Kullback-Leibler (KL) divergence term in the Poisson free energy objective becomes proportional to the prior firing rates of model neurons, yielding an emergent metabolic cost term that penalizes […]

Ver mais

Like 0

Liked Liked

technocracy

A Multimodal Deep Learning Framework for Edema Classification Using HCT and Clinical Data

digitado ⋅ 31 de March de 2026

arXiv:2603.26726v1 Announce Type: new Abstract: We propose AttentionMixer, a unified deep learning framework for multimodal detection of brain edema that combines structural head CT (HCT) with routine clinical metadata. While HCT provides rich spatial information, clinical variables such as age, laboratory values, and scan timing capture complementary context that might be ignored or naively concatenated. AttentionMixer is designed to fuse these heterogeneous sources in a principled and efficient manner. HCT volumes are first encoded using a self-supervised Vision […]

Ver mais

Like 0

Liked Liked

technocracy

Foundation Models in Robotics: A Comprehensive Review of Methods, Models, Datasets, Challenges and Future Research Directions

digitado ⋅ 20 de April de 2026

arXiv:2604.15395v1 Announce Type: new Abstract: Over the recent years, the field of robotics has been undergoing a transformative paradigm shift from fixed, single-task, domain-specific solutions towards adaptive, multi-function, general-purpose agents, capable of operating in complex, open-world, and dynamic environments. This tremendous advancement is primarily driven by the emergence of Foundation Models (FMs), i.e., large-scale neural-network architectures trained on massive, heterogeneous datasets that provide unprecedented capabilities in multi-modal understanding and reasoning, long-horizon planning, and cross-embodiment generalization. In this context, […]

Ver mais

Like 0

Liked Liked

technocracy

Graph Information Theory: The Mathematical Proofs Behind LSEnet and DSI

digitado ⋅ 19 de February de 2026

Table of Links Abstract and 1. Introduction Related Work Preliminaries and Notations Differentiable Structural Information 4.1. A New Formulation 4.2. Properties 4.3. Differentiability & Deep Graph Clustering LSEnet 5.1. Embedding Leaf Nodes 5.2. Learning Parent Nodes 5.3. Hyperbolic Partitioning Tree Experiments 6.1. Graph Clustering 6.2. Discussion on Structural Entropy Conclusion, Broader Impact, and References Appendix A. Proofs B. Hyperbolic Space C. Technical Details D. Additional Results Appendix The appendix is structured in four sections. A. Proofs on differential […]

Ver mais

Like 0

Liked Liked

technocracy

Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization

digitado ⋅ 16 de April de 2026

arXiv:2604.13175v1 Announce Type: new Abstract: Large language models can be aligned with human preferences through offline reinforcement learning (RL) on small labeled datasets. While single-objective alignment is well-studied, many real-world applications demand the simultaneous optimization of multiple conflicting rewards, e.g. optimizing both catalytic activity and specificity in protein engineering, or helpfulness and harmlessness for chatbots. Prior work has largely relied on linear reward scalarization, but this approach provably fails to recover non-convex regions of the Pareto front. In […]

Ver mais

Like 0

Liked Liked

technocracy

Summaries as Centroids for Interpretable and Scalable Text Clustering

digitado ⋅ 10 de February de 2026

arXiv:2502.09667v5 Announce Type: replace-cross Abstract: We introduce k-NLPmeans and k-LLMmeans, text-clustering variants of k-means that periodically replace numeric centroids with textual summaries. The key idea, summary-as-centroid, retains k-means assignments in embedding space while producing human-readable, auditable cluster prototypes. The method is LLM-optional: k-NLPmeans uses lightweight, deterministic summarizers, enabling offline, low-cost, and stable operation; k-LLMmeans is a drop-in upgrade that uses an LLM for summaries under a fixed per-iteration budget whose cost does not grow with dataset size. We […]

Ver mais

Like 0

Liked Liked