digitado

Temporal Difference Learning with Constrained Initial Representations

digitado ⋅ 12 de February de 2026

Recently, there have been numerous attempts to enhance the sample efficiency of off-policy reinforcement learning (RL) agents when interacting with the environment, including architecture improvements and new algorithms. Despite these advances, they overlook the potential of directly constraining the initial representations of the input data, which can intuitively alleviate the distribution shift issue and stabilize training. In this paper, we introduce the Tanh function into the initial layer to fulfill such a constraint. We theoretically unpack the convergence […]

Ver mais

Like 0

Liked Liked

technocracy

TotalFM: An Organ-Separated Framework for 3D-CT Vision Foundation Models

digitado ⋅ 6 de January de 2026

arXiv:2601.00260v1 Announce Type: new Abstract: While foundation models in radiology are expected to be applied to various clinical tasks, computational cost constraints remain a major challenge when training on 3D-CT volumetric data. In this study, we propose TotalFM, a radiological foundation model that efficiently learns the correspondence between 3D-CT images and linguistic expressions based on the concept of organ separation, utilizing a large-scale dataset of 140,000 series. By automating the creation of organ volume and finding-sentence pairs through […]

Ver mais

Like 0

Liked Liked

technocracy

Cybersecurity Use Case: AI Agent for Anomaly Detection – Part 2

digitado ⋅ 26 de January de 2026

In the first part of this series, here, I feature a click fraud case that we are working on, litigated by one of the largest law firms in the US. The input data comes from an Excel repository, automatically processed by an AI agent part of our BondingAI enterprise solutions. It comes with insights generation via automated SQL queries and pattern detection. In other contexts, the data may come from a PDF repository, the Internet, databases, or a […]

Ver mais

Like 0

Liked Liked

technocracy

Digital Red Queen: Adversarial Program Evolution in Core War with LLMs

digitado ⋅ 8 de January de 2026

arXiv:2601.03335v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly being used to evolve solutions to problems in many domains, in a process inspired by biological evolution. However, unlike biological evolution, most LLM-evolution frameworks are formulated as static optimization problems, overlooking the open-ended adversarial dynamics that characterize real-world evolutionary processes. Here, we study Digital Red Queen (DRQ), a simple self-play algorithm that embraces these so-called “Red Queen” dynamics via continual adaptation to a changing objective. DRQ uses […]

Ver mais

Like 0

Liked Liked

technocracy

Amortized Variational Inference for Partial-Label Learning: A Probabilistic Approach to Label Disambiguation

digitado ⋅ 12 de January de 2026

arXiv:2510.21300v2 Announce Type: replace-cross Abstract: Real-world data is frequently noisy and ambiguous. In crowdsourcing, for example, human annotators may assign conflicting class labels to the same instances. Partial-label learning (PLL) addresses this challenge by training classifiers when each instance is associated with a set of candidate labels, only one of which is correct. While early PLL methods approximate the true label posterior, they are often computationally intensive. Recent deep learning approaches improve scalability but rely on surrogate losses […]

Ver mais

Like 0

Liked Liked

technocracy

Graph-Neural-Network-Based Brain Connectivity Analysis for Early Detection of Long-COVID Cognitive Impairments Using MRI/fMRI

digitado ⋅ 19 de January de 2026

Long-term COVID are frequently causing neuro- logical problems that persist for long time, like cognitive damage and changes in the structure of the brain that are very difficult to find with conventional clinical testing. When coupled with AI, neuroimaging might serve as a useful tool for finding small prob- lems.To build a deep learning-based neuroimaging framework that automatically recognizes and describe neurological diseases in long-term COVID patients. Deep learning models, such as CNNs, transformers, and graph neural networks, […]

Ver mais

Like 0

Liked Liked

technocracy

Recoverability Has a Law: The ERR Measure for Tool-Augmented Agents

digitado ⋅ 2 de February de 2026

arXiv:2601.22352v1 Announce Type: new Abstract: Language model agents often appear capable of self-recovery after failing tool call executions, yet this behavior lacks a formal explanation. We present a predictive theory that resolves this gap by showing that recoverability follows a measurable law. To elaborate, we formalize recoverability through Expected Recovery Regret (ERR), which quantifies the deviation of a recovery policy from the optimal one under stochastic execution noise, and derive a first-order relationship between ERR and an empirical […]

Ver mais

Like 0

Liked Liked

technocracy

Detoxification of large language models via regularized fine-tuning

digitado ⋅ 21 de November de 2024

Detoxification of large language models via regularized fine-tuning Attribute-controlled fine-tuning can produce LLMs that adhere to policy while achieving competitive performance on general benchmarks. Conversational AI Charith Peris November 21, 03:40 PM November 21, 03:40 PM Large language models (LLMs) have demonstrated impressive performance across a variety of tasks, but, as has been clear in multiple instances, they carry the risk of producing inappropriate, unsafe, or biased outputs. When generating responses, a successfully trained LLM should comply with […]

Ver mais

Like 0

Liked Liked

technocracy

Tight Regret Bounds for Bilateral Trade under Semi Feedback

digitado ⋅ 26 de January de 2026

arXiv:2601.16412v1 Announce Type: new Abstract: The study of textit{regret minimization in fixed-price bilateral trade} has received considerable attention in recent research. Previous works [CCC+24a, CCC+24b, AFF24, BCCF24, CJLZ25, LCM25a, GDFS25] have acquired a thorough understanding of the problem, except for determining the tight regret bound for GBB semi-feedback fixed-price mechanisms under adversarial values. In this paper, we resolve this open question by devising an $widetilde{O}(T^{2 / 3})$-regret mechanism, matching the $Omega(T^{2 / 3})$ lower bound from [CJLZ25] up […]

Ver mais

Like 0

Liked Liked

technocracy

Winner’s Curse Drives False Promises in Data-Driven Decisions: A Case Study in Refugee Matching

digitado ⋅ 10 de February de 2026

arXiv:2602.08892v1 Announce Type: new Abstract: A major challenge in data-driven decision-making is accurate policy evaluation-i.e., guaranteeing that a learned decision-making policy achieves the promised benefits. A popular strategy is model-based policy evaluation, which estimates a model from data to infer counterfactual outcomes. This strategy is known to produce unwarrantedly optimistic estimates of the true benefit due to the winner’s curse. We searched the recent literature on data-driven decision-making, identifying a sample of 55 papers published in the Management […]

Ver mais

Like 0

Liked Liked