digitado

Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety

digitado ⋅ 14 de January de 2026

arXiv:2601.08000v1 Announce Type: new Abstract: Ensuring that Large Language Models (LLMs) adhere to safety principles without refusing benign requests remains a significant challenge. While OpenAI introduces deliberative alignment (DA) to enhance the safety of its o-series models through reasoning over detailed “code-like” safety rules, the effectiveness of this approach in open-source LLMs, which typically lack advanced reasoning capabilities, is understudied. In this work, we systematically evaluate the impact of explicitly specifying extensive safety codes versus demonstrating them through […]

Ver mais

Like 0

Liked Liked

technocracy

Sparse Reward Subsystem in Large Language Models

digitado ⋅ 3 de February de 2026

arXiv:2602.00986v1 Announce Type: new Abstract: In this paper, we identify a sparse reward subsystem within the hidden states of Large Language Models (LLMs), drawing an analogy to the biological reward subsystem in the human brain. We demonstrate that this subsystem contains value neurons that represent the model’s internal expectation of state value, and through intervention experiments, we establish the importance of these neurons for reasoning. Our experiments reveal that these value neurons are robust across diverse datasets, model […]

Ver mais

Like 0

Liked Liked

technocracy

Generics in science communication: Misaligned interpretations across laypeople, scientists, and large language models

digitado ⋅ 9 de February de 2026

arXiv:2602.06190v1 Announce Type: new Abstract: Scientists often use generics, that is, unquantified statements about whole categories of people or phenomena, when communicating research findings (e.g., “statins reduce cardiovascular events”). Large language models (LLMs), such as ChatGPT, frequently adopt the same style when summarizing scientific texts. However, generics can prompt overgeneralizations, especially when they are interpreted differently across audiences. In a study comparing laypeople, scientists, and two leading LLMs (ChatGPT-5 and DeepSeek), we found systematic differences in interpretation of […]

Ver mais

Like 0

Liked Liked

technocracy

Should AI chatbots have ads? Anthropic says no.

digitado ⋅ 4 de February de 2026

On Wednesday, Anthropic announced that its AI chatbot, Claude, will remain free of advertisements, drawing a sharp line between itself and rival OpenAI, which began testing ads in a low-cost tier of ChatGPT last month. The announcement comes alongside a Super Bowl ad campaign that mocks AI assistants that interrupt personal conversations with product pitches. “There are many good places for advertising. A conversation with Claude is not one of them,” Anthropic wrote in a blog post. The […]

Ver mais

Like 0

Liked Liked

technocracy

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection

digitado ⋅ 31 de March de 2010

We propose a generalization of the Multiple-try Metropolis (MTM) algorithm of Liu et al. (2000), which is based on drawing several proposals at each step and randomly choosing one of them on the basis of weights that may be arbitrary chosen. In particular, for Bayesian estimation we also introduce a method based on weights depending on a quadratic approximation of the posterior distribution. The resulting algorithm cannot be reformulated as an MTM algorithm and leads to a comparable […]

Ver mais

Like 0

Liked Liked

technocracy

Minimal Footprint Grasping Inspired by Ants

digitado ⋅ 3 de February de 2026

arXiv:2602.00935v1 Announce Type: new Abstract: Ants are highly capable of grasping objects in clutter, and we have recently observed that this involves substantial use of their forelegs. The forelegs, more specifically the tarsi, have high friction microstructures (setal pads), are covered in hairs, and have a flexible under-actuated tip. Here we abstract these features to test their functional advantages for a novel low-cost gripper design, suitable for bin-picking applications. In our implementation, the gripper legs are long and […]

Ver mais

Like 0

Liked Liked

technocracy

From Consistency to Complementarity: Aligned and Disentangled Multi-modal Learning for Time Series Understanding and Reasoning

digitado ⋅ 29 de January de 2026

Advances in multi-modal large language models (MLLMs) have inspired time series understanding and reasoning tasks, that enable natural language querying over time series, producing textual analyses of complex temporal dynamics. Recent attempts hybridize numerical time series with their visualized plots, facilitating precise value reasoning and visual structure comprehension for comprehensive time series understanding of MLLMs. However, effective cross-modal integration remains challenging due to fine-grained temporal misalignment across modalities and severe entanglement between shared and modality-specific semantics, which hinder […]

Ver mais

Like 0

Liked Liked

technocracy

Enabling stratified sampling in high dimensions via nonlinear dimensionality reduction

digitado ⋅ 6 de March de 2026

arXiv:2506.08921v2 Announce Type: replace-cross Abstract: We consider the problem of propagating the uncertainty from a possibly large number of random inputs through a computationally expensive model. Stratified sampling is a well-known variance reduction strategy, but its application, thus far, has focused on models with a limited number of inputs due to the challenges of creating uniform partitions in high dimensions. To overcome these challenges, we propose a simple methodology for constructing an effective stratification of the input domain […]

Ver mais

Like 0

Liked Liked

technocracy

AI Coding Tip 003 – Force Read-Only Planning

digitado ⋅ 17 de January de 2026

Think first, code later TL;DR: Set your AI code assistant to read-only state before it touches your files. Common Mistake ❌ You paste your failing call stack to your AI assistant without further instructions. The copilot immediately begins modifying multiple source files. It creates new issues because it doesn’t understand your full architecture yet. You spend the next hour undoing its messy changes. Problems Addressed 😔 The AI modifies code that doesn’t need changing. The copilot starts typing before it […]

Ver mais

Like 0

Liked Liked

technocracy

The Next Generation of Cybersecurity Protection for Healthcare

digitado ⋅ 4 de February de 2026

In a world where ransomware strikes, attempts at data manipulation, and system-crippling intrusions against hospitals are on the rise, the healthcare sector has reached a breaking point. These are no longer attacks on mere data; they are on ventilators, on imaging machines, on diagnostic algorithms, and on the very systems that sustain patients’ lives. On a national level across the United States, even a single breach has the power to halt surgeries, divert ambulances, and put lives in […]

Ver mais

Like 0

Liked Liked