digitado

A second order regret bound for NormalHedge

digitado ⋅ 10 de February de 2026

arXiv:2602.08151v1 Announce Type: cross Abstract: We consider the problem of prediction with expert advice for “easy” sequences. We show that a variant of NormalHedge enjoys a second-order $epsilon$-quantile regret bound of $Obig(sqrt{V_T log(V_T/epsilon)}big) $ when $V_T > log N$, where $V_T$ is the cumulative second moment of instantaneous per-expert regret averaged with respect to a natural distribution determined by the algorithm. The algorithm is motivated by a continuous time limit using Stochastic Differential Equations. The discrete time analysis […]

Ver mais

Like 0

Liked Liked

technocracy

Tracking the Limits of Knowledge Propagation: How LLMs Fail at Multi-Step Reasoning with Conflicting Knowledge

digitado ⋅ 23 de January de 2026

arXiv:2601.15495v1 Announce Type: new Abstract: A common solution for mitigating outdated or incorrect information in Large Language Models (LLMs) is to provide updated facts in-context or through knowledge editing. However, these methods introduce knowledge conflicts when the knowledge update fails to overwrite the model’s parametric knowledge, which propagate to faulty reasoning. Current benchmarks for this problem, however, largely focus only on single knowledge updates and fact recall without evaluating how these updates affect downstream reasoning. In this work, […]

Ver mais

Like 0

Liked Liked

technocracy

Structured vs. Unstructured Pruning: An Exponential Gap

digitado ⋅ 4 de March de 2026

arXiv:2603.02234v1 Announce Type: new Abstract: The Strong Lottery Ticket Hypothesis (SLTH) posits that large, randomly initialized neural networks contain sparse subnetworks capable of approximating a target function at initialization without training, suggesting that pruning alone is sufficient. Pruning methods are typically classified as unstructured, where individual weights can be removed from the network, and structured, where parameters are removed according to specific patterns, as in neuron pruning. Existing theoretical results supporting the SLTH rely almost exclusively on unstructured […]

Ver mais

Like 0

Liked Liked

technocracy

MixMin: Finding Data Mixtures via Convex Minimization

digitado ⋅ 16 de January de 2026

arXiv:2502.10510v3 Announce Type: replace-cross Abstract: Modern machine learning pipelines are increasingly combining and mixing data from diverse and disparate sources, e.g., pre-training large language models. Yet, finding the optimal data mixture is a challenging and open problem. We formalize this data mixing problem as a bi-level objective: the best mixture is the one that would lead to the best model for a downstream objective. Unfortunately, this objective is generally intractable. In this paper, we make the observation that […]

Ver mais

Like 0

Liked Liked

technocracy

Supply Chain Attack on Axios Pulls Malicious Dependency from npm

digitado ⋅ 1 de April de 2026

Supply Chain Attack on Axios Pulls Malicious Dependency from npm Useful writeup of today’s supply chain attack against Axios, the HTTP client NPM package with 101 million weekly downloads. Versions 1.14.1 and 0.30.4 both included a new dependency called plain-crypto-js which was freshly published malware, stealing credentials and installing a remote access trojan (RAT). It looks like the attack came from a leaked long-lived npm token. Axios have an open issue to adopt trusted publishing, which would ensure […]

Ver mais

Like 0

Liked Liked

technocracy

MEMRES: A Memory-Augmented Resolver with Confidence Cascade for Agentic Python Dependency Resolution

digitado ⋅ 21 de April de 2026

arXiv:2604.16941v1 Announce Type: new Abstract: We present MEMRES, an agentic system for Python dependency resolution that introduces a multi-level confidence cascade where the LLM serves as the last resort. Our system combines: (1) a Self-Evolving Memory that accumulates reusable resolution patterns via tips and shortcuts; (2) an Error Pattern Knowledge Base with 200+ curated import-to-package mappings; (3) a Semantic Import Analyzer; and (4) a Python 2 heuristic detector resolving the largest failure category. On HG2.9K using Gemma-2 9B […]

Ver mais

Like 0

Liked Liked

technocracy

Internal Reasoning vs. External Control: A Thermodynamic Analysis of Sycophancy in Large Language Models

digitado ⋅ 8 de January de 2026

arXiv:2601.03263v1 Announce Type: new Abstract: Large Language Models frequently exhibit sycophancy, prioritizing user agreeableness over correctness. We investigate whether this requires external regulation or can be mitigated by internal reasoning alone. Using CAP-GSM8K (N=500), an adversarial dataset, we evaluate internal (CoT) versus external (RCA) mechanisms across GPT-3.5, GPT-4o, and GPT-5.1. Our results reveal the structural limits of internal reasoning: it causes performance collapse in weak models (the Prioritization Paradox) and leaves an 11.4% final output gap in frontier […]

Ver mais

Like 0

Liked Liked

technocracy

The Markup Wins Six News Design Awards From the Society for News Design

digitado ⋅ 28 de March de 2026

The Markup, now a part of CalMatters, uses investigative reporting, data analysis, and software engineering to challenge technology to serve the public good. Sign up for Klaxon, a newsletter that delivers our stories and tools directly to your inbox. The Markup won multiple awards of excellence in the Society for News Design’s Best of News Design Creative Competition, which honors excellence in visual storytelling, design, and journalism produced in 2023. “See the Neighborhoods Internet Providers Excluded from Fast […]

Ver mais

Like 0

Liked Liked

technocracy

I trained an AI to play Resident Evil 4 Remake using Behavioral Cloning + LSTM

digitado ⋅ 29 de March de 2026

I recorded gameplay trajectories in RE4’s village — running, shooting, reloading, dodging — and used Behavioral Cloning to train a model to imitate my decisions. Added LSTM so the AI could carry memory across time steps, not just react to the current frame. The most interesting result: the AI handled single enemies reasonably well, but struggled with the fight-or-flee decision when multiple enemies were on screen simultaneously. That nuance was hard to imitate without more data. Full video […]

Ver mais

Like 0

Liked Liked

technocracy

Pixel-level Counterfactual Contrastive Learning for Medical Image Segmentation

digitado ⋅ 17 de March de 2026

Image segmentation relies on large annotated datasets, which are expensive and slow to produce. Silver-standard (AI-generated) labels are easier to obtain, but they risk introducing bias. Self-supervised learning, needing only images, has become key for pre-training. Recent work combining contrastive learning with counterfactual generation improves representation learning for classification but does not readily extend to pixel-level tasks. We propose a pipeline combining counterfactual generation with dense contrastive learning via Dual-View (DVD-CL) and Multi-View (MVD-CL) methods, along with supervised […]

Ver mais

Like 0

Liked Liked