digitado

Bit-Identical Medical Deep Learning via Structured Orthogonal Initialization

digitado ⋅ 30 de March de 2026

Deep learning training is non-deterministic: identical code with different random seeds produces models that agree on aggregate metrics but disagree on individual predictions, with per-class AUC swings exceeding 20 percentage points on rare clinical classes. We present a framework for verified bit-identical training that eliminates three sources of randomness: weight initialization (via structured orthogonal basis functions), batch ordering (via golden ratio scheduling), and non-deterministic GPU operations (via architecture selection and custom autograd). The pipeline produces MD5-verified identical trained […]

Ver mais

Like 0

Liked Liked

technocracy

Interface Framework for Human-AI Collaboration within Intelligent User Interface Ecosystems

digitado ⋅ 27 de February de 2026

arXiv:2602.22343v1 Announce Type: new Abstract: As interfaces evolve from static user pathways to dynamic human-AI collaboration, no standard methods exist for selecting appropriate interface patterns based on user needs and task complexity. Existing frameworks only provide guiding principles for designing AI agent capabilities. We propose a dimensional framework based on workflow complexity, AI autonomy, and AI reasoning to guide the design of context-aware, scalable AI interfaces aka modalities (e.g., prompt bars, split screens, full screens, etc.). The framework […]

Ver mais

Like 0

Liked Liked

technocracy

Radial Basis Function in Machine Learning: Formula, Example, Applications

digitado ⋅ 13 de March de 2026

Radial basis function in machine learning serves as a powerful tool for handling non-linear data patterns through neural networks that use distance-based activations. This approach helps models approximate complex functions and make accurate predictions in tasks like classification and regression. This blog explains what radial basis function means, its formula, network structure, training steps, a solved example, and applications of radial basis function network. You will gain a clear understanding of how these networks work, their strengths, and […]

Ver mais

Like 0

Liked Liked

technocracy

Zipage: Maintain High Request Concurrency for LLM Reasoning through Compressed PagedAttention

digitado ⋅ 11 de March de 2026

arXiv:2603.08743v1 Announce Type: new Abstract: With reasoning becoming the generative paradigm for large language models (LLMs), the memory bottleneck caused by KV cache during the decoding phase has become a critical factor limiting high-concurrency service. Although existing KV cache eviction methods address the memory issue, most of them are impractical for industrial-grade applications. This paper introduces Compressed PagedAttention, a method that combines token-wise KV cache eviction with PagedAttention. We propose a comprehensive scheduling strategy and support prefix caching […]

Ver mais

Like 0

Liked Liked

technocracy

COLD-Steer: Steering Large Language Models via In-Context One-step Learning Dynamics

digitado ⋅ 6 de March de 2026

Activation steering methods enable inference-time control of large language model (LLM) behavior without retraining, but current approaches face a fundamental trade-off: sample-efficient methods suboptimally capture steering signals from labeled examples, while methods that better extract these signals require hundreds to thousands of examples. We introduce COLD-Steer, a training-free framework that steers LLM activations by approximating the representational changes that would result from gradient descent on in-context examples. Our key insight is that the effect of fine-tuning on a […]

Ver mais

Like 0

Liked Liked

technocracy

Quantitative convergence of trained single layer neural networks to Gaussian processes

digitado ⋅ 6 de March de 2026

arXiv:2509.24544v3 Announce Type: replace Abstract: In this paper, we study the quantitative convergence of shallow neural networks trained via gradient descent to their associated Gaussian processes in the infinite-width limit. While previous work has established qualitative convergence under broad settings, precise, finite-width estimates remain limited, particularly during training. We provide explicit upper bounds on the quadratic Wasserstein distance between the network output and its Gaussian approximation at any training time $t ge 0$, demonstrating polynomial decay with network […]

Ver mais

Like 0

Liked Liked

technocracy

Dual-system learning model “figures out” how to use a tool

digitado ⋅ 14 de April de 2026

This is an 8 year passion project on attempting to create a control system for a purely autonomous virtual agent. I wanted to put a model together that could fully control an agent with typical human drives (hunger, play/exploration, control). The full model is comprised of interconnected simple neural network modules. The application is written in C# and implemented in Unity. The model uses Reward-modulated Hebbian learning in modules associated with value processing (e.g. amygdala, ventral striatum), and […]

Ver mais

Like 0

Liked Liked

technocracy

Claude Opus 4.7 Is Here and It Changes the Coding Model Race

digitado ⋅ 17 de April de 2026

I woke up this morning to Anthropic dropping Claude Opus 4.7. No slow rollout, no waitlist. Just a new model sitting in Claude Code, the API, and every major cloud provider simultaneously. I have been running Opus 4.6 as my default coding model for months. It has been the backbone of my agentic coding workflow, my content pipeline, and most of my production debugging sessions. So the first thing I did was throw my hardest open tasks at Opus 4.7 to […]

Ver mais

Like 0

Liked Liked

technocracy

Helping K-12 schools navigate the complex world of AI

digitado ⋅ 3 de November de 2025

With the rapid advancement of generative artificial intelligence, teachers and school leaders are looking for answers to complicated questions about successfully integrating technology into lessons, while also ensuring students actually learn what they’re trying to teach. Justin Reich, an associate professor in MIT’s Comparative Media Studies/Writing program, hopes a new guidebook published by the MIT Teaching Systems Lab can support K-12 educators as they determine what AI policies or guidelines to craft. “Throughout my career, I’ve tried to be a […]

Ver mais

Like 0

Liked Liked

technocracy

Sharp Concentration Inequalities: Phase Transition and Mixing of Orlicz Tails with Variance

digitado ⋅ 30 de March de 2026

arXiv:2603.25934v1 Announce Type: cross Abstract: In this work, we investigate how to develop sharp concentration inequalities for sub-Weibull random variables, including sub-Gaussian and sub-exponential distributions. Although the random variables may not be sub-Guassian, the tail probability around the origin behaves as if they were sub-Gaussian, and the tail probability decays align with the Orlicz $Psi_alpha$-tail elsewhere. Specifically, for independent and identically distributed (i.i.d.) ${X_i}_{i=1}^n$ with finite Orlicz norm $|X|_{Psi_alpha}$, our theory unveils that there is an interesting phase […]

Ver mais

Like 0

Liked Liked