digitado

How iteration order influences convergence and stability in deep learning

digitado ⋅ 30 de March de 2026

arXiv:2502.01557v3 Announce Type: replace-cross Abstract: Despite exceptional achievements, training neural networks remains computationally expensive and is often plagued by instabilities that can degrade convergence. While learning rate schedules can help mitigate these issues, finding optimal schedules is time-consuming and resource-intensive. This work explores theoretical issues concerning training stability in the constant-learning-rate (i.e., without schedule) and small-batch-size regime. Surprisingly, we show that the composition order of gradient updates affects stability and convergence in gradient-based optimizers. We illustrate this new […]

Ver mais

Like 0

Liked Liked

technocracy

A Hidden Problem in Jetpack Compose TextField Max Length

digitado ⋅ 30 de March de 2026

When building a design system for the Android app at inDrive, we implemented a reusable TextArea component using androidx.compose.foundation.text.BasicTextField. One of the basic requirements was straightforward: limit the maximum number of characters a user can enter. Jetpack Compose seems to already support this through InputTransformation. However, after exploring the implementation more closely, we discovered a subtle behavior that can lead to a completely unusable TextField in certain scenarios. This article explores: how Jetpack Compose TextField max length works […]

Ver mais

Like 0

Liked Liked

technocracy

Causal explanations of outliers in systems with lagged time-dependencies

digitado ⋅ 5 de February de 2026

arXiv:2602.04667v1 Announce Type: new Abstract: Root-cause analysis in controlled time dependent systems poses a major challenge in applications. Especially energy systems are difficult to handle as they exhibit instantaneous as well as delayed effects and if equipped with storage, do have a memory. In this paper we adapt the causal root-cause analysis method of Budhathoki et al. [2022] to general time-dependent systems, as it can be regarded as a strictly causal definition of the term “root-cause”. Particularly, we […]

Ver mais

Like 0

Liked Liked

technocracy

Empowering Epidemic Response: The Role of Reinforcement Learning in Infectious Disease Control

digitado ⋅ 26 de March de 2026

Reinforcement learning (RL), owing to its adaptability to various dynamic systems in many real-world scenarios and the capability of maximizing long-term outcomes under different constraints, has been used in infectious disease control to optimize the intervention strategies for controlling infectious disease spread and responding to outbreaks in recent years. The potential of RL for assisting public health sectors in preventing and controlling infectious diseases is gradually emerging and being explored by rapidly increasing publications relevant to COVID-19 and […]

Ver mais

Like 0

Liked Liked

technocracy

DD-MDN: Human Trajectory Forecasting with Diffusion-Based Dual Mixture Density Networks and Uncertainty Self-Calibration

digitado ⋅ 13 de February de 2026

arXiv:2602.11214v1 Announce Type: new Abstract: Human Trajectory Forecasting (HTF) predicts future human movements from past trajectories and environmental context, with applications in Autonomous Driving, Smart Surveillance, and Human-Robot Interaction. While prior work has focused on accuracy, social interaction modeling, and diversity, little attention has been paid to uncertainty modeling, calibration, and forecasts from short observation periods, which are crucial for downstream tasks such as path planning and collision avoidance. We propose DD-MDN, an end-to-end probabilistic HTF model that […]

Ver mais

Like 0

Liked Liked

technocracy

Layer-Parallel Training for Transformers

digitado ⋅ 15 de January de 2026

arXiv:2601.09026v1 Announce Type: new Abstract: We present a new training methodology for transformers using a multilevel, layer-parallel approach. Through a neural ODE formulation of transformers, our application of a multilevel parallel-in-time algorithm for the forward and backpropagation phases of training achieves parallel acceleration over the layer dimension. This dramatically enhances parallel scalability as the network depth increases, which is particularly useful for increasingly large foundational models. However, achieving this introduces errors that cause systematic bias in the gradients, […]

Ver mais

Like 0

Liked Liked

technocracy

SDiT: Semantic Region-Adaptive for Diffusion Transformers

digitado ⋅ 21 de January de 2026

arXiv:2601.12283v1 Announce Type: new Abstract: Diffusion Transformers (DiTs) achieve state-of-the-art performance in text-to-image synthesis but remain computationally expensive due to the iterative nature of denoising and the quadratic cost of global attention. In this work, we observe that denoising dynamics are spatially non-uniform-background regions converge rapidly while edges and textured areas evolve much more actively. Building on this insight, we propose SDiT, a Semantic Region-Adaptive Diffusion Transformer that allocates computation according to regional complexity. SDiT introduces a training-free […]

Ver mais

Like 0

Liked Liked

technocracy

Beyond Subtokens: A Rich Character Embedding for Low-resource and Morphologically Complex Languages

digitado ⋅ 26 de February de 2026

arXiv:2602.21377v1 Announce Type: new Abstract: Tokenization and sub-tokenization based models like word2vec, BERT and the GPTs are the state-of-the-art in natural language processing. Typically, these approaches have limitations with respect to their input representation. They fail to fully capture orthographic similarities and morphological variations, especially in highly inflected and under-resource languages. To mitigate this problem, we propose to computes word vectors directly from character strings, integrating both semantic and syntactic information. We denote this transformer-based approach Rich Character […]

Ver mais

Like 0

Liked Liked

technocracy

C-STEP: Continuous Space-Time Empowerment for Physics-informed Safe Reinforcement Learning of Mobile Agents

digitado ⋅ 25 de March de 2026

Safe navigation in complex environments remains a central challenge for reinforcement learning (RL) in robotics. This paper introduces Continuous Space-Time Empowerment for Physics-informed (C-STEP) safe RL, a novel measure of agent-centric safety tailored to deterministic, continuous domains. This measure can be used to design physics-informed intrinsic rewards by augmenting positive navigation reward functions. The reward incorporates the agents internal states (e.g., initial velocity) and forward dynamics to differentiate safe from risky behavior. By integrating C-STEP with navigation rewards, […]

Ver mais

Like 0

Liked Liked

technocracy

Contrastive Representation Learning

digitado ⋅ 31 de May de 2021

The goal of contrastive representation learning is to learn such an embedding space in which similar sample pairs stay close to each other while dissimilar ones are far apart. Contrastive learning can be applied to both supervised and unsupervised settings. When working with unsupervised data, contrastive learning is one of the most powerful approaches in self-supervised learning.

Ver mais

Like 0

Liked Liked