digitado

UNICBench: UNIfied Counting Benchmark for MLLM

digitado ⋅ 3 de March de 2026

arXiv:2603.00595v1 Announce Type: new Abstract: Counting is a core capability for multimodal large language models (MLLMs), yet there is no unified counting dataset to rigorously evaluate this ability across image, text, and audio. We present UNICBench, a unified multimodal, multi level counting benchmark and evaluation toolkit with accurate ground truth, deterministic numeric parsing, and stratified reporting. The corpus comprises 5,300 images (5,508 QA), 872 documents (5,888 QA), and 2,069 audio clips (2,905 QA), annotated with a three level […]

Ver mais

Like 0

Liked Liked

technocracy

Novel “Kaputt” dataset sets new benchmark for large-scale visual defect detection

digitado ⋅ 2 de October de 2025

Novel Kaputt dataset sets new benchmark for large-scale visual defect detection A new dataset with over 238,000 images challenges and advances the state of the art in visual defect detection for complex retail applications. Computer vision Sebastian Hoefer October 02, 12:34 PM October 02, 12:34 PM At Amazon, we’re constantly working to improve our logistics operations through cutting-edge AI and computer vision. Today, we’re excited to announce the public release of Kaputt, a large-scale dataset for visual defect […]

Ver mais

Like 0

Liked Liked

technocracy

GFPL: Generative Federated Prototype Learning for Resource-Constrained and Data-Imbalanced Vision Task

digitado ⋅ 25 de February de 2026

Federated learning (FL) facilitates the secure utilization of decentralized images, advancing applications in medical image recognition and autonomous driving. However, conventional FL faces two critical challenges in real-world deployment: ineffective knowledge fusion caused by model updates biased toward majority-class features, and prohibitive communication overhead due to frequent transmissions of high-dimensional model parameters. Inspired by the human brain’s efficiency in knowledge integration, we propose a novel Generative Federated Prototype Learning (GFPL) framework to address these issues. Within this framework, […]

Ver mais

Like 0

Liked Liked

technocracy

Google makes its industrial robotics AI play official–and this time, it means business

digitado ⋅ 4 de March de 2026

When Google folds a moonshot into its core operations, it’s not cleaning house. It’s placing a bet. On February 25, Alphabet-owned Intrinsic–which builds AI models and software designed to make industrial robotics more accessible–officially joined Google. The company will remain a distinct group within Google, working closely with Google DeepMind and tapping into Gemini AI models and Google Cloud. No purchase price was disclosed. On the surface, this looks like a routine internal reshuffle. It isn’t. From Moonshot to Mandate […]

Ver mais

Like 0

Liked Liked

technocracy

A benchmarking framework for PON-based fronthaul network design

digitado ⋅ 22 de January de 2026

arXiv:2601.14480v1 Announce Type: new Abstract: As mobile networks transition toward 5G and 6G RAN architectures, Passive Optical Networks (PONs) offer a critical solution for cost-effective fronthaul transport. However, the lack of standardized evaluation models in current literature makes an objective comparison of diverse optimization strategies difficult. This paper addresses this gap by proposing a unified benchmarking framework that standardizes cost catalogs and deployment scenarios. We formulate the network design problem using Integer Linear Programming (ILP) to establish optimality […]

Ver mais

Like 0

Liked Liked

technocracy

Multimodal Enhancement of Sequential Recommendation

digitado ⋅ 10 de February de 2026

arXiv:2602.07207v1 Announce Type: new Abstract: We propose a novel recommender framework, MuSTRec (Multimodal and Sequential Transformer-based Recommendation), that unifies multimodal and sequential recommendation paradigms. MuSTRec captures cross-item similarities and collaborative filtering signals, by building item-item graphs from extracted text and visual features. A frequency-based self-attention module additionally captures the short- and long-term user preferences. Across multiple Amazon datasets, MuSTRec demonstrates superior performance (up to 33.5% improvement) over multimodal and sequential state-of-the-art baselines. Finally, we detail some interesting facets […]

Ver mais

Like 0

Liked Liked

technocracy

Coarse-to-Fine Learning of Dynamic Causal Structures

digitado ⋅ 26 de February de 2026

Learning the dynamic causal structure of time series is a challenging problem. Most existing approaches rely on distributional or structural invariance to uncover underlying causal dynamics, assuming stationary or partially stationary causality. However, these assumptions often conflict with the complex, time-varying causal relationships observed in real-world systems. This motivates the need for methods that address fully dynamic causality, where both instantaneous and lagged dependencies evolve over time. Such a setting poses significant challenges for the efficiency and stability […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Ordered Representations in Latent Space for Intrinsic Dimension Estimation via Principal Component Autoencoder

digitado ⋅ 27 de January de 2026

Autoencoders have long been considered a nonlinear extension of Principal Component Analysis (PCA). Prior studies have demonstrated that linear autoencoders (LAEs) can recover the ordered, axis-aligned principal components of PCA by incorporating non-uniform $ell_2$ regularization or by adjusting the loss function. However, these approaches become insufficient in the nonlinear setting, as the remaining variance cannot be properly captured independently of the nonlinear mapping. In this work, we propose a novel autoencoder framework that integrates non-uniform variance regularization with […]

Ver mais

Like 0

Liked Liked

technocracy

Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD

digitado ⋅ 5 de March de 2026

arXiv:2505.18535v2 Announce Type: replace-cross Abstract: We study the convergence properties and escape dynamics of Stochastic Gradient Descent (SGD) in one-dimensional landscapes, separately considering infinite- and finite-variance noise. Our main focus is to identify the time scales on which SGD reliably moves from an initial point to the local minimum in the same ”basin”. Under suitable conditions on the noise distribution, we prove that SGD converges to the basin’s minimum unless the initial point lies too close to a […]

Ver mais

Like 0

Liked Liked

technocracy

RL for modeling rodent behavior?

digitado ⋅ 2 de February de 2026

I’ve seen some pretty cool work using Q learning and HMMs to model rat behavior in some pretty complex behavioral paradigms, <e.g learning a contrast gradient with psychometric function etc…) but for very classical associative learning, are there any interesting approaches that one might use? What properties/parameters of conditioned learning, e.g. beyond learning rate might be interesting to try to pull out by fitting RLs? submitted by /u/traydblockzplz [link] [comments]

Ver mais

Like 0

Liked Liked