digitado

On the Flakiness of LLM-Generated Tests for Industrial and Open-Source Database Management Systems

digitado ⋅ 15 de January de 2026

arXiv:2601.08998v1 Announce Type: new Abstract: Flaky tests are a common problem in software testing. They produce inconsistent results when executed multiple times on the same code, invalidating the assumption that a test failure indicates a software defect. Recent work on LLM-based test generation has identified flakiness as a potential problem with generated tests. However, its prevalence and underlying causes are unclear. We examined the flakiness of LLM-generated tests in the context of four relational database management systems: SAP […]

Ver mais

Like 0

Liked Liked

technocracy

Evaluating the Robustness of Reinforcement Learning based Adaptive Traffic Signal Control

digitado ⋅ 16 de March de 2026

Reinforcement learning (RL) has attracted increasing interest for adaptive traffic signal control due to its model-free ability to learn control policies directly from interaction with the traffic environment. However, several challenges remain before RL-based signal control can be considered ready for field deployment. Many existing studies rely on simplified signal timing structures, robustness of trained models under varying traffic demand conditions remains insufficiently evaluated, and runtime efficiency continues to pose challenges when training RL algorithms in traffic microscopic […]

Ver mais

Like 0

Liked Liked

technocracy

Bayesian Generative Adversarial Networks via Gaussian Approximation for Tabular Data Synthesis

digitado ⋅ 26 de February de 2026

arXiv:2602.21948v1 Announce Type: cross Abstract: Generative Adversarial Networks (GAN) have been used in many studies to synthesise mixed tabular data. Conditional tabular GAN (CTGAN) have been the most popular variant but struggle to effectively navigate the risk-utility trade-off. Bayesian GAN have received less attention for tabular data, but have been explored with unstructured data such as images and text. The most used technique employed in Bayesian GAN is Markov Chain Monte Carlo (MCMC), but it is computationally intensive, […]

Ver mais

Like 0

Liked Liked

technocracy

Vibe Coding & AI in UI/UX Design

digitado ⋅ 7 de March de 2026

Where Creativity Meets Code How AI is transforming the way we think about design, development, and everything in between Picture this: you’re sitting in a coffee shop, scribbling wireframes on a napkin. You take a photo with your phone, and within seconds, you’re looking at a fully functional, beautifully styled web interface complete with responsive layouts, smooth animations, and production-ready code. No wrestling with CSS grids. No debugging flex containers. Just your idea, brought to life. This isn’t science fiction anymore. […]

Ver mais

Like 0

Liked Liked

technocracy

Failing to Explore: Language Models on Interactive Tasks

digitado ⋅ 2 de February de 2026

arXiv:2601.22345v1 Announce Type: new Abstract: We evaluate language models on their ability to explore interactive environments under a limited interaction budget. We introduce three parametric tasks with controllable exploration difficulty, spanning continuous and discrete environments. Across state-of-the-art models, we find systematic under-exploration and suboptimal solutions, with performance often significantly worse than simple explore–exploit heuristic baselines and scaling weakly as the budget increases. Finally, we study two lightweight interventions: splitting a fixed budget into parallel executions, which surprisingly improves […]

Ver mais

Like 0

Liked Liked

technocracy

Real-Time VFX Isn’t a Feature Anymore. It’s the New Baseline for Game Development.

digitado ⋅ 13 de February de 2026

The gaming world has evolved at breakneck speed in the past couple of decades. Once upon a time, you had to wait for a year or more to get the new iteration of a game. (Think the old Total War release cycles, where the British and Australian branches of the company alternated on bi-yearly schedules.) Now, game development is fast and focused on near-instant gratification. A live-service shooter drops new content on a continuous cycle. This is fed […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Fast Monomial Orders for Gröbner Basis Computations

digitado ⋅ 3 de February de 2026

The efficiency of Gröbner basis computation, the standard engine for solving systems of polynomial equations, depends on the choice of monomial ordering. Despite a near-continuum of possible monomial orders, most implementations rely on static heuristics such as GrevLex, guided primarily by expert intuition. We address this gap by casting the selection of monomial orderings as a reinforcement learning problem over the space of admissible orderings. Our approach leverages domain-informed reward signals that accurately reflect the computational cost of […]

Ver mais

Like 0

Liked Liked

technocracy

Roadmap to Master Reinforcement Learning (RL)

digitado ⋅ 8 de January de 2026

Hi everyone, I’m a CS student aiming to master Reinforcement Learning (RL) for industry roles and startup building. I’ve designed the following roadmap and would really appreciate feedback from experienced practitioners. My background: Comfortable with Python, NumPy, Pandas Basic ML & Deep Learning knowledge Long-term goal: RL Engineer / Agentic AI systems 🛣️ My RL Roadmap 1️⃣ Foundations Python (OOP, decorators, multiprocessing) Math: Linear Algebra, Probability, Calculus Markov Processes (MDP, Bellman equations) 2️⃣ Classical RL Multi-armed bandits Dynamic […]

Ver mais

Like 0

Liked Liked

technocracy

DeepRead: Document Structure-Aware Reasoning to Enhance Agentic Search

digitado ⋅ 6 de February de 2026

arXiv:2602.05014v1 Announce Type: new Abstract: With the rapid progress of tool-using and agentic large language models (LLMs), Retrieval-Augmented Generation (RAG) is evolving from one-shot, passive retrieval into multi-turn, decision-driven evidence acquisition. Despite strong results in open-domain settings, existing agentic search frameworks commonly treat long documents as flat collections of chunks, underutilizing document-native priors such as hierarchical organization and sequential discourse structure. We introduce DeepRead, a structure-aware, multi-turn document reasoning agent that explicitly operationalizes these priors for long-document question […]

Ver mais

Like 0

Liked Liked

technocracy

ShapBPT: Image Feature Attributions Using Data-Aware Binary Partition Trees

digitado ⋅ 10 de February de 2026

arXiv:2602.07047v1 Announce Type: new Abstract: Pixel-level feature attributions are an important tool in eXplainable AI for Computer Vision (XCV), providing visual insights into how image features influence model predictions. The Owen formula for hierarchical Shapley values has been widely used to interpret machine learning (ML) models and their learned representations. However, existing hierarchical Shapley approaches do not exploit the multiscale structure of image data, leading to slow convergence and weak alignment with the actual morphological features. Moreover, no […]

Ver mais

Like 0

Liked Liked