digitado

Learning from Synthetic Data Improves Multi-hop Reasoning

digitado ⋅ 2 de March de 2026

Reinforcement Learning (RL) has been shown to significantly boost reasoning capabilities of large language models (LLMs) in math, coding, and multi-hop reasoning tasks. However, RL fine-tuning requires abundant high-quality verifiable data, often sourced from human annotations, generated from frontier LLMs, or scored by LLM-based verifiers. All three have considerable limitations: human-annotated datasets are small and expensive to curate, LLM-generated data is hallucination-prone and costly, and LLM-based verifiers are inaccurate and slow. In this work, we investigate a cheaper […]

Ver mais

Like 0

Liked Liked

technocracy

Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents

digitado ⋅ 6 de February de 2026

arXiv:2602.05073v1 Announce Type: new Abstract: Uncertainty quantification (UQ) for large language models (LLMs) is a key building block for safety guardrails of daily LLM applications. Yet, even as LLM agents are increasingly deployed in highly complex tasks, most UQ research still centers on single-turn question-answering. We argue that UQ research must shift to realistic settings with interactive agents, and that a new principled framework for agent UQ is needed. This paper presents the first general formulation of agent […]

Ver mais

Like 0

Liked Liked

technocracy

Proact-VL: A Proactive VideoLLM for Real-Time AI Companions

digitado ⋅ 5 de March de 2026

arXiv:2603.03447v1 Announce Type: new Abstract: Proactive and real-time interactive experiences are essential for human-like AI companions, yet face three key challenges: (1) achieving low-latency inference under continuous streaming inputs, (2) autonomously deciding when to respond, and (3) controlling both quality and quantity of generated content to meet real-time constraints. In this work, we instantiate AI companions through two gaming scenarios, commentator and guide, selected for their suitability for automatic evaluation. We introduce the Live Gaming Benchmark, a large-scale […]

Ver mais

Like 0

Liked Liked

technocracy

Generalized Visual Language Models

digitado ⋅ 10 de June de 2022

Processing images to generate text, such as image captioning and visual question-answering, has been studied for years. Traditionally such systems rely on an object detection network as a vision encoder to capture visual features and then produce text via a text decoder. Given a large amount of existing literature, in this post, I would like to only focus on one approach for solving vision language tasks, which is to extend pre-trained generalized language models to be capable of […]

Ver mais

Like 0

Liked Liked

technocracy

Robotic Assembly Using Deep Reinforcement Learning

digitado ⋅ 21 de October de 2020

Introduction Disclaimer: This article is a cross post from Pytorch Medium Blog Post. One of the most exciting advancements, that has pushed the frontier of the Artificial Intelligence (AI) in recent years, is Deep Reinforcement Learning (DRL). DRL belongs to the family of machine learning algorithms. It assumes that intelligent machines can learn from their actions similar to the way humans learn from experience. Over the recent years we could witness some impressive real-world applications of DRL. The […]

Ver mais

Like 0

Liked Liked

technocracy

AstraZeneca bets on in-house AI to speed up oncology research

digitado ⋅ 14 de January de 2026

Drug development is producing more data than ever, and large pharmaceutical companies like AstraZeneca are turning to AI to make sense of it. The challenge is no longer whether AI can help, but how tightly it needs to be built into research and clinical work to improve decisions around trials and treatment. That question helps explain why AstraZeneca is bringing Modella AI in-house. The company has agreed to acquire the Boston-based AI firm as it looks to deepen […]

Ver mais

Like 0

Liked Liked

technocracy

SW-ASR: A Context-Aware Hybrid ASR Pipeline for Robust Single Word Speech Recognition

digitado ⋅ 30 de January de 2026

arXiv:2601.20890v1 Announce Type: new Abstract: Single-word Automatic Speech Recognition (ASR) is a challenging task due to the lack of linguistic context and sensitivity to noise, pronunciation variation, and channel artifacts, especially in low-resource, communication-critical domains such as healthcare and emergency response. This paper reviews recent deep learning approaches and proposes a modular framework for robust single-word detection. The system combines denoising and normalization with a hybrid ASR front end (Whisper + Vosk) and a verification layer designed to […]

Ver mais

Like 0

Liked Liked

technocracy

ContraLog: Log File Anomaly Detection with Contrastive Learning and Masked Language Modeling

digitado ⋅ 3 de February de 2026

Log files record computational events that reflect system state and behavior, making them a primary source of operational insights in modern computer systems. Automated anomaly detection on logs is therefore critical, yet most established methods rely on log parsers that collapse messages into discrete templates, discarding variable values and semantic content. We propose ContraLog, a parser-free and self-supervised method that reframes log anomaly detection as predicting continuous message embeddings rather than discrete template IDs. ContraLog combines a message […]

Ver mais

Like 0

Liked Liked

technocracy

TT-Sparse: Learning Sparse Rule Models with Differentiable Truth Tables

digitado ⋅ 8 de March de 2026

Interpretable machine learning is essential in high-stakes domains where decision-making requires accountability, transparency, and trust. While rule-based models offer global and exact interpretability, learning rule sets that simultaneously achieve high predictive performance and low, human-understandable complexity remains challenging. To address this, we introduce TT-Sparse, a flexible neural building block that leverages differentiable truth tables as nodes to learn sparse, effective connections. A key contribution of our approach is a new soft TopK operator with straight-through estimation for learning […]

Ver mais

Like 0

Liked Liked

technocracy

Hybrid Tabletop Exercise (TTX) based on a Mathematical Simulation-based Model for the Maritime Sector

digitado ⋅ 19 de February de 2026

arXiv:2602.15975v1 Announce Type: new Abstract: As cyber threats grow in complexity and scale, many security incidents remain poorly managed due to the lack of proper training among C-level executives. Thus, there is a need for targeted cybersecurity education to enhance executive decision-making and crisis response. Traditional training methods, such as cyber wargames and Tabletop Exercises (TTX), aim to develop abilities to face critical incidents, however, they often lack the interactive and dynamic elements required to prepare individuals for […]

Ver mais

Like 0

Liked Liked