digitado

Reliability by design: quantifying and eliminating fabrication risk in LLMs. From generative to consultative AI: a comparative analysis in the legal domain and lessons for high-stakes knowledge bases

digitado ⋅ 23 de January de 2026

arXiv:2601.15476v1 Announce Type: new Abstract: This paper examines how to make large language models reliable for high-stakes legal work by reducing hallucinations. It distinguishes three AI paradigms: (1) standalone generative models (“creative oracle”), (2) basic retrieval-augmented systems (“expert archivist”), and (3) an advanced, end-to-end optimized RAG system (“rigorous archivist”). The authors introduce two reliability metrics -False Citation Rate (FCR) and Fabricated Fact Rate (FFR)- and evaluate 2,700 judicial-style answers from 12 LLMs across 75 legal tasks using expert, […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Without Training

digitado ⋅ 20 de February de 2026

Machine learning is at the heart of managing the real-world problems associated with massive data. With the success of neural networks on such large-scale problems, more research in machine learning is being conducted now than ever before. This dissertation focuses on three different projects rooted in mathematical theory for machine learning applications. The first project deals with supervised learning and manifold learning. In theory, one of the main problems in supervised learning is that of function approximation: that […]

Ver mais

Like 0

Liked Liked

technocracy

GutenOCR: A Grounded Vision-Language Front-End for Documents

digitado ⋅ 22 de January de 2026

arXiv:2601.14490v1 Announce Type: new Abstract: GutenOCR is a family of grounded OCR front-ends obtained by fine-tuning Qwen2.5-VL-3B and Qwen2.5-VL-7B. The resulting single-checkpoint vision-language models expose reading, detection, and grounding through a unified, prompt-based interface. Trained on business documents, scientific articles, and synthetic grounding data, the models support full-page and localized reading with line- and paragraph-level bounding boxes and conditional “where is x?” queries. We introduce a grounded OCR evaluation protocol and show that GutenOCR-7B more than doubles the […]

Ver mais

Like 0

Liked Liked

technocracy

Living With the Lethal Trifecta: A Guide to Personal AI Agent Security

digitado ⋅ 20 de February de 2026

I gave my OpenClaw AI agent the name Aris, access to my health data, family Telegram chat, calendar, and GitHub. OpenClaw is an open-source agent framework for building and running personal AI assistants that can interact with various apps and data sources. Simon Willison would call this insane, and he is probably right. Here’s what a Tuesday morning looks like. At 7:30, Aris sends my morning briefing: sleep score from Apple Watch, resting heart rate trending up, recovery […]

Ver mais

Like 0

Liked Liked

technocracy

Roles of MLLMs in Visually Rich Document Retrieval for RAG: A Survey

digitado ⋅ 8 de January de 2026

arXiv:2601.03262v1 Announce Type: new Abstract: Visually rich documents (VRDs) challenge retrieval-augmented generation (RAG) with layout-dependent semantics, brittle OCR, and evidence spread across complex figures and structured tables. This survey examines how Multimodal Large Language Models (MLLMs) are being used to make VRD retrieval practical for RAG. We organize the literature into three roles: Modality-Unifying Captioners, Multimodal Embedders, and End-to-End Representers. We compare these roles along retrieval granularity, information fidelity, latency and index size, and compatibility with reranking and […]

Ver mais

Like 0

Liked Liked

technocracy

Audio Foundation Models Outperform Symbolic Representations for Piano Performance Evaluation

digitado ⋅ 28 de January de 2026

arXiv:2601.19029v1 Announce Type: new Abstract: Automated piano performance evaluation traditionally relies on symbolic (MIDI) representations, which capture note-level information but miss the acoustic nuances that characterize expressive playing. I propose using pre-trained audio foundation models, specifically MuQ and MERT, to predict 19 perceptual dimensions of piano performance quality. Using synthesized audio from PercePiano MIDI files (rendered via Pianoteq), I compare audio and symbolic approaches under controlled conditions where both derive from identical source data. The best model, MuQ […]

Ver mais

Like 0

Liked Liked

technocracy

ByteDance backpedals after Seedance 2.0 turned Hollywood icons into AI “clip art”

digitado ⋅ 16 de February de 2026

ByteDance says that it’s rushing to add safeguards to block Seedance 2.0 from generating iconic characters and deepfaking celebrities, after substantial Hollywood backlash after launching the latest version of its AI video tool. The changes come after Disney and Paramount Skydance sent cease-and-desist letters to ByteDance urging the Chinese company to promptly end the allegedly vast and blatant infringement. Studios claimed the infringement was widescale and immediate, with Seedance 2.0 users across social media sharing AI videos featuring […]

Ver mais

Like 0

Liked Liked

technocracy

2026 Lucid Air Touring review: This feels like a complete car now

digitado ⋅ 23 de January de 2026

Life as a startup carmaker is hard—just ask Lucid Motors. When we met the brand and its prototype Lucid Air sedan in 2017, the company planned to put the first cars in customers’ hands within a couple of years. But you know what they say about plans. A lack of funding paused everything until late 2018, when Saudi Arabia’s sovereign wealth fund bought itself a stake. A billion dollars meant Lucid could build a factory—at the cost of […]

Ver mais

Like 0

Liked Liked

technocracy

European digital identity: A missed opportunity?

digitado ⋅ 22 de January de 2026

arXiv:2601.14503v1 Announce Type: new Abstract: Recent European efforts around digital identity — the EUDI regulation and its OpenID architecture — aim high, but start from a narrow and ill-defined conceptualization of authentication. Based on a broader, more grounded understanding of the term, in we identify several issues in the design of OpenID4VCI and OpenID4VP: insecure practices, static, and subject-bound credential types, and a limited query language restrict their application to classic scenarios of credential exchange — already supported […]

Ver mais

Like 0

Liked Liked

technocracy

The TechBeat: The Communication Habits That Help Startups Build Real Authority (1/10/2026)

digitado ⋅ 10 de January de 2026

How are you, hacker? 🪐Want to know what’s trending right now?: The Techbeat by HackerNoon has got you covered with fresh content from our trending stories of the day! Set email preference here. ## Designing API Contracts for Legacy System Modernization By @jamescaron [ 6 Min read ] A practical look at designing API contracts during legacy system modernization, focusing on real production failures and strategies to prevent silent regression Read More. How Supercell Powers its Massive Social […]

Ver mais

Like 0

Liked Liked