digitado

[P] vLLM-MLX: Native Apple Silicon LLM inference – 464 tok/s on M4 Max

digitado ⋅ 16 de January de 2026

Hey everyone! I built vLLM-MLX – a framework that uses Apple’s MLX for native GPU acceleration. What it does: – OpenAI-compatible API (drop-in replacement for your existing code) – Multimodal support: Text, Images, Video, Audio – all in one server – Continuous batching for concurrent users (3.4x speedup) – TTS in 10+ languages (Kokoro, Chatterbox models) – MCP tool calling support Performance on M4 Max: – Llama-3.2-1B-4bit → 464 tok/s – Qwen3-0.6B → 402 tok/s – Whisper STT […]

Ver mais

Like 0

Liked Liked

technocracy

Leveraging Language Models and RAG for Efficient Knowledge Discovery in Clinical Environments

digitado ⋅ 9 de January de 2026

arXiv:2601.04209v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly recognized as valuable tools across the medical environment, supporting clinical, research, and administrative workflows. However, strict privacy and network security regulations in hospital settings require that sensitive data be processed within fully local infrastructures. Within this context, we developed and evaluated a retrieval-augmented generation (RAG) system designed to recommend research collaborators based on PubMed publications authored by members of a medical institution. The system utilizes PubMedBERT for […]

Ver mais

Like 0

Liked Liked

technocracy

Deep Learning for Autonomous Drone Navigation (RGB-D only) – How would you approach this?

digitado ⋅ 30 de January de 2026

Hi everyone, I’m working on a university project and could really use some advice from people with more experience in autonomous navigation / RL / simulation. Task: I need to design a deep learning model that directly controls a drone (x, y, z, pitch, yaw — roll probably doesn’t make much sense here 😅). The drone should autonomously patrol and map indoor and outdoor environments. Example use case: A warehouse where the drone automatically flies through all aisles […]

Ver mais

Like 0

Liked Liked

technocracy

[D] Which hyperparameters search library to use?

digitado ⋅ 19 de February de 2026

Hello, I run some experiments on various ML libraries at work, and benchmark some algorithms they package. I would like to try out some library that does hyperparameters optimization (i.e search), and I stumbled upon those 4 candidates: hyperopts Optuna sklearn.GridSearchCV and another object sklearn.RandomizedSearchCV Thus, I am asking the community whether you have used those, and if so, which one did you end up choosing? I have some criteria Ecosystem-agnostic: I don’t want to be tied to […]

Ver mais

Like 0

Liked Liked

technocracy

asr_eval: Algorithms and tools for multi-reference and streaming speech recognition evaluation

digitado ⋅ 30 de January de 2026

arXiv:2601.20992v1 Announce Type: new Abstract: We propose several improvements to the speech recognition evaluation. First, we propose a string alignment algorithm that supports both multi-reference labeling, arbitrary-length insertions and better word alignment. This is especially useful for non-Latin languages, those with rich word formation, to label cluttered or longform speech. Secondly, we collect a novel test set DiverseSpeech-Ru of longform in-the-wild Russian speech with careful multi-reference labeling. We also perform multi-reference relabeling of popular Russian tests set and […]

Ver mais

Like 0

Liked Liked

technocracy

Hallucinations in LLMs: A Deep Technical Dive into Causes, Detection, and Mitigation

digitado ⋅ 13 de February de 2026

Large language models such as GPT, LLaMA, and Claude excel at producing fluent text, yet they share a critical failure mode that blocks many deployments: hallucinations. These occur when a model generates confident, plausible-sounding answers that are factually incorrect or unsupported by the available evidence. This write-up explores hallucinations through the lens of: training objectives probabilistic decoding model calibration retrieval and grounding evaluation and detection mitigation strategies in production systems What is a hallucination? An LLM hallucination is any […]

Ver mais

Like 0

Liked Liked

technocracy

A level-wise training scheme for learning neural multigrid smoothers with application to integral equations

digitado ⋅ 1 de March de 2026

Convolution-type integral equations commonly occur in signal processing and image processing. Discretizing these equations yields large and ill-conditioned linear systems. While the classic multigrid method is effective for solving linear systems derived from partial differential equations (PDE) problems, it fails to solve integral equations because its smoothers, which are implemented as conventional relaxation methods, are ineffective in reducing high-frequency components in the errors. We propose a novel neural multigrid scheme where learned neural operators replace classical smoothers. Unlike […]

Ver mais

Like 0

Liked Liked

technocracy

Improved Algorithms for Fair Matroid Submodular Maximization

digitado ⋅ 16 de January de 2026

arXiv:2601.09860v1 Announce Type: new Abstract: Submodular maximization subject to matroid constraints is a central problem with many applications in machine learning. As algorithms are increasingly used in decision-making over datapoints with sensitive attributes such as gender or race, it is becoming crucial to enforce fairness to avoid bias and discrimination. Recent work has addressed the challenge of developing efficient approximation algorithms for fair matroid submodular maximization. However, the best algorithms known so far are only guaranteed to satisfy […]

Ver mais

Like 0

Liked Liked

technocracy

Sophia Space raises $10M seed to demo novel space computers

digitado ⋅ 26 de February de 2026

The company’s modular computer tiles offer a new vision for space data centers.

Ver mais

Like 0

Liked Liked

technocracy

CALM: Class-Conditional Sparse Attention Vectors for Large Audio-Language Models

digitado ⋅ 10 de February de 2026

arXiv:2602.07077v1 Announce Type: new Abstract: Large audio-language models (LALMs) exhibit strong zero-shot capabilities in multiple downstream tasks, such as audio question answering (AQA) and abstract reasoning; however, these models still lag behind specialized models for certain discriminative tasks (e.g., audio classification). Recent studies show that sparse subsets of attention heads within an LALM can serve as strong discriminative feature extractors for downstream tasks such as classification via simple voting schemes. However, these methods assign uniform weights to all […]

Ver mais

Like 0

Liked Liked