HACKOBAR_ // One feed for AI signal. No noise.

HACKOBAR_ // One feed for AI signal. No noise.

#1[TLDRNEWSLETTER]

7h ago

OpenAI Releases GPT-5.6 Preview: Sol, Terra, and Luna Models

OpenAI's GPT-5.6 Preview introduces three models — Sol (flagship), Terra, and Luna — with enhanced cyber and bio safety testing and new safeguards, currently in limited preview before broader rollout. No benchmark numbers or API pricing details are yet public.

#2[TECHCRUNCH]

·

1h ago

Cursor Launches Mobile App for Remote Coding Agent Oversight

Cursor's new mobile app lets developers monitor and guide autonomous coding agents from their phones, addressing the supervision gap when agents run long background tasks. No details on supported platforms or agent control depth were provided.

#3[HUGGINGFACE]

Two-Stage LLM Cascade Retains 97–99% Accuracy While Cutting Costs

A cascaded serving framework first clusters queries and routes each cluster to the cheapest sufficient model, then applies a quality estimation layer to escalate low-confidence outputs to stronger models. On test datasets, the system retains 97–99% of the strongest model's accuracy with significantly reduced cost, controlled by a single offline-tunable hyperparameter.

#4[@rohanpaul_ai]

1d ago

OpenRouter's Top Agent Model 'Owl Alpha' Is Likely Meituan's 1.6T MoE

Owl Alpha, anonymously trialed on OpenRouter for two months, is reportedly Meituan's LongCat-2.0-Preview — a 1.6T-parameter MoE with 48B active params, dynamic active range of 33B–56B, and 1M-token native context. It is generating 559B daily tokens with 242% monthly growth, ranking top-3 on major agent benchmarks.

#5[GH]

·

1d ago

Open-Source AI Agent Finds and Fixes App Vulnerabilities Automatically

★ 88 new · 26,467 total

Strix is an open-source tool that deploys AI agents to identify and remediate application security vulnerabilities, positioning itself as an automated alternative to manual penetration testing.

#6[HN]

·

7h ago

Tidal AI Policy

50 pts · 26 comments

Tidal's AI policy requires labeling of AI-generated music and bars creators from monetizing it on the platform. The HN discussion centers on enforcement feasibility, with commenters noting spectral artifacts and phase anomalies in generative audio as potential detection signals, while debating whether human-verification workflows at higher upload costs could be practical.

#7[HUGGINGFACE]

1h ago

DiScoFormer Unifies Density Estimation and Score Matching in One Transformer

AllenAI's DiScoFormer trains a single transformer to jointly estimate probability densities and score functions across multiple distributions. No benchmark numbers or architecture details are available from the source beyond the title.

#8[r/ClaudeAI]

20h ago

Graphify Reaches 73k Stars With 71x Token Reduction for Repo Queries

300 upvotes · 35 comments

Graphify converts repos, PDFs, SQL schemas, and Obsidian vaults into knowledge graphs queryable by Claude, reducing tokens per query by ~71x. At 2.2M downloads in 2.5 months and now YC S26-backed, it has added session-persistent learning via a LESSONS.md feedback loop to reduce repeated errors.

#9[arXiv]

EpiKV Scores KV Cache Tokens via Representation Change, No Attention Matrix Needed

cs.LG, cs.CL

EpiKV replaces attention-weight-based token scoring with an epiphany score — the change in internal model representations read directly from the forward pass — enabling KV cache eviction compatible with FlashAttention without materializing the attention matrix. The method requires no training or custom kernels and scales to 16x longer feasible context at a 4096-token cache budget.

#10[TLDRNEWSLETTER]

7h ago

Google Retrofits Multi-Token Prediction onto Frozen Gemini Nano v3 for Mobile

Google's approach adds Multi-Token Prediction components to already-frozen Gemini Nano v3 weights, avoiding full retraining while improving inference efficiency for on-device deployment. The architecture targets extreme edge constraints on Pixel hardware.

#11[THEVERGE]

1d ago

ChatGPT Conversation Logs Used as Criminal Evidence in Arson Trial

Prosecutors in the Palisades fire arson case used a defendant's ChatGPT chat history alongside location data and camera footage as trial evidence, marking an early precedent for LLM logs in criminal proceedings.

#12[HUGGINGFACE]

Conversational Infill Hides Reasoner Latency in Voice Agents Using a Small Talker Model

A small real-time talker model generates contextually grounded filler responses immediately while a slower reasoner model runs in parallel, then fluently integrates the reasoner's streamed output mid-response. Trained on a 290,571-example synthetic dataset across six domains, the approach is validated across seven small models, decoupling latency from capability in voice agents.

#13[@claudeai]

2h ago

Claude Opus 4.8 and Haiku 4.5 Now GA on Azure via Microsoft Foundry

Anthropic's Claude Opus 4.8 and Claude Haiku 4.5 are generally available on Azure through Microsoft Foundry, with native Azure authentication, billing, and commitment drawdown support. Enterprise Azure customers can now use Claude models without separate Anthropic contracts.

#14[GH]

6h ago

VulnClaw Automates Full Pentest Pipeline via AI Agent and MCP Toolchain

★ 105 new · 1,032 total

VulnClaw is an open-source AI agent that chains recon, vulnerability discovery, exploitation, and report generation from natural language input using an LLM plus MCP tool orchestration. It automates the full penetration testing workflow end-to-end without manual tool invocation between stages.

#15[HN]

19h ago

PFG-1 Sophon: 330 GB On-Die DRAM, 4,200 TFLOPS FP8 on 750mm² Die

16 pts · 7 comments

PhantaField's PFG-1 Sophon is a monolithic 3D AI ASIC using 32-tier 2D-TMD gain-cell DRAM to put 330 GB of weight storage on-die, eliminating HBM entirely. At 131,072 compute-in-memory tiles running 500 MHz bit-serial activation, it claims 4,200 TFLOPS FP8 and 2,100 TFLOPS BF16 on a single 750mm² die built on 28nm CMOS. The same die handles both training and low-batch inference decode at compute-bound rates.

#16[META_AI]

Brain2Qwerty Decodes Typed Text from Brain Waves Without Surgery

Brain2Qwerty is a non-invasive BCI system that translates brain signals into typed characters, offering a communication pathway that avoids surgical implantation. No benchmark accuracy figures or model architecture details are available from the summary.

#17[r/LocalLLaMA]

1m ago

DeepSeek V4 Support Merged Into llama.cpp

135 upvotes · 21 comments

A pull request adding DeepSeek V4 support has been merged into llama.cpp, enabling local inference via GGUF quantized weights. Users can run it now with a git pull, cmake rebuild, and downloading the relevant GGUFs.

#18[arXiv]

Compliant Persona Steering Drops Llama Refusal Rate from 97% to 2%

cs.AI

In Qwen2.5-7B-Instruct and Llama-3.1-8B-Instruct, a compliant persona linear direction in activation space gates the refusal direction — steering toward compliance drops Llama's refusal rate from 97% to 2%. Refusal is computed earlier but expressed at late layers, meaning single-direction refusal interventions miss this dependency.

#19[TLDRNEWSLETTER]

7h ago

Anthropic Index: High-Wage Tasks Use 2.5x More Tokens

Anthropic's June 2026 Economic Index finds AI compute consumption scales with task economic value, with higher-wage occupations using up to 2.5x more tokens than lower-wage ones. The correlation suggests token usage may serve as a proxy for economic complexity when modeling AI adoption and cost.

#20[TECHCRUNCH]

1h ago

Samsung and SK Hynix Commit $550B+ to Expand HBM and Memory Capacity

Samsung and SK Hynix are pledging over $550B combined to build new memory fabs, directly targeting the HBM supply crunch constraining AI accelerator throughput. South Korea is positioning this as a national AI infrastructure strategy.

#21[HUGGINGFACE]

Object-Centric Residual RL Transfers Sim-Trained Policies to Real VLAs Zero-Shot

A residual RL framework trained purely in simulation refines frozen VLA actions using object poses rather than raw images or privileged simulator state, sidestepping the visual domain gap and avoiding costly real-world RL. The compact object-centric observation space enables zero-shot sim-to-real transfer on top of existing VLAs without retraining them.

#22[@bcherny]

3h ago

Claude Code Gets Background Subagents in Next Release

Claude Code's next version will run subagents in the background by default, allowing continued interaction with the primary agent while parallel tasks execute. Users can override to foreground execution on demand.

#23[GH]

1d ago

Feed-Forward 3D Foundation Model Reconstructs Scenes from Streaming Data

★ 372 new · 8,042 total

LingBot-Map is a feed-forward 3D foundation model designed for scene reconstruction from streaming sensor data, targeting real-time or near-real-time 3D understanding pipelines.

#24[HN]

1d ago

Google Caps Meta's Gemini API Access Due to Compute Constraints

23 pts · 5 comments

Google told Meta in March it could not fulfill the full Gemini compute capacity Meta sought to purchase, disrupting and delaying some of Meta's internal AI projects. Several other Google clients were also affected, though less severely. This signals real supply constraints on frontier model API access even for hyperscale buyers.

#25[OPENAI]

6h ago

OpenAI Report Maps EU Job Automation Risk by Occupation

OpenAI published an analysis of AI's potential impact on EU labor markets, categorizing occupations by automation risk, growth potential, and workflow disruption. The report targets policymakers and workforce planners but lacks specific model or benchmark citations in the summary provided.

·