AI - Artificial intelligence

176

1

A guide to understanding AI as normal technology (www.normaltech.ai)

submitted 4 months ago by cm0002@literature.cafe to c/Aii@programming.dev

0 comments fedilink

177

1

Tenstorrent downgrading Blackhole p150 cards from 140 to 120 tensor cores via firmware update — will ship cards with 120 tensor cores going forward company claims users should expect 1-2% perf drop (www.tomshardware.com)

submitted 4 months ago by cm0002@literature.cafe to c/Aii@programming.dev

0 comments fedilink

Unlike many startups producing nothing but vaporware, Jim Keller's TensTorrent has actually delivered impressive-looking RISC-V AI accelerators, but there could be some trouble brewing. Starting with firmware version 19.5.0, the firm has now chopped the tensor core count on Blackhole p150 cards from 140 to 120, affecting both new cards and existing units already in customers' hands.

The news was apparently communicated to customers via email, with the same wording present on the firmware update's GitHub page. Tenstorrent isn't elaborating on why the change was made, leaving existing and potential buyers scratching their heads

178

1

Mistral drops Voxtral Transcribe 2, an open-source speech model that runs on-device for pennies (venturebeat.com)

submitted 4 months ago by cm0002@literature.cafe to c/Aii@programming.dev

0 comments fedilink

179

1

Arkansas attorney resigns after using AI to assist in case work (www.thv11.com)

submitted 4 months ago by cm0002@literature.cafe to c/Aii@programming.dev

0 comments fedilink

180

1

On craft and AI (slightknack.dev)

submitted 4 months ago by codeinabox@programming.dev to c/Aii@programming.dev

0 comments fedilink

cross-posted from: https://lemmy.bestiver.se/post/905430

Comments

181

1

Union leaders have a message for Newsom: Regulate AI if you want to be president (calmatters.org)

submitted 4 months ago by cm0002@literature.cafe to c/Aii@programming.dev

0 comments fedilink

182

1

The Third Summit on Responsible AI in the Military Domain (REAIM) (www.justsecurity.org)

submitted 4 months ago by cm0002@literature.cafe to c/Aii@programming.dev

0 comments fedilink

183

1

qwen3-TTS-studio: ElevenLabs-style voice cloning + NotebookLM-style podcast generation, but local (github.com)

submitted 4 months ago by cm0002@literature.cafe to c/Aii@programming.dev

0 comments fedilink

Clone any voice with just a 3-second audio sample

Fine-tune parameters (temperature, top-k, top-p) with quality presets

Generate complete podcasts from just a topic – AI writes the script, assigns voices, and synthesizes everything

10 languages supported (Korean, English, Chinese, Japanese, etc.

Currently uses gpt5.2 for script generation, but the architecture is modular – you can swap in any local LLM (Qwen, Llama, etc.) if you want fully local.

184

1

Understanding LLM Inference Engines: Inside Nano-vLLM (Part 1) (neutree.ai)

submitted 4 months ago by cm0002@digipres.cafe to c/Aii@programming.dev

0 comments fedilink

185

1

QuitGPT — OpenAI Execs are Trump's Biggest Donors (quitgpt.org)

submitted 4 months ago by codeinabox@programming.dev to c/Aii@programming.dev

0 comments fedilink

186

1

LingBot-World is Google Genie 3, but open source (technology.robbyant.com)

submitted 4 months ago by cm0002@digipres.cafe to c/Aii@programming.dev

0 comments fedilink

187

1

Gaming market melts down after Google reveals new AI game design tool — Project Genie crashes stocks for Roblox, Nintendo, CD Projekt Red, and more (www.tomshardware.com)

submitted 4 months ago by cm0002@digipres.cafe to c/Aii@programming.dev

0 comments fedilink

188

1

AI agents now have their own Reddit-style social network, and it's getting weird fast (arstechnica.com)

submitted 4 months ago by cm0002@digipres.cafe to c/Aii@programming.dev

0 comments fedilink

189

1

Can AI companies become profitable? (epoch.ai)

submitted 4 months ago by codeinabox@programming.dev to c/Aii@programming.dev

0 comments fedilink

cross-posted from: https://lemmy.bestiver.se/post/891009

Comments

190

1

Vibe coding: Pros, cons, and 2026 forecasts from PVS-Studio (pvs-studio.com)

submitted 4 months ago by cm0002@suppo.fi to c/Aii@programming.dev

0 comments fedilink

191

1

Anthropic CEO Amodei warns of AI’s fast-coming changes (www.semafor.com)

submitted 4 months ago by cm0002@suppo.fi to c/Aii@programming.dev

0 comments fedilink

192

1

Why A.I. Can’t Make Thoughtful Decisions (www.nytimes.com)

submitted 5 months ago by cm0002@no.lastname.nz to c/Aii@programming.dev

0 comments fedilink

193

1

IQuest Coder - State-of-the-Art Open-Source Code LLM (iquestcoder.ai)

submitted 5 months ago by cm0002@no.lastname.nz to c/Aii@programming.dev

0 comments fedilink

194

1

Latest ChatGPT model uses Elon Musk’s Grokipedia as source, tests reveal (www.theguardian.com)

submitted 5 months ago by codeinabox@programming.dev to c/Aii@programming.dev

0 comments fedilink

195

1

AI tribalism (nolanlawson.com)

submitted 5 months ago by codeinabox@programming.dev to c/Aii@programming.dev

0 comments fedilink

196

1

Microsoft CEO Nadella’s ‘telltale sign’ of AI bubble (www.seattletimes.com)

submitted 5 months ago by monica_b1998@lemmy.world to c/Aii@programming.dev

0 comments fedilink

197

1

Cutting LLM Memory by 84%: A Deep Dive into Fused Kernels (towardsdatascience.com)

submitted 5 months ago* (last edited 5 months ago) by cm0002@lemmings.world to c/Aii@programming.dev

0 comments fedilink

If you’ve ever trained or fine-tuned an LLM, you’ve likely hit a wall at the very last step: the Cross-Entropy Loss.

The culprit is the logit bottleneck. To predict the next token, we project a hidden state into a massive vocabulary space. For Llama 3 (128,256 tokens), the weight matrix alone is over 525 million parameters. While that’s only ~1GB in bfloat16, the intermediate logit tensor is the real issue. For large batches, it can easily exceed 80GB of VRAM just to compute a single scalar loss.

Optimising this layer is how libraries like Unsloth and Liger-Kernel achieve such massive memory reductions. In this article, we’ll build a fused Linear + Cross Entropy kernel from scratch in Triton. We will derive the math and implement a tiled forward and backward pass that slashes peak memory usage by 84%.

198

1

Shrinking AI memory boosts accuracy (www.ed.ac.uk)

submitted 5 months ago by cm0002@lemmings.world to c/Aii@programming.dev

0 comments fedilink

199

1

DeepSeek just published a paper on conditional memory via scalable lookup (github.com)

submitted 5 months ago by cm0002@lemmy.cafe to c/Aii@programming.dev

0 comments fedilink

The paper argues that we have been wasting a lot of expensive GPU cycles by forcing transformers to relearn static things like names or common phrases through deep computation. Standard models do not have a way to just look something up so they end up simulating memory by passing tokens through layer after layer of feed forward networks. DeepSeek introduced a module called Engram which adds a dedicated lookup step for local N-gram patterns. It acts like a new way to scale a model that is separate from the usual compute heavy Mixture of Experts approach.

The architecture uses multi head hashing to grab static embeddings for specific token sequences which are then filtered through a context aware gate to make sure they actually fit the current situation. They found a U shaped scaling law where the best performance happens when you split your parameter budget between neural computation and this static memory. By letting the memory handle the simple local associations the model can effectively act like it is deeper because the early layers are not bogged down with basic reconstruction.

One of the best bits is how they handle hardware constraints by offloading the massive lookup tables to host RAM. Since these lookups are deterministic based on the input tokens the system can prefetch the data from the CPU memory before the GPU even needs it. This means you can scale to tens of billions of extra parameters with almost zero impact on speed since the retrieval happens while the previous layers are still calculating.

The benchmarks show that this pays off across the board especially in long context tasks where the model needs its attention focused on global details rather than local phrases. It turns out that even in math and coding the model gets a boost because it is no longer wasting its internal reasoning depth on things that should just be in a lookup table. Moving forward this kind of conditional memory could be a standard part of sparse models because it bypasses the physical memory limits of current hardware.

200

1

Mistral Small Creative beats Claude Opus 4.5 at explaining transformers — 50x cheaper, higher scores (substack.com)

submitted 5 months ago by cm0002@lemmy.cafe to c/Aii@programming.dev

0 comments fedilink