Welcome to the EasyTechDigest Latest Tech News Hub — your go-to source for the latest updates in technology, delivered in real time. Explore breaking stories in AI, gadgets, cybersecurity, startups, and the biggest moves from companies like Apple, Google, and Microsoft.
Get fresh headlines every day from top sites like TechCrunch, Wired, The Verge, Ars Technica, and more — all in one place.
Featured Headlines
- ‘Uncanny Valley’: ICE’s Secret Expansion Plans, Palantir Workers’ Ethical Concerns, and AI Assistantsby Brian Barrett, Zoë Schiffer, Leah Feiger on February 12, 2026 at 10:12 pm
In this episode of Uncanny Valley, our hosts dive into WIRED’s scoop about a secret Trump administration campaign extending right into your backyard.
- Musk needed a new vision for SpaceX and xAI. He landed on Moonbase Alpha.by Tim Fernholz on February 12, 2026 at 10:10 pm
"I really want to see a mass driver on the moon that is shooting AI satellites into deep space."
- Didero lands $30M to put manufacturing procurement on ‘agentic’ autopilotby Marina Temkin on February 12, 2026 at 8:31 pm
Didero functions as an agentic AI layer that sits on top of a company’s existing ERP, acting as a coordinator that reads incoming communications and automatically executes the necessary updates and tasks.
- Anthropic raises another $30B in Series G, with a new value of $380Bby Lucas Ropek on February 12, 2026 at 8:18 pm
The infusion of funding for the AI startup takes place as it is vying for customers and cultural attention with its competitor, OpenAI.
- YouTube finally launches a dedicated app for Apple Vision Proby Lauren Forristal on February 12, 2026 at 8:14 pm
When Apple Vision Pro first came out two years ago, YouTube hesitated to release a dedicated app. Today is the day they officially launch one.
AI & Machine Learning
- ‘Uncanny Valley’: ICE’s Secret Expansion Plans, Palantir Workers’ Ethical Concerns, and AI Assistantsby Brian Barrett, Zoë Schiffer, Leah Feiger on February 12, 2026 at 10:12 pm
In this episode of Uncanny Valley, our hosts dive into WIRED’s scoop about a secret Trump administration campaign extending right into your backyard.
- A Wave of Unexplained Bot Traffic Is Sweeping the Webby Zeyi Yang on February 12, 2026 at 7:50 pm
From small publishers to US federal agencies, websites are reporting unusual spikes in automated traffic linked to IP addresses in Lanzhou, China.
- OpenAI’s President Gave Millions to Trump. He Says It’s for Humanityby Maxwell Zeff on February 12, 2026 at 7:00 pm
In an interview with WIRED, Greg Brockman says his political donations support OpenAI's mission—even if some employees at the company disagree.
- Crypto-Funded Human Trafficking Is Explodingby Andy Greenberg on February 12, 2026 at 1:00 pm
The use of cryptocurrency in sales of human beings for prostitution and scam compounds nearly doubled in 2025, according to a conservative estimate. Many of the deals are happening in plain sight.
- I Tried RentAHuman, Where AI Agents Hired Me to Hype Their AI Startupsby Reece Rogers on February 12, 2026 at 11:00 am
Rather than offering a revolutionary new approach to gig work, RentAHuman is filled with bots that just want me to be another cog in the AI hype machine.
Gadgets & Hardware
- Feed has no items.
Big Tech (Apple, Google, Microsoft)
- Trump FTC wants Apple News to promote more Fox News and Breitbart storiesby Jon Brodkin on February 12, 2026 at 8:30 pm
FTC claims Apple News suppresses conservatives, cites study by pro-Trump group.
- YouTube finally launches a dedicated app for Apple Vision Proby Lauren Forristal on February 12, 2026 at 8:14 pm
When Apple Vision Pro first came out two years ago, YouTube hesitated to release a dedicated app. Today is the day they officially launch one.
- It took two years, but Google released a YouTube app on Vision Proby Samuel Axon on February 12, 2026 at 7:53 pm
App arrives months after Google requested takedowns of third-party options.
- Attackers prompted Gemini over 100,000 times while trying to clone it, Google saysby Benj Edwards on February 12, 2026 at 7:42 pm
Distillation technique lets copycats mimic Gemini at a fraction of the development cost.
- Apple acquires all rights to ‘Severance,’ will produce future seasons in-houseby Aisha Malik on February 12, 2026 at 3:34 pm
The show is expected to run for four seasons, with the possibility of spin-offs, a prequel, and foreign versions.
Cybersecurity
- Feed has no items.
Startups & Innovation
- Feed has no items.
Tech from Around the Web
- ‘Uncanny Valley’: ICE’s Secret Expansion Plans, Palantir Workers’ Ethical Concerns, and AI Assistantsby Brian Barrett, Zoë Schiffer, Leah Feiger on February 12, 2026 at 10:12 pm
In this episode of Uncanny Valley, our hosts dive into WIRED’s scoop about a secret Trump administration campaign extending right into your backyard.
- Musk needed a new vision for SpaceX and xAI. He landed on Moonbase Alpha.by Tim Fernholz on February 12, 2026 at 10:10 pm
"I really want to see a mass driver on the moon that is shooting AI satellites into deep space."
- Nvidia’s new technique cuts LLM reasoning costs by 8x without losing accuracyby bendee983@gmail.com (Ben Dickson) on February 12, 2026 at 10:00 pm
Researchers at Nvidia have developed a technique that can reduce the memory costs of large language model reasoning by up to eight times. Their technique, called dynamic memory sparsification (DMS), compresses the key value (KV) cache, the temporary memory LLMs generate and store as they process prompts and reason through problems and documents.While researchers have proposed various methods to compress this cache before, most struggle to do so without degrading the model's intelligence. Nvidia's approach manages to discard much of the cache while maintaining (and in some cases improving) the model's reasoning capabilities.Experiments show that DMS enables LLMs to "think" longer and explore more solutions without the usual penalty in speed or memory costs.The bottleneck of reasoningLLMs improve their performance on complex tasks by generating "chain-of-thought" tokens, essentially writing out their reasoning steps before arriving at a final answer. Inference-time scaling techniques leverage this by giving the model a larger budget to generate these thinking tokens or to explore multiple potential reasoning paths in parallel.However, this improved reasoning comes with a significant computational cost. As the model generates more tokens, it builds up a KV cache. For real-world applications, the KV cache is a major bottleneck. As the reasoning chain grows, the cache grows linearly, consuming vast amounts of memory on GPUs. This forces the hardware to spend more time reading data from memory than actually computing, which slows down generation and increases latency. It also caps the number of users a system can serve simultaneously, as running out of VRAM causes the system to crash or slow to a crawl.Nvidia researchers frame this not just as a technical hurdle, but as a fundamental economic one for the enterprise."The question isn't just about hardware quantity; it's about whether your infrastructure is processing 100 reasoning threads or 800 threads for the same cost," Piotr Nawrot, Senior Deep Learning Engineer at Nvidia, told VentureBeat.Previous attempts to solve this focused on heuristics-based approaches. These methods use rigid rules, such as a "sliding window" that only caches the most recent tokens and deletes the rest. While this reduces memory usage, it often forces the model to discard critical information required for solving the problem, degrading the accuracy of the output."Standard eviction methods attempt to select old and unused tokens for eviction using heuristics," the researchers said. "They simplify the problem, hoping that if they approximate the model's internal mechanics, the answer will remain correct."Other solutions use paging to offload the unused parts of the KV cache to slower memory, but the constant swapping of data introduces latency overhead that makes real-time applications sluggish.Dynamic memory sparsificationDMS takes a different approach by "retrofitting" existing LLMs to intelligently manage their own memory. Rather than applying a fixed rule for what to delete, DMS trains the model to identify which tokens are essential for future reasoning and which are disposable."It doesn't just guess importance; it learns a policy that explicitly preserves the model's final output distribution," Nawrot said.The process transforms a standard, pre-trained LLM such as Llama 3 or Qwen 3 into a self-compressing model. Crucially, this does not require training the model from scratch, which would be prohibitively expensive. Instead, DMS repurposes existing neurons within the model’s attention layers to output a "keep" or "evict" signal for each token.For teams worried about the complexity of retrofitting, the researchers noted that the process is designed to be lightweight. "To improve the efficiency of this process, the model's weights can be frozen, which makes the process similar to Low-Rank Adaptation (LoRA)," Nawrot said. This means a standard enterprise model like Qwen3-8B "can be retrofitted with DMS within hours on a single DGX H100."One of the important parts of DMS is a mechanism called "delayed eviction." In standard sparsification, if a token is deemed unimportant, it is deleted immediately. This is risky because the model might need a split second to integrate that token's context into its current state.DMS mitigates this by flagging a token for eviction but keeping it accessible for a short window of time (e.g., a few hundred steps). This delay allows the model to "extract" any remaining necessary information from the token and merge it into the current context before the token is wiped from the KV cache.“The ‘delayed eviction’ mechanism is crucial because not all tokens are simply ‘important’ (keep forever) or ‘useless’ (delete immediately). Many fall in between — they carry some information, but not enough to justify occupying an entire slot in memory,” Nawrot said. “This is where the redundancy lies. By keeping these tokens in a local window for a short time before eviction, we allow the model to attend to them and redistribute their information into future tokens.”The researchers found that this retrofitting process is highly efficient. They could equip a pre-trained LLM with DMS in just 1,000 training steps, a tiny fraction of the compute required for the original training. The resulting models use standard kernels and can drop directly into existing high-performance inference stacks without custom hardware or complex software rewriting.DMS in actionTo validate the technique, the researchers applied DMS to several reasoning models, including the Qwen-R1 series (distilled from DeepSeek R1) and Llama 3.2, and tested them on difficult benchmarks like AIME 24 (math), GPQA Diamond (science), and LiveCodeBench (coding).The results show that DMS effectively moves the Pareto frontier, the optimal trade-off between cost and performance. On the AIME 24 math benchmark, a Qwen-R1 32B model equipped with DMS achieved a score 12.0 points higher than a standard model when constrained to the same memory bandwidth budget. By compressing the cache, the model could afford to "think" much deeper and wider than the standard model could for the same memory and compute budget.Perhaps most surprisingly, DMS defied the common wisdom that compression hurts long-context understanding. In "needle-in-a-haystack" tests, which measure a model's ability to find a specific piece of information buried in a large document, DMS variants actually outperformed the standard models. By actively managing its memory rather than passively accumulating noise, the model maintained a cleaner, more useful context.For enterprise infrastructure, the efficiency gains translate directly to throughput and hardware savings. Because the memory cache is significantly smaller, the GPU spends less time fetching data, reducing the wait time for users. In tests with the Qwen3-8B model, DMS matched the accuracy of the vanilla model while delivering up to 5x higher throughput. This means a single server can handle five times as many customer queries per second without a drop in quality.The future of memoryNvidia has released DMS as part of its KVPress library. Regarding how enterprises can get started with DMS, Nawrot emphasized that the barrier to entry is low. "The 'minimum viable infrastructure' is standard Hugging Face pipelines — no custom CUDA kernels are required," Nawrot said, noting that the code is fully compatible with standard FlashAttention. Looking ahead, the team views DMS as part of a larger shift where memory management becomes a distinct, intelligent layer of the AI stack. Nawrot also confirmed that DMS is "fully compatible" with newer architectures like the Multi-Head Latent Attention (MLA) used in DeepSeek’s models, suggesting that combining these approaches could yield even greater efficiency gains.As enterprises move from simple chatbots to complex agentic systems that require extended reasoning, the cost of inference is becoming a primary concern. Techniques like DMS provide a path to scale these capabilities sustainably."We’ve barely scratched the surface of what is possible," Nawrot said, "and we expect inference-time scaling to further evolve."
- Didero lands $30M to put manufacturing procurement on ‘agentic’ autopilotby Marina Temkin on February 12, 2026 at 8:31 pm
Didero functions as an agentic AI layer that sits on top of a company’s existing ERP, acting as a coordinator that reads incoming communications and automatically executes the necessary updates and tasks.
- MiniMax's new open M2.5 and M2.5 Lightning near state-of-the-art while costing 1/20th of Claude Opus 4.6by carl.franzen@venturebeat.com (Carl Franzen) on February 12, 2026 at 8:28 pm
Chinese AI startup MiniMax, headquartered in Shanghai, has sent shockwaves through the AI industry today with the release of its new M2.5 language model in two variants, which promise to make high-end artificial intelligence so cheap you might stop worrying about the bill entirely. It's also said to be "open source," though the weights (settings) and code haven't been posted yet, nor has the exact license type or terms. But that's almost beside the point given how cheap MiniMax is serving it through its API and those of partners.For the last few years, using the world’s most powerful AI was like hiring an expensive consultant—it was brilliant, but you watched the clock (and the token count) constantly. M2.5 changes that math, dropping the cost of the frontier by as much as 95%.By delivering performance that rivals the top-tier models from Google and Anthropic at a fraction of the cost, particularly in agentic tool use for enterprise tasks, including creating Microsoft Word, Excel and PowerPoint files, MiniMax is betting that the future isn't just about how smart a model is, but how often you can afford to use it.Indeed, to this end, MiniMax says it worked "with senior professionals in fields such as finance, law, and social sciences" to ensure the model could perform real work up to their specifications and standards.This release matters because it signals a shift from AI as a "chatbot" to AI as a "worker". When intelligence becomes "too cheap to meter," developers stop building simple Q&A tools and start building "agents"—software that can spend hours autonomously coding, researching, and organizing complex projects without breaking the bank.In fact, MiniMax has already deployed this model into its own operations. Currently, 30% of all tasks at MiniMax HQ are completed by M2.5, and a staggering 80% of their newly committed code is generated by M2.5!As the MiniMax team writes in their release blog post, "we believe that M2.5 provides virtually limitless possibilities for the development and operation of agents in the economy."Technology: sparse power and the CISPO breakthroughThe secret to M2.5’s efficiency lies in its Mixture of Experts (MoE) architecture. Rather than running all of its 230 billion parameters for every single word it generates, the model only "activates" 10 billion. This allows it to maintain the reasoning depth of a massive model while moving with the agility of a much smaller one.To train this complex system, MiniMax developed a proprietary Reinforcement Learning (RL) framework called Forge. MiniMax engineer Olive Song stated on the ThursdAI podcast on YouTube that this technique was instrumental to scaling the performance even while using the relatively small number of parameters, and that the model was trained over a period of two months.Forge is designed to help the model learn from "real-world environments" — essentially letting the AI practice coding and using tools in thousands of simulated workspaces. "What we realized is that there's a lot of potential with a small model like this if we train reinforcement learning on it with a large amount of environments and agents," Song said. "But it's not a very easy thing to do," adding that was what they spent "a lot of time" on.To keep the model stable during this intense training, they used a mathematical approach called CISPO (Clipping Importance Sampling Policy Optimization) and shared the formula on their blog.This formula ensures the model doesn't over-correct during training, allowing it to develop what MiniMax calls an "Architect Mindset". Instead of jumping straight into writing code, M2.5 has learned to proactively plan the structure, features, and interface of a project first.State-of-the-art (and near) benchmarksThe results of this architecture are reflected in the latest industry leaderboards. M2.5 hasn't just improved; it has vaulted into the top tier of coding models, approaching Anthropic's latest model, Claude Opus 4.6, released just a week ago, and showing that Chinese companies are now just days away from catching up to far better resourced (in terms of GPUs) U.S. labs.Here are some of the new MiniMax M2.5 benchmark highlights:SWE-Bench Verified: 80.2% — Matches Claude Opus 4.6 speedsBrowseComp: 76.3% — Industry-leading search & tool use.Multi-SWE-Bench: 51.3% — SOTA in multi-language codingBFCL (Tool Calling): 76.8% — High-precision agentic workflows.On the ThursdAI podcast, host Alex Volkov pointed out that MiniMax M2.5 operates extremely quickly and therefore uses less tokens to complete tasks, on the order $0.15 per task compared to $3.00 for Claude Opus 4.6.Breaking the cost barrierMiniMax is offering two versions of the model through its API, both focused on high-volume production use:M2.5-Lightning: Optimized for speed, delivering 100 tokens per second. It costs $0.30 per 1M input tokens and $2.40 per 1M output tokens.Standard M2.5: Optimized for cost, running at 50 tokens per second. It costs half as much as the Lightning version ($0.15 per 1M input tokens / $1.20 per 1M output tokens).In plain language: MiniMax claims you can run four "agents" (AI workers) continuously for an entire year for roughly $10,000. For enterprise users, this pricing is roughly 1/10th to 1/20th the cost of competing proprietary models like GPT-5 or Claude 4.6 Opus.ModelInputOutputTotal CostSourceQwen 3 Turbo$0.05$0.20$0.25Alibaba Clouddeepseek-chat (V3.2-Exp)$0.28$0.42$0.70DeepSeekdeepseek-reasoner (V3.2-Exp)$0.28$0.42$0.70DeepSeekGrok 4.1 Fast (reasoning)$0.20$0.50$0.70xAIGrok 4.1 Fast (non-reasoning)$0.20$0.50$0.70xAIMiniMax M2.5$0.15$1.20$1.35MiniMaxMiniMax M2.5-Lightning$0.30$2.40$2.70MiniMaxGemini 3 Flash Preview$0.50$3.00$3.50GoogleKimi-k2.5$0.60$3.00$3.60MoonshotGLM-5$1.00$3.20$4.20Z.aiERNIE 5.0$0.85$3.40$4.25BaiduClaude Haiku 4.5$1.00$5.00$6.00AnthropicQwen3-Max (2026-01-23)$1.20$6.00$7.20Alibaba CloudGemini 3 Pro (≤200K)$2.00$12.00$14.00GoogleGPT-5.2$1.75$14.00$15.75OpenAIClaude Sonnet 4.5$3.00$15.00$18.00AnthropicGemini 3 Pro (>200K)$4.00$18.00$22.00GoogleClaude Opus 4.6$5.00$25.00$30.00AnthropicGPT-5.2 Pro$21.00$168.00$189.00OpenAIStrategic implications for enterprises and leadersFor technical leaders, M2.5 represents more than just a cheaper API. It changes the operational playbook for enterprises right now.The pressure to "optimize" prompts to save money is gone. You can now deploy high-context, high-reasoning models for routine tasks that were previously cost-prohibitive.The 37% speed improvement in end-to-end task completion means the "agentic" pipelines valued by AI orchestrators — where models talk to other models — finally move fast enough for real-time user applications.In addition, M2.5’s high scores in financial modeling (74.4% on MEWC) suggest it can handle the "tacit knowledge" of specialized industries like law and finance with minimal oversight.Because M2.5 is positioned as an open-source model, organizations can potentially run intensive, automated code audits at a scale that was previously impossible without massive human intervention, all while maintaining better control over data privacy, but until the licensing terms and weights are posted, this remains just a moniker. MiniMax M2.5 is a signal that the frontier of AI is no longer just about who can build the biggest brain, but who can make that brain the most useful—and affordable—worker in the room.
- Anthropic raises another $30B in Series G, with a new value of $380Bby Lucas Ropek on February 12, 2026 at 8:18 pm
The infusion of funding for the AI startup takes place as it is vying for customers and cultural attention with its competitor, OpenAI.
- YouTube finally launches a dedicated app for Apple Vision Proby Lauren Forristal on February 12, 2026 at 8:14 pm
When Apple Vision Pro first came out two years ago, YouTube hesitated to release a dedicated app. Today is the day they officially launch one.
- A Wave of Unexplained Bot Traffic Is Sweeping the Webby Zeyi Yang on February 12, 2026 at 7:50 pm
From small publishers to US federal agencies, websites are reporting unusual spikes in automated traffic linked to IP addresses in Lanzhou, China.
- Hacker linked to Epstein removed from Black Hat cyber conference websiteby Lorenzo Franceschi-Bicchierai on February 12, 2026 at 7:15 pm
Emails published by the Justice Department revealed cybersecurity veteran Vincenzo Iozzo emailed, and arranged to meet, Jeffrey Epstein multiple times between 2014 and 2018.
- Trump administration undermines EPA enforcement of Clean Air Actby Tim De Chant on February 12, 2026 at 7:13 pm
The EPA's new rule seeks to undo a 2009 finding that allowed the federal government to regulate six greenhouse gases.














