$ timeahead_
all sourcesAhead of AI (Sebastian Raschka)Anthropic NewsApple Machine Learning ResearchArs Technica AIAWS Machine Learning BlogCerebras BlogCohere BlogCrewAI BlogDeepSeek BlogDistill.pubfast.ai BlogFireworks AI BlogGoogle AI BlogGoogle Cloud AI BlogGoogle DeepMind BlogGroq BlogHaystack (deepset) BlogHugging Face BlogImport AI (Jack Clark)LangChain BlogLangFuse BlogLil'Log (Lilian Weng)LlamaIndex BlogMeta AI BlogMicrosoft AutoGen BlogMicrosoft Research BlogMistral AI NewsMIT Technology ReviewModal Blogn8n BlogNathan Lambert (RLHF)NVIDIA Developer BlogOllama BlogOpenAI BlogPerplexity AI BlogPyTorch BlogReplicate BlogSimon Willison BlogTensorFlow BlogThe Batch (DeepLearning.AI)The GradientThe Verge AITogether AI BlogVentureBeat AIvLLM BlogWeights & Biases BlogWired AIxAI (Grok) Blog
allapiagentsframeworkshardwareinframodelopen sourcereleaseresearchtutorial
★ TOP STORY[ SWB ]Tutorial·2d ago

Quoting Maggie Appleton

23rd April 2026 [...] if you ever needed another reason to learn in public by digital gardening or podcasting or streaming or whathaveyou, add on that people will assume you’re more competent than you are. This will get you invites to very cool exclusive events filled with high-achieving, interesting people, even though you have no right to be there. A+ side benefit. — Maggie Appleton, Gathering Structures (via) Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026

Simon Willison Blogread →
▲ trending · last 48hview all →
[SWB]Simon Willison Blog· 37 articlesvisit →
2d ago
A pelican for GPT-5.5 via the semi-official Codex backdoor API
A pelican for GPT-5.5 via the semi-official Codex backdoor API 23rd April 2026 GPT-5.5 is out. It’s available in OpenAI Codex and is rolling out to paid ChatGPT subscribers. I’ve had some preview access and found it to be a fast, effective and highly capable model. As is usually the case these days, it’s hard to put into words what’s good about it—I ask it to build things and it builds exactly what I ask for! There’s one notable omission from today’s release—the API: API deployments require different safeguards and we are working closely with partners and customers on the safety and security requirements for serving it at scale. We’ll bring GPT‑5.5 and GPT‑5.5 Pro to the API very soon. When I run my pelican benchmark I always prefer to use an API, to avoid hidden system prompts in ChatGPT…
2dInfra#gpt
2d ago
Extract PDF text in your browser with LiteParse for the web
Extract PDF text in your browser with LiteParse for the web 23rd April 2026 LlamaIndex have a most excellent open source project called LiteParse, which provides a Node.js CLI tool for extracting text from PDFs. I got a version of LiteParse working entirely in the browser, using most of the same libraries that LiteParse uses to run in Node.js. Spatial text parsing Refreshingly, LiteParse doesn’t use AI models to do what it does: it’s good old-fashioned PDF parsing, falling back to Tesseract OCR (or other pluggable OCR engines) for PDFs that contain images of text rather than the text itself. The hard problem that LiteParse solves is extracting text in a sensible order despite the infuriating vagaries of PDF layouts. They describe this as “spatial text parsing”—they use some very clever heuristics to detect things like multi-column layouts and group…
3d ago
Is Claude Code going to cost $100/month? Probably not - it's all very confusing
Is Claude Code going to cost $100/month? Probably not—it’s all very confusing 22nd April 2026 Anthropic today quietly (as in silently, no announcement anywhere at all) updated their claude.com/pricing page (but not their Choosing a Claude plan page, which shows up first for me on Google) to add this tiny but significant detail (arrow is mine, and it’s already reverted): The Internet Archive copy from yesterday shows a checkbox there. Claude Code used to be a feature of the $20/month Pro plan, but according to the new pricing page it is now exclusive to the $100/month or $200/month Max plans. Update: don’t miss the update to this post, they’ve already changed course a few hours after this change went live. So what the heck is going on? Unsurprisingly, Reddit and Hacker News and Twitter all caught fire. I didn’t believe…
3d ago
Changes to GitHub Copilot Individual plans
22nd April 2026 - Link Blog Changes to GitHub Copilot Individual plans (via) On the same day as Claude Code's temporary will-they-won't-they $100/month kerfuffle (for the moment, they won't), here's the latest on GitHub Copilot pricing. Unlike Anthropic, GitHub put up an official announcement about their changes, which include tightening usage limits, pausing signups for individual plans (!), restricting Claude Opus 4.7 to the more expensive $39/month "Pro+" plan, and dropping the previous Opus models entirely. The key paragraph: Agentic workflows have fundamentally changed Copilot’s compute demands. Long-running, parallelized sessions now regularly consume far more resources than the original plan structure was built to support. As Copilot’s agentic capabilities have expanded rapidly, agents are doing more work, and more customers are hitting usage limits designed to maintain service reliability. It's easy to forget that just six months ago heavy LLM…
3dOpen Source#claude#coding
3d ago
Quoting Bobby Holley
22nd April 2026 As part of our continued collaboration with Anthropic, we had the opportunity to apply an early version of Claude Mythos Preview to Firefox. This week’s release of Firefox 150 includes fixes for 271 vulnerabilities identified during this initial evaluation. [...] Our experience is a hopeful one for teams who shake off the vertigo and get to work. You may need to reprioritize everything else to bring relentless and single-minded focus to the task, but there is light at the end of the tunnel. We are extremely proud of how our team rose to meet this challenge, and others will too. Our work isn’t finished, but we’ve turned the corner and can glimpse a future much better than just keeping up. Defenders finally have a chance to win, decisively. — Bobby Holley, CTO, Firefox Recent articles - DeepSeek…
3dResearch#claude
3d ago
Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model
22nd April 2026 - Link Blog Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model (via) Big claims from Qwen about their latest open weight model: Qwen3.6-27B delivers flagship-level agentic coding performance, surpassing the previous-generation open-source flagship Qwen3.5-397B-A17B (397B total / 17B active MoE) across all major coding benchmarks. On Hugging Face Qwen3.5-397B-A17B is 807GB, this new Qwen3.6-27B is 55.6GB. I tried it out with the 16.8GB Unsloth Qwen3.6-27B-GGUF:Q4_K_M quantized version and llama-server using this recipe by benob on Hacker News, after first installing llama-server using brew install llama.cpp : llama-server \ -hf unsloth/Qwen3.6-27B-GGUF:Q4_K_M \ --no-mmproj \ --fit on \ -np 1 \ -c 65536 \ --cache-ram 4096 -ctxcp 2 \ --jinja \ --temp 0.6 \ --top-p 0.95 \ --top-k 20 \ --min-p 0.0 \ --presence-penalty 0.0 \ --repeat-penalty 1.0 \ --reasoning on \ --chat-template-kwargs '{"preserve_thinking": true}' On first run that…
4d ago
scosman/pelicans_riding_bicycles
21st April 2026 - Link Blog scosman/pelicans_riding_bicycles (via) I firmly approve of Steve Cosman's efforts to pollute the training set of pelicans riding bicycles. (To be fair, most of the examples I've published count as poisoning too.) Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026
4dModel#training
4d ago
Where's the raccoon with the ham radio? (ChatGPT Images 2.0)
Where’s the raccoon with the ham radio? (ChatGPT Images 2.0) 21st April 2026 OpenAI released ChatGPT Images 2.0 today, their latest image generation model. On the livestream Sam Altman said that the leap from gpt-image-1 to gpt-image-2 was equivalent to jumping from GPT-3 to GPT-5. Here’s how I put it to the test. My prompt: Do a where's Waldo style image but it's where is the raccoon holding a ham radio gpt-image-1 First as a baseline here’s what I got from the older gpt-image-1 using ChatGPT directly: I wasn’t able to spot the raccoon—I quickly realized that testing image generation models on Where’s Waldo style images (Where’s Wally in the UK) can be pretty frustrating! I tried getting Claude Opus 4.7 with its new higher resolution inputs to solve it but it was convinced there was a raccoon it couldn’t…
4d ago
Quoting Andreas Påhlsson-Notini
21st April 2026 AI agents are already too human. Not in the romantic sense, not because they love or fear or dream, but in the more banal and frustrating one. The current implementations keep showing their human origin again and again: lack of stringency, lack of patience, lack of focus. Faced with an awkward task, they drift towards the familiar. Faced with hard constraints, they start negotiating with reality. — Andreas Påhlsson-Notini, Less human AI agents, please. Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026
4dModel
5d ago
llm-openrouter 0.6
20th April 2026 llm openrouter refresh command for refreshing the list of available models without waiting for the cache to expire. I added this feature so I could try Kimi 2.6 on OpenRouter as soon as it became available there. Here's its pelican - this time as an HTML page because Kimi chose to include an HTML and JavaScript UI to control the animation. Transcript here. Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026
5dRelease
5d ago
SQL functions in Google Sheets to fetch data from Datasette
20th April 2026 TIL SQL functions in Google Sheets to fetch data from Datasette — I've been experimenting with ways to fetch data from Datasette and display it in Google Sheets. I put together some notes on patterns for fetching data from a Datasette instance directly into Google Sheets - using the importdata() function, a "named function" that wraps it or a Google Apps Script if you need to send an API token in an HTTP header (not supported by importdata() .) Here's an example sheet demonstrating all three methods. Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026
5dResearch
5d ago
Claude Token Counter, now with model comparisons
20th April 2026 - Link Blog Claude Token Counter, now with model comparisons. I upgraded my Claude Token Counter tool to add the ability to run the same count against different models in order to compare them. As far as I can tell Claude Opus 4.7 is the first model to change the tokenizer, so it's only worth running comparisons between 4.7 and 4.6. The Claude token counting API accepts any Claude model ID though so I've included options for all four of the notable current models (Opus 4.7 and 4.6, Sonnet 4.6, and Haiku 4.5). In the Opus 4.7 announcement Anthropic said: Opus 4.7 uses an updated tokenizer that improves how the model processes text. The tradeoff is that the same input can map to more tokens—roughly 1.0–1.35× depending on the content type. I pasted the Opus 4.7 system…
5dModel#claude
6d ago
Headless everything for personal AI
19th April 2026 - Link Blog Headless everything for personal AI. Matt Webb thinks headless services are about to become much more common: Why? Because using personal AIs is a better experience for users than using services directly (honestly); and headless services are quicker and more dependable for the personal AIs than having them click round a GUI with a bot-controlled mouse. Evidently Marc Benioff thinks so too: Welcome Salesforce Headless 360: No Browser Required! Our API is the UI. Entire Salesforce & Agentforce & Slack platforms are now exposed as APIs, MCP, & CLI. All AI agents can access data, workflows, and tasks directly in Slack, Voice, or anywhere else with Salesforce Headless. If this model does take off it's going to play havoc with existing per-head SaaS pricing schemes. I'm reminded of the early 2010s era when every…
7d ago
Changes in the system prompt between Claude Opus 4.6 and 4.7
Changes in the system prompt between Claude Opus 4.6 and 4.7 18th April 2026 Anthropic are the only major AI lab to publish the system prompts for their user-facing chat systems. Their system prompt archive now dates all the way back to Claude 3 in July 2024 and it’s always interesting to see how the system prompt evolves as they publish new models. Opus 4.7 shipped the other day (April 16, 2026) with a Claude.ai system prompt update since Opus 4.6 (February 5, 2026). I had Claude Code take the Markdown version of their system prompts, break that up into separate documents for each of the models and then construct a Git history of those files over time with fake commit dates representing the publication dates of each updated prompt—here’s the prompt I used with Claude Code for the web.…
7d ago
Claude system prompts as a git timeline
18th April 2026 Research Claude system prompts as a git timeline — Anthropic's published system prompt history for Claude is transformed into a git-based exploration tool, breaking up the monolithic markdown source into granular files and timestamped commits. By structuring extracted prompts per model, family, and revision, researchers can leverage `git log`, `diff`, and `blame` to trace prompt evolution, compare differences, and attribute changes to specific dates—all without manual parsing. Anthropic publish the system prompts for Claude chat and make that page available as Markdown. I had Claude Code turn that page into separate files for each model and model family with fake git commit dates to enable browsing the changes via the GitHub commit view. I used this to write my own detailed notes on the changes between Opus 4.6 and 4.7. Recent articles - DeepSeek V4 - almost…
7dResearch#claude#coding
7d ago
Adding a new content type to my blog-to-newsletter tool
Guides > Agentic Engineering Patterns Adding a new content type to my blog-to-newsletter tool Here's an example of a deceptively short prompt that got a quite a lot of work done in a single shot. First, some background. I send out a free Substack newsletter around once a week containing content copied-and-pasted from my blog. I'm effectively using Substack as a lightweight way to allow people to subscribe to my blog via email. I generate the newsletter with my blog-to-newsletter tool - an HTML and JavaScript app that fetches my latest content from this Datasette instance and formats it as rich text HTML, which I can then copy to my clipboard and paste into the Substack editor. Here's a detailed explanation of how that works. I recently added a new type of content to my blog to capture content that…
7dAgents#agents
8d ago
datasette 1.0a28
17th April 2026 I was upgrading Datasette Cloud to 1.0a27 and discovered a nasty collection of accidental breakages caused by changes in that alpha. This new alpha addresses those directly: - Fixed a compatibility bug introduced in 1.0a27 where execute_write_fn() callbacks with a parameter name other thanconn were seeing errors. (#2691)- The database.close() method now also shuts down the write connection for that database. - New datasette.close() method for closing down all databases and resources associated with a Datasette instance. This is called automatically when the server shuts down. (#2693) - Datasette now includes a pytest plugin which automatically calls datasette.close() on temporary instances created in function-scoped fixtures and during tests. See Automatic cleanup of Datasette instances for details. This helps avoid running out of file descriptors in plugin test suites that were written before theDatabase(is_temp_disk=True) feature introduced in Datasette…
8dRelease
8d ago
Join us at PyCon US 2026 in Long Beach - we have new AI and security tracks this year
Join us at PyCon US 2026 in Long Beach—we have new AI and security tracks this year 17th April 2026 This year’s PyCon US is coming up next month from May 13th to May 19th, with the core conference talks from Friday 15th to Sunday 17th and tutorial and sprint days either side. It’s in Long Beach, California this year, the first time PyCon US has come to the West Coast since Portland, Oregon in 2017 and the first time in California since Santa Clara in 2013. If you’re based in California this is a great opportunity to catch up with the Python community, meet a whole lot of interesting people and learn a ton of interesting things. In addition to regular PyCon programming we have two new dedicated tracks at the conference this year: an AI track on Friday…
8dTutorial
9d ago
llm-anthropic 0.25
16th April 2026 - New model: claude-opus-4.7 , which supportsthinking_effort :xhigh . #66- New thinking_display andthinking_adaptive boolean options.thinking_display summarized output is currently only available in JSON output or JSON logs.- Increased default max_tokens to the maximum allowed for each model.- No longer uses obsolete structured-outputs-2025-11-13 beta header for older models. Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026
9d ago
Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7
Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 16th April 2026 For anyone who has been (inadvisably) taking my pelican riding a bicycle benchmark seriously as a robust way to test models, here are pelicans from this morning’s two big model releases—Qwen3.6-35B-A3B from Alibaba and Claude Opus 4.7 from Anthropic. Here’s the Qwen 3.6 pelican, generated using this 20.9GB Qwen3.6-35B-A3B-UD-Q4_K_S.gguf quantized model by Unsloth, running on my MacBook Pro M5 via LM Studio (and the llm-lmstudio plugin)—transcript here: And here’s one I got from Anthropic’s brand new Claude Opus 4.7 (transcript): I’m giving this one to Qwen 3.6. Opus managed to mess up the bicycle frame! I tried Opus a second time passing thinking_level: max . It didn’t do much better (transcript): I don’t think Qwen are cheating A lot of people are convinced that…
9d ago
datasette.io news preview
16th April 2026 The datasette.io website has a news section built from this news.yaml file in the underlying GitHub repository. The YAML format looks like this: - date: 2026-04-15 body: |- [Datasette 1.0a27](https://docs.datasette.io/en/latest/changelog.html#a27-2026-04-15) changes how CSRF protection works in a way that simplifies form and API integration, and introduces a new `RenameTableEvent` for when a table is renamed by a SQL query. - date: 2026-03-18 body: |- ... This format is a little hard to edit, so I finally had Claude build a custom preview UI to make checking for errors have slightly less friction. I built it using standard claude.ai and Claude Artifacts, taking advantage of Claude's ability to clone GitHub repos and look at their content as part of a regular chat: Clone https://github.com/simonw/datasette.io and look at the news.yaml file and how it is rendered on the homepage.…
9dOpen Source
10d ago
Quoting Kyle Kingsbury
15th April 2026 I think we will see some people employed (though perhaps not explicitly) as meat shields: people who are accountable for ML systems under their supervision. The accountability may be purely internal, as when Meta hires human beings to review the decisions of automated moderation systems. It may be external, as when lawyers are penalized for submitting LLM lies to the court. It may involve formalized responsibility, like a Data Protection Officer. It may be convenient for a company to have third-party subcontractors, like Buscaglia, who can be thrown under the bus when the system as a whole misbehaves. — Kyle Kingsbury, The Future of Everything is Lies, I Guess: New Jobs Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser…
10dModel#multimodal
10d ago
datasette-export-database 0.3a1
15th April 2026 This plugin was using the ds_csrftoken cookie as part of a custom signed URL, which needed upgrading now that Datasette 1.0a27 no longer sets that cookie. Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026
10dRelease
10d ago
datasette 1.0a27
15th April 2026 Two major changes in this new Datasette alpha. I covered the first of those in detail yesterday - Datasette no longer uses Django-style CSRF form tokens, instead using modern browser headers as described by Filippo Valsorda. The second big change is that Datasette now fires a new RenameTableEvent any time a table is renamed during a SQLite transaction. This is useful because some plugins (like datasette-comments) attach additional data to table records by name, so a renamed table requires them to react in appropriate ways. Here are the rest of the changes in the alpha: - New actor= parameter for datasette.client methods, allowing internal requests to be made as a specific actor. This is particularly useful for writing automated tests. (#2688)- New Database(is_temp_disk=True) option, used internally for the internal database. This helps resolve intermittent database locked errors…
10dRelease
10d ago
Quoting John Gruber
15th April 2026 The real goldmine isn’t that Apple gets a cut of every App Store transaction. It’s that Apple’s platforms have the best apps, and users who are drawn to the best apps are thus drawn to the iPhone, Mac, and iPad. That edge is waning. Not because software on other platforms is getting better, but because third-party software on iPhone, Mac, and iPad is regressing to the mean, to some extent, because fewer developers feel motivated — artistically, financially, or both — to create well-crafted idiomatic native apps exclusively for Apple’s platforms. Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API…
10dModel#coding
10d ago
Gemini 3.1 Flash TTS
15th April 2026 - Link Blog Gemini 3.1 Flash TTS. Google released Gemini 3.1 Flash TTS today, a new text-to-speech model that can be directed using prompts. It's presented via the standard Gemini API using gemini-3.1-flash-tts-preview as the model ID, but can only output audio files. The prompting guide is surprising, to say the least. Here's their example prompt to generate just a few short sentences of audio: # AUDIO PROFILE: Jaz R. ## "The Morning Hype" ## THE SCENE: The London Studio It is 10:00 PM in a glass-walled studio overlooking the moonlit London skyline, but inside, it is blindingly bright. The red "ON AIR" tally light is blazing. Jaz is standing up, not sitting, bouncing on the balls of their heels to the rhythm of a thumping backing track. Their hands fly across the faders on a massive…
10d ago
Zig 0.16.0 release notes: "Juicy Main"
15th April 2026 - Link Blog Zig 0.16.0 release notes: "Juicy Main" (via) Zig has really good release notes - comprehensive, detailed, and with relevant usage examples for each of the new features. Of particular note in the newly released Zig 0.16.0 is what they are calling "Juicy Main" - a dependency injection feature for your program's main() function where accepting a process.Init parameter grants access to a struct of useful properties: const std = @import("std"); pub fn main(init: std.process.Init) !void { /// general purpose allocator for temporary heap allocations: const gpa = init.gpa; /// default Io implementation: const io = init.io; /// access to environment variables: std.log.info("{d} env vars", .{init.environ_map.count()}); /// access to CLI arguments const args = try init.minimal.args.toSlice( init.arena.allocator() ); } Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price -…
10dRelease
10d ago
datasette-ports 0.3
15th April 2026 A small update for my tool for helping me figure out what all of the Datasette instances on my laptop are up to. - Show working directory derived from each PID - Show the full path to each database file Output now looks like this: http://127.0.0.1:8007/ - v1.0a26 Directory: /Users/simon/dev/blog Databases: simonwillisonblog: /Users/simon/dev/blog/simonwillisonblog.db Plugins: datasette-llm datasette-secrets http://127.0.0.1:8001/ - v1.0a26 Directory: /Users/simon/dev/creatures Databases: creatures: /tmp/creatures.db Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026
10dHardware
10d ago
Gemini 3.1 Flash TTS
15th April 2026 Tool Gemini 3.1 Flash TTS — Convert text to natural-sounding speech using Google's Gemini 3.1 Flash TTS model with support for both single-speaker and multi-speaker conversation modes. The tool allows you to customize voice selection, apply directorial tags like `[whisper]` and `[short pause]` for dynamic delivery, and download the generated audio as a WAV file. Requires a valid Gemini API key to function. See my notes on Google's new Gemini 3.1 Flash TTS text-to-speech model. Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April 2026 - A pelican for GPT-5.5 via the semi-official Codex backdoor API - 23rd April 2026
10dModel#gemini
11d ago
Cybersecurity Looks Like Proof of Work Now
14th April 2026 - Link Blog Cybersecurity Looks Like Proof of Work Now. The UK's AI Safety Institute recently published Our evaluation of Claude Mythos Preview’s cyber capabilities, their own independent analysis of Claude Mythos which backs up Anthropic's claims that it is exceptionally effective at identifying security vulnerabilities. Drew Breunig notes that AISI's report shows that the more tokens (and hence money) they spent the better the result they got, which leads to a strong economic incentive to spend as much as possible on security reviews: If Mythos continues to find exploits so long as you keep throwing money at it, security is reduced to a brutally simple equation: to harden a system you need to spend more tokens discovering exploits than attackers will spend exploiting them. An interesting result of this is that open source libraries become more…
11dResearch#claude#safety
11d ago
Trusted access for the next era of cyber defense
14th April 2026 - Link Blog Trusted access for the next era of cyber defense (via) OpenAI's answer to Claude Mythos appears to be a new model called GPT-5.4-Cyber: In preparation for increasingly more capable models from OpenAI over the next few months, we are fine-tuning our models specifically to enable defensive cybersecurity use cases, starting today with a variant of GPT‑5.4 trained to be cyber-permissive: GPT‑5.4‑Cyber. They're also extending a program they launched in February (which I had missed) called Trusted Access for Cyber, where users can verify their identity (via a photo of a government-issued ID processed by Persona) to gain "reduced friction" access to OpenAI's models for cybersecurity work. Honestly, this OpenAI announcement is difficult to follow. Unsurprisingly they don't mention Anthropic at all, but much of the piece emphasizes their many years of existing cybersecurity work…
11d ago
datasette PR #2689: Replace token-based CSRF with Sec-Fetch-Site header protection
14th April 2026 - Link Blog datasette PR #2689: Replace token-based CSRF with Sec-Fetch-Site header protection. Datasette has long protected against CSRF attacks using CSRF tokens, implemented using my asgi-csrf Python library. These are something of a pain to work with - you need to scatter forms in templates with <input type="hidden" name="csrftoken" value="{{ csrftoken() }}"> lines and then selectively disable CSRF protection for APIs that are intended to be called from outside the browser. I've been following Filippo Valsorda's research here with interest, described in this detailed essay from August 2025 and shipped as part of Go 1.25 that same month. I've now landed the same change in Datasette. Here's the PR description - Claude Code did much of the work (across 10 commits, closely guided by me and cross-reviewed by GPT-5.4) but I've decided to start writing these…
11dTutorial#claude#coding
12d ago
Quoting Bryan Cantrill
13th April 2026 The problem is that LLMs inherently lack the virtue of laziness. Work costs nothing to an LLM. LLMs do not feel a need to optimize for their own (or anyone's) future time, and will happily dump more and more onto a layercake of garbage. Left unchecked, LLMs will make systems larger, not better — appealing to perverse vanity metrics, perhaps, but at the cost of everything that matters. As such, LLMs highlight how essential our human laziness is: our finite time forces us to develop crisp abstractions in part because we don't want to waste our (human!) time on the consequences of clunky ones. — Bryan Cantrill, The peril of laziness lost Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your…
12dModel
12d ago
Exploring the new `servo` crate
13th April 2026 In Servo is now available on crates.io the Servo team announced the initial release of the servo crate, which packages their browser engine as an embeddable library. I set Claude Code for web the task of figuring out what it can do, building a CLI tool for taking screenshots using it and working out if it could be compiled to WebAssembly. The servo-shot Rust tool it built works pretty well: git clone https://github.com/simonw/research cd research/servo-crate-exploration/servo-shot cargo build ./target/debug/servo-shot https://news.ycombinator.com/ Here's the result: Compiling Servo itself to WebAssembly is not feasible due to its heavy use of threads and dependencies like SpiderMonkey, but Claude did build me this playground page for trying out a WebAssembly build of the html5ever and markup5ever_rcdom crates, providing a tool for turning fragments of HTML into a parse tree. Recent articles - DeepSeek…
12dResearch#claude#coding
12d ago
Steve Yegge
13th April 2026 I was chatting with my buddy at Google, who's been a tech director there for about 20 years, about their AI adoption. Craziest convo I've had all year. The TL;DR is that Google engineering appears to have the same AI adoption footprint as John Deere, the tractor company. Most of the industry has the same internal adoption curve: 20% agentic power users, 20% outright refusers, 60% still using Cursor or equivalent chat tool. It turns out Google has this curve too. [...] There has been an industry-wide hiring freeze for 18+ months, during which time nobody has been moving jobs. So there are no clued-in people coming in from the outside to tell Google how far behind they are, how utterly mediocre they have become as an eng org. On behalf of @Google, this post doesn't match…
12dAgents#agents
13d ago
Gemma 4 audio with MLX
12th April 2026 Thanks to a tip from Rahim Nathwani, here's a uv run recipe for transcribing an audio file on macOS using the 10.28 GB Gemma 4 E2B model with MLX and mlx-vlm: uv run --python 3.13 --with mlx_vlm --with torchvision --with gradio \ mlx_vlm.generate \ --model google/gemma-4-e2b-it \ --audio file.wav \ --prompt "Transcribe this audio" \ --max-tokens 500 \ --temperature 1.0 I tried it on this 14 second .wav file and it output the following: This front here is a quick voice memo. I want to try it out with MLX VLM. Just going to see if it can be transcribed by Gemma and how that works. (That was supposed to be "This right here..." and "... how well that works" but I can hear why it misinterpreted that as "front" and "how that works".) Recent articles -…
13dHardware#multimodal
14d ago
SQLite 3.53.0
11th April 2026 - Link Blog SQLite 3.53.0 (via) SQLite 3.52.0 was withdrawn so this is a pretty big release with a whole lot of accumulated user-facing and internal improvements. Some that stood out to me: ALTER TABLE can now add and removeNOT NULL andCHECK constraints - I've previously used my own sqlite-utils transform() method for this.- New json_array_insert() function and its jsonb equivalent. - Significant improvements to CLI mode, including result formatting. The result formatting improvements come from a new library, the Query Results Formatter. I had Claude Code (on my phone) compile that to WebAssembly and build this playground interface for trying that out. Recent articles - DeepSeek V4 - almost on the frontier, a fraction of the price - 24th April 2026 - Extract PDF text in your browser with LiteParse for the web - 23rd April…
14dRelease