★ TOP STORY[ SWB ]Open Source·3d ago

Changes to GitHub Copilot Individual plans

22nd April 2026 - Link Blog Changes to GitHub Copilot Individual plans (via) On the same day as Claude Code's temporary will-they-won't-they $100/month kerfuffle (for the moment, they won't), here's the latest on GitHub Copilot pricing. Unlike Anthropic, GitHub put up an official announcement about their changes, which include tightening usage limits, pausing signups for individual plans (!), restricting Claude Opus 4.7 to the more expensive $39/month "Pro+" plan, and dropping the previous Opus models entirely. The key paragraph: Agentic workflows have fundamentally changed Copilot’s compute demands. Long-running, parallelized sessions now regularly consume far more resources than the original plan structure was built to support. As Copilot’s agentic capabilities have expanded rapidly, agents are doing more work, and more customers are hitting usage limits designed to maintain service reliability. It's easy to forget that just six months ago heavy LLM…

Simon Willison Blogread →

▲ trending · last 48hview all →

▾[AOA(]Ahead of AI (Sebastian Raschka)· 5 articlesvisit →

7d ago

My Workflow for Understanding LLM Architectures

My Workflow for Understanding LLM Architectures A learning-oriented workflow for understanding new open-weight model releases Many people asked me over the past months to share my workflow for how I come up with the LLM architecture sketches and drawings in my articles, talks, and the LLM-Gallery. So I thought it would be useful to document the process I usually follow. The short version is that I usually start with the official technical reports, but these days, papers are often less detailed than they used to be, especially for most open-weight models from industry labs. The good part is that if the weights are shared on the Hugging Face Model Hub and the model is supported in the Python transformers library, we can usually inspect the config file and the reference implementation directly to get more information about the architecture details.…

7dOpen Source#agentsby Sebastian Raschka, PhD

59d ago

A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026

A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026 A Round Up And Comparison of 10 Open-Weight LLM Releases in Spring 2026 If you have struggled a bit to keep up with open-weight model releases this month, this article should catch you up on the main themes. In this article, I will walk you through the ten main releases in chronological order, with a focus on the architecture similarities and differences: Arcee AI’s Trinity Large (Jan 27, 2026) Moonshot AI’s Kimi K2.5 (Jan 27, 2026) StepFun Step 3.5 Flash (Feb 1, 2026) Qwen3-Coder-Next (Feb 3, 2026) z.AI’s GLM-5 (Feb 12, 2026) MiniMax M2.5 (Feb 12, 2026) Nanbeige 4.1 3B (Feb 13, 2026) Qwen 3.5 (Feb 15, 2026) Ant Group’s Ling 2.5 1T & Ring 2.5 1T (Feb 16, 2026) Cohere’s Tiny Aya (Feb 17, 2026) Update 1:…

59dOpen Sourceby Sebastian Raschka, PhD

143d ago

From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates

From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates Understanding How DeepSeek's Flagship Open-Weight Models Evolved Last updated: January 1st, 2026 Similar to DeepSeek V3, the team released their new flagship model over a major US holiday weekend. Given DeepSeek V3.2’s really good performance (on GPT-5 and Gemini 3.0 Pro) level, and the fact that it’s also available as an open-weight model, it’s definitely worth a closer look. I covered the predecessor, DeepSeek V3, at the very beginning of my The Big LLM Architecture Comparison article, which I kept extending over the months as new architectures got released. Originally, as I just got back from Thanksgiving holidays with my family, I planned to “just” extend the article with this new DeepSeek V3.2 release by adding another section, but I then realized that there’s just too much interesting information…

143dOpen Sourceby Sebastian Raschka, PhD

172d ago

Beyond Standard LLMs

Beyond Standard LLMs Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers From DeepSeek R1 to MiniMax-M2, the largest and most capable open-weight LLMs today remain autoregressive decoder-style transformers, which are built on flavors of the original multi-head attention mechanism. However, we have also seen alternatives to standard LLMs popping up in recent years, from text diffusion models to the most recent linear attention hybrid architectures. Some of them are geared towards better efficiency, and others, like code world models, aim to improve modeling performance. After I shared my Big LLM Architecture Comparison a few months ago, which focused on the main transformer-based LLMs, I received a lot of questions with respect to what I think about alternative approaches. (I also recently gave a short talk about that at the PyTorch Conference 2025, where I also promised…

172dOpen Source#codingby Sebastian Raschka, PhD

231d ago

Understanding and Implementing Qwen3 From Scratch

Understanding and Implementing Qwen3 From Scratch A Detailed Look at One of the Leading Open-Source LLMs Previously, I compared the most notable open-weight architectures of 2025 in The Big LLM Architecture Comparison. Then, I zoomed in and discussed the various architecture components in From GPT-2 to gpt-oss: Analyzing the Architectural Advances on a conceptual level. Since all good things come in threes, before covering some of the noteworthy research highlights of this summer, I wanted to now dive into these architectures hands-on, in code. By following along, you will understand how it actually works under the hood and gain building blocks you can adapt for your own experiments or projects. For this, I picked Qwen3 (initially released in May and updated in July) because it is one of the most widely liked and used open-weight model families as of this…

231dOpen Source#qwen#open-sourceby Sebastian Raschka, PhD

▾[CB]CrewAI Blog· 1 articlesvisit →

187d ago

Community

CrewAI OSS 1.0 - We are going GA 1.4 Billion Agentic Automations, 60 % of the Fortune 500, 40k GitHub stars. Creating a center of gravity for the Agentic AI ecosystem First things first, thank you to everyone who attended our very first launch week webinar! It was absolutely incredible to see 2,600+ people register for a webinar in a matter of days, and even more so to see everyone immediately introducing themselves, connecting with other community members and asking

187dOpen Source

▾[DIST]Distill.pub· 3 articlesvisit →

2364d ago

Computing Receptive Fields of Convolutional Neural Networks

Mathematical derivations and open-source library to compute receptive fields of convnets, enabling the mapping of extracted features to input signals. While deep neural networks have overwhelmingly established state-of-the-art results in many artificial intelligence problems, they can still be difficult to develop and debug. Recent research on deep learning understanding has focused on feature visualization In this work, we analyze deep neural networks from a complementary perspective, focusing on convolutional models. We are interested in understanding the extent to which input signals may affect output features, and mapping features at any part of the network to the region in the input that produces them. The key parameter to associate an output feature to an input region is the receptive field of the convolutional network, which is defined as the size of the region in the input that produces the feature. As…

2364dOpen Source#coding#open-source

2454d ago

A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features'

On May 6th, Andrew Ilyas and colleagues published a paper outlining two sets of experiments. Firstly, they showed that models trained on adversarial examples can transfer to real data, and secondly that models trained on a dataset derived from the representations of robust neural networks seem to inherit non-trivial robustness. They proposed an intriguing interpretation for their results: adversarial examples are due to “non-robust features” which are highly predictive but imperceptible to humans. The paper was received with intense interest and discussion on social media, mailing lists, and reading groups around the world. How should we interpret these experiments? Would they replicate? Adversarial example research is particularly vulnerable to a certain kind of non-replication among disciplines of machine learning, because it requires researchers to play both attack and defense. It’s easy for even very rigorous researchers to accidentally use a…

2454dOpen Source

2972d ago

The Building Blocks of Interpretability

Interpretability techniques are normally studied in isolation. We explore the powerful interfaces that arise when you combine them — and the rich structure of this combinatorial space. With the growing success of neural networks, there is a corresponding need to be able to explain their decisions — including building confidence about how they will behave in the real-world, detecting model bias, and for scientific curiosity. In order to do so, we need to both construct deep abstractions and reify (or instantiate) them in rich interfaces The machine learning community has primarily focused on developing powerful methods, such as feature visualization In this article, we treat existing interpretability methods as fundamental and composable building blocks for rich user interfaces. We find that these disparate techniques now come together in a unified grammar, fulfilling complementary roles in the resulting interfaces. Moreover, this…

2972dOpen Source#safety

▾[HF]Hugging Face Blog· 54 articlesvisit →

17d ago

Safetensors is Joining the PyTorch Foundation

Safetensors is Joining the PyTorch Foundation How we got here Safetensors started as a Hugging Face project born out of a concrete need: a way to store and share model weights that couldn't execute arbitrary code. The pickle-based formats that dominated the ecosystem at the time meant that there was a very real risk you’d be running malicious code. While this was an acceptable risk when ML was still budding, it would become unacceptable as open model sharing became central to how the ML community works. The format we built is intentionally simple: a JSON header with a hard limit of 100MB, describing tensor metadata, followed by raw tensor data. Zero-copy loading that maps tensors directly from disk. Lazy loading so you can read individual weights without deserializing an entire checkpoint. What we didn't fully anticipate was how broadly it…

17dOpen Source

39d ago

State of Open Source on Hugging Face: Spring 2026

State of Open Source on Hugging Face: Spring 2026 This post builds on an earlier analysis conducted mid-2025, available here, which examined what the Hugging Face Community is building. We recommend reading additional perspectives on the open source ecosystem in and outside of Hugging Face from the Data Provenance Initiative, Interconnects, OpenRouter and a16z, and MIT and the Linux Foundation. As the Hugging Face ecosystem is distributed, analyses are a combination of Hugging Face and community members' work, each of which is appropriately credited. Activity in the open source AI ecosystem has rapidly grown, with the number of users, model, and dataset repositories all close to doubling. In 2025, Hugging Face grew to 13 million users, more than 2 million public models, and over 500,000 public datasets. This growth signals more than increased interest in open source; it reflects a…

39dOpen Source#open-source

46d ago

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries TL;DR -- For those of you who don't have time to read 5,000 words about async RL plumbing (we get it, you have models to train): - The problem: In synchronous RL (reinforcement learning) training, data generation (model inference to create data samples) dominates wall-clock time -- a single batch of 32K-token rollouts on a 32B (32-billion parameter) model can take hours, while the GPUs used for training remain idle. - The solution everyone converged on: Disaggregate (separate) inference and training onto different GPU pools, connect them with a rollout buffer (temporary storage for model outputs), and transfer weights asynchronously (without waiting), so neither side waits for the other. - We surveyed 16 open-source libraries that implement this pattern and compared them across 7 axes: orchestration primitives, buffer design, weight…

46dOpen Source#open-source

81d ago

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+ This is the third and final blog in a three-part series on China's open source community's historical advancements since January 2025's "DeepSeek Moment." The first blog on strategic changes and open artifact growth is available here, and the second blog on architectural and hardware shifts is available here. In this third article, we examine paths and trajectories of prominent Chinese AI organizations, and posit future directions for open source. For AI researchers and developers contributing to and relying on the open source ecosystem and for policymakers understanding the rapidly changing environment, due to intraorganizational and global community gains, open source is the dominant and popular approach for Chinese AI organizations for the near future. Openly sharing artifacts from models to papers to deployment infrastructure maps to a strategy…

81dOpen Source#open-source

88d ago

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek

Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek This is the second blog in a three-part series on China's open source community's historical advancements since January 2025's "DeepSeek Moment." The first blog is available here, and the third blog is available here. In this second piece we turn our focus from models to the architectural and hardware choices Chinese companies have made as openness becomes the norm. For AI researchers and developers contributing to and relying on the open source ecosystem and for policymakers understanding the rapidly changing environment, architectural preferences, modality diversification, license permissiveness, small model popularity, and growing adoption of Chinese hardware point to leadership strategies across a multitude of paths. DeepSeek R1's own characteristics inspired overlap and competition, and contributed to heavier focus on domestic hardware in China. Mixture of Experts (MoE) as the Default…

88dOpen Source#open-source

142d ago

We Got Claude to Fine-Tune an Open Source LLM

We Got Claude to Fine-Tune an Open Source LLM We gave Claude the ability to fine-tune language models using a new tool called Hugging Face Skills. Not just write training scripts, but to actually submit jobs to cloud GPUs, monitor progress, and push finished models to the Hugging Face Hub. This tutorial shows you how it works and how to use it yourself. Claude Code can use "skills"—packaged instructions, scripts, and domain knowledge—to accomplish specialized tasks. The hf-llm-trainer skill teaches Claude everything it needs to know about training: which GPU to pick for your model size, how to configure Hub authentication, when to use LoRA versus full fine-tuning, and how to handle the dozens of other decisions that go into a successful training run. With this skill, you can tell Claude things like: Fine-tune Qwen3-0.6B on the dataset open-r1/codeforces-cots And…

142dOpen Source#claude#fine-tuning#open-source

145d ago

Transformers v5: Simple model definitions powering the AI ecosystem

Transformers v5: Simple model definitions powering the AI ecosystem Today, as we launch v5, Transformers is installed more than 3 million times each day via pip - up from 20,000/day in v4 🤯. Altogether, it has now surpassed 1.2 billion installs! The ecosystem has expanded from 40 model architectures in v4 to over 400 today, and the community has contributed more than 750,000 model checkpoints on the Hub compatible with Transformers, up from roughly 1,000 at the time of v4. This growth is powered by the evolution of the field and the now mainstream access to AI. As a leading model-definition library in the ecosystem, we need to continuously evolve and adapt the library to continue being relevant. Reinvention is key for longevity in AI. We’re fortunate to collaborate with many libraries and apps built on transformers, in no specific…

145dOpen Source

186d ago

Unlock the power of images with AI Sheets

Unlock the power of images with AI Sheets 🧭TL;DR: Hugging Face AI Sheets is an open-source tool for supercharging datasets with AI models, no code required. Now with vision support: extract data from images (receipts, documents), generate visuals from text, and edit images—all in a spreadsheet. Powered by thousands of open models via Inference Providers. We are excited to release a massive update to Hugging Face AI Sheets, the open-source tool for building, transforming, and enriching data with open AI models. AI Sheets leverages Inference Providers, which means you can use thousands of open models powered by the best inference providers on the planet. The first version of AI Sheets made structuring and enriching textual content a breeze. Now, we're adding vision to AI Sheets. Images are everywhere—product photos, receipts, screenshots, diagrams, charts, logos. These documents contain structured information waiting…

186dOpen Source#rag#inference#multimodal#coding

211d ago

Nemotron-Personas-Japan: ソブリン AI のための合成データセット

Nemotron-Personas-Japan: ソブリン AI のための合成データセット日本の AI の未来に向けたオープンデータ高品質で多様なトレーニングデータなしに、日本文化を真に理解するAIを構築することはこれまでほぼ不可能でした。これを変えるため、NVIDIAは、日本の人口統計、地理的分布、文化的特性に沿ったペルソナを含む初のオープン合成データセット、Nemotron-Personas-Japan を公開しました。CC BY 4.0 ライセンスのもと提供される本データセットは、機微な個人データに依存することなく日本社会を反映した AI システム構築のための、プライバシー保護と規制対応を両立した基盤を提供します。 NVIDIA のエンタープライズ向け合成データ生成システム、NeMo Data Designer を用いて作成されたNemotron-Personas-Japan は、すでに広く利用されている US Personas データセットの成功を機に日本版として開発されました。本リリースは、各国・地域におけるソブリン AI 開発を支援する合成ペルソナデータセットとデータ構築方法のグローバルコレクションの第一弾です。本データセットは、Nemotron モデルをはじめとするオープンソースの大規模言語モデル(LLM) とシームレスに連携するよう設計されており、企業向けチャットボットから各種ドメインの AI エージェントに至るまで、日本語 AI アプリケーション向けのファインチューンを容易に行えるようになっています。データセットの内容 - 合計600万件（各レコードにつき6ペルソナ、100万レコード）の自然な日本語で記述されたペルソナ - 1レコードあたり22項目：6つのペルソナ関連項目と、公式の人口統計・労働統計に基づいた16のコンテキスト項目 - 総トークン数約14億：そのうち約8億5000万がペルソナ関連トークン - 約95万件の固有の名前：合成データ生成で前例のない多様性 - 日本の労働力を反映した 1500 以上の職種カテゴリー - 人口・地域・性格特性軸を網羅的にカバー - 多様なペルソナタイプ：職業、スポーツ、芸術、旅行、料理 - 自然言語によるペルソナ属性：文化的背景、スキルと専門性、キャリア目標・志向、趣味や関心 - CC BY 4.0 ライセンスに基づき、商用・非商用を問わず利用可能 Nemotron-Personas-Japanの構築方法データ生成パイプライン NVIDIAの合成データ生成用マイクロサービスである NeMo Data Designer を用いて構築されています。この複合AIシステムは、複雑な Jinja テンプレート、Pydantic による検証、構造化出力、自動リトライ、および複数の生成バックエンドのサポートを可能にします。これらは、このような大規模な合成データセットの生成に必要なツール群です。さらに、以下のモデルも活用しています。 - 統計に基づいた生成を実現するための確率的グラフィカルモデル（Apache-2.0） - 日本語文章生成のための GPT-OSS-120B（Apache-2.0）日本の文化的背景の反映 Nemotron-Personas-Japan は、日本の公的な人口統計および労働統計に整合するよう設計されると同時に、AI トレーニングにおいて重要な以下の点を考慮して生成されました： - 教育：国の統計で学位レベルが一括分類されている場合、モデルが異なる教育経路を反映できるよう、より細かい区分を導入しました。 - 職業：トレーニングに使用する職業の幅を広げるため、追加カテゴリー（事業主や専門職種など）を組み込みました。 - ライフステージ：統計上ではあまり表に出ない学生、退職者、失業状態といったシナリオをモデル化し、より現実的なペルソナを表現できるようにしました。 - 文化的特性：日本の社会的・文化的特徴を組み込み、AI システムが地域固有の規範をより適切に反映できるようにしました。 - デジタルデバイド：年齢層ごとのデジタルリテラシーの差を考慮し、日本における実際のテクノロジー利用状況を反映しました。プライバシーを保護した設計このデータセットには、個人を特定できる情報（PII）は一切含まれていません。年齢、名前、職業などは公的な統計データの分布に基づいていますが、存命・故人問わず、実在の人物と結びつくことはありません。全てのペルソナは完全に人工的に生成されているため、実際の文化的パターンを保ちながらも、個人のプライバシーを損なうことなくトレーニングに利用できます。想定するユーザ Nemotron-Personas-Japanは、日本のソブリン AI システムを開発する日本のモデル開発者向けに設計されています。現在、LLM開発者が使用する訓練データのほとんどは英語であり、日本やインドなど各地域の開発者は、母国語で高品質なデータを入手するのに苦労しています。本データセットを含め、NVIDIA の Nemotron-Personas の一連の取り組みは、こうした課題を直接解決するものです。地域固有のニュアンスを捉えつつ、開発者が地域固有の言語で多様かつ複雑なデータを生成できるようサポートします。データセットは国勢調査データ、日本人の命名規則、文化的特徴など地域のコンテキストに基づき、すべて母語で生成しています。そのため、日本で自分たちのモデルの採用を拡大し、日本の文化的コンテキストを理解したいすべての AI モデルの開発者の方々のお役に立てれば幸いです。実用的な AI アプリケーションへの利用本データセットに含まれる合成ペルソナを以下のようなことに活用できます： - マルチターンの会話合成：ペルソナを「シード」として活用し、人間らしい対話データセットを作成 - ドメイン固有の AI アシスタントの開発：文化的配慮が可能な AI アシスタントを構築するためのデータセットを作成 - バイアステストと公平性：モデルや AI エージェントシステムが、地方と都市、異なる年齢層、あるいは多様な教育水準などにわたってどのように機能するかを評価し、日本社会のあらゆる層に対して公平に働くAIを実現合成ペルソナデータの重要性 AI 開発には、実世界の人々を反映した多様で高品質な訓練データへのアクセスが長らく課題でした。企業向け AI の開発はプライベートデータが主流となっており、研究者、スタートアップ、そして特に利用可能なデータが少ない地域のAI開発者にとって障壁となっていました。 - データの多様性：日本の全人口層を反映することで、偏った学習やモデル崩壊を防ぎます。 - 文化的信頼性：欧米中心のデータセットへの依存を減らし、ソブリンAIシステムの開発を支援します。 - プライバシーとコンプライアンス：日本の個人情報保護法（PIPA) の要件および将来の AI ガバナンスを満たします。 Nemotron-Personas-Japan を CC BY 4.0 のもとで公開することで、企業レベルの高品質な合成データへのアクセスを可能とし、従来のコスト、プライバシーの懸念、地理的な制約といった障壁なしに、文化的背景を正確に反映した…

211dOpen Source#gpu

228d ago

mmBERT: ModernBERT goes Multilingual

mmBERT: ModernBERT goes Multilingual TL;DR This blog post introduces mmBERT, a state-of-the-art massively multilingual encoder model trained on 3T+ tokens of text in over 1800 languages. It shows significant performance and speed improvements over previous multilingual models, being the first to improve upon XLM-R, while also developing new strategies for effectively learning low-resource languages. mmBERT builds upon ModernBERT for a blazingly fast architecture, and adds novel components to enable efficient multilingual learning. If you are interested in trying out the models yourself, some example boilerplate is available at the end of this blogpost! Training Data mmBERT was trained on a carefully curated multilingual dataset totaling over 3T tokens across three distinct training phases. The foundation of our training data consists of three primary open-source and high-quality web crawls that enable both multilingual coverage and data quality: DCLM and Filtered DCLM…

228dOpen Source#rag#training#open-source

263d ago

Welcome GPT OSS, the new open-source model family from OpenAI!

Welcome GPT OSS, the new open-source model family from OpenAI! To make it even better and more impactful for the community, the models are licensed under the Apache 2.0 license, along with a minimal usage policy: We aim for our tools to be used safely, responsibly, and democratically, while maximizing your control over how you use them. By using gpt-oss, you agree to comply with all applicable law. According to OpenAI, this release is a meaningful step in their commitment to the open-source ecosystem, in line with their stated mission to make the benefits of AI broadly accessible. Many use cases rely on private and/or local deployments, and we at Hugging Face are super excited to welcome OpenAI to the community. We believe these will be long-lived, inspiring and impactful models. Contents - Introduction - Overview - API access through…

263dOpen Source#open-source

274d ago

Parquet Content-Defined Chunking

Parquet Content-Defined Chunking TL;DR: Parquet Content-Defined Chunking (CDC) is now available in PyArrow and Pandas, enabling efficient deduplication of Parquet files on content-addressable storage systems like Hugging Face's Xet storage layer. CDC dramatically reduces data transfer and storage costs by uploading or downloading only the changed data chunks. Enable CDC by passing the use_content_defined_chunking argument: import pandas as pd import pyarrow.parquet as pq df.to_parquet("hf://datasets/{user}/{repo}/path.parquet", use_content_defined_chunking=True) pq.write_table(table, "hf://datasets/{user}/{repo}/path.parquet", use_content_defined_chunking=True) Table of Contents - Introduction - Data Preparation - Different Use Cases for Parquet Deduplication - Using Parquet CDC feature with Pandas - References - Conclusion Introduction Apache Parquet is a columnar storage format that is widely used in the data engineering community. As of today, Hugging Face hosts nearly 21 PB of datasets, with Parquet files alone accounting for over 4 PB of that storage. Optimizing Parquet storage is therefore a…

274dOpen Source#rag

290d ago

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Reachy Mini – The Open-Source Robot for Today's and Tomorrow's AI Builders Reachy Mini is an expressive, open-source robot designed for human-robot interaction, creative coding, and AI experimentation. Fully programmable in Python (and soon JavaScript, Scratch) and priced from $299, it's your gateway into robotics AI: fun, customizable, and ready to be part of your next coding project. Whether you're an AI developer, hacker, researcher, teacher, robot enthusiast, or just coding with your kids on the weekend, Reachy Mini lets you develop, test, deploy, and share real-world AI applications from your desk, using the latest AI models! 🔩 Robot technical info Reachy Mini measures 11“/28cm in height and 6.3“/16cm in width (approximately 9“/23cm tall when in sleep mode) and weighs 3.3 lbs/1.5 kg. It comes as a kit and is available either in a lite version or as a fully…

290dOpen Source#open-source

303d ago

Gemma 3n fully available in the open-source ecosystem!

Gemma 3n fully available in the open-source ecosystem! Today, Gemma 3n is finally available on the most used open source libraries. This includes transformers & timm, MLX, llama.cpp (text inputs), transformers.js, ollama, Google AI Edge, and others. This post quickly goes through practical snippets to demonstrate how to use the model with these libraries, and how easy it is to fine-tune it for other domains. Models released today Here is the Gemma 3n Release Collection Two model sizes have been released today, with two variants (base and instruct) each. The model names follow a non-standard nomenclature: they are called gemma-3n-E2B and gemma-3n-E4B . The E preceding the parameter count stands for Effective . Their actual parameter counts are 5B and 8B , respectively, but thanks to improvements in memory efficiency, they manage to only need 2B and 4B in VRAM…

303dOpen Source#open-source

326d ago

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data 🧭TL;DR Today, we introduce SmolVLA, a compact (450M), open-source Vision-Language-Action model for robotics that runs on consumer hardware. - Pretrained only on compatibly licensed, open-source community-shared datasets under the lerobot tag. - SmolVLA-450M outperforms much larger VLAs and strong baselines such as ACT on simulation (LIBERO, Meta-World) and real-world tasks (SO100, SO101). - Supports asynchronous inference for 30% faster response and 2× task throughput. Useful links: - Hardware used to train and evaluate SO-100/101: https://github.com/TheRobotStudio/SO-ARM100 - Base model https://huggingface.co/lerobot/smolvla_base - Paper: https://huggingface.co/papers/2506.01844 📚 Table of Contents - 🧭 TL;DR - 📖 Introduction - 🤖 Meet SmolVLA - 🚀 How to Use SmolVLA? - 🧠 Method - 📦 Community Datasets - 📊 Results - ✅ Conclusion - 📣 Call to Action Introduction Over the past few years, Transformers have driven remarkable progress…

326dOpen Source#multimodal

326d ago

Real-Time AI Sound Generation on Arm: A Personal Tool for Creative Freedom

Real-Time AI Sound Generation on Arm: A Personal Tool for Creative Freedom As a software engineer and music producer, I’m always exploring how technology can expand creative expression. That curiosity recently led me to build a personal sound generation app that runs directly on-device—powered by an Arm-based CPU and open-source generative AI models. It’s fast, private, and enables me to generate studio-ready sounds from a simple prompt, all within seconds. This project brings together the best of several worlds: - The Stable Audio Open model from Stability AI, sourced from Hugging Face - Execution powered by PyTorch and TorchAudio - A fast, efficient pipeline that runs natively on Arm-based CPUs - A seamless creative handoff to Ableton Live A New Kind of Creative Companion When I’m deep in a music project using Ableton Live, I don’t want to interrupt my…

326dOpen Source#multimodal#local#open-source

349d ago

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How?

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How? In this post, we: - Recognize the growing impact of community-contributed LeRobot datasets - Highlight the current challenges in robotic data collection and curation - Share practical steps and best practices to maximize the impact of this collective effort Our goal is to frame generalization as a data problem, and to show how building an open, diverse “ImageNet of robotics” is not just possible—but already happening. Introduction Recent advances in Vision-Language-Action (VLA) models have enabled robots to perform a wide range of tasks—from simple commands like “grasp the cube” to more complex activities like folding laundry or cleaning a table. These models aim to achieve generalization: the ability to perform tasks in novel settings, with unseen objects, and in varying conditions. “The biggest challenge in robotics isn’t dexterity, but…

349dOpen Source

376d ago

Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 🤖

Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 🤖 Since Hugging Face started the LeRobot library in 2024, led by ex-Tesla lead Remi Cadene, the Hugging Face Hub has quickly become the most widely used hub and software platform for open robotics with models, datasets, spaces and libraries. Today, we’re excited to take it a step further by welcoming Pollen Robotics to Hugging Face, a team that's spent the last 9 years building open-source robots and hardware. We believe robotics could be the next frontier unlocked by AI — and it should be open, affordable, and private. Our vision: a future where everyone in the community, from hobbyists to enterprises, can build or use robot assistants or games, starting from open solutions instead of closed, remote controlled, hardware. — Thomas Wolf, co-founder and chief scientist at Hugging…

376dOpen Source#open-source

402d ago

AI Policy @🤗: Response to the White House AI Action Plan RFI

AI Policy @🤗: Response to the White House AI Action Plan RFI Context: Don't Sleep on (Strongly) Open Models' Capabilities Open approaches to AI development are not only (typically) more transparent, adaptable, and scientifically sound, they have also consistently reproduced or surpassed the performance of widely-used API-only commercial offerings on many tasks; and are increasingly doing so on shorter timelines, with increased resource efficiency. Our team's recent OlympicCoder outperforming Claude 3.7 on complex coding tasks with 7B parameters and an open-source post-training recipe, or AI2's fully open OLMo 2 models (with open training data) matching o1-mini performances, are two of the most recent compelling examples. These successes show that a robust AI strategy must leverage open and collaborative development to best drive performance, adoption, and security of the technology. We make three major recommendations in this direction. Recommendation 1: Recognize…

402dOpen Source#claude#rag#coding#training

410d ago

LeRobot goes to driving school: World’s largest open-source self-driving dataset

LeRobot goes to driving school TL;DR of L2D, the world's largest self-driving dataset! - 90+ TeraBytes of multimodal data (5000+ hours of driving) from 30 cities in Germany - 6x surrounding HD cameras and complete vehicle state: Speed/Heading/GPS/IMU - Continuous: Gas/Brake/Steering and discrete actions: Gear/Turn Signals - Environment state: Lane count, Road type (highway|residential), Road surface (asphalt, cobbled, sett), Max speed limit. - Environment conditions: Precipitation, Conditions (Snow, Clear, Rain), Lighting (Dawn, Day, Dusk) - Designed for training end-to-end models conditioned on natural language instructions or future waypoints - Natural language instructions. F.ex "When the light turns green, drive over the tram tracks and then through the roundabout" for each episode - Future waypoints snapped to OpenStreetMap graph, aditionally rendered in birds-eye-view - Expert (driving instructors) and student (learner drivers) policies State-of-the art Vision Language Models and Large Language Models…

410dOpen Source#open-source

449d ago

The AI tools for Art Newsletter - Issue 1

The AI tools for Art Newsletter First issue 🎉 The AI space is moving so fast it’s hard to believe that a year ago we still struggled to generate people with the correct amount of fingers 😂.The last couple of years have been pivotal for open source models and tools for artistic usage. AI tools for creative expression have never been more accessible, and we’re only scratching the surface. Join us as we look back at the key milestones, tools, and breakthroughs in AI & Arts from 2024, and forward for what’s to come in 2025 (spoiler 👀: we’re starting a new monthly roundup 👇). Table of Contents - Major Releases of 2024 - Image Generation - Video Generation - Creative Tools that Shined in 2024 - What should we expect for AI & Art in 2025? - Starting off…

449dOpen Source#multimodal#open-source

453d ago

State of open video generation models in Diffusers

State of open video generation models in Diffusers Open-source has also had its own surge of video generation models with CogVideoX, Mochi-1, Hunyuan, Allegro, and LTX Video. Is the video community having its “Stable Diffusion moment”? This post will provide a brief overview of the state of video generation models, where we are with respect to open video generation models, and how the Diffusers team is planning to support their adoption at scale. Specifically, we will discuss: - Capabilities and limitations of video generation models - Why video generation is hard - Open video generation models - Video generation with Diffusers - Inference and optimizations - Fine-tuning - Looking ahead Today’s Video Generation Models and their Limitations These are today's most popular video models for AI-generated content creation Limitations: - High Resource Requirements: Producing high-quality videos requires large pretrained models,…

453dOpen Source#fine-tuning#inference#multimodal#open-source

488d ago

Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo

Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo But what if we wanted to go a step further and control the text generation process itself by directly modifying the probability distribution? That’s where logit processing comes into play. Hugging Face's LogitsProcessor API lets you customize the prediction scores of the language model head, providing granular control over model behavior. The 🤗 Transformers library not only offers a rich set of built-in logits processors but also empowers the community to create and share custom processors tailored to unique use cases. Enter NVIDIA's LogitsProcessorZoo — a collection of powerful, modular logits processors designed for specific tasks such as controlling sequence lengths, enforcing key phrases, or guiding multiple-choice answers. Fully compatible with Hugging Face's generate method, NVIDIA’s library serves as an excellent example of community-driven innovation in logits processing. In this post, we’ll explore…

488dOpen Source#gpu

502d ago

Open Preference Dataset for Text-to-Image Generation by the 🤗 Community

Open Preference Dataset for Text-to-Image Generation by the 🤗 Community TL;DR? All results can be found in this collection on the Hugging Face Hub and code for pre- and post-processing can be found in this GitHub repository. Most importantly, there is a ready-to-go preference dataset and a flux-dev-lora-finetune. If you want to show your support already, don’t forget to like, subscribe and follow us before you continue reading further. Unfamiliar with the Data is Better Together community? [Data is Better Together](https://huggingface.co/data-is-better-together) is a collaboration between 🤗 Hugging Face and the Open-Source AI community. We aim to empower the open-source community to build impactful datasets collectively. You can follow the organization to stay up to date with the latest datasets, models, and community sprints. Similar efforts There have been several efforts to create an open image preference dataset but our effort…

502dOpen Source#multimodal

506d ago

Welcome PaliGemma 2 – New vision language models by Google

Welcome PaliGemma 2 – New vision language models by Google PaliGemma 2 comes with new pre-trained (pt) models, in sizes of 3B , 10B , and 28B parameters. All of them support various input resolutions: 224x224 , 448x448 , and 896x896 . These combinations provide a lot of flexibility for different use cases, so practitioners can choose the balance they need in the quality / efficiency space. In contrast, the previous PaliGemma was only available in the 3B variant. The pre-trained models have been designed for easy fine-tuning to downstream tasks. The first PaliGemma was widely adopted by the community for multiple purposes. With the increased flexibility from the additional variants, combined with better pre-trained quality, we can’t wait to see what the community can do this time. As an example, Google is also releasing some fine-tuned variants on the…

506dOpen Source#fine-tuning#multimodal

543d ago

Universal Assisted Generation: Faster Decoding with Any Assistant Model

Universal Assisted Generation: Faster Decoding with Any Assistant Model gemma-2-9b and Mixtral-8x22B-Instruct-v0.1 lack a much smaller version to use for assisted generation. In this blog post, we present Universal Assisted Generation: a method developed by Intel Labs and Hugging Face which extends assisted generation to work with a small language model from any model family 🤯. As a result, it is now possible to accelerate inference from any decoder or Mixture of Experts model by 1.5x-2.0x with almost zero overhead 🔥🔥🔥. Let's dive in! Introduction Nowadays, the strongest open weight LLMs typically have billions to hundreds of billions parameters (hello Llama-3.1-405B 👋), and deploying these beasts in production environments poses a range of engineering challenges. One such challenge is that generating text from these large models is slow, which has prompted the community to develop a wide range of techniques…

543dOpen Source#llama#inference#coding#open-source

550d ago

Hugging Face Teams Up with Protect AI: Enhancing Model Security for the ML Community

Hugging Face Teams Up with Protect AI: Enhancing Model Security for the ML Community Protect AI is a company founded with a mission to create a safer AI-powered world. They are developing powerful tools, namely Guardian, to ensure that the rapid pace of AI innovation can continue without compromising security. Our decision to partner with Protect AI stems from their community driven approach to security, active support of open source, and expertise in all things security x AI. Interested in joining our security partnership / providing scanning information on the Hub? Please get in touch with us over at security@huggingface.co. Model security refresher To share models, we serialize weights, configs and other data structures we use to interact with the models, in order to facilitate storage and transport. Some serialization formats are vulnerable to nasty exploits, such as arbitrary code…

550dOpen Source

586d ago

Introducing Community Tools on HuggingChat

Introducing Community Tools on HuggingChat With this feature, we’re also expanding the modalities available in HuggingChat. You can now use community tools to understand images, generate videos, or answer with a text-to-speech model. The possibilities are endless and anyone can create tools using Spaces on Hugging Face! Explore existing tools here. In this post we’re going to look at a few use cases for creating community tools: - Turning a community Space into a tool - Creating a custom tool yourself - Enhance your assistants with community tools - Create a RAG tool on your own documents Turning a community Space into a tool You can turn anyone’s public Space into a tool. This is handy for using the latest models directly in HuggingChat. Let’s use DamarJati/FLUX.1-RealismLora as an example here. Start by creating a new tool and filling in…

586dOpen Source

598d ago

Hugging Face partners with TruffleHog to Scan for Secrets

Hugging Face partners with TruffleHog to Scan for Secrets TruffleHog is an open-source tool that detects and verifies secret leaks in code. With a wide range of detectors for popular SaaS and cloud providers, it scans files and repositories for sensitive information like credentials, tokens, and encryption keys. Accidentally committing secrets to code repositories can have serious consequences. By scanning repositories for secrets, TruffleHog helps developers catch and remove this sensitive information before it becomes a problem, protecting data and preventing costly security incidents. To combat secret leakage in public and private repositories, we worked with the TruffleHog team on two different initiatives: Enhancing our automated scanning pipeline with TruffleHog Creating a native Hugging Face scanner in TruffleHog Enhancing our automated scanning pipeline with TruffleHog At Hugging Face, we are committed to protecting our users' sensitive information. This is why…

598dOpen Source#coding#open-source

646d ago

Docmatix - a huge dataset for Document Visual Question Answering

Docmatix - A huge dataset for Document Visual Question Answering An example from the dataset We first had the idea to create Docmatix when we developed The Cauldron, an extensive collection of 50 datasets for the fine-tuning of Vision-Language Model (VLM), and Idefics2 in particular. Through this process, we identified a significant gap in the availability of large-scale Document Visual Question Answering (DocVQA) datasets. The primary dataset we relied on for Idefics2 was DocVQA, which contains 10,000 images and 39,000 question-answer (Q/A) pairs. Fine-tuning on this and other datasets, open-sourced models still maintain a large gap in performance to closed-source ones. To address this limitation, we are excited to introduce Docmatix, a DocVQA dataset featuring 2.4 million images and 9.5 million Q/A pairs derived from 1.3 million PDF documents. A 240X increase in scale compared to previous datasets. Comparing Docmatix…

646dOpen Source#fine-tuning#multimodal#open-source

740d ago

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community We are excited to release Idefics2, a general multimodal model that takes as input arbitrary sequences of texts and images, and generates text responses. It can answer questions about images, describe visual content, create stories grounded in multiple images, extract information from documents, and perform basic arithmetic operations. Idefics2 improves upon Idefics1: with 8B parameters, an open license (Apache 2.0), and enhanced OCR (Optical Character Recognition) capabilities, Idefics2 is a strong foundation for the community working on multimodality. Its performance on Visual Question Answering benchmarks is top of its class size, and competes with much larger models such as LLava-Next-34B and MM1-30B-chat. Idefics2 is also integrated in 🤗 Transformers from the get-go and therefore is straightforward to finetune for many multimodal applications. You can try out the models on the…

740dOpen Source#multimodal

761d ago

Pollen-Vision: Unified interface for Zero-Shot vision models in robotics

Pollen-Vision: Unified interface for Zero-Shot vision models in robotics This is a guest blog post by the Pollen Robotics team. We are the creators of Reachy, an open-source humanoid robot designed for manipulation in the real world. In the context of autonomous behaviors, the essence of a robot's usability lies in its ability to understand and interact with its environment. This understanding primarily comes from visual perception, which enables robots to identify objects, recognize people, navigate spaces, and much more. We're excited to share the initial launch of our open-source pollen-vision library, a first step towards empowering our robots with the autonomy to grasp unknown objects. This library is a carefully curated collection of vision models chosen for their direct applicability to robotics. Pollen-vision is designed for ease of installation and use, composed of independent modules that can be combined…

761dOpen Source#agents#multimodal#open-source

799d ago

Synthetic data: save money, time and carbon with open source

Synthetic data: save money, time and carbon with open source tl;dr Should you fine-tune your own model or use an LLM API? Creating your own model puts you in full control but requires expertise in data collection, training, and deployment. LLM APIs are much easier to use but force you to send your data to a third party and create costly dependencies on LLM providers. This blog post shows how you can combine the convenience of LLMs with the control and efficiency of customized models. In a case study on identifying investor sentiment in the news, we show how to use an open-source LLM to create synthetic data to train your customized model in a few steps. Our resulting custom RoBERTa model can analyze a large news corpus for around $2.7 compared to $3061 with GPT4; emits around 0.12 kg…

799dOpen Source#open-source

812d ago

SegMoE: Segmind Mixture of Diffusion Experts

SegMoE: Segmind Mixture of Diffusion Experts diffusers 🔥! Among the features and integrations being released today: - Models on the Hub, with their model cards and licenses (Apache 2.0) - Github Repository to create your own MoE-style models. Table of Contents - What is SegMoE - Inference - Comparison - Creating your Own SegMoE - Disclaimers and ongoing work - Additional Resources - Conclusion What is SegMoE? SegMoE models follow the same architecture as Stable Diffusion. Like Mixtral 8x7b, a SegMoE model comes with multiple models in one. The way this works is by replacing some Feed-Forward layers with a sparse MoE layer. A MoE layer contains a router network to select which experts process which tokens most efficiently. You can use the segmoe package to create your own MoE models! The process takes just a few minutes. For further…

812dOpen Source#inference

816d ago

Accelerate StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding

Accelerate StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding Introduction Recently, code generation models have become very popular, especially with the release of state-of-the-art open-source models such as BigCode’s StarCoder and Meta AI’s Code Llama. A growing number of works focuses on making Large Language Models (LLMs) more optimized and accessible. In this blog, we are happy to share the latest results of LLM optimization on Intel Xeon focusing on the popular code generation LLM, StarCoder. The StarCoder Model is a cutting-edge LLM specifically designed for assisting the user with various coding tasks such as code completion, bug fixing, code summarization, and even generating code snippets from natural language descriptions. The StarCoder model is a member of the StarCoder family which includes the StarCoderBase variant as well. These Large Language Models for Code (Code LLMs) are trained…

816dOpen Source#llama#coding#open-source

957d ago

SafeCoder vs. Closed-source Code Assistants

SafeCoder vs. Closed-source Code Assistants In "How Google Tests Software" (Addison-Wesley, 2012), Google reports that fixing a bug during system tests - the final testing stage - is 1000x more expensive than fixing it at the unit testing stage. This puts much pressure on developers - the first link in the chain - to write quality code from the get-go. For all the hype surrounding generative AI, code generation seems a promising way to help developers deliver better code fast. Indeed, early studies show that managed services like GitHub Copilot or Amazon CodeWhisperer help developers be more productive. However, these services rely on closed-source models that can't be customized to your technical culture and processes. Hugging Face released SafeCoder a few weeks ago to fix this. SafeCoder is a code assistant solution built for the enterprise that gives you state-of-the-art…

957dOpen Source#coding

1009d ago

Results of the Open Source AI Game Jam

Results of the Open Source AI Game Jam The primary objective was to create games that incorporate at least one Open Source AI Tool. Although proprietary AI tools were allowed, we encouraged participants to integrate open-source tools into their game or workflow. The response to our initiative was beyond our expectations, with over 1300 signups and the submission of 88 amazing games. You can try them here 👉 https://itch.io/jam/open-source-ai-game-jam/entries The Theme: Expanding To inspire creativity, we decided on the theme of "EXPANDING." We left it open to interpretation, allowing developers to explore and experiment with their ideas, leading to a diverse range of games. The games were evaluated by their peers and contributors based on three key criteria: fun, creativity, and adherence to the theme. The top 10 games were then presented to three judges (Dylan Ebert, Thomas Simonini and…

1009dOpen Source#open-source

1013d ago

Building an AI WebTV

Building an AI WebTV 👉 Watch the stream now by going to the AI WebTV Space. If you are using a mobile device, you can view the stream from the Twitch mirror. Concept The motivation for the AI WebTV is to demo videos generated with open-source text-to-video models such as Zeroscope and MusicGen, in an entertaining and accessible way. You can find those open-source models on the Hugging Face hub: - For video: zeroscope_v2_576 and zeroscope_v2_XL - For music: musicgen-melody The individual video sequences are purposely made to be short, meaning the WebTV should be seen as a tech demo/showreel rather than an actual show (with an art direction or programming). Architecture The AI WebTV works by taking a sequence of video shot prompts and passing them to a text-to-video model to generate a sequence of takes. Additionally, a base…

1013dOpen Source#multimodal#coding#open-source

1013d ago

Open-Source Text Generation & LLM Ecosystem at Hugging Face

Open-Source Text Generation & LLM Ecosystem at Hugging Face Text generation and conversational technologies have been around for ages. Earlier challenges in working with these technologies were controlling both the coherence and diversity of the text through inference parameters and discriminative biases. More coherent outputs were less creative and closer to the original training data and sounded less human. Recent developments overcame these challenges, and user-friendly UIs enabled everyone to try these models out. Services like ChatGPT have recently put the spotlight on powerful models like GPT-4 and caused an explosion of open-source alternatives like Llama to go mainstream. We think these technologies will be around for a long time and become more and more integrated into everyday products. This post is divided into the following sections: - Brief background on text generation - Licensing - Tools in the Hugging…

1013dOpen Source#open-source

1025d ago

Making ML-powered web games with Transformers.js

Making ML-powered web games with Transformers.js Quick links - Demo: Doodle Dash - Source code: doodle-dash - Join the game jam: Open Source AI Game Jam Overview Before we start, let's talk about what we'll be creating. The game is inspired by Google's Quick, Draw! game, where you're given a word and a neural network has 20 seconds to guess what you're drawing (repeated 6 times). In fact, we'll be using their training data to train our own sketch detection model! Don't you just love open source? 😍 In our version, you'll have one minute to draw as many items as you can, one prompt at a time. If the model predicts the correct label, the canvas will be cleared and you'll be given a new word. Keep doing this until the timer runs out! Since the game runs locally…

1025dOpen Source#training#open-source

1038d ago

Panel on Hugging Face

Panel on Hugging Face What does Panel offer? Panel is an open-source Python library that lets you easily build powerful tools, dashboards and complex applications entirely in Python. It has a batteries-included philosophy, putting the PyData ecosystem, powerful data tables and much more at your fingertips. High-level reactive APIs and lower-level callback based APIs ensure you can quickly build exploratory applications, but you aren’t limited if you build complex, multi-page apps with rich interactivity. Panel is a member of the HoloViz ecosystem, your gateway into a connected ecosystem of data exploration tools. Panel, like the other HoloViz tools, is a NumFocus-sponsored project, with support from Anaconda and Blackstone. Here are some notable features of Panel that our users find valuable. - Panel provides extensive support for various plotting libraries, such as Matplotlib, Seaborn, Altair, Plotly, Bokeh, PyDeck,Vizzu, and more. -…

1038dOpen Source#fine-tuning#open-source

1048d ago

The Hugging Face Hub for Galleries, Libraries, Archives and Museums

The Hugging Face Hub for Galleries, Libraries, Archives and Museums The Hugging Face Hub for Galleries, Libraries, Archives and Museums What is the Hugging Face Hub? Hugging Face aims to make high-quality machine learning accessible to everyone. This goal is pursued in various ways, including developing open-source code libraries such as the widely-used Transformers library, offering free courses, and providing the Hugging Face Hub. The Hugging Face Hub is a central repository where people can share and access machine learning models, datasets and demos. The Hub hosts over 190,000 machine learning models, 33,000 datasets and over 100,000 machine learning applications and demos. These models cover a wide range of tasks from pre-trained language models, text, image and audio classification models, object detection models, and a wide range of generative models. The models, datasets and demos hosted on the Hub span…

1048dOpen Source#multimodal#coding#open-source

1059d ago

Announcing the Open Source AI Game Jam 🎮

Announcing the Open Source AI Game Jam 🎮 Unleash Your Creativity with AI Tools and make a game in a weekend! We're thrilled to announce the first ever Open Source AI Game Jam, where you will create a game using AI tools. With AI's potential to enhance game experiences and workflows, we're excited to see what you can accomplish: incorporate generative AI tools like Stable Diffusion into your game or workflow to unlock new features and accelerate your development process. From texture generation to lifelike NPCs and realistic text-to-speech, the options are endless. 📆 Mark your calendars: the game jam will take place from Friday to Sunday, July 7-9. Claim Your Free Spot in the Game Jam 👉 https://itch.io/jam/open-source-ai-game-jam Why Are We Organizing This? In a time when some popular game jams restrict the use of AI tools, we believe…

1059dOpen Source#open-source

1082d ago

Creating a Coding Assistant with StarCoder

Creating a Coding Assistant with StarCoder Fortunately, there are now several high-quality open-source alternatives! These include SalesForce’s CodeGen Mono 16B for Python, or Replit’s 3B parameter model trained on 20 programming languages. The new kid on the block is BigCode’s StarCoder, a 16B parameter model trained on one trillion tokens sourced from 80+ programming languages, GitHub issues, Git commits, and Jupyter notebooks (all permissively licensed). With an enterprise-friendly license, 8,192 token context length, and fast large-batch inference via multi-query attention, StarCoder is currently the best open-source choice for code-based applications. In this blog post, we’ll show how StarCoder can be fine-tuned for chat to create a personalised coding assistant! Dubbed StarChat, we’ll explore several technical details that arise when using large language models (LLMs) as coding assistants, including: - How LLMs can be prompted to act like conversational agents. -…

1082dOpen Source#fine-tuning#inference#coding#open-source

1095d ago

Databricks ❤️ Hugging Face: up to 40% faster training and tuning of Large Language Models

Databricks ❤️ Hugging Face: up to 40% faster training and tuning of Large Language Models “It's been great to see Databricks release models and datasets to the community, and now we see them extending that work with direct open source commitment to Hugging Face. Spark is one of the most efficient engines for working with data at scale, and it's great to see that users can now benefit from that technology to more effectively fine tune models from Hugging Face.” — Clem Delange, Hugging Face CEO Hugging Face gets first-class Spark support Over the past few weeks, we’ve gotten many requests from users asking for an easier way to load their Spark dataframe into a Hugging Face dataset that can be utilized for model training or tuning. Prior to today’s release, to get data from a Spark dataframe into a…

1095dOpen Source#training#open-source

1097d ago

Introducing HuggingFace blog for Chinese speakers: Fostering Collaboration with the Chinese AI community

Introducing HuggingFace blog for Chinese speakers: Fostering Collaboration with the Chinese AI community Welcome to our blog for Chinese speakers! We are delighted to introduce Hugging Face’s new blog for Chinese speakers: hf.co/blog/zh! A committed group of volunteers has made this possible by translating our invaluable resources, including blog posts and comprehensive courses on transformers, diffusion, and reinforcement learning. This step aims to make our content accessible to the ever-growing Chinese AI community, fostering mutual learning and collaboration. Recognizing the Chinese AI Community’s Accomplishments We want to highlight the remarkable achievements and contributions of the Chinese AI community, which has demonstrated exceptional talent and innovation. Groundbreaking advancements like HuggingGPT, ChatGLM, RWKV, ChatYuan, ModelScope text-to-video models as well as IDEA CCNL and BAAI’s contributions underscore the incredible potential within the community. In addition, the Chinese AI community has been actively engaged…

1097dOpen Source

1115d ago

Snorkel AI x Hugging Face: unlock foundation models for enterprises

Snorkel AI x Hugging Face: unlock foundation models for enterprises As OpenAI releases GPT-4 and Google debuts Bard in beta, enterprises around the world are excited to leverage the power of foundation models. As that excitement builds, so does the realization that most companies and organizations are not equipped to properly take advantage of foundation models. Foundation models pose a unique set of challenges for enterprises. Their larger-than-ever size makes them difficult and expensive for companies to host themselves, and using off-the-shelf FMs for production use cases could mean poor performance or substantial governance and compliance risks. Snorkel AI bridges the gap between foundation models and practical enterprise use cases and has yielded impressive results for AI innovators like Pixability. We’re teaming with Hugging Face, best known for its enormous repository of ready-to-use open-source models, to provide enterprises with even…

1115dOpen Source#gpt#rag#open-source

1233d ago

From GPT2 to Stable Diffusion: Hugging Face arrives to the Elixir community

From GPT2 to Stable Diffusion: Hugging Face arrives to the Elixir community To help anyone get started with those models, the team behind Livebook - a computational notebook platform for Elixir - created a collection of "Smart cells" that allows developers to scaffold different Neural Network tasks in only 3 clicks. You can watch my video announcement to learn more: Thanks to the concurrency and distribution support in the Erlang Virtual Machine, which Elixir runs on, developers can embed and serve these models as part of their existing Phoenix web applications, integrate into their data processing pipelines with Broadway, and deploy them alongside their Nerves embedded systems - without a need for 3rd-party dependencies. In all scenarios, Bumblebee models compile to both CPU and GPU. Background The efforts to bring Machine Learning to Elixir started almost 2 years ago with…

1233dOpen Source

1251d ago

An overview of inference solutions on Hugging Face

An Overview of Inference Solutions on Hugging Face At Hugging Face, we are obsessed with simplifying ML development and operations without compromising on state-of-the-art quality. In this respect, the ability to test and deploy the latest models with minimal friction is critical, all along the lifecycle of an ML project. Optimizing the cost-performance ratio is equally important, and we'd like to thank our friends at Intel for sponsoring our free CPU-based inference solutions. This is another major step in our partnership. It's also great news for our user community, who can now enjoy the speedup delivered by the Intel Xeon Ice Lake architecture at zero cost. Now, let's review your inference options with Hugging Face. Free Inference Widget One of my favorite features on the Hugging Face hub is the Inference Widget. Located on the model page, the Inference Widget…

1251dOpen Source#inference

1481d ago

~Don't~ Repeat Yourself

Don't Repeat Yourself* Designing open-source libraries for modern machine learning 🤗 Transformers Design Philosophy "Don't repeat yourself", or DRY, is a well-known principle of software development. The principle originates from "The pragmatic programmer", one of the most read books on code design. The principle's simple message makes obvious sense: Don't rewrite a logic that already exists somewhere else. This ensures the code remains in sync, making it easier to maintain and more robust. Any change to this logical pattern will uniformly affect all of its dependencies. At first glance, the design of Hugging Face's Transformers library couldn't be more contrary to the DRY principle. Code for the attention mechanism is more or less copied over 50 times into different model files. Sometimes code of the whole BERT model is copied into other model files. We often force new model contributions…

1481dOpen Source#rag#coding#open-source

1555d ago

Welcome Stable-baselines3 to the Hugging Face Hub 🤗

Welcome Stable-baselines3 to the Hugging Face Hub 🤗 Stable-Baselines3 is one of the most popular PyTorch Deep Reinforcement Learning library that makes it easy to train and test your agents in a variety of environments (Gym, Atari, MuJoco, Procgen...). With this integration, you can now host your saved models 💾 and load powerful models from the community. In this article, we’re going to show how you can do it. Installation To use stable-baselines3 with Hugging Face Hub, you just need to install these 2 libraries: pip install huggingface_hub pip install huggingface_sb3 Finding Models We’re currently uploading saved models of agents playing Space Invaders, Breakout, LunarLander and more. On top of this, you can find all stable-baselines-3 models from the community here When you found the model you need, you just have to copy the repository id: Download a model from…

1555dOpen Source

1586d ago

Gradio is joining Hugging Face!

Gradio is joining Hugging Face! Gradio is joining Hugging Face! By acquiring Gradio, a machine learning startup, Hugging Face will be able to offer users, developers, and data scientists the tools needed to get to high level results and create better models and tools... Hmm, paragraphs about acquisitions like the one above are so common that an algorithm could write them. In fact, one did!! This first paragraph was written with the Acquisition Post Generator, a machine learning demo on Hugging Face Spaces. You can run it yourself in your browser: provide the names of any two companies and you'll get a reasonable-sounding start to an article announcing their acquisition! The Acquisition Post Generator was built using our open-source Gradio library -- it is just one of our recent collaborations with Hugging Face. And I'm excited to announce that these…

1586dOpen Source#rag#coding#open-source

1642d ago

Course Launch Community Event

Course Launch Community Event To go with this release, we are organizing a large community event to which you are invited! The program includes two days of talks, then team projects focused on fine-tuning a model on any NLP task ending with live demos like this one. Those demos will go nicely in your portfolio if you are looking for a new job in Machine Learning. We will also deliver a certificate of completion to all the participants that achieve building one of them. AWS is sponsoring this event by offering free compute to participants via Amazon SageMaker. To register, please fill out this form. You will find below more details on the two days of talks. Day 1 (November 15th): A high-level view of Transformers and how to train them The first day of talks will focus on a…

1642dOpen Source

1747d ago

Welcome spaCy to the Hugging Face Hub

Welcome spaCy to the Hugging Face Hub Hugging Face makes it really easy to share your spaCy pipelines with the community! With a single command, you can upload any pipeline package, with a pretty model card and all required metadata auto-generated for you. The inference API currently supports NER out-of-the-box, and you can try out your pipeline interactively in your browser. You'll also get a live URL for your package that you can pip install from anywhere for a smooth path from prototype all the way to production! Finding models Over 60 canonical models can be found in the spaCy org. These models are from the latest 3.1 release, so you can try the latest realesed models right now! On top of this, you can find all spaCy models from the community here https://huggingface.co/models?filter=spacy. Widgets This integration includes support for…

1747dOpen Source#inference

▾[MTR]MIT Technology Review· 1 articlesvisit →

4d ago

Recent books from the MIT community

Recent books from the MIT community May/June 2026 Priority Technologies: Ensuring US Security and Shared Prosperity Edited by Elisabeth B. Reynolds, professor of the practice of urban studies and planning and former executive director of the MIT Task Force on the Work of the Future MIT PRESS, 2026, $24.95 The Shape of Wonder: How Scientists Think, Work, and Live By Alan Lightman, professor of the practice of the humanities, and Martin Rees PENGUIN RANDOM HOUSE, 2025, $28 Spheres of Injustice: The Ethical Promise of Minority Presence By Bruno Perreau, professor of French studies and language MIT PRESS, 2025, $34 The Analytics Edge in Healthcare By Dimitris Bertsimas, SM ’87, PhD ’88, professor of management and operations research and associate dean of online education and AI; Agni Orfanoudaki, PhD ’21; and Holly Wiberg DYNAMIC IDEAS, 2025, $110 The Art of Monetary…

4dOpen Sourceby MIT Alumni News Staff

▾[NV]NVIDIA Developer Blog· 1 articlesvisit →

5d ago

Maximizing Memory Efficiency to Run Bigger Models on NVIDIA Jetson

The boom in open source generative AI models is pushing beyond data centers into machines operating in the physical world. Developers are eager to deploy these models at the edge, enabling physical AI agents and autonomous robots to automate heavy-duty tasks. A key challenge is efficiently running multi-billion-parameter models on edge devices with limited memory. With ongoing constraints on memory supply and rising costs, developers are focused on achieving more with less. The NVIDIA Jetson platform supports popular open models while delivering strong runtime performance and memory optimization at the edge. For edge developers, the memory footprint determines whether a system functions. Unlike cloud environments, edge devices operate under strict memory limits, with CPU and GPU sharing constrained resources. Inefficient memory use can lead to bottlenecks, latency spikes, or system failure. Meanwhile, modern edge applications often run multiple pipelines—such as…

5dOpen Source#coding#open-source#gpuby Anshuman Bhat

▾[OLL]Ollama Blog· 5 articlesvisit →

100d ago

OpenAI Codex with Ollama January 15, 2026 Open models can be used with OpenAI's Codex CLI through Ollama. Codex can read, modify, and execute code in your working directory using models such as gpt-oss:20b, gpt-oss:120b, or other open-weight alternatives.

OpenAI Codex with Ollama January 15, 2026 Open models can be used with OpenAI’s Codex CLI through Ollama. Codex can read, modify, and execute code in your working directory using models such as gpt-oss:20b , gpt-oss:120b , or other open-weight alternatives. Get started Install Codex CLI: npm install -g @openai/codex Start Codex with the --oss flag: codex --oss By default, Codex will use the local gpt-oss:20b model. Note: Codex requires a large context window. We recommend at least 32K tokens. See the documentation for how to adjust context length in Ollama. Changing models You can switch to a different model using the -m flag: codex --oss -m gpt-oss:120b Cloud models All models on Ollama Cloud work with Codex. codex --oss -m gpt-oss:120b-cloud Learn more For more detailed setup instructions and configuration options, see the Codex integration guide.

100dOpen Source#llama#coding

178d ago

OpenAI gpt-oss-safeguard October 29, 2025 Ollama is partnering with OpenAI and ROOST (Robust Open Online Safety Tools) to bring the latest gpt-oss-safeguard reasoning models to users for safety classification tasks. gpt-oss-safeguard models are available in two sizes: 20B and 120B, and are permissively licensed under the Apache 2.0 license. Get started - Download Ollama - Open a terminal and run the model: 20B: ollama run gpt-oss-safeguard:20b 120B: ollama run gpt-oss-safeguard:120b Highlights - Trained to reason about safety: Trained and tuned for safety reasoning to accommodate use cases like LLM input-output filtering, online content labeling and offline labeling for Trust and Safety use cases. - Bring your own policy: Interprets your written policy, so it generalizes across products and use cases with minimal engineering. - Reasoned decisions, not just scores: Gain complete access to the model’s reasoning process, facilitating easier debugging…

178dOpen Source#llama#safety

263d ago

OpenAI gpt-oss August 5, 2025 Ollama partners with OpenAI to bring gpt-oss to Ollama and its community.

OpenAI gpt-oss August 5, 2025 Welcome OpenAI’s gpt-oss! Ollama partners with OpenAI to bring its latest state-of-the-art open weight models to Ollama. The two models, 20B and 120B, bring a whole new local chat experience, and are designed for powerful reasoning, agentic tasks, and versatile developer use cases. Feature highlights - Agentic capabilities: Use the models’ native capabilities for function calling, web browsing (Ollama is providing a built-in web search that can be optionally enabled to augment the model with the latest information), python tool calls, and structured outputs. - Full chain-of-thought: Gain complete access to the model’s reasoning process, facilitating easier debugging and increased trust in outputs. - Configurable reasoning effort: Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs. - Fine-tunable: Fully customize models to your specific use case through…

263dOpen Source#llama

694d ago

An entirely open-source AI code assistant inside your editor May 31, 2024 Continue enables you to easily create your own coding assistant directly inside Visual Studio Code and JetBrains with open-source LLMs.

An entirely open-source AI code assistant inside your editor May 31, 2024 This is a guest post from Ty Dunn, Co-founder of Continue, that covers how to set up, explore, and figure out the best way to use Continue and Ollama together. Continue enables you to easily create your own coding assistant directly inside Visual Studio Code and JetBrains with open-source LLMs. All this can run entirely on your own laptop or have Ollama deployed on a server to remotely power code completion and chat experiences based on your needs. To get set up, you’ll want to install Once you have them downloaded, here’s what we recommend exploring: Try out Mistral AI’s Codestral 22B model for autocomplete and chat As of the now, Codestral is our current favorite model capable of both autocomplete and chat. This model demonstrates how LLMs…

694dOpen Source#coding#open-source

705d ago

Google announces Firebase Genkit with Ollama support May 20, 2024 At Google IO 2024, Google announced Ollama support in Firebase Genkit, a new open-source framework for developers to build, deploy and monitor production-ready AI-powered apps.

Google announces Firebase Genkit with Ollama support May 20, 2024 At Google IO 2024, Google unveiled Firebase Genkit, featuring Ollama support for running Google’s open-source Gemma model on your local machine. Firebase Genkit is a new open-source framework for developers to build, deploy and monitor production-ready AI-powered apps. Getting started Firebase Genkit works with Ollama on MacOS, Windows, Linux, and via Docker containers. Install Genkit npm i -g genkit Download Google’s Gemma model ollama pull gemma If you don’t have Ollama installed, it can be downloaded here. Create and initialize a new node.js project mkdir genkit-ollama cd genkit-ollama npm init genkit init Genkit will now be running on localhost:4000

705dOpen Source#llama#coding#open-source

▾[OAI]OpenAI Blog· 20 articlesvisit →

3d ago

Introducing OpenAI Privacy Filter

Today we’re releasing OpenAI Privacy Filter, an open-weight model for detecting and redacting personally identifiable information (PII) in text. This release is part of our broader effort to support a more resilient software ecosystem by providing developers practical infrastructure for building with AI safely, including tools and models that make strong privacy and security protections easier to implement from the start. Privacy Filter is a small model with frontier personal data detection capability. It is designed for high-throughput privacy workflows, and is able to perform context-aware detection of PII in unstructured text. It can run locally, which means that PII can be masked or redacted without leaving your machine. It processes long inputs efficiently, making redaction decisions in a quick, single pass. At OpenAI, we use a fine-tuned version of Privacy Filter in our own privacy-preserving workflows. We developed Privacy…

3dOpen Source#local

23d ago

OpenAI acquires TBPN

Fidji Simo shared this message with the company earlier today: I’m excited to share that we’ve acquired TBPN(opens in a new window). This acquisition brings a team with strong editorial instincts, deep audience understanding, and a proven ability to convene influential voices across tech, business, and culture. TBPN has built something pretty special. It’s one of the places where the conversation about AI and builders is actually happening day to day. A lot of you already watch it, and rely on it to stay close to what’s going on. As I've been thinking about the future of how we communicate at OpenAI, one thing that's become clear is that the standard communications playbook just doesn't apply to us. We're not a typical company. We're driving a really big technological shift. And with our mission to ensure artificial general intelligence benefits…

23dOpen Source

32d ago

Update on the OpenAI Foundation

Update on the OpenAI Foundation A note from Bret Taylor, Chair of the Board of Directors of the OpenAI Foundation Last fall, OpenAI announced its recapitalization, paving the way for the OpenAI Foundation to access significant resources. Today, we’re sharing how the Foundation is starting to put that support to work. Our mission is to ensure artificial general intelligence benefits all of humanity. This is a multi-faceted endeavor. AI is already changing how people work, learn, and access care. It has the potential to unlock extraordinary benefits—faster medical breakthroughs, accelerated scientific discovery, more personalized services in healthcare and education, new tools for creativity and invention, higher productivity and economic growth, improved public services like transportation systems, and so much more. Our belief in this potential has guided OpenAI since its founding. But building powerful systems to benefit humanity is only…

32dOpen Source

37d ago

OpenAI to acquire Astral

OpenAI to acquire Astral Accelerates Codex growth to power the next generation of Python developer tools Today we’re announcing that OpenAI will acquire Astral(opens in a new window), bringing powerful open source developer tools into our Codex ecosystem. Astral has built some of the most widely used open source Python tools, helping developers move faster with modern tooling like uv, Ruff, and ty. These tools power millions of developer workflows and have become part of the foundation of modern Python development. As part of our developer-first philosophy, after closing OpenAI plans to support Astral’s open source products. By bringing Astral’s tooling and engineering expertise to OpenAI, we will accelerate our work on Codex and expand what AI can do across the software development lifecycle. Codex has already seen 3x user growth and 5x usage increase since the start of the…

37dOpen Source#agents#coding#open-source

58d ago

OpenAI Codex and Figma launch seamless code-to-design experience

OpenAI Codex and Figma launch seamless code-to-design experience Key takeaways: - New Codex to Figma integration helps users move seamlessly between code and the design canvas to iterate and ship products faster. - The Figma MCP Server connects Codex directly to its design platform and tools like Figma Make and FigJam. - The new integration expands the partnership between the two companies, which includes the Figma app in ChatGPT, and bringing the latest OpenAI models to Figma’s platform. OpenAI and Figma are deepening their partnership with a new code-to-design integration that connects Figma directly to Codex. Product builders can easily generate Figma designs from Codex and implement designs from Figma files back into code. Figma is a design and product development platform that lets teams create, prototype, and iterate on digital products together in real time. Using MCP, an open-source…

58dOpen Source#gpt#coding#open-source

136d ago

Strengthening cyber resilience as AI capabilities advance

Strengthening cyber resilience as AI capabilities advance As our models grow more capable in cybersecurity, we’re investing in strengthening them, layering in safeguards, and partnering with global security experts. Cyber capabilities in AI models are advancing rapidly, bringing meaningful benefits for cyberdefense as well as new dual-use risks that must be managed carefully. For example, capabilities assessed through capture-the-flag (CTF) challenges have improved from 27% on GPT‑5(opens in a new window) in August 2025 to 76% on GPT‑5.1‑Codex‑Max(opens in a new window) in November 2025. We expect that upcoming AI models will continue on this trajectory; in preparation, we are planning and evaluating as though each new model could reach ‘High’ levels of cybersecurity capability, as measured by our Preparedness Framework(opens in a new window). By this, we mean models that can either develop working zero-day remote exploits against well-defended…

136dOpen Source

143d ago

Announcing the initial People-First AI Fund grantees

Announcing the initial People-First AI Fund grantees Unrestricted funding for 208 nonprofits advancing people-first work nationwide. The OpenAI Foundation is announcing the first recipients from the People-First AI Fund, a multi-million dollar investment in community-based nonprofits working to strengthen local communities and expand the opportunity of AI. Through an open call, the Foundation will provide $40.5 million in unrestricted grants to 208 nonprofits across the United States. Funds will be disbursed by the end of the year. A second wave of $9.5 million in Board-directed grants will be announced in the coming months, supporting organizations already advancing transformative AI work in areas like health that reflect the Fund’s people-first values and potential for broad public benefit. “The People-First AI Fund reflects our commitment to supporting a wide range of organizations advancing work that strengthens communities and expands opportunity. We’re proud…

143dOpen Source

178d ago

Introducing gpt-oss-safeguard

Today, we’re releasing a research preview of gpt-oss-safeguard, our open-weight reasoning models for safety classification tasks, available in two sizes: gpt-oss-safeguard-120b and gpt-oss-safeguard-20b. These models are fine-tuned versions of our gpt-oss open models and available under the same permissive Apache 2.0 license, allowing anyone to use, modify, and deploy them freely. Both models can be downloaded today from Hugging Face(opens in a new window). The gpt-oss-safeguard models use reasoning to directly interpret a developer-provided policy at inference time—classifying user messages, completions, and full chats according to the developer’s needs. The developer always decides what policy to use, so responses are more relevant and tailored to the developer’s use case. The model uses chain-of-thought, which the developer can review to understand how the model is reaching its decisions. Additionally, the policy is provided during inference, rather than being trained into the…

178dOpen Source#coding#safety

184d ago

Work smarter with your company knowledge in ChatGPT

Work smarter with your company knowledge in ChatGPT A new way to bring together context from all your connected tools for answers that know your business. ChatGPT can help with almost any question, but the context you need to get work done often lives in your internal tools: docs, files, messages, emails, tickets, and project trackers. Those tools don’t always connect to each other, and the most accurate answer is often spread across them. Today we're introducing company knowledge for ChatGPT Business, Enterprise, and Edu. Company knowledge brings all the context from your connected apps together in ChatGPT, giving you answers specific to your business—so you can make decisions, take action, and get things done. With company knowledge, the information in your connected apps—like Slack, SharePoint, Google Drive, and GitHub—becomes more useful and accessible. It’s powered by a version of…

184dOpen Source#gpt

229d ago

A People-First AI Fund: $50M to support nonprofits

At OpenAI, ensuring broad deployment of benefits and applying an iterative approach to innovation is core to who we are. We believe AI should help solve humanity’s hardest problems, and that we should listen to and learn from organizations already leading that work on the frontlines. In July 2025, we announced a $50 million commitment to support nonprofits and mission-focused organizations working at the intersection of innovation and public good. This initiative reflects feedback from everyday people, community leaders, and experts dedicated to strengthening our communities—gathered through the Nonprofit Commission’s listening sessions with 100+ organizations and 500+ individuals representing over 7 million Americans, the nationwide OpenAI Nonprofit Jam, and ongoing partnerships and conversations with groups on the ground such as the American Federation of Teachers and Older Adults Technology Services at AARP. Today, we are excited to share that applications…

229dOpen Source

263d ago

Open Weights and AI for All

Open weights and AI for all AI’s next frontier isn’t just about capability—it’s about who gets to use it. Our mission to put AI in the hands of as many people as possible is what drives us. Today’s release of our most capable open-weights models is a major step forward that makes advanced AI more open, flexible, and accessible worldwide. It’s part of our broader effort to ensure AI is available to the many and not concentrated in the hands of the few. That’s why we’re integrating these models into both our OpenAI for Countries initiative and our nonprofit's support for groups on the frontlines of their communities. OpenAI for Countries helps our allies and partners build AI infrastructure rooted in democratic values. By offering these models to governments, we’re supporting the global buildout of AI on US-led rails—enabling nations…

263dOpen Source#open-source

263d ago

gpt-oss-120b & gpt-oss-20b Model Card

We introduce gpt-oss-120b and gpt-oss-20b, two open-weight reasoning models available under the Apache 2.0 license and our gpt-oss usage policy. Developed with feedback from the open-source community, these text-only models are compatible with our Responses API and are designed to be used within agentic workflows with strong instruction following, tool use like web search and Python code execution, and reasoning capabilities—including the ability to adjust the reasoning effort for tasks that don’t require complex reasoning. The models are customizable, provide full chain-of-thought (CoT), and support Structured Outputs. Safety is foundational to our approach to open models. They present a different risk profile than proprietary models: Once they are released, determined attackers could fine-tune them to bypass safety refusals or directly optimize for harm without the possibility for OpenAI to implement additional mitigations or to revoke access. In some contexts, developers…

263dOpen Source

331d ago

Building agricultural database for farmers

What if, through generative AI, a farmer could ask a chatbot any question, and instantly get an answer tailored to their specific community and context? This is exactly the experience that Digital Green has built with OpenAI. Called Farmer.Chat, this product supports the essential work of agricultural extension programs in countries including India and Kenya. For farmers navigating a changing climate, agricultural extension (also called agricultural advisory services) is critical. Extension agents teach farmers best practices for growing their crops, help them connect with local suppliers, and provide market and pricing information. But especially in rural, remote communities, giving every farmer the support they need is a huge challenge. Even India’s network of over 400,000 agents has an agent-to-farmer ratio of only 1:650. Over the past 15 years, Digital Green has been dedicated to solving this problem through digital agricultural…

331dOpen Source#local

353d ago

The San Antonio Spurs use ChatGPT to scale impact on and off the court

The San Antonio Spurs use ChatGPT to scale impact on and off the court ChatGPT Enterprise helps the Spurs save 1,800 staff hours a month and deepen global fan engagement. San Antonio basketball isn’t just a game—it’s a way of life. For over 50 years, the San Antonio Spurs have built their legacy around community, integrity, and team-first values. Now, as competition expands beyond the court, the Spurs use ChatGPT Enterprise to accelerate operational growth and scale their distinctive culture to reach international markets. “We were looking to create real competitive advantages,” says RC Buford, CEO of Spurs Sports & Entertainment. “Being early adopters of artificial intelligence is one powerful way to do that.” Today, teams across the Spurs organization—spanning data analytics, community engagement, and partnerships—have rapidly developed their own GPT solutions, boosting employees’ AI fluency from 14% to over…

353dOpen Source#gpt

578d ago

Minnesota’s Enterprise Translation Office uses ChatGPT to bridge language gaps

Minnesota’s Enterprise Translation Office uses ChatGPT to bridge language gaps For government services to be truly effective, they must be accessible to all residents. Removing language barriers is essential to high quality public service. This is the founding principle of the State of Minnesota’s Enterprise Translations Office (ETO), established in 2023 to provide dedicated translation services for the state’s Executive Branch. Over 20 percent of Minnesota’s residents primarily speak a language other than English, with Spanish, Somali, and Hmong as the top three major non-English languages(opens in a new window) spoken in the state. Language gaps in government communication are a challenge for community health, safety, and trust. The ETO works to ensure equitable access to government information and resources for all, bridging linguistic gaps to ensure a stronger, more inclusive Minnesota. Government agencies have long made consistent efforts to…

578dOpen Source#gpt#safety

898d ago

OpenAI Data Partnerships

OpenAI Data Partnerships Working together to create open-source and private datasets for AI training. We are introducing OpenAI Data Partnerships, where we’ll work together with organizations to produce public and private datasets for training AI models. Modern AI technology learns skills and aspects of our world—of people, our motivations, interactions, and the way we communicate—by making sense of the data on which it’s trained. To ultimately make AGI that is safe and beneficial to all of humanity, we’d like AI models to deeply understand all subject matters, industries, cultures, and languages, which requires as broad a training dataset as possible. Including your content can make AI models more helpful to you by increasing their understanding of your domain. We’re already working with many partners who are eager to represent data from their country or industry. For example, we recently partnered…

898dOpen Source#training#open-source

2056d ago

Generative language modeling for automated theorem proving

Generative language modeling for automated theorem proving Loading… Abstract We explore the application of transformer-based language models to automated theorem proving. This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans -- the generation of original mathematical terms -- might be addressable via generation from language models. We present an automated prover and proof assistant, GPT‑f, for the Metamath formalization language, and analyze its performance. GPT‑f found new short proofs that were accepted into the main Metamath library, which is to our knowledge, the first time a deep-learning based system has contributed proofs that were adopted by a formal mathematics community.

2056dOpen Source

2963d ago

Report from the OpenAI hackathon

Report from the OpenAI hackathon On March 3rd, we hosted our first hackathon with 100 members of the artificial intelligence community. On March 3rd, we hosted our first hackathon(opens in a new window) with 100 members of the artificial intelligence community. We had over 500 RSVPs arrive within two days of announcing the event—if you didn’t make it this time, please RSVP again in the future! Thank you to Cirrascale(opens in a new window) for providing GPU machines during the hackathon. Our applicants included high schoolers, industry practitioners, engineers for nonprofits (not just at OpenAI!), researchers at universities, and more, with interests spanning healthcare to AGI. We could only accommodate one hundred people this time so we tried to pick a balanced crowd with a wide range of backgrounds and levels of experience. In particular, we strove to achieve gender…

2963dOpen Source

3187d ago

Gathering human feedback

RL-Teacher is an open-source implementation of our interface to train AIs via occasional human feedback rather than hand-crafted reward functions. The underlying technique was developed as a step towards safe AI systems, but also applies to reinforcement learning problems with rewards that are hard to specify. The release contains three main components: - A reward predictor(opens in a new window) that can be plugged into any agent and learns to predict the actions the agent could take that a human would approve of. - An example agent(opens in a new window) that learns via a function specified by a reward predictor. RL-Teacher ships with three pre-integrated algorithms, including OpenAI Baselines PPO(opens in a new window). - A web-app(opens in a new window) that humans can use to give feedback, providing the data used to train the reward predictor. The entire…

3187dOpen Source#open-source

3267d ago

Roboschool

Roboschool We are releasing Roboschool: open-source software for robot simulation, integrated with OpenAI Gym. Roboschool provides new OpenAI Gym environments for controlling robots in simulation. Eight of these environments serve as free alternatives to pre-existing MuJoCo implementations, re-tuned to produce more realistic motion. We also include several new, challenging environments. Roboschool also makes it easy to train multiple agents together in the same environment. After we launched Gym(opens in a new window), one issue we heard from many users was that the MuJoCo(opens in a new window) component required a paid license (though MuJoCo recently added free(opens in a new window) student licenses for personal and class work). Roboschool removes this constraint, letting everyone conduct research regardless of their budget. Roboschool is based on the Bullet Physics Engine(opens in a new window), an open-source, permissively licensed(opens in a new window)…

3267dOpen Source#open-source

▾[RB]Replicate Blog· 7 articlesvisit →

268d ago

Open source video is back

Open source video is back Try WAN 2.2 Wan 2.2 has set the open source video community ablaze. It’s a huge leap forward from 2.1 with sharp physics, faster generation, and more control. And because it’s fully open source, it opens the door to high-quality video tools for anyone at an astounding fraction of the cost. We’ve teamed up with Pruna AI to release our own optimized version of Wan 2.2. You can test it in these ways: - FAST image-to-video, 480p — $0.05 per video - FAST text-to-video, 480p — $0.05 per video - FAST text-to-video, 720p — $0.10 per video You heard right. 5 cents a video. That’s even less than leading image generators on the AI market. With inference times at ~30s, Wan 2.2 can become your leading way to rapidly test video prompts. And don’t let…

268dOpen Source#multimodal#open-source

327d ago

FLUX.1 Kontext from the community

FLUX.1 Kontext from the community Try FLUX.1 Kontext In case you missed it: FLUX.1 Kontext launched last week. Judging by the reaction, most of you didn’t - but here’s the TL;DR: FLUX.1 Kontext is a new image editing model from Black Forest Labs. It’s top of its class for text-based image editing - better, cheaper, and no yellow tint compared to OpenAI’s 4o. There are three models: - FLUX.1 Kontext [pro]: High-quality image edits with strong prompt following. - FLUX.1 Kontext [max]: Top-tier performance with sharper typography. - FLUX.1 Kontext [dev] (coming soon): Open-weight version. Kontext can handle small tweaks and big changes—like color swaps, background edits, text replacements, and style transfers—while keeping characters consistent. Some prompting tips for you: - Be specific. - Start simple. - Preserve key elements. - Break big edits into steps. - Quote text. Since…

327dOpen Source

632d ago

Run FLUX with an API

Run FLUX with an API FLUX.1 is a new open-source image generation model developed by Black Forest Labs, the creators of Stable Diffusion. It’s available on Replicate today, and you can run it in the cloud with one line of code. Here’s an example of how to run FLUX.1 on Replicate using JavaScript: import Replicate from "replicate"; const replicate = new Replicate(); const model = "black-forest-labs/flux-dev"; const prompt = "Purple striped narwhal devouring a fluffy high-resolution everything bagel"; const output = await replicate.run(model, {input: { prompt }}); console.log(output); You can try out FLUX.1 right in your browser, or run it programmatically in your language of choice. What makes FLUX.1 special? FLUX.1 models have state-of-the-art performance in prompt following, visual quality, image detail, and output diversity. Here are some particular areas where we’ve been impressed: Text! Unlike older models that often…

632dOpen Source#open-source

872d ago

Businesses are building on open-source AI

Businesses are building on open-source AI To cut to the chase: we’ve raised a $40 million Series B led by a16z. Let me explain why. Last year, Stable Diffusion was released. It was an open-source image generation model that caught the imagination of tinkerers. An explosion of forks were created: inpainting, animation, texture generation, fine-tunes. At the start, it felt like a toy. It was just people tinkering around and seeing what was possible. But soon, side projects started to turn into real products. Indie hackers like Pieter Levels and Danny Postma made apps that generate profile pictures, redecorate your house, and create professional headshots. They’re now real businesses making over $1 million annual revenue as solo developers. The growth in tinkering and building since then has been astonishing. In the last year and a half, 2 million people have…

872dOpen Source#open-source

1011d ago

What happened with Llama 2 in the last 24 hours? 🦙

What happened with Llama 2 in the last 24 hours? 🦙 Meta AI just released version 2 of their open-source Llama language model. This new version was trained on more data (2 trillion tokens), supports longer context length (4096 tokens), and has a more permissive license than v1 which allows for commercial use. Here’s a list of developments from the last day since Llama 2 was released: - Llama2 chatbot – An open-source demo application built by the infra team at A16Z and powered by Streamlit, Replicate, and Fly.io. - Llama 2 7B - a 7-billion parameter version, fine tuned for chat completions, running on Replicate. Smaller and faster than the 13B and 70B versions. - Llama 2 13B - a 13-billion parameter version, fine tuned for chat completions, running on Replicate. - Llama 2 70B - a 70-billion parameter…

1011dOpen Source#llama#open-source

1065d ago

Make any large language model a better poet

Make any large language model a better poet In this post, we discuss a version of Vicuna-13B that we just released called Poet Vicuna-13B. This model is part of an early-stage project focused on enhancing open source large language models. Poet Vicuna-13B is an implementation of Vicuna-13B that is modified to generate poems and lyrics with specific syllabic patterns. You can use it to rewrite Twinkle, Twinkle Little Star or generate modernist poems with lots of beautiful white space sprinkled with lines of lengths that you choose. For example, we asked Poet Vicuna-13B to write an eight line poem with these syllable counts [3, 3, 0, 2, 0, 5, 0, 3, 2, 1, 0, 4] in response to this prompt: Write a poem that explores the concept of time and its impact on our lives—nostalgia, the passage of time, the…

1065dOpen Source#open-source

1100d ago

Language model roundup, April 2023

Language model roundup, April 2023 One month ago, we blogged about innovation around LLaMA, an open-source language model from Meta Labs. We heard from users that they really wanted to see more of these kinds of posts. So here we are, a month later, with another roundup of recent developments in the world of open-source language models. Models Large language models are hot. Here’s what came out this week: - StableLM – A new set of language models from Stability AI, the folks behind the Stable Diffusion image generation model. These models are trained on a new dataset that’s 3x the size of The Pile. - Vicuna – An open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. - GPT4All – Demo, data, and code to train open-source assistant-style large language model based on GPT-J and LLaMa.…

1100dOpen Source#open-source

▾[SWB]Simon Willison Blog· 1 articlesvisit →

9d ago

datasette.io news preview

16th April 2026 The datasette.io website has a news section built from this news.yaml file in the underlying GitHub repository. The YAML format looks like this: - date: 2026-04-15 body: |- [Datasette 1.0a27](https://docs.datasette.io/en/latest/changelog.html#a27-2026-04-15) changes how CSRF protection works in a way that simplifies form and API integration, and introduces a new `RenameTableEvent` for when a table is renamed by a SQL query. - date: 2026-03-18 body: |- ... This format is a little hard to edit, so I finally had Claude build a custom preview UI to make checking for errors have slightly less friction. I built it using standard claude.ai and Claude Artifacts, taking advantage of Claude's ability to clone GitHub repos and look at their content as part of a regular chat: Clone https://github.com/simonw/datasette.io and look at the news.yaml file and how it is rendered on the homepage.…

9dOpen Source