Your AI Advantage Weekly

90% of Developers Use AI. Only 24% Have High Trust In It.

Trust lags adoption. With hype fading, AI faces a new era of accountability.

Riley

26 Sep 2025 — 4 min read

Just the Signals · Edition #16

No big launches this week, but plenty moving below the surface.

Tool updates, fresh research, and accountability moves.

And developers? They can't stop using AI, even without trusting it.

A feeling most of us can relate to.

This week:

Connecting Dots: AI's Growing Out of the Hype Phase
Tool Updates: Cheaper Grok, ChatGPT Gets Personalized Updates
Quick Signals: Giving AI Agents too Many Tools Hurts Performance

Connecting Dots

AI's Growing Out of the Hype Phase

This week's wave of moves shows AI is shifting from hype to accountability.

The Latest AI Petition

To kick off the UN General Assembly, over 200 world leaders signed a petition calling for global "red lines" for AI development by 2026.

Meanwhile, Anthropic's CEO Dario Amodei told policymakers there's a 25% chance AI development goes catastrophically wrong.

Their concern: Valid when even the people building the technology are worried about where it's headed without guardrails.

Benchmarking

Scale AI launched SEAL Showdown rankings that show how different AI models perform for specific demographics, jobs, and languages across 100+ countries.

Their hope is to challenge AI ranking leader LMArena, which measures preferences from a niche audience of heavy AI users. SEAL aims to remove bias and ability to skew rankings.

The benefit: This should give a more accurate picture of the models that work best for different people and work.

Safety Protocols

Google DeepMind released an updated safety framework that includes specific tests for risky AI behaviors.

One key test checks whether advanced AI models try to resist being turned off or modified (a warning sign that the system might act against human control).

Why this matters: They're improving early warning systems to catch problematic AI before it reaches users.

Real-World Measurement

OpenAI launched GDPval, which compares AI performance to human experts.

The goal is to benchmark AI against real-world skills instead of exam scores.

Initial measurement includes scoring of 44 different job types across 11 industries (from lawyers to doctors to film and video editors).

They've found that today's AI is already approaching abilities of industry leaders.

The reality: Research like this will help companies and workers better prepare for what's actually changing vs what's just hype.

What this signals:

Less wild west. After years of racing to build powerful models with limited regulation, companies are being forced to prove value and address risks.
Trust becomes competitive advantage. Models that perform well in real-world conditions will win over those that only score high on tech benchmarks.

AI Now Standard In Software Development

Google's latest research shows how deeply AI has already seeped into coding:

The Trust Paradox: Only 24% of developers have high confidence in AI’s outputs, but usage keeps climbing.

Team Culture Matters: AI amplifies performance for connected teams, while fragmented teams struggle.

What this signals:

Every industry will hit this phase. Rapid adoption + measurable productivity gains + lingering trust issues.
Results > Skepticism. People will use AI tools that deliver value even when they don’t fully trust them. Oversight becomes a must-have.

Tool Updates

New Features

ChatGPT Pulse Get personalized daily updates from your chats, feedback, and connected apps like calendar. Available to Pro users on mobile.
Perplexity Email Assistant Manage, summarize, and respond to Gmail and Outlook messages. Available to Max subscribers.
Gemini Gems Sharing You can now share your custom Gems with coworkers, friends, and family just like you'd share a Google Doc.

New Models

Grok4 Fast xAI's new model runs 98% cheaper than Grok4 with faster responses and big context windows, making complex tasks more affordable.
New Qwen Models Alibaba released six new open-source models handling text, images, audio, and video supporting multiple languages.
Luma's Ray3 Reasoning Video The first video model that can reason, making video gen smarter and more consistent. Integrated into Adobe tools.

Distribution & Integration

Claude in Microsoft Products Microsoft 365 users can now choose Claude instead of ChatGPT for AI assistance in Office apps.
Gemini for TVs AI features are being added to Google TVs, letting people ask for recommendations and more.
Figma AI Agents MCP Allows teams to connect AI directly to Figma to automate workflows and produce designs.

Quick Signals

Limit your agent's tools Microsoft found that performance drops when you let your AI agents access too many tools.
Meta's licensing talks They're negotiating with News Corp, Fox, and Axel Springer to (legally) get training data for future AI models.
Meta's Llama gets approved for government use Joining ChatGPT, Claude, and Gemini.
OpenAI is poaching Apple's hardware team They've acquired more than 25 Apple staff members from iPhone, Apple Watch, and Siri teams as they set their eyes on creating physical products.