90% of Developers Use AI. Only 24% Have High Trust In It.

Trust lags adoption. With hype fading, AI faces a new era of accountability.

90% of Developers Use AI. Only 24% Have High Trust In It.
Just the Signals · Edition #16

No big launches this week, but plenty moving below the surface.

Tool updates, fresh research, and accountability moves.

And developers? They can't stop using AI, even without trusting it.

A feeling most of us can relate to.

This week:

  1. Connecting Dots: AI's Growing Out of the Hype Phase
  2. Tool Updates: Cheaper Grok, ChatGPT Gets Personalized Updates
  3. Quick Signals: Giving AI Agents too Many Tools Hurts Performance

Connecting Dots

AI's Growing Out of the Hype Phase

This week's wave of moves shows AI is shifting from hype to accountability.

The Latest AI Petition

To kick off the UN General Assembly, over 200 world leaders signed a petition calling for global "red lines" for AI development by 2026.

Source: Semafor (Charly Triballeau)

Meanwhile, Anthropic's CEO Dario Amodei told policymakers there's a 25% chance AI development goes catastrophically wrong.

Their concern: Valid when even the people building the technology are worried about where it's headed without guardrails.

Benchmarking

Scale AI launched SEAL Showdown rankings that show how different AI models perform for specific demographics, jobs, and languages across 100+ countries.

Source: Scale AI

Their hope is to challenge AI ranking leader LMArena, which measures preferences from a niche audience of heavy AI users. SEAL aims to remove bias and ability to skew rankings.

The benefit: This should give a more accurate picture of the models that work best for different people and work.

Safety Protocols

Google DeepMind released an updated safety framework that includes specific tests for risky AI behaviors.

One key test checks whether advanced AI models try to resist being turned off or modified (a warning sign that the system might act against human control).

Why this matters: They're improving early warning systems to catch problematic AI before it reaches users.

Real-World Measurement

OpenAI launched GDPval, which compares AI performance to human experts.

The goal is to benchmark AI against real-world skills instead of exam scores.

Initial measurement includes scoring of 44 different job types across 11 industries (from lawyers to doctors to film and video editors).

Source: OpenAI

They've found that today's AI is already approaching abilities of industry leaders.

The reality: Research like this will help companies and workers better prepare for what's actually changing vs what's just hype.

What this signals:

  • Less wild west. After years of racing to build powerful models with limited regulation, companies are being forced to prove value and address risks.
  • Trust becomes competitive advantage. Models that perform well in real-world conditions will win over those that only score high on tech benchmarks.

AI Now Standard In Software Development

Google's latest research shows how deeply AI has already seeped into coding:

Source: Google (Dora Report 2025)

The Trust Paradox: Only 24% of developers have high confidence in AI’s outputs, but usage keeps climbing.

Team Culture Matters: AI amplifies performance for connected teams, while fragmented teams struggle.

What this signals:

  • Every industry will hit this phase. Rapid adoption + measurable productivity gains + lingering trust issues.
  • Results > Skepticism. People will use AI tools that deliver value even when they don’t fully trust them. Oversight becomes a must-have.

Tool Updates

New Features

  • ChatGPT Pulse Get personalized daily updates from your chats, feedback, and connected apps like calendar. Available to Pro users on mobile.
  • Perplexity Email Assistant Manage, summarize, and respond to Gmail and Outlook messages. Available to Max subscribers.
  • Gemini Gems Sharing You can now share your custom Gems with coworkers, friends, and family just like you'd share a Google Doc.

New Models

  • Grok4 Fast xAI's new model runs 98% cheaper than Grok4 with faster responses and big context windows, making complex tasks more affordable.
  • New Qwen Models Alibaba released six new open-source models handling text, images, audio, and video supporting multiple languages.
  • Luma's Ray3 Reasoning Video The first video model that can reason, making video gen smarter and more consistent. Integrated into Adobe tools.

Distribution & Integration

  • Claude in Microsoft Products Microsoft 365 users can now choose Claude instead of ChatGPT for AI assistance in Office apps.
  • Gemini for TVs AI features are being added to Google TVs, letting people ask for recommendations and more.
  • Figma AI Agents MCP Allows teams to connect AI directly to Figma to automate workflows and produce designs.

Quick Signals

Source: The Information

Rumor Mill

  • No updates on the looming Claude 4.5 or Gemini 3 releases.

Want help separating hype from opportunity?
Start with the 5-Day AI Advantage Challenge to build the AI fundamentals and critical thinking skills you'll need.
Join the 5-Day Challenge →
Not a subscriber yet?
Join Your AI Advantage Weekly for practical insights on AI, strategy, and the changing workplace.
Subscribe Free →
Found this useful? Forward it to a teammate who’s figuring out AI too.