[wp-trac] [WordPress Trac] #65168: AI Client: Unable to reliably select model for providers (model preference ignored / inconsistent API)

Wed May 6 15:34:46 UTC 2026

#65168: AI Client: Unable to reliably select model for providers (model preference
ignored / inconsistent API)
--------------------------+------------------------------
 Reporter:  jabir20       |       Owner:  (none)
     Type:  defect (bug)  |      Status:  new
 Priority:  normal        |   Milestone:  Awaiting Review
Component:  AI            |     Version:  trunk
 Severity:  normal        |  Resolution:
 Keywords:                |     Focuses:
--------------------------+------------------------------

Comment (by jabir20):

 Thanks for checking.

 Yes — I tested again on WordPress 7.0-RC2, and I can confirm the issue
 still reproduces in my setup.

 Environment

 * WordPress: 7.0-RC2
 * AI Provider: AI Provider for Google (Gemini)
 * Local Docker setup
 * Same Gemini API key/project used successfully outside WordPress

 Observed behavior in RC2

 * `->model(...)` is still not available on `PromptBuilder`.
 * `->using_model_preference(...)` does not reliably force the selected
 model in my runtime path.
 * Requests still execute against `gemini-2.5-pro` in failure cases,
 resulting in:

   * `429 Too Many Requests`
   * quota errors specifically referencing `gemini-2.5-pro`

 Example:

 ```php
 $result = wp_ai_client_prompt( 'Say hello' )
     ->using_model_preference(
         'gemini-2.5-flash',
         'gemini-2.0-flash',
         'gemini-1.5-flash'
     )
     ->generate_text();
 ```

 Even with Flash models explicitly preferred, the runtime may still route
 to `gemini-2.5-pro`.

 Investigation
 I reviewed the `AI Provider for Google` plugin source and found that model
 sorting currently prioritizes `-pro` models before `-flash` models in
 `modelSortCallback()`.

 This appears to contribute to fallback/default selection behavior when
 explicit per-request preference is not honored consistently.

 Local workaround tested successfully
 I implemented a provider-side fallback patch locally, and this resolved
 the issue in RC2 while preserving the current “Pro-first” behavior.

 Changed file:
 `src/Models/GoogleTextGenerationModel.php`

 Summary of changes:

 1. `generateTextResult()` updated to:

 * try the primary model first (existing behavior)
 * on quota/rate-limit failure for `-pro` models, automatically retry Flash
 models

 2. Added helper:
    `sendGenerateContentRequest()`

 Used to execute requests against alternative fallback model IDs.

 3. Added helper:
    `shouldAttemptFlashFallback()`

 Triggers fallback only for:

 * `429`
 * `too many requests`
 * `quota`
 * `resource_exhausted`
 * `rate limit`

 4. Added helper:
    `getFlashFallbackModelIds()`

 Fallback order:

 * same-version flash model if available

   * example: `gemini-2.5-flash` for `gemini-2.5-pro`
 * `gemini-2.5-flash`
 * `gemini-2.0-flash`
 * `gemini-1.5-flash`

 Result after patch
 Requests that previously failed with:

 * `429 quota exceeded`
 * `gemini-2.5-pro`

 now successfully complete through Flash fallback models without requiring
 shortcode/plugin changes.

 Why this matters
 This impacts developers building plugins against the AI Client because:

 * model selection is not predictable
 * free-tier Gemini keys can fail unexpectedly
 * providers may override intended runtime model choice
 * multiple plugins sharing one provider configuration can conflict

 Suggested improvements

 * Provide a guaranteed and documented per-request model selection API
 * Ensure providers honor explicit model preferences consistently
 * Consider provider/core fallback handling for quota-restricted models
 (e.g. Pro → Flash)

 Happy to provide full patch diff or additional reproduction steps if
 useful.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/65168#comment:2>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform