Enhanced Usage Monitoring
January 31, 2026 New comprehensive usage tracking and reporting features for better resource management:- Datasource-level breakdowns for granular usage visibility
- Account-based tracking with improved join keys for accurate reporting
- 10-minute update intervals for near real-time usage insights
- Automated cleanup of expired data for accurate retention calculations
Improved Onboarding Experience
January 30, 2026 Streamlined onboarding with enhanced user flows:- Redesigned onboarding cards with clearer visual hierarchy
- “My First Playground” experience for hands-on experimentation
- Role collection during signup for personalized setup
- Custom hover states matching each card’s accent color
Real-Time Evaluations
January 30, 2026 Run evaluations immediately on incoming data with real-time ingestion:- Instant evaluation of production traces without delays
- Latent evaluation support for updating earlier spans
- Seamless cutover between batch and real-time processing
- Available across all Arize AX tiers by default
AWS Bedrock Custom Endpoints
January 30, 2026 Enhanced AWS Bedrock integration for enterprise deployments:- Custom base URL support for private endpoints
- Inference profile ARNs for multi-region routing
- Custom model configurations for specialized deployments
- Simplified regional management with unified tracking
Wildcard Array Path Variables
January 30, 2026 Access array data more flexibly in templates and experiments:- Wildcard (
*) patterns to reference all array elements - Last-index (
-1) access for the most recent item - Automatic generation of wildcard variants for convenience
- Support in task variables and experiment columns
Improved Queue Management
January 29, 2026 Better user experience when managing annotation queues:- Duplicate detection with clear error messages
- Added and skipped record counts after bulk operations
- Actionable feedback when attempting to add existing records
Session Evaluations with Conversation Context
January 23, 2026 Evaluate entire conversation flows with new virtual attributes:{conversation}template variable for session-level evaluations- Chronologically ordered input/output pairs
- Automatic aggregation of multi-turn dialogues
- Root span filtering for accurate session context
Circuit Breaker for Evaluation Tasks
January 29, 2026 Protect resources during evaluation failures with intelligent circuit breaking:- Immediate abort on authentication errors (401/403)
- Automatic detection of systemic issues after 10 consecutive failures
- Failure rate monitoring to stop doomed batches early
- Resource optimization by preventing guaranteed-to-fail requests
Tracing Configuration for Evaluation Tasks
January 23, 2026 Enable detailed debugging for evaluation tasks:- Toggle tracing on/off in Advanced Options
- Automatic trace generation for monitoring and debugging
- Persistent settings saved with your tasks
- Production-ready visibility into evaluation execution
Enhanced RBAC System
January 27, 2026 Fine-grained access control with the new RBAC system:- Custom roles with specific permissions
- Space-level role bindings for granular access management
- Coexistence with legacy roles during migration
- UI support for role assignment across all user management pages
- Automatic fallback to legacy roles when custom roles are deleted
Enhanced Dashboard Time Persistence
January 22, 2026 Your dashboard preferences now persist automatically:- Auto-save time range, time zone, and granularity selections
- Instant restoration when returning to dashboards
- Per-dashboard settings for customized views
- Seamless experience across sessions
Trace Table Performance Improvements
January 21, 2026 Faster loading times for the tracing table:- 30-50% faster initial load times
- String truncation for large content
- Lazy loading of full values in tooltips
- Minimal impact on user experience
Expandable Trace Hierarchy
January 20, 2026 View trace structure directly in the table:- Expand traces to see child spans inline
- Hierarchical visualization without opening slideouts
- Faster navigation through complex traces
- Contextual understanding of request flow
Custom Prompt Release Labels
January 20, 2026 Organize and track prompt versions with custom labels:- Tag prompt versions with meaningful identifiers
- Environment markers like “staging” or “production”
- Dynamic label suggestions from existing prompts
- Easy retrieval of specific prompt releases
Enhanced Annotation Configs
January 12, 2026 More powerful annotation workflows with improved configs:- Color-coded categories based on optimization direction
- Read-only view for reviewing existing configs
- Optimization direction control (maximize, minimize, or none)
- Clear label guidance for consistent evaluations
Eval Hub Enhancements
January 16, 2026 Improved evaluation management and visibility:- Model information in evaluator listings with provider icons
- Evaluator counts in running tasks with hover details
- Automatic save when creating or editing evaluators
- Streamlined task flow for faster evaluation setup
Todo List Management Improvements
January 16, 2026 More reliable task tracking in Alyx conversations:- Visual status indicators for all todo states
- Dynamic reminders with exact update calls needed
- Plan preservation across human-in-the-loop pauses
- Clearer instructions positioned near the plan
Stacked Bar Chart Widgets
January 9, 2026 Visualize multi-dimensional data with new chart types:- Stacked bar charts for comparing categories over time
- Druid-powered queries for fast rendering
- Customizable groupings and dimensions
- Dashboard integration for comprehensive monitoring
Scatter Plot Widgets
January 21, 2026 Explore relationships between variables with scatter plots:- Correlation analysis for two numeric dimensions
- Interactive data points for detailed investigation
- Dashboard integration for visual analytics
- Customizable axes and filtering
Java SDK Space ID Support
January 14, 2026 Modern authentication for Java applications:- Space ID authentication (space keys deprecated)
- Backward compatibility maintained with existing constructors
- Updated documentation and examples
- Test coverage for new authentication method
Improved Error Handling for Exceptions
January 23, 2026 Better filtering and debugging capabilities:- Filter by
exception.typeandexception.messagein the UI - OpenInference semantic convention support for exceptions
- Consistent data structure across datasources
- Faster troubleshooting of error patterns
SAML Role Mapping Search
January 23, 2026 Navigate large role mapping configurations easily:- Client-side search across attributes, spaces, roles, and organizations
- Visual highlighting of search matches
- Keyboard navigation through results
- Improved usability for enterprise customers
Span-to-Queue Workflow
January 15, 2026 Add spans and dataset examples to annotation queues seamlessly:- Multiple entry points from spans table, trace slideover, and queue records
- New or existing queue selection
- Batch operations for efficient queue population
- Dataclusters integration for reliable processing
Enhanced Session Slideover
January 21, 2026 Better conversation visualization and navigation:- Trace labels with links to detailed views
- Visual separators between traces
- Hover highlighting synchronized between list and conversation
- Improved readability for multi-turn interactions
Batch Annotation Updates
January 12, 2026 Efficiently annotate large volumes of data:- Optimization direction support in annotation configs
- Category-based labeling for issue detection
- Best practice guidance for naming and structure
- Streamlined categorization workflows
Prompt Optimization on Experiments
January 6, 2026 Run prompt optimization directly on experiment results:- Experiment selector in optimization task creation
- Dynamic column resolution for experiment data
- Enhanced iteration on proven prompts
- Seamless workflow from experiments to optimization
Custom Metrics with LIKE Operator
January 27, 2026 More powerful filtering in custom metrics:- LIKE and ILIKE operators for pattern matching
- Wildcard support with
%syntax - Case-insensitive matching with ILIKE
- Direct Druid mapping for performance
Dashboard Template Filtering
January 27, 2026 Cleaner dashboard creation experience:- LLM-only space filtering shows only relevant templates
- Context-aware templates based on project types
- Reduced clutter in template selection
- Consistent experience across spaces and projects
Pivot Table Widget Schema
January 27, 2026 Foundation for advanced tabular data visualization:- Grouped categorical dimensions for organized views
- Configurable numeric columns with aggregations
- Flexible filtering and time range support
- Dashboard integration ready
Enhanced Space Model Schema
January 14, 2026 More control over data retention and lookback:- Space-level schema lookback overrides for custom retention
- Model-specific configurations for unique requirements
- Flexible data management across different use cases
Exact Match Code Evaluator
January 14, 2026 New built-in evaluator for validation:- String equality checks for exact matches
- Expected vs actual comparisons for testing
- Multi-field access with dataset row support
- Alphabetically sorted evaluator list in UI
Experiment Task Timeout Configuration
January 21, 2026 Accommodate long-running evaluations:- Configurable timeout parameter beyond 120 seconds
- Function-level control in run_experiment and evaluate_experiment
- Backward compatibility with default values
- Support for complex evaluators requiring extended processing
Arrow Schema Reconciliation
January 14, 2026 Improved data handling across distributed segments:- Parallel schema fetching from historicals
- Unified schema reconciliation across partitions
- Automatic conversion for schema consistency
- Support for both Druid and Arrow segments
Atlantis Terraform Automation
January 15, 2026 Streamlined infrastructure-as-code workflows:- Pull request integration for Terraform plans
- Automated plan posting as PR comments
- DevOps team permissions for webhook debugging
- Structured review process before applying changes
Google Analytics 4 BigQuery Sync
January 8, 2026 Automated analytics data export:- Daily GA4 to BigQuery transfers via Terraform
- Raw event data access for advanced analysis
- Overcome GA4 limitations like sampling and retention
- Custom reporting capabilities with full data access
Vertex AI Migration
January 8, 2026 Updated integration with Google Cloud AI:- Seamless Vertex AI connectivity for LLM applications
- Enhanced observability for Google Cloud deployments
- Modernized instrumentation for better tracing
Custom Model Migrations
January 7, 2026 Expanded support for custom integrations:- Custom model endpoint support in evaluations
- Higher traffic model optimization for performance
- Flexible integration options for enterprise deployments
Generative Service Monitoring
January 8, 2026 Comprehensive monitoring for evaluation infrastructure:- Uptime and health alerts with paging
- CPU and memory monitoring with warnings
- Dedicated Grafana dashboard for visibility
- Runbook documentation for incident response
Labeling Queue Annotations
January 5, 2026 More flexible annotation management:- Clear annotations (reset to null) anywhere
- Support across spans, queues, and experiments for consistent workflows
- Improved annotation lifecycle management
Enhanced Eval Hub Empty States
January 9, 2026 Better guidance for getting started:- Improved empty state design with clear next steps
- Documentation links for learning resources
- Actionable cards for common workflows
Resizable Trace Slideover
January 22, 2026 Customize your viewing experience:- Draggable slideover width for optimal layout
- Persistent sizing preferences across sessions
- Better content visibility for long traces
Configurable Experiment Timeout
January 21, 2026 Handle complex evaluation scenarios:- Custom timeout values for long-running tasks
- Per-experiment configuration for flexibility
- Backward compatible defaults for existing code
Enhanced Platform Stability
January 2026 Numerous improvements to platform reliability and performance:- Configuration drift resolution in GCP Terraform
- Enhanced error handling across services
- Improved logging and monitoring for faster troubleshooting
- Database migration optimizations for schema updates
- Better resource management for high-volume workloads