AWS Neuron announces the Neuron Agentic Development capabilities, an open-source collection of agents and skills that equip AI coding assistants to accelerate development on AWS Trainium and AWS Inferentia. The initial release provides agentic coding capabilities for Neuron Kernel Interface (NKI) kernel development, covering the workflow from authoring to profiling and performance analysis.
NKI gives developers direct, low-level programming access to Trainium for writing custom compute kernels that maximize hardware performance. Neuron Agentic Development brings NKI expertise directly into the developer's agentic IDE (such as Claude Code and Kiro) through natural language. For example, a developer can describe a PyTorch operation and receive a working NKI kernel, ask the agent to fix a compilation error and have it automatically identify the issue and apply a correction, or request a performance analysis and receive a report identifying which lines of kernel code are causing bottlenecks. The capabilities span kernel authoring, debugging, documentation lookup, profile capture, and profile analysis.
Neuron Agentic Development is designed as a broad framework for agentic capabilities across the Neuron stack, with NKI kernel development as the initial release. The repository is available on GitHub.
Learn more:
Amazon Bedrock AgentCore launches recommendations and two ways to validate performance (batch evaluations and A/B tests). This completes the observe, evaluate, improve loop for AI agents in production. Until now, translating evaluation findings into concrete, validated improvements required manual developer intervention and intuition rather than a systematic approach. With recommendations, batch evaluations and A/B tests, developers now have the tools to act on what evaluations surface.
As models evolve and user behavior shifts, agent quality degrades quietly over time. The recommendations capability analyzes production traces and evaluation outputs generated by AgentCore to create optimized system prompts and tool descriptions tailored to your specific workload. Batch evaluations are then used for validating the recommendations against pre-defined test cases. A/B tests further validate those recommendations through controlled A/B testing against pre-defined test sets or live production traffic, with statistical significance reported before any change is promoted. Every recommendation requires your approval before it ships. Together, these capabilities complete the performance improvement cycle for agents. Agents don't just run, they get better, on your terms.
You can use optimization capabilities in all AWS Regions where AgentCore Evaluations is available. To learn more, visit the AgentCore documentation.
AWS Payment Cryptography(APC) now supports Multi-party approval (MPA) for importing root certificates, giving customers an additional layer of governance over critical key management operations.
Customers using X.509 and public key infrastructure (PKI) certificates with asymmetric keys (RSA and ECC) can now require two or more authorized individuals to approve a root certificate import request before it takes effect — even when the requester already holds the necessary IAM permissions. This distributed approval model prevents any single individual from making unilateral changes to certificate trust anchors.
Built on AWS Multi-party approval, this feature integrates natively with AWS IAM Identity Center, allowing team members to review and act on pending requests through a managed approval portal. Once approved, the new root certificate becomes active and available for use within the service. There is no additional charge for this feature beyond standard per-API rates.
This feature is available across all AWS Regions where AWS Payment Cryptography is available. To get started with this feature, review the AWS Payment Cryptography MPA guide and the Multi-Party Approval documentation.
Amazon Bedrock AgentCore Identity now supports On-Behalf-Of (OBO) token exchange, enabling developers to build agents that securely access protected resources on behalf of authenticated users — without requiring users to complete multiple consent flows.
Previously, developers building agents that needed to act on behalf of a user had to manage separate consent flows for each protected resource, adding friction for end users and complexity for builders. With OBO token exchange, developers can exchange an access token for a new scoped-down access token that carries both the original user identity and the agent identity. This token is targeted specifically to the outbound protected resource, granting just-in-time, least-privilege access without prompting the user for additional consent.
Amazon Bedrock AgentCore Identity OBO token exchange is now generally available in 14 AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Canada (Central), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), and Europe (Stockholm). To learn more, visit the Amazon Bedrock AgentCore Identity documentation .
AWS IoT Core now supports customer managed domains in the AWS GovCloud (US) Regions. Customer managed domains (also known as custom domains), allow you to configure custom domain names, use your own server certificates stored in AWS Certificate Manager, attach custom authorizers, and create multiple data endpoints for your account.
Custom domains provide long-term stability of TLS behavior, domain names, and their trust chain for device deployments. They also help you enable separate domain configurations for heterogeneous device fleets, and simplify migration of existing devices to AWS IoT Core. For example, by configuring custom domain names and custom authorizers for your data endpoints, you can keep using the same domain names and authentication methods your devices already know. This means you don't need to update device credentials or CA certificates during migration to AWS IoT Core, minimizing software updates on devices already in the field.
With the expansion to the AWS GovCloud (US) Regions, this feature is now available in all AWS regions where AWS IoT Core is present. To learn more, visit the AWS IoT Core documentation and API reference guide.
Amazon MQ for RabbitMQ now supports the Prometheus plugin on RabbitMQ 4.2 brokers, providing a native Prometheus-compatible metrics endpoint on your RabbitMQ brokers. You can scrape broker, queue, and connection metrics directly from your brokers using any Prometheus-compatible monitoring tool, giving you more flexibility in how you observe and alert on your messaging infrastructure.
The plugin exposes metrics through the /metrics, /metrics/detailed, and /metrics/memory-breakdown endpoints in Prometheus text format. Amazon MQ also publishes a curated subset of these Prometheus metrics to CloudWatch. With the Prometheus plugin, you can now integrate your brokers into existing Prometheus-based monitoring stacks including Grafana dashboards, Amazon Managed Service for Prometheus, and self-hosted Prometheus servers.
The Prometheus plugin is enabled by default on all Amazon MQ for RabbitMQ 4.2 brokers in all AWS Regions where Amazon MQ is available. To learn more about monitoring with Prometheus, see the Amazon MQ release notes.
Amazon Elastic Container Service (Amazon ECS) now offers NVIDIA GPU metrics for containerized workloads running on Amazon ECS Managed Instances. These metrics are available through Amazon CloudWatch Container Insights with enhanced observability, giving customers visibility into GPU health and performance to help troubleshoot and optimize GPU-accelerated workloads on Amazon ECS.
With the new GPU metrics, Amazon ECS Managed Instances customers can now monitor GPU capacity, utilization, memory, hardware health, and thermal conditions directly in CloudWatch. Using Container Insights with enhanced observability, customers get granular visibility into these metrics, including at the GPU device level. These metrics give customers visibility into GPU operational and hardware health across their Amazon ECS Managed Instances fleet, enabling them to right-size GPU capacity, troubleshoot performance issues, and detect problems before they impact GPU-accelerated workloads, such as AI/ML training and inference.
NVIDIA GPU metrics for Amazon ECS Managed Instances are available through Container Insights in all commercial AWS Regions. To get started, enable Container Insights with enhanced observability on your Amazon ECS cluster, and launch GPU-accelerated Amazon EC2 instance types through an Amazon ECS Managed Instances capacity provider. For Container Insights pricing, see Amazon CloudWatch Pricing. To learn more, see the Amazon ECS Container Insights with enhanced observability metrics user guide.
AWS Outposts racks now support the LagStatus Amazon CloudWatch metric in all AWS commercial Regions and the AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions.
This metric provides you with the ability to monitor Outposts LAG connectivity status directly within the CloudWatch console, without having to rely on external networking tools or coordination with other teams. You can use this metric to set alarms, troubleshoot connectivity issues, and ensure your Outposts racks are properly integrated with your on-premises infrastructure. The LagStatus metric indicates whether an Outposts LAG is operationally up and ready to forward traffic. A value of "1" means that the LAG is up, while "0" means that it is down. When combined with the existing VifConnectionStatus and VifBgpSessionState metrics, you can quickly identify whether issues stem from LAG configuration, BGP peering, or connection problems.
The LagStatus metric is now available for all Outposts LAGs in all commercial AWS Regions and the AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions where Outposts racks are available.
To get started, read this blog post and access the metrics in the CloudWatch console. To learn more, check out the CloudWatch metrics for AWS Outposts documentation for second-generation Outposts racks and first-generation Outposts racks.
Amazon Elastic Kubernetes Service (EKS) now provides one-click cluster access directly from the AWS Management Console through AWS CloudShell, eliminating the need to install and configure kubectl, AWS CLI, or kubeconfig files locally. This feature helps developers and operators who want immediate cluster access without tooling setup or complex environment configuration.
With one-click cluster access, you can navigate to any EKS cluster in the console and choose Connect to instantly launch an AWS CloudShell session with kubectl pre-configured for that cluster. You can then run kubectl commands immediately to inspect workloads, troubleshoot issues, or manage resources without switching to a local terminal. This feature supports clusters with both public and private API server endpoints. Each CloudShell session also includes the AWS CLI and standard CloudShell utilities, giving you immediate access to essential cluster operations.
One-click cluster access is available at no additional charge in all the AWS Regions where Amazon EKS is available. To get started, see Connect kubectl to an EKS cluster in the Amazon EKS User Guide.
AWS Payment Cryptography now supports Physical Key Exchange, a new PCI PIN and P2PE compliant feature for performing paper-based cryptographic key exchange with the service without needing to maintain your own secure key loading infrastructure. If your partners or vendors do not support electronic key exchange, Physical Key Exchange provides an option to exchange cryptographic keys to accelerate your migration. AWS Payment Cryptography is a managed service that provides elastic key management and cryptographic operations for your cloud-hosted payment applications.
Although electronic key exchange is preferred, some counter parties are not yet ready to support it, requiring organizations to maintain Hardware Security Modules (HSMs) and Key Loading Devices (KLDs) to perform paper-based key ceremonies in a compliant manner. Maintaining this infrastructure is costly and operationally burdensome, especially for key exchanges that occur only a few times per year. With Physical Key Exchange, paper key components are shipped to trained AWS key custodians, who handle them securely and perform key ceremonies in AWS-operated secure facilities that meet the PCI PIN and P2PE physical and logical security requirements. Once loaded into AWS Payment Cryptography, keys are available to perform cryptographic operations.
For details on key exchange options in AWS Payment Cryptography, see the Physical Key Exchange for paper-based and importing and exporting keys for electronic key exchange in the User Guide. For pricing details, visit the pricing page. To get started, open an AWS support case or contact your AWS account team.
Amazon SageMaker AI inference endpoints now support flexible provisioning across a prioritized list of instance types. When your preferred instance type has insufficient capacity, SageMaker AI automatically provisions from the next available option in your list — keeping endpoint creation and autoscaling running smoothly without manual intervention. This gives teams deploying AI/ML models in production the resilience to handle capacity constraints gracefully, ensuring endpoints come up reliably and scale on demand.
With instance pool support, you define a prioritized list of instance types and SageMaker AI automatically provisions capacity by working through your list in order. This applies across endpoint creation, updates, and scaling. When scaling down, SageMaker AI removes lowest-priority instances first, preserving your preferred infrastructure as the fleet contracts. This works for Single Model Endpoints, InferenceComponent-based endpoints, and Asynchronous Inference endpoints — including endpoints that scale to zero, where SageMaker AI provisions from your highest-priority available pool when scaling back up.
Because fallback instance types differ in GPU memory and compute capability, you can specify a different optimized model for each instance type in your priority list. You can prepare these artifacts yourself or use SageMaker AI inference recommendations, which automatically generates hardware-specific optimized configurations per instance type. Additionally, per-instance-type CloudWatch metrics give you visibility into latency, throughput, GPU utilization, and instance count by hardware type within a single endpoint.
This capability is available today in US East (N. Virginia), US East (Ohio), US West (Oregon), Canada (Central), South America (São Paulo), Europe (Ireland), Europe (London), Europe (Frankfurt), Europe (Stockholm), Europe (Zurich), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Mumbai), and Asia Pacific (Jakarta). To learn more, visit the Amazon SageMaker AI documentation.
Amazon Relational Database Service (Amazon RDS) for SQL Server now supports cross-account snapshot sharing for database instances with additional storage volumes. Additional storage volumes allow customers to scale database storage up to 256 TiB by adding up to three storage volumes, each with up to 64 TiB, in addition to the primary storage volume. With this launch, customers can create, share, and copy a database snapshot across AWS accounts for database instances set up with additional storage volumes. Cross account snapshots enable customers to set up isolated backup environments in separate accounts for compliance requirements and to perform diagnostics, such as investigating production issues by restoring database snapshots in a separate account for development and testing.
Cross-account snapshots for database instances with additional storage volumes preserve the storage layout of the original database instance, including the configuration of additional storage volumes. When a snapshot is shared to a target AWS account, authorized users in the target account can restore it to another database instance, copy the snapshot within the same or different AWS Region, or create independent backups under different AWS Identity and Access Management (IAM) access permissions for backup and disaster recovery.
Cross-account snapshot sharing with additional storage volumes is available in all AWS commercial Regions. Customers can start using this feature today through the AWS Management Console, AWS CLI, or AWS SDKs. To learn more, see Sharing a DB snapshot for Amazon RDS, Copying a DB snapshot for Amazon RDS, and Working with storage in RDS for SQL Server in the Amazon RDS User Guide.
Amazon Relational Database Service (Amazon RDS) for SQL Server now supports read replicas for database instances with additional storage volumes. Additional storage volumes allow customers to scale database storage up to 256 TiB by adding up to three storage volumes, each with up to 64 TiB, in addition to the primary storage volume. With this launch, for database instances configured with additional storage volumes, customers can create same-region and cross-region read replica database instances.
When a read replica is created for a database instance with additional storage volumes, the replica preserves the storage layout of the source instance, including the configuration of any additional storage volumes. After the initial creation, you can independently manage additional storage volume configurations on the source and read replica instances.
Read replicas with additional storage volumes are available in all AWS commercial Regions and the AWS GovCloud (US) Regions. Customers can start using this feature today through the AWS Management Console, AWS CLI, or AWS SDKs. To learn more, see Working with read replicas for Amazon RDS for SQL Server and Working with storage in RDS for SQL Server in the Amazon RDS User Guide.
Amazon Elastic Kubernetes Service (Amazon EKS) now supports Dynamic Resource Allocation (DRA) for Elastic Fabric Adapter (EFA), simplifying high-performance inter-node communication and RDMA (Remote Direct Memory Access) for artificial intelligence, machine learning, and High Performance Computing (HPC) workloads. The EFA DRA driver, built on the upstream DRANET project, brings EFA interface sharing and topology-aware allocation for workloads running on Kubernetes.
With the EFA DRA driver, you can allocate EFA interfaces and accelerator devices that share the same PCIe root or device group, ensuring inter-node traffic flows through the closest network interface to each NVIDIA GPU, AWS Trainium, or AWS Inferentia device on the node. The EFA DRA driver also supports EFA interface sharing across workloads on the same node to maximize EFA interface utilization.
The EFA DRA driver is recommended for new deployments on Amazon EKS clusters running Kubernetes version 1.34 or later with EKS managed node groups or self-managed nodes. The EFA DRA driver is available in all AWS Regions where Amazon EKS is available. The EFA device plugin remains supported and is recommended for use with Karpenter and Amazon EKS Auto Mode.
To learn more, see Manage EFA devices on Amazon EKS in the Amazon EKS User Guide.
Spatial Data Management on AWS (SDMA) now supports custom transformation connectors and a unified desktop client installer. Custom transformation connectors let you run compute-intensive processing — such as format conversion, 3D rendering, image tiling, or metadata extraction — by submitting jobs to AWS Deadline Cloud using Open Job Description templates. You can extend SDMA's built-in content analysis with custom logic to verify formats, extract attributes, or run transformations that require dedicated compute resources.
Connectors run in isolated compute environments and automatically ingest declared outputs back into SDMA's governed asset repository, enabling you to automate and chain processing workloads across your spatial data pipeline. The SDMA desktop application now includes a standalone installer that bundles all required dependencies, removing the need to separately install the CLI or other components.
These features are available in the following AWS Regions: Asia Pacific (Tokyo, Singapore, Sydney), Europe (Frankfurt, Ireland, London), US East (N. Virginia, Ohio), and US West (Oregon). To learn more, visit the SDMA solutions library product page. For technical details, see the SDMA documentation.
私たちが Amazon Q Developer を立ち上げたときの目標は、AI による支援を開発者の作業の流れに直接組み込むことでした。お客様は VS Code、JetBrains、Eclipse、Visual Studio にわたって Q Developer を導入し、コード生成やデバッグ、チャットベースのガイダンスに活用してきました。Q Developer は、AI が日々の開発サイクルに欠かせない存在であることを証明しました。 この 1 年で私たちが学んだのは、もっともインパクトのある AI 開発者体験はコード生成や補完にとどまらないということです。開発者には、プロジェクト全体 —— アーキテクチャ、要件、テスト、そしてコードの背後にある意図 —— を理解する AI が必要です。そのためには、専用に設計された環境が必要になります。それこそが、私たちが Kiro を構築した理由です。
2026 年 4 月 28 日の「What’s Next with AWS」では、マット ガーマン […]
多くの企業が社内ナレッジをさまざまなストレージツール上に保管しています。蓄積された膨大な情報を効率よく活用するのは簡単ではありません。Amazon Quick なら、組織に散らばり保存された社内ナレッジを AI エージェントに接続し、自然言語で質問するだけで必要な情報を引き出せます。 本ブログでは、Amazon Quick の AI エージェントを社内ナレッジへ接続する例として、Microsoft SharePoint Online(以下、SharePoint と記載)での連携方法を取り上げます。ナレッジベース連携とアクション連携の、2 つのアプローチについて、セットアップ手順をステップバイステップで解説します。
In this post, you will configure Amazon Bedrock AgentCore Gateway to access private endpoints using Resource Gateway, a managed construct that provisions Elastic Network Interfaces (ENIs) directly inside your Amazon VPC, one per subnet. You will explore two implementation modes (managed and self-managed) and walk through three practical scenarios: connecting to a private Amazon API Gateway endpoint, integrating with a MCP server on Amazon Elastic Kubernetes Service (Amazon EKS), and accessing a private REST API.
This post demonstrates how agentic AI assistant from Amazon Quick transform data analytics into a self-service capability by using Amazon Simple Storage Service (Amazon S3) as a storage, Amazon SageMaker and AWS Glue for lakehouse, Amazon Athena for serverless SQL querying across multiple storage formats (S3 Table, Iceberg, and Parquet).
In this post, we show how Sun Finance used Amazon Bedrock, Amazon Textract, and Amazon Rekognition to build an AI-powered identity verification (IDV) pipeline. The solution improved extraction accuracy from 79.7% to 90.8%, cut per-document costs by 91%, and reduced processing time from up to 20 hours to under 5 seconds. You'll learn how combining specialized OCR with large language model (LLM) structuring outperformed using either tool alone. You'll also learn how to architect a serverless fraud detection system using vector similarity search.
In this post, we introduce a systematic framework for LLM migration or upgrade in generative AI production, encompassing essential tools, methodologies, and best practices. The framework facilitates transitions between different LLMs by providing robust protocols for prompt conversion and optimization.
In this post, we take a deeper look at how RLAIF or RL with LLM-as-a-judge works with Amazon Nova models effectively.
Stay current with the latest serverless innovations that can improve your applications. In this 32nd quarterly recap, discover the most impactful AWS serverless launches, features, and resources from Q1 2026 that you might have missed. In case you missed our last ICYMI, check out what happened in Q4 2025. 2026 Q1 calendar Serverless with Mama […]
When you deploy AWS Outposts racks, you can run AWS infrastructure and services in on-premises locations. Maintaining seamless connectivity, both to the AWS Region and your on-premises network, is fundamental to delivering consistent, uninterrupted service to your applications. Implementing an observability strategy that uses available network metrics is key to understanding the health of this […]