AMD and Cohere expand their partnership to bring enterprise and “sovereign” AI to infrastructure with Instinct GPUs

AMD and Cohere have taken another step in their collaboration to accelerate AI adoption in businesses and public administrations. The multinational semiconductor company and the Canadian AI-focused “security-first” company announced that Cohere customers will be able to run North — their enterprise automation and agent platform — and the Command A family of models (including Vision and Translate variants) directly on AMD Instinct GPU infrastructure. The agreement targets a goal increasingly discussed in both technological and political circles: delivering AI deployments with better total cost of ownership (TCO), energy efficiency, and most importantly, data sovereignty.

This development also signals a move to the market: AMD will incorporate North into its own internal portfolio for enterprise workloads and engineering. The fact that hardware provider adopts its partner’s platform as part of its internal workflows speaks to both Cohere’s software maturity and AMD’s intention to practice what it preaches externally.

Why it matters: the race for “your terms” AI

Over the past year, discussions with major clients have shifted from “which model to choose” to “where and how to deploy it with assurances”. Two realities are driving this change:

  1. Regulation and trust. Regulated sectors and administrations demand privacy by design, residency and data localization, auditability, and reversibility. This is the natural territory of what is called Sovereign AI, a broad scope encompassing everything from compute location to model IP ownership and compliance standards.
  2. Costs and efficiency. For long-running workloads (extensive contexts, agents orchestrating multiple tools, multimodal vision, or large-scale translation), memory bandwidth and GPU capacity are what differentiate a proof of concept from a profitable operation. Here, HBM3E and the Instinct architecture play a key role.

The AMD-Cohere announcement seeks to bridge these two ends: North is envisioned as a “turnkey AI platform” for the workplace (automations, chat with proprietary data, agents operating with internal systems), designed with a focus on security and control. Meanwhile, Instinct provides the compute + memory muscle necessary to handle long contexts and general-purpose models like Command A, including its Vision variant for image input and Translate for large-scale translation.

What exactly has been announced

  • Availability on Instinct infrastructure: the Cohere models (Command A, Vision, Translate) and the North platform are now certified/optimized to run on AMD Instinct GPUs in client or provider environments, supporting projects in Canada and globally with sovereign AI needs.
  • Internal adoption by AMD: the technology will incorporate North into AMD’s enterprise AI portfolio, employing it for internal workloads and engineering.
  • Message on TCO and efficiency: the joint positioning emphasizes performance per watt, memory capacity, and deployment flexibility as key to making business AI plans predictable.

The hardware role: more memory, less friction

Instinct’s value proposition in this area boils down to two points: ample HBM and high bandwidth. In the MI350 series, AMD advertises up to 288 GB of HBM3E per GPU and 8 TB/s bandwidth, aligning with the needs of long-context reasoning and complex workloads that Cohere brings with North and Command A. In recent generations (like MI325X), the company already highlighted 256 GB of HBM3E and 6 TB/sec. The trend is clear: pack more memory closer to compute to expand model and prompt window sizes while maintaining high performance in each accelerator.

This “memory-oriented design” isn’t cosmetic. For enterprise agents combining retrieval-augmented generation (RAG) with search, document repositories, knowledge bases, and transactional systems, the ability to keep more context resident reduces latencies, disk/network evasions, and ultimately cost per interaction. That’s why this hardware-software duo makes sense: North offers the product layer (agents, automation, traceability, security), and Instinct enables the computational + memory capacity to serve long contexts and general models like Command A, its Vision variant for image input, and Translate for scalable translation.

North explained for executives

With North, Cohere packages several components often scattered in custom projects:

  • AI agents that operate with employee tools (from office suites to CRMs or ticket systems), following policies and limits defined by the client.
  • Knowledge automation: summaries, field extraction, semantic search, translations, or draft generation respecting permissions and auditability.
  • Chat with proprietary data (intranet, document repositories, ERPs), keeping information within the perimeter set by the client, whether on-premise, private cloud, or specific regional clouds.
  • Governance and security: controls for redaction (to prevent exfiltration), traceability of responses, retention policies, and compliance tools.

For CIOs and CISOs, the promise is secure and auditable adoption; for CFOs, a defensible TCO thanks to optimization of compute/memory and deployment options tailored to their infrastructure mix.

What does this mean for “sovereign AI”?

The term Sovereign AI has become a buzzword, but here it has concrete implications: the ability to choose where AI runs, who can see the data, and under what jurisdiction it resides, without sacrificing performance. The Cohere-AMD pairing introduces three vectors:

  1. Residency and jurisdiction: deployment on regional infrastructure (public or private) and data policies aligned with national frameworks. In countries like Canada or blocks like the EU, this is now mandatory: a core requirement.
  2. Operational control: North operates behind firewalls, with integrations to proprietary systems and traceability of every agent action.
  3. Sustained profitability: Instinct drives the less glamorous but essential aspects: energy consumption and capacity per node to stretch the budget without compromising user experience.

A message to the market… and competitors

The agreement sends messages in two directions:

  • To buyers: “If you were waiting for a clear path to production with security and control and independence from any single tech stack, here’s a solution already running in real environments.”
  • To competitors: AMD reinforces its commitment to enterprise AI with credible third-party software, and Cohere gains another distribution channel not reliant on GPU dominance.

Parallel to this, it’s impossible to ignore the context: mass deployments on NVIDIA coexist with new clusters based on Instinct and rapidly maturing ROCm. Recent large-scale accelerator deployments demonstrate that viable alternatives exist for large-scale operations—something procurement departments appreciate for risk diversification and bargaining power.

And the software? The ROCm layer

In any multi-vendor migration or deployment, the crucial question is whether the software stack is ready. AMD has aligned its ecosystem with ROCm 7.0, optimized for CDNA 4 (the foundation of MI350), strengthening libraries and compilers so popular frameworks (PyTorch, etc.) and inference toolchains leverage available memory and bandwidth. For Cohere, which already optimized its LLMs on Instinct before this announcement, the adaptation curve should be smoother: considerable work has been done in kernels and critical paths.

What to expect in 6–12 months

  • Pilot projects on sovereignty: first public references from governments and large enterprises in Canada and other countries deploying North on Instinct for internal assistants and automations.
  • More native integrations in North: connectors and guardrails tailored for verticals (finance, healthcare, public sector) and improved traceability.
  • Cost adjustments per user/query: with 288 GB–256 GB of HBM3E depending on platform, expect more generous context limits without scaling nodes, reducing costs for certain usage patterns.
  • Channel ripple effects: managed service providers and integrators will start offering turnkey AI solutions in your data center with North + Instinct, including SLA and observability.

Risks and challenges

Not everything is advantages. Three main fronts remain:

  1. Portability and stack dependence: although North emphasizes “run on your terms,” deep hardware-dependent optimizations incur opportunity costs when porting to other platforms.
  2. Change management: transitioning from pilots to company-wide adoption requires training, process redesign, and governance. Technology is necessary but not sufficient.
  3. Fierce competition: the pace of iteration in models, agents, and tools is extremely high; differentiation windows narrow, requiring synchronized roadmaps between hardware vendors and AI platforms.

Final thoughts

The AMD-Cohere movement reinforces the thesis that enterprise AI will mainly be multi-platform and data-driven: where it is, who accesses it, and how each step is audited. If North offers control and productivity, and Instinct delivers performance and TCO, IT leaders gain room to scale confidently. For the market, the message is clear: competition in the acceleration layer is extending into the product layer, often resulting in more options and better conditions for organizations to adopt tailored AI solutions.


Frequently Asked Questions

1) What are the advantages of deploying Cohere North on AMD Instinct GPUs for sovereign AI?
North can operate behind firewalls, with full data residency and traceability. Running on Instinct, it benefits from large HBM3E capacities (up to 288 GB per GPU in MI350) and high bandwidth (up to 8 TB/sec), enabling long contexts and more capable agents without increasing the number of nodes. This reduces latency and cost per interaction, crucial for projects with compliance requirements.

2) How do Cohere’s Command A, Vision, and Translate models differ in enterprise environments?
Command A is the core family for reasoning and generation; Command A Vision adds image understanding for multimodal cases (scanned documents, visual inspection, etc.); Command A Translate scales translation with policies and auditability. All can integrate with North with security guardrails and logging for accountability.

3) How does HBM3E memory impact the TCO of corporate generative AI?
Proximity of HBM3E to compute allows more context to be maintained and efficient batching on each GPU, reducing storage/network calls and bottlenecks. Practically, this means fewer servers for the same SLA or better performance per dollar and watt, shortening project payback periods for agents and internal assistants.

4) What should an organization consider before choosing between Instinct and other GPUs for AI?
Three axes: (a) real memory and bandwidth for your context size and model; (b) stack maturity (frameworks, libraries, MLOps pipeline compatibility); (c) total cost (hardware, power, licenses, operations). Testing own workloads—RAG with proprietary data, prompts, and agents—is the most reliable way to decide.


Sources consulted:

  • AMD (IR and Newsroom). Official announcement of expanded collaboration with Cohere: availability of North and Command A (including Vision and Translate) on Instinct, and AMD’s internal adoption of North.
  • Cohere (North). Description of North as a “turnkey” productivity platform with agents and enterprise-grade security.
  • AMD (Instinct MI350 / MI325X). Technical datasheets and blogs detailing HBM3E capacity and bandwidth per GPU; cooling options and ROCm 7.0 for CDNA 4.
  • Third-party coverage. News outlining the agreement and market context (Cohere, funding, enterprise focus; large-scale Instinct deployments).
Scroll to Top