We detect you are using an unsupported browser. For the best experience, please visit the site using Chrome, Firefox, Safari, or Edge. X
Maximize Your Experience: Reap the Personalized Advantages by Completing Your Profile to Its Fullest! Update Here
Stay in the loop with the latest from Microchip! Update your profile while you are at it. Update Here
Complete your profile to access more resources.Update Here!

The Composable Platform Part 2 - Enabling Innovation Through Flexible Infrastructure

Learn how our innovations build an agile infrastructure of compute, memory and storage and also how technology advancements that both the industry and vendors like us are enabling to meet the needs of a composable platform.


In this 2-part excpert, I described the formidable task facing data centers as they seek to improve operating efficiency and maximize their IT investments in hardware infrastructure. This must be accomplished in the face of evolving and varied application requirements, which can only be accomplished with new architectures for enabling resource agility where physical compute, memory, and storage resources are treated as composable building blocks. This is a key to unlocking efficiencies and eliminating stranded and underutilized assets.  

In this second and final installment in my series, I will describe a better way to deal with these challenges and show how Microchip innovations build an agile infrastructure of compute, memory and storage.  I will also highlight technology advancements that both the industry and vendors like us are enabling to meet the needs of a composable platform. 

The Best Path to an Agile Infrastructure

At Microchip, we strongly believe the best way to achieve this resource agility is by building flexible solution building blocks.  We are creating building blocks that can adapt to new use cases and new requirements. and enable system level composability.  With composable and flexible infrastructure, or what we like to call agile infrastructure, tremendous strides in efficiencies are possible. 

Enabling resource agility where physical compute, storage and memory resources are treated as composable building blocks is the key to unlocking efficiencies and eliminating stranded or underutilized assets.  Composable storage, compute and memory enables you to optimize resources by workload and to reduce or eliminate resource stranding. We can remove bandwidth bottlenecks, remove memory bottlenecks, remove storage bottlenecks and remove compute IO bottlenecks. The agile data center needs adaptable building block silicon platforms that can enable you to cost-effectively manage emerging memory and storage technologies, enabling your infrastructure use cases to continue to evolve after the hardware is built. 

Improving GPU utilization 

Microchip’s Switchtec PAX Advanced Fabric solution enables composable heterogenous compute architectures.  This includes a scalable non-hierarchical fabric where the fabric creates virtual domains which are dynamically reconfigurable. Resources are allocated on demand with low-latency data movement as all data transfers through the fabric are managed by hardware. This solution does not require any special driver requirements on the host, enabling rapid time to market and reduced R&D effort for the system integrators.

How does it work?  It's important to realize that a Switchtec fabric is not just a collection of PCIe switches. It is a collection of fabric elements that use virtual domains to connect route complexes or CPUs to endpoints like GPUs or storage. This is important as heterogeneous compute becomes more prevalent in the data center.  GPUs and accelerators are widely used in a variety of applications. Each application and workload may require a unique ratio of compute-to-accelerator resources.  With native support for PCIe Gen 4 on both CPUs as well as GPUs, a PCIe Gen 4 fabric is a natural choice to allow for composable heterogeneous compute in artificial intelligence and machine learning applications.  

How do we get there? We start with a programmable, enterprise-quality, low-latency PCIe Gen 4 switch. We add turnkey advanced fabric firmware to create a scalable and configurable low-latency PCIe gen 4 fabric. The PCIe fabric can scale multiple switches and endpoints, and hosts are kept in separate virtual domains.  

In the example below, we see how Host 1 is assigned to 4 GPUs marked in orange, even though the 4th GPU is physically attached to a different switch within the fabric.  These virtual domains are created by the flexible and configurable embedded control plane in each fabric element. The virtual domain is in effect a PCIe-compliant virtual switch, and here you see an example of the host in orange that has visibility to that 4th GPU.   Although the flexibility is enabled through firmware provided by Microchip as a turnkey solution, the data is routed in hardware to ensure the lowest latency.  

Furthermore, this architecture allows for direct peer-to-peer data movement inside the fabric.  Why is peer-to-peer data movement through the PCIe fabric important or useful? Peer-to-peer data movement delivers increased performance and reduces latency.  In the example below, we can deliver 2.5X the bandwidth by bypassing the CPU-to-CPU interconnect in a two-socket system.  You can see that the GPUs in this case can deliver 26 Gbps when doing a peer-to-peer transfer, rather than funneling the traffic through the CPU subsystem.  There is a significant performance improvement here due to the direct peer-to-peer transfers.   

This model of composable GPUs is readily extensible to NVMe storage through the addition of NVMe SSDs into the same fabric architecture.  The NVMe endpoints can simply be added to the fabric, just like a spec-compliant GPU can be. This allows for dynamic assignment or reassignment of SSDs to different hosts as required, resulting in storage becoming a flexible and adaptable resource. 

We have talked about allocating entire SSDs and entire GPUs to hosts, as required.  What if an individual resource itself is very large, and we wish to partition and share such resources? Such an example would be a high-capacity SSD that we want to share across multiple CPUs to avoid stranding the storage.  

SR-IOV and multi-host sharing allow for just this type of flexibility. Microchip’s Switchtec PCIe expanders, as well as our Flashtec NVMe SSD controllers, enable end-to-end multi-host IO virtualization with standard off-the-shelf drivers.  SR-IOV is a reality today. There are over eight vendors that have announced SR-IOV-capable NVMe SSDs and we have the flexible infrastructure to support this type of architecture.  Notably, the applications for PCIe fabrics extend beyond the data center.  In the autonomous car, you can have many sensors and control units that continually need to make inference decisions while driving to store data for future training.  This can be done most effectively through having a low-latency fabric with access to shared resources like an SR-IOV-capable SSD. 

We've discussed improving GPU and storage utilization and removing storage bandwidth bottlenecks with PCIe fabric solutions, like the Switchtec PCIe fabric. But true agility requires composability, as well as flexibility.   

 
Improving Storage Utilization 

Flexibility, in the case of storage, can be achieved in many different ways.  Microchip believes in bringing enabling technology to market that will allow for maximum reuse, whether it be software or hardware qualification efforts as you move from one class of storage media to another. From a protocol standpoint, our tri-mode IP and our Smart Storage series of storage controllers enable a platform that will allow for enterprise-class, high-performance and secure NVMe storage, SAS storage, SATA storage, or some combination of all three. 

From a Flash media standpoint, our Flash channel engines in our Flashtec NVMe SSD controllers provide future-proof, programmable architecture with advanced LDPC ECC, including hard and soft decoding. This enables NVMe SSD for more investments to leverage multiple generations of NAND without sacrificing quality of service. 

Improving Memory Utilization 

Memory innovation is happening along two vectors, near and far.  Near-memory innovation is about getting more bandwidth to the CPU to feed those increasing core counts within the CPU.  Far-memory innovation is about effectively pooling and then sharing memory, making it accessible to more machines within the rack.  Microchip has been collaborating with industry partners on a number of new serial load/store standards that are solving this problem, such as CXL, Gen Z and OpenCAPI.   

At FMS, we announced our first product in this area, an open memory interface to DDR4 smart memory controller

The SMC 1000 8x25G  memory controller provides low-latency connectivity to DDR4 via 8 lanes of 25G serial OMI open memory interface, enabling the memory bandwidth required for AI and machine learning applications.  

This type of solution delivers: 

  1. Increased memory bandwidth.  We've reduced a 288-pin DDR4 interface down to an 84-pin OMI interface, effectively enabling a quadrupling of the memory bandwidth to the CPU.    

  1. It enables media independence.  By moving the controller outside of the CPU, we have enabled the memory technology to evolve independently of the CPU.   

  1. Total lower cost of solution.  There is a reduced silicon, IP, and packaging cost for CPUs and SoCs. 

DDIMMs leveraging the SMC 1000 are available from some of Microchip’s partners, namely, Micron, Samsung, and Smart Modular.  

We were honored our SMC 1000 8x25G Smart Memory Controller received an FMS Best of Show Award.

In summary, at Microchip we believe that flexible and composable infrastructure is the future of the data center.  Microchip is innovating in the areas of storage, memory, and compute interconnect, enabling system builders and data center operators to drive efficiencies and to adapt to evolving use cases. 

Andrew Dieckmann, Mar 19, 2020
Tags/Keywords: Computing and Data Center