• FEATURED STORY OF THE WEEK

      Expanding Capabilities: Redfish API Support for Modern Infrastructure

      Written by :  
      semifly
      Team Semifly
      9 minute read
      August 25, 2025
      Category : Research and Development
      Expanding Capabilities: Redfish API Support for Modern Infrastructure

      Modern data centers demand efficient, secure, and standardized ways to manage servers and infrastructure. Redfish API support has emerged as the industry standard for this purpose. It is a specification developed by the Distributed Management Task Force (DMTF) to replace older management technologies like IPMI, which often lacked security and scalability.

       

      NVIDIA has integrated Redfish API support into its systems, including the DGX platform and the NVIDIA H200 GPU. This ensures administrators can remotely manage GPU-rich servers with greater consistency and automation. The H200 supports Redfish by default through its baseboard management controller (BMC), giving enterprises advanced capabilities for monitoring, configuration, and lifecycle management.

       

      With a combination of open standards and hardware-level integration, Redfish plays a key role in building scalable and secure infrastructures that are ready for the next generation of AI and high-performance computing.

       

      1. What Is Redfish API?

       

      Redfish API refers to an industry-standard way of managing and monitoring hardware systems through a modern web-based interface. It was created by the Distributed Management Task Force to provide a secure and consistent alternative to older protocols. Redfish uses familiar web technologies, which makes it easier for administrators and automation tools to interact with infrastructure in a standardized way.

       

      Redfish is based on a RESTful interface, which means it communicates using standard HTTP methods like GET, POST, and DELETE. Data is exchanged in JSON format, a lightweight and human-readable data structure. It also leverages OData, which provides a consistent way to query and manipulate data. This combination allows IT teams to access and manage servers using tools they are already familiar with, without needing specialized software.

       

      2. How Does NVIDIA Support Redfish API?

       

      NVIDIA provides Redfish API support across its data center platforms, including DGX systems such as the H100 and the H200. This support is integrated directly into the Baseboard Management Controller (BMC) and the system BIOS (SBIOS). Because it is enabled by default, administrators can begin using Redfish APIs without requiring additional setup. This makes managing servers more efficient and secure.

       

      Comparison diagram: Redfish API (modern, secure, JSON) versus IPMI (legacy, plaintext, binary) for infrastructure management.

       

      Redfish support in DGX systems includes a wide range of operations that are critical for infrastructure management. Administrators can manage user accounts, control system power, and access detailed sensor telemetry for temperature, fan speed, and voltages. Logs can be viewed and exported for system health analysis. Redfish also allows configuration of boot order and system restart settings, making remote management much easier. Power capping features are available as well, enabling teams to balance performance with energy efficiency.

       

      The DGX H200 adds additional features through firmware enhancements. With updated firmware, administrators gain more fine-grained power policy controls that improve energy optimization across workloads. Enhanced diagnostic tools are also available via Redfish, offering better visibility into system health and failure prediction. These improvements allow IT teams to proactively address performance or hardware issues before they affect critical AI workloads.

       

      Overall, NVIDIA H200 Redfish API support ensures that organizations have a modern, secure, and automation-ready framework for system management. By combining GPU performance with advanced remote management, NVIDIA helps data centers scale AI workloads while maintaining reliability and efficiency.
       
      Table: NVIDIA Redfish Support in DGX H100/H200
       

      Feature Description
      Default Support Redfish enabled by default in BMC and SBIOS
      Management Capabilities Accounts, system health, sensors, power capping, boot options, logs
      Firmware and Diagnostics Power policy control, metrics, diagnostics over Redfish API

       

      3. Why Is Redfish API Better Than IPMI?

       

      Redfish API support represents a major step forward in server and infrastructure management compared to older interfaces like IPMI. It is designed to meet the needs of modern, large-scale environments where security, interoperability, and automation are critical. By using web-native technologies, Redfish offers a secure and flexible way to manage servers across vendors and platforms.

       

      Here are the key differences between Redfish and IPMI:

       

      Security
      Redfish is built on HTTPS, which ensures encrypted and secure communication between management tools and systems. IPMI, on the other hand, often relies on plaintext communication, which can expose sensitive data and pose security risks.

       

      Data Representation
      Redfish uses JSON (JavaScript Object Notation), a lightweight and human-readable format. JSON makes it easier for both administrators and automation tools to parse and interact with system data. IPMI uses binary formats that are harder to interpret and integrate.

       

      Standardization and Interoperability
      Redfish is developed and maintained by the Distributed Management Task Force (DMTF), ensuring a consistent and vendor-neutral standard. This allows seamless management across multi-vendor environments. IPMI lacks this level of interoperability and often results in vendor-specific implementations.

       

      Extensibility
      Redfish uses a schema-based, model-driven design. This allows vendors to extend functionality for new technologies without breaking compatibility. IPMI is rigid and has limited adaptability to new use cases.

       

      Modern Use Cases
      Redfish is designed for hybrid and cloud-native environments where automation, scalability, and real-time telemetry are critical. IPMI was created decades ago for simpler server management and struggles to meet the requirements of today’s data centers.

       

      Table: Redfish vs. IPMI
       

      Feature Redfish API Support IPMI
      Security HTTPS-based, encrypted communication Plaintext, less secure
      Data Format JSON (lightweight, human-readable) Binary (complex, less flexible)
      Standardization Vendor-neutral, DMTF-backed Vendor-specific variations
      Extensibility Schema-based, easy to extend Rigid, hard to adapt
      Use Cases Ideal for cloud-native, scalable systems Suited for legacy, smaller environments
      Automation Support Strong integration with modern tools & APIs Limited automation capabilities

       

      4. How Can the H200 Leverage Redfish API Support?

       

      The Redfish API support enables advanced management of GPU-powered systems like H100 and H200 through a secure and standardized interface. By adopting Redfish, the H200 makes it easier for administrators to handle large clusters where manual configuration would be slow and error-prone. This is especially valuable in AI and HPC environments where performance and reliability are critical.

       

      System architecture diagram: Redfish API integration with NVIDIA H200 GPU systems enabling remote, automated management via HTTPS/JSON.

       

      One key advantage is the ability to streamline routine tasks such as firmware updates. With Redfish API support, updates can be applied remotely without requiring direct physical access to the servers. This reduces downtime and allows administrators to patch and upgrade H200-powered systems consistently across an entire cluster.

       

      The NVIDIA H200 also benefits from Redfish power policy management. Administrators can create or delete power policies to optimize energy consumption while maintaining performance. For example, workloads that require peak GPU performance can be given higher power allowances, while less demanding tasks can be restricted to save energy. These policies can be automated through Redfish, improving overall efficiency.

       

      System monitoring is another important use case. Redfish exposes telemetry data such as temperatures, fan speeds, and GPU status through a standardized model. This allows for proactive diagnostics and better predictive maintenance. Enhanced Redfish diagnostics in the H200 firmware provide deeper insights into system health, helping prevent failures before they affect workloads.

       

      Finally, Redfish API support makes deployment of GPU-rich clusters faster and more reliable. Automated configuration and monitoring ensure that H200-based systems are set up consistently, reducing manual intervention. This is particularly useful in large-scale AI factories where thousands of GPUs need to be managed in parallel.

       

      Table: Benefits of Redfish API for NVIDIA H200 Deployments
       

      Benefit Impact on H200 Systems
      Remote Configuration Manage power, boot, and firmware remotely
      Automated Maintenance Improve uptime and reduce manual intervention
      Scalable Operations Smooth management of multi-node GPU clusters
      Enhanced Monitoring Access real-time sensor and health data via standard APIs

       

      5. What Are Implementation Use Cases and Best Practices?

       

      The Redfish API is not just a standard; it is also practical for day-to-day operations. System administrators can script updates or configuration changes directly through Redfish commands. For example, using simple tools like curl or NVIDIA’s nvfwupd CLI, firmware updates can be triggered remotely without requiring physical access to the server. This helps maintain consistent system states across clusters.

       

      Benefits of Redfish API for NVIDIA H200 deployments, highlighting remote management, automation, scalability, monitoring.

       

      A common workflow is monitoring power usage. Redfish exposes real-time telemetry on system power draw, which allows admins to optimize energy policies for GPU-intensive workloads. Similarly, performing firmware updates through Redfish reduces manual intervention and ensures that H200-based systems run with the latest security patches and performance improvements.

       

      Another frequent use case is resetting the Baseboard Management Controller (BMC). The BMC handles low-level system management and being able to reset it remotely saves time when issues occur. Collecting system logs is also straightforward with Redfish API support, providing admins with historical data for troubleshooting hardware or performance issues.

       

      Redfish further simplifies tasks like adjusting the boot order of a system. This is especially helpful when deploying clusters at scale. Administrators can set the boot sequence programmatically, ensuring that nodes initialize correctly and consistently during large deployments.

       

      There are also a few known quirks. Sensor reporting errors sometimes occur when telemetry values briefly go out of sync, leading to inaccurate readings. Another issue involves boot inventory timing, where system data may not be immediately available after startup. The recommended remedy is to add slight delays or retries in automation scripts. These adjustments improve reliability and ensure accurate system information.

       

      Conclusion: Why Redfish API Support Is Essential for Next-Gen Systems

       

      Redfish API support is now a key requirement for managing advanced computing systems. It replaces outdated management methods with a secure, web-based interface that is both scalable and easy to use.

       

      For systems like the NVIDIA H200, Redfish API support enables administrators to manage hardware remotely with full visibility into critical resources. This helps maintain system health in GPU-rich data centers.

       

      Organizations that adopt NVIDIA H200 Redfish API support gain a competitive edge. They are better prepared to handle growing AI workloads and manage large-scale GPU deployments with greater efficiency. As AI systems expand in size and complexity, Redfish-enabled platforms will become essential for long-term success.

       

      Bookmark me
      Share on
      Comments
      Add your Comment

      Writing About AI

      Semifly

      is an engineer and a technologist with a diverse background spanning software, hardware, aerospace, defense, and cybersecurity. As CTO at Semifly, he leverages his extensive experience to lead the company’s technological innovation and development.

      Explore Nvidia’s GPUs

      Find a perfect GPU for your company etc etc
      Go to Shop

      FAQs

      • The Redfish API is an industry-standard specification developed by the Distributed Management Task Force (DMTF) for managing and monitoring hardware systems through a modern, web-based interface. It is crucial for modern infrastructure because it offers a secure, scalable, and standardised alternative to older management technologies like IPMI. Redfish uses a RESTful API model, communicating via HTTPS and representing data in JSON format, which makes it significantly easier for administrators and automation tools to interact with systems consistently across various vendors and platforms. Its web-native design ensures it scales well across cloud, on-premises, and edge deployments, making it an essential tool for building flexible, future-ready, and AI-ready infrastructures.

      • NVIDIA integrates Redfish API support directly into its data centre platforms, including DGX systems like the H100 and the H200 GPU. This support is built into the Baseboard Management Controller (BMC) and the system BIOS (SBIOS) and is enabled by default, meaning administrators can use Redfish APIs without additional setup. For the H200, this integration allows for advanced capabilities such as remote management of user accounts, system power control, detailed sensor telemetry (temperature, fan speed, voltages), and log access. Furthermore, firmware enhancements on the H200 provide more fine-grained power policy controls for energy optimisation and enhanced diagnostic tools for better visibility into system health and failure prediction. This combination of open standards and hardware-level integration allows for consistent and automated management of GPU-rich servers.

      • Redfish API offers significant improvements over IPMI, addressing the demands of modern, large-scale computing environments. Key advantages include:

         

        Security: Redfish uses HTTPS for encrypted and secure communication, whereas IPMI often relies on plaintext, posing security risks.

         

        Data Representation: Redfish uses JSON, a lightweight and human-readable format, making data interaction easier for administrators and automation tools. IPMI uses complex binary formats.

         

        Standardisation and Interoperability: Redfish is a vendor-neutral standard backed by the DMTF, ensuring seamless management across multi-vendor environments, unlike the often vendor-specific implementations of IPMI.

         

        Extensibility: Redfish’s schema-based, model-driven design allows vendors to extend functionality for new technologies without breaking compatibility, which is difficult with IPMI’s rigid structure.

         

        Modern Use Cases: Redfish is designed for hybrid and cloud-native environments, supporting automation, scalability, and real-time telemetry, while IPMI was developed for simpler, older server management.

      • The NVIDIA H200 GPU leverages Redfish API support to enable advanced and streamlined management of GPU-powered systems, which is critical for high-performance computing (HPC) and AI workloads. Redfish allows for:

         

        Streamlined Routine Tasks: Remote firmware updates can be applied without physical access, reducing downtime and ensuring consistent patching across clusters.

         

        Optimised Power Management: Administrators can create and manage power policies to optimise energy consumption while maintaining performance, adjusting power allowances based on workload demands.

         

        Enhanced System Monitoring: Redfish exposes detailed telemetry data (temperatures, fan speeds, GPU status) for proactive diagnostics and predictive maintenance. Enhanced diagnostics in H200 firmware provide deeper insights to prevent failures.

         

        Faster and More Reliable Deployment: Automated configuration and monitoring ensure H200-based systems are set up consistently, reducing manual intervention, which is invaluable in large-scale AI factories managing thousands of GPUs.

      • The Redfish API is highly practical for day-to-day data centre operations. Common implementation use cases include:

         

        Scripted Updates and Configuration: System administrators can script firmware updates or configuration changes using Redfish commands, for example, via tools like curl or NVIDIA’s nvfwupd CLI, allowing for remote management and consistent system states.

         

        Monitoring Power Usage: Real-time telemetry on system power draw enables optimisation of energy policies for GPU-intensive workloads, balancing performance with energy efficiency.

         

        Remote BMC Reset: The ability to remotely reset the Baseboard Management Controller (BMC) saves time when troubleshooting low-level system management issues.

         

        Collecting System Logs: Easily collecting system logs provides historical data for troubleshooting hardware or performance problems.

        Adjusting Boot Order: Programmatically setting the boot sequence simplifies deploying clusters at scale, ensuring consistent node initialization.

         

        Are there any known issues or best practices to consider when implementing Redfish API?

        While Redfish API is robust, there are a few known quirks and best practices to ensure reliable implementation:

         

        Sensor Reporting Errors: Occasionally, telemetry values may briefly go out of sync, leading to inaccurate sensor readings. A best practice is to incorporate retries or slight delays in automation scripts to ensure accurate data.

         

        Boot Inventory Timing: System data might not be immediately available after startup. Adding slight delays or retries in automation scripts can improve reliability and ensure accurate system information is captured.

         

        Automation: Leverage the RESTful nature and JSON format to integrate Redfish with modern automation tools and scripting languages for efficient, scalable management.

         

        Security: Always ensure HTTPS is used for communication and adhere to strong authentication practices given Redfish’s access to low-level hardware controls.

         

        Vendor-Neutrality: Utilise Redfish’s standardisation to manage multi-vendor environments consistently, avoiding vendor-specific tools where possible.

      • Redfish API is considered essential for next-generation computing systems because it provides a secure, scalable, and web-based interface that addresses the complexities of modern data centres, especially those supporting AI workloads. As AI systems grow in size and complexity, manual management becomes unsustainable. Redfish enables:

         

        Scalability: It allows efficient management of large-scale GPU deployments and multi-node clusters that are common in AI factories.

         

        Automation: Its API-driven nature facilitates automated configuration, monitoring, and maintenance, reducing human error and operational costs.

         

        Efficiency and Reliability: Features like advanced power control, remote firmware updates, and comprehensive diagnostics contribute to maintaining system health and uptime in GPU-rich environments.

         

        Future-Proofing: Its extensible design ensures it can adapt to new technologies and evolving demands of AI and high-performance computing, offering a long-term solution for infrastructure management.

      • The “Insight Category” for Redfish API support includes Datacentre, Information Technology, Infrastructure, Research and Development, and Technology Services. These categories signify that Redfish is a fundamental technology underpinning the operational efficiency, management, and development within these areas. It is critical for the core infrastructure of data centres, enhances IT management practices, supports R&D efforts in technology, and is a key component of technology service offerings.

         

        The “Insight Industry” includes Energy and Utilities, High Tech and Electronics, and Technology. This indicates that Redfish API support is particularly relevant and impactful in these sectors. For Energy and Utilities, it can aid in managing the underlying IT infrastructure required for smart grids or operational technology. In High Tech and Electronics, it’s essential for the design, manufacturing, and deployment of modern hardware. Across the broader Technology industry, it’s a foundational standard for managing the hardware that powers everything from cloud services to enterprise solutions.

      More Similar Insights and Thought leadership

      Zero-Trust Security Implementation: How Managed Services Turn Strategy into Continuous Protection

      Zero-Trust Security Implementation: How Managed Services Turn Strategy into Continuous Protection

      Zero-trust security replaces obsolete perimeter defenses with a model that assumes breach and mandates explicit verification for every access request, regardless of location,. Unlike static…
      14 minute read
      Energy and Utilities
      H100 vs H200 Performance Comparison: Decoding the GPU Upgrade That Will Shape Enterprise AI

      H100 vs H200 Performance Comparison: Decoding the GPU Upgrade That Will Shape Enterprise AI

      The NVIDIA H200 GPU enhances the H100, sharing the same Hopper architecture but targeting performance bottlenecks in large-scale AI. The key upgrade is its memory…
      10 minute read
      Energy and Utilities
      Accelerating Workflows with NVIDIA HPC Compilers: Unlocking Performance on NVIDIA H200 GPUs

      Accelerating Workflows with NVIDIA HPC Compilers: Unlocking Performance on NVIDIA H200 GPUs

      The NVIDIA HPC Compiler stack is essential for bridging the gap between the raw power of hardware like the NVIDIA H200 GPU and real-world application…
      18 minute read
      Energy and Utilities
      NVIDIA H200 Regulatory Approvals: Ensuring Safe and Compliant AI and HPC Deployments 

      NVIDIA H200 Regulatory Approvals: Ensuring Safe and Compliant AI and HPC Deployments 

      The NVIDIA H200 GPU has numerous regulatory approvals, which are essential for safe, legal, and reliable deployment of AI and high-performance computing (HPC) workloads globally.…
      8 minute read
      Energy and Utilities
      GPUs in University Research: Powering the Next Era of Discovery

      GPUs in University Research: Powering the Next Era of Discovery

      Universities are increasingly adopting Graphics Processing Units (GPUs) to accelerate research in fields like medicine, climate science, and artificial intelligence, which depend on processing massive…
      14 minute read
      Energy and Utilities
      NVIDIA DGX H200 Power Consumption: What You Absolutely Must Know

      NVIDIA DGX H200 Power Consumption: What You Absolutely Must Know

      The NVIDIA DGX H200 is a powerful, factory-built AI supercomputer designed for complex AI and research tasks. Its high performance, driven primarily by eight H200…
      14 minute read
      Energy and Utilities
      semifly
      About Us