Hardware Automation - BIOS settings

 As you know there are several hardware vendors outside and so by nature all use different ways to automate their kit. Luckily something I would call standard has made it the most vendor APIs – Redfish API.

For both parts I like to concentrate on HPE first, as this is one of the major players in the game.

 

As we are clearly looking with VMware glasses at the whole topic, I will reuse the tools I used to automate customers VMware VVD/Cloud Foundation deployments – vRealize Orchestrator.

 

In case of HPE there is a tool that they offers, which could solve baselining the just discussed problems, but also might not meet your requirements. I’m speaking about OneView. This tool from HPE clearly focus on HPE hardware and does not provide me a solution, which I can use across a heterogeneous hardware landscape. So it is not an option for me.

What I definitely like to use, are the APIs OneView is leveraging to automate the individual hardware components. One of these is the iLO RESTful API.

It is there since iLO 4 2.00 (found on Gen8 and Gen9 kit) and has enhanced with every new minor release. The latest evolution is iLO 5 with the broadest feature set so far.

I will reduce the scope to iLO 5 only for now, as this covers most current deployments, because it comes with all Gen10 kit HPE delivers to the customers at the moment. In a later article I might show, what I also developed to support iLO 4 on Gen8 and Gen9 hardware.

 

But kick it off. The iLO Edition you will need to follow my explanations in the content below and as well in the following parts is “Advanced”.

 

Looking at the BIOS settings it is always a good idea to create a golden host, which contains the BIOS configuration you like to distribute to all other nodes of the same model in your datacenter. I will not provide any guideline on how to find your optimal configuration in this article, but maybe in a future one. For the time in between I like to recommend the “VMware vSphere 6.5 Host Resources Deep Dive” by Frank Denneman and Niels Hagoort.

Assuming you have configured a host according to your needs, we first need to read the current configuration from this node via its iLO RESTful API. Clean it up and make it our golden configuration to use for every deployment of a new new server or distributing it across your existing estate.

 

As we need to run all iLO tasks in the same way, I have created a wrapper workflow, which takes the individual request and handles authentication for us.

var auth = RESTAuthenticationManager.createAuthentication("Basic",["Shared Session",user,password]);

var host = RESTHostManager.createHost(name);
host.url = url;
host.connectionTimeout = connectionTimeout;
host.operationTimeout = operationTimeout;
host.hostVerification = hostVerification;
host.authentication = auth;
  
// create temporary host
var restHost = RESTHostManager.createTransientHostFrom(host);
  
// prepare and execute main rest request
var request = restHost.createRequest(callmethod,call,callcontent);

if(callmethod != "GET")
{
  request.contentType = "application/json";
}
var response = request.execute();
  
if(response.statusCode != "200" && response.statusCode != "201")
{
  System.debug("Status code was "+response.statusCode);
  throw("return code of main request was not 200 or 201");
}
      
responseStatusCode = response.statusCode;
responseContent = response.contentAsString;

 

Read configuration

The following call is just passed through the wrapper workflow as GET request:

/redfish/v1/Systems/1/Bios/Settings/

Read iLO BIOS config

Read iLO BIOS config

Clean up

As you might have seen while looking at the JSON output of the read call that the response contains a lot of personalized information of the server, like server name and serial number. This we don’t want to have on the new server, so we just delete the parts in the JSON structure. (The server name might be useful, but need to be individualized for every write action)

{
  "Attributes": {
    "AcpiHpet": "Enabled",
    "AcpiRootBridgePxm": "Enabled",
    "AcpiSlit": "Enabled",
    "AdjSecPrefetch": "Enabled",
    "AdminEmail": "",
    "AdminName": "",
    "AdminOtherInfo": "",
    "AdminPhone": "",
    "AdvancedMemProtection": "AdvancedEcc",
    "AsrStatus": "Enabled",
    "AsrTimeoutMinutes": "Timeout10",
                ...

 

Write back to a new host

The write process is pretty easy and as quick as the read process. We just need to run a POST request to the same URL containing the amended JSON response from the read call as payload. You can paste this configuration every time as workflow input or, like me, read it from a configuration element. In my case I’ve created a dedicated one, contains an entry for each server mode so I can read it based on the model returned by the API.

You might wonder if this configuration is active already – no. HPE is using an approach many network hardware vendors are using for their configuration. There are two types of configurations, the running configuration, the one that is currently active and the pending configuration (the one you have just changed). In our case the pending configuration automatically gets active once the server is reset.

When the server boots through POST, you will notice that a remote configuration takes place and the server might automatically restarts multiple times – this is the point where your configuration gets active.

/redfish/v1/Systems/1/Bios/Settings/

 

Write iLO BIOS config

Write iLO BIOS config

 

Congrats you successfully distributed your custom configuration. This process might be adapted for other vendors. Stay tuned on updates for at least Dell in my pipeline.

For sure this only covers a single host. Feel free to wrap this into a loop or use this is part of a higher level workflow.

 

All code can be downloaded as vRO package from here:

com.schoen-computing.vro.ilo

(you need to set the iLO password in the Configuration Element section before running the workflows, otherwise they will fail)

 

Hardware Automation - Motivation

Automation in and around the SDDC mostly focuses on orchestrating virtual infrastructure for certain operational processes or leveraging it to deploy workloads/services as part of bigger blueprints.

From my day to day experience the most overseen part is the key component which enables SDDC in the first place – the underlying hardware.

Your are totally right in saying that hardware should be treated as cattle and not as pets, but mostly it lacks the level of automation exists for the virtual infrastructure in most companies.

API first claims, as done by many software vendors, have not made it the hardware producing companies so far. Many of them just have started on providing public APIs to their components.

 

So, what key advantages can you expect, if you invest mostly time into automating hardware:

 

  • Reduce hardware deployment times
  • Profiling customizable settings of your hardware, i.e. BIOS/UEFI Settings to ensure same reliability and performance across your estate, which gives you
  • Less situations of unpredictable behavior (who would have though about different BIOS settings, if you try to find an issue with your hypervisor)
  • Maintain vendor supplied hardware-firmware-driver combinations
  • Ensure that rolled out firmware and configuration settings comply with the standard you have engineered and tested

 

Looking back the last years many customers were facing issues in there environment because of such configuration drifts I just mentioned above. Standardization is key here. Some of you might argue that automation would make issues available everywhere, not just in single servers. That‘s basically correct, but is in the end a quality problem of your engineering:) Apply the same engineering and testing efforts to your hardware configurations and your will win.

A lot of companies roll the server hardware into their datacenter as they will arrive from the vendor and assume the vendor has chosen the right configuration for them. How the vendor should have known what your intended workload is? But even this methodology does not ensure that the servers are using the same firmware nor BIOS configuration. Trust me.

 

The following content should give you and idea what options you have to baseline your hardware. The content will be split up in two parts:

 

VMware Validated Designs does not exclude Stretched Cluster in general

Working as an architect in the VMware space you will sooner or later come across the VMware Validated Designs (VVD). Just a few weeks ago the latest version 4.0 was released to make adjustments for vSphere 6.5. It can be found here:

VMware Validate Designs Documentation

The designs are a great source for building your own architectures or building architectures for customers. The incorporated component architectures are natively built for availability, reliability and scalability. These are exactly the main goals I try to put in the designs I create for customers. The VVDs show up a good practise for a detailed setup that can be used for several use cases like Private Cloud or VDI deployments. VMware Cloud Foundation also makes use of the VVDs for its implementations.

But apart from this I also like to treat them as a framework which gives me the chance to keep the setup supported by VMware but also adjust it to the customer needs and make it fit like a second skin based on customers requirements.

Across their history they mainly relied/rely on a two region concept with one primary and fail-over region. This is a quite common architecture for U.S. setups. In the European space and especially in Germany customers often stick their existing architectures based a two datacenters setup working as an active/active pair. If you see this also as a two region setup or you would aggregate this into one region like me, is up to you. I prefer one region because the datacenters are in a short distance because of their synchronous replication/mirroring and so they build up a logical domain because for their active/active style.  This is why I split the region down to two physical availability zones (AWS term) and one virtual across two datacenters. This does not need to be undestand now, it will get clearer in later chapter.

In my understanding the VVD framework needs some extension in regards to Stretched Clusters and this is why I like to set up a series which guides through a forked version of the VVDs I personally use for customer designs:

  1. General thoughts
  2. Additions/Changes to physical architecture
  3. Additions/Changes to virtual architecture
  4. Additions/Changes to cloud management architecture
  5. Additions/Changes to operations management architecture
  6. Additions/Changes to business continuity architecture

Stay tuned!