How to deploy an EXAScaler Cloud Application

EXAScaler Cloud application can be launched on Microsofts Azure Portal by navigating to the link below and and clicking the Create button.

Step by step deployment video

Step by step deployment guide

The steps below will show how to create a EXAScaler Cloud Cluster on Microsoft Azure platform. 

You will need an Azure account. Sign in.

Login to your account and go to the dashboard.

You can get to the dashboard from anywhere by clicking on the blue "Microsoft Azure" in the upper left corner of the GUI. 

To get started click on "Create a resource" to get ready to start deployments of a new instance of EXAScaler Cloud.

In the search box that appears, type "EXAScaler Cloud" and hit return. For some reason the completions that start to show up do not work as expected.

Click "Create" button.

EXAScaler Cloud environment can be deployed as a standard Azure application using the Azure Marketplace. And the installation wizard is used to deploy it. Installation wizard contains few configuration steps.

➊ Basics step

Project details section

  • Subscription - all resources in an Azure subscription are billed together. So we need select one from list of available subscriptions.
  • Resource group - is a collection of resources that share the same lifecycle, permissions, and policies. And we can select an existing resource group or create a new one to deploy an EXAScaler Cloud environment.

and Instance details section:

  • Azure region for our deployment. For example, we can select the location closest to us. And we must have enough resources in this location for the deployment (number of available CPU cores).
  • Deployment type: Install to create a new deployment or Upgrade to update an existing deployment.

 Deployment step

Deployment details section

  • Configuration Profile: we can choose predefined configuration profile or create a custom EXAScaler Cloud configuration. Two predefined profiles available:
    - Small configuration profile that will configure Microsoft Azure resources to deliver 10TB capacity and 1.44 GB/s throughput performance.
    - Medium configuration profile that will configure Microsoft Azure resources to deliver 50TB capacity and 6  GB/s throughput performance.
    And one Custom configuration profile, that allows us to deploy any environment according to our requirements.


EXAScaler Cloud pricing is based on the number of CPU cores configured to meet the defined performance requirements. And DDN Premium Support is recommended for Custom configuration profile.

  • Filesystem name that will be mounted from compute clients as a distributed network resource.

Availability options section

Azure offers a range of options for managing availability and resiliency. And we can choose:

  • None - no infrastructure redundancy required.
  • Availability set - logical grouping of virtual machines that allows Azure to provide redundancy and availability. Each virtual machine in an availability set is assigned an update domain and a fault domain by the underlying Azure platform. Fault domains define the group of virtual machines that share a common power source and network switch. And the virtual machines configured within an availability set are separated across 3 fault domains. Update domains indicate groups of virtual machines and underlying physical hardware that can be rebooted at the same time.
  • Availability zone - unique physical locations within an Azure region. Each zone is made up of one or more datacenters with independent power, cooling, and networking. To ensure resiliency, there is a minimum of three separate zones in all enabled regions. The physical separation of availability zones within a region protects applications and data from datacenter failures. An availability zone in an Azure region is a combination of a fault domain and an update domain.

 Security step

allows you to choose user name and authentication parameters to remote access to the management server. And Microsoft Azure provides two options for
authentication.

Password based authentication

and SSH public key-based authentication

➍ Network step

To configure Azure Virtual Network which enables resources to securely communicate with each other. And the subnet inside the virtual network where EXAScaler Cloud environment will be deployed to.

We can use an existing network or create a new virtual network and specify address range

Also we can assign external IP addresses:

  • Management server - public IP address will be assigned to the management server only
  • All servers - public IP addresses will be assigned to all servers
  • None - no public IP addresses sill be assigned

And configure remote access for SSH and HTTP protocols

 

 Management step

To configure EXAScaler Cloud management server.

For the Custom configuration profile we can configure:

  • Type of management server

  • Type and size of management target
  • Type and size of monitoring target

And all selected configurations are checked for compliance

For predefined Small and Medium configuration profiles all configuration options are read-only.

 Metadata step

To configure EXAScaler Cloud management server. For the Custom configuration profile we can change:

  • Type of metadata server
  • Type and size of metadata target

For predefined Small and Medium configuration profiles all configuration options are read-only.

 Storage step

To configure EXAScaler Cloud storage servers. For the Custom configuration profile we can change:

  • Type of storage server
  • Number of storage servers
  • Type, size and count of storage targets

For predefined Small and Medium configuration profiles all configuration options are read-only.

 Client step

Optionally we can deploy compute client instances and by default the size and number of compute client instances matches the size and number of object storage servers. As an option we can create and attach local disks to each EXAScaler Cloud compute client. We can change:

  • Size of storage client
  • Number of storage clients
  • Type, size and count of storage targets

And all compute client options can be changed for any type of the configuration profile.

 Review + create step

At this stage Microsoft Azure checks all selected options against the subscription quota. And we can review and check all the parameters we have chosen before starting the deployment:

  • Subscription
  • Resource group
  • Region
  • Deployment type
  • Configuration profile
  • OS image
  • Filesystem name
  • Availability type
  • Availability zone
  • User name and authentication parameters
  • Virtual network and subnet parameters
  • Public IP address options
  • Remote access parameters
  • Management server options
  • Metadata server options
  • Storage server(s) options
  • Compute client(s) options

And we can press a "Create" button and send the selected configuration for deployment.

Next we will see the dashboard notification with "Deployment is in progress":

Once the status changes to "Your deployment is complete", click on "Outputs" to view the deployment output:

In the "Outputs" screen:

  • Dashboard URL to display the deployment details
  •  command to remote access to the management server

  •  to mount EXAScaler Cloud distributed file system. All compute clients created in this deployment should mount this filesystem

  • Private IP addresses

Copy and paste the "" URL on a new browser tab to display detailed deployment statistics:

Some additional information is available on the Azure shared dashboard: Azure portal → Home → Resource groups → EXAScaler-Cloud:

Click the dashboard name and open the "Go to dashboard" link:


And you can customize the dashboard according to your requirements.

How to automate deployment with Terraform

Automated deployment is possible with EXAScaler Cloud Terraform Scripts.

The steps below will show how to create a EXAScaler Cloud environment on Microsoft Azure using Terraform.

Supported products

Product Version Base OS Stock Keeping Unit (SKU)
EXAScaler Cloud 5.2.6 Red Hat Enterprise Linux 7.9 exascaler_cloud_5_2_redhat
EXAScaler Cloud 5.2.6 CentOS Linux 7.9 exascaler_cloud_5_2_centos
EXAScaler Cloud 6.0.1 Red Hat Enterprise Linux 7.9 exascaler_cloud_6_0_redhat
EXAScaler Cloud 6.0.1 CentOS Linux 7.9 exascaler_cloud_6_0_centos
EXAScaler Cloud 6.1.0 Red Hat Enterprise Linux 7.9 exascaler_cloud_6_1_redhat
EXAScaler Cloud 6.1.0 CentOS Linux 7.9 exascaler_cloud_6_1_centos
EXAScaler Cloud 6.2.0 Red Hat Enterprise Linux 8.7 exascaler_cloud_6_2_redhat
EXAScaler Cloud 6.2.0 Rocky Linux 8.7 exascaler_cloud_6_2_rocky
EXAScaler Cloud 6.3.2 Red Hat Enterprise Linux 8.10 exascaler_cloud_6_3_redhat
EXAScaler Cloud 6.3.2 Rocky Linux 8.10 exascaler_cloud_6_3_rocky

Client packages

EXAScaler Cloud deployment provides support for installing and configuring third-party clients. EXAScaler Cloud client software comprises a set of kernel modules which must be compatible with the running kernel, as well as userspace tools for interacting with the filesystem.

Vendor Product Version Arch Kernel Version for binary package Kernel Version for DKMS package
Red Hat RHEL 7.6 x86_64 3.10.0-957.99.1.el7.x86_64 3.10.0
Red Hat RHEL 7.7 x86_64 3.10.0-1062.77.1.el7.x86_64 3.10.0
Red Hat RHEL 7.8 x86_64 3.10.0-1127.19.1.el7.x86_64 3.10.0
Red Hat RHEL 7.9 x86_64 3.10.0-1160.119.1.el7.x86_64 3.10.0
Red Hat RHEL 8.0 x86_64 4.18.0-80.31.1.el8_0.x86_64 4.18.0
Red Hat RHEL 8.1 x86_64 4.18.0-147.94.1.el8_1.x86_64 4.18.0
Red Hat RHEL 8.2 x86_64 4.18.0-193.141.1.el8_2.x86_64 4.18.0
Red Hat RHEL 8.3 x86_64 4.18.0-240.22.1.el8_3.x86_64 4.18.0
Red Hat RHEL 8.4 x86_64 4.18.0-305.148.1.el8_4.x86_64 4.18.0
Red Hat RHEL 8.5 x86_64 4.18.0-348.23.1.el8_5.x86_64 4.18.0
Red Hat RHEL 8.6 aarch64 4.18.0-372.105.1.el8_6.aarch64 4.18.0
Red Hat RHEL 8.6 x86_64 4.18.0-372.134.1.el8_6.x86_64 4.18.0
Red Hat RHEL 8.7 aarch64 4.18.0-425.19.2.el8_7.aarch64 4.18.0
Red Hat RHEL 8.7 x86_64 4.18.0-425.19.2.el8_7.x86_64 4.18.0
Red Hat RHEL 8.8 aarch64 4.18.0-477.86.1.el8_8.aarch64 4.18.0
Red Hat RHEL 8.8 x86_64 4.18.0-477.86.1.el8_8.x86_64 4.18.0
Red Hat RHEL 8.9 aarch64 4.18.0-513.24.1.el8_9.aarch64 4.18.0
Red Hat RHEL 8.9 x86_64 4.18.0-513.24.1.el8_9.x86_64 4.18.0
Red Hat RHEL 8.10 aarch64 4.18.0-553.40.1.el8_10.aarch64 4.18.0
Red Hat RHEL 8.10 x86_64 4.18.0-553.40.1.el8_10.x86_64 4.18.0
Red Hat RHEL 9.0 aarch64 5.14.0-70.101.1.el9_0.aarch64 5.14.0
Red Hat RHEL 9.0 x86_64 5.14.0-70.122.1.el9_0.x86_64 5.14.0
Red Hat RHEL 9.1 aarch64 5.14.0-162.23.1.el9_1.aarch64 5.14.0
Red Hat RHEL 9.1 x86_64 5.14.0-162.23.1.el9_1.x86_64 5.14.0
Red Hat RHEL 9.2 aarch64 5.14.0-284.99.1.el9_2.aarch64 5.14.0
Red Hat RHEL 9.2 x86_64 5.14.0-284.99.1.el9_2.x86_64 5.14.0
Red Hat RHEL 9.3 aarch64 5.14.0-362.24.1.el9_3.aarch64 5.14.0
Red Hat RHEL 9.3 x86_64 5.14.0-362.24.1.el9_3.x86_64 5.14.0
Red Hat RHEL 9.4 aarch64 5.14.0-427.50.1.el9_4.aarch64 5.14.0
Red Hat RHEL 9.4 x86_64 5.14.0-427.50.1.el9_4.x86_64 5.14.0
Red Hat RHEL 9.5 aarch64 5.14.0-503.26.1.el9_5.aarch64 5.14.0
Red Hat RHEL 9.5 x86_64 5.14.0-503.26.1.el9_5.x86_64 5.14.0
Canonical Ubuntu 16.04 LTS amd64 4.4 - 4.15
Canonical Ubuntu 18.04 LTS amd64 4.15 - 5.4
Canonical Ubuntu 20.04 LTS amd64 5.4 - 5.15
Canonical Ubuntu 20.04 LTS arm64 5.4 - 5.15
Canonical Ubuntu 22.04 LTS amd64 5.15 - 6.2
Canonical Ubuntu 22.04 LTS arm64 5.15 - 6.2
Canonical Ubuntu 24.04 LTS amd64 6.8 - TBD
Canonical Ubuntu 24.04 LTS arm64 6.8 - TBD

Notes:

  • Client packages for aarch64 and arm64 architectures are available only for EXAScaler Cloud 6.3
  • Client packages for Canonical Ubuntu 16.04 LTS are not available for EXAScaler Cloud 6.3

Prerequisites

Before deploy Terraform code for Microsoft Azure, you will need to authenticate under the Microsoft account you used to log into the Microsoft Azure Portal. You will use a Microsoft account and its credentials to allow Terraform to deploy resources.

DDN EXAScaler Cloud in the Azure Marketplace have additional license and purchase terms that you must accept before you can deploy them programmatically. To deploy an environment from this image, you'll need to accept the image's terms the first time you use it, once per subscription.

Steps to authenticate via Microsoft account

Obtains access credentials for your user account via a web-based authorization flow. When this command completes successfully, it sets the active account in the current configuration to the account specified. Learn more about Azure authentication.

$ az login
[
  {
    "cloudName": "AzureCloud",
    "homeTenantId": "00000000-0000-0000-0000-000000000000",
    "id": "00000000-0000-0000-0000-000000000000",
    "isDefault": true,
    "managedByTenants": [],
    "name": "Subscription-Name",
    "state": "Enabled",
    "tenantId": "00000000-0000-0000-0000-000000000000",
    "user": {
      "name": "user@domain.com",
      "type": "user"
    }
  }
]

To view the current Azure subscription ID, please use az account show.

$ az account show
{
  "environmentName": "AzureCloud",
  "homeTenantId": "00000000-0000-0000-0000-000000000000",
  "id": "00000000-0000-0000-0000-000000000000",
  "isDefault": true,
  "managedByTenants": [],
  "name": "Subscription-Name",
  "state": "Enabled",
  "tenantId": "00000000-0000-0000-0000-000000000000",
  "user": {
    "name": "user@domain.com",
    "type": "user"
  }
}

Please use the value of id property as an Azure subscription ID for Terraform based deployments.

Steps to accept the terms of use for DDN EXAScaler Cloud images

To deploy DDN EXAScaler Cloud, you need to accept the Azure Marketplace image terms so that the image can be used to create a deployment.

  • For EXAScaler Cloud 5.2 and CentOS Linux based image:

    $ az vm image terms accept --urn ddn-whamcloud-5345716:exascaler_cloud:exascaler_cloud_5_2_centos:latest
  • For EXAScaler Cloud 5.2 and Red Hat Enterprise Linux based image:

    $ az vm image terms accept --urn ddn-whamcloud-5345716:exascaler_cloud:exascaler_cloud_5_2_redhat:latest
  • For EXAScaler Cloud 6.0 and CentOS Linux based image:

    $ az vm image terms accept --urn ddn-whamcloud-5345716:exascaler_cloud:exascaler_cloud_6_0_centos:latest
  • For EXAScaler Cloud 6.0 and Red Hat Enterprise Linux based image:

    $ az vm image terms accept --urn ddn-whamcloud-5345716:exascaler_cloud:exascaler_cloud_6_0_redhat:latest
  • For EXAScaler Cloud 6.1 and CentOS Linux based image:

    $ az vm image terms accept --urn ddn-whamcloud-5345716:exascaler_cloud:exascaler_cloud_6_1_centos:latest
  • For EXAScaler Cloud 6.1 and Red Hat Enterprise Linux based image:

    $ az vm image terms accept --urn ddn-whamcloud-5345716:exascaler_cloud:exascaler_cloud_6_1_redhat:latest

Learn more about the image terms.

Steps to configure Terraform

Download Terraform scripts and extract the tarball:

$ curl -sL https://github.com/DDNStorage/exascaler-cloud-terraform/archive/refs/tags/scripts/2.2.2.tar.gz | tar xz

Change Terraform variables according you requirements:

$ cd exascaler-cloud-terraform-scripts-2.2.2/az
$ vi terraform.tfvars

List of available variables

Common options

Variable Type Default Description
prefix string null EXAScaler Cloud custom deployment prefix. Set this option to add a custom prefix to all created objects.
tags map {} EXAScaler Cloud custom deployment tags. Set this option to add a custom tags to all created objects. Learn more about Azure tags.
fsname string exacloud EXAScaler filesystem name. Learn more about Lustre filesystem.
subscription string 00000000-0000-0000-0000-000000000000 Subscription ID - please use ID of you active Azure subscription. Learn more Azure subscriptions.
location string West US Azure region to manage resources. Learn more about Azure geography.

Availability options

Variable Type Default Description
availability.type string none Availability type: none - no infrastructure redundancy required, set - to create an availability set and automatically distribute resources across multiple fault domains, zone - to physically separate resources within an Azure region. Learn more about Azure availability options.
availability.zone integer 1 Availability zone - unique physical locations within a Azure region. Use 1, 2 or 3 to explicitly specify the availability zone. Learn more about Azure availability zones.

Resource group options

Variable Type Default Description
resource_group.new bool true Create a new resource group, or use an existing one: true or false.
resource_group.name string existing-resource-group Existing resource group name, will be using if new is false.

Learn more about Azure resource groups.

Proximity placement group options

Variable Type Default Description
proximity_placement_group.new bool true Create a new proximity placement group, or use an existing one: true or false.
proximity_placement_group.name string existing-proximity-placement-group Existing proximity placement group name, will be using if new is false.

Learn more about Azure proximity placement groups.

Network options

Variable Type Default Description
network.new bool true Create a new network, or use an existing one: true or false.
network.name string existing-network Existing network name, will be using only if new option is false.
network.address string 10.0.0.0/8 IP address in CIDR notation for the new virtual network.

Learn more about Azure virtual networks.

Subnet options

Variable Type Default Description
subnet.new bool true Create a new subnet, or use an existing one: true or false.
subnet.name string existing-subnet Existing subnet name, will be using only if new option is false.
network.address string 10.0.0.0/24 IP address in CIDR notation for the new subnet.

Learn more about Azure virtual networks and subnets.

Security options

Variable Type Default Description
security.username string stack User name for remote SSH access. Learn more about Azure SSH options.
security.ssh_public_key string ~/.ssh/id_rsa.pub Path to the local SSH public key. This file will be added to admin home directory as .ssh/authorized_keys. Learn more about Azure SSH options.
security.enable_ssh bool true true or false: enable or disable remote SSH access.
security.enable_http bool true true or false, enable or disable remote HTTP access.
security.ssh_source_ranges list(string) [0.0.0.0/0] Source IP ranges for remote SSH access in CIDR notation. Learn more about Azure network security groups.
security.http_source_ranges list(string) [0.0.0.0/0] Source IP ranges for remote HTTP access in CIDR notation. Learn more Azure network security groups.

Boot disk options

Variable Type Default Description
boot.disk_type string StandardSSD_LRS Specifies the type of managed disk to create:
  • Standard_LRS
  • Premium_LRS
  • StandardSSD_LRS
boot.disk_cache string ReadWrite Specifies the caching requirements for the target disk:
  • None
  • ReadOnly
  • ReadWrite
boot.auto_delete bool true Delete the boot disk automatically when deleting the virtual machine: trueor false.
boot.disk_size integer 64 Boot disk size in GB.

Learn more about Azure disks types.

Source image options

Variable Type Default Description
image.publisher string ddn-whamcloud-5345716 Specifies the publisher of the image used to create the virtual machine.
image.offer string exascaler_cloud Specifies the offer of the image used to create the virtual machine.
image.sku string exascaler_cloud_6_3_rocky

Specifies the SKU of the image used to create the virtual machine.

EXAScaler Cloud 5.2 images:

  • exascaler_cloud_5_2_centos
  • exascaler_cloud_5_2_redhat

EXAScaler Cloud 6.1 images:

  • exascaler_cloud_6_1_centos
  • exascaler_cloud_6_1_redhat

EXAScaler Cloud 6.2 images:

  • exascaler_cloud_6_2_rocky
  • exascaler_cloud_6_2_redhat

EXAScaler Cloud 6.3 images:

  • exascaler_cloud_6_3_rocky
  • exascaler_cloud_6_3_redhat
image.version string latest Specifies the version of the image used to create the virtual machine.
image.accept bool false Allows automatically accepting the legal terms for a Marketplace image.

Learn more about Azure disk images.

Storage account options

Variable Type Default Description
storage_account.kind string StorageV2 Defines the kind of account. Valid options are:
  • BlobStorage
  • BlockBlobStorage
  • FileStorage
  • Storage
  • StorageV2
storage_account.tier string Standard Defines the tier to use for this storage account. Valid options are:
  • Standard
  • Premium
storage_account.replication string LRS Defines the type of replication to use for this storage account. Valid options are:
  • LRS
  • GRS
  • RAGRS
  • ZRS
  • GZRS
  • RAGZRS

Learn more about Azure storage accounts.

Management server options

Variable Type Default Description
mgs.node_type string Standard_F4s Type of management server. Learn more about Azure performance considerations.
mgs.node_count integer 1 Number of management servers: 1.
mgs.public_ip bool true Assign a public IP address: true or false. Learn more about Azure public IP addresses.
mgs.accelerated_network bool false Enable accelerated networking. Learn more about Azure accelerated networking.

Management target options

Variable Type Default Description
mgt.disk_type string StandardSSD_LRS Specifies the type of managed disk to create the management target:
  • Standard_LRS
  • StandardSSD_LRS
  • Premium_LRS
mgt.disk_cache string None Specifies the caching requirements for the management target:
  • None
  • ReadOnly
  • ReadWrite
mgt.disk_size integer 256 Specifies the size of the management target in GB.
mgt.disk_count integer 1 Specifies the number of management targets: 1-128.
mgt.disk_raid bool false Create striped management target: true or false.

Learn more about Azure disks types

Monitoring target options

Variable Type Default Description
mnt.disk_type string StandardSSD_LRS Specifies the type of managed disk to create the monitoring target:
  • Standard_LRS
  • StandardSSD_LRS
  • Premium_LRS
mnt.disk_cache string None Specifies the caching requirements for the monitoring target:
  • None
  • ReadOnly
  • ReadWrite
mnt.disk_size integer 128 Specifies the size of the monitoring target in GB.
mnt.disk_count integer 1 Specifies the number of monitoring targets: 1-128.
mnt.disk_raid bool false Create striped monitoring target: true or false.

Learn more about Azure disks types

Metadata server options

Variable Type Default Description
mds.node_type string Standard_E8s_v3 Type of metadata server. Learn more about Azure performance considerations.
mds.node_count integer 1 Number of metadata servers: 1-32.
mds.public_ip bool false Assign a public IP address: true or false. Learn more about Azure public IP addresses.
mds.accelerated_network bool false Enable accelerated networking. Learn more about Azure accelerated networking.

Metadata target options

Variable Type Default Description
mdt.disk_type string Premium_LRS Specifies the type of managed disk to create the metadata target:
  • Standard_LRS
  • StandardSSD_LRS
  • Premium_LRS
mdt.disk_cache string None Specifies the caching requirements for the metadata target:
  • None
  • ReadOnly
  • ReadWrite
mdt.disk_size integer 512 Specifies the size of the metadata target in GB.
mdt.disk_count integer 1 Specifies the number of metadata targets: 1-128.
mdt.disk_raid bool false Create striped metadata target: true or false.

Learn more about Azure disks types

Object Storage server options

Variable Type Default Description
oss.node_type string Standard_Ds16_v3 Type of object storage server. Learn more about Azure performance considerations.
oss.node_count integer 4 Number of object storage servers: 1-2000.
oss.public_ip bool false Assign a public IP address: true or false. Learn more about Azure public IP addresses.
oss.accelerated_network bool false Enable accelerated networking. Learn more about Azure accelerated networking.

Object Storage target options

Variable Type Default Description
ost.disk_type string Standard_LRS Specifies the type of managed disk to create the object storage target:
  • Standard_LRS
  • StandardSSD_LRS
  • Premium_LRS
ost.disk_cache string None Specifies the caching requirements for the object storage target:
  • None
  • ReadOnly
  • ReadWrite
ost.disk_size integer 512 Specifies the size of the object storage target in GB.
ost.disk_count integer 6 Specifies the number of object storage targets: 1-128.
ost.disk_raid bool false Create striped object storage target: true or false.

Learn more about Azure disks types

Compute client options

Variable Type Default Description
cls.node_type string Standard_Ds16_v3 Type of compute client. Learn more about Azure performance considerations.
cls.node_count integer 4 Number of compute clients.
cls.public_ip bool false Assign a public IP address: true or false. Learn more about Azure public IP addresses.
cls.accelerated_network bool false Enable accelerated networking. Learn more about Azure accelerated networking.

Compute client target options

Variable Type Default Description
clt.disk_type string Standard_LRS Specifies the type of managed disk to create the compute target:
  • Standard_LRS
  • StandardSSD_LRS
  • Premium_LRS
clt.disk_cache string None Specifies the caching requirements for the compute target:
  • None
  • ReadOnly
  • ReadWrite
clt.disk_size integer 32 Specifies the size of the compute target in GB.
clt.disk_count integer 0 Specifies the number of compute targets.

Learn more about Azure disks types

Deploy an EXAScaler Cloud environment

Initialize a working directory containing Terraform configuration files. This is the first command that should be run after writing a new Terraform configuration or cloning an existing one from version control. It is safe to run this command multiple times:

$ terraform init

Initializing the backend...

Initializing provider plugins...
- Finding hashicorp/azurerm versions matching ">= 3.10.0"...
- Finding latest version of hashicorp/random...
- Finding latest version of hashicorp/template...
- Installing hashicorp/azurerm v3.13.0...
- Installed hashicorp/azurerm v3.13.0 (signed by HashiCorp)
- Installing hashicorp/random v3.3.2...
- Installed hashicorp/random v3.3.2 (signed by HashiCorp)
- Installing hashicorp/template v2.2.0...
- Installed hashicorp/template v2.2.0 (signed by HashiCorp)

Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.

Validate configuration options:

$ terraform validate

Success! The configuration is valid.

Create an execution plan with a preview of the changes that Terraform will make to the environment:

$ terraform plan


Apply the changes required to reach the desired state of the configuration:

$ terraform apply
...
  Enter a value: yes
...
Apply complete! Resources: 103 added, 0 changed, 0 destroyed.

Outputs:

azure_dashboard = "https://portal.azure.com/#@00000000-0000-0000-0000-000000000000/dashboard/arm/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/exascaler-cloud-a108-resource-group/providers/Microsoft.Portal/dashboards/exascaler-cloud-a108-dashboard"

client_config = <<EOT
#!/bin/sh
# install new EXAScaler Cloud clients:
# all instances must be in the same location westus
# and connected to the network exascaler-cloud-a108-virtual-network
# and subnet exascaler-cloud-a108-subnet
# to set up EXAScaler Cloud filesystem on a new client instance,
# run the folowing commands on the client with root privileges:

cat >/etc/esc-client.conf<<EOF
{
  "Version": "2.0.0",
  "MountConfig": {
    "ClientDevice": "10.0.0.10@tcp:/exacloud",
    "Mountpoint": "/mnt/exacloud",
    "PackageSource": "http://10.0.0.10/client-packages"
  }
}
EOF

curl -fsSL http://10.0.0.10/client-setup-tool -o /usr/sbin/esc-client
chmod +x /usr/sbin/esc-client
esc-client auto setup --config /etc/esc-client.conf

EOT

http_console = "http://exascaler-cloud-a108-mgs0.westus.cloudapp.azure.com"

mount_command = "mount -t lustre 10.0.0.10@tcp:/exacloud /mnt/exacloud"

private_addresses = {
  "exascaler-cloud-a108-cls0" = "10.0.0.8"
  "exascaler-cloud-a108-cls1" = "10.0.0.7"
  "exascaler-cloud-a108-cls2" = "10.0.0.11"
  "exascaler-cloud-a108-cls3" = "10.0.0.12"
  "exascaler-cloud-a108-mds0" = "10.0.0.13"
  "exascaler-cloud-a108-mgs0" = "10.0.0.10"
  "exascaler-cloud-a108-oss0" = "10.0.0.9"
  "exascaler-cloud-a108-oss1" = "10.0.0.4"
  "exascaler-cloud-a108-oss2" = "10.0.0.5"
  "exascaler-cloud-a108-oss3" = "10.0.0.6"
}

ssh_console = {
  "exascaler-cloud-a108-mgs0" = "ssh -A stack@exascaler-cloud-a108-mgs0.westus.cloud
app.azure.com"
}

Access the EXAScaler Cloud environment

Now you can access the EXAScaler Cloud environment:

$ eval $(ssh-agent)
Agent pid 12313

$ ssh-add
Identity added: /Users/name/.ssh/id_rsa (/Users/name/.ssh/id_rsa)

$ ssh -A stack@exascaler-cloud-a108-mgs0.westus.cloudapp.azure.com

[stack@exascaler-cloud-a108-mgs0 ~]$ df -h -t lustre
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdc        249G  2.4M  247G   1% /mnt/targets/MGS

[stack@exascaler-cloud-a108-mgs0 ~]$ loci hosts
10.0.0.8	exascaler-cloud-a108-cls0
10.0.0.7	exascaler-cloud-a108-cls1
10.0.0.11	exascaler-cloud-a108-cls2
10.0.0.12	exascaler-cloud-a108-cls3
10.0.0.13	exascaler-cloud-a108-mds0
10.0.0.10	exascaler-cloud-a108-mgs0
10.0.0.9	exascaler-cloud-a108-oss0
10.0.0.4	exascaler-cloud-a108-oss1
10.0.0.5	exascaler-cloud-a108-oss2
10.0.0.6	exascaler-cloud-a108-oss3

[stack@exascaler-cloud-a108-mgs0 ~]$ ssh exascaler-cloud-a108-cls0

[stack@exascaler-cloud-a108-cls0 ~]$ lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
exacloud-MDT0000_UUID   315302464        6212   309927544   1% /mnt/exacloud[MDT:0]
exacloud-OST0000_UUID   529449792        1256   524063444   1% /mnt/exacloud[OST:0]
exacloud-OST0001_UUID   529449792        1272   524063428   1% /mnt/exacloud[OST:1]
exacloud-OST0002_UUID   529449792        1268   524063432   1% /mnt/exacloud[OST:2]
exacloud-OST0003_UUID   529449792        1264   524063436   1% /mnt/exacloud[OST:3]
exacloud-OST0004_UUID   529449792        1256   524063444   1% /mnt/exacloud[OST:4]
exacloud-OST0005_UUID   529449792        1264   524063436   1% /mnt/exacloud[OST:5]
exacloud-OST0006_UUID   529449792        1276   524063424   1% /mnt/exacloud[OST:6]
exacloud-OST0007_UUID   529449792        1268   524063432   1% /mnt/exacloud[OST:7]
exacloud-OST0008_UUID   529449792        1260   524063440   1% /mnt/exacloud[OST:8]
exacloud-OST0009_UUID   529449792        1272   524063428   1% /mnt/exacloud[OST:9]
exacloud-OST000a_UUID   529449792        1268   524063432   1% /mnt/exacloud[OST:10]
exacloud-OST000b_UUID   529449792        1272   524063428   1% /mnt/exacloud[OST:11]
exacloud-OST000c_UUID   529449792        1256   524063444   1% /mnt/exacloud[OST:12]
exacloud-OST000d_UUID   529449792        1264   524063436   1% /mnt/exacloud[OST:13]
exacloud-OST000e_UUID   529449792        1268   524063432   1% /mnt/exacloud[OST:14]
exacloud-OST000f_UUID   529449792        1260   524063440   1% /mnt/exacloud[OST:15]
exacloud-OST0010_UUID   529449792        1272   524063428   1% /mnt/exacloud[OST:16]
exacloud-OST0011_UUID   529449792        1260   524063440   1% /mnt/exacloud[OST:17]
exacloud-OST0012_UUID   529449792        1256   524063444   1% /mnt/exacloud[OST:18]
exacloud-OST0013_UUID   529449792        1264   524063436   1% /mnt/exacloud[OST:19]
exacloud-OST0014_UUID   529449792        1256   524063444   1% /mnt/exacloud[OST:20]
exacloud-OST0015_UUID   529449792        1260   524063440   1% /mnt/exacloud[OST:21]
exacloud-OST0016_UUID   529449792        1272   524063428   1% /mnt/exacloud[OST:22]
exacloud-OST0017_UUID   529449792        1264   524063436   1% /mnt/exacloud[OST:23]

filesystem_summary:  12706795008       30348 12577522452   1% /mnt/exacloud

Add storage capacity in an existing EXAScaler Cloud environment

The storage capacity can be added by increasing the number of storage servers. To add storage capacity in an existing EXAScaler Cloud environment, just modify the terraform.tfvars file and increase the number of storage servers (the value of the oss.node_countvariable) as required:

$ diff -u terraform.tfvars.orig terraform.tfvars
@@ -217,7 +217,7 @@
 oss = {
   node_type           = "Standard_D16s_v3"
-  node_count          = 6
+  node_count          = 12
   public_ip           = false
   accelerated_network = true
 }

And then run the terraform apply command to increase the storage capacity. The available storage capacity (in GB) can be calculated by multiplying the three configuration parameters:

capacity = oss.node_count * ost.disk_count * ost.disk_size

Upgrade an existing EXAScaler Cloud environment

A software upgrade for an existing EXAScaler Cloud environment is possible by recreating the running VM instances using a new version of the OS image. And it requires some manual steps.

Create a backup copy for the existing Terraform directory (*.tf, terraform.tfvars and terraform.tfstate files):

$ cd /path/to/exascaler-cloud-terraform-scripts-x.y.z/az
$ tar pcfz backup.tgz *.tf terraform.tfvars terraform.tfstate

Update Terraform scripts using the latest available EXAScaler Cloud Terraform scripts:

$ cd /path/to
$ curl -sL https://github.com/DDNStorage/exascaler-cloud-terraform/archive/refs/tags/scripts/2.2.2.tar.gz | tar xz
$ cd exascaler-cloud-terraform-scripts-2.2.2/az
Copy the terraform.tfstate file from the existing Terraform directory:
$ cp -iv /path/to/exascaler-cloud-terraform-scripts-x.y.z/az/terraform.tfstate .

Review and update the terraform.tfvars file using configuration options for the existing environment:

$ diff -u  /path/to/exascaler-cloud-terraform-scripts-x.y.z/az/terraform.tfvars terraform.tfvars
$ vi terraform.tfvars

Review the execution plan to make sure all changes are expected:

% terraform plan

Unmount the existing EXAScaler Cloud filesystem using the provided exascaler-cloud-ctl script. This step is required to ensure data consistency during the upgrade:

$ scripts/exascaler-cloud-ctl

Usage:

List resource groups : ./scripts/exascaler-cloud-ctl list
List deployments     : ./scripts/exascaler-cloud-ctl <resource_group> list
List instances       : ./scripts/exascaler-cloud-ctl <resource_group> <deployment> list
Stop instances       : ./scripts/exascaler-cloud-ctl <resource_group> <deployment> stop
Start instances      : ./scripts/exascaler-cloud-ctl <resource_group> <deployment> start
Unmount filesystem   : ./scripts/exascaler-cloud-ctl <resource_group> <deployment> umount

$ scripts/exascaler-cloud-ctl list
Name                                 Location    Status
-----------------------------------  ----------  ---------
exascaler-cloud-f7cd-resource-group  eastus      Succeeded
NetworkWatcherRG                     westus      Succeeded

$ scripts/exascaler-cloud-ctl exascaler-cloud-f7cd-resource-group list
Name                            Created                    Status
------------------------------  -------------------------  ---------
exascaler-cloud-f7cd            2021-08-21T01:19:36+00:00  Succeeded

$ scripts/exascaler-cloud-ctl exascaler-cloud-f7cd-resource-group exascaler-cloud-f7cd umount
Umount compute client exascaler-cloud-f7cd-cls0
Umount compute client exascaler-cloud-f7cd-cls1
Umount storage server exascaler-cloud-f7cd-oss0
Umount storage server exascaler-cloud-f7cd-oss1
Umount storage server exascaler-cloud-f7cd-oss2
Umount storage server exascaler-cloud-f7cd-oss3
Umount metadata server exascaler-cloud-f7cd-mds0
Umount management server exascaler-cloud-f7cd-mgs0

Apply the changes required to upgrade the existing EXAScaler Cloud environment by recreating all instances using the latest version of EXAScaler Cloud:

$ terraform apply
...
  Enter a value: yes
...

Outputs:

Apply complete! Resources: 18 added, 8 changed, 16 destroyed.

Outputs:

azure_dashboard = "https://portal.azure.com/#@753b6e26-6fd3-43e6-8248-3f1735d59bb4/dashboard/arm/subscriptions/9978cd1b-936a-4296-8061-67c9d963dd40/resourceGroups/exascaler-cloud-f7cd-resource-group/providers/Microsoft.Portal/dashboards/exascaler-cloud-f7cd-dashboard"

http_console = "http://exascaler-cloud-f7cd-mgs0.eastus.cloudapp.azure.com"

mount_command = "mount -t lustre 10.0.0.11@tcp:/exacloud /mnt/exacloud"

private_addresses = {
  "exascaler-cloud-f7cd-cls0" = "10.0.0.6"
  "exascaler-cloud-f7cd-cls1" = "10.0.0.7"
  "exascaler-cloud-f7cd-mds0" = "10.0.0.8"
  "exascaler-cloud-f7cd-mgs0" = "10.0.0.11"
  "exascaler-cloud-f7cd-oss0" = "10.0.0.9"
  "exascaler-cloud-f7cd-oss1" = "10.0.0.4"
  "exascaler-cloud-f7cd-oss2" = "10.0.0.10"
  "exascaler-cloud-f7cd-oss3" = "10.0.0.5"
}

ssh_console = {
  "exascaler-cloud-f7cd-mgs0" = "ssh -A stack@exascaler-cloud-f7cd-mgs0.eastus.cloudapp.azure.com"
}

Steps to destroy the EXAScaler Cloud environment

Destroy the EXAScaler Cloud environment:

$ terraform destroy
...
  Enter a value: yes
...
Destroy complete! Resources: 103 destroyed.

How to access a deployment

To remote access the EXAScaler Cloud deployment (management, metadata, storage servers and compute clients), we must go through the management server public IP address, using the  command.

And then using the management server console as a jump host we can open a SSH sessions to all other servers.

localhost$ eval $(ssh-agent)
Agent pid 5095

localhost$ ssh-add 
Identity added: /Users/deiter/.ssh/id_rsa (/Users/deiter/.ssh/id_rsa)

localhost$ ssh -A stack@20.62.171.73
Last login: Tue Aug 24 19:11:18 2021 from 181.241.16.227

[stack@exascaler-cloud-65f1-mgs0 ~]$ loci hosts
10.0.0.7	exascaler-cloud-65f1-cls0
10.0.0.5	exascaler-cloud-65f1-cls1
10.0.0.8	exascaler-cloud-65f1-cls2
10.0.0.13	exascaler-cloud-65f1-cls3
10.0.0.12	exascaler-cloud-65f1-mds0
10.0.0.11	exascaler-cloud-65f1-mgs0
10.0.0.10	exascaler-cloud-65f1-oss0
10.0.0.4	exascaler-cloud-65f1-oss1
10.0.0.6	exascaler-cloud-65f1-oss2
10.0.0.9	exascaler-cloud-65f1-oss3

[stack@exascaler-cloud-65f1-mgs0 ~]$ ssh exascaler-cloud-65f1-cls0

[stack@exascaler-cloud-65f1-cls0 ~]$ lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
exacloud-MDT0000_UUID   315302464        5744   309928012   1% /mnt/exacloud[MDT:0]
exacloud-OST0000_UUID  3181466888   570662088  2578541648  19% /mnt/exacloud[OST:0]
exacloud-OST0001_UUID  3181466888   590910696  2558305972  19% /mnt/exacloud[OST:1]
exacloud-OST0002_UUID  3181466888   580374740  2568825852  19% /mnt/exacloud[OST:2]
exacloud-OST0003_UUID  3181466888   570645704  2578552816  19% /mnt/exacloud[OST:3]

filesystem_summary:  12725867552  2312593228 10284226288  19% /mnt/exacloud

How to install new EXAScaler Cloud clients

New EXAScaler Cloud client instances must be in the same location and connected to the virtual network and subnet. The process of installing and configuring new clients can be performed automatically. All required information is contained in the Terraform output. To configure EXAScaler Cloud filesystem on a new client instance create a configuration file /etc/exascaler-cloud-client.conf using the actual IP address of the management server:

{
    "MountConfig": {
        "ClientDevice": "10.0.0.10@tcp:/exacloud",
        "Mountpoint": "/mnt/exacloud",
        "PackageSource": "http://10.0.0.10/client-packages"
    }
}

To install and setup EXAScaler Cloud filesystem on a new client run the following commands on the client with root privileges:

# curl -fsSL http://10.0.0.10/exasccaler-cloud-client-$(arch) -o /usr/sbin/exascaler-cloud-client
# chmod +x /usr/sbin/exascaler-cloud-client
# exascaler-cloud-client auto setup --config /etc/exascaler-cloud-client.conf

Ubuntu Linux

# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04 LTS
Release:	22.04
Codename:	jammy

# exascaler-cloud-client auto setup --config /etc/exascaler-cloud-client.conf
Discovering platform ... Done.
Configuring firewall rules for Lustre ... Done.
Configuring Lustre client package source ... Done.
Installing Lustre client packages and building DKMS modules ... Done.
Mounting 10.0.0.10@tcp0:/exacloud at /mnt/exacloud ... Done.

# mount -t lustre
10.0.0.10@tcp:/exacloud on /mnt/exacloud type lustre (rw,flock,user_xattr,lazystatfs,encrypt)

RHEL, CentOS, Alma and Rocky Linux

# cat /etc/redhat-release
AlmaLinux release 8.6 (Sky Tiger)

# exascaler-cloud-client auto setup --config /etc/exascaler-cloud-client.conf
Discovering platform ... Done.
Configuring firewall rules for Lustre ... Done.
Configuring Lustre client package source ... Done.
Installing Lustre client packages ... Done.
Mounting 10.0.0.10@tcp0:/exacloud at /mnt/exacloud ... Done.

# mount -t lustre
10.0.0.10@tcp:/exacloud on /mnt/exacloud type lustre (rw,seclabel,flock,user_xattr,lazystatfs,encrypt)

How to use client-side encryption

The purpose that client-side encryption wants to serve is to be able to provide a special directory for each user, to safely store sensitive files. The goals are to protect data in transit between clients and servers, and protect data at rest.

This feature is implemented directly at the Lustre client level. Lustre client-side encryption relies on kernel fscrypt. fscrypt is a library which filesystems can hook into to support transparent encryption of files and directories. As a consequence, the key points described below are extracted from fscrypt documentation.

The client-side encryption feature is available natively on Lustre clients running a Linux distributions, including RHEL/CentOS 8.1 and later, Ubuntu 18.04 and later.

Client-side encryption supports data encryption and file and directory names encryption. Ability to encrypt file and directory names is governed by parameter named enable_filename_encryption and set to 0 by default. When this parameter is 0, new empty directories configured as encrypted use content encryption only, and not name encryption. This mode is inherited for all subdirectories and files. When enable_filename_encryption parameter is set to 1, new empty directories configured as encrypted use full encryption capabilities by encrypting file content and also file and directory names. This mode is inherited for all subdirectories and files. To set the enable_filename_encryption parameter globally for all clients, one can do on the management server:

# lctl set_param -P llite.*.enable_filename_encryption=1

The fscrypt package is included in the EXAScaler Cloud client toolkit and can be installed using esc-client.

Steps to install Lustre client and fscrypt packages:

# cat > /etc/exascaler-cloud-client.conf <<EOF
{
    "MountConfig": {
        "ClientDevice": "10.0.0.10@tcp:/exacloud",
        "Mountpoint": "/mnt/exacloud",
        "PackageSource": "http://10.0.0.10/client-packages"
    }
}
EOF

# curl -fsSL http://10.0.0.10/exascaler-cloud-client-$(arch) -o /usr/sbin/exascaler-cloud-client
# chmod +x /usr/sbin/exascaler-cloud-client
# exascaler-cloud-client auto setup --config /etc/esc-client.conf

Output:

# exascaler-cloud-client auto setup --config /etc/exascaler-cloud-client.conf
Discovering platform ... Done.
Configuring firewall rules for Lustre ... Done.
Configuring Lustre client package source ... Done.
Installing Lustre client packages ... Done.
Mounting 10.0.0.10@tcp0:/exacloud at /mnt/exacloud ... Done.

# rpm -q fscrypt lustre-client kmod-lustre-client
fscrypt-0.3.3-1.wc2.x86_64
lustre-client-2.14.0_ddn52-1.el8.x86_64
kmod-lustre-client-2.14.0_ddn52-1.el8.x86_64

Steps to configure client-side encryption:

$ sudo fscrypt setup
Defaulting to policy_version 2 because kernel supports it.
Customizing passphrase hashing difficulty for this system...
Created global config file at "/etc/fscrypt.conf".
Allow users other than root to create fscrypt metadata on the root filesystem? (See
https://github.com/google/fscrypt#setting-up-fscrypt-on-a-filesystem) [y/N]
Metadata directories created at "/.fscrypt", writable by root only.

$ sudo fscrypt setup /mnt/exacloud
Allow users other than root to create fscrypt metadata on this filesystem? (See
https://github.com/google/fscrypt#setting-up-fscrypt-on-a-filesystem) [y/N] y
Metadata directories created at "/mnt/exacloud/.fscrypt", writable by everyone.

Steps to encrypt directory:

$ sudo install -v -d -m 0755 -o stack -g stack /mnt/exacloud/stack
install: creating directory '/mnt/exacloud/stack'

$ fscrypt encrypt /mnt/exacloud/stack
The following protector sources are available:
1 - Your login passphrase (pam_passphrase)
2 - A custom passphrase (custom_passphrase)
3 - A raw 256-bit key (raw_key)
Enter the source number for the new protector [2 - custom_passphrase]:
Enter a name for the new protector: test
Enter custom passphrase for protector "test":
Confirm passphrase:
"/mnt/exacloud/stack" is now encrypted, unlocked, and ready for use.

$ cp -v /etc/passwd /mnt/exacloud/stack/
'/etc/passwd' -> '/mnt/exacloud/stack/passwd'

$ ls -l /mnt/exacloud/stack/
total 1
-rw-r--r--. 1 stack stack 1610 Jul 13 20:34 passwd

$ md5sum /mnt/exacloud/stack/passwd
867541523c51f8cfd4af91988e9f8794  /mnt/exacloud/stack/passwd

Lock the directory:

$ fscrypt lock /mnt/exacloud/stack
"/mnt/exacloud/stack" is now locked.

$ ls -l /mnt/exacloud/stack
total 4
-rw-r--r--. 1 stack stack 4096 Jul 13 20:34 ydpdwRP7MiXzsTkYhg0mW3DNacDlsUJdJa2e9l6AQKL

$ md5sum /mnt/exacloud/stack/ydpdwRP7MiXzsTkYhg0mW3DNacDlsUJdJa2e9l6AQKL
md5sum: /mnt/exacloud/stack/ydpdwRP7MiXzsTkYhg0mW3DNacDlsUJdJa2e9l6AQKL: Required key not available

Unlock the directory:

$ fscrypt unlock /mnt/exacloud/stack
Enter custom passphrase for protector "test":
"/mnt/exacloud/stack" is now unlocked and ready for use.

$ ls -l /mnt/exacloud/stack
total 4
-rw-r--r--. 1 stack stack 1610 Jul 13 20:34 passwd

$ md5sum /mnt/exacloud/stack/passwd
867541523c51f8cfd4af91988e9f8794  /mnt/exacloud/stack/passwd

Learn more about client-side encryption.

How to run benchmarks

Steps to run IOR benchmark on the EXAScaler Cloud deployment:

  • Run ssh-agent
  • Add ssh private key
  • Open an SSH session to the EXAScaler Cloud management server
  • Run IOR benchmark using esc-ior tool
localhost$ eval $(ssh-agent)
Agent pid 5095

localhost$ ssh-add 
Identity added: /Users/deiter/.ssh/id_rsa (/Users/deiter/.ssh/id_rsa)

localhost$ ssh -A stack@20.62.171.73

[stack@exascaler-cloud-65f1-mgs0 ~]$ exascaler-cloud-ior

IOR-3.3.0: MPI Coordinated Test of Parallel I/O

Began               : Wed Aug 25 14:43:01 2021
Command line        : /usr/bin/ior -C -F -e -r -w -a POSIX -b 16777216 -t 1048576 -s 251 -o /mnt/exacloud/0b21199cf9682b9d/0b21199cf9682b9d -s 512
Machine             : Linux exascaler-cloud-65f1-cls0
TestID              : 0
StartTime           : Wed Aug 25 14:43:01 2021
Path                : /mnt/exacloud/0b21199cf9682b9d
FS                  : 11.9 TiB   Used FS: 0.0%   Inodes: 96.0 Mi   Used Inodes: 0.0%

Options: 
api                 : POSIX
apiVersion          : 
test filename       : /mnt/exacloud/0b21199cf9682b9d/0b21199cf9682b9d
access              : file-per-process
type                : independent
segments            : 512
ordering in a file  : sequential
ordering inter file : constant task offset
task offset         : 1
nodes               : 4
tasks               : 64
clients per node    : 16
repetitions         : 1
xfersize            : 1 MiB
blocksize           : 16 MiB
aggregate filesize  : 512 GiB

Results: 

access    bw(MiB/s)  IOPS       Latency(s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
------    ---------  ----       ----------  ---------- ---------  --------   --------   --------   --------   ----
write     1482.70    1482.70    21.52       16384      1024.00    0.014559   353.60     19.92      353.60     0   
read      1480.89    1480.91    21.99       16384      1024.00    0.299514   354.03     43.40      354.04     0   
remove    -          -          -           -          -          -          -          -          3.36       0   

Max Write: 1482.70 MiB/sec (1554.72 MB/sec)
Max Read:  1480.89 MiB/sec (1552.83 MB/sec)

Finished            : Wed Aug 25 14:54:52 2021


Steps to run IO500 benchmark on the EXAScaler Cloud deployment:

  • Run ssh-agent
  • Add ssh private key
  • Open an SSH session to the EXAScaler Cloud management server
  • Get a list of EXAScaler Cloud compute hosts
  • Open an SSH session to the any EXAScaler Cloud compute host
  • Run IO500 benchmark using esc-io500 tool
localhost$ eval $(ssh-agent)
Agent pid 5095

localhost$ ssh-add 
Identity added: /Users/deiter/.ssh/id_rsa (/Users/deiter/.ssh/id_rsa)

localhost$ ssh -A stack@20.62.171.71

[stack@exascaler-cloud-2ed3-mgs0 ~]$ $ loci hosts -c
10.0.0.17   exascaler-cloud-2ed3-cls0
10.0.0.8    exascaler-cloud-2ed3-cls1
10.0.0.18   exascaler-cloud-2ed3-cls2
10.0.0.12   exascaler-cloud-2ed3-cls3
10.0.0.14   exascaler-cloud-2ed3-cls4
10.0.0.15   exascaler-cloud-2ed3-cls5
10.0.0.11   exascaler-cloud-2ed3-cls6
10.0.0.9    exascaler-cloud-2ed3-cls7
10.0.0.7    exascaler-cloud-2ed3-cls8
10.0.0.16   exascaler-cloud-2ed3-cls9
 
[stack@exascaler-cloud-2ed3-mgs0 ~]$ ssh -A exascaler-cloud-2ed3-cls0
 
[stack@exascaler-cloud-2ed3-cls0 ~]$ exascaler-cloud-io500
 
Start IO500 benchmark with options:
 
data directory:     /mnt/exacloud/071dfa36e6b20ca7/workload
hosts list:         10.0.0.17,10.0.0.9,10.0.0.14,10.0.0.8,10.0.0.12,10.0.0.5,10.0.0.11,10.0.0.19,10.0.0.18,10.0.0.7
processes per host: 16
files per process:  39637
number of tasks:    160
number of segments: 31500
block size:         4227858432
transfer size:      1048576
 
IO500 version io500-sc20_v3
[RESULT]       ior-easy-write        1.445976 GiB/s : time 364.894 seconds
[RESULT]    mdtest-easy-write       15.411987 kIOPS : time 304.382 seconds
[RESULT]       ior-hard-write        0.461174 GiB/s : time 410.219 seconds
[RESULT]    mdtest-hard-write        2.538281 kIOPS : time 449.131 seconds
[RESULT]                 find      583.795841 kIOPS : time 9.842 seconds
[RESULT]        ior-easy-read        1.450889 GiB/s : time 363.624 seconds
[RESULT]     mdtest-easy-stat       61.106840 kIOPS : time 75.517 seconds
[RESULT]        ior-hard-read        0.543306 GiB/s : time 348.233 seconds
[RESULT]     mdtest-hard-stat       20.753560 kIOPS : time 54.080 seconds
[RESULT]   mdtest-easy-delete        5.836530 kIOPS : time 789.832 seconds
[RESULT]     mdtest-hard-read       10.320768 kIOPS : time 108.658 seconds
[RESULT]   mdtest-hard-delete        4.647816 kIOPS : time 241.181 seconds
[SCORE] Bandwidth 0.851483 GiB/s : IOPS 17.322863 kiops : TOTAL 3.840589
 
/mnt/exacloud/b44731b2e4ac4fc2/sources/results
2021.06.17-20.26.19  io500-exascaler-cloud-2ed3-cls0-2021.06.17-20.26.19.tgz

How to collect inventory and support bundle

Steps to collect a support bundle on the EXAScaler Cloud deployment:

  • Run ssh-agent
  • Add ssh private key
  • Open SSH session to the EXAScaler Cloud management server
  • Collect a support bundle using esc-collector tool
localhost$ eval $(ssh-agent)
Agent pid 5095

localhost$ ssh-add 
Identity added: /Users/deiter/.ssh/id_rsa (/Users/deiter/.ssh/id_rsa)

localhost$ ssh -A stack@20.62.171.33

[stack@exascaler-cloud-55d2-mgs0 ~]$ exascaler-cloud-collector
 
The following is a list of nodes to collect from:
                              
    exascaler-cloud-55d2-cls0
    exascaler-cloud-55d2-cls1
    exascaler-cloud-55d2-cls2
    exascaler-cloud-55d2-cls3
    exascaler-cloud-55d2-cls4
    exascaler-cloud-55d2-cls5
    exascaler-cloud-55d2-cls6
    exascaler-cloud-55d2-cls7
    exascaler-cloud-55d2-cls8
    exascaler-cloud-55d2-cls9
    exascaler-cloud-55d2-mds0
    exascaler-cloud-55d2-mgs0
    exascaler-cloud-55d2-oss0
    exascaler-cloud-55d2-oss1
    exascaler-cloud-55d2-oss2
    exascaler-cloud-55d2-oss3
 
Connecting to nodes...
 
Beginning collection of sosreports from 16 nodes, collecting a maximum of 4 concurrently
 
Successfully captured 16 of 16 sosreports
Creating archive of sosreports...
 
The following archive has been created. Please provide it to your support team.
    /var/tmp/sos-collector-2021-06-18-nzsnm.tar.gz


Generate the inventory report by running the about_this_deployment command:

localhost$ $ ssh -A stack@13.87.189.176

[stack@exascaler-cloud-7819-mgs0 ~]$ about_this_deployment 
cloudName: AzurePublicCloud
subscriptionId: 9978cd1b-936a-4296-8061-67c9d963dd40
location: westus
resourceGroupName: CustomDeployment
deployment: exascaler-cloud-7819
filesystem: exacloud
capacityGB: 1024
configurationProfile: custom
instances:
- hostName: exascaler-cloud-7819-cls0
  userName: stack
  proximityPlacementGroup: exascaler-cloud-7819-proximity-placement-group
  passwordAuthentication: false
  instanceName: exascaler-cloud-7819-cls0
  instanceType: Standard_D16s_v3
  role: clt
  memoryGB: 64
  VCPUs: 16
  IOPS: 25600
  bandwidthMBps: 384
  network:
    interfaces:
    - name: exascaler-cloud-7819-cls0-network-interface
      acceleratedNetworking: true
      macAddress: 00-22-48-04-1F-A9
      ipAddresses:
      - privateIpAddress: 10.0.0.6
        subnet: exascaler-cloud-7819-subnetwork
  storage:
    image:
      offer: exascaler_cloud
      publisher: ddn-whamcloud-5345716
      sku: exascaler_cloud_523_redhat
      version: 5.2.3
    bootDisk:
      caching: ReadWrite
      sizeGB: 64
      name: exascaler-cloud-7819-cls0-boot-disk
      tier: StandardSSD_LRS
      type: E6
      IOPS: 500
      bandwidthMBps: 60
  publicKeys:
  - data: |
      ssh-rsa ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC4haGlji/jE0B35HrBKXeIxos2YN8jMP2m1+cx1G7UYnqr7c1I4wv073ubEfOQM2Cww4+CKEAQT3ZC+EW79NQuwJPAzPr2y961INN1zTYEX84ejbmydXzdWJT5u8LKgwKFkkKPzOU7tWJ2Co6rhz796siOpH1bvRg8gFt1oEk9B2fd3O3OA62zq6La0IUYkMQzTlU2b0hNvrVuX+Ea38fi18R7op7tSulystSrvjW/8fzRU8r7c8oTD8kELRh9H3zm0S8mk+Nq1fO4liXtNldnnpl3qAV9NipGXb+L9De15ogtqJE1/2wyaEdbXjqpA2BkHeUY8B2K7K6afpndR+AT stack@host
    path: /home/stack/.ssh/authorized_keys
- hostName: exascaler-cloud-7819-mds0
  userName: stack
  proximityPlacementGroup: exascaler-cloud-7819-proximity-placement-group
  passwordAuthentication: false
  instanceName: exascaler-cloud-7819-mds0
  instanceType: Standard_E8s_v3
  role: mdt
  memoryGB: 64
  VCPUs: 8
  IOPS: 12800
  bandwidthMBps: 192
  network:
    interfaces:
    - name: exascaler-cloud-7819-mds0-network-interface
      acceleratedNetworking: true
      macAddress: 00-22-48-0A-05-EE
      ipAddresses:
      - privateIpAddress: 10.0.0.7
        subnet: exascaler-cloud-7819-subnetwork
  storage:
    image:
      offer: exascaler_cloud
      publisher: ddn-whamcloud-5345716
      sku: exascaler_cloud_523_redhat
      version: 5.2.3
    bootDisk:
      caching: ReadWrite
      sizeGB: 64
      name: exascaler-cloud-7819-mds0-boot-disk
      tier: StandardSSD_LRS
      type: E6
      IOPS: 500
      bandwidthMBps: 60
    dataDisks:
    - lun: 0
      caching: None
      sizeGB: 512
      name: exascaler-cloud-7819-mds0-mdt0-disk
      tier: Premium_LRS
      type: P20
      IOPS: 2300
      bandwidthMBps: 150
  publicKeys:
  - data: |
      ssh-rsa ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC4haGlji/jE0B35HrBKXeIxos2YN8jMP2m1+cx1G7UYnqr7c1I4wv073ubEfOQM2Cww4+CKEAQT3ZC+EW79NQuwJPAzPr2y961INN1zTYEX84ejbmydXzdWJT5u8LKgwKFkkKPzOU7tWJ2Co6rhz796siOpH1bvRg8gFt1oEk9B2fd3O3OA62zq6La0IUYkMQzTlU2b0hNvrVuX+Ea38fi18R7op7tSulystSrvjW/8fzRU8r7c8oTD8kELRh9H3zm0S8mk+Nq1fO4liXtNldnnpl3qAV9NipGXb+L9De15ogtqJE1/2wyaEdbXjqpA2BkHeUY8B2K7K6afpndR+AT stack@host
    path: /home/stack/.ssh/authorized_keys
- hostName: exascaler-cloud-7819-mgs0
  userName: stack
  proximityPlacementGroup: exascaler-cloud-7819-proximity-placement-group
  passwordAuthentication: false
  instanceName: exascaler-cloud-7819-mgs0
  instanceType: Standard_F4s
  role: mgt
  memoryGB: 8
  VCPUs: 4
  IOPS: 12800
  bandwidthMBps: 192
  network:
    interfaces:
    - name: exascaler-cloud-7819-mgs0-network-interface
      acceleratedNetworking: true
      macAddress: 00-22-48-04-1B-91
      ipAddresses:
      - privateIpAddress: 10.0.0.5
        publicIpAddress: 13.87.189.176
        subnet: exascaler-cloud-7819-subnetwork
  storage:
    image:
      offer: exascaler_cloud
      publisher: ddn-whamcloud-5345716
      sku: exascaler_cloud_523_redhat
      version: 5.2.3
    bootDisk:
      caching: ReadWrite
      sizeGB: 64
      name: exascaler-cloud-7819-mgs0-boot-disk
      tier: StandardSSD_LRS
      type: E6
      IOPS: 500
      bandwidthMBps: 60
    dataDisks:
    - lun: 0
      caching: None
      sizeGB: 256
      name: exascaler-cloud-7819-mgs0-mgt0-disk
      tier: StandardSSD_LRS
      type: E15
      IOPS: 500
      bandwidthMBps: 60
    - lun: 1
      caching: None
      sizeGB: 128
      name: exascaler-cloud-7819-mgs0-mnt0-disk
      tier: StandardSSD_LRS
      type: E10
      IOPS: 500
      bandwidthMBps: 60
  publicKeys:
  - data: |
      ssh-rsa ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC4haGlji/jE0B35HrBKXeIxos2YN8jMP2m1+cx1G7UYnqr7c1I4wv073ubEfOQM2Cww4+CKEAQT3ZC+EW79NQuwJPAzPr2y961INN1zTYEX84ejbmydXzdWJT5u8LKgwKFkkKPzOU7tWJ2Co6rhz796siOpH1bvRg8gFt1oEk9B2fd3O3OA62zq6La0IUYkMQzTlU2b0hNvrVuX+Ea38fi18R7op7tSulystSrvjW/8fzRU8r7c8oTD8kELRh9H3zm0S8mk+Nq1fO4liXtNldnnpl3qAV9NipGXb+L9De15ogtqJE1/2wyaEdbXjqpA2BkHeUY8B2K7K6afpndR+AT stack@host
    path: /home/stack/.ssh/authorized_keys
- hostName: exascaler-cloud-7819-oss0
  userName: stack
  proximityPlacementGroup: exascaler-cloud-7819-proximity-placement-group
  passwordAuthentication: false
  instanceName: exascaler-cloud-7819-oss0
  instanceType: Standard_D16s_v3
  role: ost
  memoryGB: 64
  VCPUs: 16
  IOPS: 25600
  bandwidthMBps: 384
  network:
    interfaces:
    - name: exascaler-cloud-7819-oss0-network-interface
      acceleratedNetworking: true
      macAddress: 00-22-48-0A-0D-54
      ipAddresses:
      - privateIpAddress: 10.0.0.4
        subnet: exascaler-cloud-7819-subnetwork
  storage:
    image:
      offer: exascaler_cloud
      publisher: ddn-whamcloud-5345716
      sku: exascaler_cloud_523_redhat
      version: 5.2.3
    bootDisk:
      caching: ReadWrite
      sizeGB: 64
      name: exascaler-cloud-7819-oss0-boot-disk
      tier: StandardSSD_LRS
      type: E6
      IOPS: 500
      bandwidthMBps: 60
    dataDisks:
    - lun: 0
      caching: None
      sizeGB: 512
      name: exascaler-cloud-7819-oss0-ost0-disk
      tier: Standard_LRS
      type: S20
      IOPS: 500
      bandwidthMBps: 60
    - lun: 1
      caching: None
      sizeGB: 512
      name: exascaler-cloud-7819-oss0-ost1-disk
      tier: Standard_LRS
      type: S20
      IOPS: 500
      bandwidthMBps: 60
  publicKeys:
  - data: |
      ssh-rsa ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC4haGlji/jE0B35HrBKXeIxos2YN8jMP2m1+cx1G7UYnqr7c1I4wv073ubEfOQM2Cww4+CKEAQT3ZC+EW79NQuwJPAzPr2y961INN1zTYEX84ejbmydXzdWJT5u8LKgwKFkkKPzOU7tWJ2Co6rhz796siOpH1bvRg8gFt1oEk9B2fd3O3OA62zq6La0IUYkMQzTlU2b0hNvrVuX+Ea38fi18R7op7tSulystSrvjW/8fzRU8r7c8oTD8kELRh9H3zm0S8mk+Nq1fO4liXtNldnnpl3qAV9NipGXb+L9De15ogtqJE1/2wyaEdbXjqpA2BkHeUY8B2K7K6afpndR+AT stack@host
    path: /home/stack/.ssh/authorized_keys

How to stop and start a deployment

If you want to stop and start the servers in your EXAScaler Cloud deployment (for example, to save cost on weekends) - please use the esc-ctl script.

Prerequisites

Before use the Microsoft Azure CLI for Microsoft Azure, you will need to authenticate under the Microsoft account you used to log into the Microsoft Azure Portal. You will use a Microsoft account and its credentials to allow the shell script to start/stop the EXAScaler Cloud servers.

Steps to authenticate via Microsoft account

Obtains access credentials for your user account via a web-based authorization flow. When this command completes successfully, it sets the active account in the current configuration to the account specified. Learn more.

$ az login

To view the current Azure subscription ID, please use az account show command.

$ az account show
{
  "environmentName": "AzureCloud",
  "homeTenantId": "00000000-0000-0000-0000-000000000000",
  "id": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  "isDefault": true,
  "managedByTenants": [],
  "name": "Pay-As-You-Go",
  "state": "Enabled",
  "tenantId": "00000000-0000-0000-0000-000000000000",
  "user": {
    "name": "user@domain.com",
    "type": "user"
  }
}

if you have more than one subscriptions - please set the default subscription. Learn more.

$ az account set --subscription XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX

List of existing resource groups

$ scripts/exascaler-cloud-ctl list
Name              Location    Status
----------------  ----------  ---------
EXAScaler-Cloud   eastus      Succeeded  

List of existing deployments for the given resource group

$ scripts/exascaler-cloud-ctl EXAScaler-Cloud list
Name                            Created                    Status
------------------------------  -------------------------  ---------
exascaler-cloud-65f1            2021-08-24T18:56:27+00:00  Succeeded  

List of EXAScaler Cloud servers for the given deployment

$ scripts/exascaler-cloud-ctl EXAScaler-Cloud exascaler-cloud-65f1 list
Name                       Size              Role    Version    PrivateIP    PublicIP      Status
-------------------------  ----------------  ------  ---------  -----------  ------------  ----------
exascaler-cloud-65f1-cls0  Standard_D16s_v3  clt     5.2.3      10.0.0.7                   VM running
exascaler-cloud-65f1-cls1  Standard_D16s_v3  clt     5.2.3      10.0.0.5                   VM running
exascaler-cloud-65f1-cls2  Standard_D16s_v3  clt     5.2.3      10.0.0.8                   VM running
exascaler-cloud-65f1-cls3  Standard_D16s_v3  clt     5.2.3      10.0.0.13                  VM running
exascaler-cloud-65f1-mds0  Standard_E8s_v3   mdt     5.2.3      10.0.0.12                  VM running
exascaler-cloud-65f1-mgs0  Standard_F4s      mgt     5.2.3      10.0.0.11    20.62.171.73  VM running
exascaler-cloud-65f1-oss0  Standard_D16s_v3  ost     5.2.3      10.0.0.10                  VM running
exascaler-cloud-65f1-oss1  Standard_D16s_v3  ost     5.2.3      10.0.0.4                   VM running
exascaler-cloud-65f1-oss2  Standard_D16s_v3  ost     5.2.3      10.0.0.6                   VM running
exascaler-cloud-65f1-oss3  Standard_D16s_v3  ost     5.2.3      10.0.0.9                   VM running  

Stop the EXAScaler Cloud servers

$ scripts/exascaler-cloud-ctl EXAScaler-Cloud exascaler-cloud-65f1 stop
Stop compute client exascaler-cloud-65f1-cls0
Stop compute client exascaler-cloud-65f1-cls1
Stop compute client exascaler-cloud-65f1-cls2
Stop compute client exascaler-cloud-65f1-cls3
Stop storage server exascaler-cloud-65f1-oss0
Stop storage server exascaler-cloud-65f1-oss1
Stop storage server exascaler-cloud-65f1-oss2
Stop storage server exascaler-cloud-65f1-oss3
Stop metadata server exascaler-cloud-65f1-mds0
Stop management server exascaler-cloud-65f1-mgs0

Start the EXAScaler Cloud servers

$ scripts/exascaler-cloud-ctl EXAScaler-Cloud exascaler-cloud-65f1 start
Start management server exascaler-cloud-65f1-mgs0
Start metadata server exascaler-cloud-65f1-mds0
Start storage server exascaler-cloud-65f1-oss0
Start storage server exascaler-cloud-65f1-oss1
Start storage server exascaler-cloud-65f1-oss2
Start storage server exascaler-cloud-65f1-oss3
Start compute client exascaler-cloud-65f1-cls0
Start compute client exascaler-cloud-65f1-cls1
Start compute client exascaler-cloud-65f1-cls2
Start compute client exascaler-cloud-65f1-cls3

All required services will automatically start and the filesystem will be mounted and available on all compute clients.

How to upgrade an existing deployment

A software upgrade for an existing EXAScaler Cloud deployment is possible by creating a new deployment using a copy of the existing file system.

To upgrade the existing EXAScaler Cloud deployment you can use the standard EXAScaler Cloud Azure application by selecting the "Upgrade" value for the "Deployment type" option.

All running instances must be shut down for the existing EXAScaler Cloud deployment before performing the upgrade, this is required to ensure data consistency while creating a copy of existing data:

$ scripts/exascaler-cloud-ctl list                                                 
Name              Location    Status                                                                                              
----------------  ----------  ---------                                                                                           
EXAScaler-Cloud   eastus      Succeeded

$ scripts/exascaler-cloud-ctl EXAScaler-Cloud list
Name                            Created                    Status
------------------------------  -------------------------  ---------
exascaler-cloud-65f1            2021-08-24T18:56:27+00:00  Succeeded

$ scripts/exascaler-cloud-ctl EXAScaler-Cloud exascaler-cloud-65f1 stop
Stop compute client exascaler-cloud-65f1-cls0
Stop compute client exascaler-cloud-65f1-cls1
Stop compute client exascaler-cloud-65f1-cls2
Stop compute client exascaler-cloud-65f1-cls3
Stop storage server exascaler-cloud-65f1-oss0
Stop storage server exascaler-cloud-65f1-oss1
Stop storage server exascaler-cloud-65f1-oss2
Stop storage server exascaler-cloud-65f1-oss3
Stop metadata server exascaler-cloud-65f1-mds0
Stop management server exascaler-cloud-65f1-mgs0

Then launch the standard EXAScaler Cloud Azure application using the Azure Marketplace

➊ Basics step

  • specify your Azure subscription
  • new resource group
  • region
  • and select "Upgrade" value for the "Deployment type" option:

 Deployment step

Select an existing EXAScaler Cloud deployment:

Base OS image (you can upgrade Red Hat Enterprise Linux to CentOS Linux and vice versa) and availability options:

 Security step

Specify user name and authentication parameters (you can use user name and authentication parameters other than the existing environment):

➍ Network step

Specify network and subnet (you can create a new one) and public IP addresses options:

 Management step

You can change the type of management server and the type of management and monitoring targets, but the size of the targets must be the same as in the existing environment:

 Metadata step

You can change the type of metadata server and the type of metadata target, but the size of the target must be the same as in the existing environment:

 Storage step

You can change the type of storage servers and the type of storage targets, but the number of the storage servers and the number and size of the storage targets must be the same as in the existing environment:

 Client step

You can change the type of compute clients and the type of compute targets, but the number and size of the compute targets must be the same as in the existing environment:

 Review + create step

At this stage Microsoft Azure checks all selected options against the subscription quota. And we can review and check all the parameters we have chosen before starting the deployment:

And then press a "Create" button to upgrade the existing deployment. A new EXAScaler Cloud deployment will be created in accordance with the selected parameters, and all new targets will be created as copies of targets in the existing EXAScaler Cloud deployment.

  • No labels