Getting Started with BigData as a Service with Bluedata EPIC Private Cloud – Part 1

Hello everyone, this is part 1 of my 2 part series on building your BigData as a Service Private Cloud infrastructure with Bluedata EPIC. In this part, we will go through the below:

More details on BDaaS with Bluedata EPIC is available on http://www.bluedata.com

  • Prerequisites
  • BDaaS EPIC Prechecks
    • Where can I get the prechecks script?
    • What does the prechecks script do?
    • How do I run prechecks?
    • How do I review the Prechecks output?
  • EPIC Controller Installation
    • How do I start the installer?
    • What does the installer do?
    • How do I review the installer logs?
    • What if you end up with a failed installation?
    • How do I run the EPIC Uninstaller?
  • EPIC Platform Configuration
    • How do I configure the Platform Settings?
    • What steps are performed in the Platform configuration?
    • Default system Containers
    • Tenant Storage & Node Storage
  • How do I login to EPIC Console?
  • Licensing the Platform
    • How do I license the platform?

Prerequisites

Below are the pre-requisites to be performed when configuring the hosts for BDaaS EPIC deployment.

  • Host Requirements
    • For Compute hosts, Identical hosts are recommended, but we could customize this based on hybrid storage, GPUs etc which would then be used with host affinity for virtual nodes (containerized clusters).
    • Each Gateway host in the EPIC platform must meet the following minimum requirements:
      • 8-core CPU
      • 32GB RAM
      • No additional storage required beyond the OS disk.
  • Operating System Requirements
    • Red Hat Enterprise Linux or CentOS operating systems: 7.3 or 7.4 (for the EPIC 3.2 version)
    • RHEL should have a valid subscription to access the below repositories:
      • Bdsepel7 (Added by the EPIC installer)
      • rhel-x86_64-server-7 (RHN Classic / Satellite)
      • rhel-x86_64-server-extras-7 (RHN Classic / Satellite)
      • rhel-x86_64-server-ha-7 (RHN Classic / Satellite)
      • rhel-x86_64-server-optional-7 (RHN Classic / Satellite)
    • The list of required rpms can be found HERE
  • Network Requirements
    • 10G/25G networking
    • Identical interface naming standards (as of EPIC 3.2)
    • All hosts should be synchronized with an onpremise/external NTP server.
  • Hostname should be in an FQDN format. E.g- epic01.labs.local
  • SSHD Service: should allow root login via ssh.
    • In the file : /etc/ssh/sshd_config, Add the following:
      • PubkeyAuthentication=true
      • AuthorizedKeysFile=.ssh/authorized_keys
      • PermitRootLogin=yes 
  • Password-less SSH:
    • Generate rsa public and private key on the controller.
      • Execute: ssh key-gen
      • Copy both the files to other hosts
        • Scp /root/.ssh/id_rsa* root@192.168.x.x:/root/.ssh/
      • Create a authorized_keys file by copying contents of id_rsa.pub in .ssh folder on all the hosts including controller:
        • cp /root/.ssh/id_rsa.pub  /root/.ssh/authorized_keys
      • Give required permissions on all hosts:
        • chmod 700  ~/.ssh
        • chmod 400  ~/.ssh/authorized_keys
    • Connect each host over ssh atleast once from the controller to add them to the list of known_hosts.
  • Firewall and SELinux Settings: Can be turned OFF or ON. EPIC will add the necessary firewall exceptions, if Firewall/SELinux is ON. This settings should not be changed after EPIC is installed.

EPIC Prechecks

You can run the Pre-checks which performs a series of tests on the Controller host to determine whether it is ready to accept an EPIC installation.

What checks are being performed?

The tool performs the below checks:

  • Hardware tests
    • CPU – EPIC requires a minimum of 4 CPU cores
    • Memory – EPIC requires a minimum of 32GB RAM.
    • Disks – EPIC requires minimum of 2 raw disks with no filesystem, logical volumes, or partitions. These disks are used for Node storage and Tenant Storage. Minimum size is 1TB each.
  • Network port tests
    • The list of TCP/UDP ports tested by EPIC can be seen from the attached xtrace log.
  • Network connectivity tests
    • EPIC requires that the public interface name should be the same across all the hosts in the platform.
    • Identifies the primary public interface used for EPIC. You may see errors if there is more than one network interface (NIC) with an assigned IP address. To resolve this, ensure that there is only one NIC with an assigned IP address. If this is not possible, add the option –controller-public-if to specify the NIC to use.
    • Test for the FQDN of the host.The hostname must be a Fully Qualified Domain Name (FQDN) that includes at least one “dot” (.)
    • Checks for Default Gateway connectivity. Ensure there is only one DG set.
    • Checks internet access and ensures the AWS S3 EPIC buckets are reachable. If the access is through a proxy, add the –proxy option during prechecks.
  • OS tests
    • Checks for the supported OS type (RHEL / CentOS).All hosts in the EPIC platform must be running either CentOS 7.3/7.4 or RHEL 7.3/7.4.
    • Checks for a valid RHEL subscription
    • Checks for the required repositories to download the dependent rpms. The below repositories should be available.
      • Bdsepel7 (Added by the EPIC installer)
      • rhel-x86_64-server-7
      • rhel-x86_64-server-extras-7
      • rhel-x86_64-server-ha-7
      • rhel-x86_64-server-optional-7

Make sure that the repositories contain the correct version of rpms mentioned HERE .

  • Kernel – The kernel version must be greater than 2.6.32-573.el6.x86_64
  • Tests automount configuration – Ensure that /etc/auto.master has only one -hosts line. If this line exists, it should be the same on the Controller and all Worker hosts in the EPIC platform
  • SSHD – SSH should be enabled and root should have login access to SSH.
  • Software RAID test – EPIC does not support software RAID.
  • Checks whether the paths specified for both certificate and private key (as options) as available.
  • Storage tests
    • Filesystem check – EPIC requires minimum 300GB on the / partition. If you have separate mount points for /var, /opt and /srv, then the minimum sizing are as under:
      • / 50GB
      • /opt 100GB
      • /var 100GB
      • /srv 100GB
    • Swap space – Recommends at least 20% of RAM.

How do I run prechecks?

  • Download the prechecks script (bluedata-prechecks-epic-entdoc-3.2.bin) to a local directory on the controller host (Eg: /bluedata)
  • Make the .bin file executable by executing the command
    • chmod a+x bluedata-prechecks-epic-entdoc-3.2.bin.
  • Execute the script with the necessary options as below:

1

  • You can get the list of all available options at http://docs.bluedata.com/32_using-the-pre-check-script
  • Review the results and ensure that all the necessary tests pass without failures. If needed, address any errors or warnings that you see and run the Prechecks script again.

How do I review the Prechecks output?

The precheck tests generates 3 files under /tmp/

  • Precheck log (Eg: /tmp/bd_prechecks.XXXX.log where XXXX is the PID).
  • Xtrace file – used by the Bluedata support (/tmp/bd_prechecks.XXXX.log.xtrace where XXXX is the PID)
  • Precheck config file – used by the installer (/tmp/bd_prechecks.conf)

We need to ensure that all the tests succeed without errors and warnings. All the warnings need to be addressed. Failure to do so might cause unpredictable results during/after the installation. You can also use the –force option with the prechecks script to skip any warnings and generate the prechecks config file, however this is not recommended.

The prechecks config contains the relevant configuration details which is passed on to the EPIC installer (bd_prechecks.conf).

The below configuration details are passed:

2

EPIC Controller Installation

Assuming that the prechecks succeeded, now lets move forward to the controller installation.

Before you install the controller, make sure the following conditions are met.

  • The prechecks output / logs doesn’t contain any errors
  • You have accounted for any warnings contained in the prechecks output.
  • No configuration changes (host, user, network, infrastructure, OS etc) should be performed post prechecks. If there are any changes, run the prechecks again.

How do I start the installer?

  • Copy the installer “ bluedata-epic-entdoc-m7-minimal-release-3.2-2177.bin” to a local directory (Eg: /bluedata) on the controller host.
  • Make the .bin executable by running “ chmod a+x bluedata-epic-entdoc-m7-minimal-release-3.2-2177.bin”
  • Run the installer
    • ./ bluedata-epic-entdoc-m7-minimal-release-3.2-2177.bin –prechecks-config-file /tmp/bd_prechecks.conf

What does the installer do?

  • The installer checks the integrity of the EPIC bundle and then extracts the bundle contents
  • Accept the EULA
  • Initiates the installation logging to /tmp/bds-YYMMDDHHMMSS.log
  • Cleans up the yum metadata to remove any stale RPM version info.
  • Installs the bluedata repository which contains the controller and worker rpms.
  • Checks whether the dependencies are already installed. Else it downloads and installs the dependencies
    • The list of all the required RPMs are found HERE
  • Installer then installs the controller, worker, docker,OVSwitch and haproxy RPMS.
    • bluedata-common.x86_64 0:3.2-2177
    • bluedata-controller.x86_64 0:3.2-2177
    • bluedata-worker.x86_64 0:3.2-2177
    • x86_64 2:1.12.6-71.git3e8e77d.el7
    • x86_64 0:1.5.18-6.el7
    • libxml2-devel.x86_64 0:2.9.1-6.el7_2.3
    • libxslt-devel.x86_64 0:1.1.28-5.el7
    • openssl-devel.x86_64 1:1.0.2k-8.el7
    • x86_64 0:2.5.2-1.el7.centos
    • python-cffi.x86_64 0:1.6.0-5.el7
    • python-devel.x86_64 0:2.7.5-58.el7
    • x86_64 0:1.7.3.2-2.el7
  • Does the controller configuration.
  • Installer also sets up the software defined networking pieces using OpenvSwitch. It deploys 3 bridges – bds-local bridge, bds-ex bridge and bds-gateway bridge.
  • Once the installation is successful, installer presents with the WebUI (http://ControllerIP) to configure the EPIC platform.

How do I review the installer logs?

EPIC installer generates two logs under the /tmp/ directory. Review the logs and confirm there are no errors/warnings.

  • Installer log -> bds-YYMMDDHHMMSS.log
  • Xtrace -> bds-YYMMDDHHMMSS.log.xtrace

What if you end up with a failed installation?

If the controller installation fails for some reasons (eg: a connectivity issue to a required repository) or if you notice errors/ warnings in the setup log, you might need to run the EPIC Uninstaller before attempting to reinstall again. Make sure to account for the errors you have noticed before reinstalling EPIC.

How do I run the EPIC Uninstaller?

  • Navigate to the directory where you copied the EPIC installer (/bluedata)
  • Run the uninstaller ./ bluedata-epic-entdoc-m7-minimal-release-3.2-2177.bin –erase –force
  • If the EPIC platform has one or more Worker host(s), then this process should uninstall EPIC from the Worker(s) as well; however, the uninstaller will not work for any Worker host that is down or otherwise unreachable by the Controller host. In this case, you must log in to each affected host and uninstall EPIC
  • Reboot all of the hosts in the platform
  • Reinstall EPIC as per this post.

EPIC Platform Configuration

Once you have reviewed the output logs and found no issues, you are good to proceed to the platform configuration.

How do I configure the Platform Settings?

  • Navigate to the controller’s WebUI at http://ControllerIP (Eg://http://192.168.x.x/bdswebui/admininstall/)
  • The EPIC-Enterprise – Setup screen appears.

3

  • The Floating IP RangeCIDR, and Internal Gateway IPfields are prefilled. You may accept these defaults or use your own subnet. This is the network used for container (virtual nodes) networking.
  • You can decide if you want the container network to be on a routable or non-routable subnet. Using a non-routable network for the virtual clusters is the recommended way as it allows for better isolation.
  • The Floating IP External Interfacepull-down menu displays the NIC that you selected for Internet access during the command line installation. Each host in the EPIC platform must use the same NIC to access the Internet.
  • The Floating IP Next hop field is the IP address of the external gateway
  • The Domain Namefield defines the DNS domain name that will be used for the virtual nodes. This should be different from the domain name of the EPIC hosts
  • Node storage is the storage which is directly accessed by the virtual nodes. This is where the virtual cluster storage resides and is non-persistent. You could either use RAID for this, or EPIC can create Logical Volumes if the disks are Pass-though disks. This is what happens in the background.
  • EPIC creates Linux physical volumes on those disks, and then uses those physical volumes to create a Linux volume group called VolBDSCStore. A Linux LogicalVolume named “thinpool” is then created from this Linux volume group. This Linux logical volume is assigned to the Linux Docker subsystem, which then uses the Linux device mapper functionality to allocate portions of the thinpool logical volume to the containers running on that host for use as local storage within those containers.
  • Node storage can support a maximum capacity of 24TB
  • Note that this containerized local HDFS storage does not provide data persistence beyond the life of the virtual cluster.
  • Select the required Tenant Storage Tenant Storage provides a persistent data storage option for the virtual nodes. The available options are:
    • If the hosts each have a second or third hard drive and you want to create local tenant storage using HDFS with Kerberos protections, select Create HDFS from local disks for Tenant Storage.
    • To use an existing external HDFS file system as tenant storage, select Use existing HDFS for Tenant Storage and then enter the HDFS parameters.
    • To use an existing external NFS file system as tenant storage (Eg: EMC Isilon), select Use existing NFS for Tenant Storage and then enter the NFS parameters.
    • If you do not want to create any tenant storage, then select None. When this option is selected, EPIC will not create a default TenantStorage DataTap when creating tenants.
  • More details on configuring the Tenant Storage is available HERE
  • If you are creating local HDFS system storage, then checking the Kerberos Protectedcheckbox enables Kerberos protection for that storage.
  • Click Submitto finish installing EPIC on the Controller host.
  • EPIC displays a popup indicating that the installation process has started successfully.

4

  • The Bluedata software setup completed successfully popupappears when the installation process is completed. Click the Close button to exit to the EPIC Login screen

5.png

What steps are performed in the Platform configuration?

EPIC will call a series of scripts from the location /opt/bluedata/bundles/bluedata-epic-entdoc-m7-minimal-release-3.2-2177/scripts to perform the below actions.

The below steps are performed in the Platform Configuration.

  • Prepares the Node Storage.
    • Creates physical volumes on the disks.
    • Creates a Volume group called “VolBDSCStore” using those physical disks.
    • Creates a logical volume “thinpool” on this volume group.
    • This Linux logical volume is assigned to the Linux Docker subsystem, which then uses the Linux device mapper functionality to allocate portions of the thinpool logical volume to the containers running on that host for use as local storage within those containers.

6

7

8

  • Prepares the Tenant HDFS Storage
    • Creates a single partition on the disks (eg:/dev/sdg1), writes the xfs filesystem and mounts it under /opt/bds-hdfs-storage/jbod-X where X = 0 to n

9

  • EPIC creates 3 docker containers on the controller host which is explained in the later part of this document.
  • The container “epic-apache-hdfs-centos” runs the local Tenant Storage HDFS daemons.
  • This container will create the HDFS pool using all the disks we selected in the EPIC Platform settings configuration page with a default replication factor of 3.
  • EPIC DataTap interface presents the HDFS Tenant Storage to the containers.
  • More details on the Tenant HDFS storage is described in the below sections.

EPIC Containers

  • EPIC creates 3 docker containers on the controller host

2018-08-31 17_53_47-Hello everyone - Word

10

It also sets up software-defined-network pieces. More details on this including VxLAN Tenant isolation will be covered in a separate post.

12

13

Container “epic-apache-hdfs-centos”

The container “epic-apache-hdfs-centos” runs the HDFS daemons (Namenode, Datanode and HTTPFS). To get a bash shell to this container use “docker exec –it <container-id> /bin/bash

17

You can run HDFS commands (for eg: file system checks etc) inside this container.

18

19

This docker will create the HDFS pool using all the disks we selected in the EPIC Platform settings configuration page with a default replication factor of 3. EPIC DataTap interface presents the HDFS Tenant Storage to the containers.

Name Node directory – If you select more than 2 disks in the controller, HDFS will use two disks to store the Name Node metadata (fsimage and edit logs) for a disk-level resiliency. By default, the directory is at

/opt/bds-hdfs-storage/jbod-X/name

21

Data Node directory – The data node directory is at /opt/bds-hdfs-storage/jbod-X/data

You can access the WebHDFS via http://192.168.11.21:50070/dfshealth.html

To restart the HDFS container (only if necessary)

docker stop <container-id>

docker start <container-id>

  • Configuring Nagios
    • EPIC starts the container “epic-nagios-192.168.11.21” and configures the service.
    • EPIC will add the definitions for monitoring the BlueData cluster of physical hosts to the Nagios config file
      • cfg_file=/etc/bluedata/nagios/bluedata.cfg
    • EPIC will add the definitions for monitoring the local HDFS to the Nagios config file
      • cfg_file=/etc/bluedata/nagios/bluedata-hdfs.cfg
    • EPIC will add the definitions for monitoring the HDFS HA engine to the Nagios config file
      • cfg_file=/etc/bluedata/nagios/bluedata-hdfs-ha.cfg
  • EPIC will then enable monitoring of the EPIC platform.
    • Creates the below directories on the controller host.
      • Monitoring directory – /var/lib/monitoring
      • Monitoring logs directory – /var/lib/monitoring/logs
      • Monitoring Elastic Search directory – /var/lib/monitoring/elasticsearch

30

  • Sets up Elastic Search, all virtual clusters that we spin up later will have Metricbeat agent installed that reports the metrics to the Elastic Search cluster.
  • EPIC will deploy the necessary monitoring bundles on the controller host (also to workers which are added later)
  • Configures the management service and creates a demo tenant (called “Demo Tenant”)

How do I login to EPIC Console?

You can get access to the console via the WebUI – http://<controllerIP&gt;

31.png

32

Licensing the Platform

How do I license the platform?

Licensing is based on the number of CPU cores used in the EPIC platform. Obtain the controller ID of the platform and place a request with EPIC support for the license.

Navigate to Settings -> License

The controller ID is unique to a particular installation. In case you need to reinstall the platform, request a new license again with the newly generated controller ID.

Once the licensing is completed, you are good to procced with adding the worker hosts and gateways.

 

In Part 2, we will walk through the remaining activities like adding the Worker hosts and Gateway hosts.

Hope this was useful. Thanks for reading

2 thoughts on “Getting Started with BigData as a Service with Bluedata EPIC Private Cloud – Part 1

Leave a Reply