Gluster Install Using Ansible

Links

Overview

This project contains a top level control script and a number of Ansible playbooks to provision and configure a cluster of GlusterFS storage nodes. Configuration has been provided to be able to deploy to either AWS or Google Compute Engine environments.

Environment Setup

Each cloud environment requires a some account creation and project configuration to work with your credentials.

Ansible

Being a bit of a Python novice I ran into a number of PATH related issues with Anisible installed from either RPM on my Linux machines or Homebrew on my Mac. I got the best results installing the Python Pip installer using the appropriate method for either system and then using pip install --upgrade ansible to get and keep Ansible up to date.

Ansible version 2.1 required

Ansible 2.1 which is currently in development has an EC2 feature to set the delete_on_termination flag when creating storage disks. This prevents you from having unattached volumes that you have to delete with a separate action after terminating cluster instances. For the moment this requires that you run from source but at some point in the near future >v2.1 will be generally available.

Amazon Web Services

Create an API key or run from a server with the appropriate instance role

# Configure key or run from instance with role
export AWS_ACCESS_KEY_ID=FOO
export AWS_SECRET_ACCESS_KEY=BAR
export AWS_REGION=us-east-1

Download a copy of the ec2.ini configuration file and save it to hosts/aws/ec2.ini
Create a new configuration file in vars/ which you’ll reference when you run the gluster.sh script
- You can support multiple environments by making more than one config file
Most ec2 instance and ebs disk settings have defaults and can be overriden in this file, several settings are environment specific and you’ll have to set for your account

# Example contents of AWS vars/mydomain.yml file values
ec2_subnet_ids: [ 'subnet-c70dd123' ]
ec2_security_groups: [ 'ssh-whitelist', 'all-internal' ]
domain: "mydomain.com"
ec2_keypair: "mydomain.com"

Google Compute Engine

Create a project in the the Google Developers Console
Create a service account and key through the Google Developers Console API section
Add an ssh key to the project
Rename hosts/gce/gce.ini.example to hosts/gce/gce.ini and fill in the values with your account information
Create an environment specific config file like the one described above for aws and populate with your account information

# Example contents of GCE vars/mydomain.yml file values
gce_service_account_email: 13_digit_acct_number-compute@developer.gserviceaccount.com
gce_pem_file: ~/.ssh/my-account-key.pem
gce_project_id: my-project-id

gluster.sh script

This script collects the arguments needed to perform any action against a provider and coordinates the execution of the Ansible playbooks. Both multi-step actions that will configure everything in a single command and the ability to run or re-run individual playbooks are allowed.

The --vars argument is used to specify a local configuration override file you create in the vars/ directory with the settings you wish to override for your cluster configuration.

Build and Configuration Examples

# Run the inventory listing for the specified provider
./gluster.sh --provider gce --action list-inventory

# Run a multi-step action
./gluster.sh --prefix dev --provider gce --vars mydomain.yml --action build-all

# Run a specific playbook (omit the provider prefix in the name)
./gluster.sh --prefix dev --provider gce --vars mydomain.yml --action provision

# You can add the --verbose to get extra output from Ansible to assist debugging issues
./gluster.sh --prefix dev --provider gce --vars mydomain.yml --action configure --verbose

# Ping all the nodes to check to see if they are reachable
./gluster.sh --prefix dev --provider gce --vars mydomain.yml --action ping

# Use the default (aws) provider and list the nodes and public IPs of the storage nodes
./gluster.sh --prefix dev --vars mydomain.yml --action info

Gluster Volume Configuration

The documentation on the various configuration is very detailed. Some common options are shown below. Once you’ve connected to a storage node there is an inventory file located at /root/bricks.txt that will assist you with the machine and volume information needed to run the commands shown below.

# Add peers
gluster peer probe <host name>

# Peer status
gluster peer status

# A three brick dispersed volume
gluster volume create gv0 \
    disperse 3 redundancy 1 \
    t7-storage-95f4c9:/bricks/xvdf/gv0 \
    t7-storage-a89e9f:/bricks/xvdf/gv0 \
    t7-storage-2e99cb:/bricks/xvdf/gv0

# Start and status the volume
gluster volume start gv0
gluster volume info gv0

Volume Name: gv0
Type: Disperse
Volume ID: 71970221-fc75-48d9-8205-bb3f389500d2
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: t7-storage-95f4c9:/bricks/xvdf/gv0
Brick2: t7-storage-a89e9f:/bricks/xvdf/gv0
Brick3: t7-storage-2e99cb:/bricks/xvdf/gv0
Options Reconfigured:
performance.readdir-ahead: on

Version

This documentation was generated for gluster-ansible version 0.0.1-SNAPSHOT from commit b7a6008370ba12ee907302b6271847b3091b8559.