Overview
This project contains a top level control script and a number of Ansible playbooks to provision and configure a cluster of GlusterFS storage nodes. Configuration has been provided to be able to deploy to either AWS or Google Compute Engine environments.
Environment Setup
Each cloud environment requires a some account creation and project configuration to work with your credentials.
Ansible
Being a bit of a Python novice I ran into a number of PATH related issues with Anisible installed from either RPM on my
Linux machines or Homebrew on my Mac. I got the best results installing the Python Pip installer using the appropriate
method for either system and then using pip install --upgrade ansible to get and keep Ansible up to date.
|
Ansible version 2.1 required
Ansible 2.1 which is currently in development has an EC2 feature to set the delete_on_termination flag when creating storage disks. This prevents you from having unattached volumes that you have to delete with a separate action after terminating cluster instances. For the moment this requires that you run from source but at some point in the near future >v2.1 will be generally available. |
Amazon Web Services
-
Create an API key or run from a server with the appropriate instance role
# Configure key or run from instance with role
export AWS_ACCESS_KEY_ID=FOO
export AWS_SECRET_ACCESS_KEY=BAR
export AWS_REGION=us-east-1
-
Download a copy of the ec2.ini configuration file and save it to hosts/aws/ec2.ini
-
Create a new configuration file in vars/ which you’ll reference when you run the gluster.sh script
-
You can support multiple environments by making more than one config file
-
-
Most ec2 instance and ebs disk settings have defaults and can be overriden in this file, several settings are environment specific and you’ll have to set for your account
# Example contents of AWS vars/mydomain.yml file values
ec2_subnet_ids: [ 'subnet-c70dd123' ]
ec2_security_groups: [ 'ssh-whitelist', 'all-internal' ]
domain: "mydomain.com"
ec2_keypair: "mydomain.com"
Google Compute Engine
-
Create a project in the the Google Developers Console
-
Create a service account and key through the Google Developers Console API section
-
Add an ssh key to the project
-
Rename hosts/gce/gce.ini.example to hosts/gce/gce.ini and fill in the values with your account information
-
Create an environment specific config file like the one described above for aws and populate with your account information
# Example contents of GCE vars/mydomain.yml file values
gce_service_account_email: 13_digit_acct_number-compute@developer.gserviceaccount.com
gce_pem_file: ~/.ssh/my-account-key.pem
gce_project_id: my-project-id
gluster.sh script
This script collects the arguments needed to perform any action against a provider and coordinates the execution of the Ansible playbooks. Both multi-step actions that will configure everything in a single command and the ability to run or re-run individual playbooks are allowed.
The --vars argument is used to specify a local configuration override file you create in the vars/ directory with the settings you wish to override for your cluster configuration.
Build and Configuration Examples
# Run the inventory listing for the specified provider
./gluster.sh --provider gce --action list-inventory
# Run a multi-step action
./gluster.sh --prefix dev --provider gce --vars mydomain.yml --action build-all
# Run a specific playbook (omit the provider prefix in the name)
./gluster.sh --prefix dev --provider gce --vars mydomain.yml --action provision
# You can add the --verbose to get extra output from Ansible to assist debugging issues
./gluster.sh --prefix dev --provider gce --vars mydomain.yml --action configure --verbose
# Ping all the nodes to check to see if they are reachable
./gluster.sh --prefix dev --provider gce --vars mydomain.yml --action ping
# Use the default (aws) provider and list the nodes and public IPs of the storage nodes
./gluster.sh --prefix dev --vars mydomain.yml --action info
Gluster Volume Configuration
The documentation on the various configuration is very detailed. Some common options are shown below. Once you’ve connected to a storage node there is an inventory file located at /root/bricks.txt that will assist you with the machine and volume information needed to run the commands shown below.
# Add peers
gluster peer probe <host name>
# Peer status
gluster peer status
# A three brick dispersed volume
gluster volume create gv0 \
disperse 3 redundancy 1 \
t7-storage-95f4c9:/bricks/xvdf/gv0 \
t7-storage-a89e9f:/bricks/xvdf/gv0 \
t7-storage-2e99cb:/bricks/xvdf/gv0
# Start and status the volume
gluster volume start gv0
gluster volume info gv0
Volume Name: gv0
Type: Disperse
Volume ID: 71970221-fc75-48d9-8205-bb3f389500d2
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: t7-storage-95f4c9:/bricks/xvdf/gv0
Brick2: t7-storage-a89e9f:/bricks/xvdf/gv0
Brick3: t7-storage-2e99cb:/bricks/xvdf/gv0
Options Reconfigured:
performance.readdir-ahead: on