Skip to main content
Version: 3.0.1

BigTop Docker provisioner

Overview

The Docker Compose definition and wrapper script that creates Bigtop virtual Hadoop cluster on top of Docker containers for you, by pulling from existing publishing bigtop repositories. This cluster can be used:

  • to test bigtop smoke tests
  • to test bigtop puppet recipes
  • to run integration test with your application

This has been verified on Docker Engine 1.9.1, with api version 1.15, and Docker Compose 1.5.2 on Amazon Linux 2015.09 release.

Prerequisites

OS X and Windows

Linux

service docker start

USAGE

  1. Create a Bigtop Hadoop cluster by given # of node.
./docker-hadoop.sh --create 3
  1. Destroy the cluster.
./docker-hadoop.sh --destroy
  1. Get into the first container (the master)
./docker-hadoop.sh --exec 1 bash
  1. Execute a command on the second container
./docker-hadoop.sh --exec 2 hadoop fs -ls /
  1. Update your cluster after doing configuration changes on ./config. (re-run puppet apply)
./docker-hadoop.sh --provision
  1. Run Bigtop smoke tests
./docker-hadoop.sh --smoke-tests
  1. Chain your operations with-in one command.
./docker-hadoop.sh --create 5 --smoke-tests --destroy

Commands will be executed by following order:

create 5 node cluster => run smoke tests => destroy the cluster
  1. See helper message:
./docker-hadoop.sh -h
usage: docker-hadoop.sh [-C file ] args
-C file Use alternate file for config.yaml
commands:
-c NUM_INSTANCES, --create NUM_INSTANCES Create a Docker based Bigtop Hadoop cluster
-d, --destroy Destroy the cluster
-e, --exec INSTANCE_NO|INSTANCE_NAME Execute command on a specific instance. Instance can be specified by name or number.
For example: docker-hadoop.sh --exec 1 bash
docker-hadoop.sh --exec docker_bigtop_1 bash
-E, --env-check Check whether required tools has been installed
-l, --list List out container status for the cluster
-p, --provision Deploy configuration changes
-s, --smoke-tests Run Bigtop smoke tests
-h, --help

Configurations

  • There are several parameters can be configured in config.yaml:
  1. Modify memory limit for Docker containers
docker:
memory_limit: "2g"

  1. Enable local repository

If you've built packages using local cloned bigtop and produced the apt/yum repo, set the following to true to deploy those packages:

enable_local_repo = true

Configure Apache Hadoop ecosystem components

  • Choose the ecosystem you want to be deployed by modifying components in config.yaml
components: "hadoop, hbase, yarn,..."

By default, Apache Hadoop and YARN will be installed.

Experimental

With recent OS versions, like Debian 11 Fedora 35, the cgroupsv2 settings are enabled by default. Running Docker compose seems to require different settings. For example, mounting /sys/fs/cgroup:ro to the containers breaks systemd and dbus when they are installed and started (in the container). The docker-hadoop.sh script offers an option, -F, to load a different configuration file for Docker compose (by default, docker-compose.yml is picked up). The configuration file to load is docker-compose-cgroupsv2.yml. More info in BIGTOP-3614, BIGTOP-3665.