With the launch of our newest version of the Big Data Sandbox there are a number of changes we’ve included that will make the experience more seamless and simple for you to get started building your own big data integration projects.
One of the most exciting changes, but possibly the least visible, is our use of Docker for containerization of many of the underlying components. With the explosion of enterprise movement into the DevOps space, Docker has become a powerful tool for rapid and reliable provisioning and deployment of services and applications. We at Talend are embracing this movement internally and this Sandbox represents our first comprehensive use of Docker to distribute our own evaluation software platform.
We currently leverage Docker to provide the underlying systems needed for the new Sandbox to operate in a stand-alone fashion:
- Apache Spark
- Cloudera Hadoop
By distributing these additional services with Docker we remove the burden of maintaining “latest and greatest” capability in the evaluation environment, while allowing the user to choose what components they wish to work with. For example, if you are working with Cloudera and don’t need Spark or Kafka, you can avoid the additional download (which means we can deliver a smaller download that is up and running faster).
Containerization allows us to rapidly prototype and build new functionality into the evaluation environment that we deploy with the Big Data Sandbox, and we are only scratching the surface of what Docker can do for us. We are already looking into ways to deploy Talend’s own services in a pre-configured container for even more seamless and rapid deployment of evaluation and prototyping environments for our customers. Inside and out, the container revolution is opening up exciting new frontiers for our customers, and we are right there with them working with this exciting technology.
Watch the video for a complete overview of getting Hadoop Docker started: