The Future of Apache Beam, Now a Top-Level Apache Software Foundation Project


 

Our journey to this day started 10 months ago, and what an exciting road it has been.

In February 2016, Google, Talend, Cloudera, dataArtisans, PayPal and Slack joined efforts to propose Apache Beam (see Introduction to Apache Beam”) to the Apache Incubator, the entry path into The Apache Software Foundation (ASF).

The numbers are pretty impressive. During the incubation period for Apache Beam we saw:

  • More than 1600 pull requests created, resulting in more than 4500 commits.
  • More than 1000 tickets created and fixed.
  • More than 100 contributors to the code.
  • Three releases, made by three different release managers.

Beyond the impressive appeal shown by the numbers above, it was great to see how the Apache Beam community grew and rallied behind this project. After the “legacy” players got involved, new actors also joined the Apache Beam community.

Thanks to the design and approach of Apache Beam, we interacted with a large range of other projects, including Apache (Kafka, Cassandra, Avro, and Parquet) and non-Apache projects such as Elasticsearch, Kinesis, and Google. We are eager to see additional contributions and feedback that will keep improving the capabilities of Apache Beam.  As a mentor to the Apache incubation podling, I can honestly say the team is awesome to work with. They are very open minded, eager to help and committed to this project. I am equally committed to contributing to the Apache Beam project daily. I truly believe that Apache Beam is the next level of streaming analytics and data processing. It is a great choice for both batch and stream processing and can handle bounded and unbounded data sets.

Talend began to evaluate Google Dataflow in 2015 and immediately knew we wanted to get involved because we see Beam as a natural extension to our code-generating platform and a way to provide even greater agility to our customers. By updating the Beam “runner” for any new API changes (including adopting a brand new framework like Flink or Apex), we get 100 percent full fidelity support across the product suite. It’s not surprising then that Talend has been a very active contributor to the Apache Beam community over the last two years.

Stay tuned for more information on Apache Beam project developments and forthcoming Talend products enhancements in the near future.

About Jean-Baptiste (@jbonofre)

ASF Member, PMC for Apache Karaf, PMC for Apache ServiceMix, PMC for Apache ACE, PMC for Apache Syncope, Committer for Apache ActiveMQ, Committer for Apache Archiva, Committer for Apache Camel, Contributor for Apache Falcon

jean_baptiste_onofre_0

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>