Month: February 2018

The Paradise Papers: How the Cloud Helped Expose the Hidden Wealth of the Global Elite

In early 2016, the International Consortium of Investigative Journalists (ICIJ) published the Panama Papers –one of the biggest tax-related data leaks in recent history involving 2.6 Terabytes (TBs) of information. It exposed the widespread use of offshore tax havens and shell companies by thousands of wealthy individuals and political officials, including the British and Icelandic Prime Ministers. Now if […]


Talend vs. Spark Submit Configuration: What’s the Difference?

In my previous blog, “Talend and Apache Spark: A Technical Primer”, I walked you through how Talend Spark jobs equate to Spark Submit. In this blog post, I want to continue evaluating Talend Spark confiurations with Apache Spark Submit. First, we are going to look at how you can map the options in the Apache Spark […]


How to Structure Your Business to Make Better Use of Data

  A few years ago, Starbucks’ director of analytics and business intelligence, Joe LaCugna, said the Seattle coffee giant once struggled to make sense of the data pouring in from its loyalty card holders, which at the time was over 13 million and comprise 36 percent of all Starbucks’ transactions. The same was true of […]


Net Neutrality: Why it’s Vital for Digital Transformation

  Until a few months ago, it was thought that the issue of net neutrality had been definitively settled by the ruling of the Federal Communications Commission (FCC) in 2015; however, that all changed with the new Trump administration and statements by the new FCC president – just reappointed for 4 years by the US […]


CIOs: Three Considerations for Digital Transformation

  Many businesses today are scrutinizing their operations to figure out how to join the digital transformation revolution. They understand that to become more competitive and customer-centric, they need processes that are flexible, integrated, insightful and scalable. They understand harnessing data and infusing business processes with it is the key to success. Unfortunately, poor data […]


Legacy Versus Next-Generation – How Open Source is Driving the Big Data Market

When it comes to solutions for the big data sector, there is a clear split between the legacy and next-generation approaches to software development.  Legacy vendors in this space generally have their own large internal development organizations, dedicated to building proprietary, bespoke software. It’s an approach that has worked well over the years. However, the […]


Talend Step-by-Step: Continuous Data Matching & Machine Learning with Microsoft Azure

Today, almost everyone has big data, machine learning and cloud at the top of their IT “to-do” list. The importance of these technologies can’t be overemphasized as all three are opening up innovation, uncovering opportunities and optimizing businesses. Machine learning isn’t a brand new concept, simple machine learning algorithms actually date back to the 1950s, though […]


Batch vs. Stream Processing: Which Should You Choose and When?

We all know that enterprise data needs change constantly, and recently that change has come at an increasing pace. Companies that were once processing all their big data on-prem have suddenly moved into the cloud. Frameworks we used to know and love suddenly become obsolete. However, an interesting debate that still rages on is how to […]