Big Data Blog
I have joined Talend earlier this month to take ownership of Product Marketing for the Data Governance solutions. And there are a couple of reasons for that.
First, after seven years working for a Consulting and System Integrator, I wanted to go back in the software industry (I worked at SAP during seven years), a very competitive industry fueled by innovation. In this context, my new role gives me the opportunity to keep close to the demand side, so that I can sense and respond to our customers’ needs and related business stakes. At the same time, with my counterpart from product management and our R&D, I can influence the future of our MDM and Data Quality platforms and the data governance market. I aspire as well to co-create new solutions together with our customers and partners. By the way, through this blog post, let me call for feedback from our existing customers and partners; I’m happy to get your insights on our current solutions and how you feel we should improve them in the future.
After the European editions last month, Gartner ran last week their North American Business Intelligence & Analytics Summit and Enterprise Information & Master Data Management Summit, both at The Venetian Hotel in Las Vegas (a welcome change from Caesars Palace!) Both were extremely well attended: over 2000 attendees for the former and 800 for the latter. From what I hear, these were all-times records. European numbers were also impressive, maxing out the capacity of the London venue - I understand Gartner is now planning to add a Fall event (in Germany).
Compared to last year, there was a shift in the theme of the MDM Summit. The change of name actually says it all: it’s no longer just MDM, it’s now Enterprise Information and MDM. This is a very logical move, since MDM is not an end in itself, but a means to achieve proper governance of enterprise information. Similarly, Gartner is considering refocusing the BI summit on analytics, as a question in the evaluation form suggested.
Following the post The Fascinating Ecosystem of Connected Objects in which we discussed the objects themselves, I would like to take a look at the other aspect of the AFDEL meeting dedicated to the Internet of Things. The operative word in “connected objects” is “connected”. In order for objects to be connected, there needs to exist an infrastructure, hence an ecosystem. Long gone are the days of vertically integrated technologies where a single provider would manage and operate the solution end-to-end.
Learning from the presentations, several connectivity options are available.
As chairman of the Big Data Commission of AFDEL – the French software vendors association – I was privileged to moderate a breakfast meeting focused on the Internet of Things, and more specifically Connected Objects. We had 4 high quality presenters, representing distinct part of the connected objects ecosystem, and what they revealed was simply fascinating.
As we were preparing the meeting, one thing stroke me. Three of our four presenters reacted initially along the lines of: “Big data? We don’t have big data! Our datasets are really small, and that’s on purpose.” And indeed, each dataset is very small. Four measures every 15 minutes. 12-bytes messages. This is what connected objects work with. Far from the verbose web logs, detailed point-of-sales transactions or millions of tweets we are used to dealing with in our IT world. But if you take these 4 measures every 15 minutes, multiply them 96, then by 365, and that makes 140,160 data points per year – per object. 10,000 objects? That’s over 1 billion records.
Last week we hosted ten of our top customers in Miami for a meeting of our Customer Advisory Board, and I wanted to share here some of the highlights of this meeting for me. In a lot of ways this CAB meeting reflected how far we’ve come as a company in the 8 years we’ve been in business. Interspersed with our engineering team sharing our roadmap and getting feedback on some potential future priorities, each customer told their story about why they chose Talend, described their deployment, and gave us their feedback on their experience and their suggestions. As they told their stories I started to realize how remarkable each of them were, and how as a group they added up to something truly extraordinary.
A lot has been written on the origins of Valentine’s Day, and I would venture to say that the Wikipedia page on the topic could count as big data on its own. But the fact remains that in modern times, February 14th has become the day where people in most countries celebrate love and/or friendship. Because of these celebrations, Valentine’s Day is a big data day, too.
Perfectly situated between Christmas and Easter, Valentine’s Day is a great opportunity for marketers to sell stuff. In the weeks leading to the holiday, most consumers have been bombarded with offers for romantic weekend escapades, romantic dinners, romantic cruises, romantic chocolate, romantic lingerie, romantic flowers, romantic grocery bags, romantic snow shovels (ok, maybe I am going too far but you get the gist). A few years back, you would be getting the snow shovels offers even if you lived in Hawaii, and the chocolate offers even if you (or in this case, your significant other) had diabetes. Thankfully, big data has changed everything and the granular consumer analytics that retailers are using now means you only get the best targeted offers.
Last week, Forrester Research released the 2014 edition of their Master Data Management Wave. Unlike Gartner, which continues to treat the MDM market as two separate sub-markets – MDM of Customer Data and MDM of Product Data – Forrester has elected to evaluate the MDM market under the multi-domain angle.
Of course, Gartner’s position is perfectly defendable, and analysts are very clear about its rationale (I had long discussions over this topic with Bill O’Kane): the majority of their enterprise customers still look at entering MDM through one of the two primary domains (Customer, aka “Party”, or Product, aka “Thing”). They are looking at obtaining a 360 view of their customers, for example. Or at mastering their bill of materials. Or any other domain-specific application. These enterprises take a project-driven approach to MDM (which is not to say that these projects are not super strategic and expensive).
Many who live in large cities carry a contactless public transportation pass in their wallet (Oyster in London, Navigo in Paris, Octopus in Hong Kong…), which primary use is to authorize access of its bearer through the turnstiles of the metro or onboard a bus, and to charge this bearer for the journey. I say primary use, but that’s really the only use that the travelling public sees of the card.
Still, records subpoenaed from the transportation authority have in some cases been produced in court to prove (or disprove) the presence of an individual at a certain location at a certain time. Conspiracy theorists will probably argue that the NSA is anyway tracking all moves of bearers of such cards – after all, it’s only another database to gain access to, and more big data volumes to track…
Transportation contactless cards are not only used to commute to work. For some years, they have also been adopted by ski resorts to replace lift tickets. Again, only used to gain access to the lifts, right?
Gartner predicts that by 2017, the CMO (Chief Marketing Officer) will spend more on IT than the CIO. Not all analysts agree with this bold predicament, but still Forrester tells Forbes that “CMOs must accept that it’s no longer possible to run the business of marketing without technology.” So the question is not whether IT fuels marketing – it’s a given for all experts – but whether Marketing can/should run their own “shadow IT”, or work with IT.
I recently stumbled upon this chart of a whopping 947 different companies that provide software for marketers, organized into 43 categories across 6 major classes! Being a marketer myself, I know (and use) some of these technologies, but I have to admit I had no idea of how rich the ecosystem was. Mind you, not all of these are pure “business” applications: you’ll find in there the middleware stack, iPaaS technologies, API management, databases – and of course, the now-pervasive big data stack, of which Talend is a key part alongside technologies such as Hadoop, Splunk, or NoSQL databases.
Gartner defines Operational Technology (OT) as "hardware and software that detects or causes a change through the direct monitoring and/or control of physical devices, processes and events [in the enterprise]." In other words, OT is the computers (and their software) that exist outside of the IT world.
Take a thermostat for example. It is used to regulate the temperature in a home or office building. Now, think beyond a dumb, thermocouple or mercury thermostat, and consider a cool, modern thermostat that looks and works like a phone app. This is Operational Technology, not Information Technology, right? How can one think of a thermostat in the context of IT?