TALEND CONNECT 2018 : Get inspired by the movers and shakers in the big data world in NYC
Batch vs. Stream Processing: Which Should You Choose and When?
Batch vs. Stream Processing: Which Should You Choose and When?
- Nick Piette Nick Piette joined Talend in 2017 as the Director of Evangelism. In this role, Nick is responsible for providing thought leadership, brand awareness and community outreach. Prior to Talend, Nick held roles in customer support, technical presales and product management at a leading integration company.
- February 01, 2018
We all know that enterprise data needs change constantly, and recently that change has come at an increasing pace. Companies that were once processing all their big data on-prem have suddenly moved into the cloud. Frameworks we used to know and love suddenly become obsolete. However, an interesting debate that still rages on is how to get data processed faster. There are generally two heralded ways of processing data today:
- Batch Processing
- Stream Processing
Batch processing deals with non-continuous data. It’s fantastic at handling data sets quickly but doesn’t really get near the real-time requirements of most of today’s business. Stream processing does deal with continuous data and is really the golden key to turning big data into fast data.
Each approach has its pros and cons. At the end of the day, your choice of batch or streaming all comes down to your business use case. However, there are questions and use cases to consider here when selecting your data processing approach. In our latest episode of Craft Beer and Data, Mark Balkenende and I dove deep into the debate of batch vs. streaming.
We answered some interesting questions like “Is data ever really real-time?” We also debated if the lambda architecture is really dead, as well as sifted through some considerations you should take into account when deciding batch or stream processing.
Before we jump into the video (small plug), we are taking Craft Beer and Data on the road! Check out our events page and come attend an event in your area. We’d also love to hear your thoughts on the batch vs. streaming debate. Tweet me your thoughts @Nick_Piette.
Most Downloaded Resources
Browse our most popular resources - You can never just have one.
- 2018
- April
- Successful Methodologies with Talend – Part 2
- The Six Biggest GDPR Pitfalls Everyone Must Avoid
- Apache Spark and Talend: Performance and Tuning
- Why Paddy Power Betfair Bet on a Cloud Architecture for Big Data
- From GDPR to Customer Trust: Is Your Data Ready to Protect Customer Privacy?
- Introducing the Talend Architecture Center – Your One Stop Resource for Best Practices, Architectures and More
- Everything You Need to Know About IoT – Hardware
- How to Go Serverless with Talend & AWS Lambda
- What’s Outcome-Based Data Management?
- Data Preparation and Wrangling Best Practices – Part 1
- The Race for AI: Embed Artificial Intelligence in all Business Application by 2019 or Risk Irrelevancy
- March
- Talend Joins the Open API Initiative to Further API Standards and Interoperability
- 7 Emerging Open Source Big Data Projects that will Revolutionize Your Business
- How GDPR Can Empower Travel, Transport and Hospitality Firms and Their Customers
- Data Science: How to Get the Most out of Data, Science and Technology
- The ROI of Being Data-Driven
- Open Source: 20 years of Innovation and the Best is Yet to Come
- Salesforce Acquires Mulesoft – The War for Customer Data Rages On
- It’s Time to End Bad Data
- How Big Data is Growing Agriculture
- How to Migrate Your Data From On-premise to the Cloud: Amazon S3
- “Moving to the Cloud”: Going Cloud First at University of Pennsylvania
- The Cloud of Yesterday, Today, and Tomorrow
- Building the Best Enterprise Data Strategy in 2018: How Our Customers Are Getting There
- A Simple Architecture for Building a Big Data Lake on Azure with Talend Cloud
- Digital transformation in the public sector: balancing the risks with data-driven cyber security
- “Move to the Cloud” – Beachbody Delivers a Cloud Data Lake and Faster Analytics with Talend and AWS
- An Intro to Apache Spark Partitioning – What You Need to Know
- Talend & Snowflake: Building a Cloud Data Warehouse Ready for Analytics
- February
- [Step-by-Step] Data Cleansing & Discovery with Talend Data Preparation Cloud
- The Paradise Papers: How the Cloud Helped Expose the Hidden Wealth of the Global Elite
- Talend vs. Spark Submit Configuration: What’s the Difference?
- How to Structure Your Business to Make Better Use of Data
- Net Neutrality: Why it’s Vital for Digital Transformation
- CIOs: Three Considerations for Digital Transformation
- Time to review your contracts: How GDPR will change the relationship between organizations and cloud service providers
- Legacy Versus Next-Generation – How Open Source is Driving the Big Data Market
- Talend Step-by-Step: Continuous Data Matching & Machine Learning with Microsoft Azure
- Batch vs. Stream Processing: Which Should You Choose and When?
- January
- The future of DevOps is mastery of multi-cloud environments
- Apache Beam in 2017: Use Cases, Progress and Continued Innovation
- How APIs, Edge Computing and AI will Evolve in 2018
- 2 Key Takeaways from the 2017 Gartner Market Guide for Data Preparation
- Talend Integration Cloud 101 – SDLC and Code Promotion Pipeline
- Successful Methodologies with Talend
- April
- 2017
- December
- Six Top Technology Trends to Watch in 2018
- Disaster Recovery 101: 3 Strategies to Consider
- NetSuite and Talend: Integrating with Cloud ERP Systems
- What is the Future for SQL Developers in a Machine Learning World?
- 8 Key Takeaways from the MDM & Data Governance Summit
- 5 Predictions About the Future of Machine Learning
- Getting Ready For GDPR: 5 Key Takeaways from Data 2020 EMEA
- How to Create a Smart City with IoT and Big Data
- November
- Introducing The Data Lake Quick Start from Talend, Amazon Web Services and Cognizant
- Organizational Structures and Leadership in Times of Digital Disruption
- Achieving Unlimited Scale Using Talend ESB & Auto-scaling on AWS
- Tackling the API Driven Future with Restlet and Talend
- The Secret to Getting Data Lake Insight: Data Quality
- An Introduction to Anti-Patterns – Preventing Software Design Anomalies
- October
- How to Apply SQL Analytics and Windowing functions to Apache Spark Data Processing
- Proc Out: A Guide on Utilizing Talend with Google Cloud Dataproc
- Why Data Quality Should be the ‘Red Thread’ of your Data Strategy
- Danger Zone: How Big is Your GDPR Blind Spot?
- Digital Transformation and GDPR: How Self-Service Data in the Cloud Can Help
- Talend Connect 2017: Architecting Your Data-Driven Future
- The New Era of Data Apps
- Making the World a Better Place, One Mogo at a Time
- Cyber Security Data – Too Much is Just as Bad as Not Enough
- An Intro to Digital Twin Technology: A Step Towards Fully Maximizing Industrial IoT
- September
- An Informatica PowerCenter Developers’ Guide to Talend – II
- For AI to Change Business, It Needs to Be Fueled with Quality Data
- Gartner Magic Quadrant for Data Integration Tools 2017: The Data Integration Market is Being Disrupted
- An Introduction to the Global Data Protection Regulations (GDPR)
- Time to Consider a New “V” for Big Data: Virtue
- Step-by-Step: How to Check Data Quality with Talend Using Your Own RegEx Pattern Library
- Talend & Apache Spark: A Technical Primer
- Your Company can be Google Smart Too – You Just Need Some Learning – Machine Learning That Is
- [Podcast] Digging into Digital Transformation: Featuring Marco Iansiti of Harvard
- Mergers, Acquisitions and Customer Experience in the Age of Data
- 5 Key Considerations for Building a Data Governance Strategy
- [Podcast] Big Data in 2020: Featuring Mark van Rijmenam of Datafloq
- August
- From Lambda to Kappa: A Guide on Real-time Big Data Architectures
- Server Monitoring 101: Getting Started with Nagios and Talend ESB
- How to Operationalize Machine Learning with Talend
- 3 Top Trends in Big Data, and 3 Things Holding Them Back
- Data, Insight, Action: Turning the Cycle to your Competitive Advantage
- How to Seamlessly Include GeoSpatial Data and Operations Into Your Data Integration Process
- Why the Gartner Magic Quadrant is a Developer’s Secret Weapon
- ETL, ELT, and UPM for Data Warehousing with Google BigQuery
- July
- Running Data Preparations on your Data Lake with Talend and Apache Beam
- Is Your Data Integration Platform Container Ready?
- Talend’s CTO Office Insights: Devising a Strategy for Thriving in a Multi-Cloud World
- Building a Data Sharehouse – Agile Data Management and Industrial Data Space (IDS)
- [Podcast] What’s Next for Apache Beam? Featuring Frances Perry of Google
- Boost Your Data Skills with Talend’s “Summer of Open Source” Live Stream Series
- The Reality of the Artificial Intelligence Revolution
- GDPR & Data Management – Five Pillars for Success Using Talend
- Talend Summer ’17: What’s New in Self-Service Apps? (Part 2)
- Getting Connected with Google Home Using API.AI & Talend
- June
- How to Configure ELK Stack for Telemetrics on Apache Spark
- What Everyone Should Know about Machine Learning
- Talend Summer ’17: What’s New in Self-Service Apps?
- Whole Foods gives Amazon New Data to Enhance Online and Offline Shopping
- Do You Have the Data Agility Your Business Needs?
- Talend Summer ’17: Run Big Data Integration Workloads on Any Cloud
- How to Process HL7 Data Using Talend Data Mapper
- How to Start Incorporating Machine Learning in Enterprises
- Data Matching 101: How Do You Tune Data Matching?
- Microservices – A Lean Thinking Approach
- Using Talend and MapR to Create a Real-time Recommendation Model
- 12 Months to GDPR: The Year of Metadata
- May
- Why our Partnership with Cloudera Altus is a No Brainer
- Using Neural Networks with Talend DI and ESB
- Talend & Couchbase: Jumping into the NoSQL Database World
- Testing Machine Learning Algorithms with K-Fold Cross Validation
- Diving Into Cloud Data Warehousing and Big Data with Microsoft Azure
- Before the Great Data Floods – Why Data Management is Critical for Industry 4.0 Success
- Data Model Design & Best Practices – Part 1
- What the NFL Still Needs to Learn about Big Data
- How to Turn Text into Data Using tNormalize and tJavaFlex in Talend
- April
- Introducing Our Latest Video Series: Craft Beer & Data
- [Podcast] Tech Trends in 2017 with Bernard Marr: Blockchain, IoT and More
- What’s new in Talend Data Preparation 2.0?
- Modern Data Architectures In the Real-World: Enabling Business Users and Big Data Processing
- Hand-coding SQL for Data Integration? Not Cool!
- Applying Machine Learning to IoT Sensors
- How to Simplify Your IoT Platform with Talend
- Talend & MongoDB: Iterating Over Files Using tMongoDBBulkLoad
- The Internet of Things and the Threat it Poses to GDPR Compliance
- March
- How to Achieve Business Transformation Using Talend and Amazon Web Services (AWS)
- How to Achieve Business Transformation Using Talend and Amazon Web Services (AWS)
- Getting to Real-Time Big Data Faster: Talend & MapR
- Before the Great Data Floods – Managing the Data Challenges of Industrial IoT, Industry 4.0, and Cross-industrial Exchange
- How DevOps Can Bring Innovation to IT through Cloud Integration
- Data Matching 101: What Tools Does Talend Have?
- Unlocking Data Preparation for Business Intelligence (BI)
- How to Use Click Stream Analysis to Optimize your Company’s Social Outreach
- A First for Apache Beam
- [VIDEO] Modern Data Management Needs a Governed, Self-Service Approach
- February
- Using Talend to Gather Data About Data
- What’s Blockchain and Can It Help You Trust Your Data?
- When It Comes to Big Data and Cloud, Continuous Innovation is the Model
- Stripping Websites and Translating Text using Talend and Google Translate API
- How to Load Data into Microsoft Azure SQL Data Warehouse using PolyBase & Talend ETL
- Are You Ready For The Data Age? Five Maturity Levels in Data-Driven Organizations
- What are the Top Three Questions Keeping CDOs Up at Night?
- Step-by-Step: Running, Testing and Debugging a Job in Talend Open Studio
- Talend Appoints Technology Industry Veteran Nanci Caldwell to its Board
- How to Offload Oracle and MySQL Databases into Hadoop using Apache Spark and Talend
- January
- Getting Started with Big Data
- Power to The People – Creating Trust in Data with Collaborative Governance
- Accelerate Data Lake Creation and Software Development Lifecycles with Talend Integration Cloud Winter ’17
- Apache Beam Your Way to Greater Data Agility
- Talend Data Masters 2016: How the ICIJ Decoded the Panama Papers with Talend
- The Future of Apache Beam, Now a Top-Level Apache Software Foundation Project
- What Exactly is Talend Data Stewardship and Why Do You Need It?
- Air France-KLM: Change is in the air to delight customers with “made-just-for-me” travel experiences
- December
- 2016
- December
- Top 6 Technology Market Predictions for 2017
- Your ‘Resolution List’ for 2017: 5 Best Practices for Unleashing the Power of Your Data Lakes
- 4 Considerations for Delivering Data Quality on Hadoop
- The Role of Statistics in Business Decision Making
- Talend Data Masters 2016 – UNOS: How many lives can you save?
- Data Matching 101: How Does Data Matching Work?
- IT: How to Survive in a Self-Service World
- Sensors, Environment and Internet of Things (IoT)
- Talend Data Masters 2016: Lenovo’s Data-Driven Retail Transformation
- Where’s a Russian Linesman When You Need One? Talend Scores Highest Position in Visionaries Quadrant for Data Quality
- Top 5 Takeaways from AWS re:Invent 2016
- Singapore Big Data Survey
- News from AWS re:Invent – How do you solve the complex data problem?
- November
- Helping Data Driven Companies Advance to Artificial Intelligence
- Catch the Big Data Wave – Talend Named Leader in Forrester Wave™: Big Data Fabric, Q4 2016
- Views from the Top: 5 Key Pieces of Advice from Talend CTO on the Future of Cloud – Part 2
- Views from the Top: 5 Key Pieces of Advice from Talend CTO on the Future of Cloud
- Talend Connect 2016: Unlock Your Data for Unlimited Possibilities
- What’s new in Talend Data Preparation 1.3?
- Which Flavor of Talend Data Preparation is Best for You?
- Setting Up an Apache Spark Powered Recommendation Engine
- October
- Applying Big Data Analytics to Clickstream Data
- Looking Back at Ten Years of Growth
- The Industrial Internet of Things: Why You Need to Get up to Speed Fast
- Hand Coding vs. Tools: Our Take on Gartner’s Report
- Five Pillars for Succeeding in Big Data Governance and Metadata Management with Talend
- 6 Steps that will Pave the way for your Hadoop Journey with Data Governance and Metadata Management
- Making Sense of the Data Integration Market
- September
- Talend Data Mapper, Spark and Electronic Data Interchange
- Day-in-the-Life of a Data Integration Developer: Advanced Talend Studio Features
- Good Things Come in Small(er) Docker Packages!
- Choose Your Own Big Data Adventure: Getting Started with Talend’s New Big Data Sandbox
- Big Data is Revolutionizing Political Campaigning
- Eight Steps to Becoming a Data-Driven Organization
- An Introduction to Microservices
- August
- Day-in-the-Life of a Data Integration Developer: How to Build Your First Talend Job
- Apache Beam in Action: Same Code, Several Execution Engines
- Day-in-the-Life of a Data Integration Developer: Introduction to Talend Studio
- Talend Integration Cloud Summer ’16 – The Best of Both Worlds: Security & IT Productivity for AWS
- Why We Think Gartner’s 2016 Magic Quadrant for Data Integration is a Big Milestone for Open Source
- It’s Not About the Dot: A Journey to Becoming a Leader in the Gartner Magic Quadrant for Data Integration Tools
- CIO: 3 Questions to Ask about your Enterprise Data Lake
- What’s New in Talend Data Preparation 1.2?
- July
- Talend’s Evolution: An Innovative and Ongoing Journey
- Welcome to the Data-Driven Era
- Syncing Users and Groups from LDAP into Apache Ranger
- The Rise of MDM in the Analytics Age
- Practical Cryptography with Apache CXF JOSE
- 5 Enterprise Software Upgrade Best Practices You Should Know
- Are You Ready For The Data Age? Five Maturity Levels in Data-Driven Organizations
- SaaS Data Migration & Data Integration
- Bridging the Gap Between Business and IT with Self-Service Data Preparation
- How Apache Spark™ Feeds Real-Time Sports Analytics
- Creating a Hortonworks Big Data Pipeline at the Speed of Talend
- June
- Data Preparation, to the Moon and Beyond
- Our Newest Data Fabric – A Gateway to Enterprise-Wide Data Driven Insights
- Data Prep 101: Diving into Enterprise Features
- IoST and IoUT: Why They Matter for IoT Growth
- Complex Generation and Distribution of Documents with Talend
- The Evolution of ETL and Continuous Integration
- Spark Summit West & Apache Spark 2.0—An Electrifying Week in Big Data
- Moving Data to the Coalface to Achieve Business Success
- Talend Integration Cloud & AWS: 3 Ways to Automate Big Data (Part 2)
- How to Aggregate Clickstream Data with Apache Spark
- May
- The Lambda Architecture and Big Data Quality
- Talend Integration Cloud & AWS: 3 Ways to Automate Big Data (Part 1)
- Artificial Intelligence is no Longer Science Fiction, It’s a Reality
- Career Opportunities in Talend for Big Data: Your Guide to Bagging Top Talend ETL Jobs
- Talend and “The Data Vault”
- Stop Chasing Perfection in Analytics. Here’s Why
- Introduction to Apache Beam
- April
- Making Sense Out of the Big Data Tangle
- Telcos and the Big Data-Driven Opportunity
- Analytics for the Masses: Five Things to Consider
- The Real Challenge of Analytics
- Internet of Things: Connecting the Digital to the Physical World
- Utilizing the Kerberos Protocol in Talend
- Key Components for Laying the Foundation for your Data-Driven Enterprise
- March
- Talend Job Design Patterns & Best Practices: Part 2
- What are the Top Three Questions Keeping CDOs Up at Night?
- Five Key Tips for Making MDM the Foundation for Your Customer Centric Organizations
- Talend Integration Cloud Spring ‘16: Making Leaps with Spark, Amazon Redshift, and EMR Integration
- The Five Phases of Hybrid Integration—Part II
- How To Operationalize Meta-Data in Talend with Dynamic Schemas
- Why Marketing Teams Need Data Prep Tools!
- Apache Solr High Speed Data Integration Plugin
- The Five Phases of Hybrid Integration—Part I
- Big Data: Why You Must Consider Open Source
- Step-by-Step: Running, Testing and Debugging a Job in Talend Open Studio
- February
- Talend and Google Services Components: 9 Possibilities to Explore
- JAX-RS 2.1 Specification Work Has Started
- Delivering Data “As You Like It” with Self-Service
- Big Data & Logistics: 7 Current Trends to Watch
- Step-by-Step: Constructing a Job in Talend Open Studio
- Good News Marketeers! Your Day Job Just Got a WHOLE lot Easier
- Data Prep 101: Getting Started with Talend Data Preparation
- Clean and Actionable Data 1 Click away
- Big Data and the Big Game: Super Bowl 50
- 3 Trends Behind the Movement to Real-Time Data
- January
- Talend Connect 2015: Rethinking Data
- 3 Cloud Trends to Prepare for in 2016
- WADL and Swagger United in Apache CXF
- Talend Joins Google to Propose Dataflow as an ASF Incubator Project
- All Talend MDM Users Can Now Help Create a Golden Record
- My Challenge to Informatica: Let’s Play
- Talend’s Benchmark Against Informatica – Setting the Record Straight
- Start Easily Using Apache Spark With Talend 6!
- How To Turn Any Big Data Project Into a Success (And Key Pitfalls To Avoid)
- Improve Customer Engagement and Generate More Business with Apache Spark
- December
- 2015
- December
- Software Development’s Fountain of Youth
- Don’t Let Your Emails Bounce Back!
- Letting Your Data Quality Software Understand Your Data
- 2016 Predictions – 4 Ways Big Data & Analytics Will Impact Every Business
- Spoiler Alert! Talend 6.1 Hits the ‘Big Screen’
- When it Comes To Big Data – Speed Matters
- What’s Next for IoT: 4 Things to Watch
- Talend “Job Design Patterns” and Best Practices
- IT stuff for free! – 3 Zero-Cost Integration Projects
- November
- Explore the Talend 6 Studio and Its Exciting Productivity Features
- Creating the Golden Record that Makes Every Click Personal
- The Universal Language of Data Mastery
- [Demo] Combining Talend 6 + Spark for Real-Time Big Data Insights
- 6 Things You Should be Looking for in a Big Data Platform
- Too Soon to Talk Holiday Shopping?
- A Surprisingly Simple but Effective Masking System
- You Too Can Become a Data Rock Star & Change the World
- Our Sandbox has Better Toys
- October
- Talend Connect: Step into the future of Big Data!
- Three Key Takeaways from Amazon re:Invent 2015
- Building ‘Houses’ in the Cloud
- You’ve Bought Into the Cloud: Now What?
- Self-Service and Data Governance Empowers LOB Users
- Why Driving a Data-Driven Culture is Essential to Business Success
- Unlocking the Power of the Cloud: Talend Teams Up with AWS at re:Invent 2015
- You Can’t Fake the Data-Driven Force
- September
- Real-Time Big Data is About to Go Mainstream – Are You Ready?
- Survive and Thrive in a Data-Driven Future: Talend Hits the Big Apple at Strata and Hadoop World 2015!
- The Role of Data Governance in Delivering Seamless Omni-Channel Experiences
- The Path to Optimize Retail Operations through Big Data
- Being a Data-Driven Retailer: What’s in it for You?
- Bootstrapping AWS CloudFormation Stacks with Puppet and Structured EC2 User Data
- August
- Focus IT development on the user experience while improving the developer/designer relationship
- Talend – Implementation in the ‘Real World’: Data Quality Matching (Part 2)
- Beyond “The Data Vault”
- Talend and the Gartner Magic Quadrant for Data Integration Tools – Less than a whisker from the leader’s quadrant
- On the Road to MDM
- OSGI Service Containers
- July
- June
- May
- April
- March
- February
- Retail: Personalised Services to Generate Customer Confidence
- What is a Container? Cloud and SOA Converge in API Management (Container Architecture Series Part 2)
- Use Big Data to Secure the Love of Your Customers
- Defining Your “One-Click”
- Big, Bad and Ugly – Challenges of Maintaining Quality in the Big Data Era – Part 1
- January
- December
- 2014
- December
- September
- August
- Key Capabilities of MDM for Anything, and Wrap-up (MDM Summer Series Part 11)
- Key Capabilities of MDM for Product Information Management (MDM Summer Series Part 10)
- Key Capabilities of MDM for Regulated Products (MDM Summer Series Part 9)
- Key Capabilities of MDM for Lean Managed Services (MDM Summer Series Part 8)
- Key Capabilities of MDM for Material Data (MDM Summer Series Part 7)
- MDM for Anything (MDM Summer Series Part 6)
- Product Information Management (MDM Summer Series Part 5)
- MDM for Regulated Products (MDM Summer Series Part 4)
- MDM for Lean Managed Services (MDM Summer Series Part 3)
- July
- May
- April
- February
- 2013
- 2018
Top Categories
Dig Deeper
Can't get enough, can you?
Don't miss out on new content! Sign up for our newsletter.
Join The Conversation