Select Star Logo
April 9, 2018

Near Real-Time vs. Real-Time Analytics

Generic Placeholder for Profile Picture
April 9, 2018
Jacob Cohen
Director IT & Operations at HarperDB

Table of Contents

People like to talk a lot about how everything is real-time. It’s a popular buzzword and it absolutely should be, what could be more important that having data in real-time? No one wants to have to wait hours, days, or even weeks in order to make an informed decision. Data is no good if you can’t transform it into actionable information and make up-to-the-minute business decisions. 

Real-Time Analytics

What people consider to be truly real-time is up for debate, but in general real-time is considered to be within seconds. I’m from the Washington DC area, so naturally I’ve been surrounded by government contractors for as long as I can remember. When I think real-time, I think defense and intelligence. The first example I think of is missile defense, true real-time means the system has time to identify a missile and launch countermeasures all before impact. This is absolutely mission critical. The data must be real-time or there are serious consequences. Another slightly less intense example is autonomous vehicles. In order for the vehicle’s autonomous control system to make decisions the sensor data must be received and processed in true real-time, otherwise the consequences could be dire. Both of these examples are small scale data that is computed locally in a isolated system. This is the typical case for existing real-time systems. It is proven to be very difficult to provide real-time analytics on big data. So what we end up with is… 

Near Real-Time Analytics

When we say near real-time we mean almost now, in a few minutes. This is fine for most cases, but fine isn’t great. We’ve grown accustomed to settling for near real-time responses because it’s either too expensive and/or too difficult to return real-time intelligence on most datasets. I’ve worked on a few projects where everyone seems to accept the fact that we just have to run batch reports every hour or so and return the data back to the consumer. Personally, I’ve never understood why people have ever accepted that. The business accepted it because IT told them that’s the best they could do. It’s an all too common problem that we can fix.  

Why We’re Here

The HarperDB founders come from a world where they needed real-time responses on big datasets and they simply couldn’t achieve it with a reasonable budget and without the need to spend their days maintaining the system. They talked to the best in the business. Everyone told them their old solutions of a big data database plus analytics software on top of gigantic servers was the right thing to do and they were cutting edge. But when they were dealing with Twitter sized datasets the best they could do was near real-time of a few minutes. They needed more like 30 second response times. Eventually they got fed up and decided there had to be a better way. After months of random discussions they finally cracked the code and invented our patent-pending data model. HarperDB is built with analytics in mind. The moment a transaction is executed it is available for full scale analytics. We immediately transact to disk, with no middleman. Load data however you’d like and you’ll be running full on SQL aggregations the instant the transactions write.  

Real-Time Analytics on Big Data

Imagine the possibilities if we could easily achieve real-time analytics on big datasets. Marketing firms could predict consumer sentiment within seconds of a topic trending on Twitter. Traffic congestion could be detected by aggregating cell phone data from across a metro area. The possibilities are endless. For me, I’d be thrilled just to be able to tell customers that they can click a dashboard and see what’s really going on. No more excuses about how and why people can’t have up-to-the-minute information. Sometimes real-time is actually mission critical, sometimes it’s just nice to have. To me, everything should always be real-time. As far as we’ve come with technology we still have to tell people to create batch jobs just to get an aggregated dataset back. I realize I keep harping on this, but it’s unbelievable to me that we have not been able to advance here easily. That problem can now be a thing of the past with HarperDB. Contact us to learn more about how we can help make real-time analytics a possibility today.   

While you're here, learn about HarperDB, a breakthrough development platform with a database, applications, and streaming engine in one unified solution.

Check out HarperDB