The eight must-have elements for resilient big data apps


As big data applications move from proof-of-concept to production, resiliency becomes an urgent concern. When applications lack resiliency, they may fail when data sets are too large, they lack transparency into testing and operations, and they are insecure. As a result, defects must be fixed after applications are in production, which wastes time and money.

The solution is to start by building resilient applications: robust, tested, changeable, auditable, secure, performant, and monitorable. This is a matter of philosophy and architecture as much as technology. Here are the key dimensions of resiliency that I recommend for anyone building big data apps.

1. Define a blueprint for resilient applications

The first step is to create a systemic enterprise architecture and methodology for how your company approaches big data applications. What data are you after? What kinds of analytics are most important? How will metrics, auditing, security and operational features be built in? Can you prove that all data was processed? These capabilities must be built into the architecture.

Other questions to consider: What technology will be crucial? What technology is being used as a matter of convenience? Your blueprint must include honest, accurate assessments of where your current architecture is failing. Keep in mind that a resilient framework for building big data applications may take time to assemble, but is definitely worth it.

2. Size shouldn’t matter

If an application fails when it attempts to tackle larger datasets, it is not resilient. Often, applications are tested with small-scale datasets and then fail or take far too long with larger ones. To be resilient, applications must handle datasets of any size (and the size of the dataset in the case of your application may mean depth, width, rate of ingestion, or all of the above). Applications must also adapt to new technologies. Otherwise, companies are constantly reconfiguring, rebuilding and recoding. Obviously, this wastes time, resources and money.

3. Transparency and high fidelity execution analysis

With complicated applications, chasing down scaling and other resiliency problems is far from automated. Ideally, it should be easy to see how long each step in a complicated pipeline takes so that problems can be caught and fixed right away. It is critical to pinpoint where the problem is: in the code, the data, the infrastructure, or the network. But this type of transparency shouldn’t have to be constructed for each application; it should be a part of a larger platform so that developers and operations staff can diagnose and respond to problems as they arise.

Once you find a problem, it is vital to relate the application behavior to the code — ideally through the same monitoring application that reported the error. Too often, getting access to code involves consulting multiple developers and following a winding chain of custody.

4. Abstraction, productivity and simplicity

Resilient applications tend to be future-proof because they employ abstractions that simplify development, improve productivity and allow substitution of implementation technology. As part of the architecture, technology should allow developers to build applications without miring them in the implementation details. This type of simplicity allows any data scientist to use the application and access any type of data source. Without such abstractions, productivity suffers, changes are more expensive and users drown in complexity.

5. Security, auditing and compliance

A resilient application comes with its own audit trail, one that shows who used the application, who is allowed to use it, what data was accessed and how policies were enforced. Building these capabilities into applications is the key to meeting the ever-increasing array of privacy, security, governance and control challenges and regulations that businesses face with big data

6. Completeness and test-driven development

To be resilient, applications have to prove they have not lost data. Failing to do so can lead to dramatic consequences. For instance, as I witnessed during my time in financial services, fines and charges of money laundering or fraud can result if a company fails to account for every transaction because the application code missed one or two lines of data. Execution analysis, audit trails and test-driven development are foundational to proving that all the data was processed properly. Test-driven development means having the technology and the architecture to test application code in a sandbox and getting the same behavior when it is deployed in production. Test-driven development should provide the ability to step through the code, establish invariants, and utilize other defensive programming techniques.

7. Application and data portability

Evolving business requirements frequently drive changes in technology. As a result, applications must run on and work with a variety of platforms and products. The goal is to make data, wherever it lives, accessible to the end user via SQL and standard APIs. For example, a state of the art platform should allow data that is in Hadoop and processed through MapReduce to be moved to Spark or Tez and processed there with minimal or no impact to the code.

8. No black arts

Applications should be written in code that is not dependent on an individual virtuoso. Code should be shared, reviewed and commonly owned by multiple developers. Such a strategy allows for building teams that can take collective responsibility for applications.

If companies follow these eight rules, they will create resilient, scalable applications that allow them to tap into the full power of big data.

Supreet Oberoi is vice president of field engineering at Concurrent, Inc. Previously, he served as big data technical evangelist and director of technical delivery at American Express, where he was responsible for conceptualization, design and development of enterprise big data platforms across customer marketing, credit risk, fraud risk, enterprise growth, and other information-based functions within the company on a global basis.


Peter Fretty

Ease of use needs to be near the top. If people cannot understand it, they will never use it. The real value with big data is not in providing something for the data scientists. It is providing tools that allow the business user access to insights that fuel improved actions.

Peter Fretty


Aren’t these the basics of architecture and application design that have been commonplace best practices for years – with or without the much detested term “big data” appended?

Comments are closed.