Is the SDLC Relevant to Data-Centric Projects?

Originally published 4 August 2010

The Systems Development Life Cycle, more recently know as the Software Development Life Cycle, is a central organizing paradigm in information technology (IT). Better known simply as the SDLC, it has its origins in the Cambrian explosion of "data processing" back in the 1960s. At that time, the focus was on automating business processes that had hitherto been performed manually. In those early days, the automation was always to be achieved by building custom-developed software.

Today, things are very different. Very few manual processes remain to be automated; and if new business processes are conceived, they are expected to be automated from the start. But, more importantly, data has replaced process as the focus of business interest. We have shifted from a process-centric view in application development to a data-centric view. Yet the SDLC, or derivatives, still remain as a central organizing paradigm. The question naturally arises that if so much has changed, does the SDLC really apply to what we do today as much as it did five decades ago?

This question is not merely of academic interest. First, we all want to employ the most appropriate methodology to get the best results from our activities; and if the SDLC no longer really applies, we had better understand why not. Secondly, the SDLC has become so entrenched that it is taken as a given by auditors and risk managers. These groups can bring enormous pressure to bear on IT to comply with the SDLC. But what if there are development tasks that the SDLC may not apply to, and for which it might actually be misleading?

What is the SDLC?

Before we can examine this question we need to briefly review what the SDLC consists of. The SDLC divides a software development project up into a number of phases. These are roughly:
  • Requirements gathering

  • Analysis

  • Design

  • Programming

  • Testing

  • Production implementation

  • Production support and maintenance

  • Obsolescence
In the original "waterfall" approach, these phases were carried out in the above sequence, and one phase had to be more or less completed before the next phase began. A vast amount has been written about these phases; and rather than go into such detail here, we are going to assume that the reader is generally familiar with it.

Of course, there have been many variations on the SDLC over the decades. Sometimes the phases have been split into more detailed ones, and sometimes they have been combined into more general ones. The Agile methodologies have replaced the idea of a serial progression through the phases by the iteration of the phases for smaller elements of the project, thus delivering pieces that the users can review earlier in the overall project timeline.

None of these variations, however, has really replaced the SDLC in any radical way. Rather, they have adapted it. Fervent adherents of some of these variations might object to this characterization, and perhaps there is a debate worth having on this point. However, I will maintain that the SDLC has not fundamentally changed.

The Principles of the SDLC

While the phases of the SDLC are well understood, the principles on which it is founded are much less frequently discussed. As a result, they are more obscure, but they are very important because they define the overall framework in which the SDLC exists. I would suggest that these are:
  • The assumption that a business process exists that is to be automated.

  • The assumption that individuals or documents will be able to supply all details necessary for requirements and analysis.
  • Details will be supplied for the inputs, processing, and outputs of the application.
  • An orientation to building something that is final, in the sense that the building comes to an end before the application is used.
But do these really match the principles that of data-centric application development? I would suggest that these are:
  • The data that is required for the application already exists. It will not be data that is newly produced by the automation of a hitherto un-automated business process.
  • No individuals or documents will be able to provide the necessary understanding of the source data.
  • The requirements that will be presented will be information requirements that dictate what the application should produce. They will essentially be for the outputs. There may be some details provided on inputs and processing, but these will be far from complete.
  • The application is susceptible to unexpected changes in the data from the sources that supply it, and this needs to be managed.
Now it is possible to argue about both sets of principles. Are they really essential, as principles should be, or are they of secondary importance and not very significant? If the principles are not really principles, then there is a case for suggesting that application-centric projects are not really different than data-centric projects, and so the SDLC simply applies to all projects equally. Readers will have to decide for themselves if process-centric and data-centric projects are different or not. Based on my personal experience after a lot of years in both application development and data management I think that they really are different and the principles listed above are essential differences between the two kinds of projects.

Implications for the SDLC

So if we accept that the differences are real, what does it mean for the SDLC? Let us look at some of the differences.
  • Requirements. A process-centric project will need to capture how users want processes to work. This will often includes some degree of business process reengineering. A data-centric project will need to understand what data the users want to analyze, and how they want to analyze it. Process, to the extent it is required in data-centric projects, is much more about the orchestration of data acquisition. Orchestration of data acquisition is not something that any user has much interest in. Thus, requirements gathering can be expected to proceed in quite different ways in the two kinds of project. What is meant by "requirements" in the SDLC is not really the same as "requirements" in data-centric projects.
  • Analysis. The source data needed for a data-centric project will need to be understood. This is a very different task than finding out from a user how a manual process works. Indeed, perhaps the only thing that process-centric and data-centric applications have in common is the use of the term "analysis" to describe a phase in each of them. Source data analysis, with discovery, profiling and semantic investigation, is trying to solve a set of problems that simply do not arise in a process-centric project. And it is using methods that are not really needed in process-centric projects.
  • Production Support and Maintenance. In application-centric projects the assumption is that if everything has been built to meet the users' requirements, and has been tested successfully, then it should function without problem in production. If issues arise, users can be expected to report them. But in data-centric projects, there is no real control over the source data. Problems can arise in it at any moment that have nothing to do with the data-centric application. These problems cannot be anticipated in advance. Thus, the source data should be continuously monitored. The work of figuring out what to monitor is something that does not end with a given phase of the project.
There are other differences too, but these are ones that seem to me to important, based on my experience.

In conclusion, I think that the detailed application of the SDLC to data-centric projects is misleading and that a different kind of application development methodology needs to be worked out for these kinds of project. Only at the very highest levels of abstraction is there commonality (e.g., saying that "analysis" is required in both process-centric and data centric projects without understanding that "analysis" is very different in the two kinds of project). Hopefully, in the coming years these issues will become better understood.

SOURCE: Is the SDLC Relevant to Data-Centric Projects?

  • Malcolm ChisholmMalcolm Chisholm

    Malcolm Chisholm, Ph.D., has more than 25 years of experience in enterprise information management and data management and has worked in a wide range of sectors. He specializes in setting up and developing enterprise information management units, master data management, and business rules. His experience includes the financial, manufacturing, government, and pharmaceutical industries. He is the author of the books: How to Build a Business Rules Engine; Managing Reference Data in Enterprise Databases; and Definition in Information Management. Malcolm writes numerous articles and is a frequent presenter at industry events. He runs the websites http://www.refdataportal.com; http://www.bizrulesengine.com; and
    http://www.data-definition.com. Malcolm is the winner of the 2011 DAMA International Professional Achievement Award.

    He can be contacted at mchisholm@refdataportal.com.
    Twitter: MDChisholm
    LinkedIn: Malcolm Chisholm

    Editor's Note: More articles, resources, news and events are available in Malcolm's BeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Malcolm Chisholm

 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!