Dev vs. Ops and DevOps

The buzz about DevOps still seems to be dominated by conversations describing what it is.  So here is my description in a simple visual format.

Click the image for full page view…

DevOpsCulture

Build a Foundation for DevOps and Automation

Organizations struggling with project delivery, application availability and security maintenance  typically also have an IT culture that struggles to understanding its own environment architecture. Many of the big dollar investments in CMDB projects, monitoring solutions, Agile processes and DevOps strategies start by building the proverbial walls and roof without ever pouring the foundation.

I believe organizations need to build reference architectures for the organization, applications and IT automation framework. However, since these reference architectures are rare and seem to spur passionate debates when discussed, I’m simply going to articulate my view of IT for others to comment.  (APM diagram posted below).

There are three basic categories for Application Portfolio Management (APM):

  1. Business Environment Applications - provide the strategic business value.
  2. IT Supply Chain Tools – the manufacturing, delivery and maintenance tools for IT.
  3. IT Technology Platforms – technologies that all applications are built on. 

Business Environment Applications:

The Business Environment include two categories; Business Applications and Shared Services Platforms.

Business Applications represent the end-user interfaces that deliver the business functions or support business processes and have four primary categories.

  • Products are IT applications or services that are sold for revenue generation such as SaaS offerings. This may or may not be relevant to your organization.
  • Front Office Applications are the primary business applications that deliver value for the business and are targets for SLA measurements. These can typically be categorized by business functions such as Marketing Services and Manufacturing Services. However, they may also have categories representing cross-functional departments such as eCommerce Services and CRM Services.
  • Business Intelligence Applications are categorized by the applications that deliver business analysis and reporting functions.
  • Back Office Applications are the corporate functions required to run the business such as Financial Services, HR services and IT Services. In this category, IT Services include the IT end-user services such as Email, VPN, Printers, Desktops and Phones.

Shared Service Platforms represent applications that primarily provide services to other applications. These are typically services that support multiple Business Applications across multiple business departments. Therefore, the availability of these applications have broad business impacts.

  • Frontend Platforms are web proxy farms and repositories for serving media and UI (JavaScript).
  • SOA & Web Services Platforms are typical Service Oriented Architecture solutions, API governance, memory grid and cache solutions, and Business Process Management (BPM) solutions.
  • Backend Application Services represents a wide range of application services and data management solutions. This includes everything from ETL processes, to FTP file transfers, batch processing, data replication processes, directory services (AD), identity management, authorization and entitlement services; and infrastructure services such as DHCP, DNS. 

IT Supply Chain Tools:

IT Supply Chain Tools represent applications used in the development, delivery and maintenance of all other applications. In DevOps terms, these are the applications that represent the IT manufacturing process or the IT manufacturing floor. This is the function of software development, testing, deployments, security, infrastructure management, monitoring, diagnostics and all the lifecycle tools for process management.

  • IT Lifecycle Management Tools provide processes for IT project and program management, requirements gathering, bug and defect tracking, change and release management, incident and problem management.
  • IT Operations Tools provide dashboards, analytics, support and monitoring solutions.
  • IT Security Tools provide security scans, diagnostics, forensics and reporting.
  • IT Systems Tools are utilized for environment management such as “Jump Servers.”
  • IT Deployment Tools provide deployment orchestration, code deployments, configuration management, patch management and infrastructure provisioning.
  • IT QA Tools provide test plans, scripts, functional and load testing solutions.
  • IT Development Tools provide source control, builds and continuous integration.

This group of applications is rarely well defined in IT organizations and the individual application ownership is distributed between Dev and Ops leadership. In many cases, DevOps initiatives can be simplified down to the creation of small teams that have end-to-end ownership of this delivery chain. And once there is end-to-end ownership, the obvious reaction is to simplify and automate processes. Thus, the adoption of DevOps Engineers and the exploding popularity of tools like Jenkins, BambooChef, Nolio and UrbanCode.

IT Technology Platforms:

IT Technology Platforms include two categories; Application Platforms and Infrastructure Platforms.

Application Platforms represent technologies for software languages and runtime containers.  Many of these platforms are offered from cloud hosting providers.

  • Development Frameworks are the .Net, Java and other software development frameworks, mobile platforms and Integrated Development Environments (IDE).
  • Portal and WCMS Platforms are the portal framework applications such as Oracle WebCenter, SharePoint, and Adobe CQ.
  • Web Servers are the typical HTTP and web routing servers such as Apache, IIS, HAProxy and Nginx.
  • Application Servers provide the runtime environment for application development frameworks such as Tomcat, Jboss, Glassfish and WebLogic.
  • Database Servers provide the data and access protocols for structured and non-structured data such as Oracle DB, SQL Server, MongoDB and CouchDB.

Infrastructure Platforms represent the core processing technologies for application platforms. I typically describe this as the primary offering from cloud hosting providers.

  • Compute Services are operating systems, virtualization technologies and storage solutions.
  • Network Services are firewalls, routers, switches, proxies, load balancers and wireless infrastructure.
  • Communication Services represents the phone and PBX infrastructure.
  • Facilities are Data Centers, Server Rooms and Network Closets that host the hardware systems. 

APM Framework Diagram:

Click the image below for full page view…

APMStructure

So what value does this have? It provides a standard vocabulary for dev, ops and non-technical resources to communicate.  That common vocabulary can then be institutionalized in the tools sets to align end-to-end supply chain processes for the following:

  • Source code and artifact repositories
  • Deployment automation tagging
  • Runbooks and release note updates
  • System monitoring dashboards
  • Diagnostic tools and dashboards

 

The Driver for Cloud is Not Cost Savings!

circumvent dysfunctionThe SaaS and public cloud provider model is an inevitable direction for IT infrastructure, but I also believe the market drivers moving organizations to the cloud are incorrectly portrayed; or at least slightly skewed.  Cost savings and ease of implementation are very real for start-up and small IT organizations, but I’m suggesting that very few large organizations are realizing a level of cost savings to justify the move.

I say this because a real driver to the cloud is an opportunity for development teams to circumvent dysfunctional infrastructure organizations. There is a belief [and a reality] that infrastructure teams are a bottleneck to enabling business strategy.  They slow down the speed to market and they disrupt the agile development delivery cycle.  If circumventing infrastructure teams is an actual market driver for using cloud services, then the IT organization as a whole is not focus on maximizing the cloud strategy and will not capitalize on potential cost savings. How could they when the strategic driver is to circumvent a huge piece of the organization?

Historically the IT departments (Dev and Inf/Ops) had to work together out of necessity. However, the public cloud has created a new option for dev departments. It’s an option that allows them to swipe a credit card and “get” their own server without help from any other team. The larger an IT organization is, the easier it is for one obscure team to do their own thing in the cloud.  However, this approach is short sighted and ignores the amazingly difficult life cycle management of infrastructure services; regardless of them being in a private data center or a public cloud.

Why DevOps is Doomed – Ops teams are lost!

The problem between dev and ops is primarily a terminology, communication and respect problem resulting in poor operational support.  The two organizations say common things backed by different definitions that are not in agreement. For example, would ops define an “application” in Puppet the same way dev would define an “application” in Hudson? If not, how would you automate or even communicate between the two for automated application deployments? Dev and Ops really have no concept of each other’s world, yet they assume the other side understands their view, or they expect that the other side should understand their view.

I love the concept of DevOps and I am very optimistic about the movement’s value. However, I’m also very concerned about traditional IT leadership’s capacity to focus on the right goals to make DevOps successful. Bridging development and operations is NOT about dev teams utilizing a continuous integration tool like Hudson or Bamboo. And it’s NOT about ops teams standing up a configuration management tools like Puppet or Chef. Both may be needed for your automation efforts, but DevOps is about bringing dev and ops teams together so people and tools from both realms are communicating with common terminology, data sources and objectives. As always, communicating and working together for a common goal is the challenge!

  • Developers tend to think infrastructure is pretty straightforward. “I can stand up a server at Amazon in seconds. These clowns at work take forever with the simplest requests.”
  • Systems Administrators tend to expect developers to understand the infrastructure their applications run in. “The developer said it worked on his dev server, so obviously we screwed it up in production. The dumbass doesn’t understand firewalls or our company’s network.”

On average, developers know application code architecture and think they know systems architecture, but they DO NOT. On average, systems and network administrators have good diversity and know a lot of different infrastructure disciplines, and think they know application code architecture, but they DO NOT.

So why would DevOps be doomed for failure?

Web applications, services architecture and cloud providers have destroyed any hope of success for the traditional IT leadership sold on yesterday’s operational support model. There has to be a fundamental change to recognize that systems and applications are no longer static, documented operational models; they are dynamic release-time dependency models. And there has to be a systematic way for dev teams to communicate application architectures so ops teams understand them.

Have you ever been asked to document application dependencies? If so, could you? If so, how long was it valid? Documenting a traditional three-tiered application is pretty easy. Documenting an application in a service-oriented architecture is only valid until the next code release  –As each release may utilize a new service end-point, dependent on a new network segment, dependent on a new database, dependent on a new data center in a different region. Good luck on managing the relationships for your ops teams!

Application designs no longer have a universal hierarchy; the diversity and rate of change can not be easily modeled in a traditional database schema. Enterprise IT tools used to manage the environment provide little help as they expect a static hierarchical application model. ITIL and service catalog implementations also tend to expect a static hierarchical application model. The three-tiered app is gone with the introduction of web application, service architectures and cloud providers. It’s game over if you can’t define your applications, model it, and use that same data to automate the build, deployment and operations life cycle.

The bottom line

App maps look like a circuit board.

Operations teams are lost and have no idea what an application looks like, how to model it, or how to support it. Nor have traditional enterprise IT solutions provided the tools to help model the web app and cloud era. Today’s dependency maps look like circuit boards.  If you zoom in, you only see some components of your applications dependencies.  If you zoom out, you see the circuit board but can’t read or understand any details.

Let’s say your web application renders a page. For that simple transaction, your application calls multiple service applications, each with multiple endpoints, each with multiple database dependencies.  Some databases may be dependent on nightly ETL jobs to provide valid data for your functionality.  Maybe the UI is rendered by a separate UI platform with its own application, service dependencies and databases.  Now, let’s say the relevant applications, services, and databases are developed by five different dev teams across three different states.

An event: some functionality in your application fails intermittently.  How does your ops team troubleshoot the problem and resolve it?  Is the “application” just the part your dev team developed, or is the application the whole “circuit board” of dependencies?  Can your app be described effectively in a knowledgebase, KB article, or wiki site?  Can the “circuit board” be effectively described in a CMDB or support tools?  If so, who out of the five dev teams is accountable for maintaining changes to it?  Is your ops team relegated to calling in subject matter experts from each team for troubleshooting?  Is your ops team able to be effective without a clear understanding of the application?

To be successful, we have to enable our ops teams to manage the dynamic changes and complexity of today’s applications. Manual communication processes will fail, so we need to redefine the minimum bar for “automation.” Systems Administrators creating a bunch a scripts and standing up Puppet or Chef is not automation. Developers using Hudson or Bamboo for continuous integration builds is not automation.  Automation has to link the application, build, and configuration management together.

  • “Automation” needs to be an architecture platform, not an individual tool or effort.
  • Automation “platforms” must bridge the technical communication gap between development and operational lifecycle tools, thus enabling organizational DevOps efforts.

The key is establishing common data models and service architectures that enables the automation and a common communication language at a very technical level. If you have been following Willie’s posts on skydingo.com, then it should be clear why we think a CMDB architecture using an unstructured NoSQL technology like Neo4j is so valuable.

In part 2 of this series I will illustrate an application example providing details on how it lacks hierarchical structure, and why the term “application” creates so many problems for DevOps in enterprise organizations.  Then I’ll describe how we are working to solve the problem with our automation platform.

DevOps: Flexible Config Management? Not so fast!

You’re in charge of establishing a department-wide deployment automation capability. Your fellow developers are excited about it, and their managers are too. There is no shortage of ideas on how it might work:

  • “Let us create our own workflows!”
  • “We should be able to configure our own servers.”
  • “It should be able to deploy from Nexus, Artifactory, S3, or whatever we choose.”
  • “We can finally use the app versioning scheme my team likes.”
  • “My team should get to do parallel installs if we want”
  • “We should have open APIs so anybody can execute their own deployment solution.”
  • “Each team should be able to configure the middleware for their application’s needs.”

Developers hate being told how to do things, so there is a general consensus that if you can make this deployment tool as flexible as possible, you’ll be able to build the best deployment automation system the world has ever seen.

Sounds great, except that it’s totally wrong.

Flexibility kills quality

Drift is evil.  Drift causes downtime and rollbacks.  Flexibility creates drift.

I’ve been involved in data center migration projects where almost every server in a production farm was configured differently.  It’s amazing the application even worked!  On many other occasions we have rolled back code because the QA and Prod configurations were so different that our testing failed to uncover critical bugs.  Although these environments sound ridiculous, I’m confident that it describes a common scenario across enterprise environments.  I will also state that we had talented systems administrators managing the environments, unfortunately each one had the flexibility to manage the systems to their liking.

Our initial investment in deployment automation (and what initiated our devops strategy) was largely driven by a need to eliminate drift and increase availability.  We knew automated deployments should be driven by data, and server instance data would be sourced from a CMDB.  However, we quickly realized that our CMDB schema allowed for configuration drift. This led to one of our first devops principles:  Don’t manage problems that you can eliminate.

Eliminate drift with inflexible schema data.  Tools from operations teams tend to be server or device centric and we wanted our deployment automation to be app and farm centric.  In other words, we wanted to deploy apps to a farm entity, where the server instances are attributes of the farm.  However, we found traditional schema for configuration data was very flexible.  The diagram below shows a typical farm with multiple instances, and each instance has an OS version.  Since the OS version can be independently selected for each instance, the schema allows the ability to represent drift across the farm.  While architecting our app deployment CMDB (interestingly named deathBURRITO), we specifically did not want to manage farm configuration consistency.  We simply wanted a guarantee that our farm deployments did not have drift.

A typical CMDB schema that allows farm drift.

To achieve this we made a simple change to the schema that did just that – prevented the data from representing farm drift (picture below).  Although you can incorrectly represent farm attributes, the data driven deployment is either 100% right or 100% wrong.

A better CMDB schema that prevents farm drift.

Gratuitous flexibility and useful flexibility

Eliminating schema flexibility to control drift is not that controversial since most people get it — and support it.  When you start limiting personal preference, man look out, people get really passionate over stupid things.  So we started communicating another one of our devops principles:  flexibility is not always a good thing.

Your deployment automation should start with inflexibility and provide flexibility as needed.  Don’t get me wrong, we absolutely support innovation and the ability to empower our department with tools that enable creativity.  I often confuse the hell out of people by saying weird stuff like, “by limiting your flexibility, I can offer you more flexibility.”  And I actually mean it — because we focus on the flexibility that is actually valuable.  The objective is to distinguish between value-added flexibility and gratuitous flexibility, and eliminate the gratuitous junk.

  • Value-added flexibility can be represented by a middleware option between Tomcat, JBoss and Glassfish.  Each solution provides different features to the development team and they should have the ability to choose the best match (within reason) for developing to application requirements.  Easy enough, there is value to the options.
  • Gratuitous flexibility can be represented by allowing multiple install directory variations for each Tomcat app.  SysAdmins usually have a preference and sometimes make it a very passionate preference.  Although the configuration matters, it should support automation and security, not personal preference.  There is no inherent value gained by allowing your environment to have different install directories such as /opt,  /app,  /u01.  In fact, allowing options creates complexity for install scripts, logging, permissions, service accounts, monitoring etc. Pick one and restrict the rest.

One of the great things about automation is the ability to make the deployment platform deliver what you want, and fail what you don’t want.  It’s a platform that gives the devops team enforcement power in the IT department that is rearly available.  Like most organizations, you probably have many awesome design standards that are drafted, but in effect are just glorious shelfware documents.   Automation empowers your ability to eliminate drift, control flexibility and operationalize the shelfware designs.

So back to my statement about limiting flexibility to offer more flexibility?  I will argue that by eliminating all the gratuitous variations, you can simplify environment complexity and eliminate the associated busy work and time waste.  I also believe that eliminating the gratuitous variations will allow your devops teams to focus on delivering the value of predictable self-service deployments… Real flexibility is the ability to provide your developers and QA teams self-service deployments; on demand at any time day and over weekends.