More Skybase screenshots

Hey Internet people, Willie here. It’s been a little while since I’ve posted Skybase screenshots, so here’s the work in progress. I explain how to implement this stuff (Spring Data Neo4j, Spring/GitHub integration, JavaScript InfoVis Toolkit, etc.) in chapter 11 of my book Spring in Practice (Manning).

App overview page

Applications are a central concept in Skybase. This is how the app overview page looks so far. (It’s just a start.) The concept behind it is that if you want a 360-degree view of an app (dev view, test view, release view and ops view), you come to the app details page and then you can start looking at different views.

App repository commits page

Here’s a repo commit history for a given app. Right now it’s just integrated with GitHub, but the idea is to provide integrations with other providers too (e.g. BitBucket).

App repository watchers page

Again, an app details view. This time it’s GitHub watchers for a given app repo.

Region details page

Visualizations are one of the more exciting features that a CMDB can offer. One of the huge advantages of putting your configuration management data in a database is that you can move away from Visio documents and Gliffy diagrams, and move instead toward data modeling and automatically generating views. Here I’m using the JavaScript InfoVis Toolkit library to generate an interactive graph visualization. It is a great complement to the underlying Neo4j graph database.

Automation view

This might strike you as a strange screenshot, but in reality it’s an example of the most important view–the automation view. The main reason we want to manage configuration data in a database is that we can build web services (Skybase supports both JSON and XML views) that simultaneously drive automation and human-consumable views of the sort that a support team would use. And the data is accurate because it’s what brings the environment into being–no more chasing your environment around trying to document it.

If you’re interested in checking it out or even getting involved in development, see the Skybase GitHub site.

Why DevOps is Doomed – Ops teams are lost! (1 of 3)

The problem between dev and ops is primarily a terminology, communication and respect problem resulting in poor operational support.  The two organizations say common things backed by different definitions that are not in agreement. For example, would ops define an “application” in Puppet the same way dev would define an “application” in Hudson? If not, how would you automate or even communicate between the two for automated application deployments? Dev and Ops really have no concept of each other’s world, yet they assume the other side understands their view, or they expect that the other side should understand their view.

I love the concept of DevOps and I am very optimistic about the movement’s value. However, I’m also very concerned about traditional IT leadership’s capacity to focus on the right goals to make DevOps successful. Bridging development and operations is NOT about dev teams utilizing a continuous integration tool like Hudson or Bamboo. And it’s NOT about ops teams standing up a configuration management tools like Puppet or Chef. Both may be needed for your automation efforts, but DevOps is about bringing dev and ops teams together so people and tools from both realms are communicating with common terminology, data sources and objectives. As always, communicating and working together for a common goal is the challenge!

  • Developers tend to think infrastructure is pretty straightforward. “I can stand up a server at Amazon in seconds. These clowns at work take forever with the simplest requests.”
  • Systems Administrators tend to expect developers to understand the infrastructure their applications run in. “The developer said it worked on his dev server, so obviously we screwed it up in production. The dumbass doesn’t understand firewalls or our company’s network.”

On average, developers know application code architecture and think they know systems architecture, but they DO NOT. On average, systems and network administrators have good diversity and know a lot of different infrastructure disciplines, and think they know application code architecture, but they DO NOT.

So why would DevOps be doomed for failure?

Web applications, services architecture and cloud providers have destroyed any hope of success for the traditional IT leadership sold on yesterday’s operational support model. There has to be a fundamental change to recognize that systems and applications are no longer static, documented operational models; they are dynamic release-time architectures. And there has to be a systematic way for dev teams to communicate application architectures so ops teams understand them.

Have you ever been asked to document application dependencies? If so, could you? If so, how long was it valid? Documenting a traditional three-tiered application is pretty easy. Documenting an application in a service-oriented architecture is only valid until the next code release  –As each release may utilize a new service end-point, dependent on a new network segment, dependent on a new database, dependent on a new data center in a different region. Good luck on managing the relationships for your ops teams!

Application designs no longer have a universal hierarchy; the diversity and rate of change can not be easily modeled in a traditional database schema. Enterprise IT tools used to manage the environment provide little help as they expect a static hierarchical application model. ITIL and service catalog implementations also tend to expect a static hierarchical application model. The three-tiered app is gone with the introduction of web application, service architectures and cloud providers. It’s game over if you can’t define your applications, model it, and use that same data to automate the build, deployment and operations life cycle.

The bottom line

App maps look like a circuit board.

Operations teams are lost and have no idea what an application looks like, how to model it, or how to support it. Nor have traditional enterprise IT solutions provided the tools to help model the web app and cloud era. Today’s dependency maps look like circuit boards.  If you zoom in, you only see some components of your applications dependencies.  If you zoom out, you see the circuit board but can’t read or understand any details.

Let’s say your web application renders a page. For that simple transaction, your application calls multiple service applications, each with multiple endpoints, each with multiple database dependencies.  Some databases may be dependent on nightly ETL jobs to provide valid data for your functionality.  Maybe the UI is rendered by a separate UI platform with its own application, service dependencies and databases.  Now, let’s say the relevant applications, services, and databases are developed by five different dev teams across three different states.

An event: some functionality in your application fails intermittently.  How does your ops team troubleshoot the problem and resolve it?  Is the “application” just the part your dev team developed, or is the application the whole “circuit board” of dependencies?  Can your app be described effectively in a knowledgebase, KB article, or wiki site?  Can the “circuit board” be effectively described in a CMDB or support tools?  If so, who out of the five dev teams is accountable for maintaining changes to it?  Is your ops team relegated to calling in subject matter experts from each team for troubleshooting?  Is your ops team able to be effective without a clear understanding of the application?

To be successful, we have to enable our ops teams to manage the dynamic changes and complexity of today’s applications. Manual communication processes will fail, so we need to redefine the minimum bar for “automation.” Systems Administrators creating a bunch a scripts and standing up Puppet or Chef is not automation. Developers using Hudson or Bamboo for continuous integration builds is not automation.  Automation has to link the application, build, and configuration management together.

  • “Automation” needs to be an architecture platform, not an individual tool or effort.
  • Automation “platforms” must bridge the technical communication gap between development and operational lifecycle tools, thus enabling organizational DevOps efforts.

The key is establishing common data models and service architectures that enables the automation and a common communication language at a very technical level. If you have been following Willie’s posts on skydingo.com, then it should be clear why we think a CMDB architecture using an unstructured NoSQL technology like Neo4j is so valuable:  Why I’m pretty excited about using NEO4J for a CMDB backend.

In part 2 of this series I will illustrate an application example providing details on how it lacks hierarchical structure, and why the term “application” creates so many problems for DevOps in enterprise organizations.  Then I’ll describe how we are working to solve the problem with our automation platform.

Skybase/GitHub integration

A major goal for Skybase is to tie together the activities of developers, testers, release engineers and operations. A few major benefits are:

  • Tool integration: We can single source app lists and more across development, build, test, deployment, monitoring, knowledge management and other tools.
  • Process streamlining: Outputs from the dev process feed into the test and deployment processes, which in turn feed into operational processes
  • Communications across teams: All the teams work from the same concepts and data, facilitating communication.

One of the first things I want to tackle in Skybase, then, is supporting some developer-centric concerns. An important one is source control. Obviously Skybase isn’t itself a source control management system, but it makes sense to pull in useful information from the source code repos just to provide a “single pane of glass” where developers and other users can get a holistic view of what an app is all about.

Since I’m using GitHub, Github integration is a logical starting point for Skybase/SCM integration. GitHub has a REST API that makes this integration easy. I wrote a blog post on Spring in Practice that explains the technical approach. Here’s the end result:

Skybase SCM page

I suspect that such integrations will be a large part of the value that Skybase delivers. They will help realize the process, tool and communications benefits I highlighted above.

Closed loops: the secret to collecting configuration management data

Hi all, Willie here. Happy New Year!

In my last post, How NOT to collect configuration management data, I gave a quick rundown of some losing CM data approaches that I and others have attempted in the past. Most of these approaches were variants of asking people for information, putting their answers in documents somewhere and never looking at the documents again.

This time around I’m going to describe a key breakthough that our team finally made–one that made it a lot easier to collect and update the data in question.

That breakthrough was the concept of a closed loop and how it relates to configuration management.

[As it happens, the concept is well-known in configuration management circles, but at the time it wasn't well-known to us. So we discovered something that other people already knew. It's in that limited sense I say we made a breakthrough.]

We’re going to have to build up to the closed loop concept. Let’s start by looking at who has the best CM data in the org.

Who has the best CM data?

Different orgs are different, so it’s tough to make blanket statements about who has the best CM data. But what I can do is give the answer for my workplace, and hopefully the principles will make sense even if the reality is different where you work. But I’ll bet for a lot of orgs it’s quite similar.

To avoid keeping you in suspense, the answer is…

Winner: the release team. Where I work, the release team has the best CM data, where “best” means something like comprehensive, accurate and actively managed. The release team knows which apps go on which servers, which service accounts to launch Tomcat or JBoss under, which alerts to suppress during a deployment, and so on. They know these things across the entire range of apps they support. It’s all documented (in YAML files, databases, scripts, etc.) and it’s all maintained in real time.

Let’s look at some other teams though.

  • App teams have nonsystematic data. The app teams typically know the URLs for their apps, which servers their apps are on (or at least the VIPs), the interdependencies between their apps and other adjacent systems (web services, databases). But the knowledge is less systematic. It’s more like browser bookmarks, tribal knowledge and not-quite-up-to-date wiki pages. And any given developer knows his own app and maybe the last app or two he worked on, but not all apps.
  • Ops teams have to depend on busy developers for info. The ops teams have better or worse information depending on how close to the app teams they are. The team in the NOC is almost entirely at the mercy of the app teams to write and update knowledge base articles. As you might imagine, it can be a challenge to ensure that developers up against a deadline are writing solid KB articles and maintaining older ones. For the NOC it’s very important to know who the app SMEs are, given such challenges. Even this is not always readily clear as org changes occur, new apps appear, old apps get new names and so on.
  • App support teams are more expertise-driven than data-driven. The app support teams (they handle escalations out of the NOC) are generally more familiar with the apps themselves and so build up stronger knowledge about the apps, but this knowledge tends to be stronger with “problem child” apps. Also, different people tend to develop expertise with specific apps.

Why does the release team have the best CM data?

The release team has the best CM data because properly maintained CM data is fundamental to their job in a way that it’s not for other teams.

First, a quick aside. If your company isn’t regulated by SOX, you may be wondering about what a release team is and why we have one. Among many other things, SOX requires a separation of duties between people who write software and people who deploy/support it in the production environment. The release team’s primary responsibility is to release software into production. We actually have a couple of release teams, and each of them services many apps. It would not be feasible from a cost and utilization perspective for each app team to have its own dedicated release engineer. The release teams release software at low-volume times, generally during the wee hours.

Back to this idea that the release team needs proper CM data more than the other teams do. Why am I saying that?

Here’s why. The software development team is highly motivated to release software at a regular cadence. A fairly limited number of release engineers must service hundreds of applications (generally not all at once, though), so “tribal knowledge” isn’t a viable strategy when it comes to knowing what to deploy where. It must be thoroughly and accurately documented. Releases happen every week, and late at night, so it’s not reasonable for the release team to call up his buddy on the app team and ask for the list of app servers. The release team needs this information at their fingertips. If they don’t have it, the software organization fails to realize the value of its development investment.

Indeed, “documented” is the wrong word here, because deployment automation drives the deployments. The CM data must be properly “operationalized”, meaning that it must be consumable by automation. No Word docs, no Excel spreadsheets, no wiki pages. More like YAML files, XML files, web service calls against a CMDB, etc.

Importantly, when the data is wrong, the deployment fails. People really care about deployments not failing, so if there are data problems, people will definitely discover and fix them.

Let’s look at the app and ops teams again.

  • App teams can make do without great CM data. The dependency of app developers on their CM data is softer. Yes, a developer needs to know which web services his app calls, but someone just explains that when he joins the project, and that’s really all there is to it. If he has a question about a transitive dependency, he might ask a teammate. If he needs to get to the app in the test environment, he probably has the URL bookmarked and the credentials recorded somewhere, but if not, he can easily ask somebody. 99% of the time, the developer can do what he needs to do without reference to specific CM data points. The developer may or may not automate against the CM data.
  • Ops/support teams need good CM data, but expertise is cheaper in the short- to medium-term. Except in cases involving very aggressive SLAs, even ops often has a softer dependency on CM data than the release team does. Since (hopefully) app outages occur much less frequently than app deployments, the return on knowledge base investments is more sporadic than that on deployment automation. If the app in question isn’t particularly important, investments in KB articles may be very limited indeed. In most cases, investing in serious support firepower (when something breaks, bring significant subject matter expertise to bear on the problem) yields a better short- to medium-term return. (Of course, in the longer term this strategy fails, because eventually there will be the very costly outage that takes the business out for several days. That’s a subject for a different day.)

Now we’re in a good place to understand closed loops and why they’re so important for configuration management data.

Closed loops and why they matter

I think of closed loops like this. There’s a “steady state” that we want to establish and maintain with respect to our CM data. We want it to be comprehensive, accurate and relevant. When the state of our CM data diverges from that desired steady state, we want feedback loops to alert us to the situation so we can address it. That’s a closed loop.

Example 1: deployment automation. The best example is the one that we already described: deployment data. Deployment data drives the deployment process, and when the data is wrong, the deployment process fails. Because the deployment process is extremely important to the organization, some level of urgency attaches to fixing wrong data. But it’s not just wrong data. If we need to deploy an app and there’s missing data in the CMDB, then sorry, there’s no deployment! Rest assured that if the deployment matters, the missing data is only a temporary issue.

Example 2: fine-grained access controls. Here’s another example: team membership data. We’ve already noted that for operational reasons it’s very important to know who is on which development team. This isn’t something that’s going to be in the HR system, and people have better things to do than to update team membership data. But what happens when that team membership data drives ACLs for something you care about being able to do, like deploying your app to your dev environment? Now you’re going to see better team membership data.

The basic concept is to find something that people really, really care about, and then make it strongly dependent on having good CM data:

Ideally it’s best if the CM data drives an automated process that people care about, but that’s not strictly necessary. In my org, for instance, there’s a fairly robust but manual goal planning and goal tracking process. Every quarter the whole department goes through a goal planning process (my goals roll up into my boss’ goals and so on), and then we track progress against those goals every couple of weeks. The goal planning and tracking app requires correct information about who’s on which team, and so this serves to establish yet another closed loop on the team membership data. It also illustrates the point that you can hit the same type of data with multiple loops.

Design your CM strategy holistically

There are several areas in technology where it pays to take a holistic view of design: security, user experience and system testing come immediately to mind. In each case you consider a given technical system in its wider organizational context. (Super-duper-strong password requirements don’t help if people have to write them down on Post-Its.)

Configuration management is another place where it makes lots of sense to take a holistic approach to design. For any given type of data (there’s no one-size-fits-all answer here), try to figure out something important that depends on it, and then figure out how to tie that something to your data so that wheels just start falling off if the data is wrong, incomplete and so on. Again, data-driven automated processes are superior here, but any important process (automated or not) will help.

Fewer meetings?

Almost forgot. In the last post, I mentioned that I’ll equip you to get out of some pointless meetings. The meetings in question are the ones where somebody wants to get together with you to collect CM data from you so they can post it to their Sharepoint site. Decline those–they’re just a waste of time. Insist that people be able to articulate the closed loops that they will be creating to make sure that someone discovers gaps and errors in the data. I’ve been in plenty of such meetings, and in some cases they’re set up as half- or full-day meetings. I don’t do those anymore.

I’m working on an open source CMDB, called Skybase, that can help you establish closed loop configuration management. See the Skybase GitHub site.

Devops: How NOT to collect configuration management data

Hi all, Willie here. This time we’re going to step away from the keyboard and get architectural. But no ivory towers here. In my next two blog posts, I’m going to give you something that will get you out of lots of pointless meetings.

Got your attention yet? Good!

If you’re in devops, one of the things that you have to figure out is how to collect up all the information that will allow you to manage your configuration and also keep your apps up and running. This isn’t too hard when you have two apps sitting on a single server. It’s much harder when you have 400 apps deployed to thousands of servers across multiple environments and data centers.

So how do you do that? Let’s start off by looking at some things that just don’t work. In my next post I’ll share with you an approach that does work.

Some quick background

Probably owing to some psychological glitch, I’ve always been an information curator. Especially when I was in management, it was always important to me to understand exactly where my apps were deployed, which servers they were on, which services and databases they depended on, which servers those services and databases lived on, who are the SMEs for a given app, and so on.

If you work in a small environment, you’re probably thinking, “what’s the big deal?”

Well, I don’t work in a small environment, but the first time I undertook this task, I probably would have said something like, “no sweat.” (That’s the developer in me.) Anyway, for whatever reason, I decided that it was high time that somebody around the department could answer seemingly basic questions about our systems.

Attempt #1: Wiki Wheeler

Wiki Wheeler

The first time I made the attempt (several years ago), I was a director over a dozen or so app teams. So I created a wiki template with all the information that I wanted, and I chased after my teams to fill it out. My zeal was such that, unbeknownst to me, I acquired the nickname “Wiki Wheeler”. (One of my managers shared this heartwarming bit of information with me a couple years later.) I guess other managers liked the approach, though, because they chased after their teams too.

This approach started out decently well, since the teams involved could manage their own information, and since I was, well, Wiki Wheeler. But it didn’t last. Through different system redesigns and department reorgs, wiki spaces came and went, some spaces atrophied from neglect, and there was redundant but contradictory information everywhere. The QA team had its own list, the release guys had their list, the app teams had their list. The UX guys might have even gotten in on the act. Anyway, after a year or so it was a big mess and to this day our wiki remains a big mess.

Attempt #2: My huge Visio

The second time, my approach was to collect the information from all the app development teams. We had hundreds of developers, so to make things efficient, I sent out a department-wide e-mail telling everybody that I was putting together a huge Visio document with all of our systems, and I’d appreciate it if they could reply back with the requested information. And while I had to send out more nag e-mails than I would have liked, the end result was in fact a huge Visio diagram that had hundreds of boxes and lines going everywhere. I was very proud of this thing, and I printed out five or six copies on the department plotter and hung them on the walls.

How long do you think it was before it was out of date?

I have no idea. I seriously doubt that it was ever correct in the first place. Nobody (myself included) ever used the diagram to do actual work, and its chief utility was in making our workplace look somewhat more hardcore since there were huge color engineering plots on the walls.

There was one additional effect of this diagram, though I hesitate to call it utility. I acquired the wholly undeserved reputation for knowing what all these apps were and how they related to one another. This went on for years through no fault of my own. I mention it because it’s relevant for the next attempt.

Attempt #3: Disaster recovery

When asteroids strike...This one wasn’t my attempt, but it’s still worth describing since I have to imagine that we aren’t the only ones who ever tried it.

As a rapidly growing company, disaster recovery became an increasingly big deal for us several years back, and there was a mandate from up high to put a plan in place. This involved collecting up all the information that would allow us to keep the business running if our primary data center ever got cratered. The person in charge of the effort spent about two years meeting with app teams or else “experts” (me, along with others who, like me, had only the highest-level understanding of how things were connected up) and documenting it on a Sharepoint site where nobody could see it.

This didn’t work at all. Most people were too busy to meet with her, so the quality of the information was poor. Apps were misnamed, app suites were mistaken for apps, and to make a long story short, the result was not correct or maintainable.

Attempt #4: Our external consultants’ even bigger Visio

Following a major reorg, we brought in some external consultants and they came up with their own Visio. By this time I had already figured out that interviewing teams and putting together huge Visios is a losing approach, so I was surprised that a bunch of highly-paid consultants would use it. Well, they did, and their diagram was, well, “impressive”. It was also (to put it gently) neither as helpful nor as maintainable as we would have liked.

Attempt #5: HP uCMDB autodiscovery

We have the HP uCMDB tool, and so our IT Services group ran some automated discovery agents to populate it with data. It loaded the uCMDB up with tons of fine-grained detail that nobody cared about, and we couldn’t see the stuff that we actually did care about (like apps, web services, databases, etc.).

Attempt #6: Service catalog

Our IT Services group was pretty big on ITIL, so they went about putting together a service catalog. The person doing it collected service and app information into an Excel spreadsheet and posted it to a Sharepoint site. But this didn’t really get into the details of configuration management. It was mostly around figuring out which business services we offered and which systems support which services.

They ended up printing out a glossy, full-color service catalog for the business, but nobody was really asking for it, so it was more of a curiosity than anything else.

There were other attempts too (Paul led a couple that I haven’t even mentioned), but by now you get the picture.

So why did these attempts fail?

To understand why these attempts failed, it helps to look at what they had in common:

  • First, they generally involved trying to collect up information from SMEs and then putting it in some kind of document. This could be wiki pages, Visio diagrams, Word docs on Sharepoint or an Excel spreadsheet.
  • Once the information went there, nobody used it. So the information, if it was ever correct and/or complete in the first place, was before long outdated, and there wasn’t any mechanism in place to bring it up to date.

In general, people saw the problem as a data collection problem rather than a data management problem. There needs to be a systematic, ongoing mechanism to discover and fix errors. None of the approaches took the management problem seriously at all–they all simply assumed that the organization would care about the data enough to keep it up to date.

In the next installment, I’ll tell you about the approach we eventually figured out. And as promised, that’s also where I’ll explain how to get rid of some of your meetings.

Interested in configuration management? I’m working on an open source CMDB called Skybase. See the Skybase GitHub site to get involved.

Domain modeling with Spring Data Neo4j [code]

Hi all, Willie here. Last time I told you that I’m building the Skybase CMDB using Neo4j and Spring Data Neo4j, and I was excited to get a lot of positive feedback about that. I showed a little code but not that much. In this post I’ll show you how I’m building out the person configuration item (CI) in Skybase using Spring Data Neo4j.

Person CI requirements

We’re going to start really simply here by building a person CI. It’s useful to have people in the CMDB for various reasons: they allow you to define fine-grained access controls (e.g., Jim can deploy such-and-such apps to the development environment; Eric can deploy whatever he wants wherever he wants; etc.); they allow you to define groups who will receive notifications for critical events and incidents; etc.

Our person CI will have a username, first and last names, some phone numbers, an e-mail address, a manager, direct reports and finally projects he or she works on. We need to be able to display people in a list view, display a given person in a details view, allow users to create, edit and delete people and so on. Here for example is what the list view will look like, at least for now:

Person list view

Person list view

And here’s how our details view will look:

Person details

Person details

The relationship between a person and a project has an associated role. This relationship is also the basis for the list of collaborators: two people are collaborators if there’s at least one project of which they’re both members.

Our simple requirements should be enough to show what it feels like to write Spring Data Neo4j code.

Create the Person and ProjectMembership entities

First we’ll create the Person. I’ve suppressed the validation and JAXB annotations since they’re irrelevant for our current purposes:

package org.skydingo.skybase.model;

import java.util.Set;
import org.neo4j.graphdb.Direction;
import org.skydingo.skybase.model.relationship.ProjectMembership;
import org.springframework.data.neo4j.annotation.*;
import org.springframework.data.neo4j.support.index.IndexType;

@NodeEntity
public class Person implements Comparable<Person> {
    @GraphId private Long id;

    @Indexed(indexType = IndexType.FULLTEXT, indexName = "searchByUsername")
    private String username;

    private String firstName, lastName, title, workPhone, mobilePhone, email;

    @RelatedTo(type = "REPORTS_TO")
    private Person manager;

    @RelatedTo(type = "REPORTS_TO", direction = Direction.INCOMING)
    private Set<Person> directReports;

    @RelatedToVia(type = "MEMBER_OF")
    private Set<ProjectMembership> memberships;

    public Long getId() { return id; }

    public void setId(Long id) { this.id = id; }

    public String getUsername() { return username; }

    public void setUsername(String username) { this.username = username; }

    ... other accessor methods ...

    public Person getManager() { return manager; }

    public void setManager(Person manager) { this.manager = manager; }

    public Set<Person> getDirectReports() { return directReports; }

    public void setDirectReports(Set<Person> directReports) {
        this.directReports = directReports;
    }

    public Iterable<ProjectMembership> getMemberships() { return memberships; }

    public ProjectMembership memberOf(Project project, String role) {
        ProjectMembership membership = new ProjectMembership(this, project, role);
        memberships.add(membership);
        return membership;
    }

    ... equals(), hashCode(), compareTo() ...
}

There are lots of annotations we’re using to put a structure in place. Let’s start with nodes and their properties. Then we’ll look at simple relationships between nodes. Then we’ll look at so-called relationship entities, which are basically fancy relationships. First, here’s an abstract representation of our domain model:

Abstract domain model

Abstract domain model

Now let’s look at some details.

Nodes and their properties. When we have a node-backed entity, first we annotate it with the @NodeEntity annotation. Most of the simple node properties (i.e., properties that aren’t relationships to other nodes) come along for the ride. Notice that I didn’t have to annotate firstName, lastName, email, and so forth. Spring Data Neo4j will handle the mapping there automatically.

There are a couple of exceptions though. The first one is that I put @GraphId on my id property. This tells Spring Data Neo4j that this is an identifier that we can use for lookups. The other one is the @Indexed annotation, which (surprise) creates an index for the property in question. This is useful when you want an alternative to ID-based lookup.

Now we’ll look at relationships. Speaking broadly, there are simple relationships and more advanced relationships. We’ll start with the simple ones.

Simple relationships. At a low level, Neo4j is a graph database, so we can talk about the graph in graph theoretical terms like nodes, edges, directed edges, DAGs and all that. But here we’re using graphs for domain modeling, so we interpret low-level graph concepts in terms of higher-level domain modeling concepts. The language that Spring Data Neo4j uses is “node entity” for nodes, and “relationships” for edges.

Our Person CI has a simple relationship, called REPORTS_TO, that relates people so we can model reporting hierarchies. Person has two fields for this relationship: manager and directReports. These are opposite sites of the same relationship. We use @RelatedTo(type = “REPORTS_TO”) to annotate these fields. The annotation has a direction element as well, whose default value is Direction.OUTGOING, which means that “this” node is the edge tail. That’s why we specify direction = Direction.INCOMING explicitly for the directReports field.

What’s this look like in the database? Neoclipse reveals all. Here are some example reporting relationships (click the image for a larger view):

(Small aside: there’s a @Fetch annotation–we’ll see it in a moment–that tells Spring Data Neo4j to eager load a related entity. For some reason I’m not having to use it for the manager and direct reports relationships, and I’m not sure why. If somebody knows, I’d appreciate the explanation.)

Relationship entities. Besides the REPORTS_TO relationship between people, we care about the MEMBER_OF relationship between people and projects. This one’s more interesting than the REPORTS_TO relationship because MEMBER_OF has an associated property–role–that’s analogous to adding a column to a link table in a RDBMS, as I mentioned in my reply to Brig in the last post. The Person.memberOf() method provides a convenient way to assign a person to a project using a special ProjectMembership “relationship entity”. Here’s the code:

package org.skydingo.skybase.model.relationship;

import org.skydingo.skybase.model.Person;
import org.skydingo.skybase.model.Project;
import org.springframework.data.neo4j.annotation.*;

@RelationshipEntity(type = "MEMBER_OF")
public class ProjectMembership {
    @GraphId private Long id;
    @Fetch @StartNode private Person person;
    @Fetch @EndNode private Project project;
    private String role;

    public ProjectMembership() { }

    public ProjectMembership(Person person, Project project, String role) {
        this.person = person;
        this.project = project;
        this.role = role;
    }

    public Person getPerson() { return person; }

    public void setPerson(Person person) { this.person = person; }

    public Project getProject() { return project; }

    public void setProject(Project project) { this.project = project; }

    public String getRole() { return role; }

    public void setRole(String role) { this.role = role; }

    ... equals(), hashCode(), toString() ...

}

ProjectMembership, like Person, is an entity, but it’s a relationship entity. We use @RelationshipEntity(type = “MEMBER_OF”) to mark this as a relationship entity, and as with the Person, we use @GraphId for the id property. The @StartNode and @EndNode annotations indicate the edge tail and head, respectively. @Fetch tells Spring Data Neo4j to load the nodes eagerly. By default, Spring Data Neo4j doesn’t eagerly load relationships since risks loading the entire graph into memory.

Create the PersonRepository

Here’s our PersonRepository interface:

package org.skydingo.skybase.repository;

import java.util.Set;
import org.skydingo.skybase.model.Person;
import org.skydingo.skybase.model.Project;
import org.springframework.data.neo4j.annotation.Query;
import org.springframework.data.neo4j.repository.GraphRepository;

public interface PersonRepository extends GraphRepository<Person> {

    Person findByUsername(String username);

    @Query("start project=node({0}) match project<--person return person")
    Set<Person> findByProject(Project project);

    @Query(
        "start person=node({0}) " +
        "match person-[:MEMBER_OF]->project<-[:MEMBER_OF]-collaborator " +
        "return collaborator")
    Set<Person> findCollaborators(Person person);
}

I noted in the last post that all we need to do is extend the GraphRepository interface; Spring Data generates the implementation automatically.

Spring Data repositories

Spring Data repositories

For findByUsername(), Spring Data can figure out what the intended query is there. For the other two queries, we use @Query and the Cypher query language to specify the desired result set. The {0} in the queries refers to the finder method parameter. In the findCollaborators() query, we use [:MEMBER_OF] to indicate which relationship we want to follow. These return Sets instead of Iterables to eliminate duplicates.

Create the web controller

We won’t cover the entire controller here, but we’ll cover some representative methods. Assume that we’ve injected a PersonRepository into the controller.

Creating a person. To create a person, we can use the following:

@RequestMapping(value = "", method = RequestMethod.POST)
public String createPerson(Model model, @ModelAttribute Person person) {
    personRepo.save(person);
    return "redirect:/people?a=created";
}

Once again, we’re ignoring validation. All we have to do is call the save() method on the repository. That’s how updates work too.

Finding all people. Next, here’s how we can get all people:

@RequestMapping(value = "", method = RequestMethod.GET)
public String getPersonList(Model model) {
    Iterable<Person> personIt = personRepo.findAll();
    List<Person> people =
        new ArrayList<Person>(IteratorUtil.asCollection(personIt));
    Collections.sort(people);
    model.addAttribute(people);
    return "personList";
}

We have to do some work to get the Iterable that PersonRepository.findAll() returns into the format we want. IteratorUtil, which comes with Neo4j (org.neo4j.helpers.collection.IteratorUtil), helps here.

Finding a single person. Here we want to display the personal details we built out above. As with findAll(), we have to do some of the massaging ourselves:

@RequestMapping(value = "/{username}", method = RequestMethod.GET)
public String getPersonDetails(@PathVariable String username, Model model) {
    Person person = personRepo.findByUsername(username);
    List<ProjectMembership> memberships =
        CollectionsUtil.asList(person.getMemberships());
    List<Person> directReports =
        CollectionsUtil.asList(person.getDirectReports());
    List<Person> collaborators =
        CollectionsUtil.asList(personRepo.findCollaborators(person));

    Collections.sort(directReports);
    Collections.sort(collaborators);

    model.addAttribute(person);
    model.addAttribute("memberships", memberships);
    model.addAttribute("directReports", directReports);
    model.addAttribute("collaborators", collaborators);

    return "personDetails";
}

If you want to see the JSPs, check out the Skybase GitHub site.

Configure the app

Finally, here’s my beans-service.xml file:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:context="http://www.springframework.org/schema/context"
    xmlns:neo4j="http://www.springframework.org/schema/data/neo4j"
    xmlns:p="http://www.springframework.org/schema/p"
    xmlns:tx="http://www.springframework.org/schema/tx"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="
        http://www.springframework.org/schema/beans
        http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
        http://www.springframework.org/schema/context
        http://www.springframework.org/schema/context/spring-context-3.0.xsd
        http://www.springframework.org/schema/data/neo4j
        http://www.springframework.org/schema/data/neo4j/spring-neo4j-2.0.xsd
        http://www.springframework.org/schema/tx
        http://www.springframework.org/schema/tx/spring-tx-3.0.xsd">

    <context:property-placeholder
        location="classpath:/spring/environment.properties" />
    <context:annotation-config />
    <context:component-scan base-package="org.skydingo.skybase.service" />

    <tx:annotation-driven mode="proxy" />

    <neo4j:config storeDirectory="${graphDb.dir}" />
    <neo4j:repositories base-package="org.skydingo.skybase.repository" />
</beans>

Neo4j has a basic POJO-based mapping model and an advanced AspectJ-based mapping model. In this blog post we’ve been using the basic POJO-based approach, so we don’t need to include AspectJ-related configuration like <context:spring-configured />.

There you have it–a Person CI backed by Neo4j. Happy coding!

To see the code in more detail, or to get involved in Skybase development, please see the Skybase GitHub site.

 

DevOps: Flexible configuration management? Not so fast!

You’re in charge of establishing a department-wide deployment automation capability. Your fellow developers are excited about it, and their managers are too. There is no shortage of ideas on how it might work:

  • “Let us create our own workflows!”
  • “We should be able to configure our own servers.”
  • “It should be able to deploy from Nexus, Artifactory, S3, or whatever we choose.”
  • “We can finally use the app versioning scheme my team likes.”
  • “My team should get to do parallel installs if we want”
  • “We should have open APIs so anybody can execute their own deployment solution.”
  • “Each team should be able to configure the middleware for their application’s needs.”

Developers hate being told how to do things, so there is a general consensus that if you can make this deployment tool as flexible as possible, you’ll be able to build the best deployment automation system the world has ever seen.

Sounds great, except that it’s totally wrong.

Flexibility kills quality

Drift is evil.  Drift causes downtime and rollbacks.  Flexibility creates drift.

I’ve been involved in data center migration projects where almost every server in a production farm was configured differently.  It’s amazing the application even worked!  On many other occasions we have rolled back code because the QA and Prod configurations were so different that our testing failed to uncover critical bugs.  Although these environments sound ridiculous, I’m confident that it describes a common scenario across enterprise environments.  I will also state that we had talented systems administrators managing the environments, unfortunately each one had the flexibility to manage the systems to their liking.

Our initial investment in deployment automation (and what became devops) was largely driven by a need to eliminate drift and increase availability.  We knew automated deployments should be driven by data, and server instance data would be sourced from a CMDB.  However, we quickly realized that our CMDB schema allowed for configuration drift. This led to one of our first devops principles:  Don’t manage problems that you can eliminate.

Eliminate drift with inflexible schema data.  Tools from operations teams tend to be server or device centric and we wanted our deployment automation to be app and farm centric.  In other words, we wanted to deploy apps to a farm entity, where the server instances are attributes of the farm.  However, we found traditional schema for configuration data was very flexible.  The diagram below shows a typical farm with multiple instances, and each instance has an OS version.  Since the OS version can be independently selected for each instance, the schema allows the ability to represent drift across the farm.  While architecting our app deployment CMDB (interestingly named deathBURRITO), we specifically did not want to manage farm configuration consistency.  We simply wanted a guarantee that our farm deployments did not have drift.

A typical CMDB schema that allows farm drift.

To achieve this we made a simple change to the schema that did just that – prevented the data from representing farm drift (picture below).  Although you can incorrectly represent farm attributes, the data driven deployment is either 100% right or 100% wrong.

A better CMDB schema that prevents farm drift.

Gratuitous flexibility and useful flexibility

Eliminating schema flexibility to control drift is not that controversial since most people get it — and support it.  When you start limiting personal preference, man look out, people get really passionate over stupid things.  So we started communicating another one of our devops principles:  flexibility is not always a good thing.

Your deployment automation should start with inflexibility and provide flexibility as needed.  Don’t get me wrong, we absolutely support innovation and the ability to empower our department with tools that enable creativity.  I often confuse the hell out of people by saying weird stuff like, “by limiting your flexibility, I can offer you more flexibility.”  And I actually mean it — because we focus on the flexibility that is actually valuable.  The objective is to distinguish between value-added flexibility and gratuitous flexibility, and eliminate the gratuitous junk.

  • Value-added flexibility can be represented by a middleware option between Tomcat, JBoss and Glassfish.  Each solution provides different features to the development team and they should have the ability to choose the best match (within reason) for developing to application requirements.  Easy enough, there is value to the options.
  • Gratuitous flexibility can be represented by allowing multiple install directory variations for each Tomcat app.  SysAdmins usually have a preference and sometimes make it a very passionate preference.  Although the configuration matters, it should support automation and security, not personal preference.  There is no inherent value gained by allowing your environment to have different install directories such as /opt,  /app,  /u01.  In fact, allowing options creates complexity for install scripts, logging, permissions, service accounts, monitoring etc. Pick one and restrict the rest.

One of the great things about automation is the ability to make the deployment platform deliver what you want, and fail what you don’t want.  It’s a platform that gives the devops team enforcement power in the IT department that is rearly available.  Like most organizations, you probably have many awesome design standards that are drafted, but in effect are just glorious shelfware documents.   Automation empowers your ability to eliminate drift, control flexibility and operationalize the shelfware.

So back to my statement about limiting flexibility to offer more flexibility?  I will argue that by eliminating all the gratuitous variations, you can simplify environment complexity and eliminate the associated busy work and waste.  I also believe that eliminating the gratuitous variations will allow your devops teams to focus on delivering the value of predictable self-service deployments… Real flexibility is the ability to provide your developer and test teams self-service deployments at any time, over weekends and around the clock.

Why I’m pretty excited about using Neo4j for a CMDB backend

Skybase is my first open source configuration management database (CMDB) effort, but it’s not the first time I’ve built a CMDB. At work a bunch of us built–and continue to build–a proprietary, internal system CMDB called deathBURRITO as part of our deployment automation effort. We built deathBURRITO using Java, Spring, Hibernate and MySQL. deathBURRITO even has a robotic donkey (really) whose purpose we haven’t quite yet identified.

So far deathBURRITO has worked out well for us. Some of its features–namely, those that directly support deployment automation–have proven more useful than others. But the general consensus seems to be that deathBURRITO addresses an important configuration management (CM) gap, where previously we were “managing” CM data on a department wiki, in spreadsheets, in XML files and in Visio diagrams. While there’s more work to do, what we’ve done so far has been reasonably right-headed, and we’ve been able to evolve it as our needs have evolved.

That’s not to say that there’s nothing I would change. I think there’s an opportunity to do something better on the backend. That was indeed the impetus for Skybase.

Revisiting the backend with Spring Data

Because CM involves a lot of different kinds of entities (e.g., regions, data centers, environments, farms, instance types, machine images, instances, applications, middleware, packages, deployments, EC2 security groups, key pairs–the list goes on and on), it was helpful to build out a bit of framework to handle various cross-cutting concerns more or less automatically, such as standard views (list view, details view, form views), web controllers (CRUD ops, standard queries), DAOs (again, CRUD ops and queries), generic domain object capabilities, security and more. And I think it’s fair to say that this helped, though perhaps not quite as much as I would have liked (at least not yet). The fact remains that anytime we want to add a new entity to the system, we still have to do a fair amount of work making each layer of the framework do what it’s supposed to do.

My experience has been that the data persistence layer is the one that’s most challenging to change. Besides the actual schema changes, we have to write data migration scripts, we have to make corresponding changes to our integration test data scripts, we have to make sure Hibernate’s eager- and lazy-loading are doing the right things, sometimes we have to change the domain object APIs and associated Hibernate queries, etc. Certainly doable, but there’s generally a good deal of planning, discussion and testing involved.

So I started looking for some ways to simplify the backend.

Recently I started goofing around with Spring Data, and in particular, Spring Data JPA and Spring Data MongoDB. For those who haven’t used Spring Data, it offers some nice features that I wanted to try out:

  • One of the features is that it’s able to generate DAO implementations automagically using Java’s dynamic proxy mechanism. Spring Data calls these DAOs “repositories”, which is fine with me. Basically what you do as a developer is you write an interface that extends a type-specific Repository interface, such as a JpaRepository or a MongoRepository. You don’t have to declare CRUD operations because the Repository interface already declares the ones you’d typically want (e.g., save(), findAll(), findById(), delete(), deleteAll(), exists(), count(), etc.).
  • And if you have custom queries, just declare them on the interface using some method naming conventions and Spring Data can figure out how to generate an implementation for you.
  • In some cases, Spring Data handles mapping for you. In the case of Spring Data JPA you still have to make sure you have the right JPA annotations in place, and that’s not too terribly bad. But for MongoDB, since the MongoDB backend is just BSON (binary JSON), the mapping from objects to MongoDB is straightforward and so Spring Data handles that very well, without requiring a bunch of annotations.

“Game changer” might be too strong for the features I’ve just described, but man, are they useful. Besides making it easier and faster to implement repos, Spring Data allows me to get rid of a lot of boilerplate code (lots of DAO implementations) and even some Spring Data-like DAO framework code that I usually write. (You know, define a generic DAO interface, a generic abstract DAO class, etc.) My days of building DAOs directly against Hibernate are probably over.

But that’s not the only backend change I’m looking at.

Revisiting the backend, part 2: Neo4j + Spring Data Neo4j

Example of a graph in Neo4j

Since Spring Data tends to gravitate toward the NoSQL stores, I finally got around to reading up on Neo4j and some other stuff that I probably ought to have read up on a long time ago. Better late than never I guess. It struck me that Neo4j could be a very interesting way to implement a CM backend, and that Spring Data Neo4j could help me keep the DAO layer thin. Here’s the thinking:

  • There are many entities and relationships. There are lots and lots of entities and relationships in a CMDB. There are different ways of grouping things, like grouping which assets form a stack, which stacks to deploy to which instances, which instances are in which farms, how to define shards and how to define failover farms, grouping features into apps, endpoints into services, etc. We have to associate SLAs and OLAs with app features and service endpoints, we have to define dependencies between components, and so on. There’s a ton of stuff. It would eliminate a major chunk of work if we could get away with defining the schema one time (say, in the app itself) instead of in both the app and the database.
  • We need schema agility to experiment with different CMDB approaches. Along similar lines, there are some strong forces that push for schema agility. Most fundamentally, the right approach to designing and implementing a CMDB isn’t necessarily a solved problem. Some tools involve running network discovery agents that collect up millions of configuration items (CIs) from the environment. Other tools focus more on collecting the right data rather than collecting everything. Some tools have more of a traditional enterprise data center customer in mind, where you’re capturing everything from bare metal to the apps to the ITIL services based on the apps. Other tools are more aligned with cloud infrastructures, starting from virtualized infrastructure and working up, leaving everything under that virtualization layer for the cloud provider to worry about. Some tools treat CIs as supporting configuration management but not application performance management (APM); other tools try to single-source their CM and APM data. Some organizations want to centralize CIs in a single database and other organizations pursue a more federated model. With such a panoply of approaches, it’s no surprise that schema changes occur.
  • We need schema agility to accommodate continuing innovations in infrastructure. Another schema agility driver is the fact that the way we do infrastructure is rapidly evolving. Infrastructure is steadily moving to the cloud, virtualization is commonplace and new automation and monitoring tools are appearing all the time. The constant stream of innovation demands the ability to adjust quickly, and schema flexibility is a big part of that.
  • We need schema flexibility to accommodate the needs of different organizations. Besides schema agility, a CMDB needs to offer a certain level of flexibility to support the needs of different customers. One org may have everything in the cloud, where another one is just getting its feet wet. One org may have two environments whereas another has seven. Different orgs have different roles, test processes, deployment pipelines and so forth. Any CMDB that hopes to support more than just a single customer needs to support some level of flexibility.
  • But we still need structure. One of the more powerful benefits of driving deployment automation from a CMDB is that you can control drift in the target environments by controlling the data in your CMDB. And defining a schema around that is a great way to do so. If, for example, you expect every instance in a given farm to have the same OS, then your schema should attach the OS to the farm itself, not to the instance. (The instances inherit the OS from the farm.) Or maybe it’s not just the OS–you want a certain instance type (in terms of virtual hardware profile), certain machine image, certain middleware stack, certain security configuration, etc. Great: bundle those together in a single server build object and then associate that with the farm. This constrains the instances in the farm to have the same server build. (See this post for a more detailed discussion.) While Neo4j is schema-free, Spring Data Neo4j gives us a way to impose schema from the app layer.
  • A schemaless backend makes zero-downtime deployments easier. If you have any apps with hardcore availability requirements, then it goes without saying that you have to have significant automation in place to support that, both on the deployment side and on the incident response side. Since the CMDB is the single source of truth driving that automation, it follows that the CMDB itself must have hardcore availability requirements. Having a schemaless database ought to make it easier to perform zero-downtime deployments of the CMDB itself.
  • We want to support intuitive querying. Once you bring a bunch of CIs together in a CMDB, there are all kinds of queries that pop up. This could be anything from using the dependency structure to diagnose an incident, assess the impact of a change, or sequence project deliverables (e.g. make system components highly available depending on where they sit in the system dependency graph); using org structures to determine support escalation paths and horizontal communications channels; SLA reporting by group, manager, application category; and so forth. Intuitive query languages for graph databases–and in particular, the Cypher query language for Neo4j–appear to be especially well-suited for the broad diversity of queries we expect to see.

Remember how I mentioned that Spring Data generates repository implementations automatically based on interfaces? Here’s what that looks like with Spring Data Neo4j:

package org.skydingo.skybase.repository;

import org.skydingo.skybase.model.Person;
import org.skydingo.skybase.model.Project;
import org.springframework.data.neo4j.annotation.Query;
import org.springframework.data.neo4j.repository.GraphRepository;

public interface ProjectRepository extends GraphRepository<Project> {
    Project findProjectByKey(String key);
    Project findProjectByName(String name);

    @Query("start person=node({0}) match person-->project return project")
    Iterable<Project> findProjectsByPerson(Person person);
}

Let me repeat that I don’t have to write the repository implementation myself. GraphRepository comes with various CRUD operations. Methods like findProjectByKey() and findProjectByName() obey a naming convention that allows Spring Data to produce the backing query automatically. And in the findProjectsByPerson() case, I provided a query using Neo4j’s Cypher query language (it uses ASCII art to define queries–how ridiculously cool is that?).

Exploring the ideas above with Skybase

Skybase dashboard

The point of Skybase is to see whether we can build a better mousetrap based on the ideas above. I’m using Neo4j and Spring Data Neo4j to build it out. I haven’t decided yet whether Skybase will focus on the CMDB piece or whether it’s a frontend to configuration management more generally (delegating on the backend to something like Chef or Puppet, say), but the CMDB will certainly be in there. That will include a representation of the as-is (current) configuration as well as representations for desired configurations as might be defined during a deployment planning activity.

So far I’m finding it a lot easier to work with the graph database than with a relational database, and I’m finding Spring Data Neo4j to be a big help in terms of repository building and defining app-level schemas. The code is a lot smaller than it was when I did this with a relational database. But it’s still early days, so the jury is out.

Watch this space for further developments.

Skybase screenshots

One of the things we’re working on is an open source configuration management system called Skybase. It’s a new project, and we’re still working out a lot of the high-level vision for this thing. I’ll post more information about it later, but for now I just wanted to post some screenshots as a way for people to see how it looks. We’re using Twitter Bootstrap for the CSS (very nice toolkit–love it), and Spring/Neo4j (along with Spring Data Neo4j) on the backend. So far the Neo4j graph data is working out great for configuration management data.

Anyway here are the screenshots. (Click on any of the screenshots for a larger view.)

First, the dashboard:

Here’s the project details page:

And here’s the page that allows you to create a new project:

 

Welcome to the Skydingo blog

Hello! This blog is where we’ll post thoughts related to devops, configuration management, application performance management, automation and much more. Some of the information will be higher level, “best practice” type of information. And some of it will be pure development geekery, since we do some open source development.

Enjoy!