I was recording a PacketPushers podcast this morning with Ethan Banks, Stephen Skinner, and Chris Wahl which was great. It was talking the tips and war stories associated with VMware NSX and deploying it. There was a quote that was fired off early in the podcast and it was as follows

“..the first thing you do when you get NSX is .. do not deploy NSX”.

This is something that I wholeheartedly agree with. VMware espouse that NSX is quick to install, easy to integrate, and can happily work in a brownfield environment. All of that is true. Deploy NSX Manager OVA to your management cluster, point it towards vCenter, and happy days I now have a new inventory item in vCenter called Networking and Security.

The problem lies when people don’t have a design. Like a bull in a china shop and clicking buttons, deploying VTEPs, next next next and random policy application is something that becomes unruly.

What are you building that you will throw over the fence to ops? What was the design? Did you have a design?

I’ve walked into a couple of environments that have seen customers deploy:
* mind-boggling complex security policy (taking the hardware mentality of security) and applying it to NSX not uses
* complex routing architectures uses NSX edges but not distributed routing
* crazy L2 bridging and routing scenarios
* Did I mention security rule crazy ness?

The end result is a customer deploying something very different to what they expected it to be. The end result is the customer not getting full benefits from something they own. The end result is no net benefit moving the function to the hypervisor because they’ve just pushed

Security Policy

Policy is quite the buzzword. It is almost the new ‘SDN’ in terms of hype. Policy this and policy that. Policy is very personal and very much a word that holds most meaning to the individual.

In my world I term policy in the context of security as ‘desired state’. I want to attest a set of rules upon my infrastructure and ensure that they are applied. I want to know that what I express against ‘Web Policy’ is enforced. This has been a dilemma in the past where rules have for a long time been based on 5-tuple matching – the old SRC/DST IP SRC/DST Port and protocol type. The firewall has no idea about the platform or what is running on it other than can talk to on port TCP 3389.

So where does one start thinking about security in the scope of design? With software like VMware NSX you can enforce security rules at the vNIC. This is great. Every VM has a firewall filter protecting it. Think that last statement through. If every virtual machine has a firewall then I need to move beyond 5-tuple matching.

There are three mindsets I discuss with people with regards to security and NSX. They are grouped into the following:

  • Networking – Rules based on IPSet & MACSet
  • Infrastructure – Rules based on vCenter objects (Datastore, VM Name, VM type, Network port (VXLAN/VLAN)
  • Application – Service orientated focusing on Security group membership via Security tags.

Looking at this as a pyramid with networking rule on the bottom this is where we find most of the thinking happens initially. The need to define subnets, enforce them against VM’s and the thought of that alone is overwhelming. Network rules work for ingress and egress for elements that are not within the domain of NSX or vCenter. This could indulge physical objects such as SCADA or end user desktops. If you take the mindset that the industry as awhile has used for other 20 years then you will end up with thousands of firewall rules, overlap, and potential holes in your firewall environment.

Infrastructure rules look to bring administrators and security architects up a level and start expressing their security attestation using relevant objects. Building rules based on networks a workload is connected to is a start. Web-VLAN to App-VLAN. App-VLAN to Shared services-Logical-Switch . Assign a Service Group of protocols. There is a rule that is based on the virtual infrastructure.

You may use the vCenter object ‘Cluster’ and state Deny Cluster-PCI to All other Clusters. This would stop communication to any components on PCI cluster to All other clusters. You may chose to exempt AD-VM and DNS-VM with a Service Group matching AD and DNS ports to PCI cluster. This is one way to allow explicit communication from Cluster-PCI to AD and DNS VM.

Evolving from this customers and architects eventually iterate their environment to the final, Application centric or focused, security model. This is going ‘all in’ on the tools available within NSX. Security Tags, Service Policy, and Security Groups. The anatomy of a service architecture is as follows.

Security Group SG.Internet.Proxy that matches membership based on the Security Tag ST.Internet.Proxy. Applied to SG.Internet.Proxy is a Security Policy named SP.Internet.Proxy. This Security Policy states ‘Source security group’ to destination Proxy VIP (Squid is load balanced!) is allowed on 8080.

This means if a workload is allowed internet access it is tagged ST.Internet.Proxy and gains the ability to connect to the internet via the Squid Proxy. A workload can be the member of numerous tags at any given time. This allows a service based architecture that ensures key-security policies are based on broader groups such as ‘Users’ ‘Admins’ ’HR’ and then smaller service policies can be used between these larger groups.

This way of thinking does take time. This is a fundamental shift in the way the industry as a whole, irrespective of product, is looking at security.

I have built policy out and explored grouping mechanisms here.

So coming back onto design. How do you foresee your ‘micro segmentation’ plan? How do you feel you will carve your applications up? A mix of infrastructure and network rules? What about a mix of infrastructure and application rules? Or swim in the deep (and very awesome end) of Application rules?

Design. Whiteboard. Draw it out. Talk it out. Revise.

Earlier it was discussed there was some horror stories. In each instance a more refined, scalable and automated policy was applied with a high weight – same source and destination but using a more efficient matching and object criteria. This was applied inline, on a per application basis which allowed the removal of the network centric rules.

War Stories

There have been horror stories but we are changing the way we approach many things. After all – you do not know what you don’t know. Luckily there are many was to skin a cat. It is just simply one way results in a fur coat and the other a bloody mess and Dim Sims.

Like any other networking or IT project or work there is strong value in understanding what your desired goal is, the outcomes that are tied to it, and what each milestone along the way means. Just because it is done in software and you can install it by simply clicking next next next doesn’t mean you will get a desirable outcome.

Remember – Prior preparation prevents piss poor performance!

4 thoughts on ““..the first thing you do when you get NSX is .. do not deploy NSX”

  1. I whole-heartedly agree with all of this. Even with significant design thought, until you go through the paces and see how it behaves you might find that you either over-designed or over-designed the system. For instance a small environment could easily get away with two ESGs and two DLRS for ALL traffic. We chose to dedicate ESG/DLRs to each business unit which gave us a seriously reduced surface area of impact when making changes. However, without automation to drive creation of all of this the likelihood of inconsistency is extremely high.

    People also need to remember, you can do traditional networking (VLANs) and just use the distributed firewall, this is not nearly as hard! You really need to evaluate if overlay networking is even required in your design. If it is required, it must be a SUPPORTABLE configuration.

    1. People also need to remember, you can do traditional networking (VLANs) and just use the distributed firewall, this is not nearly as hard!

      This! You touch on something here that I believe people forget. IN the hype of the whole “virtualise” all the things they forget that newer methods of enforcement work independent of network. Just because NSX offers a whole kitchen sink of features it doesn’t mean there is a dependancy. I know of three very large global customers who at this stage don’t use Logical Switching or DLRs for their routed topology. They use just DFW, Service Composer, and a partner integration.

      I think there are two approaches that people take:
      1. Take existing brownfield applications and begin work segmenting these with DFW and Service Composer tools.
      2. New application stacks use all the things!

      Thanks for reading.

  2. pandom, what are the top 5 use cases you’re seeing in your world? I saw a good one the other day that segmented Server 2003 workloads into their own policy/SG/microsegment that satisfied a lot of security and audit concerns (being able to forward all traffic to IPS, for example.

    What else have you seen?

      1. Out of date or EOL workloads are one.
        Others include segmenting applications that people may not traditionally have such as Exchange or Sharepoint.
        Data Security around PCI workloads.
        Applying Network services without changing the routing topology.
        Applying Firewall services on VLAN backed networks without changing the routing topology

      Truly depends on customer. I have worked with outs who use the load balancer exclusively!

Leave a Reply

Your email address will not be published. Required fields are marked *