This is a post in the series – What’s new in 6.2? It covers off the new features of a pseudo-major NSX release.

Introducing Traceflow

Traceflow adds functionality to the Toolbox that NSX provides to help Operationalise the NSX Network Virtualisation platform. Traceflow allows the injection of varying types of packets into application topologies. As the name suggests traces the flow through the path. It collects observation of actions, hosts, relevant components, and their names. This is used to help administrators visualise a topology path.

Tracing within a Layer 2 domain

As an administrator using Traceflow it is possible to craft a packet with a variety of settings. As seen below I have picked a source and destination VM on a Logical Switch. This can be selected on Logical Switches in Unicast or Hybrid mode.

Screen Shot 2015-08-25 at 9.31.41 PM

Here you can see that there is an ability to select protocol and then modify additional fields. I have chosen a TCP packet and a SRC/DST port of 80 for this example. My firewall rules ‘protecting’ my workloads are permit any any.

Screen Shot 2015-08-25 at 9.57.27 PM

 

This matches App-01 Web Tier that are in a Security Group (matching on a Security Tag) to individual VM’s listed App01, App01, App02, App02. This rule allows all traffic. When the Traceflow is executed the following output is seen:

Screen Shot 2015-08-25 at 9.31.13 PM

At first this looks rather busy. It is possible to identify the following information from the above figure:

  • SRC: Web01 NIC1 172.16.172.10
  • DST: App01 NIC1 172.167.172.12
  • Packet flow and order of operations
  • Objects between two points

These Virtual Machines are on a VXLAN Logical Segment. This allows administrators to provide Layer 2 connectivity between workloads independent of the underlying infrastructure.

The order of operations as displayed by the figure is as follows:

  1. The Traceflow packet is injected into Web01 vNIC.
  2. Received by the Distributed Firewall protecting the Web01 vNIC
  3. Forwarded (due to permit rule) by Distributed Firewall protection the Web01 vNIC
  4. Forwarded via VXLAN Tunnel Endpoint of host 192.168.112.11
  5. Received via VXLAN Tunnel Endpoint of host 192.168.112.14 (where App01 currently is located)
  6. Received by the Distributed Firewall protecting the App01 vNIC
  7. Forwarded (due to permit rule) by Distributed Firewall protecting the App01 vNIC
  8. Delivered to destination workload App01.

That gives administrators visibility to all related objects to a topology between two end points.

Identifying the Deniers

So what would happen if the administrator decided to ratchet down security? What would occur if the rule was changed to the below:

Screen Shot 2015-08-25 at 9.58.15 PM Time to see how Traceflow reacts. When the administrator runs Traceflow a second time the following output is seen.

Screen Shot 2015-08-25 at 9.59.30 PM

The result shows 1 Dropped observation in red. Something has been blocked. The sequence is as follows:

  1. The Traceflow packet is injected into Web01 vNIC.
  2. Received by the Distributed Firewall protecting the Web01 vNIC
  3. Dropped immediately (due to deny rule) by Distributed Firewall protecting the Web01 vNIC on egress.

The component name for Sequence 2 states Firewall (Rule 1005) is the Culprit. All the objects in the Component Name column are hyperlinked. This will reveal more information to the user about the object.

Screen Shot 2015-08-25 at 9.59.45 PM

Drop details which are hyperlinked show Rule ID 1005 is the culprit as suspected. The reason is due to a FW_RULE.

If this is not a desired behaviour or a rule that should not be enforced on this workload the administrator can quickly, easily, and efficiently identify the rule and remediate accordingly.

Layer 3 Traces just got visible

Taking this mentality with security policies on the same Layer 2 domain it is possible to perform Traceflow across routed segments. In this example the administrator decides to

Screen Shot 2015-08-25 at 9.33.14 PM

The difference between this Traceflow and the last one is that the Destination is an IP address. It is an ICMP trace. This is an address that is attached to the DLR. In this case this IP address is the Gateway IP for that subnet. It is local to all hosts in the transport zone the Logical Switch and DLR are assigned to. When the flow is executed the output below is seen:

Screen Shot 2015-08-25 at 9.35.09 PM

 

Time to look at the steps occurring here to gain an insight into how the traffic is being processed:

  1. Traceflow packet is injected into the vNIC of Web01 VM
  2. Forwarded (due to permit rule) by Distributed Firewall protection the Web01 vNIC
  3. Received by the Distributed Firewall protecting the Web01 vNIC
  4. Logical Switch App-01-Flat forwards this packet
  5. Packet is received by App-01-DLR
  6. Packet is returned by App-01-DLR
  7. Logical Switch App-01-Flat forwards this packet
  8. Received by the Distributed Firewall protecting the Web01 vNIC
  9. Forwarded (due to permit rule) by Distributed Firewall protection the Web01 vNIC

Screen Shot 2015-08-25 at 9.47.57 PM

Like before it is possible to understand the related objects from the Component Name hyperlink. Observation details below outline the Segment ID and Component Name. Very handy to know what VXLAN Numerical Identifier (VNI) is assigned to a Logical Switch.

Screen Shot 2015-08-25 at 9.48.03 PM

 

Conclusion

Traceflow is a great addition to the tools within VMware NSX for vSphere. It is born out of a maturing platform and provides actionable information at an administrators fingertips. I personally like how I can correlate Firewall policies to where a packet stops. I also like the notion I can inject varying traffic types into my topologies very easily.

VMware NSX for vSphere 6.2 is available now.

One thought on “What’s new in NSX 6.2 – Traceflow

  1. Thanks – interesting run through. Very useful to be able to “see” features rather than just read release notes.

Leave a Reply

Your email address will not be published. Required fields are marked *

*