Monday, June 16, 2014

Multicast drop with LACP

Today I had a strange problem on a network with a stack of Cisco 3750X running IOS 15.0(2)SE4.

All unicast traffic was working but for some reason I noticed that multicast traffic (more specifically OSPF Hello packets) between routers did not pass. One of the routers was configured as a trunk ports with many VLANs and LACP.

What I saw was that the firewall could se one router, one router could see both the firewall and the other router and the second firewall could only see the first router.

This happened after a reload of the entire switch stack, something that I have done many times before. So I can not tell for sure exactly what I did different to cause this problem this time. The switch is configured for src-dst-ip load balancing.

This seems to be a bug within the Cisco switch as the quick fix that worked for me was to take down one of the ports in the port-channel and later bring it back up.

Troubleshooting commands that you can use. I did not so I am not sure what its result would be.

show platform forward gigabitEthernet 1/0/2 vlan 333 1111.2222.3333 3333.2222.1111
test etherchannel load-balance interface po 2 mac c000.1111.1111 2222.2222.2222
test etherchannel load-balance interface po 2 ip


Cisco IOS 15.0(2)SE6 is buggy

I just upgraded a few Cisco 3750X stacks to this version and all stacks started to misbehave. What I saw before I had to roll back to 15.0(2)SE4 was that:

  • 99 % CPU load, shared between a 802.1x process and a hrpc hlfm request process.
  • Slow console, probably because of the CPU load.
  • The switch did not learn MAC addresses on many ports, both port-channels and standalone ports. (All ports I checked was trunk ports.)
  • MACSec (using the service module) rekeying timed out and dropped its association to the other party. (Using a pre-shared key.)
  • Lots of packet loss.


Friday, June 06, 2014

Cisco hpm main process

This process does not show up often in Cisco documentation but it can still do CPU hogs.

%SYS-3-CPUHOG: Task is running for (2098)msecs, more than (2000)msecs (3/1),process = hpm main process.

The “pm” part of the name is for “Port Manager”, I have still not figured out what the first letter is for. You will find these processes at least on all Cisco 2960, Cisco 3560 and Cisco 3750 series switches.

The hpm processes (there are three of them, at least on the Cisco 3750 series) handles events related to port changes. This includes link up/link down events, configuration changes etc. If you experience CPU related issues to this process you should check for flapping ports (cabling) first and then spanning-tree issues.

Related commands:

  • show platform pm counters
  • debug platform pm hpm-events