Any vmware experts still post here?

Discussion in 'Tech Heads' started by Utumno, May 8, 2020.

  1. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    I kinda doubt it, but I'll post this anyway just in case. Maddening case at work and I haven't touched vmware shit in-depth for a long time so I'm pretty stumped.

    ==================================

    We have a pre-existing flexpod/ucs chassis with 20-ish blades and everything runs beautifully. The general setup on the blades is a standard vswitch0 which handles management port over two physical NICs, and then every other vlan (production, iscsi storage, vmotion) running on a distributed vswitch (with apparently 10-ish Cisco VIC Ethernet NICs serving as the "physical" adapters)

    Now we have a need to expand this cluster with standalone pizza-box style servers, outside the flexpod/ucs chassis. Our network team has trunked production, iscsi, vmotion, and management ports out to a Cisco Nexus switch where we've plugged in the new servers.

    New servers, if set up strictly with standard vswitches (management on vswitch0, and another standard vswitch for everything else), works perfectly... in terms of being able to run production traffic over prod vlan, and participate in iSCSI over storage network. VMs that are shut down can be moved over with no issues, proving all the cabling (and networking setup I believe) is correct. HOWEVER - vmotion won't work over standard vswitches by design right? So we can't vmotion things. We NEED to use distributed switches.

    And that's where the trouble is... if I add one of the new pizza-box hosts to the pre-existing distributed vswitch, everything seems to look okay, but no traffic seems to be going between the pizza-boxes and the Cisco Nexus switch. Logging into the Cisco Nexus switch, I can't even see the MACs of vmkernel ports created on the new server. I can't get iSCSI to work at all, despite proving that identical setup works fine on identical hosts running on standard vswitch.

    Any idea where to start looking here? Does this even make sense? Thanks in advance to any distributed vswitch gurus out there that could point me in the right direction.
     
  2. Agrul

    Agrul TZT Neckbeard Lord

    Post Count:
    47,403
    having you tried using the blades to cut up the pizza boxes? it will make them much easier to dispose of.
     
    Solayce and Utumno like this.
  3. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    fuck i knew i forgot something
     
  4. Czer

    Czer I'm a poor person. The lambo is my cousin's.

    Post Count:
    30,329
    was something implemented during legacy triggering whatever in the network to only read/write if it's expecting the prior blade server

    if you're not seeing mac add's, do you have a test environment to checkout what's going on?

    https://pubs.vmware.com/vsphere-50/index.jsp?topic=/com.vmware.vcli.examples.doc_50/cli_manage_networks.11.4.html

    You can create a maximum of 127 virtual switches on a single ESXi host. By default, each ESXi host has a single virtual switch called vSwitch0. By default, a virtual switch has 56 logical ports.
     
  5. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    yeah, a single vSwitch0 on each of the blades - i don't *think* i'm running into any limits in terms of ports or vswitches but i'll double check anyway

    we don't have a test environment really, we have multiple servers to try different things, but pretty much impossible to replicate this monstrous/expensive flexpod/ucs chassis. and to make things extra fun we are moving away from all this stuff within the next year so we already discontinued vmware support. which is why i'm here begging like a hobo lol
     
  6. Solayce

    Solayce Would you like some making **** BERSERKER!!! Staff Member

    Post Count:
    21,660
    Long day and a little late so not ready to tackle thinking through this - plus we moved to Nutanix 2 years ago, with a Proxmox cluster for site services, so I've been trying to learn more about KVM/ZFS/Ceph these days. Not to mention trying to drag my way through DevOps tooling and procedures.

    May need to draw this out so it makes sense. I'll come back tomorrow when i get some time. Hopefully I haven't burned out what little I knew.

    Before I bail, what are the basics? Version of vSphere? On the networking, sounds like there could be issues here. What are you using to bind the ports together? IIRC lacp was the best choice and not Cisco native. And then, even if you are using LACP, don't do auto detect stuff. Manually set both sides of the channel.
     
  7. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    Oh duh, yeah I'm leaving out some stuff. I also am brain fried lol.

    vCenter and vSphere v6.7 I believe.

    Ethernet ports look like this in cisco ios

    interface Ethernet111/1/24
    description -HOST: <pizzaboxhostname>:0
    switchport mode trunk
    switchport trunk native vlan 220
    switchport trunk allowed vlan 222-223,226-227
    spanning-tree port type edge trunk
    no shutdown
     
  8. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    grossly oversimplified diagram

    [​IMG]
     
  9. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    view of vlans networking on distributed vswitch

    [​IMG]
     
  10. Czer

    Czer I'm a poor person. The lambo is my cousin's.

    Post Count:
    30,329
    Did you look at how the vkernal port adapters are setup on the blade servers for comparison

    does the FEX detect the new server at all
     
  11. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    This may be helpful. I did notice in the vcenter console that the LACP settings are completely blank. I guess we're not using that (and that seems to be fine for the UCS blades).

    [​IMG]
     
  12. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    Yes, I did. Like quadruple checked the vmkernel settings to match since I kept thinking the prob was there heh.

    The FEXs detect the new servers just fine as long as I use port groups on standard vswitches.

    They don't detect shit when I use the distributed vswitch... even though there has to be SOME connectivity working because CDP info populates on Server 2 which is using distributed vswitch. So clearly the NICs on server two can see SOME stuff from the Fex/Nexus side.
     
  13. Mern is Best

    Mern is Best TZT Addict

    Post Count:
    4,022
    Turn it off then turn it back on
     
    Red, Utumno and Kilinitic like this.
  14. Agrul

    Agrul TZT Neckbeard Lord

    Post Count:
    47,403
    hire me + mern to debug ur shit for u
     
  15. Solayce

    Solayce Would you like some making **** BERSERKER!!! Staff Member

    Post Count:
    21,660
    This still sounds network related.

    Have you added the hosts in vCenter yet - doesn't that populate dvSwitches? You are using NIC Teaming, so verify those settings everywhere, both in ios and hosts/vcenter. For vMotion to work, all networking configuration needs to match; this is what dvSwitches came about. An extra space in the VLAN name between hosts is enough to F this up. Use dvSwitches everywhere.
     
    Utumno likes this.
  16. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    Yes added already. No, does not automatically populate the dvSwitches until you select that dvSwitch and specifically add the new hosts (which I have already done). That is one of the most puzzling parts, the whole point of dvswitches is to standardize all the things so that setup is somewhat automatic.

    I do think it's network related, maybe a setting on the network side needed specifically for dvswitches that is not needed for standard switches. I think I'm going to just ask my boss to see if we can hire a hired gun to look into this remotely.
     
  17. Solayce

    Solayce Would you like some making **** BERSERKER!!! Staff Member

    Post Count:
    21,660
    no - on the networking side, it's exactly the same for standard and dv switches. Been years, and I'm not a network engineer, but I have had a CCNA and studied for it several time; plus. still retain access to our routers and switches, though I only do basic port configs. As I laid out earlier, check/confirm all trunking and NIC Teaming configs. Trunking is a layer2 config so not seeing MAC addresses points there. Good luck.
     
    Utumno likes this.
  18. Solayce

    Solayce Would you like some making **** BERSERKER!!! Staff Member

    Post Count:
    21,660
    Honestly, maybe trying posting on reddit and/or spiceworks.
     
  19. Utumno

    Utumno Administrator Staff Member

    Post Count:
    42,757
    Yeah good call on spiceworks, I never post there but this does seem to be the kind of thing people often help with.
     
  20. Solayce

    Solayce Would you like some making **** BERSERKER!!! Staff Member

    Post Count:
    21,660
    any luck?