vcenter platform service controller high availability

vSphere 6 External Platform Service Controller with or without a load balancer

vSphere 6 introduces a distributed architecture for SSO and some other services. Recommended topologies such as those in KB: 2108548 all point to an external platform service controller and at first look seem to indicate that high availability would require a 3rd party load balancer.

But is a load balancer really required? If manual repointing can be done in 5 or 10 minutes is that acceptable downtime for your vCenter.

vsphere 6 PSC topologies - Copy of PSC Mutisite Failures Senarios (2)

Beside the additional complexity of using load balancers, limited choice of vendors,  if you make a support call the load balancer itself is not covered by VMware (KB: 2113115, KB2112736)

Additionally VMware seem committed to providing the HA functionality without the 3rd party load balancer, for example in the VMWorld 2015 session INF4945 – vCenter Server 6 High Availability one of the tech previews demonstrated vCenter HA without the load-balancer.

 

 

What happens when the PSC is down and no load balancer is used? 

  • If the PSC 6.0 server is down, you cannot log in to vCenter Server.
  • Existing connections and user sessions to the vCenter Server remains active, once the user session ends log on is not possible.
  • vCenter Server services remains up and running, but cannot be restarted.

If PSC’s services cannot be restored, vCenter Server must be re-pointed to an operational PSC, this is fully supported and not so difficult.

Please note if you are using a solution such as SRM, NSX or OpenStack you could run into issues related to the Certificate chain after repointing, see the note at the end of the blog.

 

Repointing operations, inter-site and between sites

Repointing within the same site is a single command, between sites is more complex and requires you upload the cmsso-util first.
( note: you need to set the default shell to bash to use win-scp )

  • KB 2113917 Repointing the VMware vCenter Server 6.0 between External Platform Services Controllers within a Site in a vSphere Domain.

vsphere 6 PSC topologies - PSC Mutisite with HA (Copy) (Copy) (2)

 

  • KB 2131191 Repointing the VMware vCenter Server 6.0 between sites in a vSphere Domain.

vsphere 6 PSC topologies - PSC Mutisite with HA (Copy) (Copy) (1)

Although the repointing operation is quick and straightforward if you forgot or missed some configuration of the environment post deploy you might find you run into issues.

 

Checklist for your distributed VCSA/PSC environment

Now this is the important part…  Be sure to run all these checks after deploy and before testing repointing

Check NTP on all appliances – vcsa and pscs

ntp.get
ntp.test –servers <xxx>
ntpq –p
service ntp stop\start

Set NTP if required

ntp.get
ntp.server.add –servers 192.168.xx.xx
timesync.set –mode NTP
ntp.get

 

Check name resolution for all appliances

If you are unable to resolve the name of the psc from your workstation you will not be able to reach vcenter

 

Check all appliances are within the domain

(Note: The add domain function is not available for the vcenter in the web client)

/opt/likewise/bin/domainjoin-cli query

Join domain if required

/opt/likewise/bin/domainjoin-cli join <domain> <domain admin> <password>

ie: to join the dca.local domain with the dcadmin account

/opt/likewise/bin/domainjoin-cli join dca.local dcadmin <password>

 

From vsca check which psc is being used by vCenter

# /usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location –server-name localhost

 

From psc check which psc are online and available sites
# cd /usr/lib/vmware-vmdir/bin
# ./vdcrepadmin -f showservers -h localhost -u Administrator -w <password>

 

From psc check the replication partners
# ./vdcrepadmin -f showpartners -h localhost -u Administrator -w <password>

 

From psc check the replication status
# ./vdcrepadmin -f showpartnerstatus -h localhost -u Administrator -w <password>

This is the interesting one, and you should play around to get the feel of how it responds, the change number may vary on each psc. Create a user in sso and then watch the synchronization of the change numbers, it can take 30 seconds, typically the psc on the remote site will take the longest.

 

On the vcsa check the site and domain names
# /usr/lib/vmware-vmafd/bin/vmafd-cli get-site-name –server-name localhost

# /usr/lib/vmware-vmafd/bin/vmafd-cli get-domain-name –server-name localhost

 

Check the site name on the target psc
# /usr/lib/vmware-vmafd/bin/vmafd-cli get-site-name –server-name localhost

 

Automatic repointing without a load balancer

Check out the following post which shows how repointing could be automated

How to automatically repoint & failover VCSA to another replicated Platform Services Controller (PSC)?

 

Important Note: Using solutions that point the lookup service

If you are using a solution such as SRM, NSX or OpenStack you could run into issues related to the Certificate chain after repointing

see KB 2109074 and  martinsvspace srm-6-nightmare

I have spend days trying to fix this in my lab, ok I repointed the psc and deleted the original… which was a bit too aggressive as a test
Anyway be warned, if you a using a solution avoiding  Server certificate chain not verified errors may be more important than getting the vCenter up and running

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *