WebHCat Failed to Start Due to Abnormal Health Status

Issue

The WebHCat instance fails to be started.

Symptom

On Manager, the health status of the WebHCat instance is Faulty, and alarm ALM-12007 Process Fault is generated for the WebHCat instance of the Hive service. An error is reported when the Hive service is restarted.

Error messages "Service not found in Kerberos database" and "Address already in use" are contained in the /var/log/Bigdata/hive/webhcat/webhcat.log file of the WebHCat instance.

Procedure

Log in to each node where the WebHCat instance resides and check whether the mapping between IP addresses and hostnames in the /etc/hosts file is correct. The WebHCat configurations in the /etc/hostname and /etc/HOSTNAME files must be the same as those in the /etc/hosts file. If they are different, manually modify them.

To view the mapping between the IP addresses and hostnames of the WebHCat instance, log in to FusionInsight Manager and choose Cluster > Services > Hive > Instance.
Log in to any node where the WebHCat instance resides and run the following command to switch to user omm:

su - omm
Run the following command to check whether the WebHCat process exists:

ps -ef|grep webhcat|grep -v grep

If it does, run the following command to kill it:

kill -9 ${webhcat_pid}
Log in to FusionInsight Manager and choose Cluster > Services > Hive . On the page that is displayed, click the Instance tab. The select all WebHCat instances, click More, and select Restart Instance. Wait until WebHCat is restarted successfully.