Symptom The process_check module of Intel Cluster Checker sometimes reports an issue related to Cluster SSH : Stale Process Check, (process_check)……………………………………………………….FAILED subtest ‘Percent cpu usage is greater than 5%’ failed – failing host compute-00-14 returned: ‘pid=8736 (cssh)’ Resolution Configure Intel® Cluster Checker to ignore the CPU usage of Cluster SSH: cssh
Here is the original post:
Using Cluster SSH with Intel® Cluster Checker
Symptom Intel® Cluster Checker hangs during the execution of the mflops_intel_mkl test module on clusters running Penguin Computing* Scyld Clusterware* 5.4. In addition, inactive or zombie processes named dgemm_mflops may be present on the nodes. The dgemm_mflops binary is a DGEMM benchmark optimized with the Intel® Math Kernel library. It is packaged with Intel® Cluster Checker. Debug output provides no other information
Here is the original:
The mflops_intel_mkl test module hangs during execution on Scyld Clusterware 5.4