Issue with VASP TaskServers - Jobs not added in queue automatically
Posted: Mon Feb 03, 2014 10:06 am
I am Raghunath J from Prof Karthikeyan's group at Department of Materials Engineering, Indian Institute of Science, Bangalore. We have Medea-Vasp installed at our lab and we are using it to perform DFT calculations.
The problem we are experiencing is that, once a job gets finished in a task server, the next one is not beginning automatically. The finished job is still shown as 'running' in the internet browser (Firefox). However, if I check the processes running on the taskserver, it shows that the process VASP_PARALLEL is not running (meaning that the job is completed). Then, I restart the job using the browser after which it shows it as finished and starts the next job in the queue.
Ideally, it should show each finished job as 'finished' at the browser and carry on to the next job without manual intervention. Because of this, we are losing lot of time on our processors which sit idle until a manual restart is done at the browser.
I checked the log of the concerned task server (at the browser) and It shows a message "Trying to notify the JobServer that the task(s) have been completed".
The taskservers have Medea-VASP installed on a CentOS operating system. Also, other servers do not have the same issue although they are of same operating system.
The problem we are experiencing is that, once a job gets finished in a task server, the next one is not beginning automatically. The finished job is still shown as 'running' in the internet browser (Firefox). However, if I check the processes running on the taskserver, it shows that the process VASP_PARALLEL is not running (meaning that the job is completed). Then, I restart the job using the browser after which it shows it as finished and starts the next job in the queue.
Ideally, it should show each finished job as 'finished' at the browser and carry on to the next job without manual intervention. Because of this, we are losing lot of time on our processors which sit idle until a manual restart is done at the browser.
I checked the log of the concerned task server (at the browser) and It shows a message "Trying to notify the JobServer that the task(s) have been completed".
The taskservers have Medea-VASP installed on a CentOS operating system. Also, other servers do not have the same issue although they are of same operating system.