So, this month’s T-SQL Tuesday invitation is from Matthew McGiffen. If on the T-SQL Tuesday logo, you can see he has invited us to do a post relating to puzzles.
Indeed, I could have just discussed one of my sessions here since it involves puzzles. However, instead I decided to invite all the SQL Server professionals who use Powershell to look at a Powershell puzzle I experienced recently.
Previously I had implemented a SQL Server Agent job that restarted maintenance jobs from a failed step if a server had been restarted due to patching. Which worked for most jobs.
However, it did highlighted a job which is behaving unusually.
One job in particular would not start from a failed step after a restart. After further investigation it looked like SQL Server Agent still showed this job as running, even though the server had been restarted.
I looked at the step and saw that it used the operating system command to call a Powershell script. Because Powershell version 3 was required it called the Powershell executable with the version 3 parameter and referenced the script to run.
Reproducing the issue
Because I wanted to test with restarts I decided to try and reproduce the issue on a test server. Which I was able to do fairly easily using the below steps.
- First I created a Powershell script that looked for all the files on two of the drives on the server.
- I then created a SQl Server Agent job.
- After this I added a step which used the Operating System (cmdexec) execution type.
- In the command section I entered the command to call the Powershell executable, specifying the version 3 paramter at the end of it. At the end I entered the location of the Powershell script to run. For example the below:
- powershell.exe -version 3 “D:\Test\GetAllFiles.ps1”
- Once this was done I started the job which I had created.
- Whilst the job was running I restarted the server.
- After the restart I checked the activity in SQL Server Agent Job Activity Monitor, which showed that the job appeared to be running.
- I then checked what was running on the server, which showed that this script definitely was not running. In addition, I checked the logs for the job.
Because of this unusual outcome, I looked in the backup tables in the msdb database. Where I discovered the jobs are not entered as cancelled properly, which explained this result and also meant the job would not automatically restart.
Attempts to fix issue
Now, I tested various ways to resolve this. For instance, I have tried changing the error number returned. In addition, I have tried converting the step to be a Powershell step and adding a throw catch.
I have called this symptom “Phantom jobs” because it’s jobs that appear to be running after a restart, when they are clearly not. However, if you use another name for them feel free to let me know.
Up until now, the only way I have found to resolve it is to add another step in the job that automatically restarts cancelled jobs. Which queries the backup taboles in the msdb database for them.
My invitation is for anybody who has yet to experience this Powershell puzzle to reproduce the issue using the steps above. Because it appears to be really easy to reproduce.
Afterwards, see if you can think of a more graceful fix for this and gain some kudos from your peers in the community.
Of course, if you have already investigated this puzzle and resolved it another way then feel free to let me know with a comment.