One of the most important tasks for the DBA is managing backup files. Often the SQL Server backups files themselves are written to a drive local to the database server, but we then need to copy them off to a remote share. RoboCopy is a fantastic utility included in Windows that is great at doing this. If you’re not familiar with this tool, you can find detailed information on it here.
However, when using SQL Server Agent to schedule a RoboCopy job, you run into an issue with Process Exit Codes. This is because RoboCopy does not always return a 0 for success, non-zero for error. So when you tell SQL Server Agent that 0 = Success, you get false job failures. In fact, what RoboCopy returns is a bitmask as follows:
Bit | Description |
16 | Serious error. Robocopy did not copy any files. This is either a usage error or an error due to insufficient access privileges on the source or destination directories. |
8 | Some files or directories could not be copied (copy errors occurred and the retry limit was exceeded). Check these errors further. |
4 | Some Mismatched files or directories were detected. Examine the output log. |
2 | Some Extra files or directories were detected. Examine the output log. |
1 | Some Extra files or directories were detected. Examine the output log. |
0 | No errors occurred, and no copying was done. The source and destination directory trees are completely synchronized. |
As you can see from the above, there are several exit codes that are successful. 1,2,3,7,8,9, etc. In most cases, any combination of the 0,1,2,4 bits would be fine. So what we need is a way to interpret the bitmap that is returned and send a 0 or 1 to SQL Server indicating success or failure.
Enter PowerShell
I wrote a short PowerShell script to deal with this as follows
1: param
2: (
3: [Parameter(Position=0, Mandatory=$true)] [string]$SrcFolder,
4: [Parameter(Position=1, Mandatory=$true)] [string]$TgtFolder
5: )
6:
7: ##copy
8: robocopy $SrcFolder $TgtFolder /S /XO /XC /XN /NP /MIN:1024 /R:2 /W:2
9:
10: ##set exit code
11: $exit = $lastexitcode -band 24
12:
13: exit $exit
The script used the powershell bitwise operation –band to check to see if bits 8 or 16 are set, as these are the two actual errors we are concerned about. It then exits with a 0 or 1, which is what SQL Server Agent expects from an OperatingSystemCmdExec job step. Note that this applies mainly to SQL Server 2005. In 2008 and 2008 R2 you can create a PowerShell job step so you might handle this a bit differently.