Query backup job space on Veeam Repositories and find orphaned files with PowerShell
Some time ago I had to investigate discrepancies in the space consumption of large Veeam scale-out repositories. Theoretically, about the same space should have been allocated. But in reality the difference was quite huge. In this post I show how to query backup job space on Veeam Repositories and how to find orphaned files with PowerShell.
In the first part I show how to easily query used space by backup jobs. For more details, even used space of each machine within a backup job can be shown. In second part you can find code snippets to check files on repository (even scale-out) against backup files in Veeam database. Physicals files not listed in database are orphaned.
Analyze used Repository space
Restore Points are files in repositories. Veeam is aware of these files and keeps track of storage space they occupy. To check this space for complete backup jobs, you can use the following code.
(Get-VBRBackup | Where-Object {$_.JobType -eq "Backup"}) |Select-Object jobname, @{N="Backupsize"; E={(($_.GetAllStorages().stats.backupsize | Measure-Object -Sum).Sum) }}
Output contains jobname and used space on storage. In this code I selected just backup-jobs.
If space occupied by each object within a backup job is interesting, the following code can be used to analyze.
(Get-VBRBackup | Get-VBRRestorePoint) |Select-Object vmname, @{N="Job"; E={$_.getsourcejob().name }}, @{N="Repo"; E={$_.getrepository().name }} , @{N="size"; E={$_.getstorage().stats.BackupSize }}
The output of this one-liner should be the same as properties of disk backups.
When total over all this corresponds to used space on repository volumes, everything is OK. But take also space savings of Fast Cloning with ReFS and XFS into account. If not, further investigations should be done.
Check files against restore points (find orphaned files)
You can check each physical file with Veeam known files in database. To start with, get a list of all file in a repository.
Export file list
To export all repository files in XML, you can run following code.
$result = @()
$dirs = (Get-ChildItem path_to_repo).fullname
foreach ($dir in $dirs) {
$temp = @()
$temp = Get-ChildItem -Path $dir -Depth 2 -file | Select-Object FullName, pschildname, length, creationtime, lastaccesstime
$result += $temp
}
$result | Export-Clixml -Path C:\path_to_xml.xml
Notes
- Replace
path_to_repo
by your repository path. - Works this way just for Windows.
Next build an array of Veeam files in database with absolute path.
Query database pre-v10
In Version 9.5 I have not found any other way than querying the SQL server directly.
I used this script (http://sebastiaan.tempels.eu/2017/05/05/veeam-orphaned-files/) to start with. This script is limited to local data. To use it for distributed environments, you can start with the following snippets.
$vbrdata=@(); $Source=@();
$Source = Import-Clixml -Path C:\path_to_xml.xml
$SOR = Get-VBRBackupRepository -name repo_name -ScaleOut | Get-VBRRepositoryExtent
$data = Get-VBRBackup | Get-VBRRestorePoint
foreach ($d in $data){
$extent = (Invoke-Sqlcmd -Query "SELECT [dependant_repo_id] FROM [VeeamBackup].[dbo].[Backup.ExtRepo.Storages] WHERE [storage_id] = '$($d.StorageId)';" -ServerInstance "sql_server_Instance").dependant_repo_id
$base = ($SOR | Where-Object {$_.id -eq $extent}).Repository.FriendlyPath
$path = $base+"\"+$d.FindBackup().DirPath+"\"+$d.GetStorage().FilePath
$vbrdata += $path
}
Notes
- Replace
path_to_xml.xml
by your preferred locations of the XML file. - Set
repo_name
to your repository name. - Your SQL Server and instance name should be coded instead of
sql_server_Instance
. - If passthrough authentication to SQL Server does not work, you can use parameters
-Username
and-Password
.
Query database post-v9
In current version v10 and v11 there is better way to find needed information. Use this code to do so.
$vbrdata=@(); $Source=@();
$Source = Import-Clixml -Path C:\path_to_xml.xml
$SOBR = Get-VBRBackupRepository -name SOBR_blob -ScaleOut
$SOR = $SOBR | Get-VBRRepositoryExtent
$data = Get-VBRBackup | ? {$_.getrepository().name -in $SOBR.Name}
$repo = $data[0].GetRepository()
foreach ($d in $data.GetAllStorages()){
$extent = $repo.FindRepositoryForExistingStorage($d.Id)
$base = ($SOR | Where-Object {$_.id -eq $extent.id}).Repository.FriendlyPath
if ($extent.type -eq "LinuxLocal") {$delimiter = '/'} elseif {$delimiter = '\'}
$path = ($base+$delimiter+$d.FindBackup().DirPath+$delimiter+$d.FilePath).replace('|',$delimiter)
$vbrdata += $path
}
[update] Mirror-Copy-jobs creates an additional sub-directory under the copy target directory for each backup job to copy. This folders are separated by a pipe (“|”) in DirPath
. Therefore I added the .replace-method for $path
variable creation.
Notes
- Replace
path_to_xml.xml
by your preferred locations of the XML file. - Replace
SOBR_blob
by your Scale-Out backup repository name. - This code works for VBR v11. For v10 you need to replace method
FindRepositoryForExistingStorage
byFindExtentRepo
. - At the time of writing I noticed my 9.5 code does not work with versions post-v9. Honestly this code is neither very well tested nor very efficient. I will update the script when I analyze a more current version.
- Variable
$delimiter
should ensure that the code works for Linux repositories too.
List orphaned files
Compare physical files with files in Veeam database.
$Orphant = @()
foreach ($s in $Source){
if ($vbrdata -notcontains $s.fullname){
$Orphant += $s
}
}
Array $Orphant
now contains all filesystem files, not known by Veeam as restore points. Notice: With these snippets, DB-transaction logs backups are also in the list. They are shown as *.vlb
files.
General notes
- Get your trial of Veeam Backup&Replication!
- Also check my posts about analyzing space savings with Veeam Fast Clone for ReFS and XFS used.
Thanks mate, you helped me big time here!
Thanks for your feedback! Always happy to help!
Hi VNOTE42
Thank you so much for the post! I do have a question.
In your backup properties screenshot you are showing 2VMs
******w2016vm- and ******7vm-
It looks like your fulls are 15GB and 10GB and your incrementals are 17MBs, if you had another set of full backups in your SOBR repository would it throw the size of the backup off ?
In my case I have a 50TB Amazon S3 SOBR. I have over 300 restore points for 2 VMS and I take full backups once a week.
The result of the PS that returns space occupied by each object indicates that backups for the specific VM is taking 60TB, which I know isn’t possible since I only have 50TB of space.
Assuming fulls aren’t really taking the space that they are reporting to veeam, but not really sure, could you please confirm what I suspect is happening?
Hi Tim!
First, I did not test the script for S3 storages.
When you do a synthetic full, you will see a file with size of real full, but the space that is allocated on the volume by this synthetic full is just the size of an incremental. This is because a synthetic full links blocks together it needs to build a full:
https://helpcenter.veeam.com/docs/backup/vsphere/synthetic_full_hiw.html?ver=110
I hope this helps?
Version 11 actually does this Natively now!
https://community.veeam.com/blogs-and-podcasts-57/changed-in-v11-handling-of-exported-and-orphaned-backups-475
Hi!
Right, there were some changes made in v11. But The stuff I described in my post is still not covered in v11! My post can help in the case VBR does not know files any more. You can find these files for manual deletion. This is not done by v11 … and certainly not in v12 too
Here is a link to forum post (of any angry user):
https://forums.veeam.com/vmware-vsphere-f24/veeam-keeps-forgetting-aged-out-restore-points-t80676.html
You are correct , just re-read
“When you remove a backup job, it will now tell you if the disks that were associated with that job have not been deleted (orphaned)”
So your script is still the best for orphaned vibs without a deleted job! I count myself lucky I only had 100GB of these files after reading that Veeam Forum post!
Get-ChildItem path_to_repo works great for Windows repositories. Do you have a script for Linux repos?
Hi Michael!
For Linux repos I have not needed this yet. But since the PowerShell core is also available for Linux, chances are good you can get this to work on Linux as well.
https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell-on-linux?view=powershell-7.3
Wolfgang
Hello. Your script sounds like i could definitely use it, however im running into an issue.. Running V11, and used the code in your “Query Database Post v9” section.. i get an error “Exception calling “GetRepository” with “0” argument(s): “Repository is not set”.. do you know why it would say that? not sure if i was supposed to set a variable that i didnt do properly.. i’ve copied the output below.. thanks! – Mike
PS C:\Windows\system32> $result = @()
PS C:\Windows\system32> $dirs = (Get-ChildItem \\vemtapp98003\i$\veeam).fullname
PS C:\Windows\system32> foreach ($dir in $dirs) {
>> $temp = @()
>> $temp = Get-ChildItem -Path $dir -Depth 2 -file | Select-Object FullName, pschildname, length, creat
accesstime
>> $result += $temp
>> }
PS C:\Windows\system32> $result | Export-Clixml -Path C:\path_to_xml.xml
PS C:\Windows\system32> $vbrdata=@(); $Source=@();
PS C:\Windows\system32> $Source = Import-Clixml -Path C:\path_to_xml.xml
PS C:\Windows\system32> $SOBR = Get-VBRBackupRepository -name SOBR_blob -ScaleOut
PS C:\Windows\system32> $SOR = $SOBR | Get-VBRRepositoryExtent
PS C:\Windows\system32> $data = Get-VBRBackup | ? {$_.getrepository().name -in $SOBR.Name}
Exception calling “GetRepository” with “0” argument(s): “Repository is not set”
At line:1 char:29
+ $data = Get-VBRBackup | ? {$_.getrepository().name -in $SOBR.Name}
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : Exception
PS C:\Windows\system32> $repo = $data[0].GetRepository()
Cannot index into a null array.
At line:1 char:1
+ $repo = $data[0].GetRepository()
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (:) [], RuntimeException
+ FullyQualifiedErrorId : NullArray
PS C:\Windows\system32> foreach ($d in $data.GetAllStorages()){
>> $extent = $repo.FindRepositoryForExistingStorage($d.Id)
>> $base = ($SOR | Where-Object {$_.id -eq $extent.id}).Repository.FriendlyPath
>> if ($extent.type -eq “LinuxLocal”) {$delimiter = ‘/’} elseif {$delimiter = ‘\’}
>> $path = ($base+$delimiter+$d.FindBackup().DirPath+$delimiter+$d.FilePath).replace(‘|’,$delimiter)
>> $vbrdata += $path
>> }
Hi Mike!
You query for repo “SOBR_blob”. What kind of repo is this? Is this an object-store?
Hi Vnote42, and thanks for the quick reply! 🙂 We have 6x scale out repositories (SOBR).. and each SOBR is made up of about 2-4 backup repositories.. the backup repositories are all large partitions on Windows servers, whose storage is on an HP MSA storage array.. i wasn’t sure how to run the script against all of the repositories.. (maybe 1x at a time, or just enter the multiple repository paths in the script?).. so i ran part 1x of the script against just 1x of the backup repositories (to see what it would do).. it did output the .xml file ok, so i was moving onto part 2 of your script.. i wasnt sure if it was supposed to get the SOBR name automatically, or if i needed to input it somewhere.. does that make sense..? thanks!
Hi Mike!
Sorry for the late answer!
I guess I found the error. In the code snippet Query database post-v9 I wrote “-name SOBR_blob” in line 3. Since the name of your Scale-Out repository will not be “SOBR_blob”, you get the error. Please let us know if it works when you replace it by the correct name in your environment.
Thanks for your finding, just added the note to replace the name in my code.