Query backup job space on Veeam Repositories and find orphaned files with PowerShell

Query backup job space on Veeam Repositories and find orphaned files with PowerShell

Some time ago I had to investigate discrepancies in the space consumption of large Veeam scale-out repositories. Theoretically, about the same space should have been allocated. But in reality the difference was quite huge. In this post I show how to query backup job space on Veeam Repositories and how to find orphaned files with PowerShell.

In the first part I show how to easily query used space by backup jobs. For more details, even used space of each machine within a backup job can be shown. In second part you can find code snippets to check files on repository (even scale-out) against backup files in Veeam database. Physicals files not listed in database are orphaned.

Analyze used Repository space

Restore Points are files in repositories. Veeam is aware of these files and keeps track of storage space they occupy. To check this space for complete backup jobs, you can use the following code.

(Get-VBRBackup | Where-Object {$_.JobType -eq "Backup"})  |Select-Object jobname, @{N="Backupsize"; E={(($_.GetAllStorages().stats.backupsize | Measure-Object -Sum).Sum) }}

Output contains jobname and used space on storage. In this code I selected just backup-jobs.

If space occupied by each object within a backup job is interesting, the following code can be used to analyze.

(Get-VBRBackup  | Get-VBRRestorePoint)  |Select-Object vmname, @{N="Job"; E={$_.getsourcejob().name }}, @{N="Repo"; E={$_.getrepository().name }} , @{N="size"; E={$_.getstorage().stats.BackupSize }}

The output of this one-liner should be the same as properties of disk backups.

When total over all this corresponds to used space on repository volumes, everything is OK. But take also space savings of Fast Cloning with ReFS and XFS into account. If not, further investigations should be done.

Check files against restore points (find orphaned files)

You can check each physical file with Veeam known files in database. To start with, get a list of all file in a repository.

Export file list

To export all repository files in XML, you can run following code.

$result = @()
$dirs = (Get-ChildItem path_to_repo).fullname
foreach ($dir in $dirs) {
    $temp = @()
    $temp = Get-ChildItem -Path $dir -Depth 2 -file | Select-Object FullName, pschildname, length, creationtime, lastaccesstime
    $result += $temp
}
$result | Export-Clixml -Path C:\path_to_xml.xml

Notes

  • Replace path_to_repo by your repository path.
  • Works this way just for Windows.

Next build an array of Veeam files in database with absolute path.

Query database pre-v10

In Version 9.5 I have not found any other way than querying the SQL server directly.

I used this script (http://sebastiaan.tempels.eu/2017/05/05/veeam-orphaned-files/) to start with. This script is limited to local data. To use it for distributed environments, you can start with the following snippets.

$vbrdata=@(); $Source=@(); 
$Source  = Import-Clixml -Path C:\path_to_xml.xml
$SOR = Get-VBRBackupRepository -name repo_name -ScaleOut | Get-VBRRepositoryExtent
$data = Get-VBRBackup  | Get-VBRRestorePoint
foreach ($d in $data){
    $extent = (Invoke-Sqlcmd -Query "SELECT [dependant_repo_id] FROM [VeeamBackup].[dbo].[Backup.ExtRepo.Storages] WHERE [storage_id] = '$($d.StorageId)';" -ServerInstance "sql_server_Instance").dependant_repo_id
    $base = ($SOR | Where-Object {$_.id -eq $extent}).Repository.FriendlyPath
    $path = $base+"\"+$d.FindBackup().DirPath+"\"+$d.GetStorage().FilePath
    $vbrdata += $path
}

Notes

  • Replace path_to_xml.xml by your preferred locations of the XML file.
  • Set repo_name to your repository name.
  • Your SQL Server and instance name should be coded instead of sql_server_Instance.
  • If passthrough authentication to SQL Server does not work, you can use parameters -Username and -Password.

Query database post-v9

In current version v10 and v11 there is better way to find needed information. Use this code to do so.

$vbrdata=@(); $Source=@(); 
$Source  = Import-Clixml -Path C:\path_to_xml.xml
$SOBR = Get-VBRBackupRepository -name SOBR_blob -ScaleOut
$SOR = $SOBR | Get-VBRRepositoryExtent
$data = Get-VBRBackup  | ? {$_.getrepository().name -in $SOBR.Name} 
$repo = $data[0].GetRepository()
foreach ($d in $data.GetAllStorages()){
    $extent = $repo.FindRepositoryForExistingStorage($d.Id)
    $base = ($SOR | Where-Object {$_.id -eq $extent.id}).Repository.FriendlyPath
	if ($extent.type -eq "LinuxLocal") {$delimiter = '/'} elseif {$delimiter = '\'}
    $path = ($base+$delimiter+$d.FindBackup().DirPath+$delimiter+$d.FilePath).replace('|',$delimiter)
    $vbrdata += $path
}

[update] Mirror-Copy-jobs creates an additional sub-directory under the copy target directory for each backup job to copy. This folders are separated by a pipe (“|”) in DirPath. Therefore I added the .replace-method for $path variable creation.

Notes

  • Replace path_to_xml.xml by your preferred locations of the XML file.
  • Replace SOBR_blob by your Scale-Out backup repository name.
  • This code works for VBR v11. For v10 you need to replace method FindRepositoryForExistingStorage by FindExtentRepo.
  • At the time of writing I noticed my 9.5 code does not work with versions post-v9. Honestly this code is neither very well tested nor very efficient. I will update the script when I analyze a more current version.
  • Variable $delimiter should ensure that the code works for Linux repositories too.

List orphaned files

Compare physical files with files in Veeam database.

$Orphant = @()
foreach ($s in $Source){
    if ($vbrdata -notcontains $s.fullname){
        $Orphant += $s
    }
}

Array $Orphant now contains all filesystem files, not known by Veeam as restore points. Notice: With these snippets, DB-transaction logs backups are also in the list. They are shown as *.vlb files.

General notes

13 responses to “Query backup job space on Veeam Repositories and find orphaned files with PowerShell”

  1. Thanks mate, you helped me big time here!

  2. Tim says:

    Hi VNOTE42

    Thank you so much for the post! I do have a question.

    In your backup properties screenshot you are showing 2VMs
    ******w2016vm- and ******7vm-

    It looks like your fulls are 15GB and 10GB and your incrementals are 17MBs, if you had another set of full backups in your SOBR repository would it throw the size of the backup off ?

    In my case I have a 50TB Amazon S3 SOBR. I have over 300 restore points for 2 VMS and I take full backups once a week.
    The result of the PS that returns space occupied by each object indicates that backups for the specific VM is taking 60TB, which I know isn’t possible since I only have 50TB of space.

    Assuming fulls aren’t really taking the space that they are reporting to veeam, but not really sure, could you please confirm what I suspect is happening?

  3. Michael Alexander says:

    Get-ChildItem path_to_repo works great for Windows repositories. Do you have a script for Linux repos?

  4. mikeb says:

    Hello. Your script sounds like i could definitely use it, however im running into an issue.. Running V11, and used the code in your “Query Database Post v9” section.. i get an error “Exception calling “GetRepository” with “0” argument(s): “Repository is not set”.. do you know why it would say that? not sure if i was supposed to set a variable that i didnt do properly.. i’ve copied the output below.. thanks! – Mike

    PS C:\Windows\system32> $result = @()
    PS C:\Windows\system32> $dirs = (Get-ChildItem \\vemtapp98003\i$\veeam).fullname
    PS C:\Windows\system32> foreach ($dir in $dirs) {
    >> $temp = @()
    >> $temp = Get-ChildItem -Path $dir -Depth 2 -file | Select-Object FullName, pschildname, length, creat
    accesstime
    >> $result += $temp
    >> }
    PS C:\Windows\system32> $result | Export-Clixml -Path C:\path_to_xml.xml
    PS C:\Windows\system32> $vbrdata=@(); $Source=@();
    PS C:\Windows\system32> $Source = Import-Clixml -Path C:\path_to_xml.xml
    PS C:\Windows\system32> $SOBR = Get-VBRBackupRepository -name SOBR_blob -ScaleOut
    PS C:\Windows\system32> $SOR = $SOBR | Get-VBRRepositoryExtent
    PS C:\Windows\system32> $data = Get-VBRBackup | ? {$_.getrepository().name -in $SOBR.Name}
    Exception calling “GetRepository” with “0” argument(s): “Repository is not set”
    At line:1 char:29
    + $data = Get-VBRBackup | ? {$_.getrepository().name -in $SOBR.Name}
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : Exception

    PS C:\Windows\system32> $repo = $data[0].GetRepository()
    Cannot index into a null array.
    At line:1 char:1
    + $repo = $data[0].GetRepository()
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : NullArray

    PS C:\Windows\system32> foreach ($d in $data.GetAllStorages()){
    >> $extent = $repo.FindRepositoryForExistingStorage($d.Id)
    >> $base = ($SOR | Where-Object {$_.id -eq $extent.id}).Repository.FriendlyPath
    >> if ($extent.type -eq “LinuxLocal”) {$delimiter = ‘/’} elseif {$delimiter = ‘\’}
    >> $path = ($base+$delimiter+$d.FindBackup().DirPath+$delimiter+$d.FilePath).replace(‘|’,$delimiter)
    >> $vbrdata += $path
    >> }

    • vNote42 says:

      Hi Mike!
      You query for repo “SOBR_blob”. What kind of repo is this? Is this an object-store?

      • mikeb says:

        Hi Vnote42, and thanks for the quick reply! 🙂 We have 6x scale out repositories (SOBR).. and each SOBR is made up of about 2-4 backup repositories.. the backup repositories are all large partitions on Windows servers, whose storage is on an HP MSA storage array.. i wasn’t sure how to run the script against all of the repositories.. (maybe 1x at a time, or just enter the multiple repository paths in the script?).. so i ran part 1x of the script against just 1x of the backup repositories (to see what it would do).. it did output the .xml file ok, so i was moving onto part 2 of your script.. i wasnt sure if it was supposed to get the SOBR name automatically, or if i needed to input it somewhere.. does that make sense..? thanks!

        • vNote42 says:

          Hi Mike!
          Sorry for the late answer!

          I guess I found the error. In the code snippet Query database post-v9 I wrote “-name SOBR_blob” in line 3. Since the name of your Scale-Out repository will not be “SOBR_blob”, you get the error. Please let us know if it works when you replace it by the correct name in your environment.

          Thanks for your finding, just added the note to replace the name in my code.

Leave a Reply

Your email address will not be published. Required fields are marked *