'How to restore Solr backup from S3 bucket

I have an up&running SolrCloud v8.11 cluster on Kubernetes, with solr-operator.

The backup is enabled on S3 bucket.

How can I correctly write the request to perform a RESTORE of a backup stored in a S3 bucket?

I'm unable to figure out what should it be the location and the snapshotName I have to provide in the Restore API request made to Solr.

In order to discover those values, I tried to execute the LISTBACKUP action, but in this case the location values is also wrong...

$ curl https://my-solrcloud.example.org/solr/admin/collections\?action=LISTBACKUP\&name=collection-name\&repository=collection-backup\&location=my-s3-bucket/collection-backup

{
  "responseHeader":{
    "status":400,
    "QTime":70},
  "error":{
    "metadata":[
      "error-class","org.apache.solr.common.SolrException",
      "root-error-class","org.apache.solr.common.SolrException"],
    "msg":"specified location s3:///my-s3-bucket/collection-backup/ does not exist.",
    "code":400}}

## The Log in cluster writes:
org.apache.solr.common.SolrException: specified location s3:///my-s3-bucket/collection-backup/ does not exist. => org.apache.solr.common.SolrException: specified location s3:///my-s3-bucket/collection-backup/ does not exist.

After all, the recurring backup works as expected, but sooner or later a RESTORE action will be performed and it's not clear how could it be done correctly.

Thank you in advance.



Solution 1:[1]

A bit late, but I came across this question while searching for the same answer. There was a thread on the mailing list that helped me to figure out how this is supposed to work.

I found the documentation on this pretty confusing, but the location seems to be relative to the backup repository. So, the repository argument already accounts for the bucket name, and the name argument would be the name of the backup you are attempting to list. Solr then builds the S3 path as {repository bucket} + {location} + {backup name}. So, location should simply be: /

Assume you've set up a backupRepository for the SolrCloud deployment like the following:

backupRepositories:
- name: "my-backup-repo"
  s3:
    region: "us-east-1"
    bucket: "my-s3-bucket"

and you have created a SolrBackup like the following:

---
apiVersion: solr.apache.org/v1beta1
kind: SolrBackup
metadata:
  name: "my-collection-backup"
spec:
  repositoryName: "my-backup-repo"
  solrCloud: "my-solr-cloud"
  collections:
    - "my-collection"

The full cURL command for LISTBACKUP would be:

$ curl https://my-solrcloud.example.org/solr/admin/collections \
  -d action=LISTBACKUP \
  -d name=my-collection-backup \
  -d repository=my-backup-repo \
  -d location=/

Similarly for the RESTORE command:

$ curl https://my-solrcloud.example.org/solr/admin/collections \
-d action=RESTORE \
-d name=my-collection-backup \
-d repository=my-backup-repo \
-d location=/ \
-d collection=my-collection-restore

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Matthew Hanlon