---
sidebar_position: 2
slug: /migration
sidebar_custom_props: {
  categoryIcon: LucideLocateFixed
}
---

# Backup & migration

- [Data migration](#data-migration)
- [Migrate from multi-bucket to single-bucket mode](#migrate-from-multi-bucket-to-single-bucket-mode)

## Data migration

:::info KUDOS
This document is contributed by our community contributor [TreeDy](https://github.com/Treedy2020). We may not actively maintain this document.
:::

A common scenario is processing large datasets on a powerful instance (e.g., with a GPU) and then migrating the entire RAGFlow service to a different production environment (e.g., a CPU-only server). This guide explains how to safely back up and restore your data using our provided migration script.

### Identify your data

By default, RAGFlow uses Docker volumes to store all persistent data, including your database, uploaded files, and search indexes. You can see these volumes by running:

```bash
docker volume ls
```

The output will look similar to this:

```text
DRIVER    VOLUME NAME
local     docker_esdata01
local     docker_minio_data
local     docker_mysql_data
local     docker_redis_data
```

These volumes contain all the data you need to migrate.

:::note
The volume name prefix (e.g., `docker_`) comes from the Docker Compose project name. By default it is `docker` (derived from the directory name). If you started RAGFlow with `docker compose -p <project_name>`, your volumes will be prefixed with `<project_name>_` instead, for example `ragflow_mysql_data`.
:::

### Step 1: Stop RAGFlow services

Before starting the migration, you must stop all running RAGFlow services on the **source machine**. Navigate to the project's root directory and run:

```bash
docker compose -f docker/docker-compose.yml down
```

If you started RAGFlow with a custom project name (e.g., `docker compose -p ragflow`), include it in the command:

```bash
docker compose -p ragflow -f docker/docker-compose.yml down
```

**Important:** Do **not** use the `-v` flag (e.g., `docker compose down -v`), as this will delete all your data volumes. The migration script includes a check and will prevent you from running it if services are active.

### Step 2: Back up your data

We provide a convenient script to package all your data volumes into a single backup folder.

For a quick reference of the script's commands and options, you can run:
```bash
bash docker/migration.sh help
```

To create a backup, run the following command from the project's root directory:

```bash
bash docker/migration.sh backup
```

This will create a `backup/` folder in your project root containing compressed archives of your data volumes.

You can also specify a custom name for your backup folder:

```bash
bash docker/migration.sh backup my_ragflow_backup
```

This will create a folder named `my_ragflow_backup/` instead.

If you started RAGFlow with a custom project name (e.g., `docker compose -p ragflow`), use the `-p` flag so the script can find the correct volumes:

```bash
bash docker/migration.sh -p ragflow backup
bash docker/migration.sh -p ragflow backup my_ragflow_backup
```

### Step 3: Transfer the backup folder

Copy the entire backup folder (e.g., `backup/` or `my_ragflow_backup/`) from your source machine to the RAGFlow project directory on your **target machine**. You can use tools like `scp`, `rsync`, or a physical drive for the transfer.

### Step 4: Restore your data

On the **target machine**, ensure that RAGFlow services are not running. Then, use the migration script to restore your data from the backup folder.

If your backup folder is named `backup/`, run:

```bash
bash docker/migration.sh restore
```

If you used a custom name, specify it in the command:

```bash
bash docker/migration.sh restore my_ragflow_backup
```

If the target machine uses a custom project name, use the `-p` flag to ensure the volumes are created with the correct prefix:

```bash
bash docker/migration.sh -p ragflow restore
bash docker/migration.sh -p ragflow restore my_ragflow_backup
```

The script will automatically create the necessary Docker volumes and unpack the data.

**Note:** If the script detects that Docker volumes with the same names already exist on the target machine, it will warn you that restoring will overwrite the existing data and ask for confirmation before proceeding.

### Step 5: Start RAGFlow services

Once the restore process is complete, you can start the RAGFlow services on your new machine:

```bash
docker compose -f docker/docker-compose.yml up -d
```

If you use a custom project name:

```bash
docker compose -p ragflow -f docker/docker-compose.yml up -d
```

**Note:** If you already have built a service by docker compose before, you may need to backup your data for target machine like this guide above and run like:

```bash
# Please backup by `bash docker/migration.sh backup backup_dir_name` before you do the following line.
# !!! this line -v flag will delete the original docker volume
docker compose -f docker/docker-compose.yml down -v
docker compose -f docker/docker-compose.yml up -d
```

Your RAGFlow instance is now running with all the data from your original machine.

## Migrate from multi-bucket to single-bucket mode

:::info KUDOS
This document is contributed by our community contributor [arogan178](https://github.com/arogan178). We may not actively maintain this document.
:::

By default, RAGFlow creates one bucket per Knowledge Base (dataset) and one bucket per user folder. This can be problematic when:

- Your cloud provider charges per bucket
- Your IAM policy restricts bucket creation
- You want all data organized in a single bucket with directory structure

The **Single Bucket Mode** allows you to configure RAGFlow to use a single bucket with a directory structure instead of multiple buckets.

### How it works

#### Default mode (Multiple buckets)

```
bucket: kb_12345/
  └── document_1.pdf
bucket: kb_67890/
  └── document_2.pdf
bucket: folder_abc/
  └── file_3.txt
```

#### Single bucket mode (with prefix_path)

```
bucket: ragflow-bucket/
  └── ragflow/
      ├── kb_12345/
      │   └── document_1.pdf
      ├── kb_67890/
      │   └── document_2.pdf
      └── folder_abc/
          └── file_3.txt
```

### Configuration

#### MinIO configuration

Edit your `service_conf.yaml` or set environment variables:

```yaml
minio:
  user: "your-access-key"
  password: "your-secret-key"
  host: "minio.example.com:443"
  bucket: "ragflow-bucket" # Default bucket name
  prefix_path: "ragflow" # Optional prefix path
```

Or using environment variables:

```bash
export MINIO_USER=your-access-key
export MINIO_PASSWORD=your-secret-key
export MINIO_HOST=minio.example.com:443
export MINIO_BUCKET=ragflow-bucket
export MINIO_PREFIX_PATH=ragflow
```

#### S3 configuration (already supported)

```yaml
s3:
  access_key: "your-access-key"
  secret_key: "your-secret-key"
  endpoint_url: "https://s3.amazonaws.com"
  bucket: "my-ragflow-bucket"
  prefix_path: "production"
  region: "us-east-1"
```

### IAM policy example

When using single bucket mode, you only need permissions for one bucket:

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:*"],
      "Resource": [
        "arn:aws:s3:::ragflow-bucket",
        "arn:aws:s3:::ragflow-bucket/*"
      ]
    }
  ]
}
```

### Migration from multi-bucket to single bucket

If you're migrating from multi-bucket mode to single-bucket mode:

1. **Set environment variables** for the new configuration
2. **Restart RAGFlow** services
3. **Migrate existing data** (optional):

```bash
# Example using mc (MinIO Client)
mc alias set old-minio http://old-minio:9000 ACCESS_KEY SECRET_KEY
mc alias set new-minio https://new-minio:443 ACCESS_KEY SECRET_KEY

# List all knowledge base buckets
mc ls old-minio/ | grep kb_ | while read -r line; do
    bucket=$(echo $line | awk '{print $5}')
    # Copy each bucket to the new structure
    mc cp --recursive old-minio/$bucket/ new-minio/ragflow-bucket/ragflow/$bucket/
done
```

### Toggle between modes

#### Enable single bucket mode

```yaml
minio:
  bucket: "my-single-bucket"
  prefix_path: "ragflow"
```

#### Disable (Use multi-bucket mode)

```yaml
minio:
  # Leave bucket and prefix_path empty or commented out
  # bucket: ''
  # prefix_path: ''
```

### Troubleshooting

#### Issue: Access Denied errors

**Solution**: Ensure your IAM policy grants access to the bucket specified in the configuration.

#### Issue: Files not found after switching modes

**Solution**: The path structure changes between modes. You'll need to migrate existing data.

#### Issue: Connection fails with HTTPS

**Solution**: Ensure `secure: True` is set in the MinIO connection (automatically handled for port 443).

### Storage backends supported

- ✅ **MinIO** - Full support with single bucket mode
- ✅ **AWS S3** - Full support with single bucket mode
- ✅ **Alibaba OSS** - Full support with single bucket mode
- ✅ **Azure Blob** - Uses container-based structure (different paradigm)
- ⚠️ **OpenDAL** - Depends on underlying storage backend

### Performance considerations

- **Single bucket mode** may have slightly better performance for bucket listing operations
- **Multi-bucket mode** provides better isolation and organization for large deployments
- Choose based on your specific requirements and infrastructure constraints