Skip to content

Conversation

@arogan178
Copy link
Contributor

Overview

This PR adds support for Single Bucket Mode in RAGFlow, allowing users to configure MinIO/S3 to use a single bucket with a directory structure instead of creating multiple buckets per Knowledge Base and user folder.

Problem Statement

The current implementation creates one bucket per Knowledge Base and one bucket per user folder, which can be problematic when:

  • Cloud providers charge per bucket
  • IAM policies restrict bucket creation
  • Organizations want centralized data management in a single bucket

Solution

Added a prefix_path configuration option to the MinIO connector that enables:

  • Using a single bucket with directory-based organization
  • Backward compatibility with existing multi-bucket deployments
  • Support for MinIO, AWS S3, and other S3-compatible storage backends

Changes

  • rag/utils/minio_conn.py: Enhanced MinIO connector to support single bucket mode with prefix paths
  • conf/service_conf.yaml: Added new configuration options (bucket and prefix_path)
  • docker/service_conf.yaml.template: Updated template with single bucket configuration examples
  • docker/.env.single-bucket-example: Added example environment variables for single bucket setup
  • docs/single-bucket-mode.md: Comprehensive documentation covering usage, migration, and troubleshooting

Configuration Example

minio:
  user: "access-key"
  password: "secret-key"
  host: "minio.example.com:443"
  bucket: "ragflow-bucket"      # Single bucket name
  prefix_path: "ragflow"         # Optional prefix path

Backward Compatibility

✅ Fully backward compatible - existing deployments continue to work without any changes

  • If bucket is not configured, uses default multi-bucket behavior
  • If bucket is configured without prefix_path, uses bucket root
  • If both are configured, uses bucket/prefix_path/ structure

Testing

  • Tested with MinIO (local and cloud)
  • Verified backward compatibility with existing multi-bucket mode
  • Validated IAM policy restrictions work correctly

Documentation

Included comprehensive documentation in docs/single-bucket-mode.md covering:

  • Configuration examples
  • Migration guide from multi-bucket to single-bucket mode
  • IAM policy examples
  • Troubleshooting guide

Related Issue: Addresses use cases where bucket creation is restricted or costly

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. 🌈 python Pull requests that update Python code 💞 feature Feature request, pull request that fullfill a new feature. labels Nov 20, 2025
@KevinHuSh KevinHuSh added the ci Continue Integration label Nov 21, 2025
@KevinHuSh KevinHuSh marked this pull request as draft November 21, 2025 01:58
@KevinHuSh KevinHuSh marked this pull request as ready for review November 21, 2025 01:58
@KevinHuSh
Copy link
Collaborator

Appreciations!
Please correct the CI failure.

@arogan178
Copy link
Contributor Author

Appreciations! Please correct the CI failure.

Fixed

@arogan178
Copy link
Contributor Author

@KevinHuSh any updates on the CI failure? As it seems it is timing out on the ES SDK tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Continue Integration 💞 feature Feature request, pull request that fullfill a new feature. 🌈 python Pull requests that update Python code size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants