Using s3fs to Mount an OBS Bucket
Application Scenario
If you are used to storing data locally but your data is now stored in OBS and you still want to use the same way to access the stored data, s3fs can help.
s3fs is a file system tool based on Filesystem in Userspace (FUSE). You can use s3fs to mount an OBS bucket to a local file system in Linux. This way, you can operate objects the same way as you operate a local file system. For more information about s3fs, see GitHub. If you encounter any problems when using s3fs, troubleshoot them by referring to the FAQ.
Solution Advantages
- A set of POSIX attributes are supported, including file upload/download, directories, soft links, and user permissions configurations.
- Multipart uploads are supported.
- Local disks can be used as caches to improve I/O performance.
Constraints
- Parallel file systems cannot be mounted using s3fs.
- Random writes or appends require that the entire files must be rewritten, which results in a waste of bandwidth.
- Metadata operations (like listing directories) perform poorly due to network latency.
- Atomic rename of files or directories is not supported.
- A bucket can be mounted to multiple cloud servers, but you need to prevent multiple servers from concurrently writing the same file.
- Hard links are not supported.
- s3fs interacts with storage servers over HTTP or HTTPS, which makes the client CPU overhead high.
- The client caches system metadata. Before the cached metadata expires, metadata on the client and on the storage server may be inconsistent.
- Since there is switching between user mode and kernel mode in FUSE, s3fs is not recommended for high-concurrency scenarios.
Procedure
To use s3fs to mount an OBS bucket, do as follows:
- Download related dependencies.
yum install fuse sudo yum install automake fuse fuse-devel gcc-c++ git libcurl-devel libxml2-devel make openssl-devel
The FUSE version deployed on a cloud server must be 2.8.4 or later. Such versions may not be compatible with some old operating systems, so you need to make adaptations accordingly.
- Download s3fs in either of the following methods:
- Method 1: Use a mirror source to install s3fs.
# Ubuntu sudo apt install s3fs # CentOS sudo yum install epel-release sudo yum install s3fs-fuse
- Method 2: Download a required s3fs version (s3fs 1.91 is recommended) from GitHub.
Compile and install it.
- Method 1: Use a mirror source to install s3fs.
- Compile and install s3fs. If you are using the method 1 to install s3fs, skip this step.
Go to the s3fs-fuse directory and run the following commands.
./autogen.sh ./configure make sudo make install
- Check the installation:
s3fs --version
If the s3fs version information is displayed in the command output, s3fs is running properly.
- Configure the AK and SK.
Below gives the command format.
echo "AK:SK" >>/root/.passwd-s3fs chmod 600 /root/.passwd-s3fs
To obtain the AK and SK, see Obtaining Access Keys (AK and SK).
- Mount a bucket.
Below gives the mount command format.
s3fs Bucket name Local mount directory -o passwd_file=Key file path -o url=Regional endpoint address -o nonempty -o big_writes -o max_write=131072 [-o ensure_diskfree=xxxx] [-o tmpdir=xxxx]Other mount parameters
ensure_diskfree and tmpdir are optional. For when to configure them, see Common Parameters.
Suppose you want to use s3fs to mount an OBS bucket test-bucket in the CN-Hong Kong region. Run the following command.
s3fs test-bucket /mnt/s3fs-test -o passwd_file=/root/.passwd-s3fs -o url=https://obs.ap-southeast-1.myhuaweicloud.com -o nonempty -o big_writes -o max_write=131072 -o ensure_diskfree=2048 -o tmpdir=/data
- Verify the mount.
Run the following command.
df -h
If information similar to the following is displayed, the bucket mount succeeds.
Filesystem Size Used Avail Use% Mounted on s3fs 16E 0 16E 0% /mnt/s3fs-test
If no information similar to the preceding is displayed, the bucket mount fails. In this case, add the following parameters to the mount command, so that you can get the mount process and debugging logs from the command output.
-d -d -f -o f2 -o curldbg
Common Parameters
Parameter |
Description |
---|---|
tmpdir |
Explanation: Directory for caching temporary data During read and write operations, s3fs uses part of the local directory space to cache temporary data by default to improve performance. You are advised to select a disk directory instead of a shared memory directory.
NOTE:
You can run the df -h command to query the directory type and capacity usage.
Examples:
[root@huawei-esc ~]# df -h /tmp Filesystem Size Used Avail Use% Mounted on /dev/vda1 40G 20G 19G 52% / [root@huawei-esc ~]# df -h /run Filesystem Size Used Avail Use% Mounted on tmpfs 32G 3.3G 29G 11% /run The type of the /tmp directory is /dev/vda1, which is a disk. The type of the /run directory is tmpfs, which is a shared memory. Default value: /tmp |
ensure_diskfree |
Explanation: Reserved space of the temporary cache directory, in MB.
CAUTION:
If this parameter is not specified, the directory specified by tmpdir may be full, which may affect the running of other processes. You are advised to set this parameter to 10% of the available capacity of the directory specified by tmpdir. Default value: 0 |
compat_dir |
Explanation: Compatible directory. s3fs supports directory objects as much as possible and treats them as directories. dir/ and dir_$folder$ can be treated as directory objects.
CAUTION:
This parameter is mandatory for s3fs 1.92. Otherwise, multi-level directory objects in a bucket cannot be displayed. Assume that you use an SDK to create an object a/b/c or a/b/c/. If this parameter is not added, directory a cannot be displayed. s3fs treats objects ending with a slash (/) as directories. Default value: N/A |
allow_other |
Explanation: It allows other users to access the mount directory. Default value: N/A |
umask |
Explanation: It controls what permissions are not given to all files in a file system. Default value: 0000 |
nonempty |
Explanation: After this parameter is added, a bucket can be mounted to a non-empty directory. Default value: N/A |
multipart_size |
Explanation: Part size in a multipart upload, in MB. The part size you specified here will affect how large the file you upload can be. For details, see Uploading Objects Using a Multipart Upload. Value range: 5 to 5120, in MB Default value: 10 |
no_check_certificate |
Explanation: Using this parameter indicates that a server certificate is not verified. This parameter is valid only when HTTPS is used. By default, certificate verification is enabled. Default value: N/A |
use_cache |
Explanation: Address for caching local files. Using this parameter will improve I/O performance but increase disk usage. This parameter can be used together with del_cache. Default value: "" (indicating that caching is not used) |
del_cache |
Explanation: Using this parameter will delete local cache files when s3fs starts or exits. Default value: N/A |
For more parameters, see s3fs-fuse.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot