Help Center/ Object Storage Service/ Best Practices/ Using s3fs to Mount an OBS Bucket
Updated on 2024-10-17 GMT+08:00

Using s3fs to Mount an OBS Bucket

Application Scenario

If you are used to storing data locally but your data is now stored in OBS and you still want to use the same way to access the stored data, s3fs can help.

s3fs is a file system tool based on Filesystem in Userspace (FUSE). You can use s3fs to mount an OBS bucket to a local file system in Linux. This way, you can operate objects the same way as you operate a local file system. For more information about s3fs, see GitHub. If you encounter any problems when using s3fs, troubleshoot them by referring to the FAQ.

Solution Advantages

  • A set of POSIX attributes are supported, including file upload/download, directories, soft links, and user permissions configurations.
  • Multipart uploads are supported.
  • Local disks can be used as caches to improve I/O performance.

Constraints

  • Parallel file systems cannot be mounted using s3fs.
  • Random writes or appends require that the entire files must be rewritten, which results in a waste of bandwidth.
  • Metadata operations (like listing directories) perform poorly due to network latency.
  • Atomic rename of files or directories is not supported.
  • A bucket can be mounted to multiple cloud servers, but you need to prevent multiple servers from concurrently writing the same file.
  • Hard links are not supported.
  • s3fs interacts with storage servers over HTTP or HTTPS, which makes the client CPU overhead high.
  • The client caches system metadata. Before the cached metadata expires, metadata on the client and on the storage server may be inconsistent.
  • Since there is switching between user mode and kernel mode in FUSE, s3fs is not recommended for high-concurrency scenarios.

Procedure

To use s3fs to mount an OBS bucket, do as follows:

  1. Download related dependencies.

    yum install fuse
    sudo yum install automake fuse fuse-devel gcc-c++ git libcurl-devel libxml2-devel make openssl-devel

    The FUSE version deployed on a cloud server must be 2.8.4 or later. Such versions may not be compatible with some old operating systems, so you need to make adaptations accordingly.

  2. Download s3fs in either of the following methods:

    • Method 1: Use a mirror source to install s3fs.
      # Ubuntu
      sudo apt install s3fs
      
      # CentOS
      sudo yum install epel-release 
      sudo yum install s3fs-fuse
    • Method 2: Download a required s3fs version (s3fs 1.91 is recommended) from GitHub.

      Compile and install it.

  3. Compile and install s3fs. If you are using the method 1 to install s3fs, skip this step.

    Go to the s3fs-fuse directory and run the following commands.
    ./autogen.sh
    ./configure
    make
    sudo make install

  4. Check the installation:

    s3fs --version

    If the s3fs version information is displayed in the command output, s3fs is running properly.

  5. Configure the AK and SK.

    Below gives the command format.

    echo "AK:SK" >>/root/.passwd-s3fs
    chmod 600 /root/.passwd-s3fs

    To obtain the AK and SK, see Obtaining Access Keys (AK and SK).

  6. Mount a bucket.

    Below gives the mount command format.

    s3fs Bucket name Local mount directory -o passwd_file=Key file path -o url=Regional endpoint address -o nonempty -o big_writes -o max_write=131072 [-o ensure_diskfree=xxxx] [-o tmpdir=xxxx]Other mount parameters

    ensure_diskfree and tmpdir are optional. For when to configure them, see Common Parameters.

    Suppose you want to use s3fs to mount an OBS bucket test-bucket in the CN-Hong Kong region. Run the following command.

    s3fs test-bucket /mnt/s3fs-test -o passwd_file=/root/.passwd-s3fs -o url=https://obs.ap-southeast-1.myhuaweicloud.com -o nonempty -o big_writes -o max_write=131072 -o ensure_diskfree=2048 -o tmpdir=/data

  7. Verify the mount.

    Run the following command.
    df -h

    If information similar to the following is displayed, the bucket mount succeeds.

    Filesystem      Size  Used Avail Use% Mounted on
    s3fs             16E     0   16E   0% /mnt/s3fs-test

    If no information similar to the preceding is displayed, the bucket mount fails. In this case, add the following parameters to the mount command, so that you can get the mount process and debugging logs from the command output.

    -d -d -f -o f2 -o curldbg

Common Parameters

Table 1 Description of common parameters

Parameter

Description

tmpdir

Explanation:

Directory for caching temporary data

During read and write operations, s3fs uses part of the local directory space to cache temporary data by default to improve performance.

You are advised to select a disk directory instead of a shared memory directory.

NOTE:

You can run the df -h command to query the directory type and capacity usage.

Examples:
[root@huawei-esc ~]# df -h /tmp
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        40G   20G   19G  52% /
[root@huawei-esc ~]# df -h /run
Filesystem      Size  Used Avail Use% Mounted on
tmpfs            32G  3.3G   29G  11% /run

The type of the /tmp directory is /dev/vda1, which is a disk.

The type of the /run directory is tmpfs, which is a shared memory.

Default value:

/tmp

ensure_diskfree

Explanation:

Reserved space of the temporary cache directory, in MB.

CAUTION:

If this parameter is not specified, the directory specified by tmpdir may be full, which may affect the running of other processes.

You are advised to set this parameter to 10% of the available capacity of the directory specified by tmpdir.

Default value:

0

compat_dir

Explanation:

Compatible directory. s3fs supports directory objects as much as possible and treats them as directories.

dir/ and dir_$folder$ can be treated as directory objects.

CAUTION:

This parameter is mandatory for s3fs 1.92. Otherwise, multi-level directory objects in a bucket cannot be displayed.

Assume that you use an SDK to create an object a/b/c or a/b/c/. If this parameter is not added, directory a cannot be displayed.

s3fs treats objects ending with a slash (/) as directories.

Default value:

N/A

allow_other

Explanation:

It allows other users to access the mount directory.

Default value:

N/A

umask

Explanation:

It controls what permissions are not given to all files in a file system.

Default value:

0000

nonempty

Explanation:

After this parameter is added, a bucket can be mounted to a non-empty directory.

Default value:

N/A

multipart_size

Explanation:

Part size in a multipart upload, in MB. The part size you specified here will affect how large the file you upload can be. For details, see Uploading Objects Using a Multipart Upload.

Value range:

5 to 5120, in MB

Default value:

10

no_check_certificate

Explanation:

Using this parameter indicates that a server certificate is not verified. This parameter is valid only when HTTPS is used. By default, certificate verification is enabled.

Default value:

N/A

use_cache

Explanation:

Address for caching local files. Using this parameter will improve I/O performance but increase disk usage. This parameter can be used together with del_cache.

Default value:

"" (indicating that caching is not used)

del_cache

Explanation:

Using this parameter will delete local cache files when s3fs starts or exits.

Default value:

N/A

For more parameters, see s3fs-fuse.