Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Could you expand on this comment? s3fs uses multiple threads for uploading and populating readdir metadata via S3fsMultiCurl::MultiPerform. Earlier versions used curl_multi_perform which may have hidden some of the parallelism from you.


Huh, right you are. I did a cursory search, but whatever I saw clearly was wrong.

That said, something is going on that I can't explain:

    # Default settings for aws
    $ time aws s3 cp /dev/shm/1GB.bin s3://test-kihaqtowex/a
    upload: ../../dev/shm/1GB.bin to s3://test-kihaqtowex/a

    real    0m10.312s
    user    0m5.909s
    sys     0m4.204s

    $ aws configure set default.s3.max_concurrent_requests 4
    $ time aws s3 cp /dev/shm/1GB.bin s3://test-kihaqtowex/b
    upload: ../../dev/shm/1GB.bin to s3://test-kihaqtowex/b

    real    0m26.732s
    user    0m4.989s
    sys     0m2.741s

    $ time cp /dev/shm/1GB.bin s3fs_mount_-kihaqtowex/c

    real    0m26.368s
    user    0m0.006s
    sys     0m0.699s
(I did multiple runs, though not hundreds so I can't say with certainty this would hold)

I swear the difference was worse in the past, so maybe things have improved. Still not worth the added slowness to me for my use cases.


Thanks for testing! You may improve performance via increasing -o parallel_count (default="5"). Newer versions of s3fs have improved performance but please report any cliffs at https://github.com/s3fs-fuse/s3fs-fuse/issues .




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: