how to speed up tar, just build a package without compression Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Community Moderator Election Results Why I closed the “Why is Kali so hard” questiontar + rsync + untar. Any speed benefit over just rsync?tar compression without directory structureSet LZMA compression level via tarArchiving to remote-machine with tar/cpio and ssh?Tar big files in different directoriesCompressing a folder but do not compress specific file types but include them in the gz fileAdding compression to .tar file?set compression level with nice and tarHow to process series of files once transfer completeLimit tar read speed

How to convince students of the implication truth values?

If a VARCHAR(MAX) column is included in an index, is the entire value always stored in the index page(s)?

Where are Serre’s lectures at Collège de France to be found?

Using audio cues to encourage good posture

Why aren't air breathing engines used as small first stages

How to tell that you are a giant?

Closed form of recurrent arithmetic series summation

For a new assistant professor in CS, how to build/manage a publication pipeline

Maximum summed powersets with non-adjacent items

Why do we bend a book to keep it straight?

また usage in a dictionary

How to find all the available tools in mac terminal?

Is it cost-effective to upgrade an old-ish Giant Escape R3 commuter bike with entry-level branded parts (wheels, drivetrain)?

Is safe to use va_start macro with this as parameter?

Has negative voting ever been officially implemented in elections, or seriously proposed, or even studied?

How does the math work when buying airline miles?

In what way is everyone not a utilitarian

When the Haste spell ends on a creature, do attackers have advantage against that creature?

Denied boarding although I have proper visa and documentation. To whom should I make a complaint?

old style "caution" boxes

Did MS DOS itself ever use blinking text?

Should I use a zero-interest credit card for a large one-time purchase?

Extracting terms with certain heads in a function

Is it a good idea to use CNN to classify 1D signal?



how to speed up tar, just build a package without compression



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Community Moderator Election Results
Why I closed the “Why is Kali so hard” questiontar + rsync + untar. Any speed benefit over just rsync?tar compression without directory structureSet LZMA compression level via tarArchiving to remote-machine with tar/cpio and ssh?Tar big files in different directoriesCompressing a folder but do not compress specific file types but include them in the gz fileAdding compression to .tar file?set compression level with nice and tarHow to process series of files once transfer completeLimit tar read speed



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








4















I have a large folder, 2TB, with 1000000 files on a Linux machine. I want to build a package with tar. I do not care about the size of the tar file, so I do not need to compress the data. How can I speed tar up? It takes me an hour to build a package with tar -cf xxx.tar xxx/. I have a powerful CPU with 28 cores, and 500GB memory,is there a way to make tar run multithreaded?



Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.










share|improve this question









New contributor




Guo Yong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.















  • 3





    Tar does not compress.

    – ctrl-alt-delor
    9 hours ago






  • 1





    What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.

    – ctrl-alt-delor
    9 hours ago






  • 1





    The CPU and the number of cores don't make much difference as the operation of creating a tar archive is disk-bound, not CPU-bound. You could use several tar processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.

    – Kusalananda
    9 hours ago












  • @ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.

    – Guo Yong
    9 hours ago






  • 1





    @GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just use scp and skip the tar phase entirely?

    – roaima
    9 hours ago

















4















I have a large folder, 2TB, with 1000000 files on a Linux machine. I want to build a package with tar. I do not care about the size of the tar file, so I do not need to compress the data. How can I speed tar up? It takes me an hour to build a package with tar -cf xxx.tar xxx/. I have a powerful CPU with 28 cores, and 500GB memory,is there a way to make tar run multithreaded?



Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.










share|improve this question









New contributor




Guo Yong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.















  • 3





    Tar does not compress.

    – ctrl-alt-delor
    9 hours ago






  • 1





    What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.

    – ctrl-alt-delor
    9 hours ago






  • 1





    The CPU and the number of cores don't make much difference as the operation of creating a tar archive is disk-bound, not CPU-bound. You could use several tar processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.

    – Kusalananda
    9 hours ago












  • @ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.

    – Guo Yong
    9 hours ago






  • 1





    @GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just use scp and skip the tar phase entirely?

    – roaima
    9 hours ago













4












4








4








I have a large folder, 2TB, with 1000000 files on a Linux machine. I want to build a package with tar. I do not care about the size of the tar file, so I do not need to compress the data. How can I speed tar up? It takes me an hour to build a package with tar -cf xxx.tar xxx/. I have a powerful CPU with 28 cores, and 500GB memory,is there a way to make tar run multithreaded?



Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.










share|improve this question









New contributor




Guo Yong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I have a large folder, 2TB, with 1000000 files on a Linux machine. I want to build a package with tar. I do not care about the size of the tar file, so I do not need to compress the data. How can I speed tar up? It takes me an hour to build a package with tar -cf xxx.tar xxx/. I have a powerful CPU with 28 cores, and 500GB memory,is there a way to make tar run multithreaded?



Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.







tar file-management






share|improve this question









New contributor




Guo Yong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Guo Yong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 9 hours ago









terdon

134k33270450




134k33270450






New contributor




Guo Yong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 10 hours ago









Guo YongGuo Yong

211




211




New contributor




Guo Yong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Guo Yong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Guo Yong is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







  • 3





    Tar does not compress.

    – ctrl-alt-delor
    9 hours ago






  • 1





    What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.

    – ctrl-alt-delor
    9 hours ago






  • 1





    The CPU and the number of cores don't make much difference as the operation of creating a tar archive is disk-bound, not CPU-bound. You could use several tar processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.

    – Kusalananda
    9 hours ago












  • @ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.

    – Guo Yong
    9 hours ago






  • 1





    @GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just use scp and skip the tar phase entirely?

    – roaima
    9 hours ago












  • 3





    Tar does not compress.

    – ctrl-alt-delor
    9 hours ago






  • 1





    What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.

    – ctrl-alt-delor
    9 hours ago






  • 1





    The CPU and the number of cores don't make much difference as the operation of creating a tar archive is disk-bound, not CPU-bound. You could use several tar processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.

    – Kusalananda
    9 hours ago












  • @ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.

    – Guo Yong
    9 hours ago






  • 1





    @GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just use scp and skip the tar phase entirely?

    – roaima
    9 hours ago







3




3





Tar does not compress.

– ctrl-alt-delor
9 hours ago





Tar does not compress.

– ctrl-alt-delor
9 hours ago




1




1





What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.

– ctrl-alt-delor
9 hours ago





What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.

– ctrl-alt-delor
9 hours ago




1




1





The CPU and the number of cores don't make much difference as the operation of creating a tar archive is disk-bound, not CPU-bound. You could use several tar processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.

– Kusalananda
9 hours ago






The CPU and the number of cores don't make much difference as the operation of creating a tar archive is disk-bound, not CPU-bound. You could use several tar processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.

– Kusalananda
9 hours ago














@ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.

– Guo Yong
9 hours ago





@ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.

– Guo Yong
9 hours ago




1




1





@GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just use scp and skip the tar phase entirely?

– roaima
9 hours ago





@GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just use scp and skip the tar phase entirely?

– roaima
9 hours ago










2 Answers
2






active

oldest

votes


















3














As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.



If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:



$ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'


This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost.






share|improve this answer


















  • 1





    Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.

    – muru
    8 hours ago


















1















Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.




Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive option:



rsync -av /mnt/data <server>:/mnt


This example copies the local directory /mnt/data and its contents to a remote server inside /mnt. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.



This operation can also be performed between 2 local directories, or from remote to local.






share|improve this answer

























    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.









    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f513042%2fhow-to-speed-up-tar-just-build-a-package-without-compression%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3














    As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.



    If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:



    $ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'


    This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost.






    share|improve this answer


















    • 1





      Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.

      – muru
      8 hours ago















    3














    As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.



    If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:



    $ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'


    This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost.






    share|improve this answer


















    • 1





      Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.

      – muru
      8 hours ago













    3












    3








    3







    As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.



    If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:



    $ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'


    This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost.






    share|improve this answer













    As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.



    If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:



    $ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'


    This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered 9 hours ago









    Jim L.Jim L.

    21114




    21114







    • 1





      Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.

      – muru
      8 hours ago












    • 1





      Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.

      – muru
      8 hours ago







    1




    1





    Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.

    – muru
    8 hours ago





    Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.

    – muru
    8 hours ago













    1















    Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.




    Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive option:



    rsync -av /mnt/data <server>:/mnt


    This example copies the local directory /mnt/data and its contents to a remote server inside /mnt. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.



    This operation can also be performed between 2 local directories, or from remote to local.






    share|improve this answer





























      1















      Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.




      Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive option:



      rsync -av /mnt/data <server>:/mnt


      This example copies the local directory /mnt/data and its contents to a remote server inside /mnt. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.



      This operation can also be performed between 2 local directories, or from remote to local.






      share|improve this answer



























        1












        1








        1








        Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.




        Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive option:



        rsync -av /mnt/data <server>:/mnt


        This example copies the local directory /mnt/data and its contents to a remote server inside /mnt. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.



        This operation can also be performed between 2 local directories, or from remote to local.






        share|improve this answer
















        Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.




        Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive option:



        rsync -av /mnt/data <server>:/mnt


        This example copies the local directory /mnt/data and its contents to a remote server inside /mnt. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.



        This operation can also be performed between 2 local directories, or from remote to local.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 1 hour ago









        K7AAY

        1,1301028




        1,1301028










        answered 7 hours ago









        TimTim

        586212




        586212




















            Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.












            Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.











            Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.














            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f513042%2fhow-to-speed-up-tar-just-build-a-package-without-compression%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            -file-management, tar

            Popular posts from this blog

            Frič See also Navigation menuinternal link

            Identify plant with long narrow paired leaves and reddish stems Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?What is this plant with long sharp leaves? Is it a weed?What is this 3ft high, stalky plant, with mid sized narrow leaves?What is this young shrub with opposite ovate, crenate leaves and reddish stems?What is this plant with large broad serrated leaves?Identify this upright branching weed with long leaves and reddish stemsPlease help me identify this bulbous plant with long, broad leaves and white flowersWhat is this small annual with narrow gray/green leaves and rust colored daisy-type flowers?What is this chilli plant?Does anyone know what type of chilli plant this is?Help identify this plant

            fontconfig warning: “/etc/fonts/fonts.conf”, line 100: unknown “element blank” The 2019 Stack Overflow Developer Survey Results Are In“tar: unrecognized option --warning” during 'apt-get install'How to fix Fontconfig errorHow do I figure out which font file is chosen for a system generic font alias?Why are some apt-get-installed fonts being ignored by fc-list, xfontsel, etc?Reload settings in /etc/fonts/conf.dTaking 30 seconds longer to boot after upgrade from jessie to stretchHow to match multiple font names with a single <match> element?Adding a custom font to fontconfigRemoving fonts from fontconfig <match> resultsBroken fonts after upgrading Firefox ESR to latest Firefox