how to speed up tar, just build a package without compression Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Community Moderator Election Results Why I closed the “Why is Kali so hard” questiontar + rsync + untar. Any speed benefit over just rsync?tar compression without directory structureSet LZMA compression level via tarArchiving to remote-machine with tar/cpio and ssh?Tar big files in different directoriesCompressing a folder but do not compress specific file types but include them in the gz fileAdding compression to .tar file?set compression level with nice and tarHow to process series of files once transfer completeLimit tar read speed
How to convince students of the implication truth values?
If a VARCHAR(MAX) column is included in an index, is the entire value always stored in the index page(s)?
Where are Serre’s lectures at Collège de France to be found?
Using audio cues to encourage good posture
Why aren't air breathing engines used as small first stages
How to tell that you are a giant?
Closed form of recurrent arithmetic series summation
For a new assistant professor in CS, how to build/manage a publication pipeline
Maximum summed powersets with non-adjacent items
Why do we bend a book to keep it straight?
また usage in a dictionary
How to find all the available tools in mac terminal?
Is it cost-effective to upgrade an old-ish Giant Escape R3 commuter bike with entry-level branded parts (wheels, drivetrain)?
Is safe to use va_start macro with this as parameter?
Has negative voting ever been officially implemented in elections, or seriously proposed, or even studied?
How does the math work when buying airline miles?
In what way is everyone not a utilitarian
When the Haste spell ends on a creature, do attackers have advantage against that creature?
Denied boarding although I have proper visa and documentation. To whom should I make a complaint?
old style "caution" boxes
Did MS DOS itself ever use blinking text?
Should I use a zero-interest credit card for a large one-time purchase?
Extracting terms with certain heads in a function
Is it a good idea to use CNN to classify 1D signal?
how to speed up tar, just build a package without compression
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Community Moderator Election Results
Why I closed the “Why is Kali so hard” questiontar + rsync + untar. Any speed benefit over just rsync?tar compression without directory structureSet LZMA compression level via tarArchiving to remote-machine with tar/cpio and ssh?Tar big files in different directoriesCompressing a folder but do not compress specific file types but include them in the gz fileAdding compression to .tar file?set compression level with nice and tarHow to process series of files once transfer completeLimit tar read speed
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have a large folder, 2TB, with 1000000 files on a Linux machine. I want to build a package with tar. I do not care about the size of the tar file, so I do not need to compress the data. How can I speed tar
up? It takes me an hour to build a package with tar -cf xxx.tar xxx/
. I have a powerful CPU with 28 cores, and 500GB memory,is there a way to make tar
run multithreaded?
Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.
tar file-management
New contributor
|
show 1 more comment
I have a large folder, 2TB, with 1000000 files on a Linux machine. I want to build a package with tar. I do not care about the size of the tar file, so I do not need to compress the data. How can I speed tar
up? It takes me an hour to build a package with tar -cf xxx.tar xxx/
. I have a powerful CPU with 28 cores, and 500GB memory,is there a way to make tar
run multithreaded?
Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.
tar file-management
New contributor
3
Tar does not compress.
– ctrl-alt-delor
9 hours ago
1
What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.
– ctrl-alt-delor
9 hours ago
1
The CPU and the number of cores don't make much difference as the operation of creating atar
archive is disk-bound, not CPU-bound. You could use severaltar
processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.
– Kusalananda♦
9 hours ago
@ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.
– Guo Yong
9 hours ago
1
@GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just usescp
and skip thetar
phase entirely?
– roaima
9 hours ago
|
show 1 more comment
I have a large folder, 2TB, with 1000000 files on a Linux machine. I want to build a package with tar. I do not care about the size of the tar file, so I do not need to compress the data. How can I speed tar
up? It takes me an hour to build a package with tar -cf xxx.tar xxx/
. I have a powerful CPU with 28 cores, and 500GB memory,is there a way to make tar
run multithreaded?
Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.
tar file-management
New contributor
I have a large folder, 2TB, with 1000000 files on a Linux machine. I want to build a package with tar. I do not care about the size of the tar file, so I do not need to compress the data. How can I speed tar
up? It takes me an hour to build a package with tar -cf xxx.tar xxx/
. I have a powerful CPU with 28 cores, and 500GB memory,is there a way to make tar
run multithreaded?
Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.
tar file-management
tar file-management
New contributor
New contributor
edited 9 hours ago
terdon♦
134k33270450
134k33270450
New contributor
asked 10 hours ago
Guo YongGuo Yong
211
211
New contributor
New contributor
3
Tar does not compress.
– ctrl-alt-delor
9 hours ago
1
What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.
– ctrl-alt-delor
9 hours ago
1
The CPU and the number of cores don't make much difference as the operation of creating atar
archive is disk-bound, not CPU-bound. You could use severaltar
processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.
– Kusalananda♦
9 hours ago
@ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.
– Guo Yong
9 hours ago
1
@GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just usescp
and skip thetar
phase entirely?
– roaima
9 hours ago
|
show 1 more comment
3
Tar does not compress.
– ctrl-alt-delor
9 hours ago
1
What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.
– ctrl-alt-delor
9 hours ago
1
The CPU and the number of cores don't make much difference as the operation of creating atar
archive is disk-bound, not CPU-bound. You could use severaltar
processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.
– Kusalananda♦
9 hours ago
@ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.
– Guo Yong
9 hours ago
1
@GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just usescp
and skip thetar
phase entirely?
– roaima
9 hours ago
3
3
Tar does not compress.
– ctrl-alt-delor
9 hours ago
Tar does not compress.
– ctrl-alt-delor
9 hours ago
1
1
What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.
– ctrl-alt-delor
9 hours ago
What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.
– ctrl-alt-delor
9 hours ago
1
1
The CPU and the number of cores don't make much difference as the operation of creating a
tar
archive is disk-bound, not CPU-bound. You could use several tar
processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.– Kusalananda♦
9 hours ago
The CPU and the number of cores don't make much difference as the operation of creating a
tar
archive is disk-bound, not CPU-bound. You could use several tar
processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.– Kusalananda♦
9 hours ago
@ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.
– Guo Yong
9 hours ago
@ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.
– Guo Yong
9 hours ago
1
1
@GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just use
scp
and skip the tar
phase entirely?– roaima
9 hours ago
@GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just use
scp
and skip the tar
phase entirely?– roaima
9 hours ago
|
show 1 more comment
2 Answers
2
active
oldest
votes
As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.
If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:
$ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'
This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost
.
1
Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.
– muru
8 hours ago
add a comment |
Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.
Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive
option:
rsync -av /mnt/data <server>:/mnt
This example copies the local directory /mnt/data
and its contents to a remote server inside /mnt
. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.
This operation can also be performed between 2 local directories, or from remote to local.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f513042%2fhow-to-speed-up-tar-just-build-a-package-without-compression%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.
If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:
$ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'
This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost
.
1
Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.
– muru
8 hours ago
add a comment |
As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.
If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:
$ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'
This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost
.
1
Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.
– muru
8 hours ago
add a comment |
As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.
If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:
$ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'
This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost
.
As @Kusalananda says in the comments, tar is disk-bound. One of the best things you can do is put the output on a separate disk so the writing doesn't slow down the reading.
If your next step is to move the file across the network, I'd suggest that you create the tar file over the network in the first place:
$ tar -cf - xxx/ | ssh otherhost 'cat > xxx.tar'
This way the local host only has to read the files, and doesn't have to also accommodate the write bandwidth consumed by tar. The disk output from tar is absorbed by the network connection and the disk system on otherhost
.
answered 9 hours ago
Jim L.Jim L.
21114
21114
1
Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.
– muru
8 hours ago
add a comment |
1
Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.
– muru
8 hours ago
1
1
Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.
– muru
8 hours ago
Better yet, you might just untar at the other end, and save half the writes there. OP seems to just want to transfer some files.
– muru
8 hours ago
add a comment |
Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.
Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive
option:
rsync -av /mnt/data <server>:/mnt
This example copies the local directory /mnt/data
and its contents to a remote server inside /mnt
. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.
This operation can also be performed between 2 local directories, or from remote to local.
add a comment |
Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.
Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive
option:
rsync -av /mnt/data <server>:/mnt
This example copies the local directory /mnt/data
and its contents to a remote server inside /mnt
. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.
This operation can also be performed between 2 local directories, or from remote to local.
add a comment |
Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.
Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive
option:
rsync -av /mnt/data <server>:/mnt
This example copies the local directory /mnt/data
and its contents to a remote server inside /mnt
. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.
This operation can also be performed between 2 local directories, or from remote to local.
Or, alternatively, is there any good way to transfer a large number of small files between different folders and between different servers? My filesystem is ext4.
Rsync over ssh is something I use on a regular basis. It preserves file permissions, symlinks, etc., when used with the --archive
option:
rsync -av /mnt/data <server>:/mnt
This example copies the local directory /mnt/data
and its contents to a remote server inside /mnt
. It invokes ssh to set up the connection. No rsync deamon is needed on either side of the wire.
This operation can also be performed between 2 local directories, or from remote to local.
edited 1 hour ago
K7AAY
1,1301028
1,1301028
answered 7 hours ago
TimTim
586212
586212
add a comment |
add a comment |
Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.
Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.
Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.
Guo Yong is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f513042%2fhow-to-speed-up-tar-just-build-a-package-without-compression%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
-file-management, tar
3
Tar does not compress.
– ctrl-alt-delor
9 hours ago
1
What will you do with the tar file when you have it. This may affect the answer, as a speed up in one area may slow down another, or vica versa.
– ctrl-alt-delor
9 hours ago
1
The CPU and the number of cores don't make much difference as the operation of creating a
tar
archive is disk-bound, not CPU-bound. You could use severaltar
processes running in parallel, each handling their own subset of the files, creating separate archives, but they would still need to fetch all the data from the single disk.– Kusalananda♦
9 hours ago
@ ctrl-alt-delor after the tar file , i will transfer that with network or just mv to another folder.
– Guo Yong
9 hours ago
1
@GuoYong the ideal split would be a combination of number of files and aggregate disk usage. If you're looking to copy the files elsewhere to another server why not just use
scp
and skip thetar
phase entirely?– roaima
9 hours ago