Deleting old files is slow and 'kills' IO performance Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Community Moderator Election Results Why I closed the “Why is Kali so hard” questionWhy is deleting files by name painfully slow and also exceptionally fast?Deleting large number of filesext4 disk space not reclaimed after deleting filesWhat is the best way to setup 3 HDDs with 3 OS on one computer?Optimize ext4 for always full operationHouskeep old files and dirsAfter reboot files losts and old versions restored!? Ext4Slow server performance - mod_fcgid causing (104), (09) and (32) errors: mod_fcgid: ap_pass_brigade failed in handle_request_ipc functionDeleting kdump files from /bootTLP and graphics card performance
How to deal with a team lead who never gives me credit?
How could we fake a moon landing now?
Amount of permutations on an NxNxN Rubik's Cube
Why aren't air breathing engines used as small first stages?
How to write this math term? with cases it isn't working
Wu formula for manifolds with boundary
Did MS DOS itself ever use blinking text?
Dating a Former Employee
If a VARCHAR(MAX) column is included in an index, is the entire value always stored in the index page(s)?
What font is "z" in "z-score"?
If a contract sometimes uses the wrong name, is it still valid?
Using et al. for a last / senior author rather than for a first author
Why are there no cargo aircraft with "flying wing" design?
2001: A Space Odyssey's use of the song "Daisy Bell" (Bicycle Built for Two); life imitates art or vice-versa?
What do you call the main part of a joke?
How to tell that you are a giant?
What causes the direction of lightning flashes?
Fundamental Solution of the Pell Equation
Can anything be seen from the center of the Boötes void? How dark would it be?
Do jazz musicians improvise on the parent scale in addition to the chord-scales?
Why wasn't DOSKEY integrated with COMMAND.COM?
When a candle burns, why does the top of wick glow if bottom of flame is hottest?
If u is orthogonal to both v and w, and u not equal to 0, argue that u is not in the span of v and w. (
Why are the trig functions versine, haversine, exsecant, etc, rarely used in modern mathematics?
Deleting old files is slow and 'kills' IO performance
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Community Moderator Election Results
Why I closed the “Why is Kali so hard” questionWhy is deleting files by name painfully slow and also exceptionally fast?Deleting large number of filesext4 disk space not reclaimed after deleting filesWhat is the best way to setup 3 HDDs with 3 OS on one computer?Optimize ext4 for always full operationHouskeep old files and dirsAfter reboot files losts and old versions restored!? Ext4Slow server performance - mod_fcgid causing (104), (09) and (32) errors: mod_fcgid: ap_pass_brigade failed in handle_request_ipc functionDeleting kdump files from /bootTLP and graphics card performance
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I'm using find to prune old files, lots of them.. this takes minutes / hours to run and other server processes encounter IO performance issues.
find -mtime +100 -delete -print
I tried ionice but it didn't appear to help.
ionice -c 3
What can one do to 1. speed up the find operation and 2. to avoid impacting other processes?
The FS is ext4.. is ext4 just bad at this kind of workload?
Kernel is 3.16
Storage is 2x 1TB 7200rpm HDDs in RAID 1.
There's 93GB in 610228 files now, so 152KB/file on average.
Maybe I just shouldn't store so many files in a single directory?
linux debian ext4
add a comment |
I'm using find to prune old files, lots of them.. this takes minutes / hours to run and other server processes encounter IO performance issues.
find -mtime +100 -delete -print
I tried ionice but it didn't appear to help.
ionice -c 3
What can one do to 1. speed up the find operation and 2. to avoid impacting other processes?
The FS is ext4.. is ext4 just bad at this kind of workload?
Kernel is 3.16
Storage is 2x 1TB 7200rpm HDDs in RAID 1.
There's 93GB in 610228 files now, so 152KB/file on average.
Maybe I just shouldn't store so many files in a single directory?
linux debian ext4
Add to the post, how many files, and disk technology, please.
– Rui F Ribeiro
Nov 23 '16 at 16:48
add a comment |
I'm using find to prune old files, lots of them.. this takes minutes / hours to run and other server processes encounter IO performance issues.
find -mtime +100 -delete -print
I tried ionice but it didn't appear to help.
ionice -c 3
What can one do to 1. speed up the find operation and 2. to avoid impacting other processes?
The FS is ext4.. is ext4 just bad at this kind of workload?
Kernel is 3.16
Storage is 2x 1TB 7200rpm HDDs in RAID 1.
There's 93GB in 610228 files now, so 152KB/file on average.
Maybe I just shouldn't store so many files in a single directory?
linux debian ext4
I'm using find to prune old files, lots of them.. this takes minutes / hours to run and other server processes encounter IO performance issues.
find -mtime +100 -delete -print
I tried ionice but it didn't appear to help.
ionice -c 3
What can one do to 1. speed up the find operation and 2. to avoid impacting other processes?
The FS is ext4.. is ext4 just bad at this kind of workload?
Kernel is 3.16
Storage is 2x 1TB 7200rpm HDDs in RAID 1.
There's 93GB in 610228 files now, so 152KB/file on average.
Maybe I just shouldn't store so many files in a single directory?
linux debian ext4
linux debian ext4
edited Nov 23 '16 at 17:41
XTF
asked Nov 23 '16 at 15:38
XTFXTF
1212
1212
Add to the post, how many files, and disk technology, please.
– Rui F Ribeiro
Nov 23 '16 at 16:48
add a comment |
Add to the post, how many files, and disk technology, please.
– Rui F Ribeiro
Nov 23 '16 at 16:48
Add to the post, how many files, and disk technology, please.
– Rui F Ribeiro
Nov 23 '16 at 16:48
Add to the post, how many files, and disk technology, please.
– Rui F Ribeiro
Nov 23 '16 at 16:48
add a comment |
1 Answer
1
active
oldest
votes
When you run the find
command like you posted, it will do a rm
for each file that it finds. This isn't a good way to do it, in terms of performance.
For improve this task, you can use the -exec
option in find for process the output to a rm
command:
find -mtime +100 -exec rm +
It's very important the use of the +
termination instead the alternate ;
. With +
, find will only make a rm
command for the maximum number of files it can process on a simple execution. With the ;
termination, find
will do a rm
command for each file, so you would have the same problem.
For a better performance, you can join it to the ionice
command like you mentioned. If you don't notice that it improves the system performance, most possible is that it is consuming other resources more than I/O, like CPU. For this, you can use renice
command to decrease the priority in CPU usage of the process.
I would use the following:
ionice -c 3 find -mtime +100 -exec rm +
Now, in another shell, you need to find the PID of the find command: ps -ef | grep find
And finally run the renice command: renice +19 -p <PID_find_command>
why not usexargs
so that each spawn of arm
process deletes, say, 50 files at a time ?
– steve
Nov 23 '16 at 23:00
Usingxargs
will not work with files with blank spaces in the filename. You need to pay attention to build the comand with-print0
option that process the blank spaces, something likefind -mtime +100 -print0 | xargs -0 rm
. This complicates the execution of the command and will not work out of the box with the I/O performance modification in one single command, like in the example ofionice
withfind
and-exec
. Also, usingxargs
doesn't offer a better performance than usingfind
with-exec
and+
termination, so I prefer this last option.
– Rubén Alemán
Nov 24 '16 at 23:01
Does -exec take care of spaces and other weird stuff? As we're IO bound, how does using rm improve performance?
– XTF
Nov 29 '16 at 17:56
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f325488%2fdeleting-old-files-is-slow-and-kills-io-performance%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
When you run the find
command like you posted, it will do a rm
for each file that it finds. This isn't a good way to do it, in terms of performance.
For improve this task, you can use the -exec
option in find for process the output to a rm
command:
find -mtime +100 -exec rm +
It's very important the use of the +
termination instead the alternate ;
. With +
, find will only make a rm
command for the maximum number of files it can process on a simple execution. With the ;
termination, find
will do a rm
command for each file, so you would have the same problem.
For a better performance, you can join it to the ionice
command like you mentioned. If you don't notice that it improves the system performance, most possible is that it is consuming other resources more than I/O, like CPU. For this, you can use renice
command to decrease the priority in CPU usage of the process.
I would use the following:
ionice -c 3 find -mtime +100 -exec rm +
Now, in another shell, you need to find the PID of the find command: ps -ef | grep find
And finally run the renice command: renice +19 -p <PID_find_command>
why not usexargs
so that each spawn of arm
process deletes, say, 50 files at a time ?
– steve
Nov 23 '16 at 23:00
Usingxargs
will not work with files with blank spaces in the filename. You need to pay attention to build the comand with-print0
option that process the blank spaces, something likefind -mtime +100 -print0 | xargs -0 rm
. This complicates the execution of the command and will not work out of the box with the I/O performance modification in one single command, like in the example ofionice
withfind
and-exec
. Also, usingxargs
doesn't offer a better performance than usingfind
with-exec
and+
termination, so I prefer this last option.
– Rubén Alemán
Nov 24 '16 at 23:01
Does -exec take care of spaces and other weird stuff? As we're IO bound, how does using rm improve performance?
– XTF
Nov 29 '16 at 17:56
add a comment |
When you run the find
command like you posted, it will do a rm
for each file that it finds. This isn't a good way to do it, in terms of performance.
For improve this task, you can use the -exec
option in find for process the output to a rm
command:
find -mtime +100 -exec rm +
It's very important the use of the +
termination instead the alternate ;
. With +
, find will only make a rm
command for the maximum number of files it can process on a simple execution. With the ;
termination, find
will do a rm
command for each file, so you would have the same problem.
For a better performance, you can join it to the ionice
command like you mentioned. If you don't notice that it improves the system performance, most possible is that it is consuming other resources more than I/O, like CPU. For this, you can use renice
command to decrease the priority in CPU usage of the process.
I would use the following:
ionice -c 3 find -mtime +100 -exec rm +
Now, in another shell, you need to find the PID of the find command: ps -ef | grep find
And finally run the renice command: renice +19 -p <PID_find_command>
why not usexargs
so that each spawn of arm
process deletes, say, 50 files at a time ?
– steve
Nov 23 '16 at 23:00
Usingxargs
will not work with files with blank spaces in the filename. You need to pay attention to build the comand with-print0
option that process the blank spaces, something likefind -mtime +100 -print0 | xargs -0 rm
. This complicates the execution of the command and will not work out of the box with the I/O performance modification in one single command, like in the example ofionice
withfind
and-exec
. Also, usingxargs
doesn't offer a better performance than usingfind
with-exec
and+
termination, so I prefer this last option.
– Rubén Alemán
Nov 24 '16 at 23:01
Does -exec take care of spaces and other weird stuff? As we're IO bound, how does using rm improve performance?
– XTF
Nov 29 '16 at 17:56
add a comment |
When you run the find
command like you posted, it will do a rm
for each file that it finds. This isn't a good way to do it, in terms of performance.
For improve this task, you can use the -exec
option in find for process the output to a rm
command:
find -mtime +100 -exec rm +
It's very important the use of the +
termination instead the alternate ;
. With +
, find will only make a rm
command for the maximum number of files it can process on a simple execution. With the ;
termination, find
will do a rm
command for each file, so you would have the same problem.
For a better performance, you can join it to the ionice
command like you mentioned. If you don't notice that it improves the system performance, most possible is that it is consuming other resources more than I/O, like CPU. For this, you can use renice
command to decrease the priority in CPU usage of the process.
I would use the following:
ionice -c 3 find -mtime +100 -exec rm +
Now, in another shell, you need to find the PID of the find command: ps -ef | grep find
And finally run the renice command: renice +19 -p <PID_find_command>
When you run the find
command like you posted, it will do a rm
for each file that it finds. This isn't a good way to do it, in terms of performance.
For improve this task, you can use the -exec
option in find for process the output to a rm
command:
find -mtime +100 -exec rm +
It's very important the use of the +
termination instead the alternate ;
. With +
, find will only make a rm
command for the maximum number of files it can process on a simple execution. With the ;
termination, find
will do a rm
command for each file, so you would have the same problem.
For a better performance, you can join it to the ionice
command like you mentioned. If you don't notice that it improves the system performance, most possible is that it is consuming other resources more than I/O, like CPU. For this, you can use renice
command to decrease the priority in CPU usage of the process.
I would use the following:
ionice -c 3 find -mtime +100 -exec rm +
Now, in another shell, you need to find the PID of the find command: ps -ef | grep find
And finally run the renice command: renice +19 -p <PID_find_command>
edited 11 hours ago
Rui F Ribeiro
42.1k1484142
42.1k1484142
answered Nov 23 '16 at 21:25
Rubén AlemánRubén Alemán
944
944
why not usexargs
so that each spawn of arm
process deletes, say, 50 files at a time ?
– steve
Nov 23 '16 at 23:00
Usingxargs
will not work with files with blank spaces in the filename. You need to pay attention to build the comand with-print0
option that process the blank spaces, something likefind -mtime +100 -print0 | xargs -0 rm
. This complicates the execution of the command and will not work out of the box with the I/O performance modification in one single command, like in the example ofionice
withfind
and-exec
. Also, usingxargs
doesn't offer a better performance than usingfind
with-exec
and+
termination, so I prefer this last option.
– Rubén Alemán
Nov 24 '16 at 23:01
Does -exec take care of spaces and other weird stuff? As we're IO bound, how does using rm improve performance?
– XTF
Nov 29 '16 at 17:56
add a comment |
why not usexargs
so that each spawn of arm
process deletes, say, 50 files at a time ?
– steve
Nov 23 '16 at 23:00
Usingxargs
will not work with files with blank spaces in the filename. You need to pay attention to build the comand with-print0
option that process the blank spaces, something likefind -mtime +100 -print0 | xargs -0 rm
. This complicates the execution of the command and will not work out of the box with the I/O performance modification in one single command, like in the example ofionice
withfind
and-exec
. Also, usingxargs
doesn't offer a better performance than usingfind
with-exec
and+
termination, so I prefer this last option.
– Rubén Alemán
Nov 24 '16 at 23:01
Does -exec take care of spaces and other weird stuff? As we're IO bound, how does using rm improve performance?
– XTF
Nov 29 '16 at 17:56
why not use
xargs
so that each spawn of a rm
process deletes, say, 50 files at a time ?– steve
Nov 23 '16 at 23:00
why not use
xargs
so that each spawn of a rm
process deletes, say, 50 files at a time ?– steve
Nov 23 '16 at 23:00
Using
xargs
will not work with files with blank spaces in the filename. You need to pay attention to build the comand with -print0
option that process the blank spaces, something like find -mtime +100 -print0 | xargs -0 rm
. This complicates the execution of the command and will not work out of the box with the I/O performance modification in one single command, like in the example of ionice
with find
and -exec
. Also, using xargs
doesn't offer a better performance than using find
with -exec
and +
termination, so I prefer this last option.– Rubén Alemán
Nov 24 '16 at 23:01
Using
xargs
will not work with files with blank spaces in the filename. You need to pay attention to build the comand with -print0
option that process the blank spaces, something like find -mtime +100 -print0 | xargs -0 rm
. This complicates the execution of the command and will not work out of the box with the I/O performance modification in one single command, like in the example of ionice
with find
and -exec
. Also, using xargs
doesn't offer a better performance than using find
with -exec
and +
termination, so I prefer this last option.– Rubén Alemán
Nov 24 '16 at 23:01
Does -exec take care of spaces and other weird stuff? As we're IO bound, how does using rm improve performance?
– XTF
Nov 29 '16 at 17:56
Does -exec take care of spaces and other weird stuff? As we're IO bound, how does using rm improve performance?
– XTF
Nov 29 '16 at 17:56
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f325488%2fdeleting-old-files-is-slow-and-kills-io-performance%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
-debian, ext4, linux
Add to the post, how many files, and disk technology, please.
– Rui F Ribeiro
Nov 23 '16 at 16:48