Why doesn't shell automatically fix “useless use of cat”? The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Community Moderator Election ResultsIs there a way to execute a native binary from a pipe?Why can't I chown a pipe?Why does the local::lib shell code use eval and $()Why not just use standard streams instead of using filenames?echo vs <<<, or Useless Use of echo in Bash Award?Is the shell permitted to optimize out useless terminating commands?VAR=`cat file` and then repeating echo “$VAR” is slower than repeating cat file. Why?Why do I need an `;` or newline in `echo hi| cat; `?Why doesn't the last function executed in a POSIX shell script pipeline retain variable values?cat error in shellHow to use pseudo-arrays in POSIX shell script?problem piping data into nc

Why did Peik Lin say, "I'm not an animal"?

Can we generate random numbers using irrational numbers like π and e?

TDS update packages don't remove unneeded items

Do warforged have souls?

What is the padding with red substance inside of steak packaging?

Can I visit the Trinity College (Cambridge) library and see some of their rare books

Why can't devices on different VLANs, but on the same subnet, communicate?

Is there a writing software that you can sort scenes like slides in PowerPoint?

Did the UK government pay "millions and millions of dollars" to try to snag Julian Assange?

Circular reasoning in L'Hopital's rule

Why not take a picture of a closer black hole?

How to read αἱμύλιος or when to aspirate

"... to apply for a visa" or "... and applied for a visa"?

Simulating Exploding Dice

How to determine omitted units in a publication

How do spell lists change if the party levels up without taking a long rest?

60's-70's movie: home appliances revolting against the owners

Is this wall load bearing? Blueprints and photos attached

Huge performance difference of the command find with and without using %M option to show permissions

number sequence puzzle deep six

Working through the single responsibility principle (SRP) in Python when calls are expensive

Identify 80s or 90s comics with ripped creatures (not dwarves)

Deal with toxic manager when you can't quit

How to make Illustrator type tool selection automatically adapt with text length



Why doesn't shell automatically fix “useless use of cat”?



The 2019 Stack Overflow Developer Survey Results Are In
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Community Moderator Election ResultsIs there a way to execute a native binary from a pipe?Why can't I chown a pipe?Why does the local::lib shell code use eval and $()Why not just use standard streams instead of using filenames?echo vs <<<, or Useless Use of echo in Bash Award?Is the shell permitted to optimize out useless terminating commands?VAR=`cat file` and then repeating echo “$VAR” is slower than repeating cat file. Why?Why do I need an `;` or newline in `echo hi| cat; `?Why doesn't the last function executed in a POSIX shell script pipeline retain variable values?cat error in shellHow to use pseudo-arrays in POSIX shell script?problem piping data into nc



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








25















Many people use oneliners and scripts containing code along the lines



cat "$MYFILE" | command1 | command2 > "$OUTPUT"


The first cat is often called "useless use of cat" because technically it requires starting a new process (often /usr/bin/cat) where this could be avoided if the command had been



< "$MYFILE" command1 | command2 > "$OUTPUT"


because then shell only needs to start command1 and simply point its stdin to the given file.



Why doesn't the shell do this conversion automatically? I feel that the "useless use of cat" syntax is easier to read and shell should have enough information to get rid of useless cat automatically. The cat is defined in POSIX standard so shell should be allowed to implement it internally instead of using a binary in path. The shell could even contain implementation only for exactly one argument version and fallback to binary in path.










share|improve this question

















  • 22





    Those commands are not actually equivalent, since in one case stdin is a file, and in the other it's a pipe, so it wouldn't be a strictly safe conversion. You could make a system that did it, though.

    – Michael Homer
    yesterday







  • 13





    That you can't imagine a use case doesn't mean that an application isn't allowed to rely on the specified behaviour uselessly. Getting an error from lseek is still defined behaviour and could cause a different outcome, the different blocking behaviour can be semantically meaningful, etc. It would be allowable to make the change if you knew what the other commands were and knew they didn't care, or if you just didn't care about compatibility at that level, but the benefit is pretty small. I do imagine the lack of benefit drives the situation more than the conformance cost.

    – Michael Homer
    yesterday






  • 3





    The shell absolutely is allowed to implement cat itself, though, or any other utility. It's also allowed to know how the other utilities that belong to the system work (e.g. it can know how the external grep implementation that came with the system behaves). This is completely viable to do, so it's entirely fair to wonder why they don't.

    – Michael Homer
    yesterday






  • 6





    @MichaelHomer e.g. it can know how the external grep implementation that came with the system behaves So the shell now has a dependency on the behavior of grep. And sed. And awk. And du. And how many hundreds if not thousands of other utilities?

    – Andrew Henle
    yesterday






  • 17





    It would be pretty uncool of my shell to edit my commands for me.

    – Azor Ahai
    yesterday

















25















Many people use oneliners and scripts containing code along the lines



cat "$MYFILE" | command1 | command2 > "$OUTPUT"


The first cat is often called "useless use of cat" because technically it requires starting a new process (often /usr/bin/cat) where this could be avoided if the command had been



< "$MYFILE" command1 | command2 > "$OUTPUT"


because then shell only needs to start command1 and simply point its stdin to the given file.



Why doesn't the shell do this conversion automatically? I feel that the "useless use of cat" syntax is easier to read and shell should have enough information to get rid of useless cat automatically. The cat is defined in POSIX standard so shell should be allowed to implement it internally instead of using a binary in path. The shell could even contain implementation only for exactly one argument version and fallback to binary in path.










share|improve this question

















  • 22





    Those commands are not actually equivalent, since in one case stdin is a file, and in the other it's a pipe, so it wouldn't be a strictly safe conversion. You could make a system that did it, though.

    – Michael Homer
    yesterday







  • 13





    That you can't imagine a use case doesn't mean that an application isn't allowed to rely on the specified behaviour uselessly. Getting an error from lseek is still defined behaviour and could cause a different outcome, the different blocking behaviour can be semantically meaningful, etc. It would be allowable to make the change if you knew what the other commands were and knew they didn't care, or if you just didn't care about compatibility at that level, but the benefit is pretty small. I do imagine the lack of benefit drives the situation more than the conformance cost.

    – Michael Homer
    yesterday






  • 3





    The shell absolutely is allowed to implement cat itself, though, or any other utility. It's also allowed to know how the other utilities that belong to the system work (e.g. it can know how the external grep implementation that came with the system behaves). This is completely viable to do, so it's entirely fair to wonder why they don't.

    – Michael Homer
    yesterday






  • 6





    @MichaelHomer e.g. it can know how the external grep implementation that came with the system behaves So the shell now has a dependency on the behavior of grep. And sed. And awk. And du. And how many hundreds if not thousands of other utilities?

    – Andrew Henle
    yesterday






  • 17





    It would be pretty uncool of my shell to edit my commands for me.

    – Azor Ahai
    yesterday













25












25








25


2






Many people use oneliners and scripts containing code along the lines



cat "$MYFILE" | command1 | command2 > "$OUTPUT"


The first cat is often called "useless use of cat" because technically it requires starting a new process (often /usr/bin/cat) where this could be avoided if the command had been



< "$MYFILE" command1 | command2 > "$OUTPUT"


because then shell only needs to start command1 and simply point its stdin to the given file.



Why doesn't the shell do this conversion automatically? I feel that the "useless use of cat" syntax is easier to read and shell should have enough information to get rid of useless cat automatically. The cat is defined in POSIX standard so shell should be allowed to implement it internally instead of using a binary in path. The shell could even contain implementation only for exactly one argument version and fallback to binary in path.










share|improve this question














Many people use oneliners and scripts containing code along the lines



cat "$MYFILE" | command1 | command2 > "$OUTPUT"


The first cat is often called "useless use of cat" because technically it requires starting a new process (often /usr/bin/cat) where this could be avoided if the command had been



< "$MYFILE" command1 | command2 > "$OUTPUT"


because then shell only needs to start command1 and simply point its stdin to the given file.



Why doesn't the shell do this conversion automatically? I feel that the "useless use of cat" syntax is easier to read and shell should have enough information to get rid of useless cat automatically. The cat is defined in POSIX standard so shell should be allowed to implement it internally instead of using a binary in path. The shell could even contain implementation only for exactly one argument version and fallback to binary in path.







shell-script performance posix






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked yesterday









Mikko RantalainenMikko Rantalainen

1,7051119




1,7051119







  • 22





    Those commands are not actually equivalent, since in one case stdin is a file, and in the other it's a pipe, so it wouldn't be a strictly safe conversion. You could make a system that did it, though.

    – Michael Homer
    yesterday







  • 13





    That you can't imagine a use case doesn't mean that an application isn't allowed to rely on the specified behaviour uselessly. Getting an error from lseek is still defined behaviour and could cause a different outcome, the different blocking behaviour can be semantically meaningful, etc. It would be allowable to make the change if you knew what the other commands were and knew they didn't care, or if you just didn't care about compatibility at that level, but the benefit is pretty small. I do imagine the lack of benefit drives the situation more than the conformance cost.

    – Michael Homer
    yesterday






  • 3





    The shell absolutely is allowed to implement cat itself, though, or any other utility. It's also allowed to know how the other utilities that belong to the system work (e.g. it can know how the external grep implementation that came with the system behaves). This is completely viable to do, so it's entirely fair to wonder why they don't.

    – Michael Homer
    yesterday






  • 6





    @MichaelHomer e.g. it can know how the external grep implementation that came with the system behaves So the shell now has a dependency on the behavior of grep. And sed. And awk. And du. And how many hundreds if not thousands of other utilities?

    – Andrew Henle
    yesterday






  • 17





    It would be pretty uncool of my shell to edit my commands for me.

    – Azor Ahai
    yesterday












  • 22





    Those commands are not actually equivalent, since in one case stdin is a file, and in the other it's a pipe, so it wouldn't be a strictly safe conversion. You could make a system that did it, though.

    – Michael Homer
    yesterday







  • 13





    That you can't imagine a use case doesn't mean that an application isn't allowed to rely on the specified behaviour uselessly. Getting an error from lseek is still defined behaviour and could cause a different outcome, the different blocking behaviour can be semantically meaningful, etc. It would be allowable to make the change if you knew what the other commands were and knew they didn't care, or if you just didn't care about compatibility at that level, but the benefit is pretty small. I do imagine the lack of benefit drives the situation more than the conformance cost.

    – Michael Homer
    yesterday






  • 3





    The shell absolutely is allowed to implement cat itself, though, or any other utility. It's also allowed to know how the other utilities that belong to the system work (e.g. it can know how the external grep implementation that came with the system behaves). This is completely viable to do, so it's entirely fair to wonder why they don't.

    – Michael Homer
    yesterday






  • 6





    @MichaelHomer e.g. it can know how the external grep implementation that came with the system behaves So the shell now has a dependency on the behavior of grep. And sed. And awk. And du. And how many hundreds if not thousands of other utilities?

    – Andrew Henle
    yesterday






  • 17





    It would be pretty uncool of my shell to edit my commands for me.

    – Azor Ahai
    yesterday







22




22





Those commands are not actually equivalent, since in one case stdin is a file, and in the other it's a pipe, so it wouldn't be a strictly safe conversion. You could make a system that did it, though.

– Michael Homer
yesterday






Those commands are not actually equivalent, since in one case stdin is a file, and in the other it's a pipe, so it wouldn't be a strictly safe conversion. You could make a system that did it, though.

– Michael Homer
yesterday





13




13





That you can't imagine a use case doesn't mean that an application isn't allowed to rely on the specified behaviour uselessly. Getting an error from lseek is still defined behaviour and could cause a different outcome, the different blocking behaviour can be semantically meaningful, etc. It would be allowable to make the change if you knew what the other commands were and knew they didn't care, or if you just didn't care about compatibility at that level, but the benefit is pretty small. I do imagine the lack of benefit drives the situation more than the conformance cost.

– Michael Homer
yesterday





That you can't imagine a use case doesn't mean that an application isn't allowed to rely on the specified behaviour uselessly. Getting an error from lseek is still defined behaviour and could cause a different outcome, the different blocking behaviour can be semantically meaningful, etc. It would be allowable to make the change if you knew what the other commands were and knew they didn't care, or if you just didn't care about compatibility at that level, but the benefit is pretty small. I do imagine the lack of benefit drives the situation more than the conformance cost.

– Michael Homer
yesterday




3




3





The shell absolutely is allowed to implement cat itself, though, or any other utility. It's also allowed to know how the other utilities that belong to the system work (e.g. it can know how the external grep implementation that came with the system behaves). This is completely viable to do, so it's entirely fair to wonder why they don't.

– Michael Homer
yesterday





The shell absolutely is allowed to implement cat itself, though, or any other utility. It's also allowed to know how the other utilities that belong to the system work (e.g. it can know how the external grep implementation that came with the system behaves). This is completely viable to do, so it's entirely fair to wonder why they don't.

– Michael Homer
yesterday




6




6





@MichaelHomer e.g. it can know how the external grep implementation that came with the system behaves So the shell now has a dependency on the behavior of grep. And sed. And awk. And du. And how many hundreds if not thousands of other utilities?

– Andrew Henle
yesterday





@MichaelHomer e.g. it can know how the external grep implementation that came with the system behaves So the shell now has a dependency on the behavior of grep. And sed. And awk. And du. And how many hundreds if not thousands of other utilities?

– Andrew Henle
yesterday




17




17





It would be pretty uncool of my shell to edit my commands for me.

– Azor Ahai
yesterday





It would be pretty uncool of my shell to edit my commands for me.

– Azor Ahai
yesterday










9 Answers
9






active

oldest

votes


















43














"Useless use of cat" is more about how you write your code than about what actually runs when you execute the script. It's a sort of design anti-pattern, a way of going about something that could probably be done in a more efficient manner. It's a failure in understanding of how to best combine the given tools to create a new tool. I'd argue that stringing several sed and/or awk commands together in a pipeline also could be said to be a symptom of this same anti-pattern.



Fixing instances of "useless use of cat" in a script is a primarily matter of fixing the source code of the script manually. A tool such as ShellCheck can help with this by pointing them out:



$ cat script.sh
#!/bin/sh
cat file | cat




$ shellcheck script.sh

In script.sh line 2:
cat file | cat
^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.



Getting the shell to do this automatically would be difficult due to the nature of shell scripts. The way a script executes depends on the environment inherited from its parent process, and on the specific implementation of the available external commands.



The shell does not necessarily know what cat is. It could potentially be any command from anywhere in your $PATH, or a function.



If it was a built-in command (which it may be in some shells), it would have the ability to reorganise the pipeline as it would know of the semantics of its built-in cat command. Before doing that, it would additionally have to make assumptions about the next command in the pipeline, after the original cat.



Note that reading from standard input behaves slightly differently when it's connected to a pipe and when it's connected to a file. A pipe is not seekable, so depending on what the next command in the pipeline does, it may or may not behave differently if the pipeline was rearranged (it may detect whether the input is seekable and decide to do things differently if it is or if it isn't, in any case it would then behave differently).



This question is similar (in a very general sense) to "Are there any compilers that attempt to fix syntax errors on their own?" (at the Software Engineering StackExchange site), although that question is obviously about syntax errors, not useless design patterns. The idea about automatically changing the code based on intent is largely the same though.






share|improve this answer

























  • It's perfectly conformant for a shell to know what cat is, and the other commands in the pipeline, (the as-if rule) and behave accordingly, they just don't here because it's pointless and too hard.

    – Michael Homer
    yesterday






  • 2





    @MichaelHomer Yes. But it's also allowed to overload a standard command with a function of the same name.

    – Kusalananda
    yesterday






  • 1





    @PhilipCouling It’s absolutely conformant as long as it’s known that none of the pipeline commands care. The shell is specifically allowed to replace utilities with builtins or shell functions and those have no execution environment restrictions, so as long as the external result is indistinguishable it’s permitted. For your case, cat /dev/tty is the interesting one that would be different with <.

    – Michael Homer
    yesterday












  • @MichaelHomer so as long as the external result is indistinguishable it’s permitted That means the behavior of the entire set of utilities optimized in such a manner can never change. That has to be the ultimate dependency hell.

    – Andrew Henle
    yesterday






  • 2





    @MichaelHomer As the other comments said, of course it's perfectly comformant for the shell to know that given the OP's input it is impossible to tell what the cat command actually does without executing it. For all you (and the shell) know, the OP has a command cat in her path which is an interactive cat simulation, "myfile" is just the stored game state, and command1 and command2 are postprocessing some statistics about the current playing session...

    – alephzero
    yesterday


















30














Because it's not useless.



In the case of cat file | cmd, the fd 0 (stdin) of cmd will be a pipe, and in the case of cmd <file it may be a regular file, device, etc.



A pipe has different semantics from a regular file, and its semantics are not a subset of those of a regular file:



  • a regular file cannot be select(2)ed or poll(2)ed on in a meaningful way; a select(2) on it will always return "ready". Advanced interfaces like epoll(2) on Linux will simply not work with regular files.


  • on Linux there are system calls (splice(2), vmsplice(2), tee(2)) which only work on pipes [1]


Since cat is so much used, it could be implemented as a shell built-in which will avoid an extra process, but once you started on that path, the same thing could be done with most commands -- transforming the shell into a slower & clunkier perl or python. it's probably better to write another scripting language with an easy to use pipe-like syntax for continuations instead ;-)



[1] If you want a simple example not made up for the occasion, you can look at my "exec binary from stdin" git gist with some explanations in the comment here. Implementing cat inside it in order to make it work without UUoC would have made it 2 or 3 times bigger.






share|improve this answer


















  • 1





    In fact, ksh93 does implement some external commands like cat internally.

    – jrw32982
    yesterday






  • 2





    cat /dev/urandom | cpu_bound_program runs the read() system calls in a separate process. On Linux for example, the actual CPU work of generating more random numbers (when the pool is empty) is done in that system call, so using a separate process lets you take advantage of a separate CPU core to generate random data as input. e.g. in What's the fastest way to generate a 1 GB text file containing random digits?

    – Peter Cordes
    21 hours ago






  • 3





    More importantly for most cases, it means lseek won't work. cat foo.mp4 | mpv - will work, but you can't seek backward further than mpv's or mplayer's cache buffer. But with input redirected from a file, you can. cat | mpv - is one way to check if an MP4 has its moov atom at the start of the file, so it can be played without seeking to the end and back (i.e. if it's suitable for streaming). It's easy to imagine other cases where you want to test a program for non-seekable files by running it on /dev/stdin with cat vs. a redirect.

    – Peter Cordes
    21 hours ago


















16














The 2 commands are not equivalent: consider error handling:



cat <file that doesn't exist> | less will produce an empty stream that will be passed to the piped program... as such you end up with a display showing nothing.



< <file that doesn't exist> less will fail to open bar, and then not open less at all.



Attempting to change the former to the latter could break any number of scripts that expect to run the program with a potentially blank input.






share|improve this answer








New contributor




UKMonkey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



























    14














    Because detecting useless cat is really really hard.



    I had a shell script where I wrote



    cat | (somecommand <<!
    ...
    /proc/self/fd/3
    ...
    !) 0<&3


    The shell script failed in production if the cat was removed because it was invoked via su -c 'script.sh' someuser. The apparently superfluous cat caused the owner of standard input to change to the user the script was running as so that reopening it via /proc worked.






    share|improve this answer
































      8














      tl;dr: Shells don't do it automatically because the costs exceed the likely benefits.



      Other answers have pointed out the technical difference between stdin being a pipe and it being a file. Keeping that in mind, the shell could do one of:



      1. Implement cat as a builtin, still preserving the file v. pipe distinction. This would save the cost of an exec and maybe, possibly, a fork.

      2. Perform a full analysis of the pipeline with knowledge of the various commands used to see if file/pipe matters, then act based on that.

      Next you have to consider the costs and benefits of each approach. The benefits are simple enough:



      1. In either case, avoid an exec (of cat)

      2. In the second case, when redirect substitution is possible, avoidance of a fork.

      3. In cases where you have to use a pipe, it might be possible sometimes to avoid a fork/vfork, but often not. That's because the cat-equivalent needs to run at the same time as the rest of the pipeline.

      So you save a little CPU time & memory, especially if you can avoid the fork. Of course, you only save this time & memory when the feature is actually used. And you're only really saving the fork/exec time; with larger files, the time is mostly the I/O time (i.e., cat reading a file from disk). So you have to ask: how often is cat used (uselessly) in shell scripts where the performance actually matters? Compare it to other common shell builtins like test — it's hard to imagine cat is used (uselessly) even a tenth as often as test is used in places that matter. That's a guess, I haven't measured, which is something you'd want to do before any attempt at implementation. (Or similarly, asking someone else to implement in e.g., a feature request.)



      Next you ask: what are the costs. The two costs that come to mind are (a) additional code in the shell, which increases its size (and thus possibly memory use), requires more maintenance work, is another spot for bugs, etc.; and (b) backwards compatibility surprises, POSIX cat omits a lot of features of e.g., GNU coreutils cat, so you'd have to be careful exactly what the cat builtin would implement.



      1. The additional builtin option probably isn't that bad — adding one more builtin where a bunch already exist. If you had profiling data showing it'd help, you could probably convince your favorite shell's authors to add it.


      2. As for analyzing the pipeline, I don't think shells do anything like this currently (a few recognize the end of a pipeline and can avoid a fork). Essentially you'd be adding a (primitive) optimizer to the shell; optimizers often turn out to be complicated code and the source of a lot of bugs. And those bugs can be surprising — slight changes in the shell script could wind up avoiding or triggering the bug.


      Postscript: You can apply a similar analysis to your useless uses of cat. Benefits: easier to read (though if command1 will take a file as an argument, probably not). Costs: extra fork and exec (and if command1 can take a file as an argument, probably more confusing error messages). If your analysis tells you to uselessly use cat, then go ahead.






      share|improve this answer
































        7














        The cat command can accept - as a marker for stdin. (POSIX, "If a file is '-', the cat utility shall read from the standard input at that point in the sequence.") This allows simple handling of a file or stdin where otherwise this would be disallowed.



        Consider these two trivial alternatives, where the shell argument $1 is -:



        cat "$1" | nl # Works completely transparently
        nl < "$1" # Fails with 'bash: -: No such file or directory'


        Another time cat is useful is where it's intentionally used as a no-op simply to maintain shell syntax:



        file="$1"
        reader=cat
        [[ $file =~ .gz$ ]] && reader=zcat
        [[ $file =~ .bz2$ ]] && reader=bzcat
        "$reader" "$file"


        Finally, I believe the only time that UUOC can really be correctly called out is when cat is used with a filename that is known to be a regular file (i.e. not a device or named pipe), and that no flags are given to the command:



        cat file.txt


        In any other situation the oroperties of cat itself may be required.






        share|improve this answer
































          2














          The cat command can do things that the shell can't necessarily do ( or at least, can't do easily). For example, suppose you want to print characters that might otherwise be invisible, such as tabs, carriage returns, or newlines. There *might* be a way to do so with only shell builtin commands, but I can't think of any off the top of my head. The GNU version of cat can do so with the -A argument or the -v -E -T arguments (IDK about other versions of cat, though). You could also prefix each line with a line number using -n (again, IDK if non-GNU versions can do this).



          Another advantage of cat is that it can easily read multiple files. To do so, one can simply type cat file1 file2 file3. To do the same with a shell, things would get tricky, although a carefully-crafted loop could most likely achieve the same result. That said, do you really want to take the time to write such a loop, when such a simple alternative exists? I don't!



          Reading files with cat would probably use less CPU than the shell would, since cat is a pre-compiled program (the obvious exception is any shell that has a builtin cat). When reading a large group of files, this might become apparent, but I have never done so on my machines, so I can't be sure.



          The cat command can also be useful for forcing a command to accept standard input in instances it might not. Consider the following:



          echo 8 | sleep



          The number "8" will be not accepted by the "sleep" command, since it was never really meant to accept standard input. Thus, sleep will disregard that input, complain about a lack of arguments, and exit. However, if one types:



          echo 8 | sleep $(cat)



          Many shells will expand this to sleep 8, and sleep will wait for 8 seconds before exiting. You can also do something similar with ssh:



          command | ssh 1.2.3.4 'cat >> example-file'



          This command with append example-file on the machine with the address of 1.2.3.4 with whatever is outputted from "command".



          And that's (probably) just scratching the surface. I'm sure I could find more example of cat being useful if I wanted to, but this post is long enough as it is. So, I'll conclude by saying this: asking the shell to anticipate all of these scenarios (and several others) is not really feasible.






          share|improve this answer






























            1














            Adding to @Kusalananda answer (and @alephzero comment), cat could be anything:



            alias cat='gcc -c'
            cat "$MYFILE" | command1 | command2 > "$OUTPUT"


            or



            echo 'echo 1' > /usr/bin/cat
            cat "$MYFILE" | command1 | command2 > "$OUTPUT"


            There is no reason that cat (on its own) or /usr/bin/cat on the system is actually cat the concatenate tool.






            share|improve this answer








            New contributor




            Rob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.















            • 2





              Other than the behaviour of cat is defined by POSIX and so shouldn't be wildly different.

              – roaima
              yesterday






            • 2





              @roaima: PATH=/home/Joshua/bin:$PATH cat ... Are you sure you know what cat does now?

              – Joshua
              yesterday











            • @Joshua it doesn't really matter. We both know cat can be overridden, but we also both know it shouldn't be wantonly replaced with something else. My comment points out that POSIX mandates a particular (subset of) behaviour that can reasonably be expected to exist. I have, at times, written a shell script that extends behaviour of a standard utility. In this case the shell script acted and behaved just like the tool it replaced, except that it had additional capabilities.

              – roaima
              yesterday












            • @Joshua: On most platforms, shells know (or could know) which directories hold executables that implement POSIX commands. So you could just defer the substitution until after alias expansion and path resolution, and only do it for /bin/cat. (And you'd make it an option you could turn off.) Or you'd make cat a shell built-in (which maybe falls back to /bin/cat for multiple args?) so users could control whether or not they wanted the external version the normal way, with enable cat. Like for kill. (I was thinking that bash command cat would work, but that doesn't skip builtins)

              – Peter Cordes
              21 hours ago



















            1














            Remember that a user could have a cat in his $PATH which is not exactly the POSIX cat (but perhaps some variant which could log something somewhere). In that case, you don't want the shell to remove it.



            The PATH could change dynamically, and then cat is not what you believe it is. It would be quite difficult to write a shell doing the optimization you dream of.



            Also, in practice, cat is a quite quick program. There are few practical reasons (except aesthetics) to avoid it.



            See also the excellent Parsing POSIX [s]hell talk by Yann Regis-Gianas at FOSDEM2018. It gives other good reasons to avoid attempting doing what you dream of in a shell.






            share|improve this answer























            • "Quite few practical reasons to avoid it" -- anyone who's waited for cat some-huge-log | tail -n 5 to run (where tail -n 5 some-huge-log could jump straight to the end, whereas cat reads only front-to-back) would disagree.

              – Charles Duffy
              35 mins ago










            protected by Michael Homer 14 hours ago



            Thank you for your interest in this question.
            Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



            Would you like to answer one of these unanswered questions instead?














            9 Answers
            9






            active

            oldest

            votes








            9 Answers
            9






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            43














            "Useless use of cat" is more about how you write your code than about what actually runs when you execute the script. It's a sort of design anti-pattern, a way of going about something that could probably be done in a more efficient manner. It's a failure in understanding of how to best combine the given tools to create a new tool. I'd argue that stringing several sed and/or awk commands together in a pipeline also could be said to be a symptom of this same anti-pattern.



            Fixing instances of "useless use of cat" in a script is a primarily matter of fixing the source code of the script manually. A tool such as ShellCheck can help with this by pointing them out:



            $ cat script.sh
            #!/bin/sh
            cat file | cat




            $ shellcheck script.sh

            In script.sh line 2:
            cat file | cat
            ^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.



            Getting the shell to do this automatically would be difficult due to the nature of shell scripts. The way a script executes depends on the environment inherited from its parent process, and on the specific implementation of the available external commands.



            The shell does not necessarily know what cat is. It could potentially be any command from anywhere in your $PATH, or a function.



            If it was a built-in command (which it may be in some shells), it would have the ability to reorganise the pipeline as it would know of the semantics of its built-in cat command. Before doing that, it would additionally have to make assumptions about the next command in the pipeline, after the original cat.



            Note that reading from standard input behaves slightly differently when it's connected to a pipe and when it's connected to a file. A pipe is not seekable, so depending on what the next command in the pipeline does, it may or may not behave differently if the pipeline was rearranged (it may detect whether the input is seekable and decide to do things differently if it is or if it isn't, in any case it would then behave differently).



            This question is similar (in a very general sense) to "Are there any compilers that attempt to fix syntax errors on their own?" (at the Software Engineering StackExchange site), although that question is obviously about syntax errors, not useless design patterns. The idea about automatically changing the code based on intent is largely the same though.






            share|improve this answer

























            • It's perfectly conformant for a shell to know what cat is, and the other commands in the pipeline, (the as-if rule) and behave accordingly, they just don't here because it's pointless and too hard.

              – Michael Homer
              yesterday






            • 2





              @MichaelHomer Yes. But it's also allowed to overload a standard command with a function of the same name.

              – Kusalananda
              yesterday






            • 1





              @PhilipCouling It’s absolutely conformant as long as it’s known that none of the pipeline commands care. The shell is specifically allowed to replace utilities with builtins or shell functions and those have no execution environment restrictions, so as long as the external result is indistinguishable it’s permitted. For your case, cat /dev/tty is the interesting one that would be different with <.

              – Michael Homer
              yesterday












            • @MichaelHomer so as long as the external result is indistinguishable it’s permitted That means the behavior of the entire set of utilities optimized in such a manner can never change. That has to be the ultimate dependency hell.

              – Andrew Henle
              yesterday






            • 2





              @MichaelHomer As the other comments said, of course it's perfectly comformant for the shell to know that given the OP's input it is impossible to tell what the cat command actually does without executing it. For all you (and the shell) know, the OP has a command cat in her path which is an interactive cat simulation, "myfile" is just the stored game state, and command1 and command2 are postprocessing some statistics about the current playing session...

              – alephzero
              yesterday















            43














            "Useless use of cat" is more about how you write your code than about what actually runs when you execute the script. It's a sort of design anti-pattern, a way of going about something that could probably be done in a more efficient manner. It's a failure in understanding of how to best combine the given tools to create a new tool. I'd argue that stringing several sed and/or awk commands together in a pipeline also could be said to be a symptom of this same anti-pattern.



            Fixing instances of "useless use of cat" in a script is a primarily matter of fixing the source code of the script manually. A tool such as ShellCheck can help with this by pointing them out:



            $ cat script.sh
            #!/bin/sh
            cat file | cat




            $ shellcheck script.sh

            In script.sh line 2:
            cat file | cat
            ^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.



            Getting the shell to do this automatically would be difficult due to the nature of shell scripts. The way a script executes depends on the environment inherited from its parent process, and on the specific implementation of the available external commands.



            The shell does not necessarily know what cat is. It could potentially be any command from anywhere in your $PATH, or a function.



            If it was a built-in command (which it may be in some shells), it would have the ability to reorganise the pipeline as it would know of the semantics of its built-in cat command. Before doing that, it would additionally have to make assumptions about the next command in the pipeline, after the original cat.



            Note that reading from standard input behaves slightly differently when it's connected to a pipe and when it's connected to a file. A pipe is not seekable, so depending on what the next command in the pipeline does, it may or may not behave differently if the pipeline was rearranged (it may detect whether the input is seekable and decide to do things differently if it is or if it isn't, in any case it would then behave differently).



            This question is similar (in a very general sense) to "Are there any compilers that attempt to fix syntax errors on their own?" (at the Software Engineering StackExchange site), although that question is obviously about syntax errors, not useless design patterns. The idea about automatically changing the code based on intent is largely the same though.






            share|improve this answer

























            • It's perfectly conformant for a shell to know what cat is, and the other commands in the pipeline, (the as-if rule) and behave accordingly, they just don't here because it's pointless and too hard.

              – Michael Homer
              yesterday






            • 2





              @MichaelHomer Yes. But it's also allowed to overload a standard command with a function of the same name.

              – Kusalananda
              yesterday






            • 1





              @PhilipCouling It’s absolutely conformant as long as it’s known that none of the pipeline commands care. The shell is specifically allowed to replace utilities with builtins or shell functions and those have no execution environment restrictions, so as long as the external result is indistinguishable it’s permitted. For your case, cat /dev/tty is the interesting one that would be different with <.

              – Michael Homer
              yesterday












            • @MichaelHomer so as long as the external result is indistinguishable it’s permitted That means the behavior of the entire set of utilities optimized in such a manner can never change. That has to be the ultimate dependency hell.

              – Andrew Henle
              yesterday






            • 2





              @MichaelHomer As the other comments said, of course it's perfectly comformant for the shell to know that given the OP's input it is impossible to tell what the cat command actually does without executing it. For all you (and the shell) know, the OP has a command cat in her path which is an interactive cat simulation, "myfile" is just the stored game state, and command1 and command2 are postprocessing some statistics about the current playing session...

              – alephzero
              yesterday













            43












            43








            43







            "Useless use of cat" is more about how you write your code than about what actually runs when you execute the script. It's a sort of design anti-pattern, a way of going about something that could probably be done in a more efficient manner. It's a failure in understanding of how to best combine the given tools to create a new tool. I'd argue that stringing several sed and/or awk commands together in a pipeline also could be said to be a symptom of this same anti-pattern.



            Fixing instances of "useless use of cat" in a script is a primarily matter of fixing the source code of the script manually. A tool such as ShellCheck can help with this by pointing them out:



            $ cat script.sh
            #!/bin/sh
            cat file | cat




            $ shellcheck script.sh

            In script.sh line 2:
            cat file | cat
            ^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.



            Getting the shell to do this automatically would be difficult due to the nature of shell scripts. The way a script executes depends on the environment inherited from its parent process, and on the specific implementation of the available external commands.



            The shell does not necessarily know what cat is. It could potentially be any command from anywhere in your $PATH, or a function.



            If it was a built-in command (which it may be in some shells), it would have the ability to reorganise the pipeline as it would know of the semantics of its built-in cat command. Before doing that, it would additionally have to make assumptions about the next command in the pipeline, after the original cat.



            Note that reading from standard input behaves slightly differently when it's connected to a pipe and when it's connected to a file. A pipe is not seekable, so depending on what the next command in the pipeline does, it may or may not behave differently if the pipeline was rearranged (it may detect whether the input is seekable and decide to do things differently if it is or if it isn't, in any case it would then behave differently).



            This question is similar (in a very general sense) to "Are there any compilers that attempt to fix syntax errors on their own?" (at the Software Engineering StackExchange site), although that question is obviously about syntax errors, not useless design patterns. The idea about automatically changing the code based on intent is largely the same though.






            share|improve this answer















            "Useless use of cat" is more about how you write your code than about what actually runs when you execute the script. It's a sort of design anti-pattern, a way of going about something that could probably be done in a more efficient manner. It's a failure in understanding of how to best combine the given tools to create a new tool. I'd argue that stringing several sed and/or awk commands together in a pipeline also could be said to be a symptom of this same anti-pattern.



            Fixing instances of "useless use of cat" in a script is a primarily matter of fixing the source code of the script manually. A tool such as ShellCheck can help with this by pointing them out:



            $ cat script.sh
            #!/bin/sh
            cat file | cat




            $ shellcheck script.sh

            In script.sh line 2:
            cat file | cat
            ^-- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.



            Getting the shell to do this automatically would be difficult due to the nature of shell scripts. The way a script executes depends on the environment inherited from its parent process, and on the specific implementation of the available external commands.



            The shell does not necessarily know what cat is. It could potentially be any command from anywhere in your $PATH, or a function.



            If it was a built-in command (which it may be in some shells), it would have the ability to reorganise the pipeline as it would know of the semantics of its built-in cat command. Before doing that, it would additionally have to make assumptions about the next command in the pipeline, after the original cat.



            Note that reading from standard input behaves slightly differently when it's connected to a pipe and when it's connected to a file. A pipe is not seekable, so depending on what the next command in the pipeline does, it may or may not behave differently if the pipeline was rearranged (it may detect whether the input is seekable and decide to do things differently if it is or if it isn't, in any case it would then behave differently).



            This question is similar (in a very general sense) to "Are there any compilers that attempt to fix syntax errors on their own?" (at the Software Engineering StackExchange site), although that question is obviously about syntax errors, not useless design patterns. The idea about automatically changing the code based on intent is largely the same though.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited yesterday

























            answered yesterday









            KusalanandaKusalananda

            141k17263439




            141k17263439












            • It's perfectly conformant for a shell to know what cat is, and the other commands in the pipeline, (the as-if rule) and behave accordingly, they just don't here because it's pointless and too hard.

              – Michael Homer
              yesterday






            • 2





              @MichaelHomer Yes. But it's also allowed to overload a standard command with a function of the same name.

              – Kusalananda
              yesterday






            • 1





              @PhilipCouling It’s absolutely conformant as long as it’s known that none of the pipeline commands care. The shell is specifically allowed to replace utilities with builtins or shell functions and those have no execution environment restrictions, so as long as the external result is indistinguishable it’s permitted. For your case, cat /dev/tty is the interesting one that would be different with <.

              – Michael Homer
              yesterday












            • @MichaelHomer so as long as the external result is indistinguishable it’s permitted That means the behavior of the entire set of utilities optimized in such a manner can never change. That has to be the ultimate dependency hell.

              – Andrew Henle
              yesterday






            • 2





              @MichaelHomer As the other comments said, of course it's perfectly comformant for the shell to know that given the OP's input it is impossible to tell what the cat command actually does without executing it. For all you (and the shell) know, the OP has a command cat in her path which is an interactive cat simulation, "myfile" is just the stored game state, and command1 and command2 are postprocessing some statistics about the current playing session...

              – alephzero
              yesterday

















            • It's perfectly conformant for a shell to know what cat is, and the other commands in the pipeline, (the as-if rule) and behave accordingly, they just don't here because it's pointless and too hard.

              – Michael Homer
              yesterday






            • 2





              @MichaelHomer Yes. But it's also allowed to overload a standard command with a function of the same name.

              – Kusalananda
              yesterday






            • 1





              @PhilipCouling It’s absolutely conformant as long as it’s known that none of the pipeline commands care. The shell is specifically allowed to replace utilities with builtins or shell functions and those have no execution environment restrictions, so as long as the external result is indistinguishable it’s permitted. For your case, cat /dev/tty is the interesting one that would be different with <.

              – Michael Homer
              yesterday












            • @MichaelHomer so as long as the external result is indistinguishable it’s permitted That means the behavior of the entire set of utilities optimized in such a manner can never change. That has to be the ultimate dependency hell.

              – Andrew Henle
              yesterday






            • 2





              @MichaelHomer As the other comments said, of course it's perfectly comformant for the shell to know that given the OP's input it is impossible to tell what the cat command actually does without executing it. For all you (and the shell) know, the OP has a command cat in her path which is an interactive cat simulation, "myfile" is just the stored game state, and command1 and command2 are postprocessing some statistics about the current playing session...

              – alephzero
              yesterday
















            It's perfectly conformant for a shell to know what cat is, and the other commands in the pipeline, (the as-if rule) and behave accordingly, they just don't here because it's pointless and too hard.

            – Michael Homer
            yesterday





            It's perfectly conformant for a shell to know what cat is, and the other commands in the pipeline, (the as-if rule) and behave accordingly, they just don't here because it's pointless and too hard.

            – Michael Homer
            yesterday




            2




            2





            @MichaelHomer Yes. But it's also allowed to overload a standard command with a function of the same name.

            – Kusalananda
            yesterday





            @MichaelHomer Yes. But it's also allowed to overload a standard command with a function of the same name.

            – Kusalananda
            yesterday




            1




            1





            @PhilipCouling It’s absolutely conformant as long as it’s known that none of the pipeline commands care. The shell is specifically allowed to replace utilities with builtins or shell functions and those have no execution environment restrictions, so as long as the external result is indistinguishable it’s permitted. For your case, cat /dev/tty is the interesting one that would be different with <.

            – Michael Homer
            yesterday






            @PhilipCouling It’s absolutely conformant as long as it’s known that none of the pipeline commands care. The shell is specifically allowed to replace utilities with builtins or shell functions and those have no execution environment restrictions, so as long as the external result is indistinguishable it’s permitted. For your case, cat /dev/tty is the interesting one that would be different with <.

            – Michael Homer
            yesterday














            @MichaelHomer so as long as the external result is indistinguishable it’s permitted That means the behavior of the entire set of utilities optimized in such a manner can never change. That has to be the ultimate dependency hell.

            – Andrew Henle
            yesterday





            @MichaelHomer so as long as the external result is indistinguishable it’s permitted That means the behavior of the entire set of utilities optimized in such a manner can never change. That has to be the ultimate dependency hell.

            – Andrew Henle
            yesterday




            2




            2





            @MichaelHomer As the other comments said, of course it's perfectly comformant for the shell to know that given the OP's input it is impossible to tell what the cat command actually does without executing it. For all you (and the shell) know, the OP has a command cat in her path which is an interactive cat simulation, "myfile" is just the stored game state, and command1 and command2 are postprocessing some statistics about the current playing session...

            – alephzero
            yesterday





            @MichaelHomer As the other comments said, of course it's perfectly comformant for the shell to know that given the OP's input it is impossible to tell what the cat command actually does without executing it. For all you (and the shell) know, the OP has a command cat in her path which is an interactive cat simulation, "myfile" is just the stored game state, and command1 and command2 are postprocessing some statistics about the current playing session...

            – alephzero
            yesterday













            30














            Because it's not useless.



            In the case of cat file | cmd, the fd 0 (stdin) of cmd will be a pipe, and in the case of cmd <file it may be a regular file, device, etc.



            A pipe has different semantics from a regular file, and its semantics are not a subset of those of a regular file:



            • a regular file cannot be select(2)ed or poll(2)ed on in a meaningful way; a select(2) on it will always return "ready". Advanced interfaces like epoll(2) on Linux will simply not work with regular files.


            • on Linux there are system calls (splice(2), vmsplice(2), tee(2)) which only work on pipes [1]


            Since cat is so much used, it could be implemented as a shell built-in which will avoid an extra process, but once you started on that path, the same thing could be done with most commands -- transforming the shell into a slower & clunkier perl or python. it's probably better to write another scripting language with an easy to use pipe-like syntax for continuations instead ;-)



            [1] If you want a simple example not made up for the occasion, you can look at my "exec binary from stdin" git gist with some explanations in the comment here. Implementing cat inside it in order to make it work without UUoC would have made it 2 or 3 times bigger.






            share|improve this answer


















            • 1





              In fact, ksh93 does implement some external commands like cat internally.

              – jrw32982
              yesterday






            • 2





              cat /dev/urandom | cpu_bound_program runs the read() system calls in a separate process. On Linux for example, the actual CPU work of generating more random numbers (when the pool is empty) is done in that system call, so using a separate process lets you take advantage of a separate CPU core to generate random data as input. e.g. in What's the fastest way to generate a 1 GB text file containing random digits?

              – Peter Cordes
              21 hours ago






            • 3





              More importantly for most cases, it means lseek won't work. cat foo.mp4 | mpv - will work, but you can't seek backward further than mpv's or mplayer's cache buffer. But with input redirected from a file, you can. cat | mpv - is one way to check if an MP4 has its moov atom at the start of the file, so it can be played without seeking to the end and back (i.e. if it's suitable for streaming). It's easy to imagine other cases where you want to test a program for non-seekable files by running it on /dev/stdin with cat vs. a redirect.

              – Peter Cordes
              21 hours ago















            30














            Because it's not useless.



            In the case of cat file | cmd, the fd 0 (stdin) of cmd will be a pipe, and in the case of cmd <file it may be a regular file, device, etc.



            A pipe has different semantics from a regular file, and its semantics are not a subset of those of a regular file:



            • a regular file cannot be select(2)ed or poll(2)ed on in a meaningful way; a select(2) on it will always return "ready". Advanced interfaces like epoll(2) on Linux will simply not work with regular files.


            • on Linux there are system calls (splice(2), vmsplice(2), tee(2)) which only work on pipes [1]


            Since cat is so much used, it could be implemented as a shell built-in which will avoid an extra process, but once you started on that path, the same thing could be done with most commands -- transforming the shell into a slower & clunkier perl or python. it's probably better to write another scripting language with an easy to use pipe-like syntax for continuations instead ;-)



            [1] If you want a simple example not made up for the occasion, you can look at my "exec binary from stdin" git gist with some explanations in the comment here. Implementing cat inside it in order to make it work without UUoC would have made it 2 or 3 times bigger.






            share|improve this answer


















            • 1





              In fact, ksh93 does implement some external commands like cat internally.

              – jrw32982
              yesterday






            • 2





              cat /dev/urandom | cpu_bound_program runs the read() system calls in a separate process. On Linux for example, the actual CPU work of generating more random numbers (when the pool is empty) is done in that system call, so using a separate process lets you take advantage of a separate CPU core to generate random data as input. e.g. in What's the fastest way to generate a 1 GB text file containing random digits?

              – Peter Cordes
              21 hours ago






            • 3





              More importantly for most cases, it means lseek won't work. cat foo.mp4 | mpv - will work, but you can't seek backward further than mpv's or mplayer's cache buffer. But with input redirected from a file, you can. cat | mpv - is one way to check if an MP4 has its moov atom at the start of the file, so it can be played without seeking to the end and back (i.e. if it's suitable for streaming). It's easy to imagine other cases where you want to test a program for non-seekable files by running it on /dev/stdin with cat vs. a redirect.

              – Peter Cordes
              21 hours ago













            30












            30








            30







            Because it's not useless.



            In the case of cat file | cmd, the fd 0 (stdin) of cmd will be a pipe, and in the case of cmd <file it may be a regular file, device, etc.



            A pipe has different semantics from a regular file, and its semantics are not a subset of those of a regular file:



            • a regular file cannot be select(2)ed or poll(2)ed on in a meaningful way; a select(2) on it will always return "ready". Advanced interfaces like epoll(2) on Linux will simply not work with regular files.


            • on Linux there are system calls (splice(2), vmsplice(2), tee(2)) which only work on pipes [1]


            Since cat is so much used, it could be implemented as a shell built-in which will avoid an extra process, but once you started on that path, the same thing could be done with most commands -- transforming the shell into a slower & clunkier perl or python. it's probably better to write another scripting language with an easy to use pipe-like syntax for continuations instead ;-)



            [1] If you want a simple example not made up for the occasion, you can look at my "exec binary from stdin" git gist with some explanations in the comment here. Implementing cat inside it in order to make it work without UUoC would have made it 2 or 3 times bigger.






            share|improve this answer













            Because it's not useless.



            In the case of cat file | cmd, the fd 0 (stdin) of cmd will be a pipe, and in the case of cmd <file it may be a regular file, device, etc.



            A pipe has different semantics from a regular file, and its semantics are not a subset of those of a regular file:



            • a regular file cannot be select(2)ed or poll(2)ed on in a meaningful way; a select(2) on it will always return "ready". Advanced interfaces like epoll(2) on Linux will simply not work with regular files.


            • on Linux there are system calls (splice(2), vmsplice(2), tee(2)) which only work on pipes [1]


            Since cat is so much used, it could be implemented as a shell built-in which will avoid an extra process, but once you started on that path, the same thing could be done with most commands -- transforming the shell into a slower & clunkier perl or python. it's probably better to write another scripting language with an easy to use pipe-like syntax for continuations instead ;-)



            [1] If you want a simple example not made up for the occasion, you can look at my "exec binary from stdin" git gist with some explanations in the comment here. Implementing cat inside it in order to make it work without UUoC would have made it 2 or 3 times bigger.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered yesterday









            mosvymosvy

            9,87211236




            9,87211236







            • 1





              In fact, ksh93 does implement some external commands like cat internally.

              – jrw32982
              yesterday






            • 2





              cat /dev/urandom | cpu_bound_program runs the read() system calls in a separate process. On Linux for example, the actual CPU work of generating more random numbers (when the pool is empty) is done in that system call, so using a separate process lets you take advantage of a separate CPU core to generate random data as input. e.g. in What's the fastest way to generate a 1 GB text file containing random digits?

              – Peter Cordes
              21 hours ago






            • 3





              More importantly for most cases, it means lseek won't work. cat foo.mp4 | mpv - will work, but you can't seek backward further than mpv's or mplayer's cache buffer. But with input redirected from a file, you can. cat | mpv - is one way to check if an MP4 has its moov atom at the start of the file, so it can be played without seeking to the end and back (i.e. if it's suitable for streaming). It's easy to imagine other cases where you want to test a program for non-seekable files by running it on /dev/stdin with cat vs. a redirect.

              – Peter Cordes
              21 hours ago












            • 1





              In fact, ksh93 does implement some external commands like cat internally.

              – jrw32982
              yesterday






            • 2





              cat /dev/urandom | cpu_bound_program runs the read() system calls in a separate process. On Linux for example, the actual CPU work of generating more random numbers (when the pool is empty) is done in that system call, so using a separate process lets you take advantage of a separate CPU core to generate random data as input. e.g. in What's the fastest way to generate a 1 GB text file containing random digits?

              – Peter Cordes
              21 hours ago






            • 3





              More importantly for most cases, it means lseek won't work. cat foo.mp4 | mpv - will work, but you can't seek backward further than mpv's or mplayer's cache buffer. But with input redirected from a file, you can. cat | mpv - is one way to check if an MP4 has its moov atom at the start of the file, so it can be played without seeking to the end and back (i.e. if it's suitable for streaming). It's easy to imagine other cases where you want to test a program for non-seekable files by running it on /dev/stdin with cat vs. a redirect.

              – Peter Cordes
              21 hours ago







            1




            1





            In fact, ksh93 does implement some external commands like cat internally.

            – jrw32982
            yesterday





            In fact, ksh93 does implement some external commands like cat internally.

            – jrw32982
            yesterday




            2




            2





            cat /dev/urandom | cpu_bound_program runs the read() system calls in a separate process. On Linux for example, the actual CPU work of generating more random numbers (when the pool is empty) is done in that system call, so using a separate process lets you take advantage of a separate CPU core to generate random data as input. e.g. in What's the fastest way to generate a 1 GB text file containing random digits?

            – Peter Cordes
            21 hours ago





            cat /dev/urandom | cpu_bound_program runs the read() system calls in a separate process. On Linux for example, the actual CPU work of generating more random numbers (when the pool is empty) is done in that system call, so using a separate process lets you take advantage of a separate CPU core to generate random data as input. e.g. in What's the fastest way to generate a 1 GB text file containing random digits?

            – Peter Cordes
            21 hours ago




            3




            3





            More importantly for most cases, it means lseek won't work. cat foo.mp4 | mpv - will work, but you can't seek backward further than mpv's or mplayer's cache buffer. But with input redirected from a file, you can. cat | mpv - is one way to check if an MP4 has its moov atom at the start of the file, so it can be played without seeking to the end and back (i.e. if it's suitable for streaming). It's easy to imagine other cases where you want to test a program for non-seekable files by running it on /dev/stdin with cat vs. a redirect.

            – Peter Cordes
            21 hours ago





            More importantly for most cases, it means lseek won't work. cat foo.mp4 | mpv - will work, but you can't seek backward further than mpv's or mplayer's cache buffer. But with input redirected from a file, you can. cat | mpv - is one way to check if an MP4 has its moov atom at the start of the file, so it can be played without seeking to the end and back (i.e. if it's suitable for streaming). It's easy to imagine other cases where you want to test a program for non-seekable files by running it on /dev/stdin with cat vs. a redirect.

            – Peter Cordes
            21 hours ago











            16














            The 2 commands are not equivalent: consider error handling:



            cat <file that doesn't exist> | less will produce an empty stream that will be passed to the piped program... as such you end up with a display showing nothing.



            < <file that doesn't exist> less will fail to open bar, and then not open less at all.



            Attempting to change the former to the latter could break any number of scripts that expect to run the program with a potentially blank input.






            share|improve this answer








            New contributor




            UKMonkey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.
























              16














              The 2 commands are not equivalent: consider error handling:



              cat <file that doesn't exist> | less will produce an empty stream that will be passed to the piped program... as such you end up with a display showing nothing.



              < <file that doesn't exist> less will fail to open bar, and then not open less at all.



              Attempting to change the former to the latter could break any number of scripts that expect to run the program with a potentially blank input.






              share|improve this answer








              New contributor




              UKMonkey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






















                16












                16








                16







                The 2 commands are not equivalent: consider error handling:



                cat <file that doesn't exist> | less will produce an empty stream that will be passed to the piped program... as such you end up with a display showing nothing.



                < <file that doesn't exist> less will fail to open bar, and then not open less at all.



                Attempting to change the former to the latter could break any number of scripts that expect to run the program with a potentially blank input.






                share|improve this answer








                New contributor




                UKMonkey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.










                The 2 commands are not equivalent: consider error handling:



                cat <file that doesn't exist> | less will produce an empty stream that will be passed to the piped program... as such you end up with a display showing nothing.



                < <file that doesn't exist> less will fail to open bar, and then not open less at all.



                Attempting to change the former to the latter could break any number of scripts that expect to run the program with a potentially blank input.







                share|improve this answer








                New contributor




                UKMonkey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.









                share|improve this answer



                share|improve this answer






                New contributor




                UKMonkey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.









                answered yesterday









                UKMonkeyUKMonkey

                26114




                26114




                New contributor




                UKMonkey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.





                New contributor





                UKMonkey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                UKMonkey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.





















                    14














                    Because detecting useless cat is really really hard.



                    I had a shell script where I wrote



                    cat | (somecommand <<!
                    ...
                    /proc/self/fd/3
                    ...
                    !) 0<&3


                    The shell script failed in production if the cat was removed because it was invoked via su -c 'script.sh' someuser. The apparently superfluous cat caused the owner of standard input to change to the user the script was running as so that reopening it via /proc worked.






                    share|improve this answer





























                      14














                      Because detecting useless cat is really really hard.



                      I had a shell script where I wrote



                      cat | (somecommand <<!
                      ...
                      /proc/self/fd/3
                      ...
                      !) 0<&3


                      The shell script failed in production if the cat was removed because it was invoked via su -c 'script.sh' someuser. The apparently superfluous cat caused the owner of standard input to change to the user the script was running as so that reopening it via /proc worked.






                      share|improve this answer



























                        14












                        14








                        14







                        Because detecting useless cat is really really hard.



                        I had a shell script where I wrote



                        cat | (somecommand <<!
                        ...
                        /proc/self/fd/3
                        ...
                        !) 0<&3


                        The shell script failed in production if the cat was removed because it was invoked via su -c 'script.sh' someuser. The apparently superfluous cat caused the owner of standard input to change to the user the script was running as so that reopening it via /proc worked.






                        share|improve this answer















                        Because detecting useless cat is really really hard.



                        I had a shell script where I wrote



                        cat | (somecommand <<!
                        ...
                        /proc/self/fd/3
                        ...
                        !) 0<&3


                        The shell script failed in production if the cat was removed because it was invoked via su -c 'script.sh' someuser. The apparently superfluous cat caused the owner of standard input to change to the user the script was running as so that reopening it via /proc worked.







                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited yesterday









                        jlliagre

                        47.9k786138




                        47.9k786138










                        answered yesterday









                        JoshuaJoshua

                        1,319815




                        1,319815





















                            8














                            tl;dr: Shells don't do it automatically because the costs exceed the likely benefits.



                            Other answers have pointed out the technical difference between stdin being a pipe and it being a file. Keeping that in mind, the shell could do one of:



                            1. Implement cat as a builtin, still preserving the file v. pipe distinction. This would save the cost of an exec and maybe, possibly, a fork.

                            2. Perform a full analysis of the pipeline with knowledge of the various commands used to see if file/pipe matters, then act based on that.

                            Next you have to consider the costs and benefits of each approach. The benefits are simple enough:



                            1. In either case, avoid an exec (of cat)

                            2. In the second case, when redirect substitution is possible, avoidance of a fork.

                            3. In cases where you have to use a pipe, it might be possible sometimes to avoid a fork/vfork, but often not. That's because the cat-equivalent needs to run at the same time as the rest of the pipeline.

                            So you save a little CPU time & memory, especially if you can avoid the fork. Of course, you only save this time & memory when the feature is actually used. And you're only really saving the fork/exec time; with larger files, the time is mostly the I/O time (i.e., cat reading a file from disk). So you have to ask: how often is cat used (uselessly) in shell scripts where the performance actually matters? Compare it to other common shell builtins like test — it's hard to imagine cat is used (uselessly) even a tenth as often as test is used in places that matter. That's a guess, I haven't measured, which is something you'd want to do before any attempt at implementation. (Or similarly, asking someone else to implement in e.g., a feature request.)



                            Next you ask: what are the costs. The two costs that come to mind are (a) additional code in the shell, which increases its size (and thus possibly memory use), requires more maintenance work, is another spot for bugs, etc.; and (b) backwards compatibility surprises, POSIX cat omits a lot of features of e.g., GNU coreutils cat, so you'd have to be careful exactly what the cat builtin would implement.



                            1. The additional builtin option probably isn't that bad — adding one more builtin where a bunch already exist. If you had profiling data showing it'd help, you could probably convince your favorite shell's authors to add it.


                            2. As for analyzing the pipeline, I don't think shells do anything like this currently (a few recognize the end of a pipeline and can avoid a fork). Essentially you'd be adding a (primitive) optimizer to the shell; optimizers often turn out to be complicated code and the source of a lot of bugs. And those bugs can be surprising — slight changes in the shell script could wind up avoiding or triggering the bug.


                            Postscript: You can apply a similar analysis to your useless uses of cat. Benefits: easier to read (though if command1 will take a file as an argument, probably not). Costs: extra fork and exec (and if command1 can take a file as an argument, probably more confusing error messages). If your analysis tells you to uselessly use cat, then go ahead.






                            share|improve this answer





























                              8














                              tl;dr: Shells don't do it automatically because the costs exceed the likely benefits.



                              Other answers have pointed out the technical difference between stdin being a pipe and it being a file. Keeping that in mind, the shell could do one of:



                              1. Implement cat as a builtin, still preserving the file v. pipe distinction. This would save the cost of an exec and maybe, possibly, a fork.

                              2. Perform a full analysis of the pipeline with knowledge of the various commands used to see if file/pipe matters, then act based on that.

                              Next you have to consider the costs and benefits of each approach. The benefits are simple enough:



                              1. In either case, avoid an exec (of cat)

                              2. In the second case, when redirect substitution is possible, avoidance of a fork.

                              3. In cases where you have to use a pipe, it might be possible sometimes to avoid a fork/vfork, but often not. That's because the cat-equivalent needs to run at the same time as the rest of the pipeline.

                              So you save a little CPU time & memory, especially if you can avoid the fork. Of course, you only save this time & memory when the feature is actually used. And you're only really saving the fork/exec time; with larger files, the time is mostly the I/O time (i.e., cat reading a file from disk). So you have to ask: how often is cat used (uselessly) in shell scripts where the performance actually matters? Compare it to other common shell builtins like test — it's hard to imagine cat is used (uselessly) even a tenth as often as test is used in places that matter. That's a guess, I haven't measured, which is something you'd want to do before any attempt at implementation. (Or similarly, asking someone else to implement in e.g., a feature request.)



                              Next you ask: what are the costs. The two costs that come to mind are (a) additional code in the shell, which increases its size (and thus possibly memory use), requires more maintenance work, is another spot for bugs, etc.; and (b) backwards compatibility surprises, POSIX cat omits a lot of features of e.g., GNU coreutils cat, so you'd have to be careful exactly what the cat builtin would implement.



                              1. The additional builtin option probably isn't that bad — adding one more builtin where a bunch already exist. If you had profiling data showing it'd help, you could probably convince your favorite shell's authors to add it.


                              2. As for analyzing the pipeline, I don't think shells do anything like this currently (a few recognize the end of a pipeline and can avoid a fork). Essentially you'd be adding a (primitive) optimizer to the shell; optimizers often turn out to be complicated code and the source of a lot of bugs. And those bugs can be surprising — slight changes in the shell script could wind up avoiding or triggering the bug.


                              Postscript: You can apply a similar analysis to your useless uses of cat. Benefits: easier to read (though if command1 will take a file as an argument, probably not). Costs: extra fork and exec (and if command1 can take a file as an argument, probably more confusing error messages). If your analysis tells you to uselessly use cat, then go ahead.






                              share|improve this answer



























                                8












                                8








                                8







                                tl;dr: Shells don't do it automatically because the costs exceed the likely benefits.



                                Other answers have pointed out the technical difference between stdin being a pipe and it being a file. Keeping that in mind, the shell could do one of:



                                1. Implement cat as a builtin, still preserving the file v. pipe distinction. This would save the cost of an exec and maybe, possibly, a fork.

                                2. Perform a full analysis of the pipeline with knowledge of the various commands used to see if file/pipe matters, then act based on that.

                                Next you have to consider the costs and benefits of each approach. The benefits are simple enough:



                                1. In either case, avoid an exec (of cat)

                                2. In the second case, when redirect substitution is possible, avoidance of a fork.

                                3. In cases where you have to use a pipe, it might be possible sometimes to avoid a fork/vfork, but often not. That's because the cat-equivalent needs to run at the same time as the rest of the pipeline.

                                So you save a little CPU time & memory, especially if you can avoid the fork. Of course, you only save this time & memory when the feature is actually used. And you're only really saving the fork/exec time; with larger files, the time is mostly the I/O time (i.e., cat reading a file from disk). So you have to ask: how often is cat used (uselessly) in shell scripts where the performance actually matters? Compare it to other common shell builtins like test — it's hard to imagine cat is used (uselessly) even a tenth as often as test is used in places that matter. That's a guess, I haven't measured, which is something you'd want to do before any attempt at implementation. (Or similarly, asking someone else to implement in e.g., a feature request.)



                                Next you ask: what are the costs. The two costs that come to mind are (a) additional code in the shell, which increases its size (and thus possibly memory use), requires more maintenance work, is another spot for bugs, etc.; and (b) backwards compatibility surprises, POSIX cat omits a lot of features of e.g., GNU coreutils cat, so you'd have to be careful exactly what the cat builtin would implement.



                                1. The additional builtin option probably isn't that bad — adding one more builtin where a bunch already exist. If you had profiling data showing it'd help, you could probably convince your favorite shell's authors to add it.


                                2. As for analyzing the pipeline, I don't think shells do anything like this currently (a few recognize the end of a pipeline and can avoid a fork). Essentially you'd be adding a (primitive) optimizer to the shell; optimizers often turn out to be complicated code and the source of a lot of bugs. And those bugs can be surprising — slight changes in the shell script could wind up avoiding or triggering the bug.


                                Postscript: You can apply a similar analysis to your useless uses of cat. Benefits: easier to read (though if command1 will take a file as an argument, probably not). Costs: extra fork and exec (and if command1 can take a file as an argument, probably more confusing error messages). If your analysis tells you to uselessly use cat, then go ahead.






                                share|improve this answer















                                tl;dr: Shells don't do it automatically because the costs exceed the likely benefits.



                                Other answers have pointed out the technical difference between stdin being a pipe and it being a file. Keeping that in mind, the shell could do one of:



                                1. Implement cat as a builtin, still preserving the file v. pipe distinction. This would save the cost of an exec and maybe, possibly, a fork.

                                2. Perform a full analysis of the pipeline with knowledge of the various commands used to see if file/pipe matters, then act based on that.

                                Next you have to consider the costs and benefits of each approach. The benefits are simple enough:



                                1. In either case, avoid an exec (of cat)

                                2. In the second case, when redirect substitution is possible, avoidance of a fork.

                                3. In cases where you have to use a pipe, it might be possible sometimes to avoid a fork/vfork, but often not. That's because the cat-equivalent needs to run at the same time as the rest of the pipeline.

                                So you save a little CPU time & memory, especially if you can avoid the fork. Of course, you only save this time & memory when the feature is actually used. And you're only really saving the fork/exec time; with larger files, the time is mostly the I/O time (i.e., cat reading a file from disk). So you have to ask: how often is cat used (uselessly) in shell scripts where the performance actually matters? Compare it to other common shell builtins like test — it's hard to imagine cat is used (uselessly) even a tenth as often as test is used in places that matter. That's a guess, I haven't measured, which is something you'd want to do before any attempt at implementation. (Or similarly, asking someone else to implement in e.g., a feature request.)



                                Next you ask: what are the costs. The two costs that come to mind are (a) additional code in the shell, which increases its size (and thus possibly memory use), requires more maintenance work, is another spot for bugs, etc.; and (b) backwards compatibility surprises, POSIX cat omits a lot of features of e.g., GNU coreutils cat, so you'd have to be careful exactly what the cat builtin would implement.



                                1. The additional builtin option probably isn't that bad — adding one more builtin where a bunch already exist. If you had profiling data showing it'd help, you could probably convince your favorite shell's authors to add it.


                                2. As for analyzing the pipeline, I don't think shells do anything like this currently (a few recognize the end of a pipeline and can avoid a fork). Essentially you'd be adding a (primitive) optimizer to the shell; optimizers often turn out to be complicated code and the source of a lot of bugs. And those bugs can be surprising — slight changes in the shell script could wind up avoiding or triggering the bug.


                                Postscript: You can apply a similar analysis to your useless uses of cat. Benefits: easier to read (though if command1 will take a file as an argument, probably not). Costs: extra fork and exec (and if command1 can take a file as an argument, probably more confusing error messages). If your analysis tells you to uselessly use cat, then go ahead.







                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited yesterday

























                                answered yesterday









                                derobertderobert

                                75.3k8164222




                                75.3k8164222





















                                    7














                                    The cat command can accept - as a marker for stdin. (POSIX, "If a file is '-', the cat utility shall read from the standard input at that point in the sequence.") This allows simple handling of a file or stdin where otherwise this would be disallowed.



                                    Consider these two trivial alternatives, where the shell argument $1 is -:



                                    cat "$1" | nl # Works completely transparently
                                    nl < "$1" # Fails with 'bash: -: No such file or directory'


                                    Another time cat is useful is where it's intentionally used as a no-op simply to maintain shell syntax:



                                    file="$1"
                                    reader=cat
                                    [[ $file =~ .gz$ ]] && reader=zcat
                                    [[ $file =~ .bz2$ ]] && reader=bzcat
                                    "$reader" "$file"


                                    Finally, I believe the only time that UUOC can really be correctly called out is when cat is used with a filename that is known to be a regular file (i.e. not a device or named pipe), and that no flags are given to the command:



                                    cat file.txt


                                    In any other situation the oroperties of cat itself may be required.






                                    share|improve this answer





























                                      7














                                      The cat command can accept - as a marker for stdin. (POSIX, "If a file is '-', the cat utility shall read from the standard input at that point in the sequence.") This allows simple handling of a file or stdin where otherwise this would be disallowed.



                                      Consider these two trivial alternatives, where the shell argument $1 is -:



                                      cat "$1" | nl # Works completely transparently
                                      nl < "$1" # Fails with 'bash: -: No such file or directory'


                                      Another time cat is useful is where it's intentionally used as a no-op simply to maintain shell syntax:



                                      file="$1"
                                      reader=cat
                                      [[ $file =~ .gz$ ]] && reader=zcat
                                      [[ $file =~ .bz2$ ]] && reader=bzcat
                                      "$reader" "$file"


                                      Finally, I believe the only time that UUOC can really be correctly called out is when cat is used with a filename that is known to be a regular file (i.e. not a device or named pipe), and that no flags are given to the command:



                                      cat file.txt


                                      In any other situation the oroperties of cat itself may be required.






                                      share|improve this answer



























                                        7












                                        7








                                        7







                                        The cat command can accept - as a marker for stdin. (POSIX, "If a file is '-', the cat utility shall read from the standard input at that point in the sequence.") This allows simple handling of a file or stdin where otherwise this would be disallowed.



                                        Consider these two trivial alternatives, where the shell argument $1 is -:



                                        cat "$1" | nl # Works completely transparently
                                        nl < "$1" # Fails with 'bash: -: No such file or directory'


                                        Another time cat is useful is where it's intentionally used as a no-op simply to maintain shell syntax:



                                        file="$1"
                                        reader=cat
                                        [[ $file =~ .gz$ ]] && reader=zcat
                                        [[ $file =~ .bz2$ ]] && reader=bzcat
                                        "$reader" "$file"


                                        Finally, I believe the only time that UUOC can really be correctly called out is when cat is used with a filename that is known to be a regular file (i.e. not a device or named pipe), and that no flags are given to the command:



                                        cat file.txt


                                        In any other situation the oroperties of cat itself may be required.






                                        share|improve this answer















                                        The cat command can accept - as a marker for stdin. (POSIX, "If a file is '-', the cat utility shall read from the standard input at that point in the sequence.") This allows simple handling of a file or stdin where otherwise this would be disallowed.



                                        Consider these two trivial alternatives, where the shell argument $1 is -:



                                        cat "$1" | nl # Works completely transparently
                                        nl < "$1" # Fails with 'bash: -: No such file or directory'


                                        Another time cat is useful is where it's intentionally used as a no-op simply to maintain shell syntax:



                                        file="$1"
                                        reader=cat
                                        [[ $file =~ .gz$ ]] && reader=zcat
                                        [[ $file =~ .bz2$ ]] && reader=bzcat
                                        "$reader" "$file"


                                        Finally, I believe the only time that UUOC can really be correctly called out is when cat is used with a filename that is known to be a regular file (i.e. not a device or named pipe), and that no flags are given to the command:



                                        cat file.txt


                                        In any other situation the oroperties of cat itself may be required.







                                        share|improve this answer














                                        share|improve this answer



                                        share|improve this answer








                                        edited 15 hours ago

























                                        answered yesterday









                                        roaimaroaima

                                        46.1k758124




                                        46.1k758124





















                                            2














                                            The cat command can do things that the shell can't necessarily do ( or at least, can't do easily). For example, suppose you want to print characters that might otherwise be invisible, such as tabs, carriage returns, or newlines. There *might* be a way to do so with only shell builtin commands, but I can't think of any off the top of my head. The GNU version of cat can do so with the -A argument or the -v -E -T arguments (IDK about other versions of cat, though). You could also prefix each line with a line number using -n (again, IDK if non-GNU versions can do this).



                                            Another advantage of cat is that it can easily read multiple files. To do so, one can simply type cat file1 file2 file3. To do the same with a shell, things would get tricky, although a carefully-crafted loop could most likely achieve the same result. That said, do you really want to take the time to write such a loop, when such a simple alternative exists? I don't!



                                            Reading files with cat would probably use less CPU than the shell would, since cat is a pre-compiled program (the obvious exception is any shell that has a builtin cat). When reading a large group of files, this might become apparent, but I have never done so on my machines, so I can't be sure.



                                            The cat command can also be useful for forcing a command to accept standard input in instances it might not. Consider the following:



                                            echo 8 | sleep



                                            The number "8" will be not accepted by the "sleep" command, since it was never really meant to accept standard input. Thus, sleep will disregard that input, complain about a lack of arguments, and exit. However, if one types:



                                            echo 8 | sleep $(cat)



                                            Many shells will expand this to sleep 8, and sleep will wait for 8 seconds before exiting. You can also do something similar with ssh:



                                            command | ssh 1.2.3.4 'cat >> example-file'



                                            This command with append example-file on the machine with the address of 1.2.3.4 with whatever is outputted from "command".



                                            And that's (probably) just scratching the surface. I'm sure I could find more example of cat being useful if I wanted to, but this post is long enough as it is. So, I'll conclude by saying this: asking the shell to anticipate all of these scenarios (and several others) is not really feasible.






                                            share|improve this answer



























                                              2














                                              The cat command can do things that the shell can't necessarily do ( or at least, can't do easily). For example, suppose you want to print characters that might otherwise be invisible, such as tabs, carriage returns, or newlines. There *might* be a way to do so with only shell builtin commands, but I can't think of any off the top of my head. The GNU version of cat can do so with the -A argument or the -v -E -T arguments (IDK about other versions of cat, though). You could also prefix each line with a line number using -n (again, IDK if non-GNU versions can do this).



                                              Another advantage of cat is that it can easily read multiple files. To do so, one can simply type cat file1 file2 file3. To do the same with a shell, things would get tricky, although a carefully-crafted loop could most likely achieve the same result. That said, do you really want to take the time to write such a loop, when such a simple alternative exists? I don't!



                                              Reading files with cat would probably use less CPU than the shell would, since cat is a pre-compiled program (the obvious exception is any shell that has a builtin cat). When reading a large group of files, this might become apparent, but I have never done so on my machines, so I can't be sure.



                                              The cat command can also be useful for forcing a command to accept standard input in instances it might not. Consider the following:



                                              echo 8 | sleep



                                              The number "8" will be not accepted by the "sleep" command, since it was never really meant to accept standard input. Thus, sleep will disregard that input, complain about a lack of arguments, and exit. However, if one types:



                                              echo 8 | sleep $(cat)



                                              Many shells will expand this to sleep 8, and sleep will wait for 8 seconds before exiting. You can also do something similar with ssh:



                                              command | ssh 1.2.3.4 'cat >> example-file'



                                              This command with append example-file on the machine with the address of 1.2.3.4 with whatever is outputted from "command".



                                              And that's (probably) just scratching the surface. I'm sure I could find more example of cat being useful if I wanted to, but this post is long enough as it is. So, I'll conclude by saying this: asking the shell to anticipate all of these scenarios (and several others) is not really feasible.






                                              share|improve this answer

























                                                2












                                                2








                                                2







                                                The cat command can do things that the shell can't necessarily do ( or at least, can't do easily). For example, suppose you want to print characters that might otherwise be invisible, such as tabs, carriage returns, or newlines. There *might* be a way to do so with only shell builtin commands, but I can't think of any off the top of my head. The GNU version of cat can do so with the -A argument or the -v -E -T arguments (IDK about other versions of cat, though). You could also prefix each line with a line number using -n (again, IDK if non-GNU versions can do this).



                                                Another advantage of cat is that it can easily read multiple files. To do so, one can simply type cat file1 file2 file3. To do the same with a shell, things would get tricky, although a carefully-crafted loop could most likely achieve the same result. That said, do you really want to take the time to write such a loop, when such a simple alternative exists? I don't!



                                                Reading files with cat would probably use less CPU than the shell would, since cat is a pre-compiled program (the obvious exception is any shell that has a builtin cat). When reading a large group of files, this might become apparent, but I have never done so on my machines, so I can't be sure.



                                                The cat command can also be useful for forcing a command to accept standard input in instances it might not. Consider the following:



                                                echo 8 | sleep



                                                The number "8" will be not accepted by the "sleep" command, since it was never really meant to accept standard input. Thus, sleep will disregard that input, complain about a lack of arguments, and exit. However, if one types:



                                                echo 8 | sleep $(cat)



                                                Many shells will expand this to sleep 8, and sleep will wait for 8 seconds before exiting. You can also do something similar with ssh:



                                                command | ssh 1.2.3.4 'cat >> example-file'



                                                This command with append example-file on the machine with the address of 1.2.3.4 with whatever is outputted from "command".



                                                And that's (probably) just scratching the surface. I'm sure I could find more example of cat being useful if I wanted to, but this post is long enough as it is. So, I'll conclude by saying this: asking the shell to anticipate all of these scenarios (and several others) is not really feasible.






                                                share|improve this answer













                                                The cat command can do things that the shell can't necessarily do ( or at least, can't do easily). For example, suppose you want to print characters that might otherwise be invisible, such as tabs, carriage returns, or newlines. There *might* be a way to do so with only shell builtin commands, but I can't think of any off the top of my head. The GNU version of cat can do so with the -A argument or the -v -E -T arguments (IDK about other versions of cat, though). You could also prefix each line with a line number using -n (again, IDK if non-GNU versions can do this).



                                                Another advantage of cat is that it can easily read multiple files. To do so, one can simply type cat file1 file2 file3. To do the same with a shell, things would get tricky, although a carefully-crafted loop could most likely achieve the same result. That said, do you really want to take the time to write such a loop, when such a simple alternative exists? I don't!



                                                Reading files with cat would probably use less CPU than the shell would, since cat is a pre-compiled program (the obvious exception is any shell that has a builtin cat). When reading a large group of files, this might become apparent, but I have never done so on my machines, so I can't be sure.



                                                The cat command can also be useful for forcing a command to accept standard input in instances it might not. Consider the following:



                                                echo 8 | sleep



                                                The number "8" will be not accepted by the "sleep" command, since it was never really meant to accept standard input. Thus, sleep will disregard that input, complain about a lack of arguments, and exit. However, if one types:



                                                echo 8 | sleep $(cat)



                                                Many shells will expand this to sleep 8, and sleep will wait for 8 seconds before exiting. You can also do something similar with ssh:



                                                command | ssh 1.2.3.4 'cat >> example-file'



                                                This command with append example-file on the machine with the address of 1.2.3.4 with whatever is outputted from "command".



                                                And that's (probably) just scratching the surface. I'm sure I could find more example of cat being useful if I wanted to, but this post is long enough as it is. So, I'll conclude by saying this: asking the shell to anticipate all of these scenarios (and several others) is not really feasible.







                                                share|improve this answer












                                                share|improve this answer



                                                share|improve this answer










                                                answered yesterday









                                                TSJNachos117TSJNachos117

                                                1435




                                                1435





















                                                    1














                                                    Adding to @Kusalananda answer (and @alephzero comment), cat could be anything:



                                                    alias cat='gcc -c'
                                                    cat "$MYFILE" | command1 | command2 > "$OUTPUT"


                                                    or



                                                    echo 'echo 1' > /usr/bin/cat
                                                    cat "$MYFILE" | command1 | command2 > "$OUTPUT"


                                                    There is no reason that cat (on its own) or /usr/bin/cat on the system is actually cat the concatenate tool.






                                                    share|improve this answer








                                                    New contributor




                                                    Rob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                    Check out our Code of Conduct.















                                                    • 2





                                                      Other than the behaviour of cat is defined by POSIX and so shouldn't be wildly different.

                                                      – roaima
                                                      yesterday






                                                    • 2





                                                      @roaima: PATH=/home/Joshua/bin:$PATH cat ... Are you sure you know what cat does now?

                                                      – Joshua
                                                      yesterday











                                                    • @Joshua it doesn't really matter. We both know cat can be overridden, but we also both know it shouldn't be wantonly replaced with something else. My comment points out that POSIX mandates a particular (subset of) behaviour that can reasonably be expected to exist. I have, at times, written a shell script that extends behaviour of a standard utility. In this case the shell script acted and behaved just like the tool it replaced, except that it had additional capabilities.

                                                      – roaima
                                                      yesterday












                                                    • @Joshua: On most platforms, shells know (or could know) which directories hold executables that implement POSIX commands. So you could just defer the substitution until after alias expansion and path resolution, and only do it for /bin/cat. (And you'd make it an option you could turn off.) Or you'd make cat a shell built-in (which maybe falls back to /bin/cat for multiple args?) so users could control whether or not they wanted the external version the normal way, with enable cat. Like for kill. (I was thinking that bash command cat would work, but that doesn't skip builtins)

                                                      – Peter Cordes
                                                      21 hours ago
















                                                    1














                                                    Adding to @Kusalananda answer (and @alephzero comment), cat could be anything:



                                                    alias cat='gcc -c'
                                                    cat "$MYFILE" | command1 | command2 > "$OUTPUT"


                                                    or



                                                    echo 'echo 1' > /usr/bin/cat
                                                    cat "$MYFILE" | command1 | command2 > "$OUTPUT"


                                                    There is no reason that cat (on its own) or /usr/bin/cat on the system is actually cat the concatenate tool.






                                                    share|improve this answer








                                                    New contributor




                                                    Rob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                    Check out our Code of Conduct.















                                                    • 2





                                                      Other than the behaviour of cat is defined by POSIX and so shouldn't be wildly different.

                                                      – roaima
                                                      yesterday






                                                    • 2





                                                      @roaima: PATH=/home/Joshua/bin:$PATH cat ... Are you sure you know what cat does now?

                                                      – Joshua
                                                      yesterday











                                                    • @Joshua it doesn't really matter. We both know cat can be overridden, but we also both know it shouldn't be wantonly replaced with something else. My comment points out that POSIX mandates a particular (subset of) behaviour that can reasonably be expected to exist. I have, at times, written a shell script that extends behaviour of a standard utility. In this case the shell script acted and behaved just like the tool it replaced, except that it had additional capabilities.

                                                      – roaima
                                                      yesterday












                                                    • @Joshua: On most platforms, shells know (or could know) which directories hold executables that implement POSIX commands. So you could just defer the substitution until after alias expansion and path resolution, and only do it for /bin/cat. (And you'd make it an option you could turn off.) Or you'd make cat a shell built-in (which maybe falls back to /bin/cat for multiple args?) so users could control whether or not they wanted the external version the normal way, with enable cat. Like for kill. (I was thinking that bash command cat would work, but that doesn't skip builtins)

                                                      – Peter Cordes
                                                      21 hours ago














                                                    1












                                                    1








                                                    1







                                                    Adding to @Kusalananda answer (and @alephzero comment), cat could be anything:



                                                    alias cat='gcc -c'
                                                    cat "$MYFILE" | command1 | command2 > "$OUTPUT"


                                                    or



                                                    echo 'echo 1' > /usr/bin/cat
                                                    cat "$MYFILE" | command1 | command2 > "$OUTPUT"


                                                    There is no reason that cat (on its own) or /usr/bin/cat on the system is actually cat the concatenate tool.






                                                    share|improve this answer








                                                    New contributor




                                                    Rob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                    Check out our Code of Conduct.










                                                    Adding to @Kusalananda answer (and @alephzero comment), cat could be anything:



                                                    alias cat='gcc -c'
                                                    cat "$MYFILE" | command1 | command2 > "$OUTPUT"


                                                    or



                                                    echo 'echo 1' > /usr/bin/cat
                                                    cat "$MYFILE" | command1 | command2 > "$OUTPUT"


                                                    There is no reason that cat (on its own) or /usr/bin/cat on the system is actually cat the concatenate tool.







                                                    share|improve this answer








                                                    New contributor




                                                    Rob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                    Check out our Code of Conduct.









                                                    share|improve this answer



                                                    share|improve this answer






                                                    New contributor




                                                    Rob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                    Check out our Code of Conduct.









                                                    answered yesterday









                                                    RobRob

                                                    111




                                                    111




                                                    New contributor




                                                    Rob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                    Check out our Code of Conduct.





                                                    New contributor





                                                    Rob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                    Check out our Code of Conduct.






                                                    Rob is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                    Check out our Code of Conduct.







                                                    • 2





                                                      Other than the behaviour of cat is defined by POSIX and so shouldn't be wildly different.

                                                      – roaima
                                                      yesterday






                                                    • 2





                                                      @roaima: PATH=/home/Joshua/bin:$PATH cat ... Are you sure you know what cat does now?

                                                      – Joshua
                                                      yesterday











                                                    • @Joshua it doesn't really matter. We both know cat can be overridden, but we also both know it shouldn't be wantonly replaced with something else. My comment points out that POSIX mandates a particular (subset of) behaviour that can reasonably be expected to exist. I have, at times, written a shell script that extends behaviour of a standard utility. In this case the shell script acted and behaved just like the tool it replaced, except that it had additional capabilities.

                                                      – roaima
                                                      yesterday












                                                    • @Joshua: On most platforms, shells know (or could know) which directories hold executables that implement POSIX commands. So you could just defer the substitution until after alias expansion and path resolution, and only do it for /bin/cat. (And you'd make it an option you could turn off.) Or you'd make cat a shell built-in (which maybe falls back to /bin/cat for multiple args?) so users could control whether or not they wanted the external version the normal way, with enable cat. Like for kill. (I was thinking that bash command cat would work, but that doesn't skip builtins)

                                                      – Peter Cordes
                                                      21 hours ago













                                                    • 2





                                                      Other than the behaviour of cat is defined by POSIX and so shouldn't be wildly different.

                                                      – roaima
                                                      yesterday






                                                    • 2





                                                      @roaima: PATH=/home/Joshua/bin:$PATH cat ... Are you sure you know what cat does now?

                                                      – Joshua
                                                      yesterday











                                                    • @Joshua it doesn't really matter. We both know cat can be overridden, but we also both know it shouldn't be wantonly replaced with something else. My comment points out that POSIX mandates a particular (subset of) behaviour that can reasonably be expected to exist. I have, at times, written a shell script that extends behaviour of a standard utility. In this case the shell script acted and behaved just like the tool it replaced, except that it had additional capabilities.

                                                      – roaima
                                                      yesterday












                                                    • @Joshua: On most platforms, shells know (or could know) which directories hold executables that implement POSIX commands. So you could just defer the substitution until after alias expansion and path resolution, and only do it for /bin/cat. (And you'd make it an option you could turn off.) Or you'd make cat a shell built-in (which maybe falls back to /bin/cat for multiple args?) so users could control whether or not they wanted the external version the normal way, with enable cat. Like for kill. (I was thinking that bash command cat would work, but that doesn't skip builtins)

                                                      – Peter Cordes
                                                      21 hours ago








                                                    2




                                                    2





                                                    Other than the behaviour of cat is defined by POSIX and so shouldn't be wildly different.

                                                    – roaima
                                                    yesterday





                                                    Other than the behaviour of cat is defined by POSIX and so shouldn't be wildly different.

                                                    – roaima
                                                    yesterday




                                                    2




                                                    2





                                                    @roaima: PATH=/home/Joshua/bin:$PATH cat ... Are you sure you know what cat does now?

                                                    – Joshua
                                                    yesterday





                                                    @roaima: PATH=/home/Joshua/bin:$PATH cat ... Are you sure you know what cat does now?

                                                    – Joshua
                                                    yesterday













                                                    @Joshua it doesn't really matter. We both know cat can be overridden, but we also both know it shouldn't be wantonly replaced with something else. My comment points out that POSIX mandates a particular (subset of) behaviour that can reasonably be expected to exist. I have, at times, written a shell script that extends behaviour of a standard utility. In this case the shell script acted and behaved just like the tool it replaced, except that it had additional capabilities.

                                                    – roaima
                                                    yesterday






                                                    @Joshua it doesn't really matter. We both know cat can be overridden, but we also both know it shouldn't be wantonly replaced with something else. My comment points out that POSIX mandates a particular (subset of) behaviour that can reasonably be expected to exist. I have, at times, written a shell script that extends behaviour of a standard utility. In this case the shell script acted and behaved just like the tool it replaced, except that it had additional capabilities.

                                                    – roaima
                                                    yesterday














                                                    @Joshua: On most platforms, shells know (or could know) which directories hold executables that implement POSIX commands. So you could just defer the substitution until after alias expansion and path resolution, and only do it for /bin/cat. (And you'd make it an option you could turn off.) Or you'd make cat a shell built-in (which maybe falls back to /bin/cat for multiple args?) so users could control whether or not they wanted the external version the normal way, with enable cat. Like for kill. (I was thinking that bash command cat would work, but that doesn't skip builtins)

                                                    – Peter Cordes
                                                    21 hours ago






                                                    @Joshua: On most platforms, shells know (or could know) which directories hold executables that implement POSIX commands. So you could just defer the substitution until after alias expansion and path resolution, and only do it for /bin/cat. (And you'd make it an option you could turn off.) Or you'd make cat a shell built-in (which maybe falls back to /bin/cat for multiple args?) so users could control whether or not they wanted the external version the normal way, with enable cat. Like for kill. (I was thinking that bash command cat would work, but that doesn't skip builtins)

                                                    – Peter Cordes
                                                    21 hours ago












                                                    1














                                                    Remember that a user could have a cat in his $PATH which is not exactly the POSIX cat (but perhaps some variant which could log something somewhere). In that case, you don't want the shell to remove it.



                                                    The PATH could change dynamically, and then cat is not what you believe it is. It would be quite difficult to write a shell doing the optimization you dream of.



                                                    Also, in practice, cat is a quite quick program. There are few practical reasons (except aesthetics) to avoid it.



                                                    See also the excellent Parsing POSIX [s]hell talk by Yann Regis-Gianas at FOSDEM2018. It gives other good reasons to avoid attempting doing what you dream of in a shell.






                                                    share|improve this answer























                                                    • "Quite few practical reasons to avoid it" -- anyone who's waited for cat some-huge-log | tail -n 5 to run (where tail -n 5 some-huge-log could jump straight to the end, whereas cat reads only front-to-back) would disagree.

                                                      – Charles Duffy
                                                      35 mins ago
















                                                    1














                                                    Remember that a user could have a cat in his $PATH which is not exactly the POSIX cat (but perhaps some variant which could log something somewhere). In that case, you don't want the shell to remove it.



                                                    The PATH could change dynamically, and then cat is not what you believe it is. It would be quite difficult to write a shell doing the optimization you dream of.



                                                    Also, in practice, cat is a quite quick program. There are few practical reasons (except aesthetics) to avoid it.



                                                    See also the excellent Parsing POSIX [s]hell talk by Yann Regis-Gianas at FOSDEM2018. It gives other good reasons to avoid attempting doing what you dream of in a shell.






                                                    share|improve this answer























                                                    • "Quite few practical reasons to avoid it" -- anyone who's waited for cat some-huge-log | tail -n 5 to run (where tail -n 5 some-huge-log could jump straight to the end, whereas cat reads only front-to-back) would disagree.

                                                      – Charles Duffy
                                                      35 mins ago














                                                    1












                                                    1








                                                    1







                                                    Remember that a user could have a cat in his $PATH which is not exactly the POSIX cat (but perhaps some variant which could log something somewhere). In that case, you don't want the shell to remove it.



                                                    The PATH could change dynamically, and then cat is not what you believe it is. It would be quite difficult to write a shell doing the optimization you dream of.



                                                    Also, in practice, cat is a quite quick program. There are few practical reasons (except aesthetics) to avoid it.



                                                    See also the excellent Parsing POSIX [s]hell talk by Yann Regis-Gianas at FOSDEM2018. It gives other good reasons to avoid attempting doing what you dream of in a shell.






                                                    share|improve this answer













                                                    Remember that a user could have a cat in his $PATH which is not exactly the POSIX cat (but perhaps some variant which could log something somewhere). In that case, you don't want the shell to remove it.



                                                    The PATH could change dynamically, and then cat is not what you believe it is. It would be quite difficult to write a shell doing the optimization you dream of.



                                                    Also, in practice, cat is a quite quick program. There are few practical reasons (except aesthetics) to avoid it.



                                                    See also the excellent Parsing POSIX [s]hell talk by Yann Regis-Gianas at FOSDEM2018. It gives other good reasons to avoid attempting doing what you dream of in a shell.







                                                    share|improve this answer












                                                    share|improve this answer



                                                    share|improve this answer










                                                    answered 11 hours ago









                                                    Basile StarynkevitchBasile Starynkevitch

                                                    8,1712041




                                                    8,1712041












                                                    • "Quite few practical reasons to avoid it" -- anyone who's waited for cat some-huge-log | tail -n 5 to run (where tail -n 5 some-huge-log could jump straight to the end, whereas cat reads only front-to-back) would disagree.

                                                      – Charles Duffy
                                                      35 mins ago


















                                                    • "Quite few practical reasons to avoid it" -- anyone who's waited for cat some-huge-log | tail -n 5 to run (where tail -n 5 some-huge-log could jump straight to the end, whereas cat reads only front-to-back) would disagree.

                                                      – Charles Duffy
                                                      35 mins ago

















                                                    "Quite few practical reasons to avoid it" -- anyone who's waited for cat some-huge-log | tail -n 5 to run (where tail -n 5 some-huge-log could jump straight to the end, whereas cat reads only front-to-back) would disagree.

                                                    – Charles Duffy
                                                    35 mins ago






                                                    "Quite few practical reasons to avoid it" -- anyone who's waited for cat some-huge-log | tail -n 5 to run (where tail -n 5 some-huge-log could jump straight to the end, whereas cat reads only front-to-back) would disagree.

                                                    – Charles Duffy
                                                    35 mins ago






                                                    protected by Michael Homer 14 hours ago



                                                    Thank you for your interest in this question.
                                                    Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                                    Would you like to answer one of these unanswered questions instead?



                                                    -performance, posix, shell-script

                                                    Popular posts from this blog

                                                    Creating 100m^2 grid automatically using QGIS?Creating grid constrained within polygon in QGIS?Createing polygon layer from point data using QGIS?Creating vector grid using QGIS?Creating grid polygons from coordinates using R or PythonCreating grid from spatio temporal point data?Creating fields in attributes table using other layers using QGISCreate .shp vector grid in QGISQGIS Creating 4km point grid within polygonsCreate a vector grid over a raster layerVector Grid Creates just one grid

                                                    Why is this plane circling around the Lucknow airport every day?Why do aircraft on Flight Radar 24 jump around randomly sometimes?What airport has this walkway over a taxiway?How does Chicago O'Hare's tower sequence aircraft at peak capacity?Which airport is featured in this Delta commercial?After a crash, for how long is the airport closed?Can a passenger plane stand still in the air, or hover at a fixed location above a ground?What are those trucks towing around, and why?What is this airport outside of Cairo, Egypt?Which US airport has the lowest circling MDH?What is this airport video?

                                                    Nikolai Prilezhaev Bibliography References External links Navigation menuEarly Russian Organic Chemists and Their Legacy092774english translationRussian Biography