sed anchor characters in [^]Why the inconsistency with using cat vs. echo piped to this sed command?Simple sed replacement of tabs mysteriously failingregarding portable sed -e… d b or ! b?Replace special characters with sedSED find and replace element in filename with incremental valuessed and special charactersSED challenge,aggregating String containing bracesInsert Newlines into serial stream before writing to text filesed: couldn't write n items to stdout: Broken pipe. What are these errors?/(.+)n1/ works but /(.*)n1/ doesn't when they should both work

Is there a name of the flying bionic bird?

How would photo IDs work for shapeshifters?

A poker game description that does not feel gimmicky

Pristine Bit Checking

Does a dangling wire really electrocute me if I'm standing in water?

How to make particles emit from certain parts of a 3D object?

How to answer pointed "are you quitting" questioning when I don't want them to suspect

Why doesn't a const reference extend the life of a temporary object passed via a function?

I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine

Add an angle to a sphere

Why is my log file so massive? 22gb. I am running log backups

Does the average primeness of natural numbers tend to zero?

Copycat chess is back

What do the Banks children have against barley water?

LWC and complex parameters

Is it legal to have the "// (c) 2019 John Smith" header in all files when there are hundreds of contributors?

Doomsday-clock for my fantasy planet

"listening to me about as much as you're listening to this pole here"

Denied boarding due to overcrowding, Sparpreis ticket. What are my rights?

Is domain driven design an anti-SQL pattern?

What does it exactly mean if a random variable follows a distribution

extract characters between two commas?

How to deal with fear of taking dependencies

How could a lack of term limits lead to a "dictatorship?"



sed anchor characters in [^]


Why the inconsistency with using cat vs. echo piped to this sed command?Simple sed replacement of tabs mysteriously failingregarding portable sed -e… d b or ! b?Replace special characters with sedSED find and replace element in filename with incremental valuessed and special charactersSED challenge,aggregating String containing bracesInsert Newlines into serial stream before writing to text filesed: couldn't write n items to stdout: Broken pipe. What are these errors?/(.+)n1/ works but /(.*)n1/ doesn't when they should both work






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








4















Why does sed, if we use the negation expression[^ ] treat anchor characters like b or B as real characters? E. g. one would expect the following expressions to yield the same result, but they don't:



$ echo 'apple pear melon banana cherry papaya' | sed 's/[^b]a[^b]/u/g'
apple pu melon baua cherry uaya
$ echo 'apple pear melon banana cherry papaya' | sed 's/BaB/u/g'
apple peur melon bununa cherry pupuya


If there was no B, how could we negate b?










share|improve this question




























    4















    Why does sed, if we use the negation expression[^ ] treat anchor characters like b or B as real characters? E. g. one would expect the following expressions to yield the same result, but they don't:



    $ echo 'apple pear melon banana cherry papaya' | sed 's/[^b]a[^b]/u/g'
    apple pu melon baua cherry uaya
    $ echo 'apple pear melon banana cherry papaya' | sed 's/BaB/u/g'
    apple peur melon bununa cherry pupuya


    If there was no B, how could we negate b?










    share|improve this question
























      4












      4








      4








      Why does sed, if we use the negation expression[^ ] treat anchor characters like b or B as real characters? E. g. one would expect the following expressions to yield the same result, but they don't:



      $ echo 'apple pear melon banana cherry papaya' | sed 's/[^b]a[^b]/u/g'
      apple pu melon baua cherry uaya
      $ echo 'apple pear melon banana cherry papaya' | sed 's/BaB/u/g'
      apple peur melon bununa cherry pupuya


      If there was no B, how could we negate b?










      share|improve this question














      Why does sed, if we use the negation expression[^ ] treat anchor characters like b or B as real characters? E. g. one would expect the following expressions to yield the same result, but they don't:



      $ echo 'apple pear melon banana cherry papaya' | sed 's/[^b]a[^b]/u/g'
      apple pu melon baua cherry uaya
      $ echo 'apple pear melon banana cherry papaya' | sed 's/BaB/u/g'
      apple peur melon bununa cherry pupuya


      If there was no B, how could we negate b?







      sed






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 28 at 12:08









      AmaterasuAmaterasu

      233




      233




















          1 Answer
          1






          active

          oldest

          votes


















          3














          Neither of b or B is a character. Both are zero-width patterns that matches between characters.



          The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".



          The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.



          The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).



          With GNU sed, [^b] matches a character that is not an or a b.



          There is no way to use a character class to replace the use of B that I'm aware of.



          The b and B patterns are supported by GNU sed. Both GNU sed and BSD sed also has < and > for explicitly matching at the start and end of a word, and BSD sed additionally supports the POSIX patterns [[:<:]] and [[:>:]] (but GNU sed does not). The POSIX patterns can't be negated ([^[:>:]] does not work).



          To get a similar effect without using B, you could use something like



          $ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
          apple peur melon bunana cherry pupaya


          That is, match an alphanumeric character on either side of the a, and then include these two flanking characters in the replacement. Note that since the replacement only happens for non-overlapping matches, this would not properly substitute the a's in a string containing multiple consecutive a's (or a's in every second position). See how banana that does not come out as bununa due to this.



          To sort that out, you could introduce a loop in the sed program:



          sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop


          This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.






          share|improve this answer




















          • 1





            Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

            – Stéphane Chazelas
            Mar 28 at 13:01












          • For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

            – Stéphane Chazelas
            Mar 28 at 13:02












          • The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

            – Stéphane Chazelas
            Mar 28 at 13:18












          • @StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

            – Kusalananda
            Mar 28 at 13:20











          • [^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

            – Stéphane Chazelas
            Mar 28 at 13:37











          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f509199%2fsed-anchor-characters-in%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          3














          Neither of b or B is a character. Both are zero-width patterns that matches between characters.



          The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".



          The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.



          The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).



          With GNU sed, [^b] matches a character that is not an or a b.



          There is no way to use a character class to replace the use of B that I'm aware of.



          The b and B patterns are supported by GNU sed. Both GNU sed and BSD sed also has < and > for explicitly matching at the start and end of a word, and BSD sed additionally supports the POSIX patterns [[:<:]] and [[:>:]] (but GNU sed does not). The POSIX patterns can't be negated ([^[:>:]] does not work).



          To get a similar effect without using B, you could use something like



          $ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
          apple peur melon bunana cherry pupaya


          That is, match an alphanumeric character on either side of the a, and then include these two flanking characters in the replacement. Note that since the replacement only happens for non-overlapping matches, this would not properly substitute the a's in a string containing multiple consecutive a's (or a's in every second position). See how banana that does not come out as bununa due to this.



          To sort that out, you could introduce a loop in the sed program:



          sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop


          This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.






          share|improve this answer




















          • 1





            Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

            – Stéphane Chazelas
            Mar 28 at 13:01












          • For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

            – Stéphane Chazelas
            Mar 28 at 13:02












          • The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

            – Stéphane Chazelas
            Mar 28 at 13:18












          • @StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

            – Kusalananda
            Mar 28 at 13:20











          • [^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

            – Stéphane Chazelas
            Mar 28 at 13:37















          3














          Neither of b or B is a character. Both are zero-width patterns that matches between characters.



          The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".



          The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.



          The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).



          With GNU sed, [^b] matches a character that is not an or a b.



          There is no way to use a character class to replace the use of B that I'm aware of.



          The b and B patterns are supported by GNU sed. Both GNU sed and BSD sed also has < and > for explicitly matching at the start and end of a word, and BSD sed additionally supports the POSIX patterns [[:<:]] and [[:>:]] (but GNU sed does not). The POSIX patterns can't be negated ([^[:>:]] does not work).



          To get a similar effect without using B, you could use something like



          $ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
          apple peur melon bunana cherry pupaya


          That is, match an alphanumeric character on either side of the a, and then include these two flanking characters in the replacement. Note that since the replacement only happens for non-overlapping matches, this would not properly substitute the a's in a string containing multiple consecutive a's (or a's in every second position). See how banana that does not come out as bununa due to this.



          To sort that out, you could introduce a loop in the sed program:



          sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop


          This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.






          share|improve this answer




















          • 1





            Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

            – Stéphane Chazelas
            Mar 28 at 13:01












          • For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

            – Stéphane Chazelas
            Mar 28 at 13:02












          • The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

            – Stéphane Chazelas
            Mar 28 at 13:18












          • @StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

            – Kusalananda
            Mar 28 at 13:20











          • [^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

            – Stéphane Chazelas
            Mar 28 at 13:37













          3












          3








          3







          Neither of b or B is a character. Both are zero-width patterns that matches between characters.



          The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".



          The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.



          The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).



          With GNU sed, [^b] matches a character that is not an or a b.



          There is no way to use a character class to replace the use of B that I'm aware of.



          The b and B patterns are supported by GNU sed. Both GNU sed and BSD sed also has < and > for explicitly matching at the start and end of a word, and BSD sed additionally supports the POSIX patterns [[:<:]] and [[:>:]] (but GNU sed does not). The POSIX patterns can't be negated ([^[:>:]] does not work).



          To get a similar effect without using B, you could use something like



          $ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
          apple peur melon bunana cherry pupaya


          That is, match an alphanumeric character on either side of the a, and then include these two flanking characters in the replacement. Note that since the replacement only happens for non-overlapping matches, this would not properly substitute the a's in a string containing multiple consecutive a's (or a's in every second position). See how banana that does not come out as bununa due to this.



          To sort that out, you could introduce a loop in the sed program:



          sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop


          This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.






          share|improve this answer















          Neither of b or B is a character. Both are zero-width patterns that matches between characters.



          The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".



          The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.



          The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).



          With GNU sed, [^b] matches a character that is not an or a b.



          There is no way to use a character class to replace the use of B that I'm aware of.



          The b and B patterns are supported by GNU sed. Both GNU sed and BSD sed also has < and > for explicitly matching at the start and end of a word, and BSD sed additionally supports the POSIX patterns [[:<:]] and [[:>:]] (but GNU sed does not). The POSIX patterns can't be negated ([^[:>:]] does not work).



          To get a similar effect without using B, you could use something like



          $ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
          apple peur melon bunana cherry pupaya


          That is, match an alphanumeric character on either side of the a, and then include these two flanking characters in the replacement. Note that since the replacement only happens for non-overlapping matches, this would not properly substitute the a's in a string containing multiple consecutive a's (or a's in every second position). See how banana that does not come out as bununa due to this.



          To sort that out, you could introduce a loop in the sed program:



          sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop


          This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Mar 28 at 13:14

























          answered Mar 28 at 12:21









          KusalanandaKusalananda

          140k17261435




          140k17261435







          • 1





            Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

            – Stéphane Chazelas
            Mar 28 at 13:01












          • For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

            – Stéphane Chazelas
            Mar 28 at 13:02












          • The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

            – Stéphane Chazelas
            Mar 28 at 13:18












          • @StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

            – Kusalananda
            Mar 28 at 13:20











          • [^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

            – Stéphane Chazelas
            Mar 28 at 13:37












          • 1





            Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

            – Stéphane Chazelas
            Mar 28 at 13:01












          • For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

            – Stéphane Chazelas
            Mar 28 at 13:02












          • The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

            – Stéphane Chazelas
            Mar 28 at 13:18












          • @StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

            – Kusalananda
            Mar 28 at 13:20











          • [^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

            – Stéphane Chazelas
            Mar 28 at 13:37







          1




          1





          Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

          – Stéphane Chazelas
          Mar 28 at 13:01






          Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

          – Stéphane Chazelas
          Mar 28 at 13:01














          For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

          – Stéphane Chazelas
          Mar 28 at 13:02






          For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

          – Stéphane Chazelas
          Mar 28 at 13:02














          The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

          – Stéphane Chazelas
          Mar 28 at 13:18






          The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

          – Stéphane Chazelas
          Mar 28 at 13:18














          @StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

          – Kusalananda
          Mar 28 at 13:20





          @StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

          – Kusalananda
          Mar 28 at 13:20













          [^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

          – Stéphane Chazelas
          Mar 28 at 13:37





          [^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

          – Stéphane Chazelas
          Mar 28 at 13:37

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f509199%2fsed-anchor-characters-in%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          -sed

          Popular posts from this blog

          Creating 100m^2 grid automatically using QGIS?Creating grid constrained within polygon in QGIS?Createing polygon layer from point data using QGIS?Creating vector grid using QGIS?Creating grid polygons from coordinates using R or PythonCreating grid from spatio temporal point data?Creating fields in attributes table using other layers using QGISCreate .shp vector grid in QGISQGIS Creating 4km point grid within polygonsCreate a vector grid over a raster layerVector Grid Creates just one grid

          What is this called? Old film camera viewer?What makes a good film camera?What to do with an old film camera?What should one look for when buying a used film camera?What is the value and age of this pre-1967 Ricoh 35 mm camera?DSLR recommendation, question about old Canon 35mm film Camera & lensesCan anyone identify the silver rangefinder-style camera in this advertisement?What kind of a Polaroid 600-camera is this?Will an old film camera still work even when not used in a very long time?What is this camera / Can I develop the film?How to fit an action camera into antique (bellows) housing?What to check when buying used and old film bodies?

          Why is this plane circling around the Lucknow airport every day?Why do aircraft on Flight Radar 24 jump around randomly sometimes?What airport has this walkway over a taxiway?How does Chicago O'Hare's tower sequence aircraft at peak capacity?Which airport is featured in this Delta commercial?After a crash, for how long is the airport closed?Can a passenger plane stand still in the air, or hover at a fixed location above a ground?What are those trucks towing around, and why?What is this airport outside of Cairo, Egypt?Which US airport has the lowest circling MDH?What is this airport video?