sed anchor characters in [^]Why the inconsistency with using cat vs. echo piped to this sed command?Simple sed replacement of tabs mysteriously failingregarding portable sed -e… d b or ! b?Replace special characters with sedSED find and replace element in filename with incremental valuessed and special charactersSED challenge,aggregating String containing bracesInsert Newlines into serial stream before writing to text filesed: couldn't write n items to stdout: Broken pipe. What are these errors?/(.+)n1/ works but /(.*)n1/ doesn't when they should both work

Is there a name of the flying bionic bird?

How would photo IDs work for shapeshifters?

A poker game description that does not feel gimmicky

Pristine Bit Checking

Does a dangling wire really electrocute me if I'm standing in water?

How to make particles emit from certain parts of a 3D object?

How to answer pointed "are you quitting" questioning when I don't want them to suspect

Why doesn't a const reference extend the life of a temporary object passed via a function?

I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine

Add an angle to a sphere

Why is my log file so massive? 22gb. I am running log backups

Does the average primeness of natural numbers tend to zero?

Copycat chess is back

What do the Banks children have against barley water?

LWC and complex parameters

Is it legal to have the "// (c) 2019 John Smith" header in all files when there are hundreds of contributors?

Doomsday-clock for my fantasy planet

"listening to me about as much as you're listening to this pole here"

Denied boarding due to overcrowding, Sparpreis ticket. What are my rights?

Is domain driven design an anti-SQL pattern?

What does it exactly mean if a random variable follows a distribution

extract characters between two commas?

How to deal with fear of taking dependencies

How could a lack of term limits lead to a "dictatorship?"

sed anchor characters in [^]

Why the inconsistency with using cat vs. echo piped to this sed command?Simple sed replacement of tabs mysteriously failingregarding portable sed -e… d b or ! b?Replace special characters with sedSED find and replace element in filename with incremental valuessed and special charactersSED challenge,aggregating String containing bracesInsert Newlines into serial stream before writing to text filesed: couldn't write n items to stdout: Broken pipe. What are these errors?/(.+)n1/ works but /(.*)n1/ doesn't when they should both work

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

Why does sed, if we use the negation expression[^ ] treat anchor characters like b or B as real characters? E. g. one would expect the following expressions to yield the same result, but they don't:

$ echo 'apple pear melon banana cherry papaya' | sed 's/[^b]a[^b]/u/g'
apple pu melon baua cherry uaya
$ echo 'apple pear melon banana cherry papaya' | sed 's/BaB/u/g'
apple peur melon bununa cherry pupuya

If there was no B, how could we negate b?

asked Mar 28 at 12:08

Amaterasu

233

add a comment |

$ echo 'apple pear melon banana cherry papaya' | sed 's/[^b]a[^b]/u/g'
apple pu melon baua cherry uaya
$ echo 'apple pear melon banana cherry papaya' | sed 's/BaB/u/g'
apple peur melon bununa cherry pupuya

If there was no B, how could we negate b?

asked Mar 28 at 12:08

Amaterasu

233

add a comment |

$ echo 'apple pear melon banana cherry papaya' | sed 's/[^b]a[^b]/u/g'
apple pu melon baua cherry uaya
$ echo 'apple pear melon banana cherry papaya' | sed 's/BaB/u/g'
apple peur melon bununa cherry pupuya

If there was no B, how could we negate b?

asked Mar 28 at 12:08

Amaterasu

233

$ echo 'apple pear melon banana cherry papaya' | sed 's/[^b]a[^b]/u/g'
apple pu melon baua cherry uaya
$ echo 'apple pear melon banana cherry papaya' | sed 's/BaB/u/g'
apple peur melon bununa cherry pupuya

If there was no B, how could we negate b?

sed

asked Mar 28 at 12:08

Amaterasu

233

asked Mar 28 at 12:08

Amaterasu

233

asked Mar 28 at 12:08

Amaterasu

233

asked Mar 28 at 12:08

Amaterasu

233

asked Mar 28 at 12:08

Amaterasu

233

add a comment |

1 Answer
1

active

oldest

votes

Neither of b or B is a character. Both are zero-width patterns that matches between characters.

The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".

The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.

The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).

With GNU sed, [^b] matches a character that is not an or a b.

There is no way to use a character class to replace the use of B that I'm aware of.

The b and B patterns are supported by GNU sed. Both GNU sed and BSD sed also has < and > for explicitly matching at the start and end of a word, and BSD sed additionally supports the POSIX patterns [[:<:]] and [[:>:]] (but GNU sed does not). The POSIX patterns can't be negated ([^[:>:]] does not work).

To get a similar effect without using B, you could use something like

$ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
apple peur melon bunana cherry pupaya

That is, match an alphanumeric character on either side of the a, and then include these two flanking characters in the replacement. Note that since the replacement only happens for non-overlapping matches, this would not properly substitute the a's in a string containing multiple consecutive a's (or a's in every second position). See how banana that does not come out as bununa due to this.

To sort that out, you could introduce a loop in the sed program:

sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop

This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.

edited Mar 28 at 13:14

answered Mar 28 at 12:21

Kusalananda♦

140k17261435

1

Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

– Stéphane Chazelas
Mar 28 at 13:01

For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

– Stéphane Chazelas
Mar 28 at 13:02

The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

– Stéphane Chazelas
Mar 28 at 13:18

@StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

– Kusalananda♦
Mar 28 at 13:20

[^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

– Stéphane Chazelas
Mar 28 at 13:37

|
show 1 more comment

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f509199%2fsed-anchor-characters-in%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Neither of b or B is a character. Both are zero-width patterns that matches between characters.

The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".

The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.

The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).

With GNU sed, [^b] matches a character that is not an or a b.

There is no way to use a character class to replace the use of B that I'm aware of.

To get a similar effect without using B, you could use something like

$ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
apple peur melon bunana cherry pupaya

To sort that out, you could introduce a loop in the sed program:

sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop

This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.

edited Mar 28 at 13:14

answered Mar 28 at 12:21

Kusalananda♦

140k17261435

1

Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

– Stéphane Chazelas
Mar 28 at 13:01

For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

– Stéphane Chazelas
Mar 28 at 13:02

The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

– Stéphane Chazelas
Mar 28 at 13:18

@StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

– Kusalananda♦
Mar 28 at 13:20

[^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

– Stéphane Chazelas
Mar 28 at 13:37

|
show 1 more comment

Neither of b or B is a character. Both are zero-width patterns that matches between characters.

The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".

The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.

The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).

With GNU sed, [^b] matches a character that is not an or a b.

There is no way to use a character class to replace the use of B that I'm aware of.

To get a similar effect without using B, you could use something like

$ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
apple peur melon bunana cherry pupaya

To sort that out, you could introduce a loop in the sed program:

sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop

This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.

edited Mar 28 at 13:14

answered Mar 28 at 12:21

Kusalananda♦

140k17261435

1

Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

– Stéphane Chazelas
Mar 28 at 13:01

For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

– Stéphane Chazelas
Mar 28 at 13:02

The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

– Stéphane Chazelas
Mar 28 at 13:18

@StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

– Kusalananda♦
Mar 28 at 13:20

[^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

– Stéphane Chazelas
Mar 28 at 13:37

|
show 1 more comment

Neither of b or B is a character. Both are zero-width patterns that matches between characters.

The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".

The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.

The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).

With GNU sed, [^b] matches a character that is not an or a b.

There is no way to use a character class to replace the use of B that I'm aware of.

To get a similar effect without using B, you could use something like

$ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
apple peur melon bunana cherry pupaya

To sort that out, you could introduce a loop in the sed program:

sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop

This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.

edited Mar 28 at 13:14

answered Mar 28 at 12:21

Kusalananda♦

140k17261435

Neither of b or B is a character. Both are zero-width patterns that matches between characters.

The b pattern matches at a word boundary, i.e. between a character that is a "word character" and a character that is not a "word character".

The B pattern matches at a non-word boundary, i.e. between a two characters that are both either "word characters" or not.

The pattern [^b] matches one character. This is why pear is transformed into pu, you replace ear (the a and the surrounding characters).

With GNU sed, [^b] matches a character that is not an or a b.

There is no way to use a character class to replace the use of B that I'm aware of.

To get a similar effect without using B, you could use something like

$ echo 'apple pear melon banana cherry papaya' | sed 's/([[:alnum:]])a([[:alnum:]])/1u2/g'
apple peur melon bunana cherry pupaya

To sort that out, you could introduce a loop in the sed program:

sed -e :top -e 's/([[:alnum:]])a([[:alnum:]])/1u2/g' -e ttop

This performs the replacement over the input line as many times as needed until all overlapping pattern matches have been handled.

edited Mar 28 at 13:14

answered Mar 28 at 12:21

Kusalananda♦

140k17261435

edited Mar 28 at 13:14

answered Mar 28 at 12:21

Kusalananda♦

140k17261435

answered Mar 28 at 12:21

Kusalananda♦

140k17261435

answered Mar 28 at 12:21

Kusalananda♦

140k17261435

1

Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

– Stéphane Chazelas
Mar 28 at 13:01

For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

– Stéphane Chazelas
Mar 28 at 13:02

The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

– Stéphane Chazelas
Mar 28 at 13:18

@StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

– Kusalananda♦
Mar 28 at 13:20

[^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

– Stéphane Chazelas
Mar 28 at 13:37

|
show 1 more comment

1

Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

– Stéphane Chazelas
Mar 28 at 13:01

For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

– Stéphane Chazelas
Mar 28 at 13:02

The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

– Stéphane Chazelas
Mar 28 at 13:18

@StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

– Kusalananda♦
Mar 28 at 13:20

[^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

– Stéphane Chazelas
Mar 28 at 13:37

Note that POSIX doesn't specify [[:<:]]. While it's shaped like a POSIX character class, it's not a character class at all.

– Stéphane Chazelas
Mar 28 at 13:01

For ast-open's sed, b means a backspace character (like in many other utilities including echo, printf and awk and $'...').

– Stéphane Chazelas
Mar 28 at 13:02

The section about The additional word delimiters < and > are provided to ease compatibility with traditional SVR4 systems but are not portable and should be avoided in the re_format man page of OpenBSD makes little sense. < is more portable than [[:<:]] and comes from vi (so from BSD) long before SVR4. AFAIK, b comes from perl

– Stéphane Chazelas
Mar 28 at 13:18

@StéphaneChazelas Thanks. I might submit a bug report/patch to the OpenBSD lists when I find the time. It's been on my mind.

– Kusalananda♦
Mar 28 at 13:20

[^b] matches a collating element , so could match more than one character. printf '%sn' 'abdz' | LC_ALL=hu_HU.UTF-8 sed 's/[^b]/<&>/g' outputs <a>b<dz> with GNU sed for instance.

– Stéphane Chazelas
Mar 28 at 13:37

|
show 1 more comment

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

-sed

搜尋此網誌

Ttyjfyk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

1 Answer
1

1 Answer
1

1 Answer
1