What character encoding is used for Linux configuration files? The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Community Moderator Election ResultsWhat charset encoding is used for filenames and paths on Linux?ssh and character encodingPrevent access to files on linux file serverCharacter encoding issue with my linux install?Which configuration files override /etc/default/locale?Is there any configuration validator for linux?Capturing UNIX/ Linux server configuration?What is BROWSER_ONLY option in ifcfg-* network configuration files?Which is the “standard” configuration parser library used in Linux?Linux network configuration: A can of worms?
How do I design a circuit to convert a 100 mV and 50 Hz sine wave to a square wave?
University's motivation for having tenure-track positions
Homework question about an engine pulling a train
Did the new image of black hole confirm the general theory of relativity?
My body leaves; my core can stay
What aspect of planet Earth must be changed to prevent the industrial revolution?
What happens to a Warlock's expended Spell Slots when they gain a Level?
Why did Peik Lin say, "I'm not an animal"?
Simulating Exploding Dice
What do I do when my TA workload is more than expected?
Are there continuous functions who are the same in an interval but differ in at least one other point?
Does Parliament need to approve the new Brexit delay to 31 October 2019?
What is the role of 'For' here?
Button changing its text & action. Good or terrible?
Can withdrawing asylum be illegal?
Match Roman Numerals
Identify 80s or 90s comics with ripped creatures (not dwarves)
What is the padding with red substance inside of steak packaging?
Do I have Disadvantage attacking with an off-hand weapon?
Using dividends to reduce short term capital gains?
Drawing vertical/oblique lines in Metrical tree (tikz-qtree, tipa)
Word for: a synonym with a positive connotation?
One-dimensional Japanese puzzle
how can a perfect fourth interval be considered either consonant or dissonant?
What character encoding is used for Linux configuration files?
The 2019 Stack Overflow Developer Survey Results Are In
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Community Moderator Election ResultsWhat charset encoding is used for filenames and paths on Linux?ssh and character encodingPrevent access to files on linux file serverCharacter encoding issue with my linux install?Which configuration files override /etc/default/locale?Is there any configuration validator for linux?Capturing UNIX/ Linux server configuration?What is BROWSER_ONLY option in ifcfg-* network configuration files?Which is the “standard” configuration parser library used in Linux?Linux network configuration: A can of worms?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
A colleague was using Qt's built-in QTextStream class to rewrite the /etc/network/interfaces file on an Ubuntu system. Part of that code included a call to QTextStream's setCodec() method, where the codec was set to UTF-8. (see https://doc.qt.io/qt-5/qtextstream.html#setCodec if you're curious)
This got me wondering about what the Linux configuration files are SUPPOSED to be written as. It seems like ISO 8859-1 would be the closest to what I'd consider "plain ASCII" style of text, and I would (perhaps naively) assume this to be correct since most configurations files are plain English with no need for much more than the basic alphabet, numbers and a few punctuation signs.
But then I also wonder what would someone from a non-English speaking country do if they wanted to put comments into such files using other characters that aren't in ISO-8859-. Are they just plain "out of luck" ?
There are obviously a lot of "standard" configuration files that you'd find on an Ubuntu/Linux system, e.g.
- /etc/network/interfaces
- /etc/ntp.conf
- /etc/hostname
- ...
Would anyone care to weigh in on what encoding is actually supported/expected in these sort of files ? And where this is actually documented ? Is it enshrined in some sort of "Linux developers manifesto" as something writers of new Linux system services should be following, and if so, where would I find a definitive source of that information ?
linux configuration locale
New contributor
add a comment |
A colleague was using Qt's built-in QTextStream class to rewrite the /etc/network/interfaces file on an Ubuntu system. Part of that code included a call to QTextStream's setCodec() method, where the codec was set to UTF-8. (see https://doc.qt.io/qt-5/qtextstream.html#setCodec if you're curious)
This got me wondering about what the Linux configuration files are SUPPOSED to be written as. It seems like ISO 8859-1 would be the closest to what I'd consider "plain ASCII" style of text, and I would (perhaps naively) assume this to be correct since most configurations files are plain English with no need for much more than the basic alphabet, numbers and a few punctuation signs.
But then I also wonder what would someone from a non-English speaking country do if they wanted to put comments into such files using other characters that aren't in ISO-8859-. Are they just plain "out of luck" ?
There are obviously a lot of "standard" configuration files that you'd find on an Ubuntu/Linux system, e.g.
- /etc/network/interfaces
- /etc/ntp.conf
- /etc/hostname
- ...
Would anyone care to weigh in on what encoding is actually supported/expected in these sort of files ? And where this is actually documented ? Is it enshrined in some sort of "Linux developers manifesto" as something writers of new Linux system services should be following, and if so, where would I find a definitive source of that information ?
linux configuration locale
New contributor
UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.
– Johan Myréen
yesterday
If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)
– JasonA
yesterday
If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.
– Johan Myréen
yesterday
add a comment |
A colleague was using Qt's built-in QTextStream class to rewrite the /etc/network/interfaces file on an Ubuntu system. Part of that code included a call to QTextStream's setCodec() method, where the codec was set to UTF-8. (see https://doc.qt.io/qt-5/qtextstream.html#setCodec if you're curious)
This got me wondering about what the Linux configuration files are SUPPOSED to be written as. It seems like ISO 8859-1 would be the closest to what I'd consider "plain ASCII" style of text, and I would (perhaps naively) assume this to be correct since most configurations files are plain English with no need for much more than the basic alphabet, numbers and a few punctuation signs.
But then I also wonder what would someone from a non-English speaking country do if they wanted to put comments into such files using other characters that aren't in ISO-8859-. Are they just plain "out of luck" ?
There are obviously a lot of "standard" configuration files that you'd find on an Ubuntu/Linux system, e.g.
- /etc/network/interfaces
- /etc/ntp.conf
- /etc/hostname
- ...
Would anyone care to weigh in on what encoding is actually supported/expected in these sort of files ? And where this is actually documented ? Is it enshrined in some sort of "Linux developers manifesto" as something writers of new Linux system services should be following, and if so, where would I find a definitive source of that information ?
linux configuration locale
New contributor
A colleague was using Qt's built-in QTextStream class to rewrite the /etc/network/interfaces file on an Ubuntu system. Part of that code included a call to QTextStream's setCodec() method, where the codec was set to UTF-8. (see https://doc.qt.io/qt-5/qtextstream.html#setCodec if you're curious)
This got me wondering about what the Linux configuration files are SUPPOSED to be written as. It seems like ISO 8859-1 would be the closest to what I'd consider "plain ASCII" style of text, and I would (perhaps naively) assume this to be correct since most configurations files are plain English with no need for much more than the basic alphabet, numbers and a few punctuation signs.
But then I also wonder what would someone from a non-English speaking country do if they wanted to put comments into such files using other characters that aren't in ISO-8859-. Are they just plain "out of luck" ?
There are obviously a lot of "standard" configuration files that you'd find on an Ubuntu/Linux system, e.g.
- /etc/network/interfaces
- /etc/ntp.conf
- /etc/hostname
- ...
Would anyone care to weigh in on what encoding is actually supported/expected in these sort of files ? And where this is actually documented ? Is it enshrined in some sort of "Linux developers manifesto" as something writers of new Linux system services should be following, and if so, where would I find a definitive source of that information ?
linux configuration locale
linux configuration locale
New contributor
New contributor
New contributor
asked yesterday
JasonAJasonA
6
6
New contributor
New contributor
UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.
– Johan Myréen
yesterday
If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)
– JasonA
yesterday
If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.
– Johan Myréen
yesterday
add a comment |
UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.
– Johan Myréen
yesterday
If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)
– JasonA
yesterday
If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.
– Johan Myréen
yesterday
UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.
– Johan Myréen
yesterday
UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.
– Johan Myréen
yesterday
If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)
– JasonA
yesterday
If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)
– JasonA
yesterday
If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.
– Johan Myréen
yesterday
If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.
– Johan Myréen
yesterday
add a comment |
1 Answer
1
active
oldest
votes
The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
JasonA is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f511919%2fwhat-character-encoding-is-used-for-linux-configuration-files%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway
add a comment |
The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway
add a comment |
The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway
The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway
answered yesterday
MetalfreakMetalfreak
815
815
add a comment |
add a comment |
JasonA is a new contributor. Be nice, and check out our Code of Conduct.
JasonA is a new contributor. Be nice, and check out our Code of Conduct.
JasonA is a new contributor. Be nice, and check out our Code of Conduct.
JasonA is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f511919%2fwhat-character-encoding-is-used-for-linux-configuration-files%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
-configuration, linux, locale
UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.
– Johan Myréen
yesterday
If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)
– JasonA
yesterday
If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.
– Johan Myréen
yesterday