Trying to understand the behaviour of the environment in Linux (Ubuntu 13.04 concretely), I’ve find different situations where setting envirionment variables are used or defined for/in different contexts. For example, if I check, locale, I get:
$ locale LANG=en_US.UTF-8 LANGUAGE=es_ES:es_HN:es_EC:en LC_CTYPE="en_US.UTF-8" LC_NUMERIC=es_ES.UTF-8 // more output
But, if I find, for example, LC_CTYPE using env | grep "LC_CTYPE", it sends no output. In general, locale shows me 13 LC_* variables and env only nine:
$ locale | grep "LC_*" | wc -l 13 $ env | grep "LC_*" | wc -l 9
Other variable with different “nature” is PS1. For example:
$ env | grep "PS1" # No output, but... $ set | grep "PS1" | head -n 1 PS1=$'\[\033[1;33m\][\t][\W]342230233\[\033[0m\] '
and of course, PS1 is a well-defined variable in my current environment, since I see my prompt changed accordingly.
Other way of viewing environment variables in other context is by means of strace. It is a program which allows you to see what’s going on when you execute a program. Sample:
$ strace -v ./a.out # a.out is a common Hello World, made in C.
execve("./a.out", ["./a.out"], ["LC_PAPER=es_ES.UTF-8", ...]) = 0
brk(0)
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
# etc, etc
write(1, "Hello Worldn", 12Hello World
) = 12
exit_group(0) = ?
The first thing the shell makes when executing a program, is call to execve, which does really call a program. Its first argument is the program being called, the second is the argv params of the called program, and the third parameter is the environment variables.
In that 3rd parameter, for example, doesn’t appear either PS1 or LC_TYPE.
In general, variables appearing in env or set appear in the list of environment variables sent to execve. Some locale variables appear in env or set but others not (LC_TYPE, LC_COLLATE and LC_MESSAGE, as well as LC_ALL but with an empty value). Lastly, other variables are not defined in env although they have a visible effect (PS1), as reflected by set.
What’s going on here? What are the differences between env, set (without arguments), locale (obviously respect to locale variables only)?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
The primary issue here — which accounts for why, e.g., $PS1 is not reported by env — is that env is reporting from a non-interactive environment. Processes are executed from a fork of your interactive shell, but there’s a subtlety involved in how their environment is set: It’s actually inherited via a native C level external variable set for all exec()‘d processes (see man environ). Here’s an illustration:
#include <stdio.h>
extern char **environ;
int main (void) {
int i;
for (i = 0; environ[i] != NULL; i++) {
printf("%sn", environ[i]);
}
return 0;
}
What’s interesting about this is, if you compile and run it, you’ll find the contents of **environ exactly match the one reported by env:
$ gcc test.c $ ./a.out > aout.txt $ env > env.txt $ diff env.txt aout.txt 68c68 < _=/bin/env --- > _=./a.out
The only difference is the name of the executable. So where does **environ come from and why doesn’t it contain, e.g., $PS1?
The fundamental explanation is that process are always created as children of other processes and they inherit **environ, but PS1 was never part of it. At start up, a shell may source variables from standard places, and those places differ depending on whether the shell is interactive or not; see INVOCATION in man bash. An aspect of this is that:
PS1 is set […] if bash is interactive, allowing a shell
script or a startup file to test this state.
Now, notice in /etc/bashrc something like this:
# are we an interactive shell? if [ "$PS1" ]; then
Which is where your actual (fancy) prompt is set, and neither it nor the initial value of $PS1 were ever exported. The initial value was created by the shell at invocation because it was interactive, and then it sourced then that file — but PS1 did not get put into **environ. You can see this if you execute:
#!/bin/sh echo $PS1
Nothing — even though if you echo $PS1 in your interactive shell it’s defined. This is because the **environ of the executed #!/bin/sh is the same as that of the parent interactive shell, but that does NOT contain PS1. This implies each shell uses an internal table of global variables separate, but originally populated, from **environ (this is confusing, since it means **environ does not include many things referred to as environment variables).
The contents of **environ are in in /proc/[PID]/environ, and if you check that for your current interactive shell, cat /proc/$BASHPID/environ, you’ll see PS1 is not there.
But how does stuff get into “environ”?
The simple answer is, via system calls. For example, if we throw some stuff into the example C program from earlier:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
extern char **environ;
int main (void) {
int i;
if (putenv("MYFOO=whatbar?")) {
fprintf(stderr, "putenv() failed: %sn", strerror(errno));
exit(1);
}
for (i = 0; environ[i] != NULL; i++) {
printf("%sn", environ[i]);
}
return 0;
}
MYFOO=whatbar? will be in the output (see man putenv). Since the shell creates processes by fork()ing (which duplicates the parent’s memory stack) and then calling execv() (which passes on the duplicated **environ), we can see a mechanism by which environment variables may be exported to child processes.
If you throw a fork() into that example, you’ll see this is the case, and (to reiterate), this process of fork’ing and potentially exec’ing is how child processes are created and inherit **environ from their ancestors. exec calls replace the process image, but as per man execv and man environ (nb. some versions of the former do not refer to this), **environ is passed on by the system.
Here’s a literal fork and exec of /usr/bin/env with MYFOO=whatbar? exported via putenv():
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
extern char **environ;
int main (void) {
pid_t pid;
if (putenv("MYFOO=whatbar?")) {
fprintf(stderr, "putenv() failed: %sn", strerror(errno));
exit(1);
}
pid_t pid = fork();
if (!pid) execl("/usr/bin/env", "env", NULL);
return 0;
}
So where’s the stuff that’s not in “environ”?
It’s private data of a particular shell instance. Bash will show you this + the inherited environ stuff via set with no arguments. Note this output also includes sourced functions.
But, if I find, for example, LC_CTYPE using env | grep “LC_CTYPE”, it sends no output. In general, locale shows me 13 LC_* variables and env only nine:
I get no LC_ variables at all from env (just LANG) but 13 from locale. I would presume these are variables set by a locale call and not exported; the fact that you get any from env perhaps reflects a naive error in some configuration somewhere.
Method 2
The shell knows two types of variables:
- “internal” variables which are known to the shell only (and to subshells)
-
exported variables, the “official” ones which are seen by
execveand thus byenv. The shell builtinexportshows you the exported variables.
If you execute
export PS1
and repeat
env | grep "PS1"
then you see it. Variables can be exported during creation (export foo=bar instead of foo=bar), they can be exported automatically on creation or modification (set -a), they can be exported later (var=foo; ...; export var) and they can be “unexported” (export -n var).
If the shell creates “real” subshells (by a|b, (a;b), $(a) and so on) it keeps several non-exported variables in order to avoid chaos.
Method 3
The output from the locale command is not a list of environment variables from the current environment. It is a display of that process’ effective locale settings (which is influenced in part by certain environment variables) and is presented in the same key=value format that the env command uses.
You can see the source for the eglibc locale command implementation here:
http://www.eglibc.org/cgi-bin/viewvc.cgi/branches/eglibc-2_19/libc/locale/programs/locale.c?view=markup
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0