Strategies for Dealing with Environment Variables

Published on Friday, September 14, 2018

TL,DR; Frameworks offer tools to parameterize environments in a variety of ways. But because of this configuration files of projects tend to get messy once projects are taken into production. Specifying purpose of the parameter within the name can help identify unneeded configurations. Making configuration explicit within the application layer can be even more helpful. Doing so eases refactoring and provides potential to improve the overall developer experience.

How often have you found yourself sifting through a configuration file to look up some settings? That can be quite a daunting task on some projects, right? I just picked three large projects I worked on. They contained 38, 121 and 68 parameters respectively.

Configuration files are messy

When teams properly refactor, obsolete and redundant settings will be removed from files like these. If your team does so, you can skip right towards the bottom part of this post for some inspiration. But what I encounter as a consultant is often the opposite. Figuring out if a particular setting can be removed or not is often not as easy as it should be. Especially if your framework manipulates the physical representation of a parameter from “camel case” (e.g. mySetting) to “snake case” (e.g. my_setting) or constant case (e.g. MY_SETTING). As a result we get files bloated with settings that may no longer be relevant.

An organized mess is still better than a mess

One way to deal with this is to use comments. Unfortunately these parameter files are often manipulated by toolchains that reorder settings and sometimes even remove comments. So on a solo project I did a few years ago I started to experiment with parameter names that were self explanatory. Rather than SMTP_USERNAME the name would be WHAT_IS_THE_SMTP_USERNAME_FOR_SENDING_TRANSACTIONAL_EMAILS. You may think this is over the top but hear me out on some of the advantages:

Regardless how our configuration file is modified, people on our team will understand what it is used for, even those who have just joined the team;
If the setting is used for the wrong purpose – for example for sending marketing emails – it will easily be detected. The code will just yell at the reader that things are used wrongly. var marketingMailer = new EmailSender(env('WHAT_IS_THE_SMTP_USERNAME_FOR_SENDING_TRANSACTIONAL_EMAILS'), (...));
Because of more specific names it will be easier to find the settings that are no longer needed. If we charge a new member of the team with replacing our Redis PUB/SUB infrastructure with RabbitMQ, she will now know that WHAT_IS_THE_REDIS_USERNAME_FOR_DISPATCHING_EVENTS_OVER_PUB_SUB can be removed. I doubt that would be done if the name is REDIS_USERNAME, which could be used for right about anything

All right! This is great. We have more clarity about configuration within the team. Unfortunately finding out where a particular setting is used is still work. And even worse, other than the “Find” functionality, our IDE won’t be particulary helpful. Which is a shame because it makes refactoring more difficult.

Do something for your IDE and it will give more in return

On a few other projects I decided to refactor the configuration setup and take this a little bit further. The first step was to introduce a class EnvironmentVariables and introduce methods for each of the environment variables. The method names in the class were identical to the environment variable names. An example of the implementation as I would write it in PHP would look like this:

final class EnvironmentVariables {

    function WHAT_IS_THE_SMTP_ENDPOINT_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return getenv(__FUNCTION__);
    }
    
    function WHAT_IS_THE_SMTP_USERNAME_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return getenv(__FUNCTION__);
    }
    
    function WHAT_IS_THE_SMTP_PASSWORD_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return getenv(__FUNCTION__);
    }
    
    // additional settings ...
}

At the call site the object would be used like this:

$envVars = new EnvironmentVariables;
$transactionalMailer = new EmailSender(
    $envVars->WHAT_IS_THE_SMTP_USERNAME_FOR_SENDING_TRANSACTIONAL_EMAILS(),
    $envVars->WHAT_IS_THE_SMTP_PASSWORD_FOR_SENDING_TRANSACTIONAL_EMAILS(),
    $envVars->WHAT_IS_THE_SMTP_ENDPOINT_FOR_SENDING_TRANSACTIONAL_EMAILS()
);

At this point we have gotten some immediate benefits:

We can easily find any usages of our environment variables because our code is statically typed and thus our IDE can help us find things and help us refactor them;
Due to type annotations we will easily detect missing configurations due to type mismatches (the specific getenv function returns false when the value doesn’t exist);
We now have a point of abstraction to improve our overal configuration system. We could use this to generate configuration files, document particular settings, obscure values in our toolstack.

Abstraction points, attraction points

At this point the EnvironmentVariables class immediately starts to attract behaviour. Let’s start with refactoring things a little bit. Let’s create a private method for loading the variable:

final class EnvironmentVariables {

    function WHAT_IS_THE_SMTP_ENDPOINT_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return $this->getVariable(__FUNCTION__);
    }

    function WHAT_IS_THE_SMTP_USERNAME_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return $this->getVariable(__FUNCTION__);
    }
    
    function WHAT_IS_THE_SMTP_PASSWORD_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return $this->getVariable(__FUNCTION__);
    }
    
    private function getVariable(string $variable): string {
        return getenv($variable);
    }
    
    // etc...
}

How about default values? Let’s add a parameter to the getVariable method called $default.

final class EnvironmentVariables {

    function WHAT_IS_THE_SMTP_ENDPOINT_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return $this->getVariable(__FUNCTION__, 'smtp.postmarkapp.com');
    }

    function WHAT_IS_THE_SMTP_USERNAME_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return $this->getVariable(__FUNCTION__, 'ask your team lead');
    }
    
    function WHAT_IS_THE_SMTP_PASSWORD_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return $this->getVariable(__FUNCTION__, 'ask your team lead');
    }
    
    private function getVariable(string $variable, string $default): string {
        $value = getenv($variable);
        
        return $value !== '' && $value !== false ? $value : $default
    }
    
    // etc...
}

How about some documentation on possible values? Easy. Just use what you would regularly do: docblock documentation.

final class EnvironmentVariables {

    /**
     * Postmark is used exclusively for sending transactional email because they do a great job at email delivery.
     * 
     * @see https://postmarkapp.com/developer/user-guide/sending-email/sending-with-smtp
     */
    function WHAT_IS_THE_SMTP_ENDPOINT_FOR_SENDING_TRANSACTIONAL_EMAILS (): string {
        return $this->getVariable(__FUNCTION__, 'smtp.postmarkapp.com');
    }
        
    // etc...
}

How about generating local configuration? Let’s add two methods and a little helper script:

final class EnvironmentVariables {

    function getIterator (): Iterator {
        $reflection = new ReflectionObject($this);
        $publicMethods = $reflection->getMethods(ReflectionMethod::IS_PUBLIC);
        $environmentVariables = [];

        foreach ($publicMethods as $method)
        {
            if (1 !== preg_match('#^(getIterator|to)#', $method->name))
            {
                $environmentVariables[$method->name] = $method->invoke($this);
            }
        }

        return new ArrayIterator($environmentVariables);
    }

    function toDotEnv(): string {
        $variables = [];
        $warningMessage = "IMPORTANT: do not manually add keys to this file! Please refer to the EnvironmentVariables class";

        foreach ($this as $environmentVariable => $value) {
            $variables[] = <<<ENV
{$environmentVariable}="{$value}"
ENV;
        }

        return "# {$warningMessage}\n\n" .implode(PHP_EOL, $variables) . PHP_EOL;
    }
    
    // etc.
}

You could have a helper script like bin/dump-env-vars and you could use it to safely modify existing configuration files.

echo (new EnvironmentVariables())->toDotEnv();

To do or to postpone?

Should you do this? Is your code base large? Do you work with multiple people on it? Are there often large intervals where no work is done at all on the code base? How often do new people join your organization?

There are plenty of questions that can be helpful in figuring out if this is will be beneficial. But keep in mind that refactoring often will expose problems, this might just uncover coupling you were unaware of before.

Little time investment in config files will pay off

Configuration files are difficult to manage. Naming can be a first step to tame the mental overload of large files that evolve over time. Making configuration an explicit part of the application layer is helpful because it introduces a point of abstraction and allows our IDE to do the work for us.