-
-
Notifications
You must be signed in to change notification settings - Fork 996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement terragrunt_output #828
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, fantastic. So excited to see this 👍
README.md
Outdated
will fail when you run an `apply-all` since the output will be interpolated before the target config has applied. | ||
Additionally, during an `apply-all`, the terragrunt configuration has to be partially interpolated in order to build up | ||
the dependency tree. This means that you will have issues using `apply-all` if `get_output` is used in the following | ||
blocks, as it will be interpolated prior to any modules being applied: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow this last one? Can you give an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine you had:
live
├── sql
│ └── terragrunt.hcl
└── vpc
└── terragrunt.hcl
And you had the following in live/sql/terragrunt.hcl
:
locals = {
vpc_path = "../vpc"
vpc_id = get_output(local.vpc_path, "vpc_id")
}
dependencies {
paths = [local.vpc_path]
}
inputs = {
vpc_id = local.vpc_id
}
Now suppose we have a completely clean project and nothing is applied yet and we run apply-all
. In order to build the dependency tree, terragrunt
needs to parse out the dependencies
block and decode them before any modules have been applied. In the above use case, the dependencies
block also references locals
, so we need to decode locals
as well. As part of that, we need to interpolate get_output
.
Well, get_output
will try to read the output of the vpc
module and fail, because that hasn't been applied yet, because we are still in the phase of building up the dependency tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I gotcha. So then the get_output
function would have to:
- Is this a single-module command (e.g.,
apply
) or multi-module (e.g.,apply-all
). - If single-module, and the dependency has not yet been applied (i.e., output is missing), return an error immediately.
- If multi-module, add the VPC to the dependency graph, pause config parsing until the command (e.g.,
apply
) has been executed in that dependency, and then continue config parsing after.
Would that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still exists, but is less of an issue because of the way you access terragrunt_output
.
README.md
Outdated
the dependency tree. This means that you will have issues using `apply-all` if `get_output` is used in the following | ||
blocks, as it will be interpolated prior to any modules being applied: | ||
|
||
- `locals` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, if you need to fetch 10 different outputs from a single dependency, I suspect 10 sequential terragrunt output
calls would be slow. Do we cache the outputs in memory in between? If not, it would be nice to allow users to cache all the outputs in a locals
variable once and then look up values within it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea that is a good point. Under most circumstances, you can use get_output
in locals
. There are some caveats you have to be aware of: #828 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first impression was that we should maintain an internal, in-memory cache of (module absolute path, all outputs for that module) pairs. Any time you call get_output
, it first checks the cache for that value: if it's present, it returns it immediately; if it's absent, it calls terraform output
and caches the result. This ensures maximal performance without any extra effort for the user...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the implementation, I worry about what happens when we are going through multiple passes of the config, as it does in the xxx-all
command. Here is where a global memoized cache could be problematic:
- In the first pass through, it would run
get_output
on the previous state. This result gets cached. - The dependency gets applied as part of
apply-all
, in the dependency order. - We parse the config again to apply the dependent module. As part of that, we call
get_output
. We look up in the cache, and return the previous value, which is NOT the newly applied state.
Note that this only happens when you depend on the awkward chicken-and-egg scenario of using get_output
in locals
, and in a naive implementation using a global cache that is stored for the duration of a single terragrunt
call.
We could have a different implementation where the cache is scoped to a single config parsing, which trades off more reuse for better predictability. I am leaning towards implementing it this way, as cache bugs are always super hard to debug. This of course requires more refactoring, since now I need to store the cache somewhere in a struct that is available throughout the parsing stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the new implementation, you will call terragrunt output
at most once per dependency per config. There is still room for memoization to cache across apply-all
calls, but I think this is world's better than get_output
which was once per function call.
And we don't need to do caching (which is always a pain to maintain due to annoying cache busting bugs) :)
config/config_helpers.go
Outdated
// target config check: make sure the target config exists | ||
targetConfig := params[0] | ||
if util.IsDir(targetConfig) { | ||
targetConfig = util.JoinPath(targetConfig, DefaultTerragruntConfigPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO for the future: we could support reading outputs from normal Terraform modules (i.e., those not meant to be used with Terragrunt) by running terraform output
instead of terragrunt output
. Not particularly high priority for now though!
- Always read the entire output map so we have type data - Refactor json to cty.Value conversion into its own function with unit tests
Ok the main sticking point is finding a good way to cache the I expect that the real solution is in #828 (comment), where we won't need to parse the dependencies block anymore. Although... we still need the |
README.md
Outdated
will fail when you run an `apply-all` since the output will be interpolated before the target config has applied. | ||
Additionally, during an `apply-all`, the terragrunt configuration has to be partially interpolated in order to build up | ||
the dependency tree. This means that you will have issues using `apply-all` if `get_output` is used in the following | ||
blocks, as it will be interpolated prior to any modules being applied: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I gotcha. So then the get_output
function would have to:
- Is this a single-module command (e.g.,
apply
) or multi-module (e.g.,apply-all
). - If single-module, and the dependency has not yet been applied (i.e., output is missing), return an error immediately.
- If multi-module, add the VPC to the dependency graph, pause config parsing until the command (e.g.,
apply
) has been executed in that dependency, and then continue config parsing after.
Would that work?
README.md
Outdated
the dependency tree. This means that you will have issues using `apply-all` if `get_output` is used in the following | ||
blocks, as it will be interpolated prior to any modules being applied: | ||
|
||
- `locals` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first impression was that we should maintain an internal, in-memory cache of (module absolute path, all outputs for that module) pairs. Any time you call get_output
, it first checks the cache for that value: if it's present, it returns it immediately; if it's absent, it calls terraform output
and caches the result. This ensures maximal performance without any extra effort for the user...
Ok new idea. This got me thinking of a potential way to solve this: what if we introduce a new block? After navigating HCL2, I think I am beginning to understand that it is significantly easier to implement partial parsing decoding of individual blocks as opposed to when it appears in the middle of an AST. Given that, it seems like the best way to achieve all of the goals is to introduce a new block construct like
In this model, the caching problem goes away because we can resolve all the Parsing order problems go away as well, since it can be parsed independently of the rest of the config (except maybe Implementing partial parsing just for dependency building is easy as well, since all we need to do is parse the blocks into a list of structs to get the config: no need to walk an AST! Another benefit is that we have full control over when the output pulling happens, since now it is a first class task in the decoding pipeline, happening before we pass the HCL struct to the decoder. So the parsing logic will now be:
|
That seems like an elegant solution! |
…that uses it yet) - Replace free strings in partial decode decodeList with enums
… now auto read these in as dependencies
Ok @brikis98 this is ready for a rereview. I have implemented
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, superb work. This ended up even more complicated to accomplish than I originally assume, but I think this is a great solution 👍
|
||
terragrunt_output "redis" { | ||
config_path = "../redis" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very, very cool 👍
Now that I've seen this, I can't wonder if this should be:
dependency "mysql" {
config_path = "../mysql"
}
dependency "redis" {
config_path = "../redis"
}
inputs = {
mysql_url = dependency.mysql.outputs.domain
redis_url = dependency.redis.outputs.domain
}
For this PR, the only difference would be a slight naming change: terragrunt_output
becomes dependency
and you read outputs from it via the .outputs
attribute. However, in a future PR, we could allow you to read any part of that module's config! E.g., Perhaps dependency.mysql.inputs.foo
would return the value of the "foo" output from the mysql
dependency and dependency.redis.remote_state.config.bucket
would return the bucket configured for remote state storage in the redis
dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this. I went through and made this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -2126,6 +2206,46 @@ The `skip` flag must be set explicitly in terragrunt modules that should be skip | |||
`terragrunt.hcl` file that is included by another `terragrunt.hcl` file, only the `terragrunt.hcl` file that explicitly | |||
set `skip = true` will be skipped. | |||
|
|||
|
|||
### Configuration parsing order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for documenting this 👍
// Allowed References: | ||
// - locals | ||
// - terragrunt_output | ||
// 5. Merge the included config with the parsed config. Note that all the config data is mergable except for `locals` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great docs 👍
@@ -28,7 +29,7 @@ func CreateLoggerWithWriter(writer io.Writer, prefix string) *log.Logger { | |||
// logging solution. | |||
// Debugf will only print out terragrunt logs if the TG_LOG environment variable is set to DEBUG. | |||
func Debugf(logger *log.Logger, fmtString string, fmtArgs ...interface{}) { | |||
if os.Getenv("TG_LOG") == "DEBUG" { | |||
if strings.ToLower(os.Getenv("TG_LOG")) == "debug" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we document this anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to README
Co-Authored-By: Yevgeniy Brikman <brikis98@users.noreply.github.com>
Ok merging and releasing this! Thanks for the review! |
The implementation has now been completely changed.
get_output
is no longer a function and instead is a new block calledterragrunt_output
. The benefits of this approach are:terragrunt_output
blocks. See https://github.com/gruntwork-io/terragrunt/blob/38aca82ff0890af3a4584870b4bd5eed8f788311/README.md#passing-outputs-between-modules for more info.terragrunt output
call: withget_output
,terragrunt output
had to be called each time the function was executed to keep the implementation sane. Sinceget_output
couldn't be parsed beforelocals
, this made it difficult to reuse the output results of a single call if you needed multiple variables from the output. Now only one call needs to be made per block and it can be referenced multiple times.This provides an implementation ofget_output
, an interpolation function that can be used to extract the output of another terraform module wrapped with terragrunt config. See the README changes for more details.`Note that as part of the implementation, it was required to refactor the configuration parsing during a
xyz-all
command to only do a partial decoding. Otherwise,apply-all
will fail even if you specify dependencies because it will interpolate theget_output
call in the initial pass to read thedependencies
from the config. This is handled in a new,PartialParseConfigFile
function which will only parse the sections specified.Note that becauselocals
,include
,dependencies
,terraform
blocks are evaluated in this section,apply-all
in the initial setup will still fail ifget_output
. This is covered in theREADME
.