Terraform Skeleton Part 5: Remote State with CloudFormation
Part 3 showed us Terraform requires some infrastructure itself to store remote state and discussed some limitations of using Terragrunt to manage the creation of that infrastructure. In part 4, we introduced more operational infrastructure for Terraform and began managing that infrastructure with CloudFormation.
Today, we’ll continue down the path of using CloudFormation to store operational Terraform infrastructure. Achieving a clean separation between the infrastructure Terraform needs to run and the infrastructure Terraform manages opens up additional possibilities for locking down the Terraform state.
We’ll make use of CloudFormation’s import feature to bring the state bucket and lock table Terragrunt created for us under CloudFormation’s control for a seamless transition of ownership.
Goals
- Control all Terraform operational infrastructure with CloudFormation
- Import existing operational infrastructure (buckets, tables) into CloudFormation
If you prefer to jump to the end, the code implementing this post’s final result is available on branch release/1.4 on GitHub. Additionally, you can view the diffs from part 4, if that’s more your speed.
Define Operational Infrastructure with CloudFormation
There are three items we need to import into CloudFormation:
- The state S3 bucket
- The log S3 bucket
- The DynamoDB lock table
The first step is to define these resources in a CloudFormation template. We’ll add resource definitions to the init-admin-account.cf.yml
template we created in part 4.
If you prefer, you can view the diffs on the CloudFormation template from part 4.
First up, the state bucket. Add an AWS::S3::Bucket resource to the CloudFormation template for our state bucket under the Resources
block:
TerraformStateBucket:
Type: 'AWS::S3::Bucket'
DeletionPolicy: Retain
UpdateReplacePolicy: Retain
Properties:
BucketName: !Ref StateBucketName
BucketEncryption:
ServerSideEncryptionConfiguration:
- ServerSideEncryptionByDefault:
SSEAlgorithm: aws:kms
LoggingConfiguration:
DestinationBucketName: !Ref StateLogBucketName
LogFilePrefix: TFStateLogs/
PublicAccessBlockConfiguration:
BlockPublicAcls: True
BlockPublicPolicy: True
IgnorePublicAcls: True
RestrictPublicBuckets: True
VersioningConfiguration:
Status: Enabled
The properties above match what Terragrunt used when it created the state bucket in part 3. Note that we’re also using the StateBucketName
and StateLogBucketName
parameters we added in part 4.
CloudFormation requires the DeletionPolicy attribute for any resources it will import, and it is a good safety measure in general.
UpdateReplacePolicy is also set. Although not required, it is needed to make our CloudFormation linter pre-commit hook happy.1
Next, we’ll tackle the log bucket. Add the following to the Resources
block:
TerraformStateLogBucket:
Type: 'AWS::S3::Bucket'
DeletionPolicy: Retain
UpdateReplacePolicy: Retain
Properties:
BucketName: !Ref StateLogBucketName
AccessControl: LogDeliveryWrite
Finally, we have the lock table:
TerraformStateLockTable:
Type: 'AWS::DynamoDB::Table'
DeletionPolicy: Retain
UpdateReplacePolicy: Retain
Properties:
TableName: !Ref LockTableName
AttributeDefinitions:
- AttributeName: LockID
AttributeType: S
KeySchema:
- AttributeName: LockID
KeyType: HASH
BillingMode: PAY_PER_REQUEST
If you’re starting fresh, and Terragrunt hasn’t already created the state bucket, log bucket, and lock table for you, you can skip the next section on importing and run the make init-admin
target we created in part 3 to deploy your stack. Otherwise, continue on to import the resources Terragrunt created for you.
Import Operational Infrastructure into CloudFormation
While you can use the AWS management console to import the resources, I prefer to work from the command line, but the commands to do so aren’t straightforward. We’ll add some targets to our Makefile in an attempt to simplify.
For a clean import, we need the following capabilities in our Makefile:
- Use CloudFormation Drift Detection to ensure our stack is up-to-date and our resource definitions match what Terragrunt created
- Create a CloudFormation change set to import the resources and show us what CloudFormation intends to do
- Execute the change set to import the resources
If you prefer to jump to the end, here’s the resulting Makefile, and here are the diffs from part 4.
Check for Drift
The first step in the import process is to verify that the stack we will be importing into has no drift, i.e., it has no other unapplied changes. The CloudFormation API uses separate calls to start a drift detection job and check the status of a drift detection job. We’ll add some helper functions to our Makefile to wrap these calls:
define wait_cfn_drift_detect_job
@while [[ \
"$$($(CFN_STATUS_DRIFT_DETECTION) $(1) | jq -r .DetectionStatus)" == \
"DETECTION_IN_PROGRESS" \
]]; do \
echo "Detection in progress. Waiting 3 seconds..."; \
sleep 3; \
done
endef
define show_cfn_drift
$(eval DRIFT_ID=$(shell $(CFN_START_DRIFT_DETECTION) $(1) \
| jq -r .StackDriftDetectionId))
$(call wait_cfn_drift_detect_job,${DRIFT_ID})
@$(CFN_STATUS_DRIFT_DETECTION) $(DRIFT_ID) | jq '{ \
DetectionStatus, \
StackDriftStatus, \
DriftedStackResourceCount \
}'
endef
Next, add a make target to perform the drift detection:
.PHONY: check-init-admin-drift
check-init-admin-drift:
$(call show_cfn_drift,${ADMIN_INIT_STACK_NAME})
Execute drift detection with:
➜ make check-init-admin-drift
Detection in progress. Waiting 3 seconds...
{
"DetectionStatus": "DETECTION_COMPLETE",
"StackDriftStatus": "IN_SYNC",
"DriftedStackResourceCount": 0
}
If you get a StackDriftStatus
other than IN_SYNC
, adjust your CloudFormation template to resolve, and verify using the drift check. Once you’re IN_SYNC
, create the import change set as follows.
Create Import Changeset
Creating the import change set requires a CloudFormation template and information about the resources to import, including:
- The resource type
- The logical name of the resource in your template
- The unique identifier for that resource in AWS
For instance, our CloudFormation template defines the state bucket with:
Resources:
TerraformStateLogBucket:
Type: 'AWS::S3::Bucket'
...
which means the resource type is AWS::S3::Bucket
, and the logical name is TerraformStateLogBucket
.
The unique identifier depends on the resource type. For an S3 bucket, it’s the bucket name. For a DynamoDB table, it’s the table name.
Below is a make target for creating the import change set:
import-terragrunt-changeset.json:
@aws cloudformation create-change-set \
--stack-name ${ADMIN_INIT_STACK_NAME} \
--change-set-name ${ADMIN_INIT_STACK_NAME}-import-terragrunt \
--change-set-type IMPORT \
--template-body file://init/admin/init-admin-account.cf.yml \
--capabilities CAPABILITY_NAMED_IAM \
--parameters \
ParameterKey=AdminAccountId,UsePreviousValue=True \
ParameterKey=StateBucketName,UsePreviousValue=True \
ParameterKey=StateLogBucketName,UsePreviousValue=True \
ParameterKey=LockTableName,UsePreviousValue=True \
--resources-to-import "[ \
{ \
\"ResourceType\":\"AWS::S3::Bucket\", \
\"LogicalResourceId\":\"TerraformStateBucket\", \
\"ResourceIdentifier\": { \
\"BucketName\": \"${STATE_BUCKET_NAME}\" \
} \
}, \
{ \
\"ResourceType\":\"AWS::S3::Bucket\", \
\"LogicalResourceId\":\"TerraformStateLogBucket\", \
\"ResourceIdentifier\": { \
\"BucketName\": \"${STATE_LOG_BUCKET_NAME}\" \
} \
}, \
{ \
\"ResourceType\":\"AWS::DynamoDB::Table\", \
\"LogicalResourceId\":\"TerraformStateLockTable\", \
\"ResourceIdentifier\": { \
\"TableName\": \"${LOCK_TABLE_NAME}\" \
} \
} \
]" | tee import-terragrunt-changeset.json
CloudFormation assigns each change set a unique ID needed for describing, executing, or discarding the change set with subsequent API calls. The import-terragrunt-changeset.json
stores the change set identifier in a JSON file for our other make targets to consume.
Next, we want a target for describing the created change set so we can see what modifications it will make:
.PHONY: prepare-cfn-import-terragrunt
prepare-cfn-import-terragrunt: import-terragrunt-changeset.json
$(eval CHANGE_SET_ID=$(shell jq -r .Id import-terragrunt-changeset.json))
aws cloudformation wait change-set-create-complete \
--change-set-name ${CHANGE_SET_ID} \
--stack-name ${ADMIN_INIT_STACK_NAME}
@aws cloudformation describe-change-set \
--change-set-name ${CHANGE_SET_ID} \
--stack-name ${ADMIN_INIT_STACK_NAME} \
| jq '{ Changes, Status, StatusReason }'
The prepare-cfn-import-terragrunt
target depends on import-terragrunt-changeset.json
to create the change set and communicate its unique id, then describes it once the change set finishes creating.
At this point, we’ll add another target for discarding the change set, in case we don’t like what we see with prepare-cfn-import-terragrunt
:
.PHONY: discard-cfn-import-terragrunt
discard-cfn-import-terragrunt: import-terragrunt-changeset.json
$(eval CHANGE_SET_ID=$(shell jq -r .Id import-terragrunt-changeset.json))
aws cloudformation delete-change-set \
--change-set-name ${CHANGE_SET_ID} \
--stack-name ${ADMIN_INIT_STACK_NAME}
@rm import-terragrunt-changeset.json
Let’s add a conventional clean
target too:
.PHONY: clean
clean:
rm import-terragrunt-changeset.json
Now, let’s use our targets to create and describe the change set:
➜ make prepare-cfn-import-terragrunt
{
"Id": "arn:aws:cloudformation:us-east-1:<omitted>:changeSet/tf-admin-init-import-terragrunt/179b1efd-2961-48af-95c2-6f45b96a9925",
"StackId": "arn:aws:cloudformation:us-east-1:<omitted>:stack/tf-admin-init/8704b070-5f61-11eb-9ff1-0eea077046db"
}
aws cloudformation wait change-set-create-complete \
--change-set-name arn:aws:cloudformation:us-east-1:<omitted>:changeSet/tf-admin-init-import-terragrunt/179b1efd-2961-48af-95c2-6f45b96a9925 \
--stack-name tf-admin-init
{
"Changes": [
{
"Type": "Resource",
"ResourceChange": {
"Action": "Import",
"LogicalResourceId": "TerraformStateBucket",
"PhysicalResourceId": "terraform-skeleton-state",
"ResourceType": "AWS::S3::Bucket",
"Scope": [],
"Details": []
}
},
{
"Type": "Resource",
"ResourceChange": {
"Action": "Import",
"LogicalResourceId": "TerraformStateLockTable",
"PhysicalResourceId": "terraform-skeleton-state-locks",
"ResourceType": "AWS::DynamoDB::Table",
"Scope": [],
"Details": []
}
},
{
"Type": "Resource",
"ResourceChange": {
"Action": "Import",
"LogicalResourceId": "TerraformStateLogBucket",
"PhysicalResourceId": "terraform-skeleton-state-logs",
"ResourceType": "AWS::S3::Bucket",
"Scope": [],
"Details": []
}
}
],
"Status": "CREATE_COMPLETE",
"StatusReason": null
}
Our describe command should show three Import
actions occurring in the change list with no other changes. If you see any other actions, it means your template contains unapplied changes, and you’ll want to address those first.
Execute Import
Having created the import change set and verified the actions it staged, we’ll now add a target for executing it:
.PHONY: cfn-import-terragrunt
cfn-import-terragrunt: import-terragrunt-changeset.json
$(eval CHANGE_SET_ID=$(shell jq -r .Id import-terragrunt-changeset.json))
aws cloudformation wait change-set-create-complete \
--change-set-name ${CHANGE_SET_ID} \
--stack-name ${ADMIN_INIT_STACK_NAME}
aws cloudformation execute-change-set \
--change-set-name ${CHANGE_SET_ID} \
--stack-name ${ADMIN_INIT_STACK_NAME}
@rm import-terragrunt-changeset.json
aws cloudformation wait stack-import-complete \
--stack-name ${ADMIN_INIT_STACK_NAME}
$(call show_cfn_drift,${ADMIN_INIT_STACK_NAME})
Execute the import with:
➜ make cfn-import-terragrunt
aws cloudformation wait change-set-create-complete \
--change-set-name arn:aws:cloudformation:us-east-1:<omitted>:changeSet/tf-admin-init-import-terragrunt/179b1efd-2961-48af-95c2-6f45b96a9925 \
--stack-name tf-admin-init
aws cloudformation execute-change-set \
--change-set-name arn:aws:cloudformation:us-east-1:<omitted>:changeSet/tf-admin-init-import-terragrunt/179b1efd-2961-48af-95c2-6f45b96a9925 \
--stack-name tf-admin-init
aws cloudformation wait stack-import-complete \
--stack-name tf-admin-init
{
"DetectionStatus": "DETECTION_COMPLETE",
"StackDriftStatus": "IN_SYNC",
"DriftedStackResourceCount": 0
}
At the end of the cfn-import-terragrunt
target, we call our show_cfn_drift
helper function to verify that the properties of the resources we imported match what we specified in our template. You should see StackDriftStatus
is IN_SYNC
, which means we’ve successfully imported the resources and the definitions for those resources in our template match reality.2
What’s Next
We’ve now fully separated the creation of operational infrastructure required to run Terraform from the infrastructure Terraform manages. Creating the operational infrastructure with CloudFormation has numerous benefits. We can now harden our state bucket, log bucket, and lock table in ways that were not available to us with Terragrunt managing their creation. We’ll tackle such hardening in upcoming posts.
Footnotes
-
If you omit
UpdateReplacePolicy
, the linter will report something like:W3011 Both UpdateReplacePolicy and DeletionPolicy are needed to protect Resources/TerraformStateBucket from deletion
-
If you see that there is drift after executing the import change set, the resources you imported have a different configuration than what you specified in your template. For instance, maybe your template specified a bucket is KMS encrypted when in reality, the bucket is AES-256 encrypted. Drift detection will report such differences. To resolve the drift:
- Decide whether the drift is appropriate and should be retained
- Modify the template to match any appropriate drift
- Create a change set on the stack with the updated template
- Verify the change set reports it will discard the inappropriate drift (or have no changes if there was no inappropriate drift)
- Execute the change set