AI for DevOps Engineers - Part 3: Infrastructure, Operations, Security, and Agents
In the previous parts (part one and part two) of this blog series, we explored the challenges facing DevOps today, how AI can address them, and how to build
At the beginning of May 2014 I started contributing to a great new OpenSource project sponsored by Deutsche Telekom. The Hardening Framework. One of the challenges in this project is an old friend: Keeping things DRY. But let us start from the beginning.
Large organizations very often allow for, or must tolerate that working teams choose different tools. So there is a need in supporting more than one toolchain to accomplish a certain goal. In this case the project goal is to provide for re-usable infrastructure code to harden several aspects of your deployment. At the moment teams should be free to choose either Chef or Puppet. Maybe more in the future so start contributing :-)
Best thing to explain is by example. Lets say, for connectivity reasons we have to be able to use specific ntp servers in our datacenter deployment. So we have to deploy ntp and configure it to use the list of ntp servers we specified.
The focus of this example is NOT to show off the pros, cons, or best practices in either the puppet or the chef world. It is the acceptance/integration part. How can we be sure that we are done? Well. Lets get started by going through the checklist we got.
So, lets assume we have implemented the requirements. We logged on to the server. Checked if it the service is up and double checked that it is using our list of time servers.
Done! Right?
Well. Not so much. This is a very manual approach. And consider that we want to support different technology stacks on different platforms to converge our nodes, e.g. Puppet and Chef deployments on several operating systems.
We can do better! Lets see. We want to verify that:
And we want a report if the verification fails or succeeds.
There are a million things we can do to check if a process is running or if a file contains some values. Basically we want to automate the manual approach we used to verify if a specification is met and get a nice report indicating success or failure.
There are a lot of libraries out there which are more or less tailored to this domain. One very popular approach is Rspec
RSpec is testing tool for the Ruby programming language. Born under the banner of Behaviour-Driven Development, it is designed to make Test-Driven Development a productive and enjoyable experience
Rspec is a Domain Specific Language for Testing. And there is an even better matching candidate: Serverspec
With serverspec, you can write RSpec tests for checking your servers are configured correctly.
Serverspec supports a lot of resource types out of the box. Have a look at Resource Types. To verify our specification we can
1describe package('ntp') do
2 it { should be_installed }
3end
4
5describe service('ntp') do
6 it { should be_running }
7end
8
9describe file('/etc/ntp.conf') do
10 it { should contain '0.pool.ntp.org' }
11 it { should contain '1.pool.ntp.org' }
12end
And the best part: This is agnostic to the method we provisioned our server! Manual, Chef, Puppet, Saltstack, Ansible, ... you name it.
To make it really work we obviously need some boilerplate code. This goes into the the spec_helper.rb
file
1require 'serverspec'
2require 'pathname'
3require 'net/ssh'
4require 'highline/import'
5
6include Serverspec::Helper::Ssh
7include Serverspec::Helper::DetectOS
8
9RSpec.configure do |c|
10
11 if ENV['ASK_SUDO_PASSWORD']
12 c.sudo_password = ask('Enter sudo password: ') { |q| q.echo = false }
13 else
14 c.sudo_password = ENV['SUDO_PASSWORD']
15 end
16
17 options = {}
18
19 if ENV['ASK_LOGIN_PASSWORD']
20 options[:password] = ask("\nEnter login password: ") { |q| q.echo = false }
21 else
22 options[:password] = ENV['LOGIN_PASSWORD']
23 end
24
25 if ENV['ASK_LOGIN_USERNAME']
26 user = ask("\nEnter login username: ") { |q| q.echo = false }
27 else
28 user = ENV['LOGIN_USERNAME'] || ENV['user'] || Etc.getlogin
29 end
30
31 if user.nil?
32 puts 'specify login user env LOGIN_USERNAME= or user='
33 exit 1
34 end
35
36 c.host = ENV['TARGET_HOST']
37 options.merge(Net::SSH::Config.for(c.host))
38 c.ssh = Net::SSH.start(c.host, user, options)
39 c.os = backend.check_os
40
41end
This allows for running the test against a server were we have ssh access.
To be able to support multiple test suites lets organize them in directories and use a Rakefile
to choose which suite to run.
1require 'rake'
2require 'rspec/core/rake_task'
3
4suites = Dir.glob('*').select{|entry| File.directory?(entry) }
5
6class ServerspecTask < RSpec::Core::RakeTask
7
8 attr_accessor :target
9
10 def spec_command
11
12 if target.nil?
13 puts "specify either env TARGET_HOST or target_host="
14 exit 1
15 end
16
17 cmd = super
18 "env TARGET_HOST=#{target} STANDALONE_SPEC=true #{cmd} --format documentation --no-profile"
19 end
20
21end
22
23namespace :serverspec do
24 suites.each do |suite|
25 desc "Run serverspec suite #{suite}"
26 ServerspecTask.new(suite.to_sym) do |t|
27 t.target = ENV['TARGET_HOST'] || ENV['target_host']
28 t.ruby_opts = "-I #{suite}/serverspec"
29 t.pattern = "#{suite}/serverspec/*_spec.rb"
30 end
31 end
32end
What have we got now? Spoiler alert: this is exactly how test-kitchen expects integration test suites!
1± /usr/local/bin/tree .
2.
3├── Gemfile
4├── Gemfile.lock
5├── Rakefile
6├── default
7│ └── serverspec
8│ ├── ntp_spec.rb
9│ └── spec_helper.rb
10└── failing
11 └── serverspec
12 ├── ntp_spec.rb
13 └── spec_helper.rb
1± rake -T
2rake serverspec:default # Run serverspec suite default
3rake serverspec:failing # Run serverspec suite failing
To test it we run it against a server. This is the output from a server which does not implement our requirement:
1± ASK_LOGIN_PASSWORD=true rake serverspec:default target_host=192.168.1.222 user=stack
2env TARGET_HOST=192.168.1.222 STANDALONE_SPEC=true /Users/ehaselwanter/.rvm/rubies/ruby-2.1.1/bin/ruby -I default/serverspec -S rspec default/serverspec/ntp_spec.rb --format documentation --no-profile
3Enter sudo password:
4
5Enter login password:
6
7File "/etc/ntp.conf"
8 should contain "0.pool.ntp.org" (FAILED - 1)
9 should contain "1.pool.ntp.org" (FAILED - 2)
10
11Service "ntp"
12 should be running (FAILED - 3)
13
14Package "ntp"
15 should be installed (FAILED - 4)
16
17Failures:
18
19 1) File "/etc/ntp.conf" should contain "0.pool.ntp.org"
20 Failure/Error: it { should contain '0.pool.ntp.org' }
21 sudo grep -q -- 0.pool.ntp.org /etc/ntp.conf || sudo grep -qF -- 0.pool.ntp.org /etc/ntp.conf
22 grep: /etc/ntp.conf: No such file or directory
23grep: /etc/ntp.conf: No such file or directory
24
25 expected File "/etc/ntp.conf" to contain "0.pool.ntp.org"
26 # ./default/serverspec/ntp_spec.rb:12:in `block (2 levels) in <top (required)>'
27
28 2) File "/etc/ntp.conf" should contain "1.pool.ntp.org"
29 Failure/Error: it { should contain '1.pool.ntp.org' }
30 sudo grep -q -- 1.pool.ntp.org /etc/ntp.conf || sudo grep -qF -- 1.pool.ntp.org /etc/ntp.conf
31 grep: /etc/ntp.conf: No such file or directory
32grep: /etc/ntp.conf: No such file or directory
33
34 expected File "/etc/ntp.conf" to contain "1.pool.ntp.org"
35 # ./default/serverspec/ntp_spec.rb:13:in `block (2 levels) in <top (required)>'
36
37 3) Service "ntp" should be running
38 Failure/Error: it { should be_running }
39 sudo ps aux | grep -w -- ntp | grep -qv grep
40 expected Service "ntp" to be running
41 # ./default/serverspec/ntp_spec.rb:8:in `block (2 levels) in <top (required)>'
42
43 4) Package "ntp" should be installed
44 Failure/Error: it { should be_installed }
45 sudo dpkg-query -f '${Status}' -W ntp | grep -E '^(install|hold) ok installed$'
46 No packages found matching ntp.
47
48 expected Package "ntp" to be installed
49 # ./default/serverspec/ntp_spec.rb:4:in `block (2 levels) in <top (required)>'
50
51Finished in 0.16283 seconds
524 examples, 4 failures
53
54Failed examples:
55
56rspec ./default/serverspec/ntp_spec.rb:12 # File "/etc/ntp.conf" should contain "0.pool.ntp.org"
57rspec ./default/serverspec/ntp_spec.rb:13 # File "/etc/ntp.conf" should contain "1.pool.ntp.org"
58rspec ./default/serverspec/ntp_spec.rb:8 # Service "ntp" should be running
59rspec ./default/serverspec/ntp_spec.rb:4 # Package "ntp" should be installed
60
61Randomized with seed 57616
62
63env TARGET_HOST=192.168.1.222 STANDALONE_SPEC=true /Users/ehaselwanter/.rvm/rubies/ruby-2.1.1/bin/ruby -I default/serverspec -S rspec default/serverspec/ntp_spec.rb --format documentation --no-profile failed
Great, right? A report about what failed and what worked as well as how it was tested.
And now the output from a server which got the the expected config applied:
1± rake serverspec:default target_host=192.168.1.50
2env TARGET_HOST=192.168.1.50 STANDALONE_SPEC=true /Users/ehaselwanter/.rvm/rubies/ruby-2.1.1/bin/ruby -I default/serverspec -S rspec default/serverspec/ntp_spec.rb --format documentation --no-profile
3Enter sudo password:
4
5Package "ntp"
6 should be installed
7
8Service "ntp"
9 should be running
10
11File "/etc/ntp.conf"
12 should contain "0.pool.ntp.org"
13 should contain "1.pool.ntp.org"
14
15Finished in 0.22648 seconds
164 examples, 0 failures
17
18Randomized with seed 55249
Again, great feedback and a nice report.
You can find this in the repo tests-kitchen-example
Now its time to provide some infrastructure-as-code to be able to converge any node to our specification. Again. This is no contest of Puppet/Chef best practices but to focus on the integration testing part.
Lucky us there is already a puppet module for this: puppetlabs/ntp. So we can implement it by using this module with the ntp servers we want:
1# example_ntp.pp
2
3class { '::ntp':
4 servers => [ '0.pool.ntp.org', '1.pool.ntp.org' ],
5}
With Chef, again the community has us covered. We can use the ntp cookbook. Implementation is as simple as using a role in the run_list:
1{
2 "name": "ntp",
3 "default_attributes": {
4 "servers": [
5 "0.pool.ntp.org",
6 "1.pool.ntp.org"
7 ]
8 },
9 "override_attributes": { },
10 "json_class": "Chef::Role",
11 "description": "NTP Role",
12 "chef_type": "role",
13 "default_attributes" : {
14 },
15 "run_list": [
16 "recipe[ntp]"
17 ]
18}
Now we are able to converge our node with Chef or Puppet, but we still have to run every step manually. It's time to bring everything together. Have Puppet as well as Chef converge our node and verify it automatically.
You know the answer to that: KitchenCI/Test-Kitchen. But we need a trick. Test-Kitchen must be made aware that we already have our tests somewhere, and that we want to use them in our Puppet as well as Chef integration test scenario. This is not possible at the moment, see Test Artifact Fetch Feature.
To come around this issue I implemented the kitchen-sharedtests gem with some thor tasks that hook right into test-kitchen.
± thor -T
kitchen
-------
thor kitchen:all-sharedtests # Run all test instances
thor kitchen:diagnose-sharedtests-default-nocm-ubuntu-1204 # Diagnose default-nocm-ubuntu-1204 test instance
thor kitchen:diagnose-sharedtests-default-nocm-ubuntu-1310 # Diagnose default-nocm-ubuntu-1310 test instance
thor kitchen:fetch-remote-tests # Fetch remote tests from provider.test_repo_uri
thor kitchen:run-sharedtests-default-nocm-ubuntu-1204 # Run run-sharedtests-default-nocm-ubuntu-1204 test instance
thor kitchen:run-sharedtests-default-nocm-ubuntu-1310 # Run run-sharedtests-default-nocm-ubuntu-1310 test instance
thor kitchen:verify-sharedtests-default-nocm-ubuntu-1204 # Run default-nocm-ubuntu-1204 to verify instance
You can run kitchen commands with thor tasks and it will fetch the integration test repo specified in the .kitchen.yml file.
If you want to keep your workflow run ...
thor kitchen:fetch-remote-tests # Fetch remote tests from provider.test_repo_uri
... once to fetch the repo and place it in test/integration
. This obviously is just a lousy hack but lowers the entry barrier to move the integration tests out of the infrastructure-as-code repo.
We will refine the example kitchen from Using Test Kitchen With Puppet. The code lives in a branch for the shared integration test repo approach distinct-test-repo branch
The notable difference is: We remove the test/integration
directory from the repo but hint in the .kitchen.yml
were to get it from.
± grep repo .kitchen.yml
test_repo_uri: "https://github.com/ehaselwanter/tests-kitchen-example.git"
This is just for convenience and allows for fetching the repo to the test/integration
path
± thor -T |grep fetch
thor kitchen:fetch-remote-tests # Fetch remote tests from provider.test_repo_uri
± thor kitchen:fetch-remote-tests
-----> create or update https://github.com/ehaselwanter/tests-kitchen-example.git
cloning https://github.com/ehaselwanter/tests-kitchen-example.git /Users/ehaselwanter/repositories/t-labs-hardening/puppet-kitchen-example/test/integration
Another change is an update to how kitchen-puppet finds its files. This is now more aligned with how kitchen does it when converging a node with chef. We place the manifest to run in the test suite. The manifest/site.pp
has to move because the configuration is tied to the specification, so we move it in a puppet folder. Once again this is not supported with test-kitchen at the moment of writing this blog post, but made possible through a little hack described in a pull request: add puppet to the list of ignored names as well as a fake gem to make it work without the pull request being merged (fake busser-puppet)
relevant part of the directory tree:
± /usr/local/bin/tree test
test
└── integration
├── default
├── puppet
│ └── manifests
│ └── site.pp
└── serverspec
├── ntp_spec.rb
└── spec_helper.rb
Now we are ready to run.
± kitchen list
Instance Driver Provisioner Last Action
default-nocm-ubuntu-1204 Vagrant PuppetApply <Not Created>
default-nocm-ubuntu-1310 Vagrant PuppetApply <Not Created>
We can add more platforms with editing the kitchen file. To run all the suites in all the platforms use:
kitchen test
once again fetch the tests
± thor kitchen:fetch-remote-tests
-----> create or update https://github.com/ehaselwanter/tests-kitchen-example.git
cloning https://github.com/ehaselwanter/tests-kitchen-example.git /Users/ehaselwanter/repositories/t-labs-hardening/chef-kitchen-example/test/integration
Next we need a Berksfile
specifying our dependencies
site :opscode
metadata
cookbook "ntp"
and a role in the test suite to run our ntp recipe with the servers we want to have (relevant part of the directory tree):
test
└── integration
├── default
├── roles
│ └── ntp.json
└── serverspec
├── ntp_spec.rb
└── spec_helper.rb
This, again, gives us:
± kitchen list
Instance Driver Provisioner Last Action
default-nocm-ubuntu-1204 Vagrant ChefSolo <Not Created>
default-nocm-ubuntu-1310 Vagrant ChefSolo <Not Created>
We can add more platforms with editing the kitchen file. To run all the suites in all the platforms use:
kitchen test
We dramatically increased the value of integration tests in test-kitchen. As shown, we now are able to verify our specification against any node we have ssh access to. And we can use the same integration test suites in our Chef or Puppet development cycle.
It still feels not very natural to first fetch the external repo but there a are some pending feature requests to test-kitchen to get that sorted. The main problem here is the motivation for creating a first class citizen support of external integration tests as very few projects have the need to support more then one configuration management tool
As shown in this article, there is true value to do so because you can immediately benefit from the standalone usage scenario.
You are interested in our courses or you simply have a question that needs answering? You can contact us at anytime! We will do our best to answer all your questions.
Contact us