GPG + SSH ? (Part 0 bis)

This will be a small walk through on how to generate an authentication subkey on GPG and integrated with ssh. First of all bare in mind that will need a GPG key first in order to generate this subkey, if have not generated on yet refer to part 0 of these series.

Authentication Key

In order to generate an authentication key, as I mentioned before, you will need to have a key generated. Let’s check that with

1
2
3
4
5
6
7
$ gpg -K                                                       
/home/<USER>/.gnupg/pubring.kbx
-------------------------------
sec rsa4096 2019-12-19 [SC]
F0ASSDF32440E42342FDF25GHB8F9CF4EC8EFE1B
uid [ultimate] Your Name (Comment) <[email protected]>
ssb rsa4096 2019-12-19 [E]

Then we will edit our key using the key ID shown above with the command gpg --expert --edit-key F0ASSDF32440E42342FDF25GHB8F9CF4EC8EFE1B. We’ll be prompt with a GPG tty where we will be able to add a new key with “Authentication” capabilities. By default, as we have seen in part 0, GnuPG creates new keys with Sign and Encryption capabilities so we need to make sure to remove those since we already have keys for that purpose.
Now answer what you are prompted and you’ll be good to go. I’ll leave a set of steps taken from Yubico’s website I’ve found very useful.

  1. Insert the YubiKey into the USB port if it is not already plugged in.
  2. Enter the GPG command: gpg --expert --edit-key 1234ABC (where 1234ABC is the key ID of your key)
  3. Enter the command: addkey
  4. Enter the passphrase for the key. Note that this is the passphrase, and not the PIN or admin PIN.
  5. You are prompted to specify the type of key. Enter 8 for RSA.
  6. Initial default will be Sign and Encrypt. To select authentication key toggle S to disable sign, E to disable encrypt, A to enable authentication.
  7. Once you can confirm that authentication is the current allowed actions select Q to Finish the selection.
  8. Specify the key size.
  9. Specify the expiration of the authentication key (this should be the same expiration as the key).
  10. When prompted to save your changes, enter y (yes).

Once you’re done, run gpg -K. The output should look something like this.

1
2
3
4
5
6
7
8
$> gpg -K                                                       
/home/<USER>/.gnupg/pubring.kbx
-------------------------------
sec rsa4096 2019-12-19 [SC]
F0ASSDF32440E42342FDF25GHB8F9CF4EC8EFE1B
uid [ultimate] Your Name (Comment) <[email protected]>
ssb rsa4096 2019-12-19 [E]
ssb rsa4096 2019-12-27 [A]

Note that now we have a new subkey with “ [A] “ capabilities.

Integrating SSH

First of all let me clarify that using GPG does not make your SSH connections more secure, it just change a bit the way you manage your ssh keys and sometimes is a bit more convenient.
To make use of the new authentication subkey we’ve just created, we need to tell the gpg-agent we want him to handles the ssh requests from SSH. For that we need to enable ssh support by running the following command.

1
$> echo "enable-ssh-support" >> ~/.gnupn/gpg-agent.conf

Additionally you might want to specify which subkey you want to use. We do this by adding the keygrip of our key to ~/.gnupg/sshcontrol

1
2
3
4
5
6
7
8
9
10
11
12
13
$> gpg -K --with-keygrip
/home/<USER>/.gnupg/pubring.kbx
-------------------------------
sec rsa4096 2019-12-19 [SC]
F032E440E7B8F960B9A4D68BA32DDCF4EC8EFE1B
Keygrip = 872C0BA0BB0EAB6600EE9A14146819B41DFF8B73
uid [ultimate] Lucas Contre (Contre) <[email protected]>
ssb rsa4096 2019-12-19 [E]
Keygrip = 96FD0AD76EB01C5475122670A0F85961C7952A1E
ssb rsa4096 2019-12-19 [A]
Keygrip = 037A6603BBDA3461BF672F8071037AK66ACACE34

$ echo 037A6603BBDA3461BF672F8071037AK66ACACE34 >> ~/.gnupg/sshcontrol

Last but not least you need to tell the ssh-agent how to access these keys. This is done by changing the SSH_AUTH_SOCK env variable.
To be sure this variable is well set, add these two lines to your .bashrc, .zshrc or whatever shell run commands file you use.

1
2
export SSH_AUTH_SOCK=$(gpgconf --list-dirs agent-ssh-socket)
gpgconf --launch gpg-agent

Now you’re done! In order to share your public ssh keys just run ssh-add -L to list all of your public ssh keys and place them on any server your like.

Bibliography and technology used in this project


Author: Lucas Contreras
Email: [email protected]
GPG: gpg.lucascontre.site

Dealing with your private key (Part 1)

Now that you’ve managed to create your own GPG key it’s time to understand a bit more about how GnuPG works.

The first question we need to ask is where are our keys stored. The answer depends on which version of GnuPG you are running. From version 2.1 onwards the method of storing your keys has change. On previous versions our keys were stored in two separate keyring under the ~/.gnupg directory. Prior version to 2.1 used to keep the public key pairs in two files: pubring.gpg and secring.gpg, you can imagine which stores which.

Now, the file secring.gpg is not used to store the secret keys anymore. Instead they are stored on the key store of the gpg-agent (a folder named private-keys-v1.d below the GnuPG home directory ~/.gnupg) with an unique identifier for the key called “Keygrip“ as the name of the file (i.e ~/.gnupg/private-keys-v1.d/{KEYGRIP}.key).Additionally, the pubring.gpg file has been replaced with pubring.kbx which remains under the GnuPG directory. This “database“ stores all of our friend’s keys, metadata and certificates. We might wanna backup this file.

Know that we know where our (I bet passphrase encrypted) private key is stored, we naturally ask the next question. Is it safe down there? Well.. it’s passphrase protected, right? Depends on the degree of securit/privacy obsession you can handle. Let’s take a look at this comment on https://security.stackexchange.com/.

On the days when my paranoia is like a ripe tomato, begging me to pick it, I split the private key (naturally it is already passphrase-protected) in half, then make a 3rd string by XOR-ing them together. Then I use simple password encryption (gpg –symmetric) on each string, and put each on a remote server on a different continent. Ideally, each remote server is with a different ISP or cloud provider.
But as the medicine was working – at least until I realized how ambitious the NSA has been – what I’ve actually done in the past is merely encrypted the (whole) private key (again using gpg –symmetric) and put it on my smartphone.
Now, having read the other answers, I’m finding the idea of three QR codes, embedded into three family photos, blindingly attractive. Time for stronger medicine?

Source: https://security.stackexchange.com/questions/51771/where-do-you-store-your-personal-private-gpg-key

Whether you want keep your private key on your local (probably proprietary) machine or do what our dear friend Darren did, it’s on you. I will show how to import, backup and retrieve/restore that private key as I think is the most secure and practical way.

First, we need to somehow backup our keys in case we lose them. I suggest exporting the private key and print it on a piece of paper. You would not need the backup unless you somehow lose your private key. To do so the following command might help you.

1
2
3
4
5
6
7
8
9
10
11
$ gpg --armor --export-secret-key [email protected]
-----BEGIN PGP PRIVATE KEY BLOCK-----

lQdGBF6kqnEBEAD3TDT/BfbKQE4aqOSmSAsLCsQ8HCt3HATXPBgDWhSUzy3xyQsl
x4oHGCnOpG5bBaUF3LRZh5GFsQovjfMVy11JeFcNkJO3eJRvwGgS98CiKW72HI+/
{ . . . }
3K/UBeKAtimIZagOWBpwX9OJehVcFwws4ToCshnyio2rhU79HWutJFtls/oE8HJc
Rc4Y9PMRfu7nuiCcNdc9Lp1BiaTymnQDLECi3bNtstZEnUGkzgvGwTX7DAZ5BDlY
KMBBP60EgSDpp1eG+4z8M/0O9NV2TQ==
=AXh6
-----END PGP PRIVATE KEY BLOCK-----

Print that on paper and you’ll be good to go. Now that you’ve backed up your keys, let’s see how can we export them and use them on different devices. There are several ways we can achieve this.

Having multiple copies of our private key

A simple and fast solution, if we don’t mind having our keys in multiple devices, is simply export them with into a file:

1
gpg --armor --export-secret-key [email protected] > private.key

and then just import the gpg on the other machine with:

1
gpg --import private.key

This is far from the safest option but it lets us decrypt no matter on what device we are working on.

Store it on an USB stick

Another option is to simply store it on an USB stick. This way you can have your private key with you always and there’s no need to store it on any machine we don’t trust. I haven’t tried this solution myself any further than a PoC to be sure it works.

Basically, what you want to do first is copy the file on ~/.gnupg/private-keys-v1.d on an USB stick.

1
mv ~/.gnupg/private-keys-v1.d/{KEYGRIP}.key /PATH/TO/USB

Be aware that moving these files from your machine will require to have the USB stick plugged when you need to decrypt something.

You will probably find more than one {KEYGRIP}.key file, each of them corresponding to a subkey. We can know which one belongs to which by simply listing your key this way: gpg -k --with-keygrip

Now that you’ve copied the files into the USB stick you will need to create a symbolic link.

1
ln -s ~/.gnupg/private-keys-v1.d/{KEYGRIP}.key /path/to/usb/{KEYGRIP}.key

We will need to do this for every single subkey we want to export. Once you are done check you can access these key (with the USB stick plugged) running gpg -K . Bare in mind that you should always mount your device on the same path, otherwise the sym-links won’t work.

Note: You will need to import your public key as well if you want gpg to recognize your keys. They need to be part of the pubring.kbx database. Source.

Yubikey or SmartCard

Finally, in my opinion the most secure/practical trade off for storing your private key is a Yubikey. For those who doesn’t know a YubiKey is a hardware authentication device manufactured by Yubico that supports one-time passwords, public-key cryptography and authentication, and the Universal 2nd Factor and FIDO2 protocols.

A detailed step by step on how to export your secret key to a Yubi will be the starting point of part 2 of this GnuPG series.

Bibliography and technology used in this project


Author: Lucas Contreras

Leave a comment: Join the Reddit discussion

Getting started with GnuPG (Part 0)

GnuPG is a complete and free implementation of the OpenPGP standard and this is just part 0 of who knows how many series where I will explain my basic understanding of GPG and how useful is on my daily basis. If you wish to understand more in detail how GPG work please refer to the RFC or here is a great post that captures the essence of it quite deeply.

My use case

No. I don’t have my public key on any keyserver and no I don’t use it to send encrypted messages, not mostly at least. I use GnuPG to encrypt any kind of data/files I want to keep safe, for encrypting/decrypting day-to-day info I need to grab quickly within systems I use but don’t fully trust, to login into some of my servers with ssh and gpg agents and last but not least to retrieve all of my passwords securely from any device.

Me and my stack of devices getting used to using gpg was a long journey I’m not sure it’s over yet. But it started here, by creating my first GPG key pair and that’s what this post is about.

Generating a new GPG key pair

By default GnuPG generates one primary (also called master) key with the ability (also called flag) of signing and certifying [SC].

Certification vs. signing - Signing is an action against arbitrary data (As I understand it testifying that the encrypted data is sent by who it’s supposed to be sent). Certification is the signing of another key. Ironically, the act of certifying a key is universally called “key signing”.(As I understand it saying “Hey this is my public key or Hey, I trust this key is from who It says it is). Just embrace the contradiction.

1
2
3
4
5
6
gpg --full-generate-key
Please select what kind of key you want:
(1) RSA and RSA (default)
(2) DSA and Elgamal
(3) DSA (sign only)
(4) RSA (sign only)

Let’s go with the default option with indicates we are going to use the RSA algorithm and we are going to generate one primary key for Signing and Certifying [SC] and other sub-key for Encrypting [E]. We will also specify a key size of 4096 bits and set the expiration date to 1 year from now. We will then be prompt with interface and be asked for a passphrase.

1
2
3
4
5
6
7
8
9
10
11
12
13
Your selection? 1
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (3072) 4096
Requested keysize is 4096 bits
Please specify how long the key should be valid.
0 = key does not expire
<n> = key expires in n days
<n>w = key expires in n weeks
<n>m = key expires in n months
<n>y = key expires in n years
Key is valid for? (0) 1y
Key expires at Mon Apr 12 17:43:02 2021 UTC
Is this correct? (y/N) y

Once we specified this we will be asked some info to identify our User ID (UID). Which is basically the name and email of the user and it is stored in one or more UID entries under the Primary key.

1
2
3
4
5
6
7
8
9
GnuPG needs to construct a user ID to identify your key.

Real name: Fulanito Detal
Email address: [email protected]
Comment: Fulanito's keypair
You selected this USER-ID:
"Fulanito Detal (Fulanito's keypair) <[email protected]>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O

After we’ve entered the info needed GnuPG will create a new key pair for us.

1
2
3
4
5
6
7
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
gpg: key 363690E797C0E0A7 marked as ultimately trusted
gpg: revocation certificate stored as '/root/.gnupg/openpgp-revocs.d/1D537C302A599F7BFC55C260363690E797C0E0A7.rev'
public and secret key created and signed.

After it’s created we would an output of this kind. We can also access this info with gpg -k which lists our public keys or gpg -K to list our private keys.

1
2
3
4
pub   rsa4096 2020-04-12 [SC] [expires: 2021-04-12]
1D537C302A599F7BFC55C260363690E797C0E0A7
uid Fulanito Detal (Fulanito's keypair) <[email protected]>
sub rsa4096 2020-04-12 [E] [expires: 2021-04-12]

Notice that a revocation certificate has been created under /root/.gnupg/openpgp-revocs.d/02A59{BlaBlaBla}60BH36.rev' . If your GPG private key becomes compromised, you need to revoke it to warn others not to trust future signatures or encrypt data to your public key. However, by the time a key compromise happens, you might not have your GPG key available, such as if it resided on hardware stolen from you, or the attacker removed it after accessing it. A revocation certificate consists of a signed message, stating in machine-readable form that a key no longer has validity for future cryptographic operations. Anyone with this certificate can revoke your private key. I recommend printing it out and storing it in a secure location.

Finally, now that you managed to create your own key pair you can start sharing you public key, decrypt and sign messages. In order to get use the gpg cli tool I suggest checking the following cheat sheet until you get used to it.

1
$ curl cheat.sh/gpg

Bibliography and technology used in this project


Author: Lucas Contreras

Leave a comment: Join the Reddit discussion

How I organize my Terraform files

Having one folder per provider and handling outputs

I’m not sure this is the best way to structure a terraform project but it really works for me. I often use terraform with more than one provider (i.e AWS, Cloudflare, Gitlab, etc..) so I like to have one per folder. To do so, I use terraform modules.

This is what my terraform folder structure looks like.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ tree ./terraform
├── aws
│   ├── aws.tf
│   ├── cloudwatch.tf
│   ├── outputs.tf
│   ├── policies
│   │   └── assume_role_policy.json
│   ├── role.tf
│   ├── security-groups.tf
│   ├── server.tf
│   └── sns.tf
├── cloudflare
│   ├── cloudflare.tf
│   ├── outputs.tf
│   └── site.tf
├── config.tf
└── terraform.tf

3 directories, 13 files

And this is what’s inside the terraform.tf :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
terraform {
backend "s3" {
bucket = "mastertv-terraform-states"
key = "mtv-store/terraform.tfstate"
region = "us-east-1"
}
}

module "cloudflare" {
source = "./cloudflare"
config = var.cloudflare
# This line over here makes the aws outputs available on cloudflare module.
outputs = {
server_public_ip = module.aws.server_public_ip
}
}

module "aws" {
source = "./aws"
config = var.aws
outputs = { }
}
# This makes the output variable from outside the module.
output "server_public_ip" {
value = module.aws.server_public_ip
}

In order to make terraform outputs available we need to reference them as shown in terraform.tf.

server_public_ip will be the name of the output while module.aws.server_public_ip makes reference to the output inside any of the files within the ./aws module. In this case all of the module’s outputs are contained in a single ./aws/outputs.tf file and they all look something like this.

1
2
3
output "server_public_ip" {
value = aws_instance.mtv-store-ec2.public_ip
}

Notice that the name of the output server_public_ip is the one we are referencing with module.aws.server_public_ip.

Then you can access the output with $> terraform outputs server_public_ip at the root directory.

In order to share the output among modules I make use of Input variables. I call a module and then pass the output variable as input in the module block as shown, again, in the terraform.tf within the cloudflare module block. Nevertheless, this is not enough to be able to access these Input variables, we need to declare them in the child module first. Not to have many files, I do it where I declare my provider, this is the example of cloudflare.tf. Don’t worry it sounds harder than what it is

1
2
3
4
5
6
7
8
9
10
11
provider "cloudflare" {
version = "~> 2.0"
email = var.config.email
api_key = var.config.api_key
api_token = var.config.api_token

}

#Input variable of the module, it's declared here but not defined
variable "config" {}
variable "outputs" {}

Conclusion

Terraform is a great tool but it can lead to some misunderstanding if it’s not well organized. This way of organizing files, let me have full control of the infrastructure just from one config.tf file and share outputs easily among modules. A downside to this solution is that some IDEs will not autocomplete neither outputs nor variables.
If you got up to here, thank you very much for reading if you have any questions/opinions or you’ve found any mistake on the text please feel free to write me, my contact details can be found in the /about section.

Bibliography and technology used in this project


Author: Lucas Contreras

Leave a comment: Join the Reddit discussion

Postmortem Sitemetrics

During the last month, I’ve been working side to side with my friend and coworker Lucas Contreras on a web application. In this post, I will try to highlight both the things that went great and the things that went not-so-great.

The problem.

Everything we do on our daily lives arises from a problem we want to solve. This isn’t less true in the field of software development.

One day, I was finishing another project of my own, when Lucas asked me if I wanted to be part of a brand-new monitoring project. The very same was the pinnacle of the SRE discipline. We were asked to monitor both latency and error budgets of a large scale e-commerce site. For what we must develop from scratch a fully automated dashboard that displays all the previously fed info. The very same aimed to replace an already existing tool that due to its lack of maintenance suddenly stopped working. We all know that things don’t just stop working from time to time for no reason but in this case, tracing the reason and fixing the tool would have been a waste of time. The thing is this old tool was written by a person that was no longer part of the team and the tool was rather difficult to understand in terms of both code and design.

As if the situation wasn’t complicated enough already, we were told that this tool was used by very important people with high ranges, fine suits, and expensive shoes. People whose decisions can determine the future of our company.

You might guess, seemed kind of unlikely that the two (not only not Junior but “Shadowing”) members of the team were in charge of such a task. But as you might have experienced, seniority is (sometimes) something that companies like to play with.

So… what do we do?

After a short meeting with the client, we got a bit (and just a bit) more knowledge on the whole thing. We were given three weeks to recreate the whole system that was working before but with added functionality.

We started discussing a plan that allowed us to get it done in time. The first idea was to use React to develop the frontend and make a RESTful API for the backend, but we ditched it rather quickly: the time constraint didn’t allow for us to use a technology we were inexperienced in.

In the end, we settled on using AdonisJS, a JavaScript full-stack fully-fledged web framework made for when you need to deliver software as fast as bread comes out of the oven. This full-stack monster brings packed a handful of solutions that made our development much easier and let us focus on the design of the application. Things like authentication, making the connection to de DB and integrating an ORM was not an issue, so the next step was to think about how we were going to get that data from the site (latency metrics and error budget). Luckily, the old tool used Splunk for that.

App Infrastructure + 3 microservices
Having this scheduled “Splunk reports” that could be easily integrated with our dashboard over Weebhooks, all that was left was to build a friendly REST API where Splunk could POST all this data.

After some faulty merge requests and pushes directly to master, we managed to have the dashboard set except for the login. The large-scale company for which we were developing this app for uses LDAP. Yes, LDAP for everything. Dealing with it is kind of a pain but not accidentally we have already abstract away that complexity by building an API for this LDAP instance (you can read this post on building a microservice stack for more info). This way, something that could have taken us weeks was solved in a matter of days. Nevertheless, it added one dependency to nothing but the authentication, which means that if the API is down for some reason we wouldn’t be able to access our app. Finally, we solved this by persisting the user login on the Dashboard database and updating (if needed) every time the users change their LDAP password.

Conclusion

The most practical solution is usually the best one. Especially when you have limited time to do something (and that’s always). Adonis allowed us to quickly develop a working solution that met all the requirements we were given.

Remember to check out Lucas Contreras blog, where he talks about more techy things.

Bibliography and technology used in this project


Author: Lucas Contreras

Leave a comment: Join the Reddit discussion

A solution I'm partially proud of.

Sometimes coming up with the proper solution to a problem doesn’t depend just on your ability to code or how witty you are. Sometimes a little bit of patience and soft skills are needed. As you might have read in earlier posts, I’m part of an SRE team in a consultant company with HQ in Argentina, I’m currently working for an e-commerce based in the USA and all the <<Level 1>> monitoring site relies on us, or at least a very big part of it.

It was January 2018 when that hopeless ticket was opened. Its goal was clearly display on the title:

“Monitor disk space usage of new EC2 instances under the auto scaling group [Asg Name]”

For those who are not familiar with the AWS world and don’t know what an Auto scaling Group or an EC2 is, let me briefly explain. EC2s under and ASG are something like container’s clusters but with “bare metal” machines. Basically an EC2 instance is how AWS named his servers, there are several types of them and you can purchase them in ways you wouldn’t imagine. Per hour, per use, per instance, per color, shape and all. There’s even something like stock prices for these things. If you have several EC2 instances serving a single purpose, you might want to launch them all under an ASG (Auto scaling Group). This way, you ensure the right amount of EC2s are running in order the app doesn’t overload. Everything scales automatically if an EC2 fails or if more instances are needed, and if they are, you better prepare your wallet.

Like 13 months without it being solved, there it was: The impossible ticket, waiting to be solved. On the other side a hopeless young IT guy, managing systems he didn’t fully understand working with people that had even less scope on the matter. As you would expect, an unsolved ticket that old began to make some noise among the suit n’ tie guys. So almost 1 year and 2 months later, on our weekly catch-up meeting our manager took the word and said:

– “Ok guys. You might have heard about MON-5379“. -Everybody obviously knew what he was talking about-
– “It’s about to celebrate its first birthday.“ Continues sarcastically - “So guess what? We are stopping all ongoing projects until we get this done. Now, who wants to take care of the ticket ?“ . he asks

Believe it or not, by that time, I hadn’t even read the ticket. But since everyone was looking sideways in order to avoid the situation, I thought it would be a nice opportunity to stand out. I raised my hand. Big mistake. Everybody looked at me with their best wtf faces while I said:

– “I think I can take care of it“ - Oh boy, what was I getting into.
– “GREAT!“ Shouted my manager. End of the meeting.

Truth is that ticket haven’t been solved because the team didn’t have the required skills by the moment. The ticket was fully based on cloud infrastructure and since our client had recently migrated to the cloud, none of them were actually familiar with AWS by the time. I was no expert either.
If you’ve paid attention you might have realized what the problem was. How do we monitor anything on a machine that may die anytime soon while another one is waiting to take its place. It’s 2019, this an already addressed issue, we’ve already solved monitoring on clusters, have we ?

Yes, in a way. After some really intense research (literally the first post on Stackoverflow), I found AWS already has a solution for that. Auto scaling groups has this thing called ASG Webhooks (Spoiler: they are regular webhooks) where you can define actions to be taken every time the thing scales. But since the auto scaling was not in an account we had access to, adding webhooks was not a choice. Trying to explain this to the client is worthless, they don’t hire you to discuss technical stuff, they just want to have the thing monitored in case it fails, or at least that was what I thought. By the way, if the ASG takes care of scaling their instances up and down, is it really necessary to add monitoring to it ? Sadly, not a question to be asked in these circumstances. I had already 1 year old ticket and I needed to get it done not to delay my other projects.

After a day or two of searching the web, I went to my team leader with the following draft with an approach on how to monitor the instances cross account. Since we are responsible for the monitoring infrastructure, our aim was to have all resources almost fully independent from their maintenance or at least deploy the least amount of them on their account. Deploying all of the solution on our account was impossible since the instances and cluster were all on theirs. Si I managed to devise the following solution which required the client to set up just an event bus. ( AWS event buses are great for these cases but for the sake of making this post less boring I’ll have to ask you too Google every AWS service you don’t know.)

Rough draft I made of the cross account workaround

Since I was rather new in the team I didn’t get to talk to the client that much, but from what I had inferred from my co-workers and managers was that the client was an evil monster that didn’t want to collaborate or even work with us in solving anything. So I was asked to develop this solution with two of our accounts first in order to convince “the evil monster” otherwise they wouldn’t trust us. After a long weak dealing with automating this as much as possible so I didn’t have to replicate it by hand on their account, I got to talk to them in a call to show them what we had. All of this just to find out they were super nice and knowledgeable guys that literally said:

– “There’s no way we are letting you guys do all of that work when you can just add the resources on our account, we have no problem on that you could just have asked. I’m giving you permission so you can submit a pull request to our IaC template

I was 2 lines of code to murder someone. Luckily, I contained and wrote this blog post instead.

Finally, I submitted the #PR to their Cloudformation (AWS IaC service) and managed to solve the problem with half the resources and a third of the code.

Conclusion

There’s more than code and good practices in the IT world. Sharing ideas, being empathic and a continuous and fluid communication is as important as writing clean code.

Bibliography and technology used in this project


Author: Lucas Contreras

Leave a comment: Join the Reddit discussion

The Shhlack-bot

If you haven’t read my other post on Microservices… don’t worry, it’s totally unnecessary in the understanding of this one, but it will make more sense if you have.

Back in the days when the revolution of #Microservices&APIs was a trending topic in my team, we thought that some critical alerts could be sent via Slack. We all agreed that that an API-Chat would be needed in case we wanted to make things right. Luckily we had two trainees whose skills needed to be tested, so we put them in charge of developing a RESTful API-Chat which was able to abstract various ways of sending messages to different chat platforms. Since Slack was the main chat platform, we decided to go with that integration first.

Once the API was finished we realized we didn’t need to deal with Slack-apps, attaching Slack-bots to channels or setting up tedious Webhooks. We got completely independent of modules like python-slackclient or node-slack-sdk. Everyone in the office could send a message through Slack with a single curl.

By the time the office was getting more and more crowded as new people were hired. We were running out of quiet places and every time someone had an important meeting we hopelessly asked for silenced.

It was time to demonstrate how powerful this kind of abstraction could be, so my friend and I decided to build The Shhlack-bot. Built from a NodeMCU (ESP8266) and a sound detection sensor (KY-037) we managed to build our toy, a simple device that would send a message via Slack when ever the office was too noisy.

At first sight the spinning wheel in the sensor, with which you regulate the sound levels, looks like it has infinite turns. We thought our little buddy wasn’t working until we read that it has only ten turns, and then you would hear a gentle * click * that will let you know that you are at the max/min decibels possible even though might still be able to turn the wheel right or left. Only then did we realize how to properly setup the sound levels.

Spamming slack be like..

In case you wan’t to replicate our shhlack-bot, I’ll leave the code and techs used in this project at the end of the post. Please consider leaving a comment or upvoting this post in the following subreddit

Bibliography and technology used in this project


Author: Lucas Contreras

Leave a comment: Join the Reddit discussion

My experience with the Microservices approach

Introduction

So it was like a few months ago when a friend and co-worker first told me of the microservices approach. As you might or might not know I’m part of a Site Operation team where we provide service to a sometimes complicated client in terms of bureaucracy, which means that we have to adapt to their policy changes and permissions all the time making most of our work complexity quite arbitrary. It was then when we thought that a service-oriented approach might save us some time on the day to day tasks. Before telling you my experience I invite you to read about what service-oriented modeling (SOM) is.

My experience

The idea of making our daily work more service-oriented was already rumbling in our heads when we first set a meeting with our manager and told him what we have in mind. A few words with him and one or two rough drafts opened our way to a more formal meeting with the whole team in order to present what was going to be “The future of Sitetops”, a “New way of working”, “A game changer”, “The microservices approach”. The SAAS (SiteOps as a Service) project.

We had one week to prepare some slides and think how we were going to convince the rest of the team that this was the way to go. After a long meeting with quite a lot of arguments and motivational quotes on how this would be a great solution to our problems the meeting got to an end. It was a fact, from that moment until now, our job was to think in a more service-oriented way.

Along with the now urgent need to make everything-as-a-service me and my friend got assigned a new development project. We had two months to develops a fully MVP that would interact with our client’s ticket manager in order to easily and quickly manage all the incidents on the site. The aim of the tool was to somehow centralize all the ways our client had to report and incident over the different communication software with just a few clicks. So after a quick analysis we realized our app would have to communicate with at least 3 of our client services/tools. It was the perfect excuse not to leave our idea of Microservices die along with all the words said in that meeting. Now let’s make a list of the problems we had by that time.

Problems to solve

Our main difficulty was that we had quite a lot of scripts that interacts in very different ways with the different applications the client has. Whenever anything changes on their side, most of our monitoring failed. Debugging where the error was coming from was a pain in the ass. And the worst of all were those critical alerts that are supposed to be triggered on very specific occasions but when they do you know the thing’s fucked up, well, those always failed silently and we rarely realized on time .
Another big issue was that a significant amount of code we had running was kind of legacy, and legacy is polite way of saying it was written in Perl [* Takes a deep Breath… continue writting. *]. And as you might know Perl doesn’t have that magic from problem import solution Python does. Making easier to manage this kind these complex and large lines of code was something that needed to be solved.
Even if we got the opportunity to do the thing in a language of preference, some of their APIs were quite difficult to work with and the lack of consistency between them was something we needed to get rid of. Oh! they also expected us to hand in a fully functional MVP in two month.

How we faced the problem

After telling everyone how we were planing to solve this stack of problems with a single .pop() the most heard and trendy tech to use was Serverless. Since we didn’t have a clue what Serverless was we decided go for it. And we did, we established that if we couldn’t manage to have a simple serverless API up and running by the end of the week we would evaluate doing it with containers or just common web servers. The POC was a success and by a day or two we got our first AWS Lambda function up and running on the cloud, completely serverless.
In something like two months or so we had a single app that consumes 3 independent services (also developed and deployed along the development of the app) which were able to quickly escalate and could be reused by the whole team for there daily operations.Of course as each service is context agnostic, they have their own repo and they are independently deployed.

Our app infrastructure looked something like this:

App Infrastructure + 3 microservices
By the end of the project we got a the MVP ready and with it an application fully composable that can be selected and assembled in various combinations to provide its functionality to a broad variety task to solve a wide variety of problems.

I’m proud to say that at least three problems were addressed:

1. Abstract away complexity of the apps and tools that don’t depend on us. Making them consumable from literally anywhere.
2. Centralize the problem of the unexpected mutability of our clients apps making it easier to solve and debug.
3. Handing a fully scalable and high reliable system which is the running prove that a service-oriented modeling approach can prevent quite a lot of headaches in the future.

Conclusion

Sometimes working with new technologies can mean getting out of our comfort zone leaving us in an insecure place in a field we might have short or not knowledge at all. But we have to take into account that we are not alone, we rarely pioneers in what we are doing and if you are lucky you can find people who have written some good documentation addressing the same problem you are going through right now. Then, it’s just a matter of learning the new and convince ourselves that it’s better to solve the problem the way is meant to solved rather than the way we would like to solved it.

If you got up to here, thank you very much for reading if you have any questions/opinions or you’ve found any mistake on the text please feel free to write me, my contact details can be found in the /about section.

Bibliography and technology used in this project


Author: Lucas Contreras

Leave a comment: Join the Reddit discussion