[Mixed](https://pixabay.com/photos/legs-feet-different-mixed-standing-362182/) by [RyanMcGuire](https://pixabay.com/users/ryanmcguire-123690/) licensed under [CC0](https://creativecommons.org/publicdomain/zero/1.0/legalcode)

Network in OSPF database but not in routing table

I needed to troubleshoot a pesky OSPF issue on a new network. It turned out it was a simple fix, but had tripped up a couple other network engineers so I thought I’d lab it up and document the scenario. The problem The reported issue was that a network that was part of the OSPF process was not showing up in the routing table. Adjacencies between all routers were up and the network in question was shown in the OSPF database. ...

November 16, 2021 · 5 min · Jason Lavoie
Fire Hydrant Flushing

Filtering a packet capture by DNS Query Name

Overview An application problem was brought to me to troubleshoot. From the symptoms I observed, I was confident that the problem was an intermittent issue with the SAAS provider’s DNS. To prove this assertion, I needed to collect a packet capture of failed query. This post details the process I went through to collect that data. Investigation When the problem was reported, we saw our recursive nameservers returning NXDOMAIN in response to queries for the domain, when manual queries (with dig) directly to the provider’s nameservers returned valid data. As soon as the entry expired from the recursive nameserver’s cache, it was queried anew, and the reported issue was temporarily resolved. Based on this, my theory was that one of the SAAS provider’s – or their DNS provider’s – nameservers was occasionally responding with a negative answer to the query. I wanted to capture this response packet to help isolate and fix the problem. ...

October 28, 2021 · 6 min · Jason Lavoie
Device table showing support expiry information

Tracking vendor support status in NetBox

Timo Reimann wrote a handy NetBox plugin to collect and display support expiry information (End-of-Sale, End-of-Support, etc.) as well as the current Contract and Warranty coverage dates for all Cisco devices defined in a NetBox installation. His README does a good job showing the process for setting up the plugin, so I won’t repeat all the details here. The general process is: register an app with Cisco and obtain the API ID and secret. install the plugin (pip install netbox-cisco-support) enable the plugin (add to PLUGINS in configuration.py) configure the plugin (add to PLUGINS_CONFIG in configuration.py) apply the Django migrations (manage.py migrate) collect the EoX data (manage.py sync_eox_data) If all goes well, there will now be two additional tables in the UI device page for on any device whose manufacturer matches the manufacturer value in PLUGINS_CONFIG (default Cisco). ...

October 20, 2021 · 3 min · Jason Lavoie
NetBox device view with additional NAPALM tabs

NetBox NAPALM automation with bastion host

NetBox has an available integration with the NAPALM automation library. For supported devices, the NetBox device view will show additional tabs for status, LLDP neighbors, and device configuration. It will also proxy any (read-only) napalm getters (get_environment, get_lldp_neighbors, etc.) via the REST API. The basic configuration outlined in the documentation assumes that the NetBox server has direct ssh access to these devices. That is not the case if you use a bastion host or jump host. Here is how to configure this feature to work in such an environment. ...

October 7, 2021 · 3 min · Jason Lavoie

VLANs not showing in configuration

I was asked to hunt down an issue where newly-created VLANs were not showing up in the running configuration (or the startup configuration) of the switch. lab3850-sw-1#conf t Enter configuration commands, one per line. End with CNTL/Z. lab3850-sw-1(config)#vlan 2 lab3850-sw-1(config-vlan)#name test lab3850-sw-1#sh run vlan 2 Building configuration... Current configuration: end At first, I thought it was a corrupt VLAN database. To test, I removed the vlan.dat file and then recreated it (by adding a VLAN). The problem persisted. ...

September 27, 2021 · 2 min · Jason Lavoie
visualization of the netbox database

Netbox database schema diagram using schemaspy

While trying to wrap my head around some of the NetBox database relationships, I found myself wishing for a database schema diagram. I looked through the documentation and code repo, but didn’t find anything. A colleague recommended trying schemaspy, so I tried it. Setup I set up a fresh install of netbox on a Debian 10 VM, and downloaded schemaspy and its dependencies. Alternatively, they publish a Docker image. Install Java sudo apt install dfault-jdk JDBC Driver PostgreSQL has a download page for the JDBC driver. ...

September 14, 2021 · 3 min · Jason Lavoie

Leading zeros in bash

A team member reported a problem with pre-commit hook I wrote, check-dns-serial, which ensures the SOA serial number is updated on any modified zone files. The script was giving them an error when they made a commit after the 8th revision in a day. It was an interesting bug in a bash script that I thought might be helpful to share. The serial number is, by convention, stored as a date string plus a 2-digit revision number. For example, 2021090104 would be today’s 4th change. This allows for 99 changes a day. The script splits this string (using cut) into two variables, the date and the revision. At one point, it checks to see if the old revision is already 99, to avoid an overflow. This is line that threw the error: ...

September 1, 2021 · 3 min · Jason Lavoie
Lambda and Perl Camel

Migrating a Perl CGI to AWS Lambda

Motivation In migrating our NOC website to from a traditional Apache server to a serverless architecture, I’ve needed to update or replace any dynamic components. For example, replacing a Wordpress installation with Hugo to publish static content to a S3 bucket served by CloudFront. In this particular case, it was a CGI script that reads our firewall configurations and presents a web page for visualizing and searching the many object-groups and access-lists. I chose to migrate this to run as a Lambda. ...

August 30, 2021 · 10 min · Jason Lavoie

Geolocation issues

As I was getting ready to leave for a summer vacation, an emergency call came from our service desk: “The Internet is in Chinese!” After a few back and forth questions, and a little bit of investigation, I determined that Google had suddenly marked an entire /44 prefix as being geolocated in Hong Kong. When connecting to https://www.google.com/, everyone was automatically redirected to https://www.google.com.hk/. This only affected the IPv6 block. The corresponding IPv4 block was not affected. ...

August 27, 2021 · 2 min · Jason Lavoie
Process flow of a GitHub AWS Connector App connecting to CodePipeline and publishing to an SNS topic

Connecting GitHub to SNS using CodePipeline

Background In the last post, I documented an approach to fan-out GitHub repository updates to AWS services using API Gateway, Lambda, and SNS. In my conclusion, I wrote: The whole time I was building and testing this, I kept thinking to myself, “I must be overlooking a more obvious solution.” I’ve asked around, and it seems that others have also run into this issue, but ended up using a different approach that didn’t involve authorization. If you know of a better/different solution, please reach out! ...

August 20, 2021 · 7 min · Jason Lavoie
Process flow of a webhook through API Gateway using a lambda integration to publish to SNS

Publish to SNS with GitHub webhooks

Motivation and Design I have a bunch of “audit scripts” that run against the network configurations (and other data sources, such as DNS and DHCP) to check for common problems, mistakes, and inconsistencies. They run on a centralized server that periodically fetches the latest data from all these sources, runs the scripts, and emails about any discrepancies. This data sources are kept in git repositories, either updated by operations staff, or automatically. In the case of networking gear, by a tool called RANCID that collects the text configuration and output of many useful “show” commands and pushes any changes a git repository for the role/group of the device. ...

August 16, 2021 · 10 min · Jason Lavoie

Use VLAN groups for UCS vNIC templates

One of my co-workers had provisioned a new appliance VM. It was having connectivity problems, so he asked me to look at it. Upon investigation, I found: absolutely no connectivity: RX Packets 0 on the interface. this was the first/only VM in this VLAN on this vCenter cluster they had just added this VLAN to the dvSwitch for this project So, I first checked what had changed most recently, the dvSwitch config. Everything looked correct. I compared it to other (working) VLANs, and saw no discrepancies. ...

August 10, 2021 · 3 min · Jason Lavoie

Cisco fan direction mismatch

Many of Cisco’s switches can be purchased in two different airflow configurations, port-side intake and port-side exhaust. Since most racks are designed with a front-to-back airflow, this allows for mounting a switch in the front or back of the rack, respectively. The latter scenario, for example, we use for a top of rack (ToR) deployment for server racks. Most times, despite selling these as different SKUs, the switch is actually the same part number, and all that differs are the part numbers of the fans and power supplies. Swap all of these out and now the switch has reverse airflow. They usually are also color-coded, with the port-side exhaust colored blue and the port-side intake colored red/burgundy. The mnemonic here is “red is hot, blue is cold” – the exposed end of the module is either cold air in (red) or hot air out (blue). ...

August 3, 2021 · 3 min · Jason Lavoie

Override AppArmor policy for bind

After upgrading a nameserver to Debian 10, I noticed some AppArmor errors in /var/log/auth.log: Jul 29 09:58:18 koala audit[1676]: AVC apparmor="DENIED" operation="mknod" profile="/usr/sbin/named" name="/etc/bind/namedb/dyn/example.com.jnl" pid=1676 comm="isc-worker0029" requested_mask="c" denied_mask=" c" fsuid=112 ouid=112 It appears that a default ISC bind install now restricts named to read-only access on /etc/bind. According to /etc/apparmor.d/usr.sbin.named: [...] # /etc/bind should be read-only for bind # /var/lib/bind is for dynamically updated zone (and journal) files. # /var/cache/bind is for slave/stub data, since we're not the origin of it. # See /usr/share/doc/bind9/README.Debian.gz /etc/bind/** r, /var/lib/bind/** rw, /var/lib/bind/ rw, /var/cache/bind/** lrw, /var/cache/bind/ rw, [...] The relevant portion of /usr/share/doc/bind9/README.Debian.gz: ...

July 30, 2021 · 2 min · Jason Lavoie

No matching key exchange method

After upgrading some bastion hosts to Debian 10, connections to some older network gear failed. Connecting to some ASA firewalls generated the error: Unable to negotiate with 203.0.113.203 port 22: no matching key exchange method found. Their offer: diffie-hellman-group1-sha1 This was a simple fix: lab-5585-1# conf t lab-5585-1(config)# ssh key-exchange group dh-group14-sha1 lab-5585-1(config)# end Some older devices, Catalyst 3750 switches and ASA 5540 firewalls, complained of no matching cipher: %SSH-3-NO_MATCH: No matching cipher found: client chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com server aes128-cbc,3des-cbc,aes192-cbc,aes256-cbc This could be fixed on the device with the ip ssh server algorithm encryption ... (3750) and ssh cipher encryption ... (ASA) commands, but I decided to fix this on the bastion host instead by adding Ciphers +aes256-cbc to /etc/ssh/ssh_config. ...

July 29, 2021 · 1 min · Jason Lavoie

Clearing HSTS on localhost

I use a few tools that create local web server: vim instant markdown hugo server These normally work well. I also regularly will use a tunnel to a host on another network, such as accessing an embedded management interface of a device on an isolated network: desktop:~$ ssh -L 8443:device:443 bastion bastion:~$ The service on the remote network device is now available locally via https://localhost:443/. Unfortunately, when I do this, my local browser will store these HSTS settings for the domain (localhost, in this case) and complain/fail when one of the above-listed tools goes to a non-HTTPS URL on localhost, such as http://localhost:8090 for instant-markdown. ...

June 30, 2021 · 1 min · Jason Lavoie

F5 management firewall rules

After upgrading our F5’s a while back – probably to a BIG-IP 14.1 release, from looking at the release notes – our monitoring of their NTP status started failing. One of our staff poked at it and even opened a support case with F5, but couldn’t get it working, so it ended up on my list of things to look at. Today, I finally spent a few minutes troubleshooting and found the problem and an easy fix. It appears that when they changed their licensing model for AFM, F5 changed the way firewall rules are used on the management interface. ...

June 23, 2021 · 2 min · Jason Lavoie
[Split](https://pixabay.com/photos/log-bark-ball-glass-ball-split-4164303/) by [manfredrichter](https://pixabay.com/users/manfredrichter-4055600/) licensed under [CC0](https://creativecommons.org/publicdomain/zero/1.0/legalcode)

Multi-homed EC2

I had an interesting design requirement for a network monitoring host. These monitoring hosts, or collectors, are used to monitor our network from an external perspective – via the Internet. They also needed to be reachable from our internal network for central management, and needed access to shared internal services, such as directory services, time servers, and central logging. Design My initial approach was to deploy the hosts in a public subnet, set the default route over the Internet, and add individual host routes via the transit gateway to the subnet routing table. This was not great from an operational perspective and violated the requirements when one of the statically-routed hosts also needed to be monitored externally. ...

June 22, 2021 · 10 min · Jason Lavoie

Using docker to compile a binary

Sometimes I have to compile a binary or build a custom package on an old platform or an operating system where I don’t have a compile host available. Docker is a perfect tool for this type of ad-hoc workflow. docker run --rm -it -v $(pwd):/mnt ubuntu:bionic sed -i 's/^# deb-src/deb-src/' /etc/apt/sources.list apt-get update apt-get -y install dpkg-dev libssl-dev # any other dependencies cd apt-get source source-package-here # cd into package and compile/make/build/etc strip resulting_binary cp resulting_binary /mnt exit This mounts the current directory at the /mnt mount point in the container. The resulting artifacts (binaries, packages, etc.) can be preserved by copying them out of the container before exiting. ...

June 21, 2021 · 1 min · Jason Lavoie
Diagram of SQL MI creation flow

Updating AzureRM templates from Terraform

Summary I have deployed some Azure SQL Managed Instances using Terraform. Since there are no native resources for this service in the Azure provider, I used an Azure Resource Manager deployment template. Recently, I had to add an output to that template (so that another workspace could set up remote logging), and wanted to note my experience with updating deployment templates from Terraform. Here, I’ll detail the original design and then walk through the update process. ...

May 19, 2021 · 10 min · Jason Lavoie