Parsing the Azure Front Door logs
In the serie 'Advanced Azure Front Door configuration strategies for Sitecore Managed Cloud on Containers', this is part 2: Parsing the Azure Front Door logs.
Sitecore Managed Cloud comes in two flavours, webapps and containers. In my role as Senior Solution Architect at uxbee, I setup and configured the container variant for a customer. In this blog series I will share my configuration experiences, with Azure Front Door in combination with Sitecore Managed Cloud. I named this series ‘Advanced Azure Front Door configuration strategies for Sitecore Managed Cloud on containers’. Part 1 of this series was about simplifying managing Azure Front Door IP whitelisting. Time to dive into part 2: Parsing the Azure Front Door logs.
Recently, we integrated CookieBot for one our customers who runs Sitecore on Managed Cloud on containers. Once the integration was released to the production environment, the customer started to see ‘the request is blocked’ pages from Azure Front Door.
This is the same message that a visitor would see when they are calling the website from an IP address that’s not whitelisted in Front Door. Since the customer was testing the website behind the IP whitelist, this website was not live yet, we immediately thought that the IP was missing on the whitelist. However, the IP address was present on the whitelist, therefor it had to be something else.
To investigate why Front Door blocked the request, we needed the log files from Front Door. The next issue was that ‘Out of the (Sitecore Managed Cloud) box’ logging is disabled on Front Door. To enable logging you have to create a request in the Sitecore Support Portal.
Logging can be enabled in two flavors
You can choose between saving the log data on a blob storage or in Application Insights. The Application Insights variant comes with additional costs, the price depends on the amount of data that is ingested per day. Because the size of the Front Door logs can quickly increase, we have chosen to store the logs on Azure blob storage. When Sitecore has enabled logging in Front Door, you will receive credentials to access that blob storage.
To be able to quickly download log files, I used ‘Microsoft Azure Storage Explorer’.
The .json logfile has a field called ‘Data’ and that field can contain data that triggers one of your firewall rules! When downloading the .json file(s) from the blob storage onto my Laptop, my Virus&Threat protection kicked in; blocking and giving warnings about threats. To go around this issue, I created a VM to process the .json files.
The first time I looked at the folder/file structure it was a bit of a hassle. Not one folder showed all log files neatly arranged. Instead it showed a deep directory structure as shown in the picture.
In every lowest “m=” folder is a .json file present with the name PT1H.json. The content of this file looks like this picture.
Note: Sensitive materials are blurred.
The log file solely contains blocked requests and, you might already noticed, it's not a perfect .json format either. My goal was to create one .csv file of all the blocked log entries from the .json files. This csv-file can then be loaded into Excel, so you can easily filter the data by date, time, firewall rule, etcetera. Locating the blocked request will be much easier this way.
To do this manually with that deep folder structure would be very time consuming, this is why I created a PowerShell script.
The parameter Sourcefolder holds the root folder from the downloaded log-files. From there it will find all PT1H.json files and merge them into one temporary file (line 8). Line 10-15 will convert that temporary .json file into an PowerShell object. Line 22-33 will loop through the PowerShell object creating the csv file. I only add the data to the csv-file that is important for the investigation.
While running this script you could encounter errors, I have included an example:
On the VM I disabled all Virus and Threat protection to process ALL the PT1H.json files correctly. Take into consideration that processing all .json files in the folder structure takes a long time. If you know the specific date and time, when the request was blocked, you can search in the Front Door folder structure for the corresponding .json files in that time period. You only convert those file(s) to a csv-file with the script, saving yourself a ton of time.
When you load the csv-file in Excel, the final result looks like this:
On the top you see requests that were blocked off by the Front Door whitelist custom rule. In the clientip-column it’s possible to see where the call was originating from. The csv-file can contain many more reasons why the request was blocked, an explanation of all the rules can be found on GitHub.
With this information I could start the investigation on why the requests were blocked. In part 3 - HELP, my 'Request is blocked' by Front Door, I will explaine what rules blocked the request and what I did to solve the issue.