January 15

SQL Lessons to Remember

I had to write a few SQL queries today so I thought I would toss up a few notes as a sort of SQL FAQ. I’ll try to keep these notes updated.

  • Inner joins fail on null matches
    • When you are joining tables, if you want to keep null values from your foreign key (i.e. the thing you’re trying to match your second table on), you need to use a left outer join.
  • Inner joins on one-to-many
    • When joining one-to-many, an inner join will give you a line for each match to the second table; e.g. if I have one row in table A and it matches on two rows in table B, you will get two rows- one row for each match
  • What table should I start with when writing a query?
    • When writing a query, the best place to start is to figure out what the output should look like. This is important because it helps you determine the granularity of your report. Once you’ve done this, now, when you’re faced with a myriad of tables, you start with the table that matches the granularity of the report you wish to write.

Order Matters

Queries are written in the following order:

  • SELECT
  • FROM
  • WHERE
  • GROUP BY
  • HAVING
  • ORDER BY

However, they are processed in this order:

  • FROM
  • WHERE
  • GROUP BY
  • HAVING
  • SELECT
  • ORDER BY

This isn’t just an academic distinction, it can be helpful in troubleshooting your reports when you get either an error or results you don’t expect.

January 11

Troubleshooting Slow Response Times

After creating this WordPress site and finally getting it hooked up to the NGINX reverse proxy, I was left with a horrible website load time (something on the order of 10-12 s). In this post, I will show you a beginning guide to troubleshooting slow website response times.

1. Benchmark Your Site

The first thing you should do is quantify your site’s slowness. If you don’t know how fast (or slow) your site was to begin with, how can you know when you’ve fixed it? One of the best tools for this is to use your browser’s developer tools, but you can also use a service such the Pingdom Website Speed Test or WebPageTest.

Screenshot of website being monitored with Chrome Developer Tools.
Chrome Developer Tools Showing Excessive Time to First Byte (TTFB aka “waiting”)

The above screenshot is from using developer tools in Chrome (Ctrl+Shift+I) and recording the initial page request. I wouldn’t wait such a time for a web page to load and it’s my own web page!

2. Identify Potential Bottlenecks (Divide and Conquer)

Now on to identifying the source of the slow response time. I find diagrams always help me think more clearly:

The diagram above gives a high level overview of how this website is served to you. You send a request to my NGINX reverse proxy, which then requests a web page from Apache (my WordPress server). The Apache server responds with the requested web page to the NGINX reverse proxy server and the reverse proxy server returns your web site request to your browser. Normally, this all happens very quickly, and in fact it does on my other servers, but something was obviously wrong here with my new build.

The real value of drawing a diagram is that it helps you identify all of the moving parts of your system which in turn helps you theorize about how that system could break. In the diagram above, you can see that the problem could be in the:

  1. Client
  2. Interaction between the Client and Reverse Proxy:
    • Download speed (or even ping) of the client
    • Upload speed of the reverse proxy
  3. On the NGINX Reverse Proxy Server itself
  4. Interaction between the Reverse Proxy and Backend (Upstream) Apache WordPress server
  5. On the Apache/WordPress Server

3. Check Your Logs

Logs can be an excellent resource for identifying your bottleneck. To figure out which piece in my above diagram was the culprit, I decided to check the log files on both my NGINX reverse proxy and my backend Apache server running WordPress. I had a TTFB of over 10 seconds- that time had to be going somewhere! As Sherlock Holmes says, “Data, data, data! I cannot make bricks without clay.”

Now, here’s a subtlety you should be familiar with: NGINX access logs are written at the end of a request [source], while Apache logs are written at the beginning of the request [source].

In my case, the NGINX access log showed a request fulfilled time a full 10+ seconds after the Apache log is showing its request received time! This means the slow response time was in the Apache response!

Further confirmation was obtained by including %D in the Apache access log which is a mnemonic that records the time between when the Apache server receives the HTTP request and when it returns a response to the client. Lo and behold, my Apache access logs showed a 10 s response time.

4. Iterate Until Done

So, now I had confirmed that the source of my slow response time was on my Apache server, but now what? What was causing that?

The answer is to go back to step 2 above and repeat. The difference this time is that instead of having the whole system to look at, we can focus our list of potential problems to ones specific to the WordPress server. In my case, since this was running on a Raspberry Pi, I was suspicious of the hardware itself. I ran sar, while visiting my site and sure enough, iowait spiked to 80-90% for several seconds without full RAM utilization. This is a notorious problem on the Raspberry Pi and is indicative of the slow read/write speeds from running everything off of a microSD card. This was even with other optimizations such as proxy_caching on the reverse proxy and fastcgi_caching setup.

So ultimately the problem was the hardware of my server itself. I instead installed a LEMP stack on a LIVA mini PC and then configured WordPress. I suppose I could’ve also tried a higher end microSD like a Samsung; maybe next time. Regardless, this brought me down to a load time of between 0.9-1.4 s and a TTFB to a much more acceptable 150-250 ms.

January 8

Getting Off the Ground

Recently, I decided I wanted to take better notes on my own learning and hopefully help others in the process. Enter this blog.

I’ve been doing development for a while; I even have my own homelab running everything from Raspberry Pi to 24-thread Xeons. It’s handy because whenever I want to test out a new project, I can just rapidly set up a server for it. Since most of my projects are low-demand, I often just deploy them on a Raspberry Pi. There’s a problem with this though, while this usually works well, sometimes you end up with one of two problems:

  1. You need short, “burst” power. On a Raspberry Pi, you redline quickly (either due to the lack of processing power or due to the fact that you only have 1 gig of RAM).
  2. On the other end of the spectrum, most of the time it’s overkill to have a dedicated device for each individual service you intend on running. If your blog only gets a hundred visitors a month, it’s overkill to have an entire dedicated host serving up websites when it’s going to sit idle 99.9% of the time.

An excellent solution to this is Docker which you can sort of think about as a “VM Lite”. In a virtual machine, each VM gets its own dedicated resources assigned to it with its own associated overhead (namely that it has its own installed OS). In my setup though, I almost exclusively use a Debian OS (Ubuntu for most of the “heavy” servers and Raspbian on the Pi) so it’s silly to have such a duplication for each individual image to only run one service.

This is where Docker comes in. With Docker, instead of an individual virtual machine, you create a container which is its own “box” that shares kernel resources. In a lot of ways, it’s actually better, at least for my purposes.

So for this blog, I attempted to deploy WordPress as a docker image. What an absolute time sink. I sank two days attempting to deploy WordPress as a Docker image before ultimately making a strategic retreat.

Now admittedly, I think my biggest problem is that I broke my own rule and failed to stage my learning. I am not familiar with WordPress and I am not familiar with Docker. I wanted to learn both, but instead of learning WordPress by deploying it with a traditional install, and learning Docker by deploying a simple Docker container, I went straight for the moon shot and tried to deploy WordPress as a Docker container. For those of you who are uninitiated with WordPress, as I was, it turns out it’s an…eccentric application to say the least. It does things like specify absolute paths instead of relative ones, so if you don’t change the domain from the internal IP address your server sits on to the domain name you’re using it will actually pass your internal private IP back to the client, which probably isn’t sitting on your network.

Even despite this, I probably could’ve still pulled it off, but my setup is additionally complicated by the fact that my WordPress deployment sits behind a reverse proxy. Moral of the story: when learning something new, do it in stages.