I had to write a few SQL queries today so I thought I would toss up a few notes as a sort of SQL FAQ. I’ll try to keep these notes updated.
Inner joins fail on null matches
When you are joining tables, if you want to keep null values from your foreign key (i.e. the thing you’re trying to match your second table on), you need to use a left outer join.
Inner joins on one-to-many
When joining one-to-many, an inner join will give you a line for each match to the second table; e.g. if I have one row in table A and it matches on two rows in table B, you will get two rows- one row for each match
What table should I start with when writing a query?
When writing a query, the best place to start is to figure out what the output should look like. This is important because it helps you determine the granularity of your report. Once you’ve done this, now, when you’re faced with a myriad of tables, you start with the table that matches the granularity of the report you wish to write.
Queries are written in the following order:
However, they are processed in this order:
This isn’t just an academic distinction, it can be helpful in troubleshooting your reports when you get either an error or results you don’t expect.
After creating this WordPress site and finally getting it hooked up to the NGINX reverse proxy, I was left with a horrible website load time (something on the order of 10-12 s). In this post, I will show you a beginning guide to troubleshooting slow website response times.
1. Benchmark Your Site
The first thing you should do is quantify your site’s slowness. If you don’t know how fast (or slow) your site was to begin with, how can you know when you’ve fixed it? One of the best tools for this is to use your browser’s developer tools, but you can also use a service such the Pingdom Website Speed Test or WebPageTest.
The above screenshot
is from using developer tools in Chrome (Ctrl+Shift+I) and recording
the initial page request. I wouldn’t wait such a time for a web
page to load and it’s my own web page!
2. Identify Potential Bottlenecks (Divide and Conquer)
Now on to identifying the source of the slow response time. I find diagrams always help me think more clearly:
The diagram above gives a high level overview of how this website is served to you. You send a request to my NGINX reverse proxy, which then requests a web page from Apache (my WordPress server). The Apache server responds with the requested web page to the NGINX reverse proxy server and the reverse proxy server returns your web site request to your browser. Normally, this all happens very quickly, and in fact it does on my other servers, but something was obviously wrong here with my new build.
The real value of drawing a diagram is that it helps you identify all of the moving parts of your system which in turn helps you theorize about how that system could break. In the diagram above, you can see that the problem could be in the:
Interaction between the Client and Reverse Proxy:
Download speed (or even ping) of the client
Upload speed of the reverse proxy
On the NGINX Reverse Proxy Server itself
Interaction between the Reverse Proxy and Backend (Upstream) Apache WordPress server
Now, here’s a subtlety you should be familiar with: NGINX access logs are written at the end of a request [source], while Apache logs are written at the beginning of the request [source].
In my case, the NGINX access log showed a request fulfilled time a full 10+ seconds after the Apache log is showing its request received time! This means the slow response time was in the Apache response!
Further confirmation was obtained by including %D in the Apache access log which is a mnemonic that records the time between when the Apache server receives the HTTP request and when it returns a response to the client. Lo and behold, my Apache access logs showed a 10 s response time.
4. Iterate Until Done
So, now I had confirmed that the source of my slow response time was on my Apache server, but now what? What was causing that?
The answer is to go back to step 2 above and repeat. The difference this time is that instead of having the whole system to look at, we can focus our list of potential problems to ones specific to the WordPress server. In my case, since this was running on a Raspberry Pi, I was suspicious of the hardware itself. I ran sar, while visiting my site and sure enough, iowait spiked to 80-90% for several seconds without full RAM utilization. This is a notorious problem on the Raspberry Pi and is indicative of the slow read/write speeds from running everything off of a microSD card. This was even with other optimizations such as proxy_caching on the reverse proxy and fastcgi_caching setup.
So ultimately the problem was the hardware of my server itself. I instead installed a LEMP stack on a LIVA mini PC and then configured WordPress. I suppose I could’ve also tried a higher end microSD like a Samsung; maybe next time. Regardless, this brought me down to a load time of between 0.9-1.4 s and a TTFB to a much more acceptable 150-250 ms.
Recently, I decided I wanted to take better notes on my own learning
and hopefully help others in the process. Enter this blog.
I’ve been doing development for a while; I even have my own homelab
running everything from Raspberry Pi to 24-thread Xeons. It’s handy
because whenever I want to test out a new project, I can just rapidly
set up a server for it. Since most of my projects are low-demand, I
often just deploy them on a Raspberry Pi. There’s a problem with
this though, while this usually works well, sometimes you end up with
one of two problems:
You need short, “burst” power. On a Raspberry Pi, you redline
quickly (either due to the lack of processing power or due to the
fact that you only have 1 gig of RAM).
On the other end of the spectrum, most of the time it’s overkill
to have a dedicated device for each individual service you intend on
running. If your blog only gets a hundred visitors a month, it’s
overkill to have an entire dedicated host serving up websites when
it’s going to sit idle 99.9% of the time.
An excellent solution to this is Docker which you can sort of think about as a “VM Lite”. In a virtual machine, each VM gets its own dedicated resources assigned to it with its own associated overhead (namely that it has its own installed OS). In my setup though, I almost exclusively use a Debian OS (Ubuntu for most of the “heavy” servers and Raspbian on the Pi) so it’s silly to have such a duplication for each individual image to only run one service.
This is where Docker comes in. With Docker, instead of an individual virtual machine, you create a container which is its own “box” that shares kernel resources. In a lot of ways, it’s actually better, at least for my purposes.
So for this blog, I attempted to deploy WordPress as a docker image.
What an absolute time sink. I sank two days attempting to deploy
WordPress as a Docker image before ultimately making a strategic
Now admittedly, I think my biggest problem is that I broke my own
rule and failed to stage my learning. I am not familiar with
WordPress and I am not familiar with Docker. I wanted to learn both,
but instead of learning WordPress by deploying it with a traditional
install, and learning Docker by deploying a simple Docker container,
I went straight for the moon shot and tried to deploy WordPress as a
Docker container. For those of you who are uninitiated with
WordPress, as I was, it turns out it’s an…eccentric application
to say the least. It does things like specify absolute paths instead
of relative ones, so if you don’t change the domain from the
internal IP address your server sits on to the domain name you’re
using it will actually pass your internal private IP back to the
client, which probably isn’t sitting on your network.
Even despite this, I probably could’ve still pulled it off, but my
setup is additionally complicated by the fact that my WordPress
deployment sits behind a reverse proxy. Moral of the story: when
learning something new, do it in stages.