Category Archives: Server

Docker MySQL with multiple databases

As discussed before I am containerizing this server for ease of development and portability. One problem is initializing the database, which is an RDS with several databases inside for different applications. Having the container initialize with the correct permissions on all databases was a headache that I at first solved by customizing the docker image, but I finally have a stable way of doing it with the upstream image.

Docker is awesome, I think we can all agree, and the RDB images like mysql and mariadb are essential for a lot of applications. Those databases have good docker images ready to go with helpful initialization support like environment variables to create the desired database, user, and setting root and user passwords. The containers also support initializing db state using a .sql or .sql.gz file in a specific directory, which is very useful for when you want to work on real data and not fixtures/fresh and empty databases. Using docker-compose, you could initialize a db container like this:

services:
  database:
    image: mysql:5.7
    volumes:
      - "./blog.sql.gz:/docker-entrypoint-initdb.d/initdb.sql.gz"
      - "./database/data:/var/lib/mysql"
    container_name: "database"

    environment:
      MYSQL_USER: "dbuser"
      MYSQL_PASSWORD: "somepassword"
      MYSQL_DATABASE: "blog"
      MYSQL_RANDOM_ROOT_PASSWORD: 1

The variables are only hardcoded for the purposes of example and you should be using secrets instead. An extra neat thing is that we’re using environment variables to tell the docker image to create the blog database, but the sqldump also has that database, and this all works as you would expect: The database is created, the dump is applied, and the user is granted access on it. There is one huge limitation though; Using the environment variables you can only create a single database in this way that the user will be granted all privileges on. My server has several apps that have separate databases, and I would like to be able to keep adding more! How can I do that?

It turns out the initialization files are loaded in alphabetical order! If only I could create an SQL file that grants access on the databases I need…

CREATE DATABASE IF NOT EXISTS archive;

SET @grantto = (select User from mysql.user where User!="root" and Host!="localhost");
SET @grantStmtText = CONCAT("GRANT ALL ON archive.* to ", @grantto);
PREPARE grantStmt FROM @grantStmtText;
EXECUTE grantStmt;

Now this is dark magic, and is likely to break in the future in strange ways. The first line speaks for itself. The second line assigns the username defined in the MYSQL_USER environment variable to the mysql user defined variable @grantto. I’m taking advantage of a known initial state for the database, as there doesn’t seem to be a way to read actual environment variables from within mysql. The only users allowed access are some internal mysql users and our user created from the environment variable. Next I just construct the grant statement as a string. The last two lines are turning that string into a mysql expression and executing it, et voila, our user has access to the archive database!

Now we just take advantage of the alphabetic nature of the init files to add this short sql file to our docker-compose.yml like so:

services:
  database:
    image: mysql:5.7
    volumes:
      - "./blog_and_archive.sql.gz:/docker-entrypoint-initdb.d/initdb.sql.gz"
      - "./grant-all.sql:/docker-entrypoint-initdb.d/zz-grant-all.sql"
      - "./database/data:/var/lib/mysql"
    container_name: "database"

    environment:
      MYSQL_USER: "dbuser"
      MYSQL_PASSWORD: "somepassword"
      MYSQL_DATABASE: "blog"
      MYSQL_RANDOM_ROOT_PASSWORD: 1

If we clear out the database data and restart the container it will start with our user having access to both “blog” and “archive”! We could tweak it even further in order to figure out which databases were created and granting on all of them, but I have a manageable amount of databases and a job, so I’m not doing it.

containerized blog

Another chapter in the ever growing book that is the story of my blog, as is good and right for any developer.

This is now coming at you from docker-compose. The blog, I mean. It used to be on a normal digital ocean droplet running on bare metal (well, low tier instance so probably a vmware instance but you know what I mean). Even worse, to my great shame it was just a normal wordpress instance. Now, it’s still running on that same vmware instance and it’s still wordpress, but it’s using roots/bedrock.

Dark Mode

bedrock (this link opens in a new window) by roots (this link opens in a new window)

WordPress boilerplate with modern development tools, easier configuration, and an improved folder structure

roots/bedrock lets you manage wordpress as a composer dependency, including themes and plugins. Essentially that means the whole blog is now a git repo with a single composer.json and composer.lock file. Of course there’s a bit more to it with .env files and persistent stuff, but essentially that’s it. This is very cool on its own, but just moving one wordpress site to using composer isn’t cool enough, so I did the same for the archive. The archive was using some plugins that don’t even exist anymore, but I manged to find and patch their successors well enough to keep it afloat, so now that’s also managed with composer. That means I can easily upgrade and patch both blogs on my machine, test them here, and if everything work quickly run the same upgrade in a predictable manner in production. Cool.

But this server doesn’t just host wordpress, it’s also running my nrk_subs app, my cv app, and new as of today, my lolz aggregator. What I really want is to run everything in nice little docker containers so I can duplicate everything locally and develop it further there in the same way I would do at work, so that’s what I did. I first built the containers I needed for the blogs and then started incorporating the other projects which were already mostly containerized. So currently, this is the docker-compose.yml that manages everything here.

version: "3.8"

services:
  database:
    build:
      context: "./database/docker"
    volumes:
      - "./storage/blog_and_archive.sql.gz:/docker-entrypoint-initdb.d/initdb.sql.gz"
      - "./database/data:/var/lib/mysql"
    container_name: "database"
    healthcheck:
      test: ["CMD", "mysqladmin", "ping", "--silent"]
    command: "--default-authentication-plugin=mysql_native_password"
    env_file: .env
    environment:
      MYSQL_DATABASE: $MYSQL_BLOG_DATABASE
      MYSQL_RANDOM_ROOT_PASSWORD: 1

  blog:
    image: brbcoffee/blog-base
    env_file: .env
    depends_on:
      - database
    environment:
      DB_HOST: database:3306
      DB_USER: $MYSQL_USER
      DB_PASSWORD: $MYSQL_PASSWORD
      DB_NAME: $MYSQL_BLOG_DATABASE
      WP_HOME: $WP_HOME_BLOG
      WP_SITEURL: $WP_SITEURL_BLOG
      XDEBUG_CONFIG: remote_host=172.17.0.1
    volumes:
      - "./blog/:/var/www/blog"
      - "./storage/media/blog:/var/www/blog/web/app/uploads"

  archive:
    image: brbcoffee/blog-base
    env_file: .env
    depends_on:
      - database
    environment:
      DB_HOST: database:3306
      DB_USER: $MYSQL_USER
      DB_PASSWORD: $MYSQL_PASSWORD
      DB_NAME: $MYSQL_ARCHIVE_DATABASE
      WP_HOME: $WP_HOME_ARCHIVE
      WP_SITEURL: $WP_SITEURL_ARCHIVE
      XDEBUG_CONFIG: remote_host=172.17.0.1
    volumes:
      - "./archive/:/var/www/archive"
      - "./storage/media/archive:/var/www/archive/web/app/uploads"

  proxy:
    image: brbcoffee/proxy
    env_file: .env
    ports:
      - "80:80"
      - "443:443"
    depends_on:
      - blog
      - archive
      - cv
      - subs
    volumes_from:
      - blog
      - archive
      - lolz

  mailhog:
    image: mailhog/mailhog
#    ports:
#      - "1025:1025"
#      - "8025:8025"

  cv:
    image: brbcoffee/cv
    volumes:
      - "./storage/resume/CV.xml:/app/data/CV.xml"

  subs:
    image: "brbcoffee/subs"

  lolz:
    image: php:7.3-fpm
    environment:
      - APP_ENV=prod
    volumes:
      - "./lolz:/var/www/lolz"

  lolz-cron:
    image: brbcoffee/lolz-cron
    environment:
      - APP_ENV=prod
    volumes:
      - "./lolz:/app

As you can see a lot is managed in the .env file, and a lot of code is mounted in. The code mounting’s not necessary for everything, and I’ll be tweaking it going forward, but for now I mostly wanted to get it live so I had an MVP to work from. There are also a lot of brbcoffee/* images here, those are built in a Makefile specific to the project. I factored it out of the docker-compose.yml file in order to separate concerns a bit once the docker-compose.yml file started getting too hairy. The goal is to get rid of the droplet entirely and run the whole setup in kubernetes or something like that.

One hiccup was ssl. The rest has actually been working for weeks, but I couldn’t figure out a clean way to do ssl. In the end I decided I’m ok with not having the certificates automatically renew in version one and just fetched a wildcard with certbot and built it into the proxy container for now.

So there it is, all the stuff on brbcoffee now runs in docker containers under docker-compose. The blogs and the proxy are in the main repo, while the other services have their own repositories which are installed as git submodules. I can toggle a single .env variable and add a build arg and have node serve in dev mode, have the blog containers run xdebug, and have the python containers run a debugpy listener for fullstack local dev. Pretty cool stuff.

Certbot and apache

I promised a blog post detailing changes I needed to make to my apache config in order to move BRBcoffee to Https, but in hindsight there isn’t much to write about it, it’s basically just a refactor.

Certbot, the tool from EFF (written in Python, yay!) that gets ssl certs from Let’s Encrypt, doesn’t work with monolithic conf files with multiple hosts. I run all my projects on the same server, routing traffic based on the site address using apache’s VirtualHost directive. It used to look like this:

<VirtualHost *:80>
    DocumentRoot "/var/www/blog"
    ServerName blog.brbcoffee.com
    # Some more directives
</VirtualHost>
<VirtualHost *:80>
    DocumentRoot "/var/www/archive"
    ServerName archive.brbcoffee.com
    # Some more directives
</VirtualHost>
<VirtualHost *:80>
    ProxyPreserveHost On
    ProxyRequests Off
    ServerName cv.brbcoffee.com
    ProxyPass / http://localhost:5000/
    ProxyPassReverse / http://localhost:5000
</VirtualHost>

So what you need to do is rip all of that out, you don’t want it in there. In their place you want this:

IncludeOptional conf.d/*.conf

Depending on your packager, it may be that this directive is already somewhere in your httpd.conf file. If it is, great, just leave it be. After that you want to take each VirtualHost that you ripped out of the main httpd.conf, and place them in individual files, like so:

<VirtualHost *:80>
    DocumentRoot "/var/www/blog"
    ServerName blog.brbcoffee.com
    # Some more directives
</VirtualHost>

blog.conf

The configuration doesn’t change, it just needs to be in separate file for certbot to be able to do its thing. You see, after certbot goes out and gets your certificates it needs to add some rules to each vhost for redirecting traffic to ssl, which I guess they didn’t want to write a lot of ugly parsing code to be able to do in a program that really isn’t about that (although it should be trivial with BeautifulSoup.

Anyway, before, running certbot –apache probably didn’t work, it got the certs for you, but couldn’t edit your conf to do what we want. Now, when you run it, it’ll complete just fine. If you chose to set all traffic to go to https, it will add three redirect lines to your conf files, and it will create a new file as well, in my case, blog-le-ssl.conf. It’s identical to the old conf file, except that it is on port 443, and that it checks that mod_ssl is loaded. All of this is stuff we could have done ourselves, of course, but it’s a convenience thing.

So that’s all there is to it. Refactor your httpd.conf by separating each virtualhost into a different file, and run certbot –apache again.

Https

Welcome to Https BRBcoffee, and thank you to Let’s Encrypt and certbot for making it a breeze, mostly.
Tomorrow there will be a blog post up about the changes I needed to make to my Apache configuration in order to get certbot to play nice.

For now, glory in that green padlock! And don’t go to the archive if you want it to stay green. WordPress doesn’t automatically update image links, so I’ll need to fix that at some point. The CV though I had no trouble with, even though it’s a Flask app that Apache just redirects traffic to. Good job, Apache, good job, python!

Configuring a linux firewall

So you’ve got your Linux server going, it’s configured the way you want it , and everything is coming up roses. You try to ping it, but the server doesn’t seem to exist. You’ve been blocked by some of the best/most insane firewall in the galaxy: iptables. A firewall’s job is to block unwanted traffic, and iptables takes its job seriously. By default, it. drops. everything. Got a http request incoming? Psh, drop that packet. I don’t care if you’ve got apache running. FTP request? Same story. Mysql? Nope.

A cat shoving things off a desk
This is iptables

Ssh is usually fine though, so we can log in and edit the rules. Iptables rules are added to rule chains. The only chain we’re interested in is the INPUT chain for now; We want to be able to receive http requests to our server, ssh connections, and nothing else. We’ll also want to allow existing connections to persist. These are the switches we’ll be using (you can find all these in the manpages, of course, but some are in the iptables-extensions manpage).

  • -F flushes the rulechains. This means exactly what you’d think.
  • -A [chain] adds a rule to the specified chain.
  • -I [chain] [number] same as -A but inserts rule at a given point in the chain.
  • -i [interface] specifies the interface the rule will act on. If you don’t specify this, the rule will act on all interfaces, including loopback (more on this later).
  • -p [protocol] specifies whether the rule is for tcp, udp, or whathaveyou.
  • --dport [port] further narrows down the packets to look at by checking which port they’re headed for.
  • -m [match] this is an extension that looks at the type of traffic the packet belongs to. We use it with:
  • --state [state], which asks a different module called conntrack whether the connection is INVALID, NEW, ESTABLISHED, RELATED, or UNTRACKED. This is magic, I have only a vague understanding of how it works.
  • -j [policy] says whether to accept or drop the packet.

Alright, let’s get to it. You can think of iptables as a sieve, where every rule along the way checks out a packet and decides whether to keep it or discard it. If the rule doesn’t match the packet, it moves further down the sieve to be checked by other rules. Therefore, at the end of it, if the packet doesn’t match any of our rules, we will just discard it. A good policy for internet traffic is that if you don’t know what it is, don’t touch it. Every rule we add gets added last in the chain/sieve.

A script demonstrating the use of iptables

And that’s it. We’ve configured our firewall. It will reset every time you reboot your server, but that isn’t often. I just keep a script like the one above to reconfigure it. You can get NetworkManager to configure it for you on boot, but I don’t really see the point unless you reboot your server all the time, which, I mean, why would you do such a thing?

Plugin rot and how to do regex replaces in mysql

I’ve spent a few hours over the past week restoring my old blog, which can now be reached at brbcoffee.com/archive/.  This will be the final chapter in that saga.

My old blog is sort of old. It started out on MySpace, moved to Google’s BlogSpot platform, then moved to Tumblr for a while because I got some new authors on board. Finally we moved to a WordPress solution.  From there on BRBcoffee has been on WordPress, but it’s moved around hosts like an alcoholic goes from bar to bar trying to get his fix. I’ve employed a bunch of plugins to do various things for me, from importing and exporting posts to providing language switching capabilities to advanced caching to image scaling to… you get the picture. This is all fine and dandy, but it does pose a few problems when you move around like I do.

  1. The image scaling edits your wp_posts database table, naturally. When you move, the scaled images don’t necessarily come with, resulting in a lot of broken links.
  2. The translation plugin just has it’s own tags in that same database table, and keeps two posts in one field. When the plugin isn’t there anymore, or isn’t compatible with your WordPress version, the tags become visible on the site, and you get two languages at once.
  3. Plugins are often developed by lone wolf devs, and they are never up to date when you check up on them years later.

So, we need to clean up the mess they’ve created, and we don’t install Yet Another Plugin™ to try to fix the problem. We need to dig into our database and fix things ourselves. There’s only one problem: Mysql is made for storing, finding, updating, removing, etc. It isn’t the tool for the job when you need to do complex things like searching for a pattern and replacing only the part of the string that matches that pattern. It can be done, I’m sure, but why would you do it when there are better tools for the job? Here’s how I do it:

  1. Export the table you need to edit with
    mysqldump -u USER -p DATABASE TABLE > dumpfile.sql
  2. Do work on the dumpfile with specialized tools like sed, awk, or even vim when in a pinch.
  3. Empty the table before reimporting the modified dump with
    mysql -u USER -p DATABASE -e "delete from TABLE;"
  4. Finally reimport the table with
    mysql -u USER -p DATABASE < dumpfile.sql

Mysql also comes with a tool called mysqlimport, but I had a problem with it, so I just used the standard mysql command in the end.  So what do those switches mean?

  • -p means this user has a password, ask me for it before doing anything
  • -u USER means use this username. Pretty obvious.
  • -e means execute. Execute the following quoted code.

The only weird thing is that -u takes an argument, but -p doesn’t. It looks like DATABASE should be the argument to -p, but it isn’t. DATABASE is simply the database you want to do work on. In my case, it was called “archive,” while yours might be “blog,” “sales_records,” or even “naughty_movies.” That’s commandline switches for you. In my opinion, a single dash should be reserved for switches, double for arguments, like so:

  • -p [ask me for a password]
  • --user <name> [I'm logging in as name, how quaint]

Oh well, you get used to it. That’s how you should do many complex updates on a migrated database. This isn’t for when you want do do one small change on a live system. The 3rd step removes all records from the specified table. Your system will practically be offline until you’ve reimported the changed data.

If you wish this were simpler, you’re not alone. Behavior like this has been a feature request in mysql since 2007. Go there and click affects me, and maybe in another 7 years we’ll have it built in. Till then, use the method described above, and for heaven’s sake, don’t let the plugin rot get your blog too. If you can do it without a plugin, do it without a plugin.