No gamma death rays here

Thinking about Solr

Docker, Drupal, and Solr: thoughts on creating an Alpine Docker image for Apache Solr

Callback Insanity

--

Solar activity on a pulsar neutron star. Image credit: NASA.gov.

Today, I began my day by looking at this monstrosity of a Docker image for Apache Solr:

https://hub.docker.com/layers/solr/library/solr/8.7-slim/images/sha256-4537181cdb9dce9e431fd7c9aa98618505370d2320a9ee5a961e3ff44f8c8b49?context=explore

The image is fresh (from this month), and for all it’s worth it is as well constructed as it can be — it is after all, the official image from the World Wide Web Supremos over at the Apache Foundation.

But Dear Lord, isn’t it unabashedly large sitting at 267.24 MB! That’s just the first thought of the day. The image uses a debian/ubuntu base as betrayed by the apt-get usage in the Dockerfile commands, and that could be part of the size. The other reason why it’s so large is because Solr is (and always has been) a Java-dependent technology. As that Dockerfile shows, the official Apache Solr is using an open source version of the Java JDK, since that might help keep papa Ellison off your back. Here’s the JDK it’s downloading to power Solr:

  • https://github.com/AdoptOpenJDK/openjdk11-upstream-binaries/releases/download/jdk-11.0.10%2B9/OpenJDK11U-jre_x64_linux_11.0.10_9.tar.gz ;

Here’s some nice references to visit later on and read upon: AdoptOpenJDK Github, AdoptOpenJDK official site.

The Java JDK layer adds about 40 MB on top the base distribution, but the biggest ka-chonk of fat comes this 191.65 MB layer. This last layer downloads and installs the Solr distribution proper:

then SOLR_DOWNLOAD_URL=”$SOLR_DOWNLOAD_SERVER/$SOLR_VERSION/solr-$SOLR_VERSION.tgz”;

You can read more about Apache Solr over at it’s official homepage, where you’ll find official documentation and information about the latest releases.

APACHE SOLR™ 8.7.0

Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene™.

Over at their download page, I downloaded the latest stable version 8.7.0, for science (and then cancelled the download midway).

The compressed binary package sits at 192 MB, and the source alone is 75 MB.

75 MB of code is a lot!

It seems that with a pre-built binary so large (and that’s uncompressed), the options for building a limited-size Alpine image for Solr seem limited, and something about the Law of Diminishing Returns comes to mind with regards to partaking such effort.

Could the official Apache Solr Github page reveal some hidden secrets about building the binaries from source that end up providing a leaner and meaner Apache Solr binary? Maybe, worth checking it out.

For future reference, maarten blokker was thinking around the same lines (Alpine, Java JDK, minus the Solr component) over at de Bijenkorf tech blog in his article Creating the smallest JVM microservice deployment. After three rounds of build optimization for the JDK, Mr. Blokker was able to get the JDK build artifacts down to 94.5MB.

Apache Solr: Drupal module compatibility

The OG Apache Solr search Drupal module of yore, “Apache Solr Search” is sunset for Drupal 8, and available only for Drupal 7 and below.

The Acquia documentation for Solr search indicates which supported Drupal modules you can use to connect with their search services, and confirms that the original apachesolr module is dead in the water for Drupal 8 and above, at least if you want to connect with Acquia’s Solr servers.

Solr search support matrix for Drupal modules and Acquia. Credit: acquia.com

Over at Drupal.org, the Search API Solr module says on it’s compatibility matrix that it supports the latest versions of Solr (8.x), some good news to keep in mind.

The Search API module says it’s able to leverage the Search API Solr module as a dependency, and provides additional functionality on top of the Search API Solr module. Ixnay on the Drupal 9 support for both modules, at least “officially” from the Acquia Solr search side of things.

Closing Thoughts

If anyone were to build their own Docker image for Apache Solr, with the goal of intentegrating such Solr service with a Drupal container, they would be advised to check what versions of Solr the existing Drupal modules are compatible with, as well as interoperability with Acquia Search in case you’d wanted to use Acquia Search for production environments.

As for thoughts of having a petit Alpine image for Solr with a weight lesser than 100 MB, that’s more of a distant fever dream. From the preliminary research I gathered it seems that between the Open JDK (even when optimized for size), and the Solr binaries it would probably be a diminishing rewards effort to embark on such attempt. But maybe something still worth doing someday, because science!

  • The official Apache Solr images are probably the most practical way to have your own Solr container.
  • You could still use the official image as your base image, then add some bespoke customizations on top of it that fit your particular Drupal development profile.
  • As an alternative you could either build the Java Open JDK from scratch (via maarten blokker).
  • Or, you could also build Solr from scratch, too (thank you Kai Chan).

Note that as of version 9, Solr switches from Ant to Gradle for building. But insofar as Drupal 8/9 is involved, there is no support for Solr 9 yet, so that should not be a concern.

Whichever route you go for containerizing Solr, hopefully this article highlighted some of the alternatives available to you — from easy mode to try hard mode.

--

--

Callback Insanity

Organic, fair-sourced DevOps and Full-Stack things. This is a BYOB Establishment — Bring Your Own hipster Beard.