An easy tutorial to help you get up and running with TYPO3 and Solr

Introduction

This is a tutorial that will show you how set up a TYPO3 web site that uses Solr for site search. I'm going to lay out every detail of the setup I used to create this tutorial so that you may duplicate it. It's recommended that you use the same major versions of the software that I use so that this tutorial will work for you. 

I created this tutorail because I had tried in the past to get TYPO3 working with Solr but I just couldn't figure it out. The documentation for EXT:solr (what I will call the TYPO3 extension for solr from this point on) is extensive, but I just needed a more newbie friendly guide, so I created one. You don't have to be a Solr expert to follow this tutorial, but it is recommended that you at least follow the official Solr tutorial available from the Solr web site so that you have a basic understanding of how Solr works.

This tutorial shows you how to install Solr and configure it to work with a fresh installation of the TYPO3 Official Introduction Package in easy steps from start to finish. If you follow the setup I describe below, using the same operating system, version of Solr, and the same version of TYPO3, everything is almost guaranteed to work if all steps are followed.

Please ask me any questions that you have about this tutorial. Constructive criticism is also welcome. I want to make this tutorial better with questions and criticism received.

My personal setup that I'm using for this tutorial

Before we start I want to inform you of some more specifics of the setup that I'm using to make this tutoriall. It's best that you use a similar setup so that you may be more likely to achieve the desired result - a TYPO3 web site that is configured to work properly with Solr.

My base system for this tutorial consists of an OpenVZ based VPS with 4GB of RAM with Ubuntu 16.04 (64 bit) as the operating system. I've also walked through the steps of this tutorial with a VirtualBox machine with 2GB of RAM and Ubuntu 16.04 (64 bit). With tutorials like this, it's best to have a system that is as similar as possible as the one used in the tutorials creation.

My TYPO3 setup is a fresh install of the TYPO3 Official Introduction Package, version 8.7.16. The Official Introduction Package was installed at the time TYPO3 was installed during the installation step that asks "Want a pre-configured site?". Nothing has been added to or modified from this default TYPO3 Introduction Package installation before the following tutorial steps. So if you install the Official Introduction Package on Ubuntu 16.04 you should be fine. 

After you follow this tutorial and you've had a chance to look around and play around with and learn about TYPO3 and Solr, you will have an understanding that will help you to configure your regular TYPO3 sites to use Solr.

As a side note, I also use Virtualmin GPL to manage virtual servers and other stuff related to the LAMP stack that runs TYPO3 for me. Virtualmin is by no means a requirement to follow this tutorial. But I just mentioned it because I want to be transparent about my complete setup. You should be able to install TYPO3 any way you like, with or without a server management software like Virtualmin or cPanel, and this tutorial should work for you.

Basic Requirements Before You Start

So here's what you'll need to have before you follow this tutorial:

  • A server running Ubuntu 16.04. You could try a different Ubuntu version or different OS if you want, but your results may vary. Use Ubuntu 16.04 64-bit if you want the same OS I used, and this should work out for you perfectly. I used the Ubuntu 16.04 image provided by my VPS hosting provider. I also had success with a Virtualmin install of Ubuntu 16.04 using the 16.04 server image.
  • A default installation of The Official Introduction Package for TYPO3 must be installed and running on a fresh install of TYPO3. Make sure that the Official Introduction Package is installed at the time you install TYPO3. In other words, during installation of TYPO3 when you are asked near the end of the installation steps if you "Want a pre-configured site?", select the option that says, "Yes, download the list of distributions." Then when you first log in to the backend of TYPO3 you will see a list of TYPO3 distributions. Then install the first distribution listed which is labeled, "The official Introduction Package."
  • Root privileges and shell access for your server are required. Using sudo is fine.
  • You should have a decent understanding of Linux and Linux commands like "cp, mv, ls" etc. You should know your way around a Linux server. I'd like to think my instructions are clear enough that you could follow them without much Linux experience though.
  • An understanding of Solr would be helpful. I'd recommend following the official Solr tutorial that will quickly walk you through the basics of Solr. That solr tutorial that I linked to is for a more recent version of Solr than you will install in this tutorial, but I linked to it because I think it is a better written tutorial than is available for the version of Solr that we will install below and it won't matter if you use a different version of Solr for learning about Solr because the concepts you will learn will be the same.

 

The Instructions

Make sure that you run all of the commands with root privileges

Install Java

Running the commands below will install Oracle Java 8 on your Ubuntu 16.04 server. Java is required to run Solr. You could install the OpenJDK instead, but I chose to install Oracle Java for no particular reason.

First update your system:

 

apt update && apt upgrade

 

Install the software-properties-common package that will install tools to let you add a repository:

 

apt install software-properties-common

 

Then add the Java repository. This will let your system download the Java installer that will install Oracle's Java. You'll need to press [ENTER] when you're prompted to.

 

add-apt-repository ppa:webupd8team/java

 

Now update your package list

 

apt update

 

Now install Oracle's JDK with the following command. You will be prompted to accept the "Oracle Binary Code license terms" and you must do so.

 

apt install oracle-java8-installer

Install Solr

As of right now, the most current version of Solr is 7.4.0, but according to the EXT:solr documentation, the most recent version of Solr that is supported for the EXT:solr extension is Solr 6.6.5. For that reason, the following instructions show how to install Solr version 6.6.5 instead of the most recent Solr version. 

Follow the following steps to install Solr 6.6.5 on your system.

Change to a directory where you'd like to download the Solr package. I chose to download it to the /tmp directory.

Then download the Solr package with wget using the following command:

 

wget https://www.apache.org/dist/lucene/solr/6.6.5/solr-6.6.5.tgz

 

When I first wrote this tutorial the above link referenced the 6.6.3 version of Solr. I noticed later that the link was no longer active and I received a 404 error when trying to download it. If the above link fails for you with a 404 because the version of Solr available has changed, then try to download a version of Solr that is as close to the version above as possible. Remember though that if the version is changed, then any instructions that follow which reference the Solr version must also be updated to reflect a different Solr version.

Now extract the install script from the archive.

 

tar xzf solr-6.6.5.tgz solr-6.6.5/bin/install_solr_service.sh --strip-components=2

 

Then run the install script

 

sudo bash ./install_solr_service.sh solr-6.6.5.tgz

 

The install script will run and it will put most of the solr files in /opt/solr/ and some of the writable files in /var/solr/. The install script will also install Solr as a service that will run on boot.

When the Solr install script ends, it will start Solr as a service running on port 8983. The install script will also create a user named solr. After the install script runs, your terminal may be filled with log output and you may not have a shell prompt that is available for you to type into. If that happens it is okay to close your terminal window and open a new one because Solr will still be running. Or you can press CTRL+C to get a command prompt. 

Also if you want, you can visit the Solr admin url at http://YourServerAddress:8983/solr/#/ and have a look around. 

Install the Solr extension for TYPO3

Remember, you should already have TYPO3 installed with the Official Introduction Package as described in the beginning of this tutorial.

Follow these steps to install the Solr extension for TYPO3.

From the backend of your TYPO3 Official Introduction Package installation, go to the extension manager and search for and install the solr extension. The scheduler extension will also be installed automatically as a dependency. If by any chance the scheduler extension isn't installed automatically you should install the scheduler extension.

 

Configure Solr to work with TYPO3 and EXT:solr

In these steps you will copy the special Solr configuration that comes with the EXT:solr extension to the filesystem locations where Solr is installed. Solr needs these configuration files so that it will know how to index TYPO3 content.

Run the two following commands to copy the TYPO3 specific files to your Solr installation, replacing the path to EXT:solr used below with the correct path for your own TYPO3 installation:

 

cp -r /home/typo3andsolr/public_html/typo3conf/ext/solr/Resources/Private/Solr/configsets /var/solr/data

 

 

cp /home/typo3andsolr/public_html/typo3conf/ext/solr/Resources/Private/Solr/solr.xml /opt/solr/server/solr/solr.xml

 

Next, you will use the curl command to call the Solr API to create three Solr cores. The reason why three cores will be created is because the TYPO3 Official Introduction Package has three languages configured by default (English, German, Danish) so there needs to be a separate Solr core for each language. Otherwise, content from all languages would be in a single Solr core, and that's probably not a good idea.

If you don't have curl installed, you can install it on your system with the following command:

 

apt install curl

 

Now you will use Solr's REST api to create the three Solr cores that will be used to index the web site content for the three default languages (English, German, Danish) of the TYPO3 Official Introduction Package

Note: At the time this tutorial was written the version of EXT:solr used was 8.1.0. Make sure to check what version of EXT:solr you have and replace the version number in the commands below. And make sure you replace the decimals in the version number with underscores (for instance 8.1.0 becomes 8_1_0) for the curl commands below.

 

curl "http://localhost:8983/solr/admin/cores?action=CREATE&name=core_en&configSet=ext_solr_8_1_0&schema=english/schema.xml&dataDir=dataDir=../../data/english"

 

curl "http://localhost:8983/solr/admin/cores?action=CREATE&name=core_de&configSet=ext_solr_8_1_0&schema=german/schema.xml&dataDir=dataDir=../../data/german"

 

curl "http://localhost:8983/solr/admin/cores?action=CREATE&name=core_da&configSet=ext_solr_8_1_0&schema=danish/schema.xml&dataDir=dataDir=../../data/danish"

Configure the EXT:solr extension in TYPO3

Include the static templates for EXT:solr in your site template

From the WEB->Template module of your TYPO3 site, edit the main template of your TYPO3 site to include the following two static templates that come with the EXT:solr extension.

  • Search - Base configuration (solr)
  • Search - Default stylesheets (solr)

Edit the "Constants" section of your site template

You also need to add the following typoscript to the Constants section of your "Introduction Package" template. This typoscript will tell the EXT:solr extension how to contact your Solr server. This typoscript also tells the EXT:solr extension which of the Solr cores you created should be used to store data for each of the three languages of your TYPO3 Official Introduction Package site.

 

plugin.tx_solr.solr {
   scheme = http
   port   = 8983
   path   = /solr/core_en/
   host   = 127.0.0.1
}

[globalVar = GP:L = 1]
plugin.tx_solr.solr.path = /solr/core_da/
[end]

[globalVar = GP:L = 2]
plugin.tx_solr.solr.path = /solr/core_de/
[end]


Optional step if you are not using the Official Introduction Package for TYPO3

Here is something you'll need to do only if you happen to not be using the Official TYPO3 Introduction Package.

The Introduction Package already includes the following typoscript, but if you are not using the Introduction Package, you will need to add the following typoscript to the Setup section of your sites template. 

 

config {
    index_enable = 1
}

 

Remember, you won't have to add this typoscript if you're using the Official Introduction Package, but I am including this here for those who may not be using the Official Introduction Package, or for those who wish to configure other sites to be indexed with Solr. 

Add a domain record

You must add a domain record to the root page of your site for the EXT:solr extension to work.

From the list module, click on the name of the root page of the Introduction Package, which is titled "Congratulations". Then click on the "+" sign icon at the top of the page to create a new record. Select "Doman" from the list of kinds of records you can create. Then fill in the form with your domain name. Click the save button.

Initialize the Solr Connections

Now you must Initialize the Solr Connections. You can do that by opening the Cache-Menu by clicking the "lightning bolt" icon at the top of the TYPO3 backend and then clicking the menu item that says, "Initialize Solr Connections". If at first you do not see the "Initialize Solr Connections" option in the menu,  log out of TYPO3 and then log in again, and then you should then see the "Initialize Solr Connections" menu item. 

Once you have initialized the Solr connections, go to the Reports module from the TYPO3 backend and then to the Status Report view. Under the Solr heading you shouldn't see any errors. You should see that for each of the languages of the TYPO3 Official Introduction Package there is a separate Solr core specified for it and that there are no errors mentioned related to Solr. If everything looks good in the Reports module related to Solr then you are ready to move on to the next step.

 

Index Your Site

Add Page Records to the Index Queue

From the APACHE SOLR module of the backend, open the Index Queue module. This is where you will tell the EXT:solr plugin what to index.

Under the heading Index Queue Initialization select the checkbox next to where it says "pages".

Now click the button that says Queue Selected Content for Indexing. The page records will then be added to the index queue. The index queue tells the EXT:solr extension what needs to be indexed.

Create a scheduled task to process the index queue

Now go to the Scheduler module.

Create a new scheduled task. For the "Class" field of the form, select Index Queue Worker. You will receive an error message when you save if you don't enter a Frequency, so you can enter "1"  for the frequency. Everything else can be left as the default values.

Running the scheduled task to index the site in Solr

Now go back to the Scheduler module and select the task you created and run it. It might take a long time before you see any progress on the progress bar under the task. Your sites pages will be indexed.

This has been the step that I've run into the most trouble with. Oftentimes the scheduler task would never complete, or it would result in an "Internal Server Error". What I learned was that this was mostly due to low PHP resource limits or possibly that the server specs were too low on RAM or processing power. On a decent system with PHP settings set to what is required by TYPO3, I usually don't run into any problems.

You may find that you receive errors when indexing, or that the script times out. If so, you may need to adjust your PHP max_execution_time or other variables accordingly.

If everything is working as it should, you will be able to visit your Solr administration URL at http://YourServerAddress:8983/solr/#/core_en (substitute core_en  with a different core if your site isn't in English) and see that there are some documents indexed. 

When the progress bar for the scheduler task reaches 100% that means your site is indexed and you can move on to the next step.

It is also alternatively possible to index the site using the APACHE SOLR -> Index Queue module by clicking the "Index now" button underneath the Index Queue Status bar. But it is important that you get your site working with indexing through the Scheduler extension which will allow you to automate indexing.

Add a search form to your site and search your site

Now that you have your web site indexed, you can use the plugin included with EXT:solr to include a search form on your site. The same plugin will also display search results.

Do this by creating a new page under the root page of your site (This is the page named "Congratulations")  and name your new page "Search". This will be your search page where you will insert the plugin that displays the search form and the search results. Make sure the Search page is unhidden so that it will be visible in the front end of the web site.

On your new Search page, create a new content element. When you are viewing the "New content element" flexform, select the tab that says "Plugins" and then insert the element titled "Search 
A search form and results list."

Then just click the "Save" button and when you visit your Search page in the frontend you'll see a search form and a button to submit your query. Try searching for "content elements".

Setup TYPO3 to index pages automatically

You already created a scheduler task, but you have so far run it manually to index pages.

Instead of visiting the Scheduler module every time you want to index new pages on your TYPO3 web site, you can set up a cron job that will index new pages automatically as frequently as you desire. That means that any time you create a new page on your site, it will be added to the Index Queue and the scheduler task will automatically index the new pages at regular times.

There is a special command you need to have the cron job run. Based on my TYPO3 install location it looks like what you see below. You must modify it according to your TYPO3 install location.

 

/home/typo3andsolr/public_html/typo3/sysext/core/bin/typo3 scheduler:run --task=1

 

Notice the end part that says --task=1. This simply means that the scheduler script will run the task with scheduler task UID = 1. Since you started with a bare install of the Official Introduction Package, the number for your task should be "1", but if you setup the task on a TYPO3 site with existing scheduler tasks, you will need to replace "--task=1" in your cron command with the correct number of the task corresponding to the task that processes the index queue.

Once you have set up a cron job to run your Index Queue Worker task automatically, you can create some new pages and they will be automatically indexed when your cron runs. 
 

What's next?

Well that's all for my tutorial. This should get you started with using TYPO3 and Solr together with the EXT:solr extension. There's a lot more to learn though and there's a lot of cool features to try out from the manual of EXT:solr.

If you are going to set this up on a live server, it's also important to secure Solr because a Solr installation, by default, is available to anyone on the internet. 

Also, once again, if you have any questions, just ask!

Links to other sources of information on setting up Solr with TYPO3 and the solr extension for TYPO3

I read a lot of other sites when I was coming up with tutorial. Here's a list of sites that you may find useful:

 

Apache Solr for TYPO3 - Enterprise Search
This is the page for the extension on the TYPO3 extension repository. 

The TYPO3 Solr extension official documentation
This is the official documentation and gives a pretty good overview of everything you'd need to know.

TYPO3 Apache Solr for TYPO3
This is the homepage of the extension. Read about features and the many sponsors who make this extension possible. 

TYPO3 AND APACHE SOLR – INTRODUCTION TO AN ADVANCED TYPO3 SEARCH
Much information on using TYPO3 with Solr vs other search solutions. Gives a good understanding of what is needed to get TYPO3 working with Solr.

Solr Search for TYPO3
Gives instructions on configuring the extension. In German.

Using Solr With TYPO3 On Debian Wheezy
This is an older tutorial that uses an older version of Solr, but I think that it might still offer some insight for those setting up Solr and TYPO3 today.

Installation of Solr for TYPO3
Another older installation guide that may be useful. Not in English, but I used Google Translate.

Solr device in TYPO3
A good, older, tutorial that you might find useful for setting up the solr extension. (in German)

 

Share

Comments (0)

No comments found!

Write new comment