Global Search for Moodle on Centos

My students are using a Moodle VLE to access resources and teaching materials and it became evident that some kind of global search function would help them find things quickly, especially later in the programme when they come to write their assignments.

I’m running Moodle on a CentOS 7.3 virtual private server with Plesk Onyx. The server hosts several other sites running WordPress, bespoke PHP and some other bits and pieces including the usual mail services. Some of the containers require the OS-standard PHP5.4 but a recent upgrade to Moodle 3.3 required me to switch the container to PHP 7.0.

Installing Global Search was a little tricky because of the multiple PHP versions running on the server, but I eventually figured it out to these key steps:

Install the Solr Server

$ cd /opt
$ wget http://apache.mirrors.nublue.co.uk/lucene/solr/6.6.0/solr-6.6.0.tgz
$ tar zxvf solr-6.6.0.tgz
$ cp solr-6.6.0/bin/install_solr_service.sh .
$ rm -rf solr-6.6.0
$ ./install_solr_service.sh solr-6.6.0.tgz
$ chkconfig solr on
$ su - solr -c "/opt/solr/bin/solr create_core -c moodle"

You should be able to visit http://your-domain.tld:8983 to verify the Solr server is running OK.

Secure the Solr Server

By default, Solr is open to the world. You might want to secure it by adding this at the end of /opt/solr/server/etc/webdefault.xml:

  <security-constraint>
   <web-resource-collection>
       <web-resource-name>Solr Administration</web-resource-name>
       <url-pattern>/*</url-pattern>
   </web-resource-collection>
   <auth-constraint>
       <role-name>solr-admin</role-name>
   </auth-constraint>
  </security-constraint>

  <login-config>
   <auth-method>BASIC</auth-method>
   <realm-name>Solr Administration</realm-name>
  </login-config>

Create a file in the same directory called realm.properties containing your chosen authentication details (matching the role above) in a single line:

admin: password, solr-admin

Finally, add this just before the last line in jetty.xml in the same directory:

<Call name="addBean">
 <Arg>
  <New class="org.eclipse.jetty.security.HashLoginService">
    <Set name="name">Solr Administration</Set>
    <Set name="config"><SystemProperty name="jetty.home" default="."/>/etc/realm.properties</Set>
    <Set name="refreshInterval">0</Set>
  </New>
 </Arg>
</Call>

Install the PHP Solr Extension

$ rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
$ rpm -Uvh https://mirror.webtatic.com/yum/el7/webtatic-release.rpm
$ yum install libxml2-devel pcre-devel libcurl-devel php70w-devel php70w-pear

You’ll need to build the extension using the right versions of phpize and php-config for your version of PHP, in my case, 7.0:

$ cd /opt
$ curl -O https://pecl.php.net/get/solr-2.4.0.tgz
$ tar zxvf solr-2.4.0.tgz
$ cd solr-2.4.0/
$ ../plesk/php/7.0/bin/phpize
$ ./configure --with-php-config=/opt/plesk/php/7.0/bin/php-config
$ make
$ make install
$ cp /opt/solr-2.4.0/modules/solr.so /opt/plesk/php/7.0/lib64/php/modules/
$ sudo service httpd restart

Visit the Site administration / ▶︎ Plugins / ▶︎ Search / ▶︎ Manage global search page in your Moodle installation to configure, index and enable the Solr Search Engine.

I am impressed with how quickly this has been used and appreciated by the students.

PHP Mail and stripping of lines in Microsoft Outlook

A client recently contacted me about problems with the formatting of messages he was getting from a php contact form on his site. He asked if I could insert a couple of CRLFs to make it easier to read and to stop it breaking the email links in the message.

The client’s site is one of those creaking anachronistic beasts, from the days of hand-hacked HTML, which is full of things that work just well enough to enable him to concentrate on his business. I’ve been trying to get him to move to a CMS like WordPress for several years now, but he’s not quite able to let go.

The contact form had not been a problem, as far as I knew, but all this while he has been putting up with messages from the site that look a bit like this:

Name: FredEmail: fred@bloggs.comTel: 09999899988Hi I was
wondering blah blah blah blah?RegardsFred

On my machines, they look like this:

Name: Fred
Email: fred@bloggs.com
Tel: 09999899988
Hi I was wondering blah blah blah blah?
Regards
Fred

It seems that there is a “feature” that has existed in Microsoft Outlook since 2002, at least. What it does, often without letting the user know, is strip out any formatting of lines in the original message and replaces it with what it thinks you’d prefer. In text-only messages, this results in what you see in the first example above.

There’s a lot written about this, much of it along the lines of altering the user’s practice to include workarounds that are only necessary because Microsoft can’t write good code. See here, for example, or here for one of the empirical solutions that suggests changing code to accommodate Outlook’s perverse behaviour. Many others remain baffled. However, thanks to a bit of forensic inquiry by Matthew Truesdell, there are some rules that can be interpreted in such a way that allows the php script to work for all users. Matthew posted the rules he found in Outlook 2007, over on Stack Overflow: I’ve adapted from those here, slightly, using the term “mode” to mean the behaviour of Outlook that strips out line breaks from plain text messages. Lines are assessed one at a time:

  • Every message starts with the mode OFF.
  • Lines 40 characters or longer switch the mode ON.
  • Lines that end with a full stop (.), question mark (?), exclamation (!) or colon (:) switch the mode OFF.
  • Lines that turn the mode off will start with a line break, but will turn it back on if they are longer than 40 characters.
  • Lines that start or end with a tab turn the mode off.
  • Lines that start with 2 or more spaces turn the mode off.
  • Lines that end with 3 or more spaces turn the mode off.

So it seems that one way to trick Outlook is to add 3 spaces at the end of each line, which in the code is just before the CRLF. I tried this, but be careful if you rely on it: different versions of Outlook do different things. Outlook 2013 is still stripping out the line breaks on the client machine, so we have this:

Name: Fred   Email: fred@bloggs.com   Tel: 09999899988
Hi I was wondering blah blah blah blah?   Regards   Fred

Which is still not satisfactory but at least allows him to click on the email address for a quicker response.

On my own machine (OSX Yosemite), Outlook 10 seems to be working as you’d expect, without interfering with the line breaks. Gmail works fine also. I think that’s as far as I’m going to take it.

Mavericks OSX 10.9 Update php fix

I updated to OSX 10.9 Mavericks this week, and as with all updates, it broke PHP. I run a local MAMP server for development purposes and it all works OK except that you have to re-enable PHP in the apache configuration file. I found a useful guide over at coolestguidesontheplanet.com which included these steps:

Open a terminal window and edit the httpd.conf file:

sudo nano /etc/apache2/httpd.conf

Uncomment this line:

LoadModule php5_module libexec/apache2/libphp5.so

Write out and save the file, then restart apache:

sudo apachectl restart

… and Robert’s your Mother’s Brother.

A summer of code

anarchyThe summer has had me getting to grips with the nitty-gritty of internet web hosting, caused by a consolidation and move of all of the websites and services that I host to a new server. I had been using HostPapa in a shared environment for several years but the traffic and resource usage of these sites had been on the increase for about 18 months, to the point that HostPapa invited me to pack up and leave.

After a detailed survey of requirements and possible alternatives, I elected to move to the affordable but much more powerful next-step-up of a virtual private server (VPS) solution from HostingUK. I’ve known these guys since they set up business in the late 90’s and felt comfortable that I would get good support from the people behind the business. I haven’t been disappointed.

The new server runs CentOS 6.4, a version of the Red Hat Linux operating system and has the usual LAMP features of Apache Web server, mySQL and PHP, with the Parallels Plex 11 management panel.

My development has been firstly in the area of learning how to set it all up using the Plex panel: it’s a very powerful tool but it’s not quite plug-and-play. The DNS for each of the domains on the site is best managed at the registration server using their nameservers: they have redundancy built in and although the VPS can be its own NS, if it goes down for any reason, this can lead to problems with mail transport and SEO indexing. Within the DNS records for each domain, minimum configuration requires appropriate A, MX and CNAME  entries as well as TXT or SPF records to stop your mail from being forever consigned to the spam folder.

Further learning has included getting down and dirty with the *nix command line, from basic file operations to examining logs, setting up CRON and managing and installing further packages. I’ve installed Munin to help identify what normal operation looks like. One of the things that my new insight has given me is an appreciation of just how much sustained attack is endured by even the smallest of websites by the likes of Turkish, Chinese, North Korean and other interests. The importance of having decent passwords is underlined when you see 20,000 (yes, twenty thousand) attempts to guess the root password in a single day.

The summer of code has reminded me of what I’m best at, and what I enjoy doing.