Automatically updating a WordPress blogroll

Years ago, I signed up to a social bookmarking system called Spurl. I’m really not interested in social bookmarking, but it had a handy feature: by inserting a bit of Javascript in a web page, you could pull all or a subset of the links out and display them in a list. That meant that I could add links to my blogrolls by simply bookmarking a link using Spurl, and categorising it. No need to fiddle around going to the blog admin and adding the link. It was really useful for easily sharing links and maintaining links pages.

Sadly, Spurl was overwhelmed by spam and disappeared. After that I bumbled around trying to find another service that provided the same functionality. It seemed obvious that lots of people would want this. But no. I managed to get something similar working with Furl.net. But then that was taken over by Diigo (a problem with these services is that they keep disappearing or being taken over). Diigo claimed I could do the same thing, but no matter what I did, it refused to show more than the 3 newest links in a particular category. Not very useful for a blogroll (although I do use it for recent links here — see the sidebar on the home page). Del.icio.us doesn’t seem to have any functionality like this.

Yesterday, while half-heartedly trying to resolve this problem, I happened across Sitebar. Yes, another bookmark manager. But this one is open source, and you can install it on your own server. Complete control! Hastily I installed it (a 5-minute job). Added a few bookmarks and experimented with the RSS feed. Once I’d figured that out, some googling and digging in the WordPress codex came up with a way to get WordPress to read the RSS feed and display it in the sidebar.

<div id="sitebar_linkroll" >
<?php
// Get RSS Feed(s)
include_once(ABSPATH . WPINC . '/feed.php');
// Get a SimplePie feed object from the specified feed source.
$rss = fetch_feed('http://www.archetype-it.com/sitebar/index.php?hits=0&w=rss&mix=nodes&root=16');
if (!is_wp_error( $rss ) ) : // Checks that the object is created correctly
// Figure out how many total items there are, but limit it to 20.
$maxitems = $rss->get_item_quantity(20);
// Build an array of all the items, starting with element 0 (first element).
$rss_items = $rss->get_items(0, $maxitems);
endif;
?>

<ul>
<?php if ($maxitems == 0) echo '<li>No items.</li>';
else
// Loop through each feed item and display each item as a hyperlink.
foreach ( $rss_items as $item ) : ?>
<li>
<a href='<?php echo esc_url( $item->get_permalink() ); ?>'
title='<?php echo 'Posted '.$item->get_date('j F Y | g:i a'); ?>'>
<?php echo esc_html( $item->get_title() ); ?></a>
</li>
<?php endforeach; ?>
</ul>
</div>

Ta-dah! This may seem a bit overkill for a blogroll, but it solves a problem for me. And Sitebar is a nice links manager in itself, even if it looks a little old-fashioned (no trendy tags, just folders!) and hasn’t been updated since 2007. There is a hosted version with free and paid options, but hosting it myself means I don’t need to worry about the service disappearing or changing its terms, and if it does what I want it doesn’t really matter if it’s not being actively maintained. It’s designed to be used in the panel/sidebar area of a browser, hence the name. So I’ve added it to Opera as a panel, where it’s easily accessible for adding links (there’s a Javascript bookmarklet too, for one-click adding, though I haven’t got that working in Opera yet).

Note, of course, that you can use this same PHP code for any RSS feed that you want to incorporate into a WordPress site — just edit the feed URL.

Posted in Blogging, PHP, Web development, Wordpress | Tagged , , | Leave a comment

Book review: PHP and MySQL for Dynamic Web Sites: Visual QuickPro Guide, 4th Edition by Larry Ullman

I bought PHP and MySQL for Dynamic Web Sites: Visual QuickPro Guide (4th Edition) to use as a set book for an online intermediate PHP class that I teach. I previously used Web Database Applications with PHP & MySQL by Williams and Lane, but it was very old (published 2004), covering only PHP 4, and no new edition seemed to be forthcoming. I chose Larry Ullman’s book pre-publication on the strength of the coverage described in the pre-release table of contents, reviews of earlier editions, and other books by Ullman which do a good job of explaining concepts — I also corresponded with the author about some of the details, and he was helpful and forthcoming.

I think this book will be much more accessible than the Williams and Lane one, which had great detailed coverage of PHP and MySQL, but was over-complicated in places. But I’m a little disappointed with several aspects.

By far the biggest disappointment: no coverage of exceptions, a key feature of PHP5. Second biggest: the whole chapter on OOP is pretty inadequate. I don’t teach OOP on my course, but I do encourage students to use pre-built classes for some tasks, such as data validation or sending mail. This chapter takes that approach, but doesn’t explain why using the MySQLi class is any better than just using the non-OOP mysqli_ functions. He could at least have covered PDO here, as it’s more versatile than MySQLi. But again, PDO isn’t mentioned at all — a very surprising omission.

I was also disappointed that while he briefly discusses user-defined functions in the early chapters, he makes no effort to encourage the reader to actually use them in the exercises and case-studies later in the book. All the code is inline, including in places that are crying out for use of functions, such as the shopping cart, user registration, and forum examples. This book is not intended for complete beginners, so it really should do more to encourage good practice and maintainable code. It is nice to include a chapter on using jQuery and AJAX with PHP, but frankly I’d have rather seen this chapter omitted in favour of more coverage of the above topics. I know there’s only so much you can cover in 600 pages (especially when the layout is so inefficient), and different readers’ needs will vary, but some of the priorities here seem wrong to me.

Good points: the coverage of MySQL queries is pretty good, but for my purposes I’d have liked to see better discussion and explanation of database design and ERDs. The Williams and Lane book had a very good appendix covering this. There’s a whole chapter on security, with useful and realistic examples, and Ullman addresses security aspects such as input filtering and escaping throughout the book (often a weak point in introductory PHP books, happily littered with examples using unfiltered POST and GET input “for clarity” — the ever-popular PHP and MySQL Web Development by Welling and Thompson is an egregious example). Finally, he does a good job of building up examples and leading the reader through them. The chapter on form handling is particularly good, and there’s also a whole chapter on regular expressions which does a better job than most of demystifying them. If you are an intermediate PHP coder, you’ll probably find this book useful to improve your skills, and I’m glad to find an up-to-date PHP book I can use for my course. But I’m still looking for the perfect intermediate PHP book!

One last black mark for Peachpit: the 2-column layout really doesn’t work for this material. Code listings drag on for pages because they are squashed into a narrow column, and the whole thibg looks untidy. Why they didn’t use a single-column design is beyond me.

Posted in Databases, PHP, Web development | Tagged | Leave a comment

Accessing virtual hosts across a LAN

Like many web developers, I have a local Apache/MySQL/PHP setup for developing sites. To ensure maximum compatibility with live servers, I set up a virtual host for each site. That way I know all paths will work. For example, on my MacBook, all my sites live in subdirectories of /Users/myusername/Sites. It’s easy enough to set up a virtual host for each site; in Apache’s config file (normally httpd-vhosts.conf, but your set up may vary) I have the line:
NameVirtualHost *:80

and then a virtual host section for each site, for example:

<virtualhost *:80>
DocumentRoot /Users/myusername/Sites/site1
ServerName site1
</virtualhost *:80>

That sets up the Apache side of things. In order for my computer to know what address to map the site names to, I now need to edit my hosts file. On the Mac it’s in /etc/hosts; on Windows it’s more deeply buried in C:\Windows\System32\drivers\etc. In both cases you need administrator privileges to edit it. Easy enough on the Mac; in Windows you’ll need to run Notepad or another text editor of your choice as administrator (right-click on it in the Start menu as choose “run as administrator”).

For each virtual host that I set up in the Apache config file, I add a line to my hosts file:
127.0.0.1 site1
and that provides the mapping I need so that I can use my browser to go to http://site1 and Apache will serve up the correct site. The address 127.0.0.1 always maps to the local computer (i.e. localhost). But what if I want to access site1 from another computer on the LAN? How will it know where site1 is? That turns out to be easy-peasy.

First, find out the IP address of the computer hosting site1. For example 192.168.1.10. On the second computer, edit the hosts file. Add the line:
192.168.1.10 site1
Et voilà! Now, when I enter http://site1 on the second computer, it will find it on the first computer, even if it’s not the default served up by Apache.

Note, you may hit snags with this system if you use DHCP on your LAN, because the IP addresses of the host computers may change from time to time (e.g. if you reboot them). I recommend using static IPs on your local network for this reason, at least for those computers you are using to host websites.

Posted in Web development | Tagged , , | Leave a comment

More nifty SQL: insert if not exists

Quite often you want to insert a record into a table, but only if it doesn’t already exist. OK, you could first do a SELECT query to check for existence, and then do the insert, but it’s clumsy and raises the slim possibility of a race condition (what if someone inserted or deleted that record between the check and the insert?). Every RDBMS seems to handle this differently (and some not at all). In SQL Server, you can do this:
IF NOT EXISTS (SELECT acolumn from atable where column1 = 1 and column2 = 'text') INSERT INTO atable(column1,scolumn2) VALUES(1,'text')

And in MySQL:
INSERT INTO atable(column1,column2) SELECT 1, 'text' FROM DUAL WHERE NOT EXISTS (SELECT column1 FROM atable WHERE column1=1 AND column2='text' LIMIT 1)

Notice that in MySQL you have to use the dummy DUAL table (which suggests to me that this syntax might work in Oracle too; I haven’t tried it).

Posted in Databases | Tagged , | Leave a comment

ADODB database abstraction library for PHP

I’ve been using this library for years to make my database interface code database-independent, but I am still discovering new useful functions. Case in point: today I was using the handy rs2csv function, which outputs a recordset to a CSV file with a few lines of code …

$fp = fopen($file, "w");
if ($fp) {
rs2csvfile($rs, $fp);
fclose($fp);
}

But I needed to massage some of the rows in the recordset before writing them to the file. I struggled for a while before finding RSFilter:
$rs2 = RSFilter($rs, 'callbackfunction');

The callback function takes two arguments: the recordset and an array (passed by reference) representing a single row of the recordset (see the example in the link to the documentation above). RSFilter deals with running the function on each row, and returns the duly massaged recordset ready for output to the file.

Posted in Databases, PHP, Web development | Tagged , | Leave a comment

Preventing hotlinking of images

Using an .htaccess file to prevent people stealing bandwidth by linking directly to your images from another website is a popular and well-known technique. I needed to do this, but decided to take it a step further with a little added PHP.

Most solutions simply display another image (as polite or as rude as you like) in place of the hotlinked image. But I was curious to know who was linking to which images. Some of them might be legitimate users, in which case I could unblock them. Others might need to get a stiff warning about copyright.

For the .htaccess file, I used the technique here. But the author was actually redirecting to another page with the requested image embedded in it. Hey, I’m trying to save bandwidth here! So I wrote a PHP script that would log the access and then display a small generic image advertising the website from which the image had been, er, “borrowed”. Here’s the .htaccess file:

RewriteEngine On
# hotlinked images
RewriteCond %{REQUEST_FILENAME} \.(gif|jpe?g|png)$ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !mydomain\.com [NC]
RewriteCond %{HTTP_REFERER} !myotherdomain\.com [NC]
RewriteCond %{HTTP_REFERER} !alegitimatedomain\.com [NC]
RewriteRule (.*) /blockimage.php?imagename=$1 [NC,L]
# end hotlinked images

This file, placed in the images directory, blocks all but the authorised domains and any blank referrers from accessing images. Notice that the image requested is passed to my blockimage.php script in the query string (that’s what the $1 does), because I’m going to use it later.

I could have logged accesses to a text file, but I have a MySQL database so I might as well use it; it will let me do further statistical analysis of the accesses too. I created the following table:

CREATE TABLE Hotlinks (
hotlinkid int(10) unsigned NOT NULL auto_increment,
referrer varchar(255) NOT NULL default '',
imagename varchar(255) NOT NULL default '',
linktime timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY (hotlinkid)
) ENGINE=InnoDB;

I used INNODB for the table type because I reckoned it would be quicker for frequent writes and infrequent reads, but you can use MyISAM if INNODB isn’t available on your server.

Now the PHP code:

<?php
// called when an image is hotlinked
// ... snipped code to connect to database ...
// get the referrer and image name
// first initialise an empty array to hold them
$values = array();
$values['referrer'] = mysql_real_escape_string($_SERVER['HTTP_REFERER']);
// get the image name from the query string thanks to .htaccess
$values['imagename'] = mysql_real_escape_string($_GET['imagename']);
// write it to Hotlinks table using my own insertrow function
$result = insertrow($DB, 'Hotlinks', $values);
// show a nice image advertising my site
$mimetype = 'image/jpeg';
$filename= $_SERVER['DOCUMENT_ROOT'].'/images/sitelogo.jpg';
header('content-type: '.$mimetype);
header('content-length: '.filesize($filename));
readfile($filename);
?>

So after silently logging the access, the PHP code sets the mime type to what the calling page is expecting (an image) and then returns the specified image to the browser. Remember you can’t have any other output (including white space) before the calls to the header function.

Now I can check the database table to see who is accessing what, and if I see any legitimate domains there, I can add them to the “allowed” list in .htaccess.

Ironically I ended up not using this code on the site it was intended for, for various reasons. But I think it’s an interesting and potentially useful technique, so I’m posting it here anyway.

Posted in PHP, Web development | Tagged , , | Leave a comment

Who knew Javascript could be fun?

It gave me migraines till I discovered jQuery. Now I can validate a form with some simple CSS and a couple of lines of jQuery code. And today I had a rather odd requirement to toggle the colour of cells in a table when you click on them. In the old days, I wouldn’t even have attempted this. In my new jQuery heaven, I did this:


(document).ready(function() {
$("td.standard").click(
function(){
var currentclass = $(this).attr("class");
if (currentclass.indexOf("highlightcell") == -1) {
$(this).addClass("highlightcell");
} else {
$(this).removeClass("highlightcell");
}
}
);
});

Voilà!

Posted in Javascript, Web development | Tagged | Leave a comment

PHP: is it a leap year?

I’ve written about date manipulation in PHP before. There are some truly horrible hacks out there. Today I needed to detect whether a given year was a leap year, in order to write future-proof code for a calendar. PHP has got to have a way of doing this. I Googled and found this. And this, which is almost as bad. Urgh. Sorry, but that is one of those occasions when the coder should have asked him/herself if there wasn’t a better way, and at least considered looking at the manual for the date function. There is of course a quick and simple way. Here it is.

So I wrote one line of code:
$maxfebday = date('L', strtotime("$currentyear-01-01")) ? 29 : 28;

Posted in PHP | Leave a comment

Amarok and network shares on the eee PC

The eee now spends most of its time being used as a convenient way to play music through the stereo. This means it needs to mount the shared directory on the server downstairs that contains all the music. This caused a few problems initially because Amarok doesn’t seem to like Samba shares and it just wouldn’t build the collection even though I could access the tracks by simply browsing the file system. It took me some expert help and a few goes to get the syntax of the mount command right, and what I ended up with was rather a mouthful:

sudo mount -t cifs //silver/music /home/user/silvermusic -o username=samba_username,password=samba_password,iocharset=utf8,
file_mode=0777,dir_mode=0777

Thank goodness you can use the up-arrow to recall terminal commands! But it does work, and Amarok can build the collection correctly. Helpful Mike S explained to me how to set up fstab to load it automatically:

The following line would do it:

//silver/music /home/user/silvermusic cifs username=samba_username,password=samba_password,iocharset=utf8,
file_mode=0777,dir_mode=0777 0 0

But it would be better to not put the login details in fstab, so you should create a credentials file to hold the username and password:
sudo nano /root/.sambacred

Then add the following to it:

username=samba_username
password=samba_password

Then make it read/writeable only by root in order to keep other people out of it:
sudo chmod 600 /root/.sambacred

Then use the following line in fstab:

//silver/music /home/user/silvermusic cifs credentials=root/.sambacred,iocharset=utf8,
file_mode=0777,dir_mode=0777 0 0

Note: I have introduced line breaks for formatting here, but it’s very important to avoid extraneous spaces in these lines :-) After cifs there should be no spaces. Unfortunately in my case it doesn’t work, I think because the wireless interface doesn’t come up until after fstab has been read, in fact it isn’t activated until after the GUI has loaded. So I ended up just putting the original unwieldy line in a shell script called loadmusic.sh so I can easily run it from the shell prompt. I imagine Mike’s way would work if I was using a wired connection.

Posted in Linux | Tagged , , | Leave a comment

SQL query to find duplicate rows

This is one of those things I can never remember how to do when I need to do it. I don’t know why, it’s really quite simple:

SELECT emailaddress, firstname, lastname,
COUNT(emailaddress)
FROM customers
GROUP BY emailaddress
HAVING COUNT(emailaddress) > 1

Posted in Databases | Tagged , | Leave a comment