ircount development

I’ve finally got around to spending a bit of time on the ircount code.

This post goes through some of the techy stuff behind it. If you’re just interested in features, I’m afraid there’s none yet, but you can now compare more than 4 repositories, but that’s as far as you’ll want to read.

Subversion and Track

The first thing I did a few months a go was create a Subversion repository. I seemed to time this quite well in learning how to use subversion just as every man and his dog gets excited by GIT.

I also installed Trac (using Dreamhost’s useful one-click install), which I haven’t really used other than to browse the code and view changes (time-line).

The repository is publicly available from here: http://svn.nostuff.org/ircount/

The trac site (which can be used to browse the code) is found at: http://trac.nostuff.org/ircount/

It took me a while to get svn working well. Originally I would edit the files in a local working copy of my Macbook, and then use subversion to load these to the server to test. Of course, this meant every little change had to be manually checked in to test it. I got apache and php working on the Macbook, and setup the Mysql db (on the Dreamhost server) to allow connections remotely which allowed me to test/use files located in my local copy. This seems to work well.

I use svnx as a GUI Mac client. It’s free and is easy to use.

The code. The rewrite

I don’t do much code writing or developing. Anyone glancing at my work will take that as plain obvious.

I’ve realised when writing code I have a tendency to be very linear and unconsciously put efficiency before good design. Anything more than one database call per page was unthinkable, loading in other pages a sin, which often led to large unwieldy while loops processing the results from a massive database call. Database calls are mixed with html output mixed with logic.

This is just about ok when showing information about one Institutional Repository. But when comparing a number, it becomes unreadable, and not in any state to be reused. The key aspect of comparing a number of IRs is that you need to ascertain a number of facts for the page as a whole (the earliest data collection date for the page, the higher record count for the page – for the chart for example – which could be from any of the IRs).

My  aim was that the page files should be little more than calls to a few discrete functions.

I’m not quite there yet, but it’s a start. archive.php is mainly a set of function calls, but there is still too much in there, and random bits of html code dotted around. There’s also too many arrays holding information the repository (php) objects can provide. include.php holds the functions, but is now a little bit unwieldy itself, with functions ordered randomly. The third file of note is class.archive.php. This is a repository object, which can grab data from the db for the repository, and return it to the calling page in various ways.

My original plan was to merge the code for showing one repository, with the code to show more than one, to make it easier to implement changes (not having to update two files). By the end of it, I’m now wondering if it would be easier to have two files again, for the little changes, which both call the same core functions.

Google Chart

Google Chart is a great API and I recommend anyone to play with it.

However one of the problems of the last version of ircount is that the URLs for the chart images often became too long (more than 2,000 characters) resulting in no chart being shown. The Chart URL includes each data point separated by a comma, so four repositories multiplied by 100 weeks (for example), multiplied by 4 or 5 digits per datapoint (4 digit number plus a comma), it soon adds up.

One solution was to only pass data per month rather than per week (roughly reducing the number of data points to a quarter of the original). Another would have been (and probably will do in the future) to make use of Google Chart’s encoding function, made easy using these helpful functions.

Overcoming this in an efficient way was a challenge. Originally I had a Google Chart PHP object. IR data would be passed to it in one method, and another method would return the URL.

This seemed sensible at the time, but deciding which php object did what became confusing. For example, the first thing the chart object needed to do was decide if the URL would be too long for the Repositories in question, taking in to account the data we had for each repository. Should the chart object loop around each data point for each repository to first decide how many there are? or should the repository object be handling this by telling the chart object how much data it has? How to avoid the need to loop through the same data several times. Does it matter which object does the work? It’s for the chart, so the chart object should do it, yet other parts of the page may want this info about the repositories, so the repository objects should provide it for all.

In the end, I did away with the chart object and used a function instead, which is passed an array of repository objects, which in turn handle a lot of work.

Future

The foundations are about there. For any page I (or anyone else) wishes to create. A couple of lines are all that are needed to take one or more repository IDs passed in the URL and load in all the data for them, ready to be used as needed. We can then easily call a table or chart to display for these repositories (or a subset of them).

As I mentioned above, the only real improvements are the ability to show more than 4 repositories at once (the chart stops showing once you get to about nine repositories), and the chart is more robust and will now show when it would have failed to do so in the past.

Google Chart does have an encoding which allows far more to be passed in a condensed way, and this php function looks very useful for using it.

If I was starting again today I would look to use a framework such as Zend Framework or CakePHP, or maybe even have a go at Ruby on Rails. But perhaps a third rewrite is a little over the top for now.

I need to tidy up the table view a bit (some nasty code there) and then look to a few new features, may be collecting more data, and exposing it in some computer friendly formats such as atom.

So ircount is really no more than a play thing for a bad coder to make mistakes and learn a little bit along the way. slowly. but if you have any ideas or thoughts I would love to hear them.

  • Share/Bookmark
Twitter clients

From about an hour after signing up to Twitter until very recently I used Twirl on both PC and Mac as my Twitter client. I was happy with it, and still am, but had noticed people using other clients and wanted to see if I was missing anything.

I round up my findings here:

Picture 1.png

Three twitter clients

Twirl

Adobe Air based, with a simple, effective and attractive interface. It just works and has some nice features. You can activate searches so that their results appear in your main feed. updates can appear on screen in small popup notifications. In many ways I see it as the baseline of what a client should be.

Very occasionally it’s non native Air roots would show through with quirks with widgets and UI controls (e.g. a scroll bar not behaving like it should). It could be easy to miss replies and mentions that come in when you are not looking, once off the screen there is no indication that they are there. Some of the newer clients go further than Twirl with integrating with third party apps and displaying conversations. Clicking on a hashtag will open a web browser twitter search, it would be neater if it opened it’s own search tab instead.

Seesmic Desktop

Seesmic bought twirl a while back, and their website now implies Seesmic Desktop is the main client. It’s currently my twitter client at work.

The good: (often) auto completes on usernames. Handles multiple accounts well (they all seem to handle more than one account). Two (or more) column layout works well, but seems to require less screen space than TweetDeck. A good all around client, similar to Twirl

The not so good. Interface could be better, and perhaps too cluttered. On my work PC the interface stutters a little when new tweets come in, a slight delay while new avatar images load which for someone reason is noticeable and a little distracting.

A mention must also go to Seesmic web client. An excellent choice if away from your normal computer. Acts like a desktop client in a web browser.

Echofon

This is the client I have stuck with on the Mac.

The Good: Nice simple interface. One of it’s best features is it’s pop out draw which shows conversations, this is useful, click on the conversation button next to any tweet to see the whole thread, including those you do not follow.

Another nice feature, clicking on a link to an image (on twitpick, yfrog, flickr etc) opens a window showing just the image, no need to load a webpage to have a quick look at an image.

When replying/mentioning someone you can undo the twitter ‘reply’ functionality (i.e. so it’s just a tweet that mentions their username, rather than a tweet which replies to a specific tweet), useful because twitter will only show your reply to those who also follow the person you are replying to (i.e. probably most your followers will not see what you are saying).

Not so good: no auto complete on usernames. when I’m in the middle of composing a tweet and click on ‘reply’ next to a person’s tweet, I want their name to appear where the cursor is, not at the start of the tweet, this is quite annoying. It has a search tab, but no way to auto-update search results (e.g. when following conference tweets), or for them to appear in your main stream.

TweetDeck

Seems to be the most popular client in use at the moment. Like Twirl and Seesmic Desktop it is Adobe Air based.

The Good: Allows for users to be put in to groups and different columns for different groups, e.g. a group of core friends you don’t want to miss anything from. Seems to be the choice of power users. Auto completes usernames.

The not so good: interface isn’t great, and seems to take up a lot of space, with each tweet taking up more room, and the need of multiple columns. Personally, I know it’s very popular, but the interface has never worked for me. Partly because I prefer my twitter client to tick away on one side of the screen while I work in another app which uses the rest of the screen. Tweetdeck doesn’t seem to be designed with this in mind.

Tweetie

I haven’t tried but some people I follow on twitter highly rate it. Not free but sounds like it is worth checking out.

Twitt

Seems quite a new twitter client for the Mac, see their homepage here.

I haven’t really looked at it other than start it up quickly, so can’t really comment. UI seems simple, uncluttered and nice, but there seems a lot of space around each tweet, so less tweets on screen at anyone time. One slight annoying thing, on first start up, rather than asking for a username, it just stated there was an error getting tweets, not a huge problem but not very welcoming. One to watch.

To Conclude…

Having tried a few clients, here are some of the features I like (or would like to see) and will keep an eye open for when looking at any new client.

  • Seeing conversations with one click without needing to go to the browser (echofon does a great job here)
  • Autocomplete usernames. (Tweetdeck/Seesmic Desktop), so I don’t have to remember the exact spelling of a username when I want to mention someone.
  • A simple way to browse all the people I follow (sorted alphabetically and any other useful way), for when I want to mention someone but have no idea of their name but will recognise it when i see it, or recognise their avatar.
  • A way to workaround Twitter’s broken replies (i.e. if you reply to someone only those who follow the person you are replying to will see your tweet, even if it is of general interest). When you reply to someone, some clients allow you to click a small button so that your tweet is a normal tweet (which so happens to mention another user) rather than a reply. Useful, but wonder if there is more clients could do to work around this.

There’s not much between them. I’m currently using Seesmic on my work PC and echofon on my home macbook, but would happily be using Twirl or Tweetdeck as well. All have useful features that others do not have, but none seem to have a killer feature which puts them above the rest. We should of course be thankful so many people and organisations have developed clients for free!

I’d be interested to here what features others find useful, and which clients they prefer. Why is TweetDeck so popular?

Update (2nd Dec) : A few days after I posted this I tried using TweetDeck again at work. I still don’t think the UI is as good as others, but it does have some nice features and gets most things right. Of note is the username auto completion. But perhaps its biggest selling point is that other clients tend to each miss one or two bits of useful functionality, where as Tweetdeck has most of the features you might find useful. For example, you can mark a user as spam and their tweets and replies are removed from view. Another example, from any tweet, you can do just about any action for the user of the specific tweet (see user profile, add user, email tweet etc). Seeing conversations, profiles and pictures within tweetdeck is also useful.

It’s weaker points (apart from UI): I have the window wide enough to see ‘All friends’ and ‘mentions’ and it’s easy to miss direct messages. Also, I’m not keen on how it handles multiple accounts. I had set-up Seesmic Desktop so that the main tweets column showed tweets for my main account. But the mentions column shows mentions for either my main account or a work related account (where it was important to respond to any reply that occasionally came in). It doesn’t seem possible to do this with Tweetdeck, and so far it seems that basically all settings are for all twitter accounts, i.e. can’t use different settings for different accounts.

  • Share/Bookmark
Apple Automator

Automator on OS X is one of those things I use about once a year, but it always impresses.

Many attempts to use drap and drop to replace programming lead to confusing design. Examples are yahoo pipes, Business Objects, and most of all MS Access.

What’s impressive with Automator is that it always seems it was designed with the very problem you want to solve in mind.

My Problem (apart from the drink)

I use command+shift+4 to get screen grabs a lot. For twitter, for blogs (in fact for this very post), for documents, etc. However I end up with lots of images on my desktop called Picture 1, Picture 2, etc.

I want to keep these, for future use, but want a clutter free desktop.

The problem is, if I try and drag these in to a folder, there will already be a Picture 1, Picture 2 from previous occasions. Leading to annoyingly having to rename every file, or create a sub-directory for each time it runs.

Now I have a simple Automator script.

Picture 1.png

Each time I run it, it moves everything in to a folder, with the create date in front of the name. It was easy to search for ‘actions’ (‘move’, ‘rename’) and browse (‘Files’ -> ‘Find Files’).

  • Share/Bookmark
People power : twitter is highlighting & affecting important issues.

A few weeks a go (why, the 12th Oct in fact) I was sitting at my laptop during the evening, doing this and that with Twitter ticking away on the right. I glanced at the newest tweets to pop in and noticed one from secretlondon.

200911121603.jpg

Curious, I read the Guardian article it linked to. A gagging order to stop a paper report the proceedings of parliament. This is not very good. I muttered and got back on with the this and that. A few minutes later another tweet from secretlondon came in:

200911121605.jpg

Now we have seeds of information! To Hansard, To Wikipedia, To Google. Who were Trafigura, who were Carter-Ruck?

Soon other tweets were coming along about this, and I was adding my two pence too, re-tweeting the news and adding my own little links to what I was finding.

Hansard provided the details the Guardian couldn’t report, and it quickly became clear what they were trying to hide.

By now twitter was alight. Hashtags came in to usage. Following these produced more information, once someone found something, they didn’t just share with their followers, but with everyone now following those tags. Previous Guardian articles (amongst others) were brought to our collective attention.

Before this I had not heard of Trafigura or Carter-Ruck. I suspect many were the same, yet now we were angry about what we read about their questionable activities (one apparently dumps nasty stuff in Africa, the other boasts about suppressing the press, regardless of truth). A storm was brewing and I felt it had yet to peek. But it was late and sleep beckoned.

The next morning I was curious if there had been any developments over night.

First thing I came across was a Spectator online article (a publication on the other side of the political spectrum to the Guardian). It quoted the Guardian article heavily, but then went on to quote the part of Hansard that contained the question (and company name) that the Guardian could not and provided links. I tip my hat to them. #Trafigura was now trending, celebrity twitterers (including our Lord Steven Fry) were highlighting it and more.

It felt like it was everywhere, on the news, and over in coffee room colleagues were talking about it. The Streisand effect had truly kicked in. Before noon on the 13th the case had been dropped. The Guardian was no longer prevented on reporting on the story.

Five Days later

Five days later a Daily Mail columnist Jan Moir wrote a homophobic piece (since edited) about the very recent death of singer Stephen Gately. A similar thing happens. A storm brews up. Not organised. But distributed little efforts or raising attention, mainly through twitter, which leads to coverage in main stream media and changes to the article and headline.

Black Out and Breaking News

And back in February there was a ‘black out’ campaign because of a proposed repressive internet law in New Zealand. Again, partly due to the coverage on sites like twitter, the section in question was scrapped.

It’s not just campaigns and activism. Breaking news is spread rapidly via twitter, such as the plane crash in New York (Twitter broke the story, and first images of the plane in the water came from Twitter), and Michael Jackson’s death.

But twitter doesn’t always have it’s own way. There was a Green avatar campaign for democracy in Iran, which sadly never saw success.

We’re seeing two things here…

1 – That information is now able to spread much faster than it ever has before. This has always been the case with the Internet, and has increased each year with new technologies (blogs, social networking), but especially with twitter.

2 – That people spreading this information leads to the main stream press reporting on it, and those under pressure back tracking. (I wonder how essential that middle step is?)

Twitter is such a good tool for the first point. It is instantaneous, not just the web, but on computer twitter clients and phones, and messages are public by default (unlike many other social networking sites where they are restricted to a specific groups or trusted circle of friends). Having an Open technical platform (which allows any other website or application to access tweets) also helps.

So…

My instant reaction is that this must be a good thing. When something bad is happening in the world (sorry, that sounds very simplistic) twitter, and other websites, can spread the information quickly and widely, even to those who don’t follow the news each day. This can lead to positive change.

The Trafigura/Carter-Ruck case is a good example of this. Imagine if it had happened 10 years a go. People (well, only Guardian readers) would have read the Guardian front page but not had a clue what it was about. In fact The Guardian may well not have run it as a front page story (or at all) as it would have simply confused/frustrated their readership. The Guardian took a gamble by putting this on to the front page, knowing (hoping) it would then become a story in its own right. It did, probably more than they ever hoped.

Noteworthy information is a virus, once it is in the wild it is unstoppable.

But all is not rosy. It will be slippery slope. I’m reminded of the 1995 film ‘the Last Supper‘. In the film they start off killing of the worst people in society, but as time goes on, things become more complex and grey and less clear cut. The Trafigura case was clear cut. Those trying to stop the BBC putting Nick Griffin on to Question Time, less so. There’s a thin line between the people power righting wrongs and mob rule.

One final example – baby and bump

Last Christmas I came across a news story about a ‘Lapland in the New Forest’. Long story short it was a con. Promised a lot, but was little more than muddy fields and a few fun fair (pay to use) rides, two santas (queue for hours, not allowed to take photos) and the odd tree with fairly lights, with staff who were untrained and the worst possible people to be interacting with kids.

For some reason I looked up to find out more. And I came across a thread on a web based forum called baby and bump (you can guess what it’s for). The thread was the top result on Google so became one of the main exchanges on the web for those affected by this.

The thread starts off with a few excited people discussing going to the Lapland attraction and how excited the kids are, and how much they have splashed out (money they couldn’t afford to throw away). Then those who visited the first few days after opening reported back, while others are in denial that it can be that bad. Then it really starts, more report back, and others start to join the forum simply to add their experience.

Then the fact finding starts: the owners name and address, other business addresses, legal rights, who in the council to complain to, who in the press to contact, how to file a small claims, the owner is related to the leader of Brighton (my) Council!

I like this example. It isn’t the twitterati or tech-savy web2.0 types, but just families on a simple web forum. No one organised anything, but many added bits of info, supported others, or shared their experiences. I would say it very much played it’s part in the early closure of this cruel con. After returning from a horrible day, cold, upset kids, after paying quite a sum upfront, it must feel frustrating and helpless, I think even finding others who have been through the same must be of some help. The Internet can really help in such situations. But it’s not the power of the internet, it’s the power of people. The Internet just acts as a enabling tool.

So Twitter is allowing us to share information and become aware of facts/situations in a way not thinkable until now, and at a very quick speed.

1 person likes this post.
  • Share/Bookmark
Google Reader – shared stuff

I’ve been using Google Reader for a while having jumped ship from Bloglines. One of its features is to share stuff. This is potentially a good thing as it avoids me bombarding my twitter followers with endless links to stuff i find interesting.

At the moment it is useless as I don’t really follow anyone on Google Reader, and they (probably good sense, and a firm value of their own time) don’t follow me.

So, people, here is a link to my shared stuff.

http:

So feel free to add me as a contact in Google Reader, and I’ll do the same. And read interesting stuff. Because twitter, failblog, blogs and the web don’t already waste enough of my time.

  • Share/Bookmark