If you follow me on Twitter you’ll have seen me vent my frustrations with ScreenSteps earlier this week. I held off on writing this review for a few days to get a bit more experience with the app, and to give myself some time to decide how best to phrase my issues so they don’t come across as being overly negatively. The short version is that I’m very conflicted about this app. Without a shadow of a doubt their idea is sound, as is their basic architecture. However, I found a few aspects of the interface exceptionally frustrating. Some parts of the app suffer from poor usability in my opinion, and another lacks what I consider to be very basic features. Ultimately, what’s missing is some fit and polish. However, it can’t be denied that the app works, which is obviously very important. I hope you can see why I’m so conflicted about this app. I want to love it, but I can’t – at least not this version.

Before we go any further I should probably explain what ScreenSteps is. It’s software designed for writing computer manuals, lessons, and tutorials. If you need to write a document to describe how to do something on your computer, ScreenSteps aims to be the answer. The interface is designed around a very sensible work-flow. Do what ever it is you want to document, taking screen-shots as you work, then, after you’re done, annotate and describe those screen-shots. The idea is that you don’t interrupt the task you’re documenting by constantly switching context between doing the task and documenting it. This is a great workflow, and to facilitate it ScreenSteps provides a built-in screen-shot utility in the form of a small floating Window that’s always available yet doesn’t get in your way.

Once you’re done taking the screen-shots you go back to the main ScreenSteps interface where you’ll find that a “step” has been created in your document for each of the screen-shots you took. All you have to do now is explain these steps by adding annotations to your images, and textual descriptions to the steps.

Steps are a very important part of ScreenSteps’ architecture. A lesson is nothing more than a sequence of steps. Each step has a title, an optional screen-shot, and an optional textual description. You can add text, arrows, boxes, ovals, and sequence numbers to the screen-shots to help illustrate your point.

I should also say that there are two versions of ScreenSteps, a pro version and a standard version, and that both versions are available on both Mac OS X and Windows. I’m using the OS X edition of ScreenSteps Standard which retails for $39.95.

Clearly the design of this app has been well thought out, and it’s built around a fundamentally sound workflow. There can be no doubt about the fact that the developers put a lot of hard work and deep thought into the design of ScreenSteps. My issues are just with certain aspects of the implementation.

Before I go on to explain my frustrations, I just want to give some background. I’m not an average user, I’m a power user. I have very high expectations for the software I use. This is doubly true for Mac software, which I’ve found is generally more polished and more usable than Windows software. I also regularly use some very powerful apps on the Mac. I’ve seen just how well some things can be done, so when I come across apps that don’t do things as well, it frustrates me. I know it can be done better, because I’ve seen it done better. Think of if this way, if the iPhone had never existed, no one would complain about the Palm Pre, it would be seen as the best phone on the planet, the very personification of perfection. But, the iPhone DOES exist, and that colours people’s views of the Palm Pre. It gets judged much more harshly that it otherwise would simply because the iPhone exists. Similarly, I can’t forget about the other apps I’ve used on the Mac when I judge the usability of ScreenSteps.

Spacial Confusion

So, lets go through the workflow, starting with the screen-shot acquisition phase. This mostly works perfectly, but I did run into one issue, which I was able to work around. I’ve already described the cool floating windows ScreenSteps uses to gather screen-shots. It’s a great idea, but it doesn’t play nice with Spaces on OS X. Whenever I want to document something I like to use an empty Space, that way my screen-shots only show relevant information. Following my usual workflow for documenting something, I opened ScreenSteps in Space 2, and all the Windows I needed for the process I was documenting in Space 3. When I moved to space 3 the floating window vanished. No matter what I did, and I tried every way I know of for moving a window from one space to another, I could not get that floating window to move into Space 3. I had no choice but to move the main ScreenSteps window in Space 3. Once I did this, the floating Window also moved itself into Space 3. This wasn’t a major problem though, because I was able to minimise the main ScreenSteps window, getting it mostly out of the way. In the grand scheme of things this is a very minor annoyance, but worth mentioning none-the-less.

Inconsistent Context

Lets move on to the next phase, and start annotating images. My first confusion and frustration came straight away. There were all these images in my lesson, but no sign of any tools for annotating them. I knew the program could do it, but I couldn’t find the tools. After some head-scratching I discovered that you have to select an image to see the tools.

I don’t want to get into the age-old argument about whether or not context-sensitive toolbars are a good or a bad idea. They’re not common on the Mac, though they are used, and Microsoft absolutely love them. However, usability experts point out that they increase the cognitive load on users, and hence make interfaces more difficult to use. Personally, I hate them, but that’s just an opinion, and I’m not going to hold that against ScreenSteps.

However, what I am going to hold against ScreenSteps is the inconsistent way in which they expose context-sensitive controls. When you select an image, two sets of buttons appear. One set appears docked to the image, the other in the toolbar at the top of the window. If you have to expose context-sensitive functionality, docking the buttons to the object they manipulate is a very good and very clear way of doing it. Your mouse is obviously at the object since you’ve just clicked on it, and you can’t possibly miss the appearance of the new buttons because they are right in you field of view. My problem is with the fact that ScreenSteps only exposes SOME of the context-sensitive options in this sensible way. The majority of them appear and disappear way out of your immediate field of view up in the toolbar at the top of the window. This is just not standard behaviour on the Mac. It’s not what users expect, and hence it’s confusing and dissonant. Guys, if you read this, please put ALL the context-sensitive controls in the same place, ideally docked to the image or text-area they belong to!

Imperfect Imitation

The strange toolbar behaviour is also a symptom of an underlying implementation decision that I simply don’t approve of, and with good reasons. ScreenSteps is not a true native Mac app – or, to be more precise, it does not use OS X’s standard interface libraries, known as Cocoa. It is instead built using a custom library that tries to mimic cocoa, but doesn’t fully succeed. On native Cocoa apps you can customise toolbars, and that’s a feature I make heavy use of. I tweak the toolbars in most of the apps I use regularly so that the functions I use often are easy to get at, and those I almost never use don’t clutter my interface. This is clearly a power-user feature, but it’s one I rely on a lot, and one ScreenSteps doesn’t support.

Similarly, on the Mac there are conventional key strokes that most apps have in common. I’m talking about things like cmd+c for copy, cmd+v for paste, cmd+a for select all and so forth. Thankfully ScreenSteps does implement these very obvious standard keystrokes, but it’s missing others. One I rely on very heavily is cmd+i which should bring up an inspector window, but doesn’t in ScreenSteps. ScreenSteps does use an inspector, and it does have a keystroke for brining it up, but it’s a non-standard one, which is exceptionally frustrating. Muscle memory can be a great thing, and on the Mac it almost never lets you down because almost all apps obey the conventions. Windows users are used to there being almost no commonality between apps made by different companies, Mac users are not. This kind of non-stnandard behaviour is like someone constantly poking me in the side while I use ScreenSteps.

The non-standard behaviour of the toolbar is not so easy to fix, but the non-standard keystroke certainly is.

Modal Madness

So, after that detour into the bowls of the interface, lets get back to annotating images. Leaving aside the poor use of context sensitive toolbars, I find this whole aspect of the interface needlessly clunky, cumbersome, unpolished, annoyingly simplistic, and frustratingly inconsistent. If I were writing a school report card it would say “can do better”.

This aspect of the interface has a state, and that state determines it’s behaviour. This is not unusual, or necessarily a bad thing, but in ScreenSteps, I find the state-ful behaviour overly rigid, and inconsistent in it’s behaviour.

When you’re annotating an image there are seven modes the interface can be in “select”, “crop”, “line”, “rectangle”, “oval”, “sequence”, and “text”. You change modes by choosing the relevant icon from the toolbar, and the mode you are in is illustrated by a white drop-shadow around the relevant icon. For a start, this highlighting isn’t as clear as it could be. It would probably work better if the current mode was shows as a depressed button in the way bold and italics buttons are typically displayed.

These modes sound clear enough, and, for the most part, they work pretty well. The only mode that causes issues and confusion is the “select” mode. This mode allows you to move any item around. However, it’s not the ONLY mode that allows you to move things around. When you’re in “sequence” mode you can move ONLY sequences, when you’re in “rectangle” mode ONLY rectangles etc..

This confusing behaviour caused me a lot of frustration. For example, one image I was working with had a lot of very small items in it which I needed to draw attention to. Just putting numbers next to the items wasn’t clear enough, so I decided to move the numbers away from the items a little, and draw arrows from the numbers to the items. As I was doing this I was constantly moving between “line” mode and “sequence” mode. Being a perfectionist I was also constantly moving the arrows and sequences around to spread them out more evenly over the image and help make it as clear as possible. Firstly, having to change mode to move anything is exceptionally annoying anyway, so it was not behaviour that came naturally to me, but getting away with moving half of the items on the screen without changing mode, but not the other half was very confusing. I’d have last added a sequence number say, then go to move anther one further away. I’d go to move it having forgotten to change mode as usual, and it would move, then I’d go to move its matching arrow, and a new sequence number would get added over the arrow I’d tried to move. Having to change mode to move anything at all is already very annoying, but having to remember that you can only move SOME things without changing modes is down-right bad design. This poor design is placing FAR too much cognitive load on users. We should be free to think about what we want to achieve with the interface without having to constantly think about modes.

A very simply way in which this interface could be massively improved would be to allow all items to be moved in any mode. That simple change would make a world of difference.

No Groups Allowed

I also want to mention another short-coming in the interface for annotating images – there is no way to group items together. It should be possible to in some way stick items together so they can be moved as one. There are many ways in which this would be very useful. For example, it would be nice to link arrows to sequence numbers, sequence numbers to text, text to arrows, sequence numbers to arrows to rectangles, and so on and so forth. Simple grouping is even available in MS Word, which is a word processor, rather than an app specifically designed for annotating screen-shots!

Simple grouping would be good enough, but in an ideal world items would snap together and link themselves like they do in Omni Graffle. Omni’s approach to linking objects is to place “magnets” on key parts of objects, say on each end of an arrow or line, in the middle of a label, at the edges of rectangles, etc., and have these magnets snap together when they’re dropped on each other, and then stay together as the objects are moved around. This design is simple, elegant, intuitive, powerful, and easy to use – ScreenSteps could really benefit from implementing something similar.

If I’d never used Omni Graffle I’d probably be happy with just regular grouping, but having seen Graffle I know just how well object grouping and manipulation can be done, so I can’t help but compare ScreenSteps’ primitive implementation to Omni’s masterful one. But, as I’ve already said, at the very least ScreenSteps definitely needs to implement basic grouping ala MS Word.

Primitive Text

My final frustrations revolve around text entry. To say ScreenSteps only supports basic of text formatting is to put it mildly. I’ve literally had typewriters that were more feature-rich than ScreenSteps! For example, my old typewriter could write “the 14th“, ScreenSteps can’t since it has no superscript or subscript support! I don’t think it’s unreasonable to expect to be able to write “the 3rd thing you need to watch out for is …..” in a tutorial – but you can’t in ScreenSteps, and that’s not even the worst omission here!

The only text formatting options you have are bold, italic, underline, and colour. You have no control over the actul font, so it isn’t possible to write something in a clear fixed-width font which clearly shows difference between a lowercase ‘L’ and an uppercase ‘i’. Writing a terminal command, or file or program name in regular font is just not good enough in my opinion, you need clear fixed-width fonts for things like that. ScreenSteps is all about writing tutorials for how to do things on computers, yet it can’t render text in a clear, unambiguous font, as is the expected norm in computer documentation!

The second massive clanger for me is the lack of support for either bulleted lists or numbered lists. You can add sequence numbers to images, but you can’t add a matching list of things to do to your text. OK, you CAN, but only if you fake it by starting new lines with *s or numbers manually – but that’s just frustratingly clumsy. Lists are a great way of keeping things concise, simple & clear, and are FAR from rocket science. ScreenSteps really needs to add basic list support to their next version.

To be honest, not only did I expect list support, I expected the addition of sequence numbers to an image to automatically generate a corresponding numbered list in the text below. In fact, I’d even expected the list in the text to be linked to the sequence numbers on the image – like footnotes are linked to their insertion points in word processors. Should I delete the third sequence counter out of four from an image, I’d expect the third item in the list below to also be deleted, and the original fourth item in both places to re-number itself to 3.

Linking as described above is more of a wish-list item though, it’s not a requirement. The total lack of support for any lists at all however, that’s a different matter. How a tool for writing anything can make it to version 2.6 without basic support for lists is beyond me.

Wishful Thinking

Finally, lets move on to a short wish-list for future versions. These are not shortcoming, they’re just enhancements that I’d like. I’m not criticising ScreenSteps for not doing these things.

In an ideal world I’d love nested steps. Even just one level of nesting would be great, but obviously in a truly ideal world there’d be no limit to the depth of the nesting. Yes, this kind of nesting could be abused to make really complicated documentation, but, it would also open up great possibilities for power users.

Assuming nested steps aren’t going to happen, a nice fall-back would be the ability to have more than one image in a step. There are many valid reasons for wanting this, one example being a before and after shot. That’s not two steps, it’s one step – do this and you should get this.


So, in conclusion – ScreenSteps DOES work – and that’s a really important point. It does make it easier to make nice tutorials, and that’ a credit to its developers. However, it’s missing a lot of polish. I find image manipulation annoyingly clunky, and text manipulation shockingly primitive. These are two very important aspects of the app, so it’s a real pity they’re both distinctly underwhelming. The good news is that none of these things are un-fixable. Were ScreenSteps to focus some development time on really improving the smoothness and power of their image annotation and text-entry interfaces this could be one of my very favourite apps. Please guys, if you’re reading this, spend some time on polishing the usability before the next release. I think I can really come to love this app a few releases further into its life.