Blog Post

The Reboot: Going (Mostly) Paperless

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!


I’m rebooting my life. After 10 years at the BBC, I’m switching careers and running my own business full-time. And that’s not the only big switch going on around here — I’m also moving from Windows to Mac (s aapl) for just about everything. Moving everything to the Mac doesn’t just affect my business, it shapes the very heart of my digital world. It also introduces its own interesting challenges.

Entrepreneur or ordinary consumer, regular TAB readers may find these are problems we have all faced. Finding solutions has been challenging and fun in equal measure — so it seems prudent to share them here. If I do a good job, maybe my solutions will work for you, too.

Going (Mostly) Paperless

I’m no eco-warrior. I’m too old and just don’t have the energy for all that worrying. Plus, I like my double-quilted, ultra-soft, aloe-infused toilet paper. I know, it kills the Earth — and I am a scoundrel. A scoundrel with a pampered bum.

That said, 10 years ago it seemed no one (besides hemp-wearing hippies) gave a toss about The Environment. Today, governments are awarding grants and tax relief to eco-friendly companies. The take-home message is, look, The Environment is big, it’s probably going to be around for a while, so we’d better take it seriously.

Going mostly paperless is one easy way to show a bit of eco-love. Forget all that nonsense about being completely paperless — only the most dedicated geeks/tree-huggers ever achieve that, and, to be perfectly honest, it’s often impractical. We don’t live in a world that tolerates the paperless model. Not yet. But, that doesn’t mean you shouldn’t try. The benefits are tremendous. Over three years of correspondence, bills, statements and other documents now live on my iDisk — the original hard copies shredded and recycled. I can’t believe how much office space I’ve reclaimed!

You can only imagine the pleasure I felt the first time I put my system to the test; I needed to find some obscure cable account information. It took me all of six seconds with a simple Spotlight search. Bliss.

Finding my Feet

To get to this point I spent a good few weeks digitizing every bit of paperwork I found in my home office, and after a bit of stumbling, found my feet by settling on a solution that worked for me without needing any software beyond the scanner interface and the tools already baked into Mac OS X.

I use an HP Photosmart C5280 — one of those typical printer/scanner/copier combo devices that seemed so impressive five years ago. This particular model has been out for about two years now, but the more recent offerings from HP have nearly identical software.

Just a quick word on the software, which I have run on both Mac and Windows machines. Most of the Windows-based OEM software bundled with it is horrible — big, garish windows with custom buttons and controls that were obviously chosen because the designers figured “big, bold and vomit-like” equaled “user-friendly and intuitive.” Thankfully, the Mac version of the software doesn’t suffer quite the same fate; most of it is in line with the sleek, elegant lines of Leopard.

First up is the main scan control interface (see photo below). This presents all the options and controls you need for scanning a document: defining part or all areas of a document to scan; rotation; resolution; color; and the file format of the final scan. I chose PDF. I did not use HP’s own optical character recognition (OCR) software because it scans only in black and white, often screws up important text (turning, say, an account number into bizarre hieroglyphs) and, worst of all, the final document format is Microsoft Word.

HP Scan Pro - Main Window

Rather than opting for HP’s OCR output, choosing “Scan to: PDF File” yields exactly the right results; images are faithfully scanned and reproduced, while text is properly rendered in the final PDF.

HP Scan Pro - scanning progress bar

Once you have made your choices, the final scan usually takes about 10 or so seconds to complete.

Scanned PDF in Adobe Acrobat

The final scanned document — in this screenshot, as it appears in Adobe Acrobat Pro — is a faithful reproduction of the original.

Scanned documents contain real selectable text

The real benefit from turning your scans into PDF documents is that all the text they contain is real. Not images of text. Actual text. Text that can be selected. Text that can be copied to the clipboard.

Spotlight results

Even better, all that text is automatically indexed and almost immediately searchable in Spotlight.

Happy Hippies

If you’re looking to embrace a truly all-digital lifestyle, going paperless has to be one of your goals. Maybe this solution can help you get there without breaking the bank buying into some of the “end-to-end” scanning and archiving solutions available. (I’ve tried NeatReceipts — it was not ideal for me, your mileage may vary.)

I use Adobe Acrobat to read and manipulate my PDF documents once they’re saved, but you don’t need such expensive software to do that. Every copy of Mac OS X has Preview, and that’s more than enough for reading and performing basic edits.

And there’s the point: My desire to go all-digital is driven by my needs as a freelance media professional, small business owner and tech-savvy individual who wants to get his life in order. Mac OS X supports me with most of the tools I need, and they’re right there from day one, out of the box.

In 10 years no one will use paper any more, and they’ll look back at articles like this one and laugh at how much we old-timers struggled with the transition out of the Stupid Ages. What’s really funny is that it’s not so hard at all. Give it a try — you won’t be disappointed.

Next time on The Reboot…

‘Collaboration’ needn’t be a dirty word — but it is. It’s really very dirty. But that’s not because of a lack of tools. Goodness, no. iChat, Adium, aMSN, Chatty-Watty (OK, that one I made up) all let us jabber at one another endlessly. But I’m not 13 years old and I don’t have the mental agility to maintain long messenger conversations any more.

No, the problem isn’t a lack of real-time collaboration software. The problem is an abundance of it, offering countless solutions and services across all conceivable OS platforms. The choice is, frankly, bewildering.

So what are these other collaboration tools, and which work best? Is there such a thing as the perfect solution? What do I use every day, and why do I think you should, too? Join us next time to find out!

21 Responses to “The Reboot: Going (Mostly) Paperless”

  1. Demi God

    Adobe Acrobat Professional is doing the OCR work. Hence why no 7 Patrick cannot get the text selectable in Preview or Adobe Reader (not the same as Adobe Acrobat now).

    In response to The Eck, HP printer software is a collaboration with HP and IRIS. The very basic version of IRIS that is included with the HP all-in-one style performs ok, but IRIS obviously wants you to make the upgrade to the full version. It suffers no layout abilities, text is accurate only on stuff that is already good quality printed text.

    The annoying thing is under 10.5 and free HP you seem not to be able to save colour OCR’ed files to PDF (an error occurred and the file could not created!). Under 10.3.9 and free HP software you could! You could OCR scan to a text file and save the output to PDF which also preserved the location and page layout. However, under 10.5 you can save colour OCR’ed text files to HTML which preserves the layout; as long as the HTML browser is correct version. HTML is not a good archive format though.

    The point here really is you can do a basic job of OCR scanning with HP IRIS combo but it is convoluted and not a robust solution. The Acrobat/HP combo would be better I think?

  2. Please forgive the shameless plug, but if you’re looking for Mac OCR software to read those image-only PDFs, please drop by We’re working on a modern Mac app to convert images to PDFs with embedded text.

  3. Wilson Ng - Guam

    If auditor sees that you are meticulous and well prepared with the scanning and filing of documents then it would be permissable. They would tend to trust your filing system if you were careful about it. They obviously wouldn’t trust your scans if you just had a bunch of image files crammed in folders all over the hard drive.

    It also depends on the auditor if he/she allows photocopies. One of my friends had an idiot auditor that demanded the original carbon copies of credit card receipts as proof. My friend gave photocopies of the credit card receipts but the auditor didn’t want them. She wanted the “originals.”

    My friend proceeded to blast her a new bum-hole and showed her that most credit card receipts are on carbon paper which fades over time. All he had were blank scraps of paper. Finally the auditing company yanked her for incompetence and replaced her with another auditor who had some common sense.

    Auditors will not trust the validity of the scanned copies if they see that your record keeping is shoddy…..

    I file my scanned PDFs in DevonThink Pro Office. In the comments section, I enter in any extra comments I may want to add about a certain receipt.

    I’m thinking of checking out Mariner Paperless (formerly known as ReceiptWallet) as a software package to help me with my quest for a paperless office.

  4. I applaud the actions and sentiments in this article. However, one thing that you should remember is that if the Inland Revenue were to ever investigate you then they would require the original receipts. Scanned copies will not suffice.

  5. David Chadderton

    This doesn’t work at all. I have the C6280 with the exact same software, but the text doesn’t every become text unless you OCR it (how could it?). Scan it to a PDF and you just get a PDF with an embedded image of text.

  6. Liam, how do you feel about the security of having all your personal information stored on your iDisk?

    I scan and shred everything I can, and I get the same OCR functionality using my Canon Lide 70, but I store all the PDFs locally. I was considering putting everything onto a synced iDisk, but I don’t know anything about how secure Apple’s servers are, so I haven’t dared. I suppose I could encrypt every PDF, but that’s too much of an obstruction to efficiency, and I’ve read that having encrypted disk images on a synced iDisk results in corruption.

    Any thoughts?

  7. Couldn’t convince the last firm I worked for – before retiring – to go paperless. Truth is at the consumer level, service level, a certain amount of paperwork was needed.

    Once it got back to the office – absolutely, when it got to my home office – there was no need whatsoever for paper.

    I got them to email or FTP as much as they were capable of. Every piece of paper went into my Canon scanner > the paper then went through the shredder for recycling > the digital info has its own group of file folders on my hard drive.

    Daily backups of the whole hard drive were followed by a separate backup to a separate standalone just for company info.

    Not only was it a piece of cake, I was able to retrieve, discuss, utilize information on hundreds of accounts faster than the company offices.

  8. Patrick

    I’m with Jared … how are you getting the text to be recognized by Spotlight? I have already scanned a majority of my documents to .pdf, but nothing appears in Spotlight. I have the same HP multifunction copy/scan/fax machine, using the same software, also on a Mac. However, the .pdf documents it creates don’t allow me to select the text. I do NOT have the Adobe software you have; I’m just using the built-in preview window to see the .pdf documents. Is this the difference?

  9. @ 5 Jared: Sorry, I forgot to mention it by name. For DTPO it’s a collaboration between Devon and an outfit called ABBYY. Regarding the HP, I have no idea. Perhaps something built into Acrobat?

  10. First, thanks for an interesting article. It was a good read.

    Second, I have to suggest that you look at DevonThink Pro Office, which provides an integrated means of organizing, sorting, and storing your documents. DTPO has a built-in OCR engine that is really quite the bee’s knees in overall character recognition. In a recent project of scanning 20-odd pages of text, only half a dozen words were not recognized (due to poor inking on the original).

    If anyone is thinking about “going paperless,” do yourselves the favor of checking out the Fujitsu Scansnap S series scanners with auto document feeders that read both sides of a document in one pass. DTPO is tightly linked to the scanner drivers so that when you scan several different pages that are unrelated, you can direct each one into its appropriate folder within DTPO.

    @ #2 Champ: I don’t know about how the HP software works but in DTPO you definitely can specify any (or even a new) folder as you’re scanning rather than having to review all your scans that the driver sends to its “scans” folder.

    Devonthink is at and I have no association with Devon other than being a very satisfied user.