Insights from practitioners in Information Management

Issue 80 – Living Life in the Cloud

As our reliance on the digital world continues to grow, our need for more storage space at a reasonable price also grows. But do we keep our growing digital presence “in-house” or do we utilise the companies who offer “cheap/free” storage on the Internet to solve our problems. This paper takes a brief look at some of the issues.

In this issue we will look at:

  • Living life in the cloud
  • The cost of cloud storage
  • Sweeping the problems under the carpet
  • Changing File Formats
  • Upgrades to the “cloud” servers
  • Retention and Disposal of Records
  • Metadata
  • Security of Information
  • Dead but not forgotten
  • Is this the “future” or will there be other variations to come?

Living life in the cloud

What do we mean by “the cloud”?

In simple terms:
Cloud computing is a way of computing, via the Internet, that broadly shares computer resources instead of using software or storage on a local PC. Cloud computing is an outgrowth of the ease-of-access to remote computing sites provided by the Internet.

This paper looks at some of the issues surrounding our increasing reliance on outside service providers to look after our digital lives.

The cost of cloud storage
While we may think it is going to cost less to store our data offsite, there are some things to consider before doing so:

If you are trying to determine a monthly service fee so you can make an informed decision about which of the cloud service providers to choose from, or whether to keep the information in-house, beware the hidden fees.

The basic cost per gigabyte of cloud storage will most likely be prominently displayed on a company website. For example, the basic cost for Amazon Web Services is $US0.15; pricing for Zetta starts at $0.25 and decreases as more data is stored to the cloud.

What they may not “advertise” however, are costs associated with transfer of material to and from the cloud (the bigger the number of users accessing the cloud, the larger these costs will be. And then there are the costs associated with deleting your information.

“All providers will charge for data transfers in and out of the cloud based on the volume of data transferred (typical cost is $US0.10 per GB). Some will also charge for metadata functions such as directory or file attribute listings, and copying or deleting files. While these metadata operation costs are generally miniscule on a per-operation basis (maximum of $US0.01 per 1,000 for Amazon), they can add up.”

Given that most organisations are storing Terrabytes of data, which isn’t hard to do if you are dealing with embedded video, large graphics files and entire archives of scanned material, the costs can be staggering.  

But there are other “costs” to consider. What happens if and when the company you choose as your cloud service provider gets taken over, or worse still goes under? Will you be able to recover your information? Will they guarantee you can have it back should either of those scenarios happen?

Think about what would happen if your chosen vendor was unscrupulous in its business dealings and won’t release your data unless you pay “ransom money” – how much would you have to pay to get your information freed?

As we have already discussed, cloud storage providers already charge for the removal of information from their storage systems, so this is not such a “silly” statement to make. So, if you need access to your information and your vendor is being antsy about giving it up – how much would you be willing to pay for it? How much is your information worth to you?

Sweeping the problem under the carpet
You’ve done the math and you’ve determined the cost to outsource your electronic archive to cloud storage is well worth it. Why worry about long term archival strategies, when you can give the entire problem to someone else. But are you giving away your problem entirely?

Changing File Formats
When it comes to the electronic world in which our organisations tend to exist, there is a major issue we need to discuss, and that revolves around file formats. If you had to guess – how many different file formats does your organisation use on a daily basis?

To state the obvious ones, there are:

PDF; .doc; .docx; .txt; .ppt; .pst; zip files, excel spreadsheets, databases, backups etc.

As you and I know, most of these normal file formats are not “open” formats, they are proprietary and therefore reliant on the organisations desire and ability to maintain these formats over time:

  • How many times have you upgraded your electronic world?
  • Have you ever received a file format you couldn’t open?
  • How did you convert the file to one you could? The first time someone sent me a .docx file I had to download a converter so my perfectly adequate but old version of word could open it.
  • Do you still have floppy discs stored in your hard copy archive? How about the safe? One question – does your new, whizz bang, modern computer have the capacity to open them or do you just have a CD drive? What would happen if you DO want to open these so you could see what was stored on them? How would you copy these discs so you could migrate the data back into a digital format you can send to the cloud?

I’m fortunate, I have a very old laptop – it may have been re-built 3 times after major system crashes, but it does have the capacity to read discs, CD’s and can handle an external hard drive. But why would this matter?

Well it depends what you have on your discs I suppose – and if you can’t open them, how will you ever know? For me, mine contained part of my writing history and some of the first digital photographs of the kids. These discs came in handy when I was re-building my electronic world after system crashes, and it didn’t take me long to realise that discs and CD’s were not going to be the best medium for long term archival storage of what I deemed to be my vital records.

Assuming all your information is in an already cloud friendly format (ie electronic and on your server somewhere), you make the decision and send it on its way. However, as with all records, these will need to be reviewed on a periodic basis (or retrieved for further use) and ultimately marked for destruction (but more of that later).

As you can imagine, the problems will begin to occur of course when we try and review these “archived” documents and the file formats cannot be opened on our recently upgraded systems.

Sweeping the electronic archiving problem under the carpet by not addressing file formats BEFORE you send your archive to the cloud can cause major headaches for your organisation when you do come to retrieve these documents in a few years time.

So the question has to be – do we convert our long term archive into a non-proprietary format before we archive, just so we can be assured this problem does not occur, and do we keep backups of backups just in case the cloud service provider disappears?

Upgrades to the “cloud” servers
In addition to our own file format issue, we also have a hardware issue to consider, and not just our own.

If there is anything we can absolutely guarantee in the electronic world, it is upgrades to systems, software and hardware and our cloud service providers are going to be no different. As more people send their material to the cloud, there will need to be additional space to cope with the traffic. With more traffic comes its own set of problems of course, but say for example:

Every time there is an upgrade to hardware and software we run the risk of losing information. We mentioned the file formats issue earlier. What happens if the newer hardware and software used by the cloud storage companies is no longer compatible with our own hardware and software OR the older formats we have stored?

With migration across platforms, we run the risk of losing information, it’s nothing new, it’s been happening since the dawn of the electronic world:

  • Would we know what information we had lost?
  • Do we know how much information we have stored?
  • Do we know WHAT we have stored?
  • What guarantees do the cloud storage providers give us?

In the early days of Google Mail (Gmail) that happened and people lost information and were told,

“sorry – don’t know quite what happened there, but we can’t find it.”

Is having a cheaper storage option worth the risk of potential data loss?

Retention and Disposal of Records
I know I have been talking “worst case scenario” so far, but I do have to ask these questions, so you ask these questions. But let’s move on.

What about your Record Keeping Requirements and in particular the Retention and Disposal of Business Records.

As we all know “We keep business records for a number of reasons. Every organisation regardless of its size, creates, receives and uses records (both paper based and electronic / digital) in relation to business activities on a daily basis. These “records” form the framework around which an organisation conducts its business, complies with regulatory requirements and can provide necessary accountability of business activities. The record, subject to a test of reliability, is proof of how things were at any given point in time.”
Section 1: What is a record? P16 Australian Record Retention Manual 2009

When it comes to electronic records there are a number of concerns, not least of which:

Is it possible when we don’t know “where” it is stored, then – How can we be sure it has been deleted from every server, every back up tape and every mirror site?

Given that one of the marketing benefits is touted as being – “you can access your material from anywhere, so long as you have an Internet connection” it is a question that needs to be asked before moving your entire archive to the cloud.

Again, what guarantees do you have? Are they worth the paper (electronic or otherwise) they are printed on?

While you may no longer need to keep the “record” you do need to know it once existed, so it is important to retain the metadata about the record even if you don’t keep the record itself, as Kate Cummins stated in an article entitled “Pass the Digital Stress Test”.

“In a legal and business sense, records are of little value if you cannot demonstrate that they are what they purport to be. If you can’t account for their management, if you can’t demonstrate the role they played in business process, if you can’t show who had access to them and when, if you can’t provide the business rules that governed their use, many records lose all value”
Image and Data Manager, July/ August 2009 pp15

Security of Information
If the above has not raised hackles of awareness to problems, the security of information should set the alarm bells ringing.

Yesterday I downloaded another anti-virus software program, as it began the installation procedure they recommended we also download all the upgrades – I sat and watched in utter incredulity as each upgrade told how many cookies, traces, Trojans and worms were being “fixed” with each upgrade. Thousands upon thousands every time, and the problem is not getting any better.

In fact in today’s news – this was reported

Spanish authorities say they have nabbed the hackers behind the Mariposa botnet. The botnet, which was developed for large-scale theft of information, took control of more than 13 million computers in 190 nations.

According to the news report, the three accused did not appear to use the network to the extent it could have been used, however, the information was apparently on-sold to other “organisations” who did.

If your information resides in the electronic realms then it is always going to be vulnerable to attack from people who can hack, crack and slither through the gaps in unsafe, dictionary based and easily guessed passwords and sloppy computing practices.

Are you still happy to send your information to a cloud storage provider?

Let me put it this way, if you have a Hotmail account you may have been a recent victim of a phishing attack, twitter and facebook are constantly being targeted – these are all cloud services, so maybe it’s time to tighten your passwords?!

Dead but not forgotten
What happens to your personal “digital” information when you die? Who do you trust with your passwords? The truth is, if it isn’t managed from beyond the grave

  • would we care
  • would our families care
  • would the organisations we store our information on – give too hoots and eventually mothball your accounts and move your digital life to a backroom somewhere?

It is an interesting aspect of the digital realm in which we reside. In the not so recent past, our families (or an executor) would have the dubious pleasure of rifling through our paper based history. Sniggering over journals and photographs they would make the decision whether to keep, toss or donate to a collecting institution. In the electronic world, it’s hard to remember where we stored anything let alone what the passwords are. But in reality these bits and bytes still need managing in the same way as the contents of our houses, homes, safes and safety deposit boxes.

But one question – will there be anything to “give” to collecting institutions if it’s all electronic? For someone like myself with a writing history that spans decades and most of it residing in the electronic realm – how can I pass on the baton that is / was my life and my business unless I give someone else the passwords and the details of where and how to find it?

According to Scott Brown in his article on “Managing your Digital Remains” he cites three companies who have already jumped onto that particular bandwagon. But as with all things electronic – how can we be sure these companies will still be around if and when we do die to do what we paid them to do?

Is this the “future” or just another stepping stone?

So what of the future?

As you can appreciate these questions, issues and ponderances will be with us for as long as the problem exists. And given we still haven’t solved the long-term electronic archiving issue (PDF/A, HTML etc notwithstanding) I’m not hopeful this will be the answer we’ve been hoping for.

Or as someone once said:

“Plus ca change, plus c’est la meme chose”

With many thoughts