Archive for the ‘ Architecture ’ Category

Recipe: Migrating from Novell Groupwise to Microsoft Office 365 Part 2 – Understanding the Mailscape

After getting a clear picture of why users wanted to migrate from GroupWise to Outlook, we knew we need to look closely at exactly what we were migrating. We had very raw numbers for GroupWise – we knew only that there were approximately 5000 mailboxes totally over 30 terabytes of storage. This worked out to roughly 6GB per mailbox, which seemed awfully high.

It also meant that we had enough mailbox data that migration to Office 365 would not happen overnight – it would take days. So a third-party specialist was brought in to do analysis on exactly what was happening in Groupwise. We need to know exactly what the mailbox landscape was. What we found was shocking:

  • Over 30 terabytes of mail total
  • 404 mailboxes larger than 15 GB
  • 101 mailboxes larger than 100 GB
  • 13 mailboxes larger than 200 GB
  • 3 mailboxes larger than 300 GB (and still growing!)
  • Many mailboxes had huge numbers of subfolders (the record was 6838 subfolders)
  • Almost all mailboxes were configured with shared folders between multiple users
  • 563 mailboxes were orphaned – no one had access to them
  • 1609 mailboxes had been converted to resources, meaning they didn’t receive mail, didn’t consume a license, they only provided archival access to mail
  • 60 mail servers across 72 physical offices, geographically distributed
  • Each server responsible for its own backup, no synchronization
  • No archiving mechanism beyond backups

The analysis also showed that because the mailboxes were so large, if the biggest, busiest mail server in the group had failed, it would take almost a week to restore the data.

Clearly people were using GroupWise for far more than just email, contacts, scheduling appointments and keeping track of tasks. There was only one way to find out what people were using there system for beyond the obvious and that is by talking to them. With a user base of close to 5000 talking to everyone was impractical, so we started with users that owned the sixteen biggest mailboxes (200 GB+ and 300 GB+). After that we spoke to a random number of people that had an inbox of more than 100 GB. Last but not least we spoke to a random number of people that were having an inbox bigger than 15 GB. It turned out that the mailbox sizes correlated with the way they used the system.

The users with the biggest mailboxes usually had no shared folders or other users accessing the box. They were using the email system as a document management system (DMS). They went as far as emailing documents to them to store it in their DMS. Every mailbox bigger than 200 GB was actually being used as a DMS – the size purely related to how long the mailbox had been open and who made the largest documents.

Mailboxes bigger than a 100 GB had a common but different characteristic. These were typically a shared mailbox for an entire department. The rational for this was that everyone should have access to all mail received by anyone in the department excluding actual private mail. The department head selected a mailbox and setup proxy rights for everyone in the department to share the “departmental emails”. These mails were mostly stored in “topic folders”, and sometimes these folders had subfolders organized to store mail from particular periods, such as “Jan 2011”.

The 15 GB+ mailboxes were mostly in use by individuals and small groups who used shared folders a lot. It turned out that most users in this group were sales people and they used the system in a slightly different way as the other groups with the largest mailboxes. The overall discovery on the mailboxes smaller than 15 GB but still substantial size was that people just didn’t cleanup their mailbox and their mailbox size was influenced by the number of years they worked at the company – the longer you worked there, the bigger your mailbox.

Further investigation showed that a number of people were violating the email compliancy rules set by the company. Forwarding rules were used to enable users to work on their emails from home, which was strictly forbidden by the company.

The scope of the issues discovered within GroupWise made it clear that a quick migration was out of the question – we needed the change the way email was used by the employees of the company before we could migrate. Ultimately this broke the migration project into a number of other smaller projects.

For the sales staff, implementing a true Customer Relationship Management (CRM) system would move a lot of CRM-related data out of the email system. There are numerous products in this space, SalesForce arguably being the most famous. Ultimately the company went with Microsoft CRM Dynamics because it integrated well with Outlook and Exchange.

But the far larger issue was a proper document management system. The users had started storing documents in their mailboxes not so much by choice, but because there was no other way to manage all of their documents. The fact that it had swollen their mailboxes to massive sizes had no impact on the users as long as the servers didn’t crash. And if-and-when they did crash, the users certainly didn’t see it as the fault of their giant mailboxes – it was just the IT department failing to do its job.

A separate project was initiated to set up Sharepoint 2010 as the document management solution for the organization. Initially it was developed in-house to “lighten the load” on the email system, removing emails stuff full of attachments and replacing them with emails to links to Sharepoint documents. Ultimately the Sharepoint implementation will be migrated to Office 365 as well, since it is part of the offering.

Finally, there was the issue of email governance – up until now the company really had not enforced many rules on how email was used, such as limiting the size of email boxes and requiring archiving for size management and legal requirements. These sorts of rules would have to be applied during migration, utilizing the features in Exchange 2010/Office 365 to reduce the size of mailboxes to less than 5 GB. While Office 365 supports 25 GB mailboxes, the company had recognized through the analysis process of GroupWise that it is not in its best interest for users to store mail in individual mailboxes, but rather it should be migrated to Sharepoint in various forms better suited for accessing by a variety of users.

Many months of work have been summarized in this blog post, but there was more to come – once these changes were made we were ready to begin the actual migration from GroupWise to Office 365, the subject of our next blog post.

Recipe: Migrating from Novell Groupwise to Microsoft Office 365 Part 1 – Why Change Anything?

This is the first blog post of a category we call a "recipe" – the analysis, planning and effort that went into an implementation of a cloud-based application. Recipes are big, this one breaks down into three parts, a separate post each for the analysis phase, the planning phase and the execution of the project.

This project did not start out as a cloud-based application project – it started out as an email migration project. The customer in question had an email system in place operating on Novell GroupWise. But the users all wanted Outlook as their mail client. But was Outlook what they actually wanted, or was there more to the demands?

In general users ask for functionality instead of specific products – they typically only name a product because they believe it possesses the functionality they need. In order to make a solid business case we needed more than just the fact that users like the Outlook UI better than GroupWise client UI. Migrating 5,000 users to a new mail system is not an easy nor cheap endeavour. Trying to justify this in the middle of a financial crisis with only "it looks better" as an argument would certainly have resulted in a NO from the board. We needed to get a clear answer on the why from a business perspective, an answer to what was wrong with GroupWise.

When we started a like-for-like comparison between GroupWise and Outlook we came to the conclusion that there’s not a lot of difference between the two. They both send mail, allow for appointments to be made, keep a task list and so forth. The differences between the two are small on a functional level, although the UIs are very different. The functionality one expects from any mail program is adequately covered by both. But what really mattered to the users was Outlook’s ability to integrate with the other tools they use on a daily basis including Microsoft Office and the ERP systems. When it comes to increasing productivity and efficiency through integration with other software, the Outlook and Exchange combination wins easily over GroupWise.

There was some discussion about using Outlook as a client against GroupWise servers – but ultimately it was discarded. There were too many reported problems with integration, broken features and lack of support moving to Outlook 2010 to make this option viable. And then an even larger conversation started – if the project migrates over to Microsoft Exchange, should the Exchange servers be on-premise, hosted or in the Cloud?

Clarifying the meaning and differences between on-premise, hosted and Cloud:

  • On-Premise Exchange means that the Exchange servers live in the company data center. The company is responsible for everything – hardware, software, maintenance, backup, reliability, disaster recovery and so on. Likely consultants will help with the initial configuration, but in the long term the responsibility rests solely on the IT resources of the company itself.
  • Hosted Exchange involves a third party company in the ownership and/or operation of the Exchange servers. While there is the option of co-location of the company’s servers and a hosted location, Hosted Exchange in our minds means a scenario where the company does not own the servers or even the software. The hosting provider owns everything, maintains everything and is ultimately responsible for the quality of service to the company. The company typically pays a fee roughly in relationship to the number of mailboxes involved.
  • Cloud Exchange is a relatively new product from Microsoft. The original version was called Business Productivity Online Suite (BPOS). The subsequent version is called Office 365. Office 365 is SAS70 and ISO27001 compliant. For both BPOS and Office 365, the offering is more than just Exchange. But regardless, the Exchange part of the equation is handled entirely by Microsoft, operating all the servers, responsible for all services and charging the company a per-seat fee.

It’s also important to realize that these three options are not an either-or choice. Depending on the products involved, it is entirely possible to end up with a mixture of on-premise, hosted and cloud elements to any application – effectively a "Hybrid Cloud" solution. In fact, the more we work with real world Cloud implementations, the more it is apparent that virtually all sophisticated software application efforts in the Cloud end up as Hybrid Cloud implementations.

In the analysis of this particular project, the company is a multinational organization that had a number of significant jurisdictional compliances issues regarding email. For certain countries, email orginated in the country had to be stored in the country. This would appear to make email in the Cloud impossible, but the Hybrid Cloud approached solved the problem.

With Office 365, it is possible to mix storing email in the Cloud as well as on your own Exchange 2010 servers. Those mailboxes that had to be stored in-country were positioned on Exchange 2010 servers in that country, the rest were stored in the Cloud – hence the Hybrid Cloud solution.

One point of discussion was exactly where those Exchange 2010 servers should live. The option as to put them into the company offices or a third-party hosting provider. Using a third-party provider was preferred because it reduced cost of ownership, increased reliability and security. Ideally the server itself should be owned by the third-party to minimize the expense to the organization, but third-party providers at this point were not ready to support Office 365 in that configuration yet, so the servers in this case would be co-located servers, owned by the organization and positioned in a third-party data center.

Another key advantage we discovered using Office 365 was creating common journalling rules and data retention policies. Office 365 provided one point of access into the entire mail system so that there is a single archiving and legal hold system in place – a substantial improvement over the other mail solutions that made dealing with legal issues very challenging.

The analysis completed, we knew where we want to get to – now the question was, how to get there? The next blog post deals with the planning involved in this mail migration to the Cloud.

The Cloud SPI Model Part 2: Platform As A Service – When You Need to Build Your Own Application

The goal of any information technology department in an organization is to provide services to the personnel of the organization to make them more productive and profitable. Cloud technologies have emerged as a way to provide a wide variety of very capable services at a reasonable, granular cost. These services are delivered in the form of software, and Software As a Service (SaaS) is the way to deliver it.

So why bother with Platform As a Service (PaaS) or Infrastructure As a Service (IaaS)?

The main reason is that the software your users need isn’t available in the Cloud, so you’re going to build it yourself.

PaaS is the next Cloud product down the stack from SaaS, but is arguably the most difficult to implement. Where a SaaS offering is aimed at end users, a PaaS offering is meant for developers. These developers build software on top of PaaS to provide services to the users.

One of the first products in the PaaS space came from, called It’s a set of tools to allow developers to build applications running on the same platform that runs on. is the SaaS offering, and is the PaaS offering.

Amazon’s product in this space is Amazon Web Services. Google has a product called App Engine that serves a similar role to Microsoft Azure is a product with huge potential in this space because it takes the massive group of existing developers and cloud-enables them – if you know how to build ASP.NET applications, you’re most of the way to building applications on Microsoft Azure. And making developers productive is the key challenge to any PaaS offering.

What makes PaaS so challenging is the fact that it is a platform – a platform that developers have to learn to use. Developers skilled in one platform, for example, Microsoft’s .NET Framework, are not going to be as skilled in another platform like Often organizations are surprised at the impact changing platforms has on developers – it takes longer to build software and often that software has more bugs until the developers get more experienced with the platform.

As with any other new development platform, organizations need to plan in training time, consulting and mentoring to increase the rate at which their developers get proficient in the PaaS product. Picking minor projects to start with is also wise.

PaaS offerings, like SaaS offerings, are built to be highly scalable – which requires particular programming styles. Scalable architectures set very specific limits on certain things that developers can do, for example, limiting the ability of the developer to store data about a user in the computational part of the application, rather than in a database. Developers want to do this because it’s fast and efficient – but at the expense of scalability. Because PaaS is built to scale, the platform will not offer those sorts of storage options and can frustrate developers used to having those services available.

Ultimately these limitations are beneficial – you want the software your developers build on PaaS to be able to scale to whatever the users require. It takes time to get used to these constraints, as it does with any other specific requirements of a new platform.

For better or worse, migrating an existing application to a PaaS environment has proven very difficult. Most successful PaaS products are “greenfield” implementations – built from day one to run on a given PaaS offering. Even if the programming language is the same between the old platform the application was originally built on and the new platform of the PaaS offering, because the platform is different, the services are different which ultimately means the architecture is different. And rearchitecting an existing application is like trying to change the foundation on an existing building – dangerous, difficult, and in many cases, doomed.

PaaS still provides most of the benefits of SaaS: You don’t own the hardware, the operating system or the platform software that your application runs on. All of that is maintained by the cloud provider. But your developers do have to learn how to work with the platform the Cloud provider offers, and to operate in the constraints. In the end the goal is to deliver software that benefits the users.

However, not every set of application requirements fit into a PaaS offering and you need to go even further down the stack to Infrastructure As a Service, the subject of the next blog post.

The Cloud SPI Model Part 1: Software As A Service, What You Really Want From the Cloud

For the most part cloud products have sorted themselves into three classes: Software-As-A-Service (SaaS), Platform-As-A-Service (PaaS) and Infrastructure-As-A-Service (IaaS). These three offerings are now being referred to as the SPI Model of Cloud services. So what are the differences, what should you get and who offers what?

As we started deliberating the thinking around SPI, it became apparent that this would be a very long post, so it’s broken into three parts, one for each offering.

It’s important to realize that SaaS, PaaS and IaaS are not mutually exclusive – they build on each other. SaaS is only possible because under the hood there is platform and infrastructure that SaaS depends on. The platform and infrastructure may or may not be available for sale as PaaS and IaaS offerings, but they are there.

All Cloud offerings are ultimately SaaS, it’s just a question of how much you build yourself. With SaaS, you’re building nothing – the application is finished, you just configure it the way you want to use it. Some of these applications have been around longer than the term “Cloud”. Products like have been sold under the Application Service Providers (ASP) moniker for a long time. As the larger concept of the Cloud came to the forefront, they jumped onto the bandwagon, but it didn’t really change the product.

Changing the moniker didn’t change the application, and the users couldn’t care less. Users don’t care about Cloud or any other technology – they care about getting their work done. How you deliver the tools to them that let them get that work done isn’t important to them. If that tool happens to be software, and that software is delivered via Cloud technology, and it is effective for the user, then the user will like Cloud. But that begs the question, what makes a given technology (like software) a Cloud offering?

The best definition we’ve found for determining whether a given technology is “Cloud worthy” comes from Dave Neilsen of Platform D. The acronym is OSSM (“awesome”): On-Demand, Self-Service, Scalable and Measured. On-Demand means the product is available whenever you want in whatever quantities you want. Self-Service means that you can order the product without needing any support from anyone else (especially the vendor selling the service). Scalable means you can order as much or as little as you want, it works the same regardless. And finally, Measured means that the product is measured in reasonable increments so that you know exactly what you’ve bought and how much it cost. You’ll pay only for what you use.

Using the OSSM standard, it’s easy to see that qualifies as a Cloud SaaS offering – you are able to order it yourself without any assistance and get to work immediately. You can buy as many seats as you want, enter as many leads as you want, run as many reports as you want. And you only pay for what you use.

Other SaaS offerings including Google Docs and Microsoft Office 365. In fact, mail in the Cloud is arguably the definitive SaaS product. Why go to the expense and effort of operating your own mail servers when you get everything you want from a SaaS offering, typically at a fixed rate per mailbox?

There are a couple of challenges with SaaS. The first is security. For the most part, security for SaaS applications is sufficient, even if it’s nothing more than SSL. But if the security features of a SaaS application don’t comply with your organization’s requirements, there’s often no recourse. The security the application has is the security the application has, take it or leave it.

A more complex issue is regulatory compliance. This is a huge subject unto itself, depending on the industry you’re in, the countries you’re working in and the work you are doing. There are whole web sites dedicated to the topic of regulatory compliance like the Compliance Authority that also talk specifically about compliance and Cloud. Smart SaaS vendors are helping their customers get through regulatory compliance concerns with great documentation and even features – for example, Microsoft’s Office 365 allows mailboxes to be stored outside the cloud so that a customer can fulfill compliance requirements for mail from a particular country to be stored in that country. There’s also sophisticated mail retention policies and auditing.

Beyond these issues, there’s almost no downside to SaaS. And there is a huge upside –  all the things you don’t have to do. You don’t own the hardware, the operating system or even the software you’re running. Someone else is responsible for all of that, you just utilize the product as you see fit. From an ROI perspective, SaaS is a no brainer.

Ultimately, SaaS is what we want from Cloud – the maximum potential with the minimum of ownership and responsibility. But what if the application you need doesn’t exist in the Cloud already? That’s where Platform-As-A-Service (PaaS) comes into play. It’s the next layer down the SPI Model stack, and is the subject of the next blog post.