We help Microsoft-centric enterprises fully adopt the cloud & adapt to new ways of working.
\

CATEGORY

Journal Migration

Do you need to migrate thousands of PST files, really big mailboxes or large email archives to Microsoft 365 (and is drive shipping the answer?)

What is a PST file?

A PST file is an Outlook Data File (.pst) that can be used to store messages and other Outlook items.  From the ‘outside’ they look like a single file, but inside they have the structure of a pseudo mailbox.

Larger organisations in particular can have a legacy of many thousands of PST files sitting on individuals’ local drives, home drives and file servers.  They would historically get created as a way of staying within mailbox quota (when Exchange mailbox quotas on premises where perhaps not as generous as they are online).  PST files were also used as a convenient way of preserving mailbox contents – perhaps after an individual had left the company.   In fact PSTs used in these scenarios are widely considered a pest – especially when organisations switch to the cloud.

PST files can also be used as an interim store when migrating between platforms, as many mail systems can export emails and attachments into PST format, and conversely, import content from a PST file.  It is this ‘use-case’ that is the subject of this particular article…..

What is drive shipping?

Microsoft’s drive shipping service is multi-step process via which large amounts of data can be uploaded into Microsoft cloud via an interim device which is physically shipped to a Microsoft location.

This is how it works:

Instead of transferring data across the network, the technique involves writing PST files to a hard drive along with a mapping file (for example, migrate PST <filename> to <username> primary or archive mailbox).

The (encrypted) hard drive is then physically shipped to a designated Microsoft location from where data centre personnel pre-stage the contents into Azure.  The files are then ingested into Exchange Online according to the supplied mappings.

If you have TBs of emails you want to move into Microsoft 365, drive shipping via PSTs is an option open to you.

This Microsoft article goes into all the different iterations of what you need to do and the cost of using this service.

The same technique can be used to perform migration of mailboxes that are over 100GB in size or if the user’s mailbox contains one or more messages that exceed the 150-megabyte (MB) message limit (in which case resorting to PST files is recommended).

Using interim PSTs is also an option for migrating the contents from large email archives such as Enterprise Vault and EMC SourceOne, or email journals form platforms such as Mimecast.

The question is:  Is drive shipping PSTs a good option for your large email archive migration? 

Here are the pros and cons:

Pros

  • It’s low cost…on paper*. The cost to import PST files to Microsoft 365 mailboxes using drive shipping is $2 USD per GB of data.
  • It minimises impact on your network: if you have a sub-optimal network that cannot handle large amounts of data being transferred, using a data drive to ship PSTs to Microsoft negates any network concern. Even on networks that can support around 500Mbs you can experience slow performance when you start to drive a large migrations alongside regular user activity.
  • It avoids the impact of Microsoft throttling:Microsoft applies throttling to avoid overloading its servers. Although you won’t experience the effect of throttling when using native Microsoft mailbox moves to migrate your mailboxes, many email archive migration solutions use the EWS protocol to move your data, and this protocol is subject to throttling, although Microsoft has made throttling easy to ease off during the course of a bulk migration.

Cons

  • *It can work out expensive:At face value, $2 USD per GB is cost-effective, for example, a 20TB project would be $40,960 to ‘drive ship’, but this does not include the added overheads of getting your data onto the drives (see next point).
  • PST preparation is labour-intensive. Suffice to say that manually extracting data from archives into PST files and then preparing them for upload can be super time-consuming.  Native tools for extraction out of third-party archives (such as the Enterprise Vault extraction wizard) are slow and not geared up for performing automated mass exits. Once you’ve extracted files, you’ll need to make sure they are prepared properly for Microsoft.  This includes the creation of a mapping file, so Microsoft knows what files(s) belong to who, and where you want them putting.  Check out the steps you’ll need to carry out.  Whilst it’s possible to automate the PST extraction and preparation process using third-party migration software, you’ll need to factor this additional cost in.
  • It can take a long time:You’ll need to allow 7-10 days for your data to be uploaded from the drives into Azure (as we said earlier, this is where your data is pre-staged) and then Microsoft offers an ingestion rate of 24GB per day.  Using our 20TB example, this means your PSTs would take 860 total days to ingest.
  • It introduces an element of risk:When using multiple hops and manual interventions to move your data, there’s the potential for things to go wrong.  Even though drive shipping uses Bitlocker encryption to protect your data in transit, there are many other steps that introduce the potential for human error, this includes the process of babysitting the extraction into PST files from your archive and the mapping of PST files to their owners.  This, combined with the fact that extraction tools typically have no inbuilt error-checking, are unable to recover in the event of a failure, and no auditing, will make it difficult for you to prove chain-of-custody.  Oh, and did I mention that PSTs as an interim file construct are prone to corruption?
  • Your source data needs to be static. If you’re migrating the contents of an email archive using drive shipping via PSTs you’ll ideally need to make your archive static during the course of the migration.  This means stopping any archiving activity for the duration of your archive project, otherwise you’ll have the overhead of subsequently migrating any additions to your archive.  We’ve encountered several projects where stopping archiving is not possible.
  • Shortcuts aren’t being addressed (and create confusion). You will need to have a game-plan for dealing with the shortcuts (also known as stubs) that typically link to archived items.  Many enterprises end up migrating shortcuts along with regular emails into Exchange online mailboxes.  Whilst in most cases it’s possible to retrieve the full item across the network from an on-premises archive whilst your migration is taking place, you’ll have various issues that emerge once your PSTs have been uploaded into Microsoft 365.  This includes broken shortcuts (assuming at some point you will decommission your on-premises archive) and legacy shortcuts that can appear along with the full migrated item in the event of any eDiscovery exercise.
  • Other limitations:
    • Message Size Limits of 150MB
    • No more than 300 nested folders
    • Doesn’t support Public Folders
    • You don’t get flexibility on where your data is migrated to destination and split of data
    • Volume restrictions of up to 10 TB
    • A maximum of 10 Hard drives for a single import job

So should we use drive shipping for our migration?

In summary, the only time we can see drive shipping using PSTs as being beneficial is if you have:

  1. Very slow network connectivity
  2. Lots of inactive data to migrate. For example, archives belonging to ‘leavers’

Our email archive migration service uses a series of techniques to mitigate the impact of Microsoft throttling, enabling us to move archives directly from your archive into Exchange Online (either primary mailboxes or archives) at a rate in excess of 3TB a day.  There’s also no overheads or time delays involved by extracting into PSTs first.

We can also schedule migration activity to coincide with less busy times on your network.

Also, the fact that we can move your data in one step, direct from source to target avoids the non-compliance risk of interim storage and human error.

We can also help you avoid moving everything.  For example, by applying date ranges.

You can also avoid creating a storage overhead in the cloud by managing where data gets migrated to.  I.e., by moving messages over a certain age into archive mailboxes or moving PSTs belonging to leavers into a separate (but indexed) Azure-based store.

On a final note, using interim PSTs is also an option when migrating journals from services such as Mimecast and Proofpoint, but there are a few things to watch out for when migrating into Microsoft 365.  You can find out more about migrating journals to Microsoft 365 in this article.

Find out more for your PST migration project

Get in touch with our migration experts for an unbiased chat on the options open to you.

In the early days of corporate email communications, messaging was not viewed as a formal business record despite emails being more verbose compared to the average email in 2020.

Policies about the use and retention of messages generally did not exist because of the relaxed view of email in the workplace. If there was a corporate policy about email, it was usually to impose small quotas on mailboxes, erroneously believing that this would control storage growth and would mean that messages were deleted after a certain period.

All of this changed when email messages played significant roles in high-profile litigations, with the smoking gun being an email that was thought to have been deleted.

The corporate world soon realised that what they did not know could hurt them, and governments moved to pass legislation imposing regulatory compliance requirements for specific industries to keep records.

Journaling provides a “golden copy”

There are three reasons that you need journaling:

  1. Your organisation falls under legislation or one of the regulatory regimes that mandate it, and/or
  2. Your legal department says so
  3. You’re not sure Microsoft 365 will fully meet your email retention needs

It is common for legal teams to require email journaling because it offers them the option of conducting early data assessments in the event of claims. Legal teams can make an informed decision about whether to fight or settle the matter when they have a reliable, golden copy to explore early in the process.

Many legal teams find the cost of journaling and early data assessment to be far less than deciding to fight and later losing based on surprise email evidence.

Does Microsoft 365 solve my journaling needs?

The short answer: Partially.

Although you can configure journaling to take place in your Microsoft 365 messaging backbone, you cannot use Exchange Online mailboxes to provide the storage for your email journals.

You have to store your journals elsewhere.

As found in Microsoft’s documentation:

You can’t designate an Exchange Online mailbox as a journaling mailbox. You can deliver journal reports to an on-premises archiving system or a third-party archiving service. If you’re running an Exchange hybrid deployment with your mailboxes split between on-premises servers and Exchange Online, you can designate an on-premises mailbox as the journaling mailbox for your Exchange Online and on-premises mailboxes.”

Microsoft 365 journaling hacks

Arguably, by setting the right retention policies in Microsoft 365 you can recreate the ‘effect’ of having a journal – including capturing those emails that were ‘BCC’d’.  You can read more about the importance of capturing BCC’d emails (and how to do this in Microsoft 365) here.

It’s also possible to migrate your historic journals into Exchange Online.  This might involve migrating a journal from Exchange on-premises, a third-party archive such as Enterprise Vault, or a hosted journaling service such as Mimecast.

Whilst this is technically possible – for example, by taking an extremely large journal and chopping it up into smaller chunks that will fit into a series of Microsoft 365 shared mailboxes with appropriate use of retention policies –  this approach is a hack.

For example, it can create search and discovery complications downstream as, in order to be complete, all relevant shared folders would need to be included in any future eDiscovery exercise, alongside regular mailboxes.

You should test any retention and eDiscovery strategy to ensure it aligns with your legal and compliance requirements and that the hold, collection and eDiscovery workflows deliver the results you expect.

Journaling Microsoft 365 in the Cloud

Cloud-based journaling can work alongside Microsoft 365 to solve both the retention of legacy journal archives and the go-forward journaling for an ‘air-gapped’ golden copy.

Much like insurance – you never know when your organisation will need to pull data from old emails.  If you don’t have a journaling system in place you run the risk of lacking the information needed which can ultimately cost much more than implementing a proper journaling solution in the first place. That’s why preparing in advance is key to preventing unnecessary problems in the future.

If you haven’t started looking into email journaling, now is as good a time as any to start.

Migrating Email Journals

Find out about the range of journal options available to you.

Are you being locked in by your cloud vendor?

Whenever I get on a flight I always count the number of rows to the nearest exit, so I can grope my way out of a smoke-filled cabin if the worst should happen.  A totally pointless exercise, as in reality I’d be toast, but at least it makes me feel better.

What is worth doing is checking your exit route if you’re planning to store your content in a hosted cloud service.

A common tale of woe relates to hosted email journaling vendors, whose built-in export tools are simply not up to the job of wholesale extraction when the customer wants to ‘move on’.

“It took us between 16 hours to a day to extract just one mailbox into a PST, which then needed to be re-imported.”

“We had to run a series of searches using the “from address” to collect all the emails belonging to each user.”

By all accounts, data extraction is not a fun exercise when you’ve got TBs of data to move.

Check your exit route

What’s involved in getting your data back out of the cloud has to be a primary consideration if you are planning to migrate into it.

Ask your prospective cloud vendor these questions:

  1. How easy will it be to get my data out,
  2. How quickly can I get at it? Will it be over the network or on a disk?
  3. What about chain-of-custody during the extraction process?
  4. How will I know I’ve got everything back?
  5. What format will it be in when I get it back?
  6. How much will it cost?

Cloud storage vendor escape route

If you’re stuck in a hosted journal service, or are contemplating your best options for zero lock-in cloud storage, get in touch!.

Free retention of ex-employee’s data

At present, if you want to retain the mailbox contents of former employees’ mailboxes there’s a facility called ‘Inactive Mailboxes’ that you can use.  The great thing is that if you follow Microsoft’s steps, you can re-use the licence associated with the ex-employees mailbox for someone else, so effectively there’s currently no charge for this facility.

Watch this space, however, as back in 2017 Microsoft was on the verge of introducing a charge for inactive mailboxes, and it’s our guess they could consider doing it again.

Inactive mailboxes could be chargeable…

Back in late 2017, Microsoft was on the verge of charging for inactive Office 365 mailbox licences.  It’s our prediction that this could happen again.

At the time, Microsoft faced a backlash from their customers and MVPs during Ignite 2017, and did a U-turn on charges for inactive mailboxes.

https://www.petri.com/no-licenses-office-365-inactive-mailboxes

Having seen the proposed licence plans, we’re not surprised it caused a stir. Inactive mailboxes represent a significant volume of data.

“When we do an analysis scan before moving email archives to Office 365, it’s not unusual for about 70% of the contents to belong to ex-employees” Annie Holder, Migration Consultant

The U-turn highlighted the demands that businesses are making on Microsoft to support proper governance of their email and other data.  Right now, the way Microsoft 365 helps you manage the full lifecycle and eDiscovery of email is impressive.

We will, however, watch with interest how Microsoft adapts to accommodating the vast churn of mailboxes from a licencing perspective.

Not just because of the potential future cost of retaining sheer volumes of it, but also because of a greater responsibility to keep it secure, minimise the risk it represents and fulfil obligations around data protection.

Managing data that doesn’t have ‘an obvious home’

Handling the retention of leaver’s mailboxes, SharePoint and OneDrives is sometimes still only part of the story.

Many cloud project teams are now turning their attention to other more complicated stores of data – like legacy Journals, public folders, PST files, file shares… data that sometimes doesn’t seem to have an obvious home in the Microsoft cloud.

Retention of content that doesn’t fit neatly into Office 365 (such as legacy data on file servers), is a topic we regularly address with our customers.

Inactive mailbox charges

If you have legacy on-premises content you want to preserve and do eDiscovery on, but you’re not sure where to start, get in touch.

Why are BCC’d recipients so important?

In relation to email, BCC stands for “blind carbon copy.” Just like CC, BCC is a way of sending copies of an email to other people. The difference is that recipients CC’d on an email have no visibility of the fact that other people may have also received the same email.

I think we’ve all been on the receiving end of a marketing email that’s been inadvertently sent to CC’d a circulation list.  This is where BCC comes into its own, but there’s other scenarios where BCC is used.

A key thing to consider is “Why do people use BCC in work-related emails”?

  • To raise an issue concerning a co-worker?
  • To lodge a confidential record of an email exchange with a third-party?

Arguably the use of BCC is secretive and deceptive and it follows that the nature of the email will be more ‘shady’ or confidential than an openly CC’d email.  It also follows that the person being BCC’d is just as important, if not more so, than those that are CC’d.

The good news:

The default Exchange journal setting (and that of most hosted journaling services such as Mimecast) is called an ‘envelope’ journal.  The envelope includes a record of the TO: and CC: fields as well as any BCC’d recipients and all the individuals included in your local distribution lists (DL) at the point in time the email was received by your messaging transport agent (MTA).

The bad news:

In the process of migrating to Office 365, you could be stripping out BCC and DL information from your email records.

Having helped with extremely large corporate email investigations, we know the importance of maintaining complete email records and maintaining due diligence when handling email archives in particular.  https://www.theguardian.com/media/2011/jul/08/phone-hacking-emails-news-international

What’s the problem with Office 365 & Journaling?

The key ‘gotcha’ is that Office 365 does not have a journal service – at all. 

Until recently if you wanted to move to Office 365 and maintain a conventional envelope journal you’d have had to subscribe to a third-party service from an organisation like Mimecast, or keep an Exchange journal running back on-premises. 

But in the last few years Microsoft has been filling a few holes.  Office 365 can now effectively replace the role of the envelope journal and provide a one-stop-shop for compliant and complete email records retention.  This is how it works:

  • Instead of using a large, centralised, single-instanced mailbox that is inherently difficult to scale and failover, Microsoft uses its optimised multi-instance storage model.  This allows each user to retain his/her copy (journal) of all emails sent/received with zero performance penalty and no single point of failure.
  • By putting all relevant mailboxes on In-Place Hold, all emails sent and received are retained indefinitely.
  • Deleted emails are removed from the user’s view, but held into a special hidden folder inside the Recoverable Items Folder (RIF), where they are available to the eDiscovery process.
  • Any BCC’d recipients will be retained indefinitely in the senders’ mailboxes.
  • The members of any distribution lists (DLs) are expanded at the point of sending and stored in hidden headers in senders’ emails so they are fully discoverable.
  • Ex-employee’s mailboxes (i.e. those belonging to leavers) can be put on Indefinite Hold and made available for eDiscovery, without a license penalty (using Microsoft’s inactive mailbox service).

So assuming you’re not going to dump over 10 years’ worth of email records when you move, all you’ve got to do it map what’s in your existing journals and any journal archives (which are commonplace given the size to which journals can grow) into the new model.

You’ve actually got a few options for doing this, ranging from quick and potentially dirty to slower and comprehensive?

Email Journal Migration

Want to get the full scoop on how it all works?  Get in touch today.

Discover How (and why) Microsoft 365 Replaces The ‘Traditional’ Email Journaling Service.

Have you ever wondered why Microsoft 365 doesn’t provide a ‘native’ email journaling service (like your old on-premises Exchange server used to).

  • Do you still need to use a third-party journaling service (such as Mimecast or Proofpoint) or an on-premises Exchange server?
  • If not, how is Microsoft now ‘filling the journal gap’
  • What you need to do to migrate an existing on-premises journal or cloud journal into the new ‘Microsoft way of doing things’?

This white paper addresses all these questions and more. 

Download your copy of the Making Office 365 One-Stop-Shop for Email Records Compliance white paper.

Discover How (and why) Microsoft 365 Replaces The ‘Traditional’ Email Journaling Service

Get in touch to find out more about your options for handling your legacy email Journal when you use Microsoft 365.

Although an archive might be something the IT department would prefer to put on a tape and forget about, most email archives need to be ‘kept alive and kicking’ over period that could extend well beyond our retirement – or our next job move!

At the extreme end of the scale, Child Services related records – including those in email form – must be retained by UK Government bodies until the person’s 75th birthday. Imagine that!

Even if you don’t have a legislative reason to retain and discover emails, there’s usually a whole bunch of business and productivity reasons you need to ensure archives are reliably maintained and readily accessible for staff – and for a longer time than you bargained for.

So it’s no surprise that an enterprise will need to tackle at least one archive migration – possibly several – before the emails in question reach the end of the road (that’s if someone wants to take responsibility for pressing the delete button).

In fact Essential has already migrated more than a handful of customers twice – we even have several ‘three timers’ – within a span of 7 years. More recently this includes law firm Ashfords.

The drivers behind multiple jumps can be down to whole range of scenarios, including:

  1. CHANGING CIRCUMSTANCES – For example, customers have found themselves needing to move on from technically sound solutions that have unfortunately ‘fallen by the wayside’ following vendor acquisitions or lack of vendor focus.
  2. OVER-OPTIMISM – Some customers have been tempted to take advantage of increased storage capacity in newer versions of Exchange. Where this can be successful, the lack of single instancing and ‘re-hydration’ effect as emails get moved ‘back to where they came from’ can lead to a bout of archive indigestion (triggering a return to a dedicated archive).
  3. STORAGE RE-FRESH – Extricating archives from high-end specialist storage devices and end-of-life storage devices (EMC Centera fits both categories) is a common request and very justifiable in the face of spiralling storage costs. Being able to physically retrieve from the storage you may have purchased a decade ago is also important – we even had one customer whose disks had started to rust.
  4. FINANCIAL ATTRACTION – Even a recent re-vamp of an on-premises archive can be ousted in favour of a pay-as-you-go cloud model if that is what the FD desires.The good news is that from an accounting perspective, most assets – including software – are depreciated of period of 3 or 5 years, so relatively frequent switching to a new long term email storage platform is not the end of the world. Similarly, newer archive systems and storage platforms tend to have lower overheads.

What needs to be accounted for, however, is the cost and complexity of switching to the replacement (i.e. migration).

I recall the time we moved house just a few doors down the street. To save costs we decided to move ourselves – it was such a short distance for Pete’s sake.

The reality was that my partner lost 2 stones in the process of our DIY move. I guess you could say that, in that respect, our strategy to cut costs paid off. But never again.

When you’re planning a move you need to plan in removal costs from a reputable firm that will ensure everything makes it successfully to the new destination, quickly, intact and fully accounted for.

This is all part of good information governance that should be adhered to throughout the lifecycle of your corporate records.

PS – If you’ve moved your archives into Office 365, it’s highly likely that this won’t be the ‘final destination’ for your email records.  Anything can (and often, will) happen that could mean a re-location of your data.  The good news is that extracting your data out of Office 365 should be a lot easier….

There’s a lot to think about when migrating email archives.  We caught up with Migration Consultant Jim Fussell over a cup of tea and a biscuit to pick his brains on getting your data into (and out of) Mimecast..

So James, what’s the first step?  Well, first you’ll need to define what you’re migrating. Often this will simply be a case of selecting messages within a time-frame that matches your retention policy. Lots of customers decide to migrate literally everything up until the point that their Mimecast Journal Capture service kicked in (or stopped).

Of course you might want to filter what you’re migrating, or exclude email from leaver’s mailboxes.  It’s up to the customer, their email retention policies any legislation that applies to their industry.

Can you migrate directly into Mimecast?

No, currently you will provide your data in PSTs or EML files. The PSTs need to be structured and named in line with Mimecast’s requirements, which we sort out.  We also keep them below a certain size to avoid corruption. Mimecast sends an encrypted storage device which they pick up when you’re ready and take it from there. Transferring data using this method is actually faster for the larger sites we deal with as network bandwidth can be a bottleneck.

Any other top tips for handling the PSTs in transit?  Yes. We always recommend customers store a copy of extracted PSTs until they receive confirmation that the ingestion is complete, and although it’s temporary, make sure it’s backed up.  It’s also worth bearing in mind that archives like Enterprise Vault compress and de-duplicate your email, so when you extract to PSTs you’ll need storage space that is 2 or 3 times bigger than your archive.

How long will it take? Hmmm, this is the million dollar question.  We get asked this a lot and the answer is, “It depends”.  We automate the extraction process making it a lot quicker than doing it manually.  In fact, any extraction over 1TB is a pain to do manually.  Running a couple of test extractions will give you an idea of timescales, but you should also get an estimate from Mimecast on their current ingestion times for an end-to-end estimate.

When should we switch off archiving on-premises? It’s always preferable to extract from a static archive so if your Exchange servers can cope, it will be best to stop archiving just before extraction. Mimecast will have probably started Journal Capture by then so you won’t be at risk from a compliance perspective.  It might just be a case of making sure your Exchange mailbox sizes don’t grow too large if you were archiving fairly aggressively beforehand.

What if we’ve stopped archiving on-premises already? That’s great, because your archive is static, but it might mean that you will have content in Exchange that you need to migrate too because you’ll have this gap of time between your archive stopping and Mimecast starting.  If possible, I’d recommend archiving everything into your on-premises archive so it can all be extracted from one place.

If that’s not an option, you’ll have to do an extraction from Exchange. We’ve helped a couple of customers with this recently because they needed to define a date range and exclude stubs from the extraction because stubs will obviously be useless once in Mimecast and users might get confused.

Talking of stubs, don’t forget to delete them from user’s mailboxes after you’ve completed the migration.

Any extra tips?  Migration to Mimecast might be a good opportunity to centralise any other email you’ve got in PST files. Mopping up rogue PST files isn’t that easy, but if you have concerns around PSTs now might be a good time to tackle them.

Can we migrate out of Mimecast?

Yes, but not without technical and/or financial pain.  I guess it’s no surprise that a SaaS vendor wants to keep your business.  As a result, open APIs and no-cost options that let you readily take your data (and your business) elsewhere are not common.

With Mimecast it’s possible to export all emails belonging to an individual user (in batches of 10GB and a maximum of 2GB per file).  We’ve also encountered approaches that involve automating eDiscovery searches and exporting the results (exports are currently limited to searches returning fewer than 50,000 messages).  Both of these approaches are a world of pain if you’re trying to navigate a timely and reliable exit strategy for your valuable email records.

The best route for larger enterprises is to pay Mimecast’s per GB extraction fee.  As I say – it’s painful either way.  The default format you’ll get your precious data in is a big, single-instanced bucket of emails.  You are then left with the challenge of how you’re going to move this into your new email/archive/journal platform of choice.

Click here and find out more about how Essential can help your migration to (or out of) Mimecast..

 

Migrate Your Email Archives to the Cloud

Find out more about how Essential can help your migration to (or out of) Mimecast.

?>