By Martin Donegan, Public Cloud Solution Architect at Ensono
By Prasad Rao, Sr. Partner Solutions Architect at AWS
The global authority on world records, Guinness World Records (GWR), was originally a traditional book publisher. To better serve their audience over the past decade, the GWR website and online presence have become focal points of their brand in the record-breaking community.
Since 2012, Guinness World Records have been diversifying from just producing their iconic annual book into being a creative consultancy and digital media company.
GWR receives a lot of user-generated content, and has been publishing across a wider range of media channels. Each year, GWR handles around 50,000 official submissions to attempt records, nearly 30 percent of which are for brand new record titles. All of these submissions are supported by video footage and other evidence.
To keep up with their community’s needs, Guinness World Records has had to reimagine their IT architecture. In this post, we’ll describe how Ensono helped GWR move all-in on Amazon Web Services (AWS) over a period of 10 months. We’ll also explore how Ensono continues to help them innovate using advanced AWS services.
For delivering this data center-to-AWS migration project within 10 months, Ensono won the Best Cloud Migration Partner Award at the UK Cloud Awards in 2019.
Ensono is an AWS Advanced Consulting Partner and Managed Service Provider (MSP) with AWS Competencies in Migration and Microsoft Workloads. Ensono is also an AWS Public Sector Partner, AWS Solution Provider, and AWS Well-Architected Partner. Their staff holds more than 700 AWS accreditations and 145 AWS Certifications.
Why Ensono and GWR Chose AWS
Ensono and Guinness World Records have been working together since 2002, with Ensono hosting GWR’s infrastructure in a private cloud on a managed service basis since 2008.
GWR wanted a platform for the long term that could host their existing workloads and adapt to evolving business requirements. Ensono collaborated with AWS to propose a migration of all GWR infrastructure to the AWS Cloud.
The following were key drivers of GWR’s migration to AWS:
- Flexible, scalable infrastructure; reducing costs when not needed and allowing experimentation by their innovative agile team.
- Provide customers with a low-latency, locally-hosted experience at a global scale.
- Unlimited storage; submission of 4K videos is now common.
- Migration from end-of-life platforms (MS SQL 2008 and Windows Server 2008).
- Productivity; allow staff to focus on adding business value by reducing undifferentiated heavy lifting.
The migration to AWS presented the opportunity to substantially re-architect GWR’s IT in order to shed technical debt, modernize workloads, and right-size infrastructure.
The extensive capabilities of AWS services are well suited to refactored workloads, which can bring benefits of reduced OpEx, faster operation, and improved supportability. Refactoring costs money, absorbs time, and despite producing technically elegant solutions may not offer an improvement over the existing solution in meeting underlying business requirements.
There were no complex refactoring exercises undertaken during this migration, however. This reduced the cost and time required, which helped to enable additional future improvements that would add more value for the customer.
For the initial migration to AWS, Guinness World Records and Ensono chose the following focus areas:
- Migration of the Windows COTS workloads (commercial off-the-shelf).
- Replacing existing cold disaster recovery (DR) strategy with resilient and highly available architecture on AWS.
- Re-architecting GWR’s core web content management system.
- Continuous improvements and innovations.
Migration of Workloads Running on Windows Servers
The GWR IT estate was 80 percent Windows Server 2008 R2 running Microsoft or third-party applications, as well as some Windows Server 2012 servers. The workloads running on those followed one of three different migration journeys.
1. Migrate Windows workloads to Linux
To save on licensing costs, migrations from Windows to Linux were undertaken. For example, the Windows Server 2012-based Apache SOLR application was re-platformed onto Amazon Linux on EC2.
2. Migrate Windows Server 2008 R2 to the latest license-included Windows AMIs
For the workloads that needed to continue to run on Windows, they were rehosted on Amazon Elastic Compute Cloud (Amazon EC2) license-included Windows Server 2016 instances; the most recent version available in 2018.
The SQL 2008 cluster with shared SAN storage was upgraded to SQL 2016 Always On availability groups, avoiding end of licensing issues with SQL 2008 and providing resilience to a single data center failure. Reporting software required SQL Enterprise features, precluding the use of Amazon Relational Database Service (Amazon RDS).
3. Retain workloads with applications dependencies on Windows Server 2012
Application dependencies dictated that some servers could be upgraded to Windows Server 2012 only, and not Windows Server 2016. Overcoming these application dependencies would have added to project effort and incurred the time and cost of training operations staff on the new applications.
It was decided that it would be more cost effective to pay for extended support of Windows Server 2012. Then, at a later date, they could revisit the upgrade of the applications and Windows Server 2012 instances. The first of those workloads is now scheduled for replacement, so this has proven to be a good decision.
Building in modern ways, all of the infrastructure created made use of infrastructure as code (IaC), delivering auditable and reliable infrastructure changes. Config as code (Ansible) was used on the EC2 instances to define the changes to the systems, speeding up deployment through re-use and increasing consistency.
Replacing Existing Hot DR Strategy with Resilient Architecture on AWS
Active-passive solutions are now an anti-pattern still present in the private cloud era. Previously, GWR had a hot DR strategy, which left around 40 percent of Guinness World Records’ infrastructure sitting idle, waiting to be used if the primary site went down.
Those servers had to be licensed, powered, cooled, and patched, but served no customers. This needed to change to a system that met GWR’s restore time and restore point objectives, whilst maximizing use of provisioned infrastructure.
The following diagram shows the high-level architecture designed to provide resiliency and high availability.
Figure 1 – High-level solution architecture.
The design leverages multiple AWS services to achieve a highly available and scalable solution:
- Amazon Route 53 is used to manage GWR’s DNS records with a 100 percent uptime SLA.
- Amazon CloudFront with seven distributions. Each CloudFront distribution provides access to one of seven unique GWR web sites, each customized in language and content for a particular geographic region.
- Multi-AWS Availability Zone architecture and resources are balanced over distinct data centers within each AWS region.
- AWS Auto Scaling is used by all server roles of the website workload, spread across Availability Zones and behind Application Load Balancers.
- Amazon ElastiCache for Redis is used by the web servers to store session data.
- Microsoft SQL Server on AWS deployed on EC2 instances with Always On availability groups, provided as a fully managed service by Ensono.
- All EC2 instances are backed up at least daily using Ensono’s backup management system to meet Recovery Point Objectives (RPO).
- This architecture collectively provides 99.95 percent availability for the web site.
- SQL and data backups are stored in Amazon Simple Storage Service (Amazon S3) and Amazon S3 Glacier with 99.999999999 percent (11x9s) of durability, replacing the previous tape backup solution.
Re-Architecting GWR’s Core Web Content Management System
Guinness World Records uses SDL Tridion for their web content management system (CMS), which allows varied configurations ranging from all tiers on one server, to splitting out each tier and server role of the CMS into multiple servers.
The architecture in Figure 2 shows GWR’s previous CMS deployment setup. It only splits out the roles according to tier, concentrating multiple roles onto a handful of servers. This was convenient for reducing Windows license costs. However, it made scaling and maintenance tasks more complex.
Figure 2 – GWR’s previous CMS deployment setup.
The migration of this workload presented the opportunity to re-architect the deployment to meet best practices by splitting out the server roles within each tier onto their own servers.
The architecture in Figure 3 represents the high-level CMS deployment architecture on AWS. It was designed to provide scalability, high reliability, and performance efficiency.
Figure 3 – CMS deployment architecture on AWS.
Reference AMI images were created for each unique server role in the workload. Each of these were placed in Auto Scaling groups and subnets that span three Availability Zones, to minimize capacity loss if an Availability Zone were to fail.
Non-critical server roles (Content Manager, Publisher, Deployer) were placed in auto-scale groups with a target size of one, meaning if a server stops responding AWS will replace the instance. If the Availability Zone was not present, it would be re-previsioned into another.
For high availability, the servers hosting these roles have auto scaling policies that maintain a minimum of two active instances for microservices and three for the website. The web site servers scale up in response to load based on the CPU utilization over a number of polling periods. They also scale out on a schedule every weekday afternoon to accommodate the regular increase in load experienced at this time.
Load Balancers were used extensively to allow systems to be added and removed for automation and to ease administration, de-coupling the architecture.
ElastiCache Redis (highly available) is used to store the web servers’ states. This allows horizontal scaling and consistent user experiences as web servers are automatically added and removed.
Continuous Improvements and Innovations
Migrating to AWS has presented Guinness World Records with opportunities to optimize existing workloads, develop innovative solutions, and introduce new business capabilities that would have been considerably more complex to deploy to their previous data center.
GWR developed in-house an Alexa Skill to provide users with one interesting world record per day, plus the ability to provide more if requested. The team is currently developing the next release, which will give access to 50,000 world records on demand. The medium-term roadmap includes delivering video and audio artefacts for those records to suitably capable Alexa devices.
Live Production Broadcast from Virtual Workstations
To scale, GWR has identified the need to have a live broadcast capability anywhere in the world, without their staff being physically present. Virtual workstations on AWS were identified as a suitable candidate to meet this requirement.
One of the most important requirements for a virtual workstation on AWS (used for video and audio processing) is a rich remote viewing protocol. It should be low latency to provide lag-free interaction, high quality for accurate color and frame rate reproduction, and feature-rich to accommodate multiple screens and end user input devices.
Ensono evaluated various technologies and settled on Teradici’s PCoIP, which supports multiple high-resolution screens and input tablets, and adapts to changing network conditions.
The proof of concept (PoC) was completed within two weeks, with a live demo in which there were incoming live video streams from different locations, stock footage, and screen overlays being mixed in real time and broadcast live on YouTube.
Following the successful PoC, several vMix workstations are being deployed to a production environment, enabling a cloud-based live video production and broadcasting capability.
Ensono is an AWS Well-Architected Partner and has periodically conducted Well-Architected Reviews with GWR since 2018. The recommendations have led to measurable improvements such as cost-saving through savings plans and more widespread use of encryption at rest.
The improvements implemented following Ensono’s Well-Architected Review with GWR helped save the company 17 percent on their monthly AWS bill.
Many organizations are cautious about moving to the cloud, or are fighting against gravity by staying on-premises. When done right, cloud adoption can increase your speed to innovate and offer opportunities to transform customer experiences.
In this post, we shared how Ensono helped Guinness World Records to migrate all-in on AWS. Measurable benefits of the migration include:
- Calculated availability of the GWR web site went up from 99.9 percent to 99.95 percent.
- 95th percentile cold web site page loads went from 16.3 seconds down to 12 seconds.
- With Amazon CloudFront, 95th percentile second poll web page load times were 6.6s in the UK and 7.1s in other territories.
- 50th percentile cold web site page loads in China went down from 60s to 5s.
- Transitioned from 80 percent to zero percent of Microsoft Servers running Windows Server 2008 R2.
- Removed hard limits to file storage (particularly large video files) by migrating to infinitely-scalable object storage in Amazon S3 and using the third-party ‘ExpanDrive’ products to access files from desktop clients.
Ensono – AWS Partner Spotlight
Ensono is an AWS Advanced Technology Partner that helps customers looking to transform their traditional environments with AWS.
*Already worked with Ensono? Rate the Partner
*To review an AWS Partner, you must be a customer that has worked with them directly on a project.