By Gus Robertson, NGINX

Now that the new year has arrived, and the holiday hangover has subsided, it’s time to take a look back on what worked and what didn’t for the 2015 holiday season. For most retailers, Black Friday, Cyber Monday, and all of the madness in between, are the busiest, yet most profitable, days of the year. The past year proved to be no exception with another consecutive year of continued growth. For 2015, IBM estimates that Black Friday sales alone were up 21.5% from the previous year, while Cyber Monday sales topped $3 billion for the first time ever.
For many retail CEOs, the beginning of the year is a time to breathe a sigh of relief and reflect on the successes of the holiday season sales. However, as we see every year, some very high-profile sites had major outages; including Target, Neiman Marcus and PayPal. I guarantee you that the CEOs of these companies have had to explain to their boards or bosses why their sites went down on the busiest day of the year, and what they are going to change in order to make sure it never happens again.
Advertisement
It is unacceptable for the management team of any online retailer to not understand the performance of their online store, and how it handles your busiest days. This is equivalent to the CEO of a traditional brick-and-mortar retailer not knowing how many stores they will have open on Black Friday, or how many staff they will have available to handle their customers.
How many millions of dollars will be lost before CEOs understand that, in the digital world, your site (or app) is your lifeblood. If it goes down, you’ve essentially closed your doors and are telling customers to go elsewhere. This means lost revenue, and every minute equals hundreds of thousands, or potentially millions, of dollars lost. Luck would have it that most sites go down when they experience the highest number of visitors, as several retailers saw on Black Friday and Cyber Monday.
This is not just an “IT problem” anymore. Slowness or outages on your site dictate if the best day of the year becomes your worst. Yet, many of those who fail are successful, technically savvy companies. How is this possible?
The answer is that web and application delivery is a very complex issue, especially when traffic is high and tens of thousands of users all request, at the very same time, different information about pricing, sizes, inventory, shipping, discounts, financing — and from all kinds of devices — PCs, tablets and smartphones.
This is meaningful stuff. If your page load time is reduced by just one second due to an influx of traffic, you can expect to see conversions reduced by 7%. At peak traffic times, more than 75% of online consumers will leave for a competitor’s site rather than suffer even tiny delays. That said, technology exists to solve these problems and it’s easy and inexpensive to deploy.
No one expects a CEO to become an expert in software architecture, but in today’s digital-centric world it’s critical that you have enough insight to know whether or not you will survive and thrive during your busiest times of year. To help you out, here are the big questions that every CEO should ask their technology teams in order to make sure they are ready to handle the 2016 holiday season, and all of the busy days in between:
How Many Users Can We Handle At One Time?
Concurrency is the number of simultaneous users connected to a site. Many high-profile sites still use architectures that can only handle low concurrency. To handle higher levels of concurrency, a web site should be based on a number of very efficient building blocks and be able to scale with the growing number of simultaneous connections and requests per second.
How Do We Plan To Scale When Traffic Soars?
Many sites have a simple architecture: a web server that runs application code to put together web pages, often with a database server supporting the application. Those platforms are limited in their ability to handle significant amounts of traffic because every new browser session opens many server connections. If several thousand users are online at once, tens of thousands of connections are needed, which slows performance.
Tools like load balancers can prevent these problems. They receive requests from user devices and dispatches those requests to duplicate servers. Once a response is generated, reverse proxy servers return the information to the user device. As such, other servers can work at optimal speeds because they don’t get bogged down holding thousands of open connections waiting for responses. Ask your team if your site uses a multi-server architecture along with commercial-grade load balancers, and how they are deployed.
When Was The Last Time We Tested Our Application For Scale?
With e-Commerce sales on the rise, past success with your application is no indicator of future results. The only way to know for sure is to test the reliability and scalability of your site with simulated traffic. When was the last time your team tested for scale? What was the result? What failed first? What plans were put into place to rectify those issues?
Many times the issues that arise are easily resolved with small changes to how your servers are deployed, and straightforward application delivery techniques. Things like content caching, putting limits on bandwidth and number of requests by each unique user, and tuning your load balancer for performance all can have dramatic impacts on reliability and scale.
Do We Have Enough Traffic Cops?
When you get spikes in traffic you don’t want to overload a single server, so you spread the load across multiple servers. Many companies already have hardware load balancers. They also need software load balancers that application teams control, to redirect traffic or add new servers should your traffic suddenly increase or you hit bandwidth and connection limits used to mitigate the costs to your cloud provider.
This is also true when your site suffers a denial-of-service (DoS) attack, where your servers are overrun with requests as a means to crash the site by malicious actors. The sheer number and sophistication of these attacks are constantly growing, and they can be hugely disruptive to your business. Every CEO today should assume their company is a target and protect themselves accordingly.
Failure Is Inevitable. How Quickly Can We Recover, And Who Has Our Back?
The ability to bounce back quickly today has as much to do with how your team is set up as it does with your software stack. Agile development teams bounce back faster. They’re flexible in their approach, they work well with others, and they move fast.
These companies overwhelmingly use open source software tools for their flexibility and freedom from vendor lock-in. That said, when things go wrong it’s important to have experts to back you up. Pick a partner that understands your stack, understands performance tuning, and are available 24×7 with support.
Asking the right questions and ensuring your team has thought through how to achieve and ensure scale will help you survive any holiday hordes. Having the right architecture, software and partners can save you millions of dollars, thousands of customers, and that one ugly board meeting.
Gus Robertson serves as Chief Executive Officer of NGINX and is a seasoned leader in the technology industry, including companies such as Microsoft, Visio, Lexmark, and Red Hat. He has a proven track record of building successful commercial offerings, strengthening open source communities, and driving customer loyalty.