论坛

 找回密码
 注册
                  
查看: 1286|回复: 3

内幕: bet365的一百万英镑的业务连续性项目

[复制链接]
发表于 2014-3-13 23:43 | 显示全部楼层 |阅读模式
Inside bet365’s £1m business continuity project

Every week, a small contingent of IT experts unlock the doors at a Manchester data centre and prepare to put their network through its paces. Switching on the lights, they sit down to manage the thousands of servers and communications nodes backing up one of Britain's biggest real-time networks from their Network Operations Centre.

These visitors are the IT operations team of bet365, the world's largest online gambling website, and the system they keep operational round the clock has the kind of performance specs that keep IT managers awake at night.

The bet365 website has more than 11 million customers worldwide and some £20bn wagered on its Sportsbook alone. Because customers bet on popular, time-critical events, such as football matches and horse races, its systems can have as many as two million concurrent users at peak times, resulting in up to 500,000 transactions per second across its database estate.

And because of the unique requirements of the business, bet365's owner and joint chief executive, Denise Coates, took the decision that the company should develop its own software and network architecture internally. As a result it has invested heavily in developing in-house systems and software capability and is now a leader in online gaming technology.

Neil Selby, head of networks and security at bet365, says: "Assuring business continuity and disaster recovery has been a major aim of our investment programme over the past three years. We needed to prove that we're able to survive a major site outage, due to either a technical failure or the potentially devastating effects of a natural disaster such as fire or flood."

The company has two primary data centres, in Manchester and London. Its core network has a considerable number of Cisco Nexus 7000 switches. The newer Manchester centre has matured greatly over the past year and all the critical database systems are now duplicated between the two sites.

To make sure the system lives up to its name, 365 days a year, the company is currently completing the third phase of its £1m investment programme to provide full business continuity for its national network. This has involved installing a second Dark Fibre Network in the UK and the first European order for the new NetScaler 22040, Citrix's web application delivery product.

The NetScaler 22040 enables bet365 to continue to grow its 10Gbit/s network to keep pace with customer demand.

Bet365 specialises in in-play sports betting, where customers can bet while the action unfolds during a sporting event - such as on the next goal in football or the next wicket to fall in cricket. Customers can place bets directly, through their PC or smartphone on the bet365 website at www.bet365.com.

"Historically we've run on an active-standby model of resilience with two principal locations for databases and web servers as well as internet connectivity in Manchester and London," says Selby. All of our critical databases are now geo-clustered and are able to work out of any of our data centres."

The company's Stoke-on-Trent HQ was its first proving ground for business resilience and it has since built on that experience.

"We've implemented a second fibre network to link our sites in a very robust way. We now have the same data available to us if we want to transact on our sportsbook in Manchester as we do in our office in Stoke. We can almost ignore physical location because we have generous backbone network with very low inter-site latency," says Selby.

One issue for Selby and his team is that, as well as duplicating servers and communications equipment to protect against failure, they've also had to contend with the company's rapid growth in user numbers over the past three years.

Bet365 is deploying large-scale resilience and growing the network's capacity at the same time. This means putting more servers into their webserver farms and also duplicating the farms in each location.

"We also have to exploit all possible technical efficiency improvements," says Selby. "We are always looking to move data away from the server farm to the caching layer on the edge to reduce bandwidth requirements within the network and serve content from as close to our customers as we can. This improves response time, website performance and reduces our internal load."

One of the greatest pressures on network performance is that sporting events tend to generate intense bursts of activity and the network must respond to these peaks in demand in real-time.

To support transaction rates of up to 500,000 per second and up to two million concurrent users at peak times the company has installed market-leading technology at every layer to support these huge transaction rates with very, very low latency.

Sports betting presents considerable challenges from an IT point of view, especially when the sport or event in question is incredibly popular. The Grand National is a good example.

"Traditionally, the Grand National is our busiest day of the year and it's not unusual that we will have anything from 200-300 per cent more activity compared to anything we've seen on any one day throughout the previous year," says Selby. "Not only is it extremely busy but there are two particularly large peaks during the day; one at the beginning and another at the end of the race. Typically, people wait until the last minute to place their bet so there is a huge burst of activity minutes before the race starts and the same at the end of the race when we have to reconcile everyone's winnings."

To further safeguard its national communications network, bet365 has implemented a second Dark Fibre Network, provided by Virgin Media.

"Our existing WAN is configured as a figure of eight, with every inter-site link having direct and indirect paths giving us a significant degree of resilience. Our investment in a second WAN provides further dedicated inter-site connectivity down the spine of our network from a separate provider, with guaranteed physical separation giving us protection from all the currently perceived risks to our inter-site connectivity," says Selby.

This gives bet365 three network routes for every inter-site link and in the event of disaster, the company can readily reconfigure the links as required.

"The addition of the second network means that in the event of a major event impacting our primary network, like regional flooding or systemic failure, bet365 can move traffic onto Virgin without interruption to service which is vital for our operation which, serving sports globally, is truly 24x7 covering sporting events in most time zones: Europe, Asia & Australia and at differing times of day in those regions," says Selby.

To ensure that the systems are always in a state of preparedness bet365 performs regular rehearsals of its business continuity procedures to prove that the planned measures are effective.

"Our business continuity team looks at all of our response procedures, trying to anticipate every possible scenario. Once a week, our operations team work from our alternative NOC to prove they have the connectivity and communications required. We routinely swap between server farms, switch IP Transit links and providers and run on generators. These exercises prove that in the event of a real issue we will be working in familiar territory," says Selby.

http://www.computing.co.uk/ctg/feature/2330194/inside-bet365-s-gbp1m-business-continuity-project/page/1
http://www.computing.co.uk/ctg/feature/2330194/inside-bet365-s-gbp1m-business-continuity-project/page/2
发表于 2014-4-17 17:38 | 显示全部楼层
一句话看不懂,
发表于 2014-4-17 21:33 | 显示全部楼层
我也看不懂
发表于 2014-4-17 21:33 | 显示全部楼层
我也看不懂
您需要登录后才可以回帖 登录 | 注册

本版积分规则



小黑屋|手机版|Archiver|论坛

GMT+8, 2024-12-23 15:51 , Processed in 0.101021 second(s), 22 queries .

Powered by Discuz! X3.4

© 2001-2017 Comsenz Inc.

快速回复 返回顶部 返回列表