Responsible for urban transportation, SETAO operates the tram and bus network in the metropolitan area of Orleans. The company is traveling more than 100 000 passengers per day, and must therefore have a highly available system. In terms of computing resources, the company relies on a MAN 24-km optic-based Cisco, serving 60 access points and has 4 computers rooms spread over the city. On this network transits vehicles radio flows, billing system, CCTV surveillance with video recording of 14 days, flow management of electricity transformers, traffic lights…
To facilitate the transport of passengers and inform them with real time traffic conditions. To achieve that, the SETAO uses a support system operations (SAE): a specific business application. This fundamental tool is extremely powerful and can manage real-time traffic and bus tramways. It also serves to inform customers in real time in the 24 tram stations, 22 trams, 220 busses and on mobile phones.
What has happened we thought about virtualization world?
In 2004, eager to renew the hardware platform of its application, the SETAO, had to maintain the SAE’s existing operating system Windows NT4. One option considered was to migrate to Windows 2000 but that requires recertification and recompilation of the application. The costs were estimated at 240.000,00 ‚Ç¨. Finally, the solution was to virtualize the environment with VMware Workstation 4.5, the only solution on the market that supports Windows NT4. This solution was declined on all servers of the company, when renewing comes..
Secure information system
In 2006, the company acquired a backup library Quantum PX720 LTO III and software Netbackup 6.5, to renew his two ADIC SCALAR 100 and Arkeia.Network Backup.
VMware Workstation was quickly replaced by VMware Server with 30 physical hosts running Linux. 67 virtual machines.A new problem occured :administration of a comprehensive and centralized all these VMs.
After a study of functional and financial plans, the company made the choice of Virtual Infrastructure 3.5 with ¬†VC, HA, DRS, VMOTION, VCB.hosted on two farms of servers based on dual Xeon quad-core and 32 GB of RAM. In 2008, to enable synchronous replication, storage, NFS and iSCSI-based, NAS Adaptec SNAP 18000 were replaced with two storage arrays Pillar Data Systems AXIOM 500 synchronous replicated with FalconStor NSS IPSTOR. To respond fully to the issue of continuity of service, Site Recovery Manager tool was deployed in November.
In order to give to the same performance features of redundancy of datacenters, View3 will be deployed to secure client environment in the first quarter of 2009.
To prépare the future, the company plans to merge LAN and SAN flows on the same 10 Gigabit Ethernet media using Cisco VSS Catalyst switchs jointly ¬†NEXUS physical and virtual switchs with FCOE protocol ¬†A second tram line construction has started on end of 2008. It’s planned to be operational in 2012. The MAN will extend to 40Km and desserved 120 access points. CCTV system will grow up to 500 cameras and storage up to 100 To.
The Practice – Practical tips
During my work I encountered some issues I want to share with you:
Initially, the SETAO had no Fiber Channel SAN. All connections were Gigabit Ethernet and IP. We choose to use the storage protocol ISCSI over 4K jumbo frames, maximum value common to all NICs 3Com, Broadcom, Intel and CISCO 6509 switches.
See return of experiment (sorry in French) : http://www.indexel.net/1_6_4638__3_/2/9/1/Parier_sur_l_iSCSI_pour_optimiser_la_restauration_de_donnees.htm.
However, handling and management of virtual machines were fastidious. So we quickly used the NFS protocol of our NAS and we saw that we had better performances in terms of access. To further improve this mode, I recommend using the command
/sbin/ hdparm -a (read ahead) 512 which is the number of cylinders disk read in advance. It could increase up to 50% a VM performance.
During transition to VI Infrastructure 3, we used our NFS‚Äôs NAS again. All of the 67 VMs have been converted to VI3.5 with Converter in a weekend!
The following Monday, users just saw an acceleration of applications, because they were worked on new machines with two quad-core CPUs.
Pillar Storage Architecture
The choice of Pillar Data was first on the architecture of the machine where each disk brick has two controllers, unlike other storage market. Over adding bricks, more the machine is powerful.
Another fact is interesting: the use of 1MB wide stripe writes use in conjunction of the logic 1MB VMFS3 blocks overcome any contention of 24 GB RAM buffers.
During our tests front of the NetApp FAS3120 and the EMC CX-4, the AXIOM 500 was the only machine that has allowed us to use up to saturation of the fiber and was the more energy efficient.
Better, and for information I have obtained 160MB/s throughput on a single slammer ‚Äìthe SAN Fabric – GBit Ethernet port with jumbo frames !
Finally, the use of the interface is very intuitive and does not require a storage specialist: A journalist friend of mine, Mrs. Virtuanews.fr her selves, was able to create a LUN in 3 minutes at first sight of the storage system….
The change to move to SAN Pillar has also conducted fairly simply. I created 4 LUNs on different Quality of Service levels: High, Low, Medium, archive 2TB each ..
Then I selected, based on expected performance, the VMs for position in each disk‚Äôs classes of service to ensure the performance of disk access. I associated the VMware pools ad-hoc on VMware VI ..
Then, Storage-VMotion allowed us to migrate the Virtual Machine files without any interruption. Replication between bays was also carried out without stopping production using the FalconStor appliances IPSTOR.
Except the bug of 3.5U2 of last summer, the only issue we had was during the implementation of Site Recovery Manager with the SRA FalconStor agent. SRA didn¬¥t saw the LUNs to integrate protection group and generates an error.
In WebEx and confcall to 1:00 AM with employees of FalconStor and people who have developed among VMware SRM, it was revealed that the PERL.EXE provided by Oracle, the database that we use for running SRM, and PERL.EXE provided by VMware came into conflict. The workaround was to move the PERL Oracle in Perl.old …
Finally, we tried many fail-over simulations with SRM. The RPO was about 3 minutes and the total ¬†RTO ¬†about 7 minutes. The improvements I expect for future versions of SRM are protection one too many ‚Äì as a star – and not one to one as actually, and a automatic fail back because the backup site becomes the production site in case of disaster. It‚Äôs not easy come back in the normal situation!