Well Prepared for the Future

The move to the new data centre is complete

Christoph Strauch - 28 January 2021

About two years ago, we started planning the move to a new data centre within Switzerland in order to simplify the MACD system landscape that has grown over the years, to modernise the hardware and to be generally equipped for future requirements. We have recently gone live - all customer connections have been migrated successfully.

From the beginning it was clear that this project would mean an immense coordination effort between the more than 100 MACD customers and partners, the previous data centre, and the providers. It would certainly include a lot of unexpected challenges. Therefore our team collaborated closely with Raphael Grochtmann the project manager.

We had to move 22 IPSS Connections, 42 IPSec Connections, 1 COLT Connection and 3 exchange connections as smoothly as possible for our customers, set up a parallel operation to ensure high availability and maximum security and avoid production downtime - a very demanding undertaking, especially in the financial IT sector.

For me personally, it was also a challenge because I joined MACD only three months before the start of the project. For the first time, I had taken over the management of an admin team and now had to ensure that the complex project was implemented flawlessly in close cooperation with the project management.

I benefited from the fact that I had already participated in a data centre move during my previous job - from Frankfurt to Aachen. However, at that time it was only a change of location, not really comparable to this project. At MACD, it was not only about the implementation of new servers, but also about the change of current operating systems as well as the use of completely new technologies, e.g. CEPH for central storage.

There is no such thing as the perfect route

Yes, there were obstacles - in customer connections, high availability, updating to current operating systems and software as well as existing dependencies and finding the right contact person both internally and externally. The solutions were always different - different from problem to problem - there was no such thing as the perfect route.

We had to be agile and flexible and arrange many meetings to get the necessary knowledge from our internal and external experts. Thanks to the great cooperation with our Technical Account Management and Software Engineering colleagues and the regular exchange with external Admin teams of customers and partners, we always succeeded in doing so.

Fortunately, we were well positioned for permanent remote work right from the start - forced by Corona. Since my team includes employees in Aachen (Germany) as well as in Urdorf (Switzerland), we were used to working together remotely via video conferencing. This made it difficult to familiarise new team colleagues with the work, but we have managed to do this well so far.

To completely avoid a failure is utopian

With the move we have become better at the recovery after an outage. To avoid a breakdown completely is simply utopian. But you can keep the duration of the failure as short as possible. We always assume that hardware can die at any time and are able to restart our systems automatically after a few minutes without data loss.

Furthermore we follow the strategy to patch our systems as often as possible - in the former data centre they were touched too rarely because of the fear that something would break. We have now made our connections completely transparent for customers. The customers have exactly one access point to our firewall cluster.

Well prepared for the future

Since it was a very complex, sometimes demanding project, we were able to gain valuable experience on a technical, but also on a human level. It was valuable for us as a team as well as for MACD and our customers and partners.

Because in the new data centre we now have control over the entire infrastructure, host all connections ourselves, have expanded and improved our monitoring and strengthened our IT security.

We have also greatly expanded our network know-how and are getting to know new technologies such as Proxmox Cluster or CEPH. Some innovations are for example:

  • We separate networks with VLANs
  • We rely on 10 GBit network technology to operate our CEPH storage
  • We cluster all switches
  • We use different Proxmox clusters and thus separate production from DEV environments and allow developers to productively reproduce bugs
  • The Proxmox clusters allow systems to be moved from hardware to hardware at runtime without having to stop them

And last but not least, we have grown together very well as a fresh, new team.

So we are well positioned for the future and especially for the further development of our new trading system MAX. The new system consists of different modules, each representing a service. The aim is to make it even easier, faster and more secure for us to provide our customers with the services they require.

Of course - as with any new system, experience must first be gathered and teething troubles must be gradually eliminated. Also, every update always involves a risk - but security simply has to be given a higher priority.

And with every update you learn from mistakes.