Empowering Ethiopian Universities: Inside EthERNet’s High-Performance Computing Journey

In 2019, EthERNet unveiled its High-Performance Computing (HPC) facility, a groundbreaking initiative designed to revolutionize research capabilities across Ethiopia. This state-of-the-art facility was established to cater exclusively to higher education institutions and research centers, providing them with unparalleled access to advanced computing technologies. Just as well, it is meant to be a reference case for advanced ICT services in Ethiopian higher education. Since its inception, EthERNet’s HPC facility has been at the forefront of driving innovation and collaboration, empowering researchers and students alike to push the boundaries of knowledge and discovery. Five years later, we look at the impact that this facility has had on EthERNet’s community.

The challenges faced by universities and graduate students, particularly in areas where high performance computing resources are needed, was a strong motivating factor in the establishment of this HPC. Since 2019, twenty universities and institutions have managed to leverage on this resource. This includes Ababa University, Hawassa University, and Jimma University, among others, who have embraced HPC to conduct cutting-edge research across multiple disciplines, including computational biology, climate modelling, materials science, and physics. Some of the universities are now able to offer courses and workshops on parallel computing, which provides valuable opportunities for students to gain hands-on experience.

Furthermore, HPC provides universities with essential capabilities for conducting simulations, modelling complex systems, and analysing large datasets. With the increasing volume and complexity of data, the HPC is essential for handling data intensive applications such as machine learning, AI, big data analytics, and fluid dynamics.

Assistant Professor Gamachis Sakata Gurmesa, from Mattu University, shares his journey of leveraging EthERNet’s HPC resources to advance his research in Condensed Matter Physics. Through sophisticated Quantum Mechanical Methods of Materials Modelling and Simulation, Gamachis has made significant contributions to the field, with his work even being published in prestigious journals. He states:

“In 2016, I enrolled in Addis Ababa University as a doctoral student in the Department of Physics with a focus on Condensed Matter Physics. I have been using High-Performance Computing (HPC) to do sophisticated Quantum Mechanical Methods of Materials Modeling and Simulation since my PhD research, ‘Computational Analysis and Design of Novel Silicate Cathode Materials for Rechargeable Lithium- and Sodium-ion Batteries.’ My PhD work was partially funded by EthERNet’s HPC access. The EthERNet Computational facilities are one common national computational resource that we, as staff or students, use. These days, we are mostly employing quantum mechanical methods for materials modeling and simulation to help with two areas of research: (1) the role of defects in novel materials; and (2) crystal structure prediction. Our research group named Facilities and Sustainable Training in Education and Research Network (FaSTERNet) uses readily available VASP and Quantum ESPRESSO codes, provided by EthERNet

EthERNet’s HPC Technical Features

  • Management Nodes: These nodes handle cluster management tasks such as NTP server, InfiniBand subnet management, job scheduling, and cluster monitoring. They are designed for high reliability and consist of two servers with Intel Xeon processors, 16 GB memory, SAS hard disks, GE ports, and Mellanox InfiniBand cards.
  • Compute Nodes: There are 18 compute nodes, each equipped with Intel Xeon processors, 192 GB memory, high-speed SAS hard disks, GE and InfiniBand cards. Each node delivers a computing capability of 3072 GFLOPS, totaling 55 TFLOPS across all nodes.
  • Storage Devices: Configured with redundant controllers, these devices offer 200 TB effective capacity and connect externally via 16G FC ports.
  • Storage Nodes: Consisting of Tecal RH1288H V3 and RH2288 V3 units, these nodes serve as Metadata Server (MDS) and Object Storage Server (OSS) respectively. Each node is equipped with Intel Xeon processors, 64 GB memory, SAS hard disks in RAID 1 configuration, and high-speed network cards.

The journey to establishing the HPC has not been without its challenges. Being an unfarmiliar realm, there was a lot of learning and discovering to be done by EthERNet staff. Dedication from the team and assistance from the vendor was essential in making this initiative a success. Additionally, power interruption presented another obstacle. Since the system runs multiple computational jobs concurrently, interruptions in power supply can disrupt ongoing computations, leading to data corruption and system downtime. Back up generators and other UPS systems have been implemented to cushion against this.

As Ethiopia’s research landscape continues to evolve, EthERNet remains committed to expanding its HPC capabilities to meet growing computational demands. Through strategic investments in powerful hardware components such as processors, accelerators (GPUs), memory module, larger capacity storage systems, implementing cloud storage solutions,  and investments in alternative sources of energy, EthERNet aims to enhance its infrastructure and ensure uninterrupted access to cutting-edge computing resources for researchers nationwide.

EthERNet’s HPC facility stands as a testament to the power of collaboration, innovation, and perseverance. By empowering researchers with the tools they need to thrive, EthERNet is not just driving academic excellence but also shaping the future of scientific discovery in Ethiopia and beyond.

Leave a Comment

UbuntuNet-Connect2024 Early Bird-Registration Open