Cassava, a woody shrub that is part of the spurge family, is extensively cultivated as an annual crop in tropical and subtropical regions for its edible starchy tuberous root. It is a major source of carbohydrates and the third-largest source of food carbohydrates in the tropics. As a major staple food in the developing world, it provides a basic diet for over half a billion people. Despite being one of the most drought-tolerant crops capable of growing on marginal soils, it is highly susceptible to distinct species of circular single-stranded DNA viruses that are whitefly-transmitted and primarily infect cassava plants. Nine species of cassava-infecting geminiviruses have been identified between Africa and India based on genomic sequencing and phylogenetic analysis. This number is estimated to grow due to a high rate of natural transformation.
Ongoing efforts to preserve this vital food source and maintain sustainable growing methods in Africa include collaborative research that spans countries – and continents. Prof. Elijah Ateka, Associate Professor of Plant Virology at Jomo Kenyatta University of Science and Technology (JKUAT) and Dr. Laura Boykin from the University of Western Australia collaborate on a research project to diagnose viruses that infect cassava, and sequencing whole cassava virus genomes in order to help eradicate infection.
“The project is aimed at developing diagnostics for and characterization of the viruses that infect cassava as well as the vectors of these viruses. Specifically, the project focuses on understanding the threat from evolving viruses and vectors affecting cassava, and training farmers to understand the causes, manifestation and management of virus diseases and therefore build sustainable national capacity. The main diseases are cassava mosaic disease (CMD), cassava brown streak disease (CBSD) which are both transmitted by whitefly. To characterize viruses and whiteflies, lots of sequence data are generated; for which we collaborate with the University of Western Australia (UWA) for bioinformatics,” says Prof. Ateka in a recent interview with KENET.
Transmission of Enormous Data Sets
However, the research was data intensive research making data sharing between Prof. Ateka and his Australian counterpart very tedious and unsuccessful in most cases. The main data generated was sequence data and fell in the range of 1-5 GB. In some instances, Dr. Boykin had to journey from Australia to Kenya to physically transfer and access data.
The KENET Virtual Lab (VLab), a research cloud computing platform developed by Kenya Education Network (KENET) provided the solution. The platform enabled a high-speed data transfer between researchers across two continents, nearly 10,000 kilometres apart. KENET provisioned a free virtual server with a 5TB storage on the KENET research cloud to facilitate data sharing between a researcher based in Kenya and the other in Australia. Once they were provisioned with the KENET VLab, which can capacitate disk space of 100TB, they were able to virtually collaborate and exchange data at a high speed and in real-time.
Connecting Data. Connecting People.
The project worked with researchers in Tanzania, Uganda and Kenya, among the areas where the cassava virus most affects people. It is estimated that an approximate 800 million people rely on the carbohydrate-rich Cassava for consumption or a source for income. As a result of the project, farmers have since recorded higher yields due to the genomic technologies used to improve management of Cassava viruses. Dr. Boykin and Prof. Ateka intend to publish their findings in an open access journal.
“International and local collaboration has aided our research in many ways. We have been able to gain expertise within the team that we did not have. Collaborating with KENET has also helped us to interact and share data easily. In addition, such collaboration has enabled us gain access to state-of-the-art laboratory facilities we do not have in the region,” notes Prof. Ateka.
The KENET VLab allows staff, faculty and students within the KENET community to register to the cloud environment and utilize computing resources by self-provisioning a virtual machine within few seconds, as well as store and share data. The KENET virtual computing lab adopts a clustered model that runs on Ganeti cluster management tool with synnefo, an open source cloud management tool to deploy a massively scalable cloud-based virtual computing lab solution integrated with several automated services to enhance its usability.
Just like Prof. Ateka and Dr. Boykin, many researchers work with a lot of qualitative and/or quantitative data sets which require a platform whereby they can be stored, processed or analyzed. Big Data (extreme large data sets) therefore calls for optimal computing resources for data sharing, analysis and storage. NRENs like KENET play an important role in providing the relevant services towards data sharing, processing and analysis.
Ultimately, the success of an NREN is determined by how it accomplishes both its education and research mandates. By providing high-speed internet coupled with key research infrastructures, NRENS are able to support the needs of their research community and foster cross border collaborations.