Knowledge Discovery: Making Sense of Big Data
Implementing Algorithms for Data Analytics
In an age where we are buried in information, how do we make sense of it?
Graduate student in computer science, Wellington Cabrera (Ph.D., 鈥17) worked on implementing
algorithms for data mining and analytics.One of technology鈥檚 many advantages is the ease with which we can collect information
on everything from internet browsing patterns to activity levels recorded by wearable
trackers to weather patterns, all of which allow for more accurate insights. This
ease of data collection, however, has its own drawback, as the sheer amount can be
overwhelming.
Data Mining: Teasing Out Hidden Patterns
While working on graduate degree in computer science, Wellington Cabrera (Ph.D., 鈥17), sought to address this problem by creating and implementing algorithms for data analytics and mining in database systems. This research was performed under the guidance of Carlos Ordonez, associate professor of computer science in the College of Natural Sciences and Mathematics.
In this deluge of information, with all of its tangled implications, data mining works to tease out the hidden patterns.
鈥淎nother name for data mining is 鈥榢nowledge discovery,鈥欌 Cabrera noted.
Parallel Computing Increases Speed and Storage Capabilities
Cabrera鈥檚 research was to develop algorithms that could work for parallel database systems. Often, database systems are distributed across multiple computers, a strategy termed parallel computing. Although this increases a database鈥檚 speed and storage capabilities, this also requires an adjustment in how tasks are performed.
鈥淒eveloping an algorithm for a single computer requires a lot of sequential steps,鈥 Cabrera said. 鈥淔or parallel systems, the challenge is getting these multiple computers to work together to solve problems.鈥
Scalable Algorithms
Cabrera focused on scalable algorithms, in order to get comparable performance regardless of a database鈥檚 size.
鈥淵ou want an algorithm that can work just as well with two computers as it does with 1,000,鈥 Cabrera said. 鈥淲hen you have many computers working together, you tend to see a degradation. If the algorithm does not coordinate the parallel processing correctly, then the computers cannot work together in the right manner, becoming a mess.鈥
Overcoming Challenges
During his time as a graduate student, Cabrera faced many of the typical challenges
of juggling coursework, research and his responsibilities as a teaching assistant,
all while trying to plan for the next
step in his career.
鈥淭o get a Ph.D., you have to overcome many obstacles,鈥 Cabrera said.
This hard work ultimately paid off, as Cabrera landed several internships, published numerous papers in well-respected journals, and, after graduation, was offered a job in the tech industry.
鈥淚 am very stubborn,鈥 Cabrera said. 鈥淚 don鈥檛 like to give up.鈥
- Rachel Fairbank, College of Natural Sciences and Mathematics
November 30, 2017