data science

‘This revolution will continue’

Prasun Chaudhuri
Prasun Chaudhuri
Posted on 22 Oct 2024
06:39 AM
istock.com/nicoelnino

istock.com/nicoelnino

ADVERTISEMENT

Data is the new oil, goes the popular saying. And extracting it are a slew
of data scientists. What is it that a data scientist does? Prasun Chaudhuri asks Subrata Das. He is a data scientist who has authored several books, including Computational Business Analytics and High-Level Data Fusion. He is currently an adjunct faculty member at Northeastern University in Boston, US, teaching generative AI.

q How have artificial intelligence (AI) and machine learning (ML) revolutionised the field of data science?

First, by extracting complex patterns and insights through the analysis of large volumes of data. We see the impact of AI and ML in every sphere of life. Examples include product and movie recommendations, ticket booking, medical decision making, drug discovery, remote surgery, autonomous robot guidance on Earth and in outer space, weather prediction, and crime surveillance, to name just a few. An AI system exhibits some form of human-like intelligence, including the ability to continuously learn from interacting with the environment, much like we do. This revolution will continue in the years to come.

ADVERTISEMENT

q Where do we go from here?

What we have built so far is far from artificial general intelligence (AGI). Although large language models (LLM) — such as ChatGPT — are a step toward AGI and excel at tasks like searching, answering questions, generating essays and summaries, translating and providing explanations, currently they are trained on limited data. Moreover, LLMs are poor at tasks fundamental to achieving true intelligence, such as logical reasoning, mathematical derivation and induction. True, AI would be capable of learning from its environment, making plans and decisions, reacting to the environment with appropriate actions, expressing emotions or revising beliefs. The paranoia around the speculation that AI will soon become more intelligent than humans is meaningless. As for AGI, Google DeepMind’s approach of child-like learning through observation of its environment to play games is revolutionary.

q What does a new-age data scientist do?

New-age data scientists perform a variety of tasks, ranging from scoping out problems in collaboration with business partners, preparing data to feed algorithms and modelling to perform analytics using various machine learning and deep learning libraries, to visualising data for presenting results to the business, and assisting in delivering the models for end-users. The primary programming language used is Python. Often, a data scientist spends more than 80 per cent of their time searching for and preparing relevant data, whether it resides in the cloud or on company premises.

q What skills do a data scientist need?

An aspiring data scientist can make one of three broad career choices, and the skill requirements will depend on the choice made. Do you want to work under a chief data scientist? Do you want to be part of the mainstream professional community? Or do you want to delve deeply into the mathematical or statistical foundations of the field?

The first category involves the time-consuming and laborious process of preparing data and generating reports, mostly by querying existing data sources. The second category includes most of the so-called professional data scientists. These individuals initially come from a variety of disciplines in which they had started their careers but later obtained a master’s degree from one of the many universities that have recently begun offering degrees in data science. They view analytics as a “bag of tricks”, adopting whichever techniques best solve the current problem.

Less than 2 per cent of professionals falls into the third category, though this number will grow as the data science field matures. Professionals in this group hold degrees in foundational disciplines such as mathematics, probability and statistics, linear algebra, and the broader theory of computer science and artificial intelligence. You cannot expect someone without a proper mathematical background to perform tasks in this category.

q What are the degrees needed to become a data scientist?

Typically, industries look for candidates with a bachelor’s or master’s degree in computer science or something similar and some experience. Data scientists come from various backgrounds, primarily because educational institutions offer one or two-year courses in data science, attracting students from a range of disciplines. Not all of them have a strong background in mathematics or statistics.

q What is your advice to a high school student who wants to be a data scientist?

Take a strong interest in advanced statistics and probability and in various disciplines of mathematics, such as advanced calculus, algebra and geometry. Work on some projects in Python that solve interesting problems and keep up with current developments in the field of AI in general. Try reading technical articles but focus
on mastering the art of filtering out shallow and inaccurate material.

Das has worked as a data scientist in several multinational companies and authored several books on data science and AI, including Computational Business Analytics and High-Level Data Fusion. He is presently an adjunct faculty member at Northeastern University in Boston, teaching generative AI

Last updated on 22 Oct 2024
06:41 AM
ADVERTISEMENT
Read Next