Data science in short
What is Data science ?
Data science is the analysis and interpretation of complex digital data. This data analysis from a website, application, system, or software should enable informed decisions to be made. Data scientists design and model systems and algorithms to automate database analysis and deploy it at scale. Concretely, they define a mathematical representation of a phenomenon. They try to find a mathematical relationship between measured values and the phenomenon in question. Ultimately, the results mainly allow them to predict and anticipate future situations, actions, behaviors such as the weather or the average shopping basket to react accordingly. These are simple examples, and it should be kept in mind that data science allows for complex analyzes that provide the opportunity to guide strategic decision-making.
Data science project lifecycle
A data science project is divided into 5 stages:
- Data collection: we try to extract and collect relevant data for the project in this stage.
- Data preparation: cleaning and formatting the data so that it can be used by the machine. This could be, for example, removing missing values or unifying the representation of dates.
- Data exploration: we try to understand the available data using statistical methods.
- Model building: construction of prediction models.
- Deployment: we put the project at scale.
Data science project lifecycle
Required languages
Python, R, Scala (if you work in a Big Data environment), Java and SQL.
Some applications
- Bank fraud detection
- Customer sentiment analysis
- Personalized product recommendation
- Prediction of a value (example turnover of a company) according to parameters
- Facial recognition, speech synthesis… etc.
Careers
Many careers are possible when you are in data sciences:
- Lead Data Scientist
- Data Engineer
- Data Analyst
- Data Architect
- Data administrator
- Business Analyst
- Data Manager
Ping : Big data