DECA-DS: Data Science and Big Data Analytics


• Duration: 5 Days
• Mode of Delivery: Online -Instructor-led training
• Job role: AI Engineer, Developer, M/L Specialist
• Preparation for exam: DEA-7TT2
Cost: USD$2,900.00

This course focuses on the practice of data analytics, the role of the Data Scientist, the main phases of the Data Analytics Lifecycle, analyzing and exploring data with R, statistics for model building and evaluation, the theory and methods of advanced analytics and statistical modeling, the technology and tools that can be used for advanced analytics, operationalizing an analytics project, and data visualization techniques. Successful candidates will be able to attempt the exam to achieve the Dell EMC Professional – Data Science Associate credential.

16 in stock (can be backordered)


This course is intended for individuals seeking to develop an understanding of Data Science from the perspective of a practicing Data Scientist, including:
• Managers of teams of business intelligence, analytics, and big data professionals
• Current Business and Data Analysts looking to add big data analytics to their skills.
• Data and database professionals looking to exploit their analytic skills in a big data environment
• Recent college graduates and graduate students with academic experience in a related discipline looking to move into the world of data science and big data


To complete this course successfully and gain the maximum benefits from it, a student should have the following knowledge and skill sets:
• A strong quantitative background with a solid understanding of basic statistics, as would be found in a statistics 101 level course
• Experience with a scripting language, such as Java, Perl, or Python (or R). Many of the lab examples taught in the course use R (with an RStudio GUI), which is an open source statistical tool and programming
• Experience with SQL

Skills Gained

Upon successful completion of this course, participants should be able to:
• Identify the pre-requisites for Big Data project
• Gain familiarity Data Analytic Methods Using R
• Use Statistics methods for Evaluation
• Network Analysis, and Data Visualization concepts
• Work with Clustering Algorithms
• Work with Association Rules
• Work with Regression
• Work with Classification
• Work with Time Series Analysis
• Work with Text Analysis

Course outline

Module 1: Introduction to Big Data Analytics
• Big Data Overview
• State of the Practice in Analytics
• The Data Scientist

Module 2: Big Data Analytics in Industry Verticals
• Data Analytics Lifecycle
• Discovery
• Data Preparation
• Model Planning
• Model Building
• Communicating Results
• Operationalizing

Module 3: Data Analytic Methods Using R
• Basic features of R
• Introduction to R
• Using R to Look at Data

Module 4: Data Exploring and Analyzing Data
• Statistics methods for Evaluation

Module 5: Clustering Algorithms
• Centroid
• Clustering
• K-means
• Unsupervised Learning
• Within Sum of Squares (WSS)

Module 6: Association Rules
Association Rules
• APiori Algorithm
• Support
• Confidence
• Lift
• Leverage

Module 7: Regression
• Categorical Variable
• Linear Regression
• Residuals
• Logistic Regression
• Ordinary Least Squares (OLS)
• Receiver Operating Characteristic (ROC) Curve

Module 8: Classification
• Classification Learning
• Decision Tree
• Naïve Bayes
• ROC curve
• Confusion matrix

Module 9: Time Series Analysis
• Stationarity
• Time series
• Autocorrelation Function (ACF)
• Autoregressive (AR)
• Moving Average (MA)
• Partial Autocorrelation Function (PACF)

Module 10: Text Analysis
• Term
• Corpus
• Text normalization
• Term Frequency – Inverse Document Frequency (TFIDF)
• Topic modelling
• Sentiment Analysis

Module 11: Advanced analytics—technology and tools
• Introduction to advanced analytics—technology and tools
• Hadoop ecosystem
• In-database analytics SQL essentials
o SQL Queries
o Regular expressions
o User-defined functions
o Window functions
• Advanced SQL
• MADlib

Module 12: Putting it all together
• Preparing to operationalize
• Preparing project presentations
• Data visualization techniques


Click on the following link to see the current Course Schedule
Our minimum class-size is 3 for this course.
If there are no scheduled dates for this course, it can be customized to suit the time and skill needs of clients and it can be held online, at a rented location or at your premises.
Click on the following link below to arrange for a custom course: Enquire about a course date

Product Information

Data is created constantly, and at an ever-increasing rate. Mobile phones, social media, imaging technologies to determine a medical diagnosis-all these and more create new data, and that must be stored somewhere for some purpose. Devices and sensors automatically generate diagnostic information that needs to be stored and processed in real time. Merely keeping up with this huge influx of data is difficult, but substantially more challenging is analyzing vast amounts of it, especially when it does not conform to traditional notions of data structure, to identify meaningful patterns and extract useful information. These challenges of the data deluge present the opportunity to transform business, government, science, and everyday life.
“Big Data” is data whose scale, diversity and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from.

Although the volume of Big Data tends to attract the most attention, generally the variety and velocity of the data provide a more apt definition of Big Data. (Big Data is sometimes described as having 4 Vs: volume, variety, veracity and velocity.)

Due to its size or structure, Big Data cannot be efficiently analyzed using only traditional databases or methods. Big Data problems require new tools and technologies to store, manage, and realize business benefits.

The players in this field are data analysts, data engineers and data scientists: 
An effective data analyst will take the guesswork out of business decisions and help the entire organization thrive. The data analyst must be an effective bridge between different teams by analyzing new data, combining different reports, and translating the outcomes. In turn, this is what allows the organization to maintain an accurate pulse check on its growth.
The data scientist will uncover hidden insights by leveraging both supervised (e.g. classification, regression) and unsupervised learning (e.g. clustering, neural networks, anomaly detection) methods toward their machine learning models. They are essentially training mathematical models that will allow them to better identify patterns and derive accurate predictions.
Data engineers establish the foundation that the data analysts and scientists build upon. Data engineers are responsible for constructing data pipelines and often have to use complex tools and techniques to handle data at scale. Unlike the previous two career paths, data engineering leans a lot more toward a software development skill set. At larger organizations, data engineers can have different focuses such as leveraging data tools, maintaining databases, and creating and managing data pipelines. Whatever the focus may be, a good data engineer allows a data scientist or analyst to focus on solving analytical problems, rather than having to move data from source to source. The data engineer’s mindset is often more focused on building and optimization

Additional Information

CANCELLATION POLICY – There is never a fee for cancelling seven business days before a class for any reason. Data Vision Systems reserves the right to cancel any course due to insufficient registration or other extenuating circumstances. Participants will be advised prior to doing so.


There are no reviews yet.

Be the first to review “DECA-DS: Data Science and Big Data Analytics”

Your email address will not be published. Required fields are marked *