Infogoal Logo
GOAL DIRECTED LEARNING
Master DW

Data and Analytics Tutorial

Data and Analytics Overview
Under Construction

Data and Analytics Success

Data and Analytics Strategy
Project Management
Data Analytics Methodology
Quick Wins
Data Science Methodology

Requirements

BI Requirements Workshop

Architecture and Design

Architecture Patterns
Technical Architecture
Data Attributes
Data Modeling Basics
Dimensional Data Models

Enterprise Information Management

Data Governance
Metadata
Data Quality

Data Stores and Structures

Data Sources
Database Choices
Big Data
Atomic Warehouse
Dimensional Warehouse
Logical Data Warehouse
Data Lake
Operational Datastore (ODS)
Data Vault
Data Science Sandbox
Flat Files Data
Graph Databases
Time Series Data

Data Integration

Data Pipeline
Change Data Capture
Extract Transform Load
ETL Tool Selection
Data Warehoouse Automation
Data Wrangling
Data Science Workflow

BI and Data Visualization

BI - Business Intelligence
Data Viaulization

Data Science

Statistics
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics

Test and Deploy

Testing
Security Architecture
Desaster Recovery
Rollout
Sustaining DW/BI

Big Data Tutorial

"Big Data" refers to datasets which cannot be supported by traditional databases because they contain a very large volume data; with a variety of formats and structures; and data that moves at a high velocity. The term "Big Data" originated in 1997; became a hot term; and then cooled in use. Volume, variety and velocity are known as the three Vs of Big Data.

Since the three Vs were identified, more Vs have been brought forward. The 10 Vs of Big Data and Data Science are:

  • Volume: super huge amounts of data, measured in the Terabytes and above
  • Variety: great differing formats in addition to traditional structured data such as free form text, images and audio recordings.
  • Velocity: measure of the speed at which data is coming in to the system. Major websites like Twitter and Netflix have tens of millions of customers who are generating data.
  • Volatility: varying useful lifetime value of data. Data may quickly become stale.
  • Variability: inconsistencies in data - variations in arrival time or form.
  • Veracity: data which is regarded as trustworthy because it has a known and reliable source.
  • Validity: accurate and timely data that is fit for its intended use.
  • Vulnerability: potential for data breaches which expose private data of persons to wrong doers. With Big Data more personal information can be exposed in a single incident, meaning greater Vulnerability.
  • Visualization: showing data through data visualization tools in challenging due data volume, variety and velocity.
  • Value: ability to produce helpful results such as better insights and decisions resulting in: improved customer service, better products, greater revenue, reduced cost and/or managed risk.

Big Data is important because it provides input to data science approaches like Machine Learning (ML), and Artificial Intelligence (AI). Use of these approaches have led to innovative breakthroughs. At time same time, the term Big Data is fading as large volumes of data are the norm rather than cutting edge exceptions.

Big Data 10 Vs

Big Data References and Links

Check out these Big Data links:


Advertisements

Advertisements:
 


Infogoal.com is organized to help you gain mastery.
Examples may be simplified to facilitate learning.
Content is reviewed for errors but is not warranted to be 100% correct.
In order to use this site, you must read and agree to the terms of use, privacy policy and cookie policy.
Copyright 2006-2020 by Infogoal, LLC. All Rights Reserved.

Infogoal Logo