Infogoal Logo
GOAL DIRECTED LEARNING
Master DW

Data and Analytics Tutorial

Data and Analytics Overview
Under Construction

Data and Analytics Success

Data and Analytics Strategy
Project Management
Data Analytics Methodology
Quick Wins
Data Science Methodology

Requirements

BI Requirements Workshop

Architecture and Design

Architecture Patterns
Technical Architecture
Data Attributes
Data Modeling Basics
Dimensional Data Models

Enterprise Information Management

Data Governance
Metadata
Data Quality

Data Stores and Structures

Data Sources
Database Choices
Big Data
Atomic Warehouse
Dimensional Warehouse
Logical Data Warehouse
Data Lake
Operational Datastore (ODS)
Data Vault
Data Science Sandbox
Flat Files Data
Graph Databases
Time Series Data

Data Integration

Data Pipeline
Change Data Capture
Extract Transform Load
ETL Tool Selection
Data Warehoouse Automation
Data Wrangling
Data Science Workflow

BI and Data Visualization

BI - Business Intelligence
Data Viaulization

Data Science

Statistics
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics

Test and Deploy

Testing
Security Architecture
Desaster Recovery
Rollout
Sustaining DW/BI

Flat Files and Flattened Data

Databases that are organized into tables, rows and columns with relationships between tables are not the answer to every data challenge. DBAs may want to load or backup database tables. Data Engineers may want to move data from place to place using Data Pipelines. Data Analysts may want to visualize data using charts using a spreadsheet or BI tool. Data Scientists may want to train analytical models to make predictions or load data into a database for analysis. Data Publishers may want to make data available for download. The answer to these challenges in the humble flat file and flattened / denormalized data.

What are Flat Files and Flattened Data?

Flat Filesare plain text files stored in a file system, not a database. Flat files are collectons of records which in turn consist of fields which are single pieces of information. Files stored in spreadsheet format such as Excel are also often referred to as flat files. Terms related to flat files include:

  • Header Record: a record that contains the names of the fields in the following records. This provides documentation and facilitates data manipulation.
  • Delimited Flat File: a flat file where an identified character known as a delimited such as as a comma or tab separates fields.
  • Comma Separated Values (CSV): a delimited flat file where a comma character delimits fields.
  • Fixed-format Record: a flat file where each field is positioned at the same location in each record resulting in records that are of equal and consistent length. The fixed-format approach was commonly used in COBOL applications.

Flattened / Denormalized Data is data where data from related database tables or flat file records are gathered into a single or reduced number of tables or records. This reverses the process of normalization where data is organized so that each fact is stored once - avoiding duplication. So why flatten or denormalize?

  • Retrieval Performance: selecting data from flat structures avoids joins to multiple tables which slow performance.
  • Context: database surrogate keys which have no business meaning can be translated to business terms which enable data analysts to explore and data engineers to load data.
  • Predictive Analytics Algorithms: requires flatenned data input often often using flat files. Categories of analytical models that use flat data to train include: regression, clustering, decision trees and neural networks. See diagram below.

Flat Files and Predictive Analytics

Flat Data References and Links

Check out these flat data related links:


Advertisements

Advertisements:
 


Infogoal.com is organized to help you gain mastery.
Examples may be simplified to facilitate learning.
Content is reviewed for errors but is not warranted to be 100% correct.
In order to use this site, you must read and agree to the terms of use, privacy policy and cookie policy.
Copyright 2006-2020 by Infogoal, LLC. All Rights Reserved.

Infogoal Logo