Получить консультацию
Training Center MUKКурсыOracleSystemsOracle Big Data Fundamentals Ed 2

Oracle Big Data Fundamentals Ed 2

Код курса
OBDF
Продолжительность
5 Дней, 40 Ак. Часов
Описание курса
Цели
Требования
Программа курса
Описание курса

Course overview

In the Oracle Big Data Fundamentals course, you learn about big data, the technologies used in processing big data and Oracle’s solution to handle big data. You also learn to use Oracle Big Data Appliance to process big data, and obtain a hands-on experience in using Oracle Big Data Lite VM. You identify how to acquire the raw data from a variety of sources, and learn to use HDFS and Oracle NoSQL Database to store the data. You learn about data integration options available in Oracle Big Data. These include Oracle Big Data Connectors to move data to and from Oracle Database, Oracle Data Integrator and Oracle GoldenGate for Big Data which provide integration and synchronization capabilities for data unification of relational and Hadoop data, and Oracle Big Data SQL, which enables dynamic, integrated access for all of your data big data, whether it is stored in HDFS, NoSQL, or Oracle Database. Finally, you learn how to analyze your big data using Oracle Big Data SQL, Oracle Advance Analytics, and Oracle Big Data Spatial and Graph.

Цели

Course Objectives

  • Define Big Data
  • Describe Oracle’s Integrated Big Data Solution and its components
  • Define Cloudera’s distribution of Hadoop and its core components and the Hadoop ecosystem
  • Use the Hadoop Distributed File System (HDFS)
  • Acquire big data using the Command Line Interface, Flume, and Oracle NoSQL Database
  • Process big data using MapReduce, YARN, Hive, Oracle XQuery for Hadoop, Solr, and Spark
  • Integrate big data and warehouse data using Sqoop, Oracle Big Data Connectors, Copy to Hadoop, Oracle Data Integrator, and Oracle GoldenGate for big data, and Oracle Big Data SQL
  • Analyze big data using Oracle Big Data SQL, Oracle Big Data Spatial and Graph, and Oracle Advanced Analytics technologies
  • Use and manage Oracle Big Data Appliance
  • Identify the key features and benefits of Oracle Big Data Cloud Service
  • Identify the key features and benefits of Oracle Big Data Cloud Service — Compute Edition
Требования

Suggested Prerequisite

  • Database Basics and Administration
  • Exposure to Big Data

Audience

  • Application Developers
  • Database Administrators
  • Database Developers
Программа курса

Course Topics

Introduction

  • Reviewing the Available Big Data Documentation, Tutorials, and Other Resources
  • Course Road Map
  • Course Objectives
  • Starting the Oracle BDLite VM and accessing the Practice Files
  • Questions About You
  • Oracle Big Data Lite (BDLite) Virtual Machine (VM) Home Page

Introducing Oracle Big Data Strategy

  • Big Data implementation examples
  • Importance of Big Data
  • Oracle strategy for Big Data: combining Big Data Processing Engines: Hadoop / NoSQL / RDBMS
  • Characteristics of Big Data
  • Big Data Opportunities: Some Examples
  • Big Data Challenges

Using Oracle Big Data Lite Virtual Machine and Movieplex Application

  • Reviewing the Deployment Guide
  • Oracle Big Data Lite VM Home Page Sections
  • Introducing the Oracle Movieplex Case Study
  • Oracle Big Data Lite VM Used in this Course
  • Importing the Appliance File
  • Downloading and Running 7-zip Files to create Virtual Box Appliance File
  • Downloading and installing Oracle VM VirtualBox and its Extension Pack
  • Staring the Big Data Lite VM and Starting and Stopping Services

Introduction to the Big Data Ecosystem

  • Cloudera’s Distribution Including Apache Hadoop (CDH)
  • Apache Hadoop
  • Types of Analysis That Use Hadoop
  • CDH Architecture and Components
  • Apache Hadoop Ecosystem
  • Computer Clusters and Distributed Computing
  • Types of Data Generated
  • Apache Hadoop Core Components: HDFS, MapReduce (MR1), and YARN (MR2)

Introduction to the Hadoop Distributed File System

  • Sample Hadoop High Availability (HA) Cluster
  • HDFS Files and Blocks
  • Hadoop Distributed Filesystem (HDFS) Design Principles, Characteristics, and Key Definitions
  • Interacting With Data Stored in HDFS: Hue, Hadoop Client, WebHDFS, and HttpFS
  • DataNodes (DN) Daemons Functions
  • Writing a File to HDFS: Example
  • Active and Standby Daemons (Services) Functions

Acquire Data using CLI, Fuse, Flume, and Kafka

  • Kafka topics
  • Additional Resources
  • Viewing File System Contents Using the CLI
  • What is Flume?
  • Overview of FuseDFS
  • Loading Data Using the CLI
  • Reviewing the Command Line Interface (CLI)
  • FS Shell Commands

Acquire and Access Data Using Oracle NoSQL Database

  • Oracle NoSQL models: Key-Value and Table
  • Accessing the KVStore
  • What is a NoSQL Database
  • Accessing the CLIs (Data, Admin, SQL)
  • Acquiring and Accessing Data in a NoSQL DB
  • HDFS Compared to NoSQL
  • Define Oracle NoSQL Database
  • RDBMS Compared to NoSQL

Introduction to MapReduce and YARN Processing Frameworks

  • Data Locality Optimization in Hadoop
  • Parallel Processing with MapReduce
  • YARN Architecture, Features, and Daemons
  • Hadoop Basic Cluster: MapReduce 1 Versus YARN (MR 2)
  • MapReduce Framework Features, Benefits, and Jobs
  • YARN Application Workflow
  • Word Count Examples
  • Submitting and Monitoring a MapReduce Job

Resource Management Using Yarn

  • Static Service Pools
  • Cloudera Manager Dynamic Resource Management: Example
  • Working with the Fair Scheduler
  • Cloudera Manager Resource Management Features
  • First In, First Out (FIFO) Scheduler, Capacity Scheduler, and Fair Scheduler
  • Submitting and Monitoring a MapReduce Job Using YARN
  • Job Scheduling in YARN
  • Using the YARN application Command

Overview of Apache Spark

  • Benefits of Using Spark
  • Running a Spark Application on YARN (yarn-cluster Mode)
  • Spark Interactive Shells: spark-shell and pyspark
  • Spark Application Components: Driver, Master, Cluster Manager, and Executors
  • Monitoring Spark Jobs Using YARN’s ResourceManager Web UI
  • Word Count Example by Using Interactive Scala
  • Spark Architecture
  • Resilient Distributed Dataset (RDD)

Overview of Apache Hive

  • What is Hive?
  • How is Data Stored in HDFS?
  • Big Data SQL on Top of Hive Data
  • Organizing and Describing Data With Hive
  • Defining Tables Over HDFS
  • Use Case: Storing Clickstream Data
  • Hive Queries
  • Hadoop Architecture

Overview of Cloudera Impala

  • Hadoop: Some Data Access/Processing Options
  • Cloudera Impala: Programming Interfaces
  • How Impala Works with Hive
  • Cloudera Impala
  • How Impala Fits Into the Hadoop Ecosystem
  • Overview of Cloudera Impala
  • Cloudera Impala: Supported Data Formats
  • Cloudera Impala: Key Features

Using Oracle XQuery for Hadoop

  • XQuery Transformation and Basic Filtering
  • XML Review
  • Viewing the Completed Query in YARN’s ResourceManager
  • Running an OXH Query
  • OXH Features
  • Oracle XQuery for Hadoop (OXH)
  • Using OXH: Installation, Functions, Adapters, and Configuration Properties
  • OXH Data Flow
  • Overview of Solr
  • Cloudera Search: Features

Overview of Solr

  • Apache Solr (Cloudera Search)
  • Cloudera Search Tasks
  • Indexing in Cloudera Search
  • Types of Indexing
  • The solrctl Command
  • Cloudera Search: Key Capabilities

Integrating Your Big Data

  • Comparing Big Data Processing Engines
  • Unifying Data: A Typical Requirement
  • Introducing Data Unification Options
  • When To Use These Options?

Batch Loading Options

  • Oracle Copy to Hadoop
  • Oracle Loader for Hadoop
  • Apache Sqoop

Using Oracle SQL Connector for HDFS

  • Using OSCH
  • Performance Tuning
  • Loading: Choosing a Connector
  • Parallelism and Performance
  • Batch and Dynamic Loading: Oracle SQL Connector for HDFS
  • OSCH Architecture
  • Features
  • Key Benefits

Using Oracle Data Integrator and Oracle GoldenGate for Big Data

  • Oracle GoldenGate for Big Data
  • ODI’s Declarative Design
  • Using ODI with Big Data Heterogeneous Integration with Hadoop Environments
  • Using ODI Studio
  • ODI Studio: Big Data Knowledge Modules
  • ETL and Synchronization: Oracle Data Integrator
  • ODI Knowledge Modules (KMs)Simpler Physical Design / Shorter Implementation Time
  • ODI Studio Components: Overview

Using Oracle Big Data SQL

  • Query Performance Overview
  • Benefits: Virtualizes data access across Oracle Database, Hadoop and NoSQL stores
  • Overcoming Big Data Barriers
  • Barriers to Effective Big Data Adoption
  • Oracle Big Data SQL: The Hybrid Solution
  • Deployment Options
  • Using Oracle Big Data SQL

Using Oracle Big Data Spatial and Graph

  • BDSG: Graph Analysis
  • Multimedia Analytics Framework
  • Deployment Options for Oracle BDSG
  • Oracle BDSG: Spatial Analysis
  • Graph and Spatial Analysis: All About Relationships
  • Additional Resources
  • Strategy (supported platforms, etc)
  • What is Oracle Big Data Spatial and Graph (BDSG)?

Using Oracle Advanced Analytics

  • OAA: Oracle Data Mining
  • OAA: Oracle R Enterprise
  • Oracle Advanced Analytics (OAA)

Oracle Big Data Deployment Options

  • BDA Hardware and Integrated and Optional Software
  • Introduction to the Oracle Big Data Cloud Service – Compute Edition
  • Running the Oracle BDA Configuration Generation Utility
  • Administering and Securing the Oracle BDA
  • Introduction to the Oracle Big Data Appliance
  • Oracle BDA Mammoth Software Deployment Bundle
  • Introduction to the Oracle Big Data Cloud Service
  • Using the Oracle BDA mammoth Utility
Регистрация на ближайший курс
Oracle Big Data Fundamentals Ed 2
Код курса:
OBDF
Продолжительность:
5 Дней, 40 Ак. Часов
Зарегистрироваться
Получить консультацию
Свяжитесь со мной
Получить консультацию
Отправить заявку
Регистрация на вебинар
Отправить заявку
Ваша заявка получена!
Мы свяжемся с вами в ближайшее время.