A comprehensive journey through the world of database and data engineering concepts - from SQL, NoSQL to Hadoop
What you'll learn
- Build an intuition from RDBMS system through NoSQL to the Big Data on the Cloud and Hadoop platform
- Understand various distributed database classifications
- Understand when and how to use Redis or Key-Value Stores
- Understand when and how to use MongoDB or Document-oriented databases
- Understand and use HBase as a Wide-Columnar Store
- Understand and use Time series database (InfluxDB)
- Understand and use Elasticsearch as a search engine
- Understand and use Neo4J as a Graph Database Management System
- Understand large scale distributed data storage and processing in Hadoop
- Understand when and how to use and build Streaming architecture with Apache Kafka
- Use Apache Hive and Understand where to use it in respect to big data platforms
- Understand a number of SQL-on-Hadoop Engines and how they work
- Understand how to use data engineering capabilities to enable a data-driven organization
- No strict requirement but knowledge of relational database will be helpful.
- A Windows, Linux or Mac Machine to set up a lab
- Any Hadoop Vendor Sandbox like Cloudera Quickstart or HDP VM (Hadoop)
A comprehensive look at the wide landscape of database systems and how to make a good choice in your next project
The first time we ask or answer any question regarding databases is when building an application. The next is either when our choice of database becomes a bottleneck or when we need to do large-scale data analytics.
This course covers almost all classes of databases or data storage platform there are and when to consider using them. It is a great journey through databases that will be great for software developers, big data engineers, data analysts as well as decision makers. It is not an in-depth look into each of the databases but promises to get you up and running with your first project for each class.
In this course, we are going to cover
Relational Database Systems, their features, use cases and limitations
Key-Value store and their use cases
Document-oriented databases and their use cases
Wide-columnar store and their use cases
Time-series databases and their use cases
Search Engines and their use cases
Graph databases and their use cases
Distributed Logs and real time streaming systems
Hadoop and its use cases
SQL-on-Hadoop tools and their use cases
How to make informed decisions in building a good data storage platform
What is the target audience?
Chief data officers
Anyone who wants to understand Hadoop from a database perspective.
What this course does not cover?
This course does not access any of the databases from the administrative perspective. So we don't cover administrative tasks like security, backup, recovery, migration and the likes.
Very in-depth features in the specific databases in discussion. An example is that we will not go into the different database engines for MySQL or how to write a stored procedures.
What are the requirements?
The lab for this course can be carried out in any machine (Microsoft Windows, Linux, Mac OX).
However, the training on HBase or Hadoop will require you to have a hadoop environment. The suggestion for this will be to to use a pre-installed sandbox, a cloud offering or install your own custom sandbox.
What do I need to know to get the best out of this course?
This course does not assume any knowledge of NoSQL or data engineering.
However a little knowledge of RDBMS (even Microsoft Access) is enough to get you into the best position for this course.
- Chief Data Officers
- IT Decision Makers
- Database Architects
- Software Developers
- Big data Engineers
- Anyone who wants to understand the where each NoSQL class of database best fits.
- Anyone who is curious about NoSQL or Big Data Systems