DATA BASE AND BIG DATA ANALYTICS

Academic Year 2021/2022 - 1° Year
Teaching Staff Credit Value: 12
Scientific field: ING-INF/05 - Sistemi di elaborazione delle informazioni
Taught classes: 80 hours
Term / Semester: 1° and 2°

Learning Objectives

  • Databases

    The learning outcomes of this teaching activity, expressed in terms of the Dublin Descriptors, are the following:

    Knowledge and understanding

    This course will provide students with the knowledge and understanding of the relational data model. In particular, the techniques and methodologies to design, build, query and manage a relational database are described, using the SQL language. In addition to the relational model, the course also presents the basics of the data models falling within the so-called NoSQL family.

    Applying knowledge and understanding

    For each topic of the course, a number of examples and practical exercises will be presented in class, so that the students gain the basic skills both for designing a database, and for understanding an existing model written in the SQL language. Part of the exercises will be carried out using software packages employed in professional contexts.

    Making judgements

    Students will be able to evaluate the different alternatives when designing and querying a database. The basic knowledge of the models falling in the NoSQL paradigm will allow the student to assess in which cases one of these models can be preferred to the relational model.

    Communication skills

    The design of the conceptual model of a database for a given application requires the capability of translating the requirements expressed by the user in natural language, into the formalism of the relational model. The students will learn the basic communication capabilities needed to interact with non-technical users in order to clearly define the application requirements.

    Learning skills

    The student will learn the basic principles behind the design and use of relational and non-relational models, providing them with the essential tools for extending their knowledge on more advanced technical aspects and on different database management system implementations.

  • Big Data Analytics

    This module covers the fundamental concepts of management and design of a business intelligence system. Topics include data models for building a data warehouse; ETL (extract, transform and load) functionalities; OLAP analysis; basic data mining; reporting and interactive dashboards, evolution of BI architectures on large datasets. The module covers techniques and algorithms for data visualization and exploratory analysis based on principles and techniques from graphic design, perceptual psychology and cognitive science. It is targeted to using visualization in their data analytics work. The learning objectives are as follows:

    Knowledge and understanding

    • To understand the most important methodologies and techniques used by industries to analyse data in order to support the decision process
    • To understand the main methodologies to design a data warehouse
    • To understand the main methodologies to transform data into sources of knowledge through visual representation

    Applying knowledge and understanding

    • To be able to apply methodologies and techniques to analyse data.
    • To be able to design a data warehouse.
    • To be able to build report and data analysis and organize them into interactive dashboards

Course Structure

  • Databases

    Lectures and hands-on exercises.

    Should teaching be carried out in mixed mode or remotely, it may be necessary to introduce changes with respect to previous statements, in line with the programme planned and outlined in the syllabus.

  • Big Data Analytics

    The main teaching methods are as follows:

    • Lectures, to provide theoretical and methodological knowledge of the subject;
    • Hands-on exercises, to provide “problem solving” skills and to apply design methodology;
    • Laboratories, to learn and test the usage of related tools

    Should teaching be carried out in mixed mode or remotely, it may be necessary to introduce changes with respect to previous statements, in line with the programme planned and outlined in the syllabus.


Required Prerequisites

  • Databases

    None, although basic programming skills are helpful.

  • Big Data Analytics
    • Basic knowledge of database systems
    • Basic knowledge of SQL

Attendance of Lessons

  • Databases

    Strongly recommended. Attending and actively participating in the classroom activities will contribute positively towards the overall assessment of the oral exam.

  • Big Data Analytics

    Strongly recommended. Attending and actively participating in the classroom activities will contribute positively towards the overall assessment of the oral exam.


Detailed Course Content

  • Databases
    • Fundamentals of Database Management Systems (DBMS)

    • Relational Model: basic concepts, integrity constraints and keys.

    • SQL language: data definition, data modification, queries, views, transactions.

    • NO-SQL database: MongoDB

  • Big Data Analytics

    1. Introduction to Business Intelligence and Big Data Analytics (6 hours)

    • Goal and rationale of BI systems
    • The value of knowledge - data driven decision making
    • The structure and evolution of BI and Big Data analytics systems
    • OLAP vs OLTP
    • Data warehouse and Business intelligence
    • Advanced tools and platforms for BI and analytics

    2. Data models for data warehouse (10 hours)

    • Conceptual modeling
    • Dimensions and facts
    • Multi-dimensional data model
    • Conceptual, logical and physical design

    3. BI Architecture (8 hours)

    • ETL (extract, transform and load) functionalities
    • OLAP analysis
    • OLAP query
    • Reporting and Interactive Dashboard
    • Overview on commercial and open-source BI platforms

    4. Data Visualization (16 hours)

    • Introduction to Visualization
    • Data Visualization fundamentals: Visual Perception and Preattentive Attributes
    • Charts and standard views: relevance, appropriateness and best practices
    • Use of colors in data visualization
    • Dashboard Design
    • Advanced and innovative tools for data visualization: the Tableau platform

Textbook Information

  • Databases
    1. R. Elmasri and S. Navathe, "Fundamentals of Database Systems", 7th Edition, Pearson, 2016.

    2. Instructor’s notes

  • Big Data Analytics
    1. [GoRi] Golfarelli, Rizzi. Data Warehouse Design: Modern Principles and Methodologies, McGraw Hill
    2. [Dash] Steve Wexler, Jeffrey Shaffer, Andy Cotgreave. The Big Book Dashboards: Visualizing Your Data Using Real-World Business Scenarios. Wiley (2017)
    3. [Few1] Stephen Few. Show Me the Numbers: Designing Tables and Graphs to Enlighten, 2nd edition, Analytics Press (2012)
    4. [Few2] Stephen Few. Information Dashboard Design: Displaying Data for At-a-Glance Monitoring, 2nd edition, O’Reilly Media (2013)
    5. [Notes] Instructor’s notes (published on Studium and/or the Microsoft Teams platform)

Course Planning

Databases
 SubjectsText References
1Relational model (part 1) 
2Relational model (part 2) 
3Relational algebra (part 1) 
4Relational algebra (part 2) 
5Relational algebra exercises 
6SQL basic concepts (part 1) 
7SQL basic concepts (part 2) 
8SQL exercises 
9SQL aggregate operators 
10SQL transactions and views 
11SQL exercises 
12NoSQL (part 1) 
13NoSQL (part 2) 
14NoSQL exercises 
Big Data Analytics
 SubjectsText References
1Introduction to Big Data Analytics.[Notes] 
2Business intelligence: introduction, fundamental concepts and architectures[Notes]
[GoRi] Chap. 1 
3The structure and evolution of BI and Big Data analytics systems[Notes] 
4Data models for data warehouse: conceptual modeling and design[GoRi] Chap. 2-6 
5Multi-dimensional data model[GoRi] Chap. 5 
6Data models for data warehouse: logical modeling and design[GoRi] Chap. 8-9 
7ETL (extract, transform and load) process[GoRi] Chap. 10
[Notes] 
8OLAP analysis and query[GoRi] Chap. 7
[Notes] 
9Introduction to Data Visualization. Visual Perception and Preattentive Attributes[Dash] Chap. 1
[Few2] Chap. 4 
10Charts and standard views: relevance, appropriateness and best practices[Few1] 
11Use of colors in data visualization[Dash] Chap. 1 
12Advanced and innovative tools for data visualization: the Tableau platform[Notes] 
13Dashboard design principles. Exploratory vs. Explanatory dashboards.[Few2] 
14Data visualization: infographics and storytelling[Few2] 

Learning Assessment

Learning Assessment Procedures

  • Databases

    The final exam will consist of two parts:

    • A written test with SQL exercises
    • An oral discussion of the written test

    Learning assessment may also be carried out on line, should the conditions require it.

  • Big Data Analytics

    The final exam consists of

    1. a project work aiming at assessing the capabilities in developing a BI system including the analysis and the visualization of relevant information,
    2. an oral exam that will consist of the discussion of the project work.

    Assessment criteria include: depth of analysis, adequacy, quality and correctness of the proposed solutions to the project work, ability to justify and critically evaluate the adopted solutions, clarity.

    The vote on the Big Data Analytics module will account for 50% of the total grade for the entire course.

    Learning assessment may also be carried out on line, should the conditions require it.


Examples of frequently asked questions and / or exercises

  • Databases

    The written test consists of:

    • Creation of a database
    • Creation of tables, given the specifications
    • Implementation of queries in SQL

    Written test simulations will be carried out during the course.

    At the oral discussion, students will be asked questions on how they implemented the database specifications and queries in the written test. In case of mistake on the written test, students will be asked to find, explain and correct their mistakes.

  • Big Data Analytics

    Examples of questions and exercises are available on the Studium platform and/or the Microsoft Teams platform