Mastering SQL Server 2017: Build smart and efficient database applications for your organization with SQL Server 2017
By Miloš Radivojević, Dejan Sarka, William Durkin and
()
About this ebook
Leverage the power of SQL Server 2017 Integration Services to build data integration solutions with ease
Key Features- Work with temporal tables to access information stored in a table at any time
- Get familiar with the latest features in SQL Server 2017 Integration Services
- Program and extend your packages to enhance their functionality
Microsoft SQL Server 2017 uses the power of R and Python for machine learning and containerization-based deployment on Windows and Linux. By learning how to use the features of SQL Server 2017 effectively, you can build scalable apps and easily perform data integration and transformation.
You’ll start by brushing up on the features of SQL Server 2017. This Learning Path will then demonstrate how you can use Query Store, columnstore indexes, and In-Memory OLTP in your apps. You'll also learn to integrate Python code in SQL Server and graph database implementations for development and testing. Next, you'll get up to speed with designing and building SQL Server Integration Services (SSIS) data warehouse packages using SQL server data tools. Toward the concluding chapters, you’ll discover how to develop SSIS packages designed to maintain a data warehouse using the data flow and other control flow tasks.
By the end of this Learning Path, you'll be equipped with the skills you need to design efficient, high-performance database applications with confidence.
This Learning Path includes content from the following Packt books:
- SQL Server 2017 Developer's Guide by Miloš Radivojević, Dejan Sarka, et. al
- SQL Server 2017 Integration Services Cookbook by Christian Cote, Dejan Sarka, et. al
- Use columnstore indexes to make storage and performance improvements
- Extend database design solutions using temporal tables
- Exchange JSON data between applications and SQL Server
- Migrate historical data to Microsoft Azure by using Stretch Database
- Design the architecture of a modern Extract, Transform, and Load (ETL) solution
- Implement ETL solutions using Integration Services for both on-premise and Azure data
This Learning Path is for database developers and solution architects looking to develop ETL solutions with SSIS, and explore the new features in SSIS 2017. Advanced analysis practitioners, business intelligence developers, and database consultants dealing with performance tuning will also find this book useful. Basic understanding of database concepts and T-SQL is required to get the best out of this Learning Path.
Miloš Radivojević is a data platform MVP and specializes in SQL Server for application developers and performance/ query tuning. Miloš is a co-founder of PASS Austria. Dejan Sarka, MCT and Microsoft Data Platform MVP, is an independent trainer and consultant who focuses on the development of database and business intelligence applications. He is the founder of the Slovenian SQL Server and .NET Users Group. William Durkin is a data platform architect for Data Masterminds, he is a regular speaker at conferences around the globe, a Data Platform MVP, and the founder of the popular SQLGrillen event. Christian Coté is an MS-certified technical specialist in business intelligence (MCTS-BI). His ETL projects have used various ETL tools and plain code with various RDBMSes (such as Oracle and SQL Server). Matija Lah has more than 15 years of experience working with Microsoft SQL Server, mostly from architecting data-centric solutions in the legal domain.Related to Mastering SQL Server 2017
Related ebooks
SQL Server 2017 Developer’s Guide: A professional guide to designing and developing enterprise database applications Rating: 0 out of 5 stars0 ratingsSQL Server 2017 Machine Learning Services with R: Data exploration, modeling, and advanced analytics Rating: 0 out of 5 stars0 ratingsHands-On Data Science with SQL Server 2017: Perform end-to-end data analysis to gain efficient data insight Rating: 0 out of 5 stars0 ratingsData Science with SQL Server Quick Start Guide: Integrate SQL Server with data science Rating: 0 out of 5 stars0 ratingsLearn SQL Database Programming: Query and manipulate databases from popular relational database servers using SQL Rating: 0 out of 5 stars0 ratingsSQL Server 2017 Integration Services Cookbook Rating: 0 out of 5 stars0 ratingsGetting Started with SQL Server 2014 Administration Rating: 0 out of 5 stars0 ratingsSQL Server on Linux Rating: 0 out of 5 stars0 ratingsSQL Server Query Tuning and Optimization: Optimize Microsoft SQL Server 2022 queries and applications Rating: 0 out of 5 stars0 ratingsWhat's New in SQL Server 2012 Rating: 0 out of 5 stars0 ratingsInstant SQL Server Analysis Services 2012 Cube Security Rating: 0 out of 5 stars0 ratingsLearn T-SQL Querying: A guide to developing efficient and elegant T-SQL code Rating: 0 out of 5 stars0 ratingsAdvanced Elasticsearch 7.0: A practical guide to designing, indexing, and querying advanced distributed search engines Rating: 0 out of 5 stars0 ratingsMCSA Windows Server 2016 Certification Guide: Exam 70-741: The ultimate guide to becoming MCSA certified Rating: 0 out of 5 stars0 ratingsServerless ETL and Analytics with AWS Glue: Your comprehensive reference guide to learning about AWS Glue and its features Rating: 0 out of 5 stars0 ratingsLearning NServiceBus Sagas Rating: 0 out of 5 stars0 ratingsGuide to NoSQL with Azure Cosmos DB: Work with the massively scalable Azure database service with JSON, C#, LINQ, and .NET Core 2 Rating: 0 out of 5 stars0 ratingsImplementing Power BI in the Enterprise Rating: 5 out of 5 stars5/5Data Modeling for Azure Data Services: Implement professional data design and structures in Azure Rating: 0 out of 5 stars0 ratingsReporting with Microsoft SQL Server 2012 Rating: 1 out of 5 stars1/5
Databases For You
Oracle DBA Mentor: Succeeding as an Oracle Database Administrator Rating: 0 out of 5 stars0 ratingsBlockchain Basics: A Non-Technical Introduction in 25 Steps Rating: 5 out of 5 stars5/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Beginning Microsoft Power BI: A Practical Guide to Self-Service Data Analytics Rating: 0 out of 5 stars0 ratingsAccess 2019 For Dummies Rating: 0 out of 5 stars0 ratingsGrokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5COMPUTER SCIENCE FOR ROOKIES Rating: 0 out of 5 stars0 ratingsPractical Data Analysis Rating: 4 out of 5 stars4/5CompTIA DataSys+ Study Guide: Exam DS0-001 Rating: 0 out of 5 stars0 ratingsBehind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight Rating: 5 out of 5 stars5/5Python Projects for Everyone Rating: 0 out of 5 stars0 ratingsLearn SQL in 24 Hours Rating: 5 out of 5 stars5/5Access 2016 For Dummies Rating: 0 out of 5 stars0 ratingsGo in Action Rating: 5 out of 5 stars5/5The Analytic Detective: Decipher Your Company’s Data Clues and Become Irreplaceable Rating: 0 out of 5 stars0 ratingsAccess for Beginners: Access Essentials, #1 Rating: 0 out of 5 stars0 ratingsLearn SQL Server Administration in a Month of Lunches Rating: 3 out of 5 stars3/5Learning Oracle 12c: A PL/SQL Approach Rating: 0 out of 5 stars0 ratingsAccess 2010 All-in-One For Dummies Rating: 4 out of 5 stars4/5Learn Git in a Month of Lunches Rating: 0 out of 5 stars0 ratingsAzure SQL Revealed: A Guide to the Cloud for SQL Server Professionals Rating: 0 out of 5 stars0 ratingsA Concise Guide to Object Orientated Programming Rating: 0 out of 5 stars0 ratingsGetting Started with SQL Server 2014 Administration Rating: 0 out of 5 stars0 ratingsPython and SQLite Development Rating: 0 out of 5 stars0 ratingsPractical SQL Rating: 4 out of 5 stars4/5SQL in 30 Pages Rating: 4 out of 5 stars4/5Learning PostgreSQL Rating: 1 out of 5 stars1/5
Reviews for Mastering SQL Server 2017
0 ratings0 reviews
Book preview
Mastering SQL Server 2017 - Miloš Radivojević
Mastering SQL Server 2017
Build smart and efficient database applications for your organization with SQL Server 2017
Miloš Radivojević
Dejan Sarka
William Durkin
Christian Coté
Matija Lah
BIRMINGHAM - MUMBAI
Mastering SQL Server 2017
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First Published: August 2019
Production Reference: 1140819
Published by Packt Publishing Ltd.
Livery Place, 35 Livery Street
Birmingham, B3 2PB, U.K.
ISBN 978-1-83898-320-8
www.packtpub.com
Contributors
About the Authors
Miloš Radivojević is a database consultant in Vienna, Austria. He is a data platform MVP and specializes in SQL Server for application developers and performance/ query tuning. Currently, he works as a principal database consultant in Bwin (GVC Holdings)—the largest regulated online gaming company in the world. Miloš is a co-founder of PASS Austria. He is also a speaker at international conferences and speaks regularly at SQL Saturday events and PASS Austria meetings.
Dejan Sarka, MCT and Microsoft Data Platform MVP, is an independent trainer and consultant who focuses on the development of database and business intelligence applications. Besides projects, he spends about half his time on training and mentoring. He is the founder of the Slovenian SQL Server and .NET Users Group. He is the main author or co-author of many books about databases and SQL Server. The last three books before this one were published by Packt, and their titles were SQL Server 2016 Developer's Guide, SQL Server 2017 Integration Services Cookbook, and SQL Server 2016 Developer's Guide. Dejan Sarka has also developed many courses and seminars for Microsoft, SolidQ, and Pluralsight.
William Durkin is a data platform architect for Data Masterminds. He uses his decade of experience with SQL Server to help multinational corporations achieve their data management goals. Born in the UK and now based in Germany, William is a regular speaker at conferences around the globe, a Data Platform MVP, and the founder of the popular SQLGrillen event.
Christian Coté has been in IT for more than 12 years. He is an MS-certified technical specialist in business intelligence (MCTS-BI). For about 10 years, he has been a consultant in ETL/BI projects. His ETL projects have used various ETL tools and plain code with various RDBMSes (such as Oracle and SQL Server). He is currently working on his sixth SSIS implementation in 4 years.
Matija Lah has more than 15 years of experience working with Microsoft SQL Server, mostly from architecting data-centric solutions in the legal domain. His contributions to the SQL Server community have led to the Microsoft Most Valuable Professional award in 2007 (data platform). He spends most of his time on projects involving advanced information management, and natural language processing, but often finds time to speak at events related to Microsoft SQL Server where he loves to share his experience with the SQL Server platform.
Packt Is Searching for Authors Like You
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
mapt.io
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry-leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Why Subscribe?
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Packt.com
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Table of Contents
Title Page
Copyright
Mastering SQL Server 2017
Contributors
About the Authors
Packt Is Searching for Authors Like You
About Packt
Why Subscribe?
Packt.com
Preface
Who This Book Is For
What This Book Covers
To Get the Most out of This Book
Download the Example Code Files
Download the color images
Conventions Used
Get in Touch
Reviews
Introduction to SQL Server 2017
Security
Row-Level Security
Dynamic data masking
Always Encrypted
Engine features
Query Store
Live query statistics
Stretch Database
Database scoped configuration
Temporal Tables
Columnstore indexes
Containers and SQL Server on Linux
Programming
Transact-SQL enhancements
JSON
In-Memory OLTP
SQL Server Tools
Business intelligence
R in SQL server
Release cycles
Summary
SQL Server Tools
Installing and updating SQL Server Tools
New SSMS features and enhancements
Autosave open tabs
Searchable options
Enhanced scroll bar
Execution plan comparison
Live query statistics
Importing flat file Wizard
Vulnerability assessment
SQL Server Data Tools
Tools for developing R and Python code
RStudio IDE
R Tools for Visual Studio 2015
Setting up Visual Studio 2017 for data science applications
Summary
JSON Support in SQL Server
Why JSON?
What is JSON?
Why is it popular?
JSON versus XML
JSON objects
JSON object
JSON array
Primitive JSON data types
JSON in SQL Server prior to SQL Server 2016
JSON4SQL
JSON.SQL
Transact-SQL-based solution
Retrieving SQL Server data in JSON format
FOR JSON AUTO
FOR JSON PATH
FOR JSON additional options
Add a root node to JSON output
Include NULL values in the JSON output
Formatting a JSON output as a single object
Converting data types
Escaping characters
Converting JSON data in a tabular format
OPENJSON with the default schema
Processing data from a comma-separated list of values
Returning the difference between two table rows
OPENJSON with an explicit schema
Import the JSON data from a file
JSON storage in SQL Server 2017
Validating JSON data
Extracting values from a JSON text
JSON_VALUE
JSON_QUERY
Modifying JSON data
Adding a new JSON property
Updating the value for a JSON property
Removing a JSON property
Multiple changes
Performance considerations
Indexes on computed columns
Full-text indexes
Summary
Stretch Database
Stretch DB architecture
Is this for you?
Using Data Migration Assistant
Limitations of using Stretch Database
Limitations that prevent you from enabling the Stretch DB features for a table
Table limitations
Column limitations
Limitations for Stretch-enabled tables
Use cases for Stretch Database
Archiving of historical data
Archiving of logging tables
Testing Azure SQL database
Enabling Stretch Database
Enabling Stretch Database at the database level
Enabling Stretch Database by using wizard
Enabling Stretch Database by using Transact-SQL
Enabling Stretch Database for a table
Enabling Stretch DB for a table by using wizard
Enabling Stretch Database for a table by using Transact-SQL
Filter predicate with sliding window
Querying stretch databases
Querying and updating remote data
SQL Server Stretch Database pricing
Stretch DB management and troubleshooting
Monitoring Stretch Databases
Pause and resume data migration
Disabling Stretch Database
Disable Stretch Database for tables by using SSMS
Disabling Stretch Database for tables using Transact-SQL
Disabling Stretch Database for a database
Backing up and restoring Stretch-enabled databases
Summary
Temporal Tables
What is temporal data?
Types of temporal tables
Allen's interval algebra
Temporal constraints
Temporal data in SQL Server before 2016
Optimizing temporal queries
Temporal features in SQL:2011
System-versioned temporal tables in SQL Server 2017
How temporal tables work in SQL Server 2017
Creating temporal tables
Period columns as hidden attributes
Converting non-temporal tables to temporal tables
Migrating an existing temporal solution to system-versioned tables
Altering temporal tables
Dropping temporal tables
Data manipulation in temporal tables
Inserting data in temporal tables
Updating data in temporal tables
Deleting data in temporal tables
Querying temporal data in SQL Server 2017
Retrieving temporal data at a specific point in time
Retrieving temporal data from a specific period
Retrieving all temporal data
Performance and storage considerations with temporal tables
History retention policy in SQL Server 2017
Configuring the retention policy at the database level
Configuring the retention policy at the table level
Custom history data retention
History table implementation
History table overhead
Temporal tables with memory-optimized tables
What is missing in SQL Server 2017?
SQL Server 2016 and 2017 temporal tables and data warehouses
Summary
Columnstore Indexes
Analytical queries in SQL Server
Joins and indexes
Benefits of clustered indexes
Leveraging table partitioning
Nonclustered indexes in analytical scenarios
Using indexed views
Data compression and query techniques
Writing efficient queries
Columnar storage and batch processing
Columnar storage and compression
Recreating rows from columnar storage
Columnar storage creation process
Development of columnar storage in SQL Server
Batch processing
Nonclustered columnstore indexes
Compression and query performance
Testing the nonclustered columnstore index
Operational analytics
Clustered columnstore indexes
Compression and query performance
Testing the clustered columnstore index
Using archive compression
Adding B-tree indexes and constraints
Updating a clustered columnstore index
Deleting from a clustered columnstore index
Summary
SSIS Setup
Introduction
SQL Server 2016 download
Getting ready
How to do it...
Installing JRE for PolyBase
Getting ready
How to do it...
How it works...
Installing SQL Server 2016
Getting ready
How to do it...
SQL Server Management Studio installation
Getting ready
How to do it...
SQL Server Data Tools installation
Getting ready
How to do it...
Testing SQL Server connectivity
Getting ready
How to do it...
What Is New in SSIS 2016
Introduction
Creating SSIS Catalog
Getting ready
How to do it...
Custom logging
Getting ready
How to do it...
How it works...
There's more...
Create a database
Create a simple project
Testing the custom logging level
See also
Azure tasks and transforms
Getting ready
How to do it...
See also
Incremental package deployment
Getting ready
How to do it...
There's more...
Multiple version support
Getting ready
How to do it...
There's more...
Error column name
Getting ready
How to do it...
Control Flow templates
Getting ready
How to do it...
Key Components of a Modern ETL Solution
Introduction
Installing the sample solution
Getting ready
How to do it...
There's more...
Deploying the source database with its data
Getting ready
How to do it...
There's more...
Deploying the target database
Getting ready
How to do it...
SSIS projects
Getting ready
How to do it...
Framework calls in EP_Staging.dtsx
Getting ready
How to do it...
There's more...
Dealing with Data Quality
Introduction
Profiling data with SSIS
Getting ready
How to do it...
Creating a DQS knowledge base
Getting ready
How to do it...
Data cleansing with DQS
Getting ready
How to do it...
Creating a MDS model
Getting ready
How to do it...
Matching with DQS
Getting ready
How to do it...
Using SSIS fuzzy components
Getting ready
How to do it...
Unleash the Power of SSIS Script Task and Component
Introduction
Using variables in SSIS Script task
Getting ready
How to do it...
Execute complex filesystem operations with the Script task
Getting ready
How to do it...
Reading data profiling XML results with the Script task
Getting ready
How to do it...
Correcting data with the Script component
Getting ready
How to do it...
Validating data using regular expressions in a Script component
Getting ready
How to do it...
Using the Script component as a source
How to do it...
How it works...
Using the Script component as a destination
Getting ready
How to do it...
How it works...
On-Premises and Azure Big Data Integration
Introduction
Azure Blob storage data management
Getting ready
How to do it...
Installing a Hortonworks cluster
Getting ready
How to do it...
Copying data to an on-premises cluster
Getting ready
How to do it...
Using Hive – creating a database
Getting ready
How to do it...
There's more...
Transforming the data with Hive
Getting ready
How to do it...
There's more...
Transferring data between Hadoop and Azure
Getting ready
How to do it...
Leveraging a HDInsight big data cluster
Getting ready
How to do it...
There's more...
Managing data with Pig Latin
Getting ready
How to do it...
There's more...
Importing Azure Blob storage data
Getting ready
How to do it...
There's more...
Azure Data Factory and SSIS
Extending SSIS Custom Tasks and Transformations
Introduction
Designing a custom task
Getting ready
How to do it...
How it works...
Designing a custom transformation
How to do it...
How it works...
Managing custom component versions
Getting ready
How to do it...
How it works...
Scale Out with SSIS 2017
Introduction
SQL Server 2017 download and setup
Getting ready
How to do it...
There's more...
SQL Server client tools setup
Getting ready
How to do it...
Configuring SSIS for scale out executions
Getting ready
How to do it...
There's more...
Executing a package using scale out functionality
Getting ready
How to do it...
Other Books You May Enjoy
Leave a review - let other readers know what you think
Preface
Mastering SQL Server 2017 brings in the power of R and Python for machine learning and containerization-based deployment on Windows and Linux. By knowing how to use the features of SQL Server 2017 to your advantage, you can build scalable applications and easily perform data integration and transformation.
After a quick recap of the features of SQL Server 2017, this Learning Path shows you how to use Query Store, columnstore indexes, and In-Memory OLTP in your applications. You'll then learn to integrate Python code in SQL Server and graph database implementations for development and testing. Next, you'll learn how to design and build SQL Server Integration Services (SSIS) data warehouse packages using SQL server data tools. You'll also learn to develop SSIS packages designed to maintain a data warehouse using data flow and other control flow tasks.
By the end of this Learning Path, you'll have the required information to easily design efficient, high-performance database applications. You'll also have explored on-premises big data integration processes to create a classic data warehouse.
This Learning Path includes content from the following Packt products:
SQL Server 2017 Developer's Guide by Miloš Radivojević, Dejan Sarka, William Durkin
SQL Server 2017 Integration Services Cookbook by Christian Coté, Dejan Sarka, Matija Lah
Who This Book Is For
Database developers and solution architects looking to develop ETL solutions with SSIS, and who want to learn the new features and capabilities in SSIS 2017, will find this Learning Path very useful. It will also be valuable to advanced analysis practitioners, business intelligence developers, and database consultants dealing with performance tuning. Some basic understanding of database concepts and T-SQL is required to get the best out of this Learning Path.
What This Book Covers
Chapter 1, Introduction to SQL Server 2017, very briefly covers the most important features and enhancements, not just those for developers. The chapter shows the whole picture and points readers in the direction of where things are moving.
Chapter 2, SQL Server Tools, helps you understand the changes in the release management of SQL Server tools and explores small and handy enhancements in SQL Server Management Studio (SSMS). It also introduces RStudio IDE, a very popular tool for developing R code, and briefly covers SQL Server Data Tools (SSDT), including the new R Tools for Visual Studio (RTVS), a plugin for Visual Studio, which enables you to develop R code in an IDE that is popular among developers using Microsoft products and languages. The chapter introduces Visual Studio 2017 and shows how it can be used for data science applications with Python.
Chapter 3, JSON Support in SQL Server, explores the JSON support built into SQL Server. This support should make it easier for applications to exchange JSON data with SQL Server.
Chapter 4, Stretch Database, helps you understand how to migrate historical or less frequently/infrequently accessed data transparently and securely to Microsoft Azure using the Stretch Database (Stretch DB) feature.
Chapter 5, Temporal Tables, introduces support for system-versioned temporal tables based on the SQL:2011 standard. We explain how this is implemented in SQL Server and demonstrate some use cases for it (for example, a time-travel application).
Chapter 6, Columnstore Indexes, revises columnar storage and then explores the huge improvements relating to columnstore indexes in SQL Server 2016: updatable non-clustered columnstore indexes, columnstore indexes on in-memory tables, and many other new features for operational analytics.
Chapter 7, SSIS Setup, contains recipes describing the step by step setup of SQL Server 2016 to get the features that are used in the book.
Chapter 8, What Is New in SSIS 2016, contains recipes that talk about the evolution of SSIS over time and what's new in SSIS 2016. This chapter is a detailed overview of Integration Services 2016, new features.
Chapter 9, Key Components of a Modern ETL Solution, explains how ETL has evolved over the past few years and will explain what components are necessary to get a modern scalable ETL solution that fits the modern data warehouse. This chapter will also describe what each catalog view provides and will help you learn how you can use some of them to archive SSIS execution statistics.
Chapter 10, Dealing with Data Quality, focuses on how SSIS can be leveraged to validate and load data. You will learn how to identify invalid data, cleanse data and load valid data to the data warehouse.
Chapter 11, Unleash the Power of SSIS Script Task and Component, covers how to use scripting with SSIS. You will learn how script tasks and script components are very valuable in many situations to overcome the limitations of stock toolbox tasks and transforms.
Chapter 12, On-Premises and Azure Big Data Integration, describes the Azure feature pack that allows SSIS to integrate Azure data from blob storage and HDInsight clusters. You will learn how to use Azure feature pack components to add flexibility to their SSIS solution architecture and integrate on-premises Big Data can be manipulated via SSIS.
Chapter 13, Extending SSIS Tasks and Transformations, talks about extending and customizing the toolbox using custom-developed tasks and transforms and security features. You will learn the pros and cons of creating custom tasks to extend the SSIS toolbox and secure your deployment.
Chapter 14, Scale Out with SSIS 2017, talks about scaling out SSIS package executions on multiple servers. You will learn how SSIS 2017 can scale out to multiple workers to enhance execution scalability.
To Get the Most out of This Book
In order to run all of the demo code in this book, you will need SQL Server 2017 Developer or Enterprise Edition. In addition, you will extensively use SQL Server Management Studio.
You will also need the RStudio IDE and/or SQL Server Data Tools with R Tools for Visual Studio plug-in
SQL Server 2017 Developer or Enterprise Edition.
In addition, you will extensively use SQL Server Management Studio.
Other tools you may need are Visual Studio 2015, SQL Data Tools 16 or higher and SQL Server Management Studio 17 or later.
In addition to that, you will need Hortonworks Sandbox Docker for Windows Azure account and Microsoft Azure.
Download the Example Code Files
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. You can download the code files by following these steps:
Log in or register to our website using your e-mail address and password.
Hover the mouse pointer on theSUPPORTtab at the top.
Click onCode Downloads & Errata.
Enter the name of the book in theSearchbox.
Select the book for which you're looking to download the code files.
Choose from the drop-down menu where you purchased this book from.
Click onCode Download.
You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account. Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Mastering-SQL-Server-2017-. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Download the color images
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781838983208_ColorImages.pdf.
Conventions Used
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: The last characters CI and AS are for case insensitive and accent sensitive, respectively.
A block of code is set as follows:
USE DQS_STAGING_DATA;
SELECT CustomerKey, FullName, StreetAddress, City, StateProvince, CountryRegion, EmailAddress, BirthDate, Occupation;
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: The simplest query to retrieve the data that you can write includes the SELECT and the FROM clauses. In the SELECT clause, you can use the star character (*), literally SELECT *, to denote that you need all columns from a table in the result set.
A block of code is set as follows:
USE WideWorldImportersDW;
SELECT *
FROM Dimension.Customer;
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
USE WideWorldImporters;
CREATE TABLE dbo.Product
(
ProductId INT NOT NULL CONSTRAINT PK_Product PRIMARY KEY,
ProductName NVARCHAR(50) NOT NULL,
Price MONEY NOT NULL,
ValidFrom
DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL
,
ValidTo
DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL
,
PERIOD FOR SYSTEM_TIME
(ValidFrom, ValidTo)
)
WITH (SYSTEM_VERSIONING = ON);
Any command-line input or output is written as follows:
Customer SaleKey Quantity
------------------------------ -------- -----------
Tailspin Toys (Aceitunas, PR) 36964 288
Tailspin Toys (Aceitunas, PR) 126253 250
Tailspin Toys (Aceitunas, PR) 79272 250
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: Go to Tools | Options and you are then able to type your search string in the textbox in the top-left of the Options window.
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Get in Touch
Feedback from our readers is always welcome.
General feedback: Email feedback@packtpub.com and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at questions@packtpub.com.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packtpub.com with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Reviews
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packtpub.com
Introduction to SQL Server 2017
SQL Server is the main relational database management system product from Microsoft. It has been around in one form or another since the late 80s (developed in partnership with Sybase), but as a standalone Microsoft product, it's been here since the early 90s. In the last 20 years, SQL Server has changed and evolved, gaining newer features and functionality along the way.
The SQL Server we know today is based on what was arguably the most significant (r)evolutionary step in its history: the release of SQL Server 2005. The changes that were introduced allowed the versions that followed the 2005 release to take advantage of newer hardware and software improvements, such as: 64-bit memory architecture, better multi-CPU and multi-core support, better alignment with the .NET framework, and many more modernizations in general system architecture.
The incremental changes introduced in each subsequent version of SQL Server have continued to improve upon this solid new foundation. Fortunately, Microsoft has changed the release cycle for multiple products, including SQL Server, resulting in shorter time frames between releases. This has, in part, been due to Microsoft's focus on their much reported Mobile first, Cloud first strategy. This strategy, together with the development of the cloud version of SQL Server Azure SQL Database, has forced Microsoft into a drastically shorter release cycle. The advantage of this strategy is that we are no longer required to wait 3 to 5 years for a new release (and new features). There have been releases every 2 years since SQL Server 2012 was introduced, with multiple releases of Azure SQL Database in between the real versions.
While we can be pleased that we no longer need to wait for new releases, we are also at a distinct disadvantage. The rapid release of new versions and features leaves us developers with ever-decreasing periods of time to get to grips with the shiny new features. Prior versions had multiple years between releases, allowing us to build up a deeper knowledge and understanding of the available features, before having to consume new information.
Following on from the release of SQL Server 2016 was the release of SQL Server 2017, barely a year after 2016 was released. Many features were merely more polished/updated versions of the 2016 release, while there were some notable additions in the 2017 release.
In this chapter (and book), we will introduce what is new inside SQL Server 2017. Due to the short release cycle, we will outline features that are brand new in this release of the product and look at features that have been extended or improved upon since SQL Server 2016.
We will be outlining the new features in the following areas:
Security
Engine features
Programming
Business intelligence
Security
The last few years have made the importance of security in IT extremely apparent, particularly when we consider the repercussions of the Edward Snowden data leaks or multiple cases of data theft via hacking. While no system is completely impenetrable, we should always be considering how we can improve the security of the systems we build. These considerations are wide ranging and sometimes even dictated via rules, regulations, and laws. Microsoft has responded to the increased focus on security by delivering new features to assist developers and DBAs in their search for more secure systems.
Row-Level Security
The first technology that was introduced in SQL Server 2016 to address the need for increased/improved security is Row-Level Security (RLS). RLS provides the ability to control access to rows in a table based on the user executing a query. With RLS it is possible to implement a filtering mechanism on any table in a database, completely transparently to any external application or direct T-SQL access.
The ability to implement such filtering without having to redesign a data access layer allows system administrators to control access to data at an even more granular level than before. The fact that this control can be achieved without any application logic redesign makes this feature potentially even more attractive to certain use-cases. RLS also makes it possible, in conjunction with the necessary auditing features, to lock down a SQL Server database so that even the traditional god-mode sysadmin cannot access the underlying data.
Dynamic data masking
The second security feature that we will be covering is Dynamic Data Masking (DDM). DDM allows the system administrator to define column level data masking algorithms that prevent users from reading the contents of columns, while still being able to query the rows themselves. This feature was initially aimed at allowing developers to work with a copy of production data without having the ability to actually see the underlying data. This can be particularly useful in environments where data protection laws are enforced (for example, credit card processing systems and medical record storage). Data masking occurs only at query runtime and does not affect the stored data of a table. This means that it is possible to mask a multi-terabyte database through a simple DDL statement, rather than resorting to the previous solution of physically masking the underlying data in the table we want to mask. The current implementation of DDM provides the ability to define a fixed set of functions to columns of a table, which will mask data when a masked table is queried. If a user has the permission to view the masked data, then the masking functions are not run, whereas a user who may not see masked data will be provided with the data as seen through the defined masking functions.
Always Encrypted
The third major security feature to be introduced in SQL Server 2016 is Always Encrypted. Encryption with SQL Server was previously a (mainly) server-based solution. Databases were either protected with encryption at the database level (the entire database was encrypted) or at the column level (single columns had an encryption algorithm defined). While this encryption was/is fully functional and safe, crucial portions of the encryption process (for example, encryption certificates) are stored inside SQL Server. This effectively gave the owner of a SQL Server instance the ability to potentially gain access to this encrypted data—if not directly, there was at least an increased surface area for a potential malicious access attempt. As ever more companies moved into hosted service and cloud solutions (for example, Microsoft Azure), the previous encryption solutions no longer provided the required level of control/security.
Always Encrypted was designed to bridge this security gap by removing the ability of an instance owner to gain access to the encryption components. The entirety of the encryption process was moved outside of SQL Server and resides on the client side. While a similar effect was possible using homebrew solutions, Always Encrypted provides a fully integrated encryption suite into both the .Net Framework and SQL Server. Whenever data is defined as requiring encryption, the data is encrypted within the .NET framework and only sent to SQL Server after encryption has occurred. This means that a malicious user (or even system administrator) will only ever be able to access encrypted information should they attempt to query data stored via Always Encrypted.
Microsoft has made some positive progress in this area of the product. While no system is completely safe and no single feature can provide an all-encompassing solution, all three features provide a further option in building up, or improving upon, any system's current security level.
Engine features
The Engine features section is traditionally the most important, or interesting, for most DBAs or system administrators when a new version of SQL Server is released. However, there are also numerous engine feature improvements that have tangential meanings for developers too. So, if you are a developer, don't skip this section—or you may miss some improvements that could save you some trouble later on!
Query Store
The Query Store is possibly the biggest new engine feature to come with the release of SQL Server 2016. DBAs and developers should be more than familiar with the situation of a query behaving reliably for a long period, which suddenly changed into a slow-running, resource-killing monster. Some readers may identify the cause of the issue as the phenomenon of parameter sniffing or similarly through stale statistics. Either way, when troubleshooting to find out why one unchanging query suddenly becomes slow, knowing the query execution plan(s) that SQL Server has created and used can be very helpful. A major issue when investigating these types of problems is the transient nature of query plans and their execution statistics. This is where Query Store comes into play; SQL Server collects and permanently stores information on query compilation and execution on a per-database basis. This information is then persisted inside each database that is being monitored by the Query Store functionality, allowing a DBA or developer to investigate performance issues after the fact.
It is even possible to perform longer-term query analysis, providing an insight into how query execution plans change over a longer time frame. This sort of insight was previously only possible via handwritten solutions or third-party monitoring solutions, which may still not allow the same insights as the Query Store does.
Live query statistics
When we are developing inside SQL Server, each developer creates a mental model of how data flows inside SQL Server. Microsoft has provided a multitude of ways to display this concept when working with query execution. The most obvious visual aid is the graphical execution plan. There are endless explanations in books, articles, and training seminars that attempt to make reading these graphical representations easier. Depending upon how your mind works, these descriptions can help or hinder your ability to understand the data flow concepts—fully blocking iterators, pipeline iterators, semi-blocking iterators, nested loop joins... the list goes on. When we look at an actual graphical execution plan, we are seeing a representation of how SQL Server processed a query: which data retrieval methods were used, which join types were chosen to join multiple data sets, what sorting was required, and so on. However, this is a representation after the query has completed execution. Live Query Statistics offers us the ability to observe during query execution and identify how, when, and where data moves through the query plan. This live representation is a huge improvement in making the concepts behind query execution clearer and is a great tool to allow developers to better design their query and index strategies to improve query performance.
Further details of Live Query Statistics can be found in Chapter 2, SQL Server Tools.
Stretch Database
Microsoft has worked a lot in the past few years on their