Mastering SQL Server 2017: Build smart and efficient database applications for your organization with SQL Server 2017

Ebook994 pages6 hours

Mastering SQL Server 2017: Build smart and efficient database applications for your organization with SQL Server 2017

By Miloš Radivojević, Dejan Sarka, William Durkin and

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Leverage the power of SQL Server 2017 Integration Services to build data integration solutions with ease

Key Features

Work with temporal tables to access information stored in a table at any time
Get familiar with the latest features in SQL Server 2017 Integration Services
Program and extend your packages to enhance their functionality

Book Description

Microsoft SQL Server 2017 uses the power of R and Python for machine learning and containerization-based deployment on Windows and Linux. By learning how to use the features of SQL Server 2017 effectively, you can build scalable apps and easily perform data integration and transformation.

You’ll start by brushing up on the features of SQL Server 2017. This Learning Path will then demonstrate how you can use Query Store, columnstore indexes, and In-Memory OLTP in your apps. You'll also learn to integrate Python code in SQL Server and graph database implementations for development and testing. Next, you'll get up to speed with designing and building SQL Server Integration Services (SSIS) data warehouse packages using SQL server data tools. Toward the concluding chapters, you’ll discover how to develop SSIS packages designed to maintain a data warehouse using the data flow and other control flow tasks.

By the end of this Learning Path, you'll be equipped with the skills you need to design efficient, high-performance database applications with confidence.

This Learning Path includes content from the following Packt books:

SQL Server 2017 Developer's Guide by Miloš Radivojević, Dejan Sarka, et. al
SQL Server 2017 Integration Services Cookbook by Christian Cote, Dejan Sarka, et. al

What you will learn

Use columnstore indexes to make storage and performance improvements
Extend database design solutions using temporal tables
Exchange JSON data between applications and SQL Server
Migrate historical data to Microsoft Azure by using Stretch Database
Design the architecture of a modern Extract, Transform, and Load (ETL) solution
Implement ETL solutions using Integration Services for both on-premise and Azure data

Who this book is for

This Learning Path is for database developers and solution architects looking to develop ETL solutions with SSIS, and explore the new features in SSIS 2017. Advanced analysis practitioners, business intelligence developers, and database consultants dealing with performance tuning will also find this book useful. Basic understanding of database concepts and T-SQL is required to get the best out of this Learning Path.

Miloš Radivojević is a data platform MVP and specializes in SQL Server for application developers and performance/ query tuning. Miloš is a co-founder of PASS Austria. Dejan Sarka, MCT and Microsoft Data Platform MVP, is an independent trainer and consultant who focuses on the development of database and business intelligence applications. He is the founder of the Slovenian SQL Server and .NET Users Group. William Durkin is a data platform architect for Data Masterminds, he is a regular speaker at conferences around the globe, a Data Platform MVP, and the founder of the popular SQLGrillen event. Christian Coté is an MS-certified technical specialist in business intelligence (MCTS-BI). His ETL projects have used various ETL tools and plain code with various RDBMSes (such as Oracle and SQL Server). Matija Lah has more than 15 years of experience working with Microsoft SQL Server, mostly from architecting data-centric solutions in the legal domain.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateAug 22, 2019

ISBN9781838987527

Author

Miloš Radivojević

Related authors

Skip carousel

Related to Mastering SQL Server 2017

Related ebooks

Skip carousel

SQL Server 2017 Developer’s Guide: A professional guide to designing and developing enterprise database applications
Ebook
SQL Server 2017 Developer’s Guide: A professional guide to designing and developing enterprise database applications
byWilliam Durkin
Rating: 0 out of 5 stars
0 ratings
SQL Server 2017 Machine Learning Services with R: Data exploration, modeling, and advanced analytics
Ebook
SQL Server 2017 Machine Learning Services with R: Data exploration, modeling, and advanced analytics
byTomaž Kaštrun
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Science with SQL Server 2017: Perform end-to-end data analysis to gain efficient data insight
Ebook
Hands-On Data Science with SQL Server 2017: Perform end-to-end data analysis to gain efficient data insight
byMarek Chmel
Rating: 0 out of 5 stars
0 ratings
Data Science with SQL Server Quick Start Guide: Integrate SQL Server with data science
Ebook
Data Science with SQL Server Quick Start Guide: Integrate SQL Server with data science
byDejan Sarka
Rating: 0 out of 5 stars
0 ratings
Learn SQL Database Programming: Query and manipulate databases from popular relational database servers using SQL
Ebook
Learn SQL Database Programming: Query and manipulate databases from popular relational database servers using SQL
byJosephine Bush
Rating: 0 out of 5 stars
0 ratings
SQL Server 2017 Integration Services Cookbook
Ebook
SQL Server 2017 Integration Services Cookbook
byChristian Cote
Rating: 0 out of 5 stars
0 ratings
Getting Started with SQL Server 2014 Administration
Ebook
Getting Started with SQL Server 2014 Administration
byGethyn Ellis
Rating: 0 out of 5 stars
0 ratings
SQL Server on Linux
Ebook
SQL Server on Linux
byJasmin Azemovic
Rating: 0 out of 5 stars
0 ratings
Identity with Windows Server 2016: Microsoft 70-742 MCSA Exam Guide: Deploy, configure, and troubleshoot identity services and Group Policy in Windows Server 2016
Ebook
Identity with Windows Server 2016: Microsoft 70-742 MCSA Exam Guide: Deploy, configure, and troubleshoot identity services and Group Policy in Windows Server 2016
byVladimir Stefanovic
Rating: 0 out of 5 stars
0 ratings
SQL Server Query Tuning and Optimization: Optimize Microsoft SQL Server 2022 queries and applications
Ebook
SQL Server Query Tuning and Optimization: Optimize Microsoft SQL Server 2022 queries and applications
byBenjamin Nevarez
Rating: 0 out of 5 stars
0 ratings
Microsoft SQL Server 2012 with Hadoop: Getting SQL Server talking to Hadoop is a smooth process when you follow this tutorial. Learn all the tools and techniques you need integrate the data and then extract powerful business insights from the merged result.
Ebook
Microsoft SQL Server 2012 with Hadoop: Getting SQL Server talking to Hadoop is a smooth process when you follow this tutorial. Learn all the tools and techniques you need integrate the data and then extract powerful business insights from the merged result.
byDebarchan Sarkar
Rating: 0 out of 5 stars
0 ratings
What's New in SQL Server 2012
Ebook
What's New in SQL Server 2012
byRachel Clements
Rating: 0 out of 5 stars
0 ratings
Instant SQL Server Analysis Services 2012 Cube Security
Ebook
Instant SQL Server Analysis Services 2012 Cube Security
bySatya SK Jayanty
Rating: 0 out of 5 stars
0 ratings
Learn T-SQL Querying: A guide to developing efficient and elegant T-SQL code
Ebook
Learn T-SQL Querying: A guide to developing efficient and elegant T-SQL code
byPedro Lopes
Rating: 0 out of 5 stars
0 ratings
Microsoft Azure Architect Technologies: Exam Guide AZ-300: A guide to preparing for the AZ-300 Microsoft Azure Architect Technologies certification exam
Ebook
Microsoft Azure Architect Technologies: Exam Guide AZ-300: A guide to preparing for the AZ-300 Microsoft Azure Architect Technologies certification exam
bySjoukje Zaal
Rating: 0 out of 5 stars
0 ratings
Querying SQL Server: Run T-SQL operations, data extraction, data manipulation, and custom queries to deliver simplified analytics (English Edition)
Ebook
Querying SQL Server: Run T-SQL operations, data extraction, data manipulation, and custom queries to deliver simplified analytics (English Edition)
byAdam Aspin
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Warehousing with Azure Data Factory: ETL techniques to load and transform data from various sources, both on-premises and on cloud
Ebook
Hands-On Data Warehousing with Azure Data Factory: ETL techniques to load and transform data from various sources, both on-premises and on cloud
byChristian Cote
Rating: 0 out of 5 stars
0 ratings
Advanced Elasticsearch 7.0: A practical guide to designing, indexing, and querying advanced distributed search engines
Ebook
Advanced Elasticsearch 7.0: A practical guide to designing, indexing, and querying advanced distributed search engines
byWai Tak Wong
Rating: 0 out of 5 stars
0 ratings
MCSA Windows Server 2016 Certification Guide: Exam 70-741: The ultimate guide to becoming MCSA certified
Ebook
MCSA Windows Server 2016 Certification Guide: Exam 70-741: The ultimate guide to becoming MCSA certified
bySasha Kranjac
Rating: 0 out of 5 stars
0 ratings
Serverless ETL and Analytics with AWS Glue: Your comprehensive reference guide to learning about AWS Glue and its features
Ebook
Serverless ETL and Analytics with AWS Glue: Your comprehensive reference guide to learning about AWS Glue and its features
byVishal Pathak
Rating: 0 out of 5 stars
0 ratings
Learning NServiceBus Sagas
Ebook
Learning NServiceBus Sagas
byRich Helton
Rating: 0 out of 5 stars
0 ratings
Microsoft System Center Configuration Manager: Deploy a scalable solution by ensuring high availability and disaster recovery using Configuration Manager with this book and ebook.
Ebook
Microsoft System Center Configuration Manager: Deploy a scalable solution by ensuring high availability and disaster recovery using Configuration Manager with this book and ebook.
byMarius Sandbu
Rating: 0 out of 5 stars
0 ratings
Installation, Storage, and Compute with Windows Server 2016: Microsoft 70-740 MCSA Exam Guide: Implement and configure storage and compute functionalities in Windows Server 2016
Ebook
Installation, Storage, and Compute with Windows Server 2016: Microsoft 70-740 MCSA Exam Guide: Implement and configure storage and compute functionalities in Windows Server 2016
bySasha Kranjac
Rating: 0 out of 5 stars
0 ratings
Professional Azure SQL Database Administration: Equip yourself with the skills you need to manage and maintain your SQL databases on the Microsoft cloud
Ebook
Professional Azure SQL Database Administration: Equip yourself with the skills you need to manage and maintain your SQL databases on the Microsoft cloud
byAhmad Osama
Rating: 0 out of 5 stars
0 ratings
Guide to NoSQL with Azure Cosmos DB: Work with the massively scalable Azure database service with JSON, C#, LINQ, and .NET Core 2
Ebook
Guide to NoSQL with Azure Cosmos DB: Work with the massively scalable Azure database service with JSON, C#, LINQ, and .NET Core 2
byGastón C. Hillar
Rating: 0 out of 5 stars
0 ratings
Implementing Power BI in the Enterprise
Ebook
Implementing Power BI in the Enterprise
byGreg Low
Rating: 5 out of 5 stars
5/5
Data Modeling for Azure Data Services: Implement professional data design and structures in Azure
Ebook
Data Modeling for Azure Data Services: Implement professional data design and structures in Azure
byPeter ter Braake
Rating: 0 out of 5 stars
0 ratings
Microsoft Azure Administrator – Exam Guide AZ-103: Your in-depth certification guide in becoming Microsoft Certified Azure Administrator Associate
Ebook
Microsoft Azure Administrator – Exam Guide AZ-103: Your in-depth certification guide in becoming Microsoft Certified Azure Administrator Associate
bySjoukje Zaal
Rating: 0 out of 5 stars
0 ratings
Reporting with Microsoft SQL Server 2012
Ebook
Reporting with Microsoft SQL Server 2012
byJames Serra
Rating: 1 out of 5 stars
1/5
Mastering Azure Machine Learning: Perform large-scale end-to-end advanced machine learning in the cloud with Microsoft Azure Machine Learning
Ebook
Mastering Azure Machine Learning: Perform large-scale end-to-end advanced machine learning in the cloud with Microsoft Azure Machine Learning
byKörner Christoph
Rating: 0 out of 5 stars
0 ratings

Databases For You

Skip carousel

Oracle DBA Mentor: Succeeding as an Oracle Database Administrator
Ebook
Oracle DBA Mentor: Succeeding as an Oracle Database Administrator
byBrian Peasland
Rating: 0 out of 5 stars
0 ratings
Blockchain Basics: A Non-Technical Introduction in 25 Steps
Ebook
Blockchain Basics: A Non-Technical Introduction in 25 Steps
byDaniel Drescher
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Beginning Microsoft Power BI: A Practical Guide to Self-Service Data Analytics
Ebook
Beginning Microsoft Power BI: A Practical Guide to Self-Service Data Analytics
byDan Clark
Rating: 0 out of 5 stars
0 ratings
Access 2019 For Dummies
Ebook
Access 2019 For Dummies
byLaurie A. Ulrich
Rating: 0 out of 5 stars
0 ratings
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
Ebook
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
byAlexander Cooper
Rating: 2 out of 5 stars
2/5
COMPUTER SCIENCE FOR ROOKIES
Ebook
COMPUTER SCIENCE FOR ROOKIES
byAngel Bahabwa
Rating: 0 out of 5 stars
0 ratings
Practical Data Analysis
Ebook
Practical Data Analysis
byHector Cuesta
Rating: 4 out of 5 stars
4/5
CompTIA DataSys+ Study Guide: Exam DS0-001
Ebook
CompTIA DataSys+ Study Guide: Exam DS0-001
byMike Chapple
Rating: 0 out of 5 stars
0 ratings
Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight
Ebook
Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight
byPiyanka Jain
Rating: 5 out of 5 stars
5/5
Python Projects for Everyone
Ebook
Python Projects for Everyone
byMohamad Charara
Rating: 0 out of 5 stars
0 ratings
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Access 2016 For Dummies
Ebook
Access 2016 For Dummies
byLaurie A. Ulrich
Rating: 0 out of 5 stars
0 ratings
Go in Action
Ebook
Go in Action
byErik St. Martin
Rating: 5 out of 5 stars
5/5
The Analytic Detective: Decipher Your Company’s Data Clues and Become Irreplaceable
Ebook
The Analytic Detective: Decipher Your Company’s Data Clues and Become Irreplaceable
bySteve Leeds
Rating: 0 out of 5 stars
0 ratings
Access for Beginners: Access Essentials, #1
Ebook
Access for Beginners: Access Essentials, #1
byM.L. Humphrey
Rating: 0 out of 5 stars
0 ratings
Learn SQL Server Administration in a Month of Lunches
Ebook
Learn SQL Server Administration in a Month of Lunches
byDon Jones
Rating: 3 out of 5 stars
3/5
SQL Programming & Database Management For Absolute Beginners SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
Ebook
SQL Programming & Database Management For Absolute Beginners SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
byWilliam Sullivan
Rating: 5 out of 5 stars
5/5
Learning Oracle 12c: A PL/SQL Approach
Ebook
Learning Oracle 12c: A PL/SQL Approach
bySham Tickoo
Rating: 0 out of 5 stars
0 ratings
Access 2010 All-in-One For Dummies
Ebook
Access 2010 All-in-One For Dummies
byAlison Barrows
Rating: 4 out of 5 stars
4/5
Learn Git in a Month of Lunches
Ebook
Learn Git in a Month of Lunches
byRick Umali
Rating: 0 out of 5 stars
0 ratings
Azure SQL Revealed: A Guide to the Cloud for SQL Server Professionals
Ebook
Azure SQL Revealed: A Guide to the Cloud for SQL Server Professionals
byBob Ward
Rating: 0 out of 5 stars
0 ratings
A Concise Guide to Object Orientated Programming
Ebook
A Concise Guide to Object Orientated Programming
byalasdair gilchrist
Rating: 0 out of 5 stars
0 ratings
Getting Started with SQL Server 2014 Administration
Ebook
Getting Started with SQL Server 2014 Administration
byGethyn Ellis
Rating: 0 out of 5 stars
0 ratings
Python and SQLite Development
Ebook
Python and SQLite Development
byAgus Kurniawan
Rating: 0 out of 5 stars
0 ratings
Practical SQL
Ebook
Practical SQL
byDavid Perry
Rating: 4 out of 5 stars
4/5
LINUX: Beginner's Crash Course. Your Step-By-Step Guide To Learning The Linux Operating System And Command Line Easy & Fast!
Ebook
LINUX: Beginner's Crash Course. Your Step-By-Step Guide To Learning The Linux Operating System And Command Line Easy & Fast!
byJeremy Li
Rating: 3 out of 5 stars
3/5
SQL in 30 Pages
Ebook
SQL in 30 Pages
byU.Q. Magnusson
Rating: 4 out of 5 stars
4/5
Learning PostgreSQL
Ebook
Learning PostgreSQL
byJuba Salahaldin
Rating: 1 out of 5 stars
1/5

Related podcast episodes

Skip carousel

Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
Podcast episode
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
byData Engineering Podcast
0 ratings
0% found this document useful
Building Linked Data Products With JSON-LD: A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. In this episode Brian Platz explains how JSON-LD can be used as a shared representation of linked data for building semantic data products.
Podcast episode
Building Linked Data Products With JSON-LD: A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. In this episode Brian Platz explains how JSON-LD can be used as a shared representation of linked data for building semantic data products.
byData Engineering Podcast
0 ratings
0% found this document useful
Addressing The Challenges Of Component Integration In Data Platform Architectures: Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components. In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team.
$Addressing The Challenges Of Component Integration In Data Platform Architectures: Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components. In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team.$
$Addressing The Challenges Of Component Integration In Data Platform Architectures: Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components. In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team.$
Podcast episode
Addressing The Challenges Of Component Integration In Data Platform Architectures: Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components. In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team.
byData Engineering Podcast
0 ratings
0% found this document useful
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
Podcast episode
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
byData Engineering Podcast
0 ratings
0% found this document useful
Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine: Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of developer tools to provide useful suggestions that speed up the work of engineers. Tabnine is one of the main platforms offering an AI powered assistant for software engineers. In this episode Eran Yahav shares the journey that he has taken in building this product and the ways that it enhances the ability of humans to get their work done, and when the humans have to adapt to the tool.
Podcast episode
Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine: Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of developer tools to provide useful suggestions that speed up the work of engineers. Tabnine is one of the main platforms offering an AI powered assistant for software engineers. In this episode Eran Yahav shares the journey that he has taken in building this product and the ways that it enhances the ability of humans to get their work done, and when the humans have to adapt to the tool.
byData Engineering Podcast
0 ratings
0% found this document useful
Shining Some Light In The Black Box Of PostgreSQL Performance: Databases are the core of most applications, but they are often treated as inscrutable black boxes. When an application is slow, there is a good probability that the database needs some attention. In this episode Lukas Fittl shares some hard-won wisdom about the causes and solution of many performance bottlenecks and the work that he is doing to shine some light on PostgreSQL to make it easier to understand how to keep it running smoothly.
Podcast episode
Shining Some Light In The Black Box Of PostgreSQL Performance: Databases are the core of most applications, but they are often treated as inscrutable black boxes. When an application is slow, there is a good probability that the database needs some attention. In this episode Lukas Fittl shares some hard-won wisdom about the causes and solution of many performance bottlenecks and the work that he is doing to shine some light on PostgreSQL to make it easier to understand how to keep it running smoothly.
byData Engineering Podcast
0 ratings
0% found this document useful
Powering Vector Search With Real Time And Incremental Vector Indexes: The rapid growth of machine learning, especially large language models, have led to a commensurate growth in the need to store and compare vectors. In this episode Louis Brandy discusses the applications for vector search capabilities both in and outside of AI, as well as the challenges of maintaining real-time indexes of vector data.
Podcast episode
Powering Vector Search With Real Time And Incremental Vector Indexes: The rapid growth of machine learning, especially large language models, have led to a commensurate growth in the need to store and compare vectors. In this episode Louis Brandy discusses the applications for vector search capabilities both in and outside of AI, as well as the challenges of maintaining real-time indexes of vector data.
byData Engineering Podcast
0 ratings
0% found this document useful
Designing Data Transfer Systems That Scale: The first step of data pipelines is to move the data to a place where you can process and prepare it for its eventual purpose. Data transfer systems are a critical component of data enablement, and building them to support large volumes of information is a complex endeavor. Andrei Tserakhau has dedicated his careeer to this problem, and in this episode he shares the lessons that he has learned and the work he is doing on his most recent data transfer system at DoubleCloud.
Podcast episode
Designing Data Transfer Systems That Scale: The first step of data pipelines is to move the data to a place where you can process and prepare it for its eventual purpose. Data transfer systems are a critical component of data enablement, and building them to support large volumes of information is a complex endeavor. Andrei Tserakhau has dedicated his careeer to this problem, and in this episode he shares the lessons that he has learned and the work he is doing on his most recent data transfer system at DoubleCloud.
byData Engineering Podcast
0 ratings
0% found this document useful
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
Podcast episode
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
byData Engineering Podcast
0 ratings
0% found this document useful
Designing A Non-Relational Database Engine: Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database.
Podcast episode
Designing A Non-Relational Database Engine: Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database.
byData Engineering Podcast
0 ratings
0% found this document useful
Tackling Real Time Streaming Data With SQL Using RisingWave: Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable.
Podcast episode
Tackling Real Time Streaming Data With SQL Using RisingWave: Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable.
byData Engineering Podcast
0 ratings
0% found this document useful
Defining A Strategy For Your Data Products: The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.
Podcast episode
Defining A Strategy For Your Data Products: The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.
byData Engineering Podcast
0 ratings
0% found this document useful
Surveying The Market Of Database Products: Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection.
Podcast episode
Surveying The Market Of Database Products: Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection.
byData Engineering Podcast
0 ratings
0% found this document useful
An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem: Data systems are inherently complex and often require integration of multiple technologies. Orchestrators are centralized utilities that control the execution and sequencing of interdependent operations. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. In this episode Nick Schrock, creator of Dagster, shares his perspective on the state of data orchestration technology and its application to help inform its implementation in your environment.
Podcast episode
An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem: Data systems are inherently complex and often require integration of multiple technologies. Orchestrators are centralized utilities that control the execution and sequencing of interdependent operations. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. In this episode Nick Schrock, creator of Dagster, shares his perspective on the state of data orchestration technology and its application to help inform its implementation in your environment.
byData Engineering Podcast
0 ratings
0% found this document useful
Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable: Building streaming applications has gotten substantially easier over the past several years. Despite this, it is still operationally challenging to deploy and maintain your own stream processing infrastructure. Decodable was built with a mission of eliminating all of the painful aspects of developing and deploying stream processing systems for engineering teams. In this episode Eric Sammer discusses why more companies are including real-time capabilities in their products and the ways that Decodable makes it faster and easier.
Podcast episode
Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable: Building streaming applications has gotten substantially easier over the past several years. Despite this, it is still operationally challenging to deploy and maintain your own stream processing infrastructure. Decodable was built with a mission of eliminating all of the painful aspects of developing and deploying stream processing systems for engineering teams. In this episode Eric Sammer discusses why more companies are including real-time capabilities in their products and the ways that Decodable makes it faster and easier.
byData Engineering Podcast
0 ratings
0% found this document useful
Using Product Driven Development To Improve The Productivity And Effectiveness Of Your Data Teams: With all of the messaging about treating data as a product it is becoming difficult to know what that even means. Vishal Singh is the head of products at Starburst which means that he has to spend all of his time thinking and talking about the details of product thinking and its application to data. In this episode he shares his thoughts on the strategic and tactical elements of moving your work as a data professional from being task-oriented to being product-oriented and the long term improvements in your productivity that it provides.
Podcast episode
Using Product Driven Development To Improve The Productivity And Effectiveness Of Your Data Teams: With all of the messaging about treating data as a product it is becoming difficult to know what that even means. Vishal Singh is the head of products at Starburst which means that he has to spend all of his time thinking and talking about the details of product thinking and its application to data. In this episode he shares his thoughts on the strategic and tactical elements of moving your work as a data professional from being task-oriented to being product-oriented and the long term improvements in your productivity that it provides.
byData Engineering Podcast
0 ratings
0% found this document useful
Adding An Easy Mode For The Modern Data Stack With 5X: The "modern data stack" promised a scalable, composable data platform that gave everyone the flexibility to use the best tools for every job. The reality was that it left data teams in the position of spending all of their engineering effort on integrating systems that weren't designed with compatible user experiences. The team at 5X understand the pain involved and the barriers to productivity and set out to solve it by pre-integrating the best tools from each layer of the stack. In this episode founder Tarush Aggarwal explains how the realities of the modern data stack are impacting data teams and the work that they are doing to accelerate time to value.
Podcast episode
Adding An Easy Mode For The Modern Data Stack With 5X: The "modern data stack" promised a scalable, composable data platform that gave everyone the flexibility to use the best tools for every job. The reality was that it left data teams in the position of spending all of their engineering effort on integrating systems that weren't designed with compatible user experiences. The team at 5X understand the pain involved and the barriers to productivity and set out to solve it by pre-integrating the best tools from each layer of the stack. In this episode founder Tarush Aggarwal explains how the realities of the modern data stack are impacting data teams and the work that they are doing to accelerate time to value.
byData Engineering Podcast
0 ratings
0% found this document useful
Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer: Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological solution to the problem. In this episode Artyom Keydunov, creator of Cube, discusses the evolution and applications of the semantic layer as a component of your data platform, and how Cube provides speed and cost optimization for your data consumers.
Podcast episode
Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer: Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological solution to the problem. In this episode Artyom Keydunov, creator of Cube, discusses the evolution and applications of the semantic layer as a component of your data platform, and how Cube provides speed and cost optimization for your data consumers.
byData Engineering Podcast
0 ratings
0% found this document useful
Find Out About The Technology Behind The Latest PFAD In Analytical Database Development: Building a database engine requires a substantial amount of engineering effort and time investment. Over the decades of research and development into building these software systems there are a number of common components that are shared across implementations. When Paul Dix decided to re-write the InfluxDB engine he found the Apache Arrow ecosystem ready and waiting with useful building blocks to accelerate the process. In this episode he explains how he used the combination of Apache Arrow, Flight, Datafusion, and Parquet to lay the foundation of the newest version of his time-series database.
Podcast episode
Find Out About The Technology Behind The Latest PFAD In Analytical Database Development: Building a database engine requires a substantial amount of engineering effort and time investment. Over the decades of research and development into building these software systems there are a number of common components that are shared across implementations. When Paul Dix decided to re-write the InfluxDB engine he found the Apache Arrow ecosystem ready and waiting with useful building blocks to accelerate the process. In this episode he explains how he used the combination of Apache Arrow, Flight, Datafusion, and Parquet to lay the foundation of the newest version of his time-series database.
byData Engineering Podcast
0 ratings
0% found this document useful
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
Podcast episode
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
byData Engineering Podcast
0 ratings
0% found this document useful
Version Your Data Lakehouse Like Your Software With Nessie: Data lakehouse architectures are gaining popularity due to the flexibility and cost effectiveness that they offer. The link that bridges the gap between data lake and warehouse capabilities is the catalog. The primary purpose of the catalog is to inform the query engine of what data exists and where, but the Nessie project aims to go beyond that simple utility. In this episode Alex Merced explains how the branching and merging functionality in Nessie allows you to use the same versioning semantics for your data lakehouse that you are used to from Git.
Podcast episode
Version Your Data Lakehouse Like Your Software With Nessie: Data lakehouse architectures are gaining popularity due to the flexibility and cost effectiveness that they offer. The link that bridges the gap between data lake and warehouse capabilities is the catalog. The primary purpose of the catalog is to inform the query engine of what data exists and where, but the Nessie project aims to go beyond that simple utility. In this episode Alex Merced explains how the branching and merging functionality in Nessie allows you to use the same versioning semantics for your data lakehouse that you are used to from Git.
byData Engineering Podcast
0 ratings
0% found this document useful
Building ETL Pipelines With Generative AI: Artificial intelligence applications require substantial high quality data, which is provided through ETL pipelines. Now that AI has reached the level of sophistication seen in the various generative models it is being used to build new ETL workflows. In this episode Jay Mishra shares his experiences and insights building ETL pipelines with the help of generative AI.
Podcast episode
Building ETL Pipelines With Generative AI: Artificial intelligence applications require substantial high quality data, which is provided through ETL pipelines. Now that AI has reached the level of sophistication seen in the various generative models it is being used to build new ETL workflows. In this episode Jay Mishra shares his experiences and insights building ETL pipelines with the help of generative AI.
byData Engineering Podcast
0 ratings
0% found this document useful
Reconciling The Data In Your Databases With Datafold: A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is stored and processed correctly can be complex and time-consuming, especially when the source and destination speak different dialects of SQL. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data.
Podcast episode
Reconciling The Data In Your Databases With Datafold: A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is stored and processed correctly can be complex and time-consuming, especially when the source and destination speak different dialects of SQL. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data.
byData Engineering Podcast
0 ratings
0% found this document useful
The Cloudcast #342 - Understanding Databases in AWS
Podcast episode
The Cloudcast #342 - Understanding Databases in AWS
byThe Cloudcast
0 ratings
0% found this document useful
Troubleshooting Kafka In Production: Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Operating it at scale, however, is notoriously challenging. Elad Eldor has experienced these challenges first-hand, leading to his work writing the book "Kafka: Troubleshooting in Production". In this episode he highlights the sources of complexity that contribute to Kafka's operational difficulties, and some of the main ways to identify and mitigate potential sources of trouble.
Podcast episode
Troubleshooting Kafka In Production: Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Operating it at scale, however, is notoriously challenging. Elad Eldor has experienced these challenges first-hand, leading to his work writing the book "Kafka: Troubleshooting in Production". In this episode he highlights the sources of complexity that contribute to Kafka's operational difficulties, and some of the main ways to identify and mitigate potential sources of trouble.
byData Engineering Podcast
0 ratings
0% found this document useful
Using Data To Illuminate The Intentionally Opaque Insurance Industry: The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry.
Podcast episode
Using Data To Illuminate The Intentionally Opaque Insurance Industry: The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry.
byData Engineering Podcast
0 ratings
0% found this document useful
Simple And Scalable Encryption Of Data In Use For Analytics And Machine Learning With Opaque Systems: Encryption and security are critical elements in data analytics and machine learning applications. We have well developed protocols and practices around data that is at rest and in motion, but security around data in use is still severely lacking. Recognizing this shortcoming and the capabilities that could be unlocked by a robust solution Rishabh Poddar helped to create Opaque Systems as an outgrowth of his PhD studies. In this episode he shares the work that he and his team have done to simplify integration of secure enclaves and trusted computing environments into analytical workflows and how you can start using it without re-engineering your existing systems.
Podcast episode
Simple And Scalable Encryption Of Data In Use For Analytics And Machine Learning With Opaque Systems: Encryption and security are critical elements in data analytics and machine learning applications. We have well developed protocols and practices around data that is at rest and in motion, but security around data in use is still severely lacking. Recognizing this shortcoming and the capabilities that could be unlocked by a robust solution Rishabh Poddar helped to create Opaque Systems as an outgrowth of his PhD studies. In this episode he shares the work that he and his team have done to simplify integration of secure enclaves and trusted computing environments into analytical workflows and how you can start using it without re-engineering your existing systems.
byData Engineering Podcast
0 ratings
0% found this document useful
Automate Your Pipeline Creation For Streaming Data Transformations With SQLake: Managing end-to-end data flows becomes complex and unwieldy as the scale of data and its variety of applications in an organization grows. Part of this complexity is due to the transformation and orchestration of data living in disparate systems. The team at Upsolver is taking aim at this problem with the latest iteration of their platform in the form of SQLake. In this episode Ori Rafael explains how they are automating the creation and scheduling of orchestration flows and their related transforations in a unified SQL interface.
Podcast episode
Automate Your Pipeline Creation For Streaming Data Transformations With SQLake: Managing end-to-end data flows becomes complex and unwieldy as the scale of data and its variety of applications in an organization grows. Part of this complexity is due to the transformation and orchestration of data living in disparate systems. The team at Upsolver is taking aim at this problem with the latest iteration of their platform in the form of SQLake. In this episode Ori Rafael explains how they are automating the creation and scheduling of orchestration flows and their related transforations in a unified SQL interface.
byData Engineering Podcast
0 ratings
0% found this document useful
How Data Platforms Affect ML & AI // Jake Watson // #207
Podcast episode
How Data Platforms Affect ML & AI // Jake Watson // #207
byMLOps.community
0 ratings
0% found this document useful
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
Podcast episode
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
byData Engineering Podcast
0 ratings
0% found this document useful

Skip carousel

Output Quality
Linux Format
Article
Output Quality
Dec 15, 2020
1 min read
Three Low-code Options
PC Pro Magazine
Article
Three Low-code Options
Nov 12, 2020
Counting Intel, Vodafone and VW among its customers, OutSystems helps businesses create cloudbased, on-premises and hybrid applications for mobile and web. Its development environment is predominantly drag-and-drop, with views for processes, data and
3 min read
Office 365 Features For Business
PC Pro Magazine
Article
Office 365 Features For Business
Dec 8, 2022
4 min read
Enterprise Soaring Success
Linux Format
Article
Enterprise Soaring Success
Aug 27, 2019
7 min read
Plotting Applications
Linux Format
Article
Plotting Applications
Mar 10, 2020
1 min read
Switch to ABILITY OFFICE 11 FOR FREE
PC Pro Magazine
Article
Switch to ABILITY OFFICE 11 FOR FREE
Nov 10, 2022
6 min read
Vector Vexations
Linux Format
Article
Vector Vexations
Apr 2, 2024
Why does MySQL not support vectors in its community edition? Generative AI is the hot topic in tech. GenAI relies on vector data. Yet Oracle has no plans to support vectors in the community edition of MySQL. If you want to try out vector data with ot
1 min read
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Business Today
Article
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Jan 20, 2023
2 min read
How Netflix’s OTT Architecture Functions?
Techfastly
Article
How Netflix’s OTT Architecture Functions?
May 1, 2022
With so many OTT platforms in the market today, Netflix has managed to capture a majority of the audience on a global scale. Netflix has become the go-to source of so much entertainment for consumers in less than 20 years. It can even be said that Ne
4 min read
Data-driven Decision Making That Uses Data, Mind And Heart
The European Business Review
Article
Data-driven Decision Making That Uses Data, Mind And Heart
Jan 31, 2020
14 min read
The Best OPEN SOURCE Software Ever!
Linux Format
Article
The Best OPEN SOURCE Software Ever!
Mar 7, 2023
1 min read
FLASK Web Frameworks
Linux Format
Article
FLASK Web Frameworks
Jun 4, 2019
The main focus of Python has always been to get you cracking on with your coding – the language was never made for web programming. However, this has just made it more interesting to extend the language for the web, or to create an interface to web-b
9 min read
Craft A Perfect Personal Document Library
iCreate
Article
Craft A Perfect Personal Document Library
Jan 26, 2023
There are countless apps available that can be used to organise your notes and also many word processors designed to help you create smart-looking documents, but Craft aims to do both in style. The idea is to include a huge number of advanced documen
1 min read
Rediscover Speed With The Redis Revolution
Linux Format
Article
Rediscover Speed With The Redis Revolution
Jul 25, 2023
Credit: https://redis.io Redis is an open-source, in-memory data structure store that has gained popularity R as a highly efficient caching and messaging system. It prioritises speed, efficiency and versatility, making it a top choice for various ap
8 min read
Cloudways
Linux Format
Article
Cloudways
Aug 22, 2023
2 min read
Veritas Backup Exec 22.2
PC Pro Magazine
Article
Veritas Backup Exec 22.2
Oct 5, 2023
PRICE Simple Core Pack, 5 instances, £489 per year exc VAT from uk.insight.com Veritas Backup Exec (BE) has always been one of our top choices for on-premises data protection. It delivers a comprehensive range of backup and recovery services. The BE
2 min read
2 The Use of Python in AI and ML
Techfastly
Article
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
Salesforce Adding Einstein Analytics Al To Tableau Platform
Techfastly
Article
Salesforce Adding Einstein Analytics Al To Tableau Platform
Feb 4, 2021
3 min read
Mining Actionable Information with Smart Capture
The European Business Review
Article
Mining Actionable Information with Smart Capture
May 22, 2018
4 min read
Opinion
Linux Format
Article
Opinion
Jun 25, 2024
Italo Vignoli is one of the founders of LibreOffice and the Document Foundation. “The Document Foundation has announced LibreOffice 24.2.4 Community, the fourth minor release of the free, volunteer-supported office suite for personal productivity in
3 min read
KeePassXC: The Friendlier Free Offline Password Manager
PCWorld
Article
KeePassXC: The Friendlier Free Offline Password Manager
Sep 5, 2023
7 min read
CalicoPie Family Historian 7
Computeractive
Article
CalicoPie Family Historian 7
Mar 24, 2021
SOFTWARE | £60 from Family Historian Store www.snipca.com/37615 If you’ve ever researched your family tree, you’ll know it’s much harder than the BBC’s celebrity genealogy programme Who Do You Think You Are? makes it appear. You’ll certainly need to
2 min read
Jonathan Ellis INTERVIEW
Linux Format
Article
Jonathan Ellis INTERVIEW
Oct 22, 2019
6 min read
BUYER'S GUIDE TO Cloud File Sharing In 2021
PC Pro Magazine
Article
BUYER'S GUIDE TO Cloud File Sharing In 2021
Jan 7, 2021
4 min read
Top 10 Programming Languages
PC Pro Magazine
Article
Top 10 Programming Languages
Jan 5, 2023
8 min read
“We’re Learning As We Go And Accepting Any False Starts As Being A Part Of The Process”
PC Pro Magazine
Article
“We’re Learning As We Go And Accepting Any False Starts As Being A Part Of The Process”
Jul 8, 2021
6 min read
Retrospect Backup 18.5
PC Pro Magazine
Article
Retrospect Backup 18.5
Oct 8, 2022
2 min read
Barracuda Cloud-to-Cloud Backup
PC Pro Magazine
Article
Barracuda Cloud-to-Cloud Backup
Oct 8, 2022
2 min read
Ditch the MONTHLY PAYMENTS
PC Pro Magazine
Article
Ditch the MONTHLY PAYMENTS
Dec 9, 2021
7 min read
Automate Tasks In Excel – But Only In Microsoft 365
Computeractive
Article
Automate Tasks In Excel – But Only In Microsoft 365
Sep 28, 2022
1 min read

Related categories

Skip carousel

Reviews for Mastering SQL Server 2017

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Mastering SQL Server 2017 - Miloš Radivojević

Mastering SQL Server 2017

Mastering SQL Server 2017

Build smart and efficient database applications for your organization with SQL Server 2017

Miloš Radivojević

Dejan Sarka

William Durkin

Christian Coté

Matija Lah

BIRMINGHAM - MUMBAI

Mastering SQL Server 2017

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First Published: August 2019

Production Reference: 1140819

Published by Packt Publishing Ltd.

Livery Place, 35 Livery Street

Birmingham, B3 2PB, U.K.

ISBN 978-1-83898-320-8

www.packtpub.com

Contributors

About the Authors

Miloš Radivojević is a database consultant in Vienna, Austria. He is a data platform MVP and specializes in SQL Server for application developers and performance/ query tuning. Currently, he works as a principal database consultant in Bwin (GVC Holdings)—the largest regulated online gaming company in the world. Miloš is a co-founder of PASS Austria. He is also a speaker at international conferences and speaks regularly at SQL Saturday events and PASS Austria meetings.

Dejan Sarka, MCT and Microsoft Data Platform MVP, is an independent trainer and consultant who focuses on the development of database and business intelligence applications. Besides projects, he spends about half his time on training and mentoring. He is the founder of the Slovenian SQL Server and .NET Users Group. He is the main author or co-author of many books about databases and SQL Server. The last three books before this one were published by Packt, and their titles were SQL Server 2016 Developer's Guide, SQL Server 2017 Integration Services Cookbook, and SQL Server 2016 Developer's Guide. Dejan Sarka has also developed many courses and seminars for Microsoft, SolidQ, and Pluralsight.

William Durkin is a data platform architect for Data Masterminds. He uses his decade of experience with SQL Server to help multinational corporations achieve their data management goals. Born in the UK and now based in Germany, William is a regular speaker at conferences around the globe, a Data Platform MVP, and the founder of the popular SQLGrillen event.

Christian Coté has been in IT for more than 12 years. He is an MS-certified technical specialist in business intelligence (MCTS-BI). For about 10 years, he has been a consultant in ETL/BI projects. His ETL projects have used various ETL tools and plain code with various RDBMSes (such as Oracle and SQL Server). He is currently working on his sixth SSIS implementation in 4 years.

Matija Lah has more than 15 years of experience working with Microsoft SQL Server, mostly from architecting data-centric solutions in the legal domain. His contributions to the SQL Server community have led to the Microsoft Most Valuable Professional award in 2007 (data platform). He spends most of his time on projects involving advanced information management, and natural language processing, but often finds time to speak at events related to Microsoft SQL Server where he loves to share his experience with the SQL Server platform.

Packt Is Searching for Authors Like You

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry-leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why Subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

Packt.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Title Page

Copyright

Mastering SQL Server 2017

Contributors

About the Authors

Packt Is Searching for Authors Like You

About Packt

Why Subscribe?

Packt.com

Preface

Who This Book Is For

What This Book Covers

To Get the Most out of This Book

Download the Example Code Files

Download the color images

Conventions Used

Get in Touch

Reviews

Introduction to SQL Server 2017

Security

Row-Level Security

Dynamic data masking

Always Encrypted

Engine features

Query Store

Live query statistics

Stretch Database

Database scoped configuration

Temporal Tables

Columnstore indexes

Containers and SQL Server on Linux

Programming

Transact-SQL enhancements

JSON

In-Memory OLTP

SQL Server Tools

Business intelligence

R in SQL server

Release cycles

Summary

SQL Server Tools

Installing and updating SQL Server Tools

New SSMS features and enhancements

Autosave open tabs

Searchable options

Enhanced scroll bar

Execution plan comparison

Live query statistics

Importing flat file Wizard

Vulnerability assessment

SQL Server Data Tools

Tools for developing R and Python code

RStudio IDE

R Tools for Visual Studio 2015

Setting up Visual Studio 2017 for data science applications

Summary

JSON Support in SQL Server

Why JSON?

What is JSON?

Why is it popular?

JSON versus XML

JSON objects

JSON object

JSON array

Primitive JSON data types

JSON in SQL Server prior to SQL Server 2016

JSON4SQL

JSON.SQL

Transact-SQL-based solution

Retrieving SQL Server data in JSON format

FOR JSON AUTO

FOR JSON PATH

FOR JSON additional options

Add a root node to JSON output

Include NULL values in the JSON output

Formatting a JSON output as a single object

Converting data types

Escaping characters

Converting JSON data in a tabular format

OPENJSON with the default schema

Processing data from a comma-separated list of values

Returning the difference between two table rows

OPENJSON with an explicit schema

Import the JSON data from a file

JSON storage in SQL Server 2017

Validating JSON data

Extracting values from a JSON text

JSON_VALUE

JSON_QUERY

Modifying JSON data

Adding a new JSON property

Updating the value for a JSON property

Removing a JSON property

Multiple changes

Performance considerations

Indexes on computed columns

Full-text indexes

Summary

Stretch Database

Stretch DB architecture

Is this for you?

Using Data Migration Assistant

Limitations of using Stretch Database

Limitations that prevent you from enabling the Stretch DB features for a table

Table limitations

Column limitations

Limitations for Stretch-enabled tables

Use cases for Stretch Database

Archiving of historical data

Archiving of logging tables

Testing Azure SQL database

Enabling Stretch Database

Enabling Stretch Database at the database level

Enabling Stretch Database by using wizard

Enabling Stretch Database by using Transact-SQL

Enabling Stretch Database for a table

Enabling Stretch DB for a table by using wizard

Enabling Stretch Database for a table by using Transact-SQL

Filter predicate with sliding window

Querying stretch databases

Querying and updating remote data

SQL Server Stretch Database pricing

Stretch DB management and troubleshooting

Monitoring Stretch Databases

Pause and resume data migration

Disabling Stretch Database

Disable Stretch Database for tables by using SSMS

Disabling Stretch Database for tables using Transact-SQL

Disabling Stretch Database for a database

Backing up and restoring Stretch-enabled databases

Summary

Temporal Tables

What is temporal data?

Types of temporal tables

Allen's interval algebra

Temporal constraints

Temporal data in SQL Server before 2016

Optimizing temporal queries

Temporal features in SQL:2011

System-versioned temporal tables in SQL Server 2017

How temporal tables work in SQL Server 2017

Creating temporal tables

Period columns as hidden attributes

Converting non-temporal tables to temporal tables

Migrating an existing temporal solution to system-versioned tables

Altering temporal tables

Dropping temporal tables

Data manipulation in temporal tables

Inserting data in temporal tables

Updating data in temporal tables

Deleting data in temporal tables

Querying temporal data in SQL Server 2017

Retrieving temporal data at a specific point in time

Retrieving temporal data from a specific period

Retrieving all temporal data

Performance and storage considerations with temporal tables

History retention policy in SQL Server 2017

Configuring the retention policy at the database level

Configuring the retention policy at the table level

Custom history data retention

History table implementation

History table overhead

Temporal tables with memory-optimized tables

What is missing in SQL Server 2017?

SQL Server 2016 and 2017 temporal tables and data warehouses

Summary

Columnstore Indexes

Analytical queries in SQL Server

Joins and indexes

Benefits of clustered indexes

Leveraging table partitioning

Nonclustered indexes in analytical scenarios

Using indexed views

Data compression and query techniques

Writing efficient queries

Columnar storage and batch processing

Columnar storage and compression

Recreating rows from columnar storage

Columnar storage creation process

Development of columnar storage in SQL Server

Batch processing

Nonclustered columnstore indexes

Compression and query performance

Testing the nonclustered columnstore index

Operational analytics

Clustered columnstore indexes

Compression and query performance

Testing the clustered columnstore index

Using archive compression

Adding B-tree indexes and constraints

Updating a clustered columnstore index

Deleting from a clustered columnstore index

Summary

SSIS Setup

Introduction

SQL Server 2016 download

Getting ready

How to do it...

Installing JRE for PolyBase

Getting ready

How to do it...

How it works...

Installing SQL Server 2016

Getting ready

How to do it...

SQL Server Management Studio installation

Getting ready

How to do it...

SQL Server Data Tools installation

Getting ready

How to do it...

Testing SQL Server connectivity

Getting ready

How to do it...

What Is New in SSIS 2016

Introduction

Creating SSIS Catalog

Getting ready

How to do it...

Custom logging

Getting ready

How to do it...

How it works...

There's more...

Create a database

Create a simple project

Testing the custom logging level

See also

Azure tasks and transforms

Getting ready

How to do it...

See also

Incremental package deployment

Getting ready

How to do it...

There's more...

Multiple version support

Getting ready

How to do it...

There's more...

Error column name

Getting ready

How to do it...

Control Flow templates

Getting ready

How to do it...

Key Components of a Modern ETL Solution

Introduction

Installing the sample solution

Getting ready

How to do it...

There's more...

Deploying the source database with its data

Getting ready

How to do it...

There's more...

Deploying the target database

Getting ready

How to do it...

SSIS projects

Getting ready

How to do it...

Framework calls in EP_Staging.dtsx

Getting ready

How to do it...

There's more...

Dealing with Data Quality

Introduction

Profiling data with SSIS

Getting ready

How to do it...

Creating a DQS knowledge base

Getting ready

How to do it...

Data cleansing with DQS

Getting ready

How to do it...

Creating a MDS model

Getting ready

How to do it...

Matching with DQS

Getting ready

How to do it...

Using SSIS fuzzy components

Getting ready

How to do it...

Unleash the Power of SSIS Script Task and Component

Introduction

Using variables in SSIS Script task

Getting ready

How to do it...

Execute complex filesystem operations with the Script task

Getting ready

How to do it...

Reading data profiling XML results with the Script task

Getting ready

How to do it...

Correcting data with the Script component

Getting ready

How to do it...

Validating data using regular expressions in a Script component

Getting ready

How to do it...

Using the Script component as a source

How to do it...

How it works...

Using the Script component as a destination

Getting ready

How to do it...

How it works...

On-Premises and Azure Big Data Integration

Introduction

Azure Blob storage data management

Getting ready

How to do it...

Installing a Hortonworks cluster

Getting ready

How to do it...

Copying data to an on-premises cluster

Getting ready

How to do it...

Using Hive – creating a database

Getting ready

How to do it...

There's more...

Transforming the data with Hive

Getting ready

How to do it...

There's more...

Transferring data between Hadoop and Azure

Getting ready

How to do it...

Leveraging a HDInsight big data cluster

Getting ready

How to do it...

There's more...

Managing data with Pig Latin

Getting ready

How to do it...

There's more...

Importing Azure Blob storage data

Getting ready

How to do it...

There's more...

Azure Data Factory and SSIS

Extending SSIS Custom Tasks and Transformations

Introduction

Designing a custom task

Getting ready

How to do it...

How it works...

Designing a custom transformation

How to do it...

How it works...

Managing custom component versions

Getting ready

How to do it...

How it works...

Scale Out with SSIS 2017

Introduction

SQL Server 2017 download and setup

Getting ready

How to do it...

There's more...

SQL Server client tools setup

Getting ready

How to do it...

Configuring SSIS for scale out executions

Getting ready

How to do it...

There's more...

Executing a package using scale out functionality

Getting ready

How to do it...

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

Mastering SQL Server 2017 brings in the power of R and Python for machine learning and containerization-based deployment on Windows and Linux. By knowing how to use the features of SQL Server 2017 to your advantage, you can build scalable applications and easily perform data integration and transformation.

After a quick recap of the features of SQL Server 2017, this Learning Path shows you how to use Query Store, columnstore indexes, and In-Memory OLTP in your applications. You'll then learn to integrate Python code in SQL Server and graph database implementations for development and testing. Next, you'll learn how to design and build SQL Server Integration Services (SSIS) data warehouse packages using SQL server data tools. You'll also learn to develop SSIS packages designed to maintain a data warehouse using data flow and other control flow tasks.

By the end of this Learning Path, you'll have the required information to easily design efficient, high-performance database applications. You'll also have explored on-premises big data integration processes to create a classic data warehouse.

This Learning Path includes content from the following Packt products:

SQL Server 2017 Developer's Guide by Miloš Radivojević, Dejan Sarka, William Durkin

SQL Server 2017 Integration Services Cookbook by Christian Coté, Dejan Sarka, Matija Lah

Who This Book Is For

Database developers and solution architects looking to develop ETL solutions with SSIS, and who want to learn the new features and capabilities in SSIS 2017, will find this Learning Path very useful. It will also be valuable to advanced analysis practitioners, business intelligence developers, and database consultants dealing with performance tuning. Some basic understanding of database concepts and T-SQL is required to get the best out of this Learning Path.

What This Book Covers

Chapter 1, Introduction to SQL Server 2017, very briefly covers the most important features and enhancements, not just those for developers. The chapter shows the whole picture and points readers in the direction of where things are moving.

Chapter 2, SQL Server Tools, helps you understand the changes in the release management of SQL Server tools and explores small and handy enhancements in SQL Server Management Studio (SSMS). It also introduces RStudio IDE, a very popular tool for developing R code, and briefly covers SQL Server Data Tools (SSDT), including the new R Tools for Visual Studio (RTVS), a plugin for Visual Studio, which enables you to develop R code in an IDE that is popular among developers using Microsoft products and languages. The chapter introduces Visual Studio 2017 and shows how it can be used for data science applications with Python.

Chapter 3, JSON Support in SQL Server, explores the JSON support built into SQL Server. This support should make it easier for applications to exchange JSON data with SQL Server.

Chapter 4, Stretch Database, helps you understand how to migrate historical or less frequently/infrequently accessed data transparently and securely to Microsoft Azure using the Stretch Database (Stretch DB) feature.

Chapter 5, Temporal Tables, introduces support for system-versioned temporal tables based on the SQL:2011 standard. We explain how this is implemented in SQL Server and demonstrate some use cases for it (for example, a time-travel application).

Chapter 6, Columnstore Indexes, revises columnar storage and then explores the huge improvements relating to columnstore indexes in SQL Server 2016: updatable non-clustered columnstore indexes, columnstore indexes on in-memory tables, and many other new features for operational analytics.

Chapter 7, SSIS Setup, contains recipes describing the step by step setup of SQL Server 2016 to get the features that are used in the book.

Chapter 8, What Is New in SSIS 2016, contains recipes that talk about the evolution of SSIS over time and what's new in SSIS 2016. This chapter is a detailed overview of Integration Services 2016, new features.

Chapter 9, Key Components of a Modern ETL Solution, explains how ETL has evolved over the past few years and will explain what components are necessary to get a modern scalable ETL solution that fits the modern data warehouse. This chapter will also describe what each catalog view provides and will help you learn how you can use some of them to archive SSIS execution statistics.

Chapter 10, Dealing with Data Quality, focuses on how SSIS can be leveraged to validate and load data. You will learn how to identify invalid data, cleanse data and load valid data to the data warehouse.

Chapter 11, Unleash the Power of SSIS Script Task and Component, covers how to use scripting with SSIS. You will learn how script tasks and script components are very valuable in many situations to overcome the limitations of stock toolbox tasks and transforms.

Chapter 12, On-Premises and Azure Big Data Integration, describes the Azure feature pack that allows SSIS to integrate Azure data from blob storage and HDInsight clusters. You will learn how to use Azure feature pack components to add flexibility to their SSIS solution architecture and integrate on-premises Big Data can be manipulated via SSIS.

Chapter 13, Extending SSIS Tasks and Transformations, talks about extending and customizing the toolbox using custom-developed tasks and transforms and security features. You will learn the pros and cons of creating custom tasks to extend the SSIS toolbox and secure your deployment.

Chapter 14, Scale Out with SSIS 2017, talks about scaling out SSIS package executions on multiple servers. You will learn how SSIS 2017 can scale out to multiple workers to enhance execution scalability.

To Get the Most out of This Book

In order to run all of the demo code in this book, you will need SQL Server 2017 Developer or Enterprise Edition. In addition, you will extensively use SQL Server Management Studio.

You will also need the RStudio IDE and/or SQL Server Data Tools with R Tools for Visual Studio plug-in

SQL Server 2017 Developer or Enterprise Edition.

In addition, you will extensively use SQL Server Management Studio.

Other tools you may need are Visual Studio 2015, SQL Data Tools 16 or higher and SQL Server Management Studio 17 or later.

In addition to that, you will need Hortonworks Sandbox Docker for Windows Azure account and Microsoft Azure.

Download the Example Code Files

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. You can download the code files by following these steps:

Hover the mouse pointer on theSUPPORTtab at the top.

Click onCode Downloads & Errata.

Enter the name of the book in theSearchbox.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click onCode Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account. Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Mastering-SQL-Server-2017-. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781838983208_ColorImages.pdf.

Conventions Used

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: The last characters CI and AS are for case insensitive and accent sensitive, respectively. A block of code is set as follows:

USE DQS_STAGING_DATA;

SELECT CustomerKey, FullName, StreetAddress, City, StateProvince, CountryRegion, EmailAddress, BirthDate, Occupation;

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: The simplest query to retrieve the data that you can write includes the SELECT and the FROM clauses. In the SELECT clause, you can use the star character (*), literally SELECT *, to denote that you need all columns from a table in the result set.

A block of code is set as follows:

USE WideWorldImportersDW;

SELECT *

FROM Dimension.Customer;

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

USE WideWorldImporters;

CREATE TABLE dbo.Product

(

ProductId INT NOT NULL CONSTRAINT PK_Product PRIMARY KEY,

ProductName NVARCHAR(50) NOT NULL,

Price MONEY NOT NULL,

ValidFrom

DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL

ValidTo

DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL

PERIOD FOR SYSTEM_TIME

(ValidFrom, ValidTo)

)

WITH (SYSTEM_VERSIONING = ON);

Any command-line input or output is written as follows:

Customer SaleKey Quantity

------------------------------ -------- -----------

Tailspin Toys (Aceitunas, PR) 36964 288

Tailspin Toys (Aceitunas, PR) 126253 250

Tailspin Toys (Aceitunas, PR) 79272 250

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: Go to Tools | Options and you are then able to type your search string in the textbox in the top-left of the Options window.

Warnings or important notes appear in a box like this.

Tips and tricks appear like this.

Get in Touch

Feedback from our readers is always welcome.

General feedback: Email feedback@packtpub.com and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at questions@packtpub.com.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packtpub.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packtpub.com

Introduction to SQL Server 2017

SQL Server is the main relational database management system product from Microsoft. It has been around in one form or another since the late 80s (developed in partnership with Sybase), but as a standalone Microsoft product, it's been here since the early 90s. In the last 20 years, SQL Server has changed and evolved, gaining newer features and functionality along the way.

The SQL Server we know today is based on what was arguably the most significant (r)evolutionary step in its history: the release of SQL Server 2005. The changes that were introduced allowed the versions that followed the 2005 release to take advantage of newer hardware and software improvements, such as: 64-bit memory architecture, better multi-CPU and multi-core support, better alignment with the .NET framework, and many more modernizations in general system architecture.

The incremental changes introduced in each subsequent version of SQL Server have continued to improve upon this solid new foundation. Fortunately, Microsoft has changed the release cycle for multiple products, including SQL Server, resulting in shorter time frames between releases. This has, in part, been due to Microsoft's focus on their much reported Mobile first, Cloud first strategy. This strategy, together with the development of the cloud version of SQL Server Azure SQL Database, has forced Microsoft into a drastically shorter release cycle. The advantage of this strategy is that we are no longer required to wait 3 to 5 years for a new release (and new features). There have been releases every 2 years since SQL Server 2012 was introduced, with multiple releases of Azure SQL Database in between the real versions.

While we can be pleased that we no longer need to wait for new releases, we are also at a distinct disadvantage. The rapid release of new versions and features leaves us developers with ever-decreasing periods of time to get to grips with the shiny new features. Prior versions had multiple years between releases, allowing us to build up a deeper knowledge and understanding of the available features, before having to consume new information.

Following on from the release of SQL Server 2016 was the release of SQL Server 2017, barely a year after 2016 was released. Many features were merely more polished/updated versions of the 2016 release, while there were some notable additions in the 2017 release.

In this chapter (and book), we will introduce what is new inside SQL Server 2017. Due to the short release cycle, we will outline features that are brand new in this release of the product and look at features that have been extended or improved upon since SQL Server 2016.

We will be outlining the new features in the following areas:

Security

Engine features

Programming

Business intelligence

Security

The last few years have made the importance of security in IT extremely apparent, particularly when we consider the repercussions of the Edward Snowden data leaks or multiple cases of data theft via hacking. While no system is completely impenetrable, we should always be considering how we can improve the security of the systems we build. These considerations are wide ranging and sometimes even dictated via rules, regulations, and laws. Microsoft has responded to the increased focus on security by delivering new features to assist developers and DBAs in their search for more secure systems.

Row-Level Security

The first technology that was introduced in SQL Server 2016 to address the need for increased/improved security is Row-Level Security (RLS). RLS provides the ability to control access to rows in a table based on the user executing a query. With RLS it is possible to implement a filtering mechanism on any table in a database, completely transparently to any external application or direct T-SQL access.

The ability to implement such filtering without having to redesign a data access layer allows system administrators to control access to data at an even more granular level than before. The fact that this control can be achieved without any application logic redesign makes this feature potentially even more attractive to certain use-cases. RLS also makes it possible, in conjunction with the necessary auditing features, to lock down a SQL Server database so that even the traditional god-mode sysadmin cannot access the underlying data.

Dynamic data masking

The second security feature that we will be covering is Dynamic Data Masking (DDM). DDM allows the system administrator to define column level data masking algorithms that prevent users from reading the contents of columns, while still being able to query the rows themselves. This feature was initially aimed at allowing developers to work with a copy of production data without having the ability to actually see the underlying data. This can be particularly useful in environments where data protection laws are enforced (for example, credit card processing systems and medical record storage). Data masking occurs only at query runtime and does not affect the stored data of a table. This means that it is possible to mask a multi-terabyte database through a simple DDL statement, rather than resorting to the previous solution of physically masking the underlying data in the table we want to mask. The current implementation of DDM provides the ability to define a fixed set of functions to columns of a table, which will mask data when a masked table is queried. If a user has the permission to view the masked data, then the masking functions are not run, whereas a user who may not see masked data will be provided with the data as seen through the defined masking functions.

Always Encrypted

The third major security feature to be introduced in SQL Server 2016 is Always Encrypted. Encryption with SQL Server was previously a (mainly) server-based solution. Databases were either protected with encryption at the database level (the entire database was encrypted) or at the column level (single columns had an encryption algorithm defined). While this encryption was/is fully functional and safe, crucial portions of the encryption process (for example, encryption certificates) are stored inside SQL Server. This effectively gave the owner of a SQL Server instance the ability to potentially gain access to this encrypted data—if not directly, there was at least an increased surface area for a potential malicious access attempt. As ever more companies moved into hosted service and cloud solutions (for example, Microsoft Azure), the previous encryption solutions no longer provided the required level of control/security.

Always Encrypted was designed to bridge this security gap by removing the ability of an instance owner to gain access to the encryption components. The entirety of the encryption process was moved outside of SQL Server and resides on the client side. While a similar effect was possible using homebrew solutions, Always Encrypted provides a fully integrated encryption suite into both the .Net Framework and SQL Server. Whenever data is defined as requiring encryption, the data is encrypted within the .NET framework and only sent to SQL Server after encryption has occurred. This means that a malicious user (or even system administrator) will only ever be able to access encrypted information should they attempt to query data stored via Always Encrypted.

Microsoft has made some positive progress in this area of the product. While no system is completely safe and no single feature can provide an all-encompassing solution, all three features provide a further option in building up, or improving upon, any system's current security level.

Engine features

The Engine features section is traditionally the most important, or interesting, for most DBAs or system administrators when a new version of SQL Server is released. However, there are also numerous engine feature improvements that have tangential meanings for developers too. So, if you are a developer, don't skip this section—or you may miss some improvements that could save you some trouble later on!

Query Store

The Query Store is possibly the biggest new engine feature to come with the release of SQL Server 2016. DBAs and developers should be more than familiar with the situation of a query behaving reliably for a long period, which suddenly changed into a slow-running, resource-killing monster. Some readers may identify the cause of the issue as the phenomenon of parameter sniffing or similarly through stale statistics. Either way, when troubleshooting to find out why one unchanging query suddenly becomes slow, knowing the query execution plan(s) that SQL Server has created and used can be very helpful. A major issue when investigating these types of problems is the transient nature of query plans and their execution statistics. This is where Query Store comes into play; SQL Server collects and permanently stores information on query compilation and execution on a per-database basis. This information is then persisted inside each database that is being monitored by the Query Store functionality, allowing a DBA or developer to investigate performance issues after the fact.

It is even possible to perform longer-term query analysis, providing an insight into how query execution plans change over a longer time frame. This sort of insight was previously only possible via handwritten solutions or third-party monitoring solutions, which may still not allow the same insights as the Query Store does.

Live query statistics

When we are developing inside SQL Server, each developer creates a mental model of how data flows inside SQL Server. Microsoft has provided a multitude of ways to display this concept when working with query execution. The most obvious visual aid is the graphical execution plan. There are endless explanations in books, articles, and training seminars that attempt to make reading these graphical representations easier. Depending upon how your mind works, these descriptions can help or hinder your ability to understand the data flow concepts—fully blocking iterators, pipeline iterators, semi-blocking iterators, nested loop joins... the list goes on. When we look at an actual graphical execution plan, we are seeing a representation of how SQL Server processed a query: which data retrieval methods were used, which join types were chosen to join multiple data sets, what sorting was required, and so on. However, this is a representation after the query has completed execution. Live Query Statistics offers us the ability to observe during query execution and identify how, when, and where data moves through the query plan. This live representation is a huge improvement in making the concepts behind query execution clearer and is a great tool to allow developers to better design their query and index strategies to improve query performance.

Further details of Live Query Statistics can be found in Chapter 2, SQL Server Tools.

Stretch Database

Microsoft has worked a lot in the past few years on their

Enjoying the preview?

Page 1 of 1

Mastering SQL Server 2017: Build smart and efficient database applications for your organization with SQL Server 2017

About this ebook

Miloš Radivojević

Related authors

Related to Mastering SQL Server 2017

Related ebooks

Databases For You

Related podcast episodes

Related articles

Related categories

Reviews for Mastering SQL Server 2017

What did you think?

Book preview

Mastering SQL Server 2017 - Miloš Radivojević

Mastering SQL Server 2017

Contributors

About the Authors

Packt Is Searching for Authors Like You

Why Subscribe?

Packt.com

Table of Contents

Preface

Who This Book Is For

What This Book Covers

To Get the Most out of This Book

Download the Example Code Files

Download the color images

Conventions Used

Get in Touch

Reviews

Introduction to SQL Server 2017

Security

Row-Level Security

Dynamic data masking

Always Encrypted

Engine features

Query Store

Live query statistics

Stretch Database