📄 Page
1
(This page has no text content)
📄 Page
2
1. Preface a. Why I Wrote This Book b. Who This Book Is For c. How This Book Is Organized d. Acknowledgments e. Conventions Used in This Book f. Using Code Examples g. O’Reilly Online Learning h. How to Contact Us 2. 1. Why Self-Service Data Prep? a. A Short History of Self-Service Data Visualization b. Accessing the “Right Data” c. The Self-Service Data Preparation Opportunity d. Tableau Prep Up and Running e. Summary 3. I. Getting Started 4. 2. Getting Started with Tableau Prep Builder a. Where to Get Tableau Prep Builder b. How to Get a License for Prep Builder c. The Tableau Prep Builder Screen d. Basic Steps of Data Preparation
📄 Page
3
i. Input Step ii. Clean Step iii. Output Step iv. Saving a Flow e. Summary 5. 3. Planning Your Prep a. Stage 1: Know Your Data b. Stage 2: Identify the Desired State c. Stage 3: Determine the Required Transitions from KYD to the Desired State d. Stage 4: Build the Workflow e. Summary 6. 4. Shaping Data a. What to Look for in Incoming Data Sets b. What Shape Is Best for Analysis in Tableau? c. Changing Data Set Structures in Prep Builder i. Pivot ii. Aggregate iii. Join iv. Union d. Applying Restructuring Techniques to the Ice Cream Example i. Step 1: Pivot Columns to Rows
📄 Page
4
ii. Step 2: Pivot Rows to Columns e. Summary 7. 5. Connecting to Data in Files a. Files Upon Files Upon Files i. Spreadsheets ii. Other File Types b. Where to Find Your Data Files c. How to Connect to Files in Prep d. Considerations for Saving Flows with File Inputs e. Summary 8. 6. Connecting to a Database a. What Is a Database? b. How to Connect to a Database Within Prep Builder c. When to Avoid Connecting to a Database d. Summary 9. II. Data Types 10. 7. Dealing with Numbers a. What Do We Mean by Numbers? b. Types of Numbers c. Category or Measure? d. Aggregation e. Formatting Numbers
📄 Page
5
f. Functions for Mastering Numerical Data g. Summary 11. 8. Dealing with Dates a. Why Are Dates Important? b. Parts of a Date c. Date Lookup Tables d. Epoch Dates e. Excel Serial Number f. Entering Dates i. The makedate() Function ii. The dateparse() Function g. Summary 12. 9. Dealing with String Data a. What Do We Mean by Strings? b. How String Data Is Different i. Character Order ii. Formatting Considerations c. Common Functions for Preparing String Data d. Grouping and Replace Options for Working with String Data e. Summary 13. 10. Dealing with Boolean Data
📄 Page
6
a. What Is Boolean Data? i. Why Is It So Useful in Data Analysis? ii. Functions Featuring Boolean Logic b. Summary 14. III. The Shape of Data 15. 11. Profiling Data a. What Is a Profile? b. Why Visualizing the Data Set Is Important i. Anscombe’s Quartet ii. Visualizations Versus Data Tables c. How Prep Builder Profiles Data i. Generating Histograms and Mini- Histograms ii. Selecting Summary Versus Detail Views iii. Highlighting Values iv. Viewing Dimension Counts d. Sorting e. Summary 16. 12. Sampling Data Sets a. One Simple Rule: Use It All If Possible b. Sampling to Work Around Technical Limitations i. Volume of Data
📄 Page
7
ii. Velocity of Data c. Other Reasons for Sampling i. Reduce Build Times ii. Determine What You Need d. Sampling Techniques i. Fixed Number of Rows ii. Random Sample e. When Not to Sample f. Summary 17. 13. Pivoting Columns to Rows a. When to Pivot in Tableau Prep Builder b. How to Pivot Columns to Rows c. Summary 18. 14. Pivoting Rows to Columns a. When to Use a Rows-to-Columns Pivot b. How to Pivot Rows to Columns c. Summary 19. 15. Aggregating in Prep Builder a. Comparing Calculations in Prep Builder and Desktop b. Which Calculations in Prep Builder Differ? c. Adding the Aggregate Step
📄 Page
8
d. Where’s the Rest of My Data? e. Level of Detail Calculation Option f. Summary 20. 16. Joining Data Sets Together a. How to Join Data Sets in Prep Builder b. Join Logic and Terminology c. Types of Join in Prep Builder d. When to Use Each Join Type e. Summary 21. 17. Unioning a. What Is a Union? b. What If the Data Structure Isn’t Identical? c. When to Union Data i. Monthly Data Sets ii. Data Sets from Web Sources iii. Company Mergers d. Multiple Tables and Wildcard Unions e. Summary 22. 18. Calculations a. What Do Calculations Do in Data Preparation? b. Creating a Calculated Field c. Fundamentals of Calculations
📄 Page
9
i. The Reference List ii. Syntax iii. Description iv. Example d. Building the Calculation i. When Calculations Go Well ii. When Calculations Go Poorly iii. Editing Calculated Fields iv. Recommendations e. Types of Calculations i. Numerical Calculations ii. String Calculations iii. Date Calculations iv. Conditional Calculations with a Boolean Output v. Logical Calculations vi. Type Conversions f. Level of Detail and Ranking Calculations g. Summary 23. IV. Output 24. 19. Choosing an Output a. Types of Output i. Publish to Files
📄 Page
10
ii. Publish to Tableau Server b. When to Output Data in Prep Builder i. Outputting Data in the Output Step ii. Previewing Output Data in Desktop c. Other Considerations for Output Data d. Summary 25. 20. Outputting to a Database a. When to Write to a Database i. Clean Data ii. Simplified Joins iii. Staging and Reference Tables b. Setup for Writing to a Database c. What to Watch Out For d. Summary 26. 21. Getting Started with Tableau Prep Conductor a. When to Use Prep Conductor b. How to Get Prep Conductor c. Loading a Flow to Prep Conductor d. Other Benefits of Using Prep Conductor e. Summary 27. V. Cleaning Data 28. 22. Creating Additional Data
📄 Page
11
a. When Not to Create Data i. Dynamic Calculations in Desktop ii. Duplicate Records from Joins b. Creating Additional Columns i. Using Calculations ii. Pivoting Rows to Columns iii. Joining Data Sets c. Creating Additional Rows i. Pivoting Columns to Rows ii. Unioning Data Sets iii. Scaffolding Data Sets iv. Joining Data Sets d. Summary 29. 23. Filtering a. What Is a Filter? b. Different Types of Filters i. Selection ii. Calculation iii. Wildcard iv. Null Values c. When to Filter Out Columns d. When to Filter Out Rows
📄 Page
12
e. Summary 30. 24. Removing Data During Input a. Changing Your Data Set Before Loading It b. Slow Performance, Slow Build, Slow Output c. Removing Columns d. Removing Records e. Summary 31. 25. Splitting Data Fields a. Basic Splits b. Advanced Splits: When Automatic Splits Don’t Work as Intended c. When Not to Split Data i. Address Data ii. No Clear Delimiter d. Summary 32. 26. Cleaning by Grouping Data a. What Does Grouping Mean? b. Why Use Grouping i. Improving Accuracy ii. Navigating the Data Hierarchy iii. Smoothing Reorganizations c. Grouping Techniques
📄 Page
13
i. Manual ii. Calculations iii. Built-in Functionality d. Summary 33. 27. Dealing with Nulls a. What Is a Null? b. When Is a Null OK? c. How to Remove or Replace a Null i. ISNULL() ii. ZN() iii. Merge d. Summary 34. 28. Using Data Roles a. How to Use Data Roles b. Custom Data Roles c. Summary 35. 29. Dealing with Unwanted Characters a. What Is an Unwanted Character? b. Issues Caused by Unwanted Characters c. Removing Unwanted Characters i. Strings with Mistyped Characters ii. Numbers with Unwanted Characters
📄 Page
14
iii. Dates with Mistyped Characters d. Summary 36. 30. Deduplicating a. How to Identify Duplicates b. Causes of Duplicates i. System Loads ii. Row per Measure iii. Joins c. How to Handle Duplicates i. Aggregating: Technique 1 ii. Aggregating: Technique 2 iii. Pivoting Rows to Columns d. Summary 37. 31. Using Regular Expressions a. What Are Regular Expressions? b. How to Use Regexes in Prep c. REGEXP_EXTRACT() and REGEXP_EXTRACT_NTH() i. REGEXP_MATCH() ii. REGEXP_REPLACE() d. Regex Use Cases i. Replacing Common Mistakes
📄 Page
15
ii. Anonymizing Comments or Feedback e. Common Regex Commands f. Summary 38. 32. Completing Advanced Joins a. Multiple Join Conditions b. Join Conditions Other Than Equals i. Filtering with a Join ii. Joining by a Range c. OR Statements d. Summary 39. 33. Creating Level of Detail Calculations a. What Is Appending? b. Exploring Appending Through LOD Calculations i. When to Use an LOD Calculation ii. How to Write an LOD Calculation in Prep Builder iii. What a Level of Detail Calculation Is Doing c. Summary 40. 34. Doing Analytical Calculations a. What Is a Table Calculation? b. Applying Table Calculation Logic in Prep Builder i. Keywords
📄 Page
16
ii. Analytical Calculations c. Use Cases i. Filtering for the Top N ii. Filtering Out a Percentage of Data d. Summary 41. VI. Beyond the Basics 42. 35. Breaking Down Complex Data Preparation Challenges a. The Challenge b. Where to Begin c. Logical Steps d. Making Changes e. Be Ready to Iterate f. Summary 43. 36. Handling Free Text a. What Is Free Text? b. Why Is Free Text Useful? c. How to Analyze Free Text in Tableau i. Split the Strings ii. Pivot Columns to Rows iii. Clean Cases and Punctuation iv. Use a Join to Remove Common Words v. Group the Remaining Values
📄 Page
17
d. Summary 44. 37. Using Smarter Filtering a. Calculations i. Boolean Calculations ii. Logical Calculations iii. Regex Calculations b. Join Ranges c. Percentage Variance i. Manual Entry: Level of Detail Calculations ii. Reloaded Data: Join to Previous Output iii. Aggregating the Average Production Cost per Type iv. Joining the Data Sets Together d. Combining Techniques e. Summary 45. 38. Managing Conversion Rates a. Challenges of Conversion Rates b. Applying Conversion Rates in Prep i. Step 1: Create a Consistent Granularity of Data for the Conversion ii. Step 2: Join the Data Sets Together iii. Step 3: Apply the Conversion Rate c. Long-Term Strategies for Conversion Rates
📄 Page
18
i. Managing Frequency ii. Maintaining History Tables d. Summary 46. 39. Scaffolding Your Data a. What Is Scaffolding? b. Challenges Addressed by Scaffolding c. Challenges Created by Scaffolding d. The Traditional Scaffolding Technique i. Step 1: Input the Data Sets ii. Step 2: Build the Join Calculations iii. Step 3: Join the Two Data Sets Together iv. Step 4: Filter Out Unnecessary Rows e. The Newer Scaffolding Technique i. Step 1: Input the Data Sets ii. Step 2: Join the Data Sets iii. Step 3: Add the Reporting Date iv. Step 4: Remove the Scaffold Value f. The Result g. Summary 47. 40. Connecting to Programming Scripts a. When to Use the Script Step in Prep b. Setting Up Your Computer to Use Scripts in Prep
📄 Page
19
c. Using a Script Step d. Summary 48. 41. Handling Prep Builder Errors a. Parameter Errors b. Blank Profile Panes or Data Panes i. Changing a Calculation or Removing a Data Field Downstream ii. The Data Source Has Changed c. Errors Within a Calculated Field i. Incomplete Calculations ii. Unsupported Functions d. Summary 49. VII. Managing Your Data 50. 42. Documenting Your Data Preparation a. Basic Documentation i. Folder Structure ii. Filenames iii. Data Sources iv. Output b. Step Names c. Clean Step d. Step Descriptions
📄 Page
20
e. Color f. Joins g. Unions h. Summary 51. 43. Deciding Where to Prepare Your Data a. Processes to Consider b. Data Preparation Versus Visual Analytics i. Data Literacy ii. Organization Size iii. Quality of Technological Hardware iv. History of Data Investment c. Software Performance i. Sampling ii. Functionality iii. Documentation d. Summary 52. 44. Managing Data a. What Is Sensitive Data? i. Public ii. Confidential iii. Strictly Confidential iv. Restricted