Search Results for

    Show / Hide Table of Contents

    TDS Table Extraction

    Extract data from Dataverse using the TDS (Tabular Data Stream) endpoint

    Overview

    The TDS Table Extraction operation provides a powerful way to extract large amounts of data from Microsoft Dataverse environments using the TDS endpoint - a read-only SQL interface that enables fast, efficient data extraction to CSV files.

    The operation:

    • Connects to Dataverse TDS endpoint (port 5558)
    • Executes SELECT * queries on specified tables
    • Automatically handles large tables with chunked extraction
    • Validates row counts to ensure data integrity
    • Supports MFA authentication via browser
    • Exports to RFC 4180-compliant CSV format
    Note
    • The TDS endpoint provides read-only access to Dataverse data
    • All extractions include automatic row count validation to ensure data integrity
    • For tables exceeding the 2-minute query timeout, automatic chunking is enabled
    • Currently supported only on Windows 10 and Windows 11
    • Requires Microsoft .NET 8 installed

    When to Use This Operation

    Use This Operation When:

    ✅ Migrating data between environments or systems
    ✅ Creating point-in-time snapshots for compliance or audits
    ✅ Extracting data for Power BI, Excel, or data warehouses
    ✅ Backing up critical data offline
    ✅ You need to extract entire tables with all columns
    ✅ Processing hundreds of thousands or millions of rows

    Don't Use This Operation When:

    ❌ You need filtered data (use FetchXML or views instead)
    ❌ You need real-time data access (use Web API instead)
    ❌ Working with Dataverse for Teams (TDS endpoint not available)
    ❌ You need to modify data (TDS is read-only)

    Quick Start

    Example 1: Extract a Standard Table

    .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
      --TdsExtractTable contact `
      --TdsEnvironmentHost myorg.crm6.dynamics.com `
      --TdsDatabaseName org12345678 `
      --TdsOrderByColumn contactid `
      --TdsOutputPath .\contacts.csv `
      --TdsUseInteractive
    

    Expected Output:

    Starting extraction: contact -> .\contacts.csv
    ✓ Connected to TDS endpoint
    ✓ Expected row count: 25,340
    Executing query: SELECT * FROM [contact] ORDER BY [contactid]
    ✓ Query executed successfully
      Columns: 156
      Progress: 5,000 rows | 8,200 rows/sec | Elapsed: 00:00:01
      Progress: 10,000 rows | 8,500 rows/sec | Elapsed: 00:00:02
      Progress: 15,000 rows | 8,300 rows/sec | Elapsed: 00:00:03
      Progress: 20,000 rows | 8,400 rows/sec | Elapsed: 00:00:04
      Progress: 25,000 rows | 8,350 rows/sec | Elapsed: 00:00:05
    ✓ Extraction Complete
      Total Rows: 25,340
      Total Time: 00:00:06
      Throughput: 4,223 rows/sec
      File Size: 48.5 MB
    ✓ Row Count Validation: PASSED (25,340 = 25,340)
    

    Example 2: Extract Large Table with Automatic Chunking

    .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
      --TdsExtractTable mag_outcomestepsummary `
      --TdsEnvironmentHost midcentral.crm6.dynamics.com `
      --TdsDatabaseName midcentral_prod `
      --TdsOrderByColumn mag_outcomestepsummaryid `
      --TdsOutputPath .\outcome_steps.csv `
      --TdsProgressInterval 50000 `
      --TdsUseInteractive
    

    Expected Output (with automatic chunking):

    Starting extraction: mag_outcomestepsummary -> .\outcome_steps.csv
    ✓ Connected to TDS endpoint
    ✓ Expected row count: 378,402
    Executing query: SELECT * FROM [mag_outcomestepsummary] ORDER BY [mag_outcomestepsummaryid]
    ✓ Query executed successfully
      Columns: 150
      Progress: 50,000 rows | 8,311 rows/sec | Elapsed: 00:00:21
      Progress: 100,000 rows | 8,726 rows/sec | Elapsed: 00:00:27
      Progress: 150,000 rows | 8,729 rows/sec | Elapsed: 00:00:33
      Progress: 200,000 rows | 8,407 rows/sec | Elapsed: 00:00:39
      Progress: 250,000 rows | 9,043 rows/sec | Elapsed: 00:00:44
      Progress: 300,000 rows | 8,387 rows/sec | Elapsed: 00:00:50
      Progress: 350,000 rows | 8,775 rows/sec | Elapsed: 00:00:56
    ✓ Extraction Complete
      Total Rows: 378,402
      Total Time: 00:00:59
      Throughput: 6,333 rows/sec
      File Size: 543.67 MB
    ✓ Row Count Validation: PASSED (378,402 = 378,402)
    
    Tip

    You can add the CLI .exe to your PATH to run commands without the .\ prefix.

    How It Works

    Processing Steps

    1. Authentication: Connects using Azure CLI or Interactive (browser-based MFA)
    2. TDS Connection: Establishes connection to {host}:5558 with specified database
    3. Row Count Query: Executes SELECT COUNT(*) FROM [table] to get expected row count
    4. Data Extraction:
      • Executes SELECT * FROM [table] ORDER BY [ordercolumn]
      • Streams data to CSV file with RFC 4180 formatting
      • Shows progress updates at specified intervals
    5. Automatic Chunking (if timeout occurs):
      • Switches to ID-based pagination
      • Extracts 1,000 rows per chunk
      • Continues until all rows extracted
    6. Validation: Compares extracted row count with expected count

    CSV Output Format

    The extracted data follows RFC 4180 standards:

    • Always quoted fields: All values enclosed in double quotes
    • Escaped quotes: Quote characters within values escaped as ""
    • UTF-8 encoding: Supports international characters
    • Header row: First row contains column names
    • Null values: Represented as empty quoted strings ""

    Example CSV output:

    "contactid","firstname","lastname","emailaddress1","createdon"
    "a1b2c3d4-e5f6-7890-1234-567890abcdef","John","Smith","john.smith@example.com","2024-01-15 14:23:45"
    "b2c3d4e5-f6a7-8901-2345-67890abcdef1","Jane","Doe","jane.doe@example.com","2024-01-16 09:12:33"
    

    Automatic Chunking

    For very wide tables (200+ columns) or tables that exceed the 2-minute query timeout:

    Trigger: Initial SELECT * query times out
    Action: Automatically switches to chunked extraction
    Method: ID-based pagination using the ORDER BY column
    Chunk Size: 1,000 rows per chunk

    Example automatic chunking output:

    ❌ Query Timeout (exceeded 2-minute limit for SELECT *)
      Rows retrieved before timeout: 0
    
    Will retry with automatic chunked extraction...
    ⚙ Automatic fallback: Switching to chunked extraction...
    
    Starting chunked extraction: mag_referral
    Chunk strategy: ID-based pagination using [mag_referralid]
    
    ✓ Connected to TDS endpoint
    ✓ Expected row count: 33,130
      Chunk size: 1,000 rows
      Estimated chunks: ~34
    
      Columns: 342
      Writing to: .\referrals.csv
    
      Chunk 1: 1,000 rows in 1.1s
      Chunk 2: 1,000 rows in 1.0s
      Chunk 3: 1,000 rows in 0.8s
      ...
      Chunk 34: 130 rows in 0.2s
    
    ✓ Chunked Extraction Complete
      Total Rows: 33,130
      Total Chunks: 34
      Total Time: 00:00:25
      Throughput: 1,302 rows/sec
    

    Parameters

    Required Parameters

    --TdsExtractTable

    The logical name of the Dataverse table (entity) to extract. This is the same name you see in Power Apps or the Dataverse API.

    Examples:

    • mag_referral - Custom table
    • contact - Standard Dataverse table
    • mag_outcomestepsummary - Custom table
    Tip

    You can find table logical names in the Power Apps maker portal under Tables, or by using tools like XrmToolBox Metadata Browser.

    --TdsEnvironmentHost

    The hostname of your Dataverse environment (without https:// and without the database name).

    Format: environmentname.crm[X].dynamics.com

    Examples:

    • myorg.crm6.dynamics.com - Oceania region
    • contoso.crm.dynamics.com - North America region
    • fabrikam.crm4.dynamics.com - Europe region
    Tip

    You can find your environment URL in the Power Platform Admin Center (admin.powerplatform.microsoft.com) under Environments > [Your Environment] > Details.

    --TdsDatabaseName

    The database name for your Dataverse environment. This is typically your organisation's unique identifier.

    Format: Usually a GUID-like string or organisation name (e.g., org12345678 or myorganisation)

    How to find your database name:

    1. Open SQL Server Management Studio (SSMS)
    2. Connect to your TDS endpoint: environmentname.crm6.dynamics.com,5558
    3. Use Azure Active Directory authentication
    4. View the available databases - your database name will be listed

    Alternatively, you can use PowerShell:

    # Test connection to find database name
    .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
      --TdsExtractTable contact `
      --TdsEnvironmentHost myorg.crm6.dynamics.com `
      --TdsDatabaseName org12345678 `
      --TdsOutputPath .\test.csv `
      --TdsUseInteractive
    

    If the database name is incorrect, the tool will report a connection error.

    --TdsOutputPath

    The file path where the extracted CSV data will be saved.

    Format: Can be absolute or relative path

    Examples:

    • .\output\contacts.csv - Relative path (current directory)
    • C:\DataExtracts\2024\contacts.csv - Absolute path
    • ..\exports\referrals.csv - Relative path (parent directory)
    Note

    The tool will automatically create any missing directories in the path.

    Optional Parameters

    --TdsOrderByColumn

    The column name to use for sorting the data. Highly recommended for large tables.

    Why this matters:

    • Required for automatic chunking if a table exceeds the 2-minute query timeout
    • Ensures consistent ordering across multiple extractions
    • Usually the primary key of the table (e.g., contactid, mag_referralid)

    Examples:

    • contactid - For the contact table
    • mag_referralid - For mag_referral table
    • accountid - For account table
    Important

    For tables with more than 100,000 rows or wide tables (200+ columns), always specify --TdsOrderByColumn to enable automatic chunking if needed.

    --TdsProgressInterval

    How often (in rows) to display progress updates during extraction.

    Default: 50,000 rows

    Examples:

    • --TdsProgressInterval 10000 - Show progress every 10,000 rows
    • --TdsProgressInterval 1000 - Show progress every 1,000 rows (more frequent updates)
    • --TdsProgressInterval 100000 - Show progress every 100,000 rows (less frequent updates)

    --TdsPort

    The port number for the TDS endpoint.

    Default: 5558 (recommended)

    Options:

    • 1433 - Standard SQL Server port (requires special firewall configuration)
    • 5558 - Alternative port (recommended, works without additional firewall rules)
    Note

    Most users should use the default port 5558. Only change this if you have specific networking requirements.

    --TdsUseInteractive

    Use Active Directory Interactive authentication (browser-based MFA).

    When to use this:

    • Your organisation requires Multi-Factor Authentication (MFA)
    • Conditional Access policies block standard Azure CLI authentication
    • You prefer browser-based login over command-line credentials

    Example:

    .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
      --TdsExtractTable contact `
      --TdsEnvironmentHost myorg.crm6.dynamics.com `
      --TdsDatabaseName org12345678 `
      --TdsOutputPath .\contacts.csv `
      --TdsUseInteractive
    

    When this flag is used, a browser window will open for you to authenticate with your Microsoft account.

    Tip

    If you're not sure whether you need this, try running without it first. If you get an authentication error mentioning "Anonymous" or "conditional access", add --TdsUseInteractive.

    Authentication Setup

    The TDS extractor supports two authentication methods:

    Method 1: Azure CLI (Default)

    1. Install Azure CLI: https://aka.ms/azure-cli
    2. Open PowerShell and run:
      az login
      
    3. Follow the prompts to authenticate
    4. Run your extraction without --TdsUseInteractive

    Best for: Automated scripts, CI/CD pipelines, non-MFA environments

    Method 2: Interactive Authentication (MFA-Compatible)

    1. Add --TdsUseInteractive to your command
    2. A browser window will open when you run the extraction
    3. Sign in with your Microsoft account (MFA supported)
    4. The extraction will begin automatically after successful authentication

    Best for: MFA-protected environments, conditional access policies, interactive use

    Important

    For production environments with strict security policies, Interactive Authentication is recommended.

    Real-World Scenarios

    Scenario 1: Extract Contacts for Reporting

    Extract all contacts with their details:

    .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
      --TdsExtractTable contact `
      --TdsEnvironmentHost production.crm6.dynamics.com `
      --TdsDatabaseName org_production `
      --TdsOrderByColumn contactid `
      --TdsOutputPath C:\Reports\MonthlyExport\contacts_2024_11.csv `
      --TdsProgressInterval 5000 `
      --TdsUseInteractive
    

    Expected output:

    Starting extraction: contact -> C:\Reports\MonthlyExport\contacts_2024_11.csv
    ✓ Connected to TDS endpoint
    ✓ Expected row count: 25,340
    Executing query: SELECT * FROM [contact] ORDER BY [contactid]
    ✓ Query executed successfully
      Columns: 156
      Progress: 5,000 rows | 8,200 rows/sec | Elapsed: 00:00:01
      Progress: 10,000 rows | 8,500 rows/sec | Elapsed: 00:00:02
      Progress: 15,000 rows | 8,300 rows/sec | Elapsed: 00:00:03
      Progress: 20,000 rows | 8,400 rows/sec | Elapsed: 00:00:04
      Progress: 25,000 rows | 8,350 rows/sec | Elapsed: 00:00:05
    ✓ Extraction Complete
      Total Rows: 25,340
      Total Time: 00:00:06
      Throughput: 4,223 rows/sec
      File Size: 48.5 MB
    ✓ Row Count Validation: PASSED (25,340 = 25,340)
    

    Scenario 2: Extract Large Table with Automatic Chunking

    Situation: You have a very large table (378K+ rows) that needs to be extracted for data warehouse loading.

    Solution:

    .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
      --TdsExtractTable mag_outcomestepsummary `
      --TdsEnvironmentHost midcentral.crm6.dynamics.com `
      --TdsDatabaseName midcentral_prod `
      --TdsOrderByColumn mag_outcomestepsummaryid `
      --TdsOutputPath .\extracts\outcome_steps.csv `
      --TdsProgressInterval 50000 `
      --TdsUseInteractive
    

    If the table is too large for a single query (exceeds 2-minute timeout), the tool automatically switches to chunked extraction:

    Expected output:

    Starting extraction: mag_outcomestepsummary -> .\extracts\outcome_steps.csv
    ✓ Connected to TDS endpoint
    ✓ Expected row count: 378,402
    Executing query: SELECT * FROM [mag_outcomestepsummary] ORDER BY [mag_outcomestepsummaryid]
    ✓ Query executed successfully
      Columns: 150
      Progress: 50,000 rows | 8,311 rows/sec | Elapsed: 00:00:21
      Progress: 100,000 rows | 8,726 rows/sec | Elapsed: 00:00:27
      Progress: 150,000 rows | 8,729 rows/sec | Elapsed: 00:00:33
      Progress: 200,000 rows | 8,407 rows/sec | Elapsed: 00:00:39
      Progress: 250,000 rows | 9,043 rows/sec | Elapsed: 00:00:44
      Progress: 300,000 rows | 8,387 rows/sec | Elapsed: 00:00:50
      Progress: 350,000 rows | 8,775 rows/sec | Elapsed: 00:00:56
    ✓ Extraction Complete
      Total Rows: 378,402
      Total Time: 00:00:59
      Throughput: 6,333 rows/sec
      File Size: 543.67 MB
    ✓ Row Count Validation: PASSED (378,402 = 378,402)
    

    Scenario 3: Extract Wide Table (Many Columns)

    Situation: You need to extract a table with 342 columns and 33K rows. The wide column set may cause query timeouts.

    Solution:

    .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
      --TdsExtractTable mag_referral `
      --TdsEnvironmentHost production.crm6.dynamics.com `
      --TdsDatabaseName org_prod `
      --TdsOrderByColumn mag_referralid `
      --TdsOutputPath .\referrals.csv `
      --TdsProgressInterval 5000 `
      --TdsUseInteractive
    

    If the table is very wide (e.g., 342 columns), the initial query may timeout and automatic chunking will activate:

    Expected output:

    Starting extraction: mag_referral -> .\referrals.csv
    ✓ Connected to TDS endpoint
    ✓ Expected row count: 33,130
    Executing query: SELECT * FROM [mag_referral] ORDER BY [mag_referralid]
    ⏳ Waiting for data (this may take up to 2 minutes for SELECT *)...
    
    ❌ Query Timeout (exceeded 2-minute limit for SELECT *)
      Rows retrieved before timeout: 0
    
    Will retry with automatic chunked extraction...
    ⚙ Automatic fallback: Switching to chunked extraction...
    
    Starting chunked extraction: mag_referral
    Chunk strategy: ID-based pagination using [mag_referralid]
    
    ✓ Connected to TDS endpoint
    ✓ Expected row count: 33,130
      Chunk size: 1,000 rows
      Estimated chunks: ~34
    
      Columns: 342
      Writing to: .\referrals.csv
    
      Chunk 1: 1,000 rows in 1.1s
      Chunk 2: 1,000 rows in 1.0s
      Chunk 3: 1,000 rows in 0.8s
      Chunk 4: 1,000 rows in 0.9s
      Progress: 5,000 rows | 251 rows/sec | Chunk 5 | Elapsed: 00:00:20
      ...
      Chunk 33: 1,000 rows in 0.9s
      Chunk 34: 130 rows in 0.2s
    
    ========================================
    ✓ Chunked Extraction Complete
    ========================================
      Total Rows: 33,130
      Total Chunks: 34
      Total Time: 00:00:25
      Throughput: 1,302 rows/sec
      File Size: 64.62 MB
      Output: .\referrals.csv
    
    ✓ Row Count Validation: PASSED (33,130 = 33,130)
    

    Scenario 4: Batch Extract Multiple Tables

    Situation: You need to extract 4 different tables every month for reporting. Manual extraction is time-consuming and error-prone.

    Solution:

    Create a PowerShell script to automate the extraction:

    # Extract-MultipleTables.ps1
    # Configuration
    $environmentHost = "production.crm6.dynamics.com"
    $databaseName = "org_production"
    $outputFolder = ".\MonthlyExport"
    
    # Ensure output folder exists
    New-Item -ItemType Directory -Force -Path $outputFolder | Out-Null
    
    # Tables to extract (table name, order by column, description)
    $tables = @(
        @{Name="contact"; OrderBy="contactid"; Description="All contacts"},
        @{Name="account"; OrderBy="accountid"; Description="All accounts"},
        @{Name="mag_referral"; OrderBy="mag_referralid"; Description="All referrals"},
        @{Name="mag_assessment"; OrderBy="mag_assessmentid"; Description="All assessments"}
    )
    
    # Extract each table
    foreach ($table in $tables) {
        Write-Host "========================================" -ForegroundColor Cyan
        Write-Host "Extracting: $($table.Description)" -ForegroundColor Cyan
        Write-Host "Table: $($table.Name)" -ForegroundColor Cyan
        Write-Host "========================================" -ForegroundColor Cyan
        
        $outputPath = Join-Path $outputFolder "$($table.Name).csv"
        
        .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
            --TdsExtractTable $table.Name `
            --TdsEnvironmentHost $environmentHost `
            --TdsDatabaseName $databaseName `
            --TdsOrderByColumn $table.OrderBy `
            --TdsOutputPath $outputPath `
            --TdsProgressInterval 10000 `
            --TdsUseInteractive
        
        if ($LASTEXITCODE -eq 0) {
            Write-Host "✓ Success: $($table.Name)" -ForegroundColor Green
        } else {
            Write-Host "✗ Failed: $($table.Name)" -ForegroundColor Red
        }
        
        Write-Host ""
        Start-Sleep -Seconds 2
    }
    
    Write-Host "All extractions complete!" -ForegroundColor Green
    

    Run the script:

    .\Extract-MultipleTables.ps1
    

    Understanding the Output

    CSV File Format

    The extracted data is saved in CSV format following RFC 4180 standards:

    • Always quoted fields - All values are enclosed in double quotes
    • Escaped quotes - Quote characters within values are escaped as ""
    • UTF-8 encoding - Supports international characters
    • Header row - First row contains column names
    • Null values - Represented as empty quoted strings ""

    Example CSV output:

    "contactid","firstname","lastname","emailaddress1","createdon"
    "a1b2c3d4-e5f6-7890-1234-567890abcdef","John","Smith","john.smith@example.com","2024-01-15 14:23:45"
    "b2c3d4e5-f6a7-8901-2345-67890abcdef1","Jane","Doe","jane.doe@example.com","2024-01-16 09:12:33"
    "c3d4e5f6-a7b8-9012-3456-7890abcdef12","Bob","Johnson","bob.j@example.com","2024-01-17 11:45:22"
    

    Log Files

    Detailed logs are written to the Logs folder in the CLI directory:

    Log file naming: DataManipulation_YYYY-MM-DD_HH-MM-SS-mmm.log

    Example log content:

    [2024-11-16 23:37:17] === Log started at 2024-11-16 23:37:17 ===
    [2024-11-16 23:37:17] Navigator Data Manipulation tool
    [2024-11-16 23:37:17] Starting extraction: contact -> .\contacts.csv
    [2024-11-16 23:37:17] TDS Endpoint: production.crm6.dynamics.com:5558
    [2024-11-16 23:37:17] Database: org_production
    [2024-11-16 23:37:17] Authentication: Active Directory Interactive (MFA-compatible)
    [2024-11-16 23:37:20] ✓ Connected to TDS endpoint
    [2024-11-16 23:37:21] ✓ Expected row count: 25,340
    [2024-11-16 23:37:27] ✓ Extraction Complete
    [2024-11-16 23:37:27] ✓ Row Count Validation: PASSED (25,340 = 25,340)
    [2024-11-16 23:37:27] === Log ended at 2024-11-16 23:37:27 ===
    

    Troubleshooting

    Common Issues and Solutions

    Issue: "Anonymous authentication error"

    Error message:

    Login failed: The HTTP request was forbidden with client authentication scheme 'Anonymous'
    

    Solution: Add --TdsUseInteractive to your command to use browser-based MFA authentication:

    .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
      --TdsExtractTable contact `
      --TdsEnvironmentHost myorg.crm6.dynamics.com `
      --TdsDatabaseName org12345678 `
      --TdsOutputPath .\contacts.csv `
      --TdsUseInteractive
    

    Issue: "Execution Timeout Expired"

    Error message:

    ❌ Query Timeout (exceeded 2-minute limit for SELECT *)
    

    Solution: Make sure you've specified --TdsOrderByColumn to enable automatic chunking:

    .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
      --TdsExtractTable mag_referral `
      --TdsEnvironmentHost myorg.crm6.dynamics.com `
      --TdsDatabaseName org12345678 `
      --TdsOrderByColumn mag_referralid `  # ← This enables automatic chunking
      --TdsOutputPath .\referrals.csv `
      --TdsUseInteractive
    

    If you already have --TdsOrderByColumn and still get timeout errors, the automatic chunking should activate automatically.

    Issue: "Row count validation failed"

    Error message:

    ❌ Row Count Validation: FAILED
      Expected: 10,000
      Actual: 9,850
    

    What this means: The number of rows written to the CSV file doesn't match the SQL COUNT(*). This could indicate:

    • Network interruption during extraction
    • Data corruption in the source table
    • Special characters causing CSV parsing issues

    Solution:

    1. Check the log file for errors during extraction
    2. Retry the extraction
    3. If the issue persists, contact Whānau Tahi support with the log file

    Issue: "Cannot find database"

    Error message:

    Login failed for user. Cannot open database requested by the login.
    

    Solution: Verify your --TdsDatabaseName is correct. Try these steps:

    1. Check the Power Platform Admin Center for your organisation name
    2. Use SQL Server Management Studio to connect and view available databases
    3. Common formats: org12345678 or organisationname

    Issue: "Access denied" or "Insufficient permissions"

    Error message:

    The user does not have permission to perform this action.
    

    Solution: Ensure your user account or app user has:

    • Read permission on the table you're trying to extract
    • TDS endpoint access enabled in the environment
    • Appropriate security roles assigned

    Contact your Dataverse administrator to verify permissions.

    Performance Tips

    Optimising Extraction Speed

    1. Use the default port (5558) - Usually faster than port 1433

    2. Extract during off-peak hours - Better performance when the system has lower load

    3. Use appropriate progress intervals - Too frequent updates can slow extraction

      • Small tables (< 10,000 rows): --TdsProgressInterval 1000
      • Medium tables (10,000 - 100,000 rows): --TdsProgressInterval 5000
      • Large tables (> 100,000 rows): --TdsProgressInterval 50000
    4. Extract to local disk - Faster than network drives or cloud storage

    5. Close other applications - Free up memory and CPU resources

    Expected Performance

    Typical extraction speeds:

    Table Size Columns Expected Time Throughput
    1,000 rows 50 1-2 seconds 500-1,000 rows/sec
    10,000 rows 100 5-10 seconds 1,000-2,000 rows/sec
    100,000 rows 150 30-60 seconds 2,000-5,000 rows/sec
    500,000 rows 150 2-4 minutes 3,000-6,000 rows/sec
    Note

    Very wide tables (200+ columns) or tables with large text fields may be slower.

    Advanced Scenarios

    Scenario 1: Delta Extraction (Extract Only New Records)

    Extract only records created since your last extraction using a filter in SQL Server Management Studio, then use this tool to extract the filtered dataset:

    1. Connect to TDS endpoint with SSMS
    2. Create a view with your filter:
      CREATE VIEW vw_contacts_recent AS
      SELECT * FROM contact
      WHERE createdon >= '2024-11-01'
      
    3. Extract the view:
      .\WhanauTahi.Xpm.Tooling.CLI.exe datamanipulation `
        --TdsExtractTable vw_contacts_recent `
        --TdsEnvironmentHost myorg.crm6.dynamics.com `
        --TdsDatabaseName org12345678 `
        --TdsOutputPath .\contacts_recent.csv `
        --TdsUseInteractive
      
    Note

    Creating views requires advanced SQL permissions. Contact your administrator if you need this capability.

    Scenario 2: Scheduled Daily Extractions

    Create a Windows Task Scheduler job to run extractions automatically:

    1. Create your extraction script (e.g., Daily-Extract.ps1)
    2. Open Task Scheduler
    3. Create new task:
      • Trigger: Daily at 2:00 AM
      • Action: Run PowerShell script
      • Program: powershell.exe
      • Arguments: -File "C:\Scripts\Daily-Extract.ps1"
    4. Ensure the task runs with an account that has appropriate permissions
    Important

    Scheduled tasks using --TdsUseInteractive require the session to remain logged in. For unattended automation, use Azure CLI authentication with service principal.

    Scenario 3: Extract to SQL Server Database

    Use the extracted CSV files with SQL Server's BULK INSERT:

    -- Create target table matching CSV structure
    CREATE TABLE staging_contacts (
        contactid UNIQUEIDENTIFIER,
        firstname NVARCHAR(50),
        lastname NVARCHAR(50),
        emailaddress1 NVARCHAR(100),
        createdon DATETIME2
    );
    
    -- Load CSV data
    BULK INSERT staging_contacts
    FROM 'C:\Extracts\contacts.csv'
    WITH (
        FIRSTROW = 2,  -- Skip header
        FIELDTERMINATOR = ',',
        ROWTERMINATOR = '\n',
        FORMAT = 'CSV'
    );
    

    Security Considerations

    Data Protection

    1. Secure storage - Store extracted CSV files in encrypted folders
    2. Access control - Limit who can read extracted data
    3. Retention policy - Delete old extracts when no longer needed
    4. Audit trail - Log files track all extractions performed

    Compliance

    When extracting data containing personal information (PII):

    1. GDPR/Privacy Act compliance - Ensure you have legal basis for extracting and storing data
    2. Data minimisation - Only extract tables and columns you actually need
    3. Secure transmission - If sharing files, use encrypted channels (OneDrive, SharePoint with passwords)
    4. Right to erasure - Have processes to delete extracted data when requested
    Caution

    Extracted CSV files may contain sensitive personal information. Follow your organisation's data protection policies.

    Frequently Asked Questions

    Q: Can I extract data from production environments?

    A: Yes, the TDS endpoint is read-only, so there's no risk of modifying production data. However, large extractions may impact performance, so schedule them during off-peak hours.

    Q: What's the maximum table size I can extract?

    A: There's no hard limit. The tool automatically handles arbitrarily large tables using chunked extraction. Tables with millions of rows can be extracted successfully.

    Q: Can I filter which rows to extract?

    A: Not directly with this tool. The tool always extracts the entire table. For filtered extraction, create a view in SQL Server Management Studio and extract the view.

    Q: Does this work with Dataverse for Teams?

    A: No, the TDS endpoint is only available for Dataverse (not Dataverse for Teams).

    Q: Can I extract system tables like systemuser or team?

    A: Yes, as long as you have read permissions on those tables.

    Q: How much disk space do I need?

    A: As a rough estimate, plan for 1-2 MB per 1,000 rows for typical tables. Wide tables (200+ columns) may require 3-5 MB per 1,000 rows.

    Q: Can I run multiple extractions in parallel?

    A: Yes, you can run multiple CLI instances simultaneously to extract different tables in parallel. This can significantly speed up batch extractions.

    Q: What happens if my internet connection drops during extraction?

    A: The extraction will fail with a network error. You'll need to restart the extraction. The tool does not support resume functionality.

    See Also

    • datamanipulation Overview - All datamanipulation operations
    • Rebuild Activity Relationships - Rebuild N:N relationships from JSON
    • Recreate Family Groups - Create family groups for individuals
    • Microsoft Dataverse TDS Endpoint - Official TDS documentation
    In This Article
    Back to top Copyright © Whānau Tahi Ltd 2025 Leave us feedback on our documentation!