CsvPath Framework
  • CsvPath
  • DATA PREBOARDING
  • Getting Started
    • Quickstart
    • Organizing Inbound Data
      • Dataflow Diagram
      • The Three Data Spaces
        • Source Staging
        • Validation Assets
        • Trusted Publishing
      • How Data Progresses Through CsvPath Framework
        • Staging
          • Data Identity
          • Handling Variability
            • Templates
            • Named-file Reference Queries
          • Registration API and CLI
            • Loading
            • Going CLI-only
        • Validation and Upgrading
          • Templates
          • Run Using the API
          • Running In the CLI
          • Named-paths Reference Queries
        • Publishing
          • Inspect Run Results
            • Result API
            • More Templates and References
          • Export Data and Metadata
    • Csv and Excel Validation
      • Your First Validation, The Lazy Way
      • Your First Validation, The Easy Way
      • Your First Validation, The Hard Way
    • DataOps Integrations
      • Getting Started with CsvPath + OpenTelemetry
      • Getting Started With CsvPath + OpenLineage
      • Getting Started with CsvPath + SFTPPlus
        • SFTPPlus Implementation Checklist
      • Getting Started with CsvPath + CKAN
    • How-tos
      • How-to videos
      • Storage backend how-tos
        • Store source data and/or named-paths and/or the archive in AWS S3
        • Loading files from S3, SFTP, or Azure
        • Add a file by https
        • Store source data and/or named-paths and/or the archive in Azure
        • Store source data and/or named-paths and/or the archive in Google Cloud Storage
      • CsvPath in AWS Lambda
      • Call a webhook at the end of a run
      • Setup notifications to Slack
      • Send run events to Sqlite
      • Execute a script at the end of a run
      • Send events to MySQL or Postgres
      • Sending results by SFTP
      • Another (longer) Example
        • Another Example, Part 1
        • Another Example, Part 2
      • Working with error messages
      • Sending results to CKAN
      • Transfer a file out of CsvPath
      • File references and rewind/replay how-tos
        • Replay Using References
        • Doing rewind / replay, part 1
        • Doing rewind / replay, part 2
        • Referring to named-file versions
      • Config Setup
      • Debugging Your CsvPaths
      • Creating a derived file
      • Run CsvPath on Jenkins
    • A Helping Hand
  • Topics
    • The CLI
    • High-level Topics
      • Why CsvPath?
      • CsvPath Use Cases
      • Paths To Production
      • Solution Storming
    • Validation
      • Schemas Or Rules?
      • Well-formed, Valid, Canonical, and Correct
      • Validation Strategies
    • Python
      • Python vs. CsvPath
      • Python Starters
    • Product Comparisons
      • The Data Preboarding Comparison Worksheet
    • Data, Validation Files, and Storage
      • Named Files and Paths
      • Where Do I Find Results?
      • Storage Backends
      • File Management
    • Language Basics
    • A CsvPath Cheatsheet
    • The Collect, Store, Validate Pattern
    • The Modes
    • The Reference Data Types
    • Manifests and Metadata
    • Serial Or Breadth-first Runs?
    • Namespacing With the Archive
    • Glossary
  • Privacy Policy
Powered by GitBook
On this page
  1. Getting Started
  2. Organizing Inbound Data
  3. How Data Progresses Through CsvPath Framework
  4. Staging
  5. Handling Variability

Templates

PreviousHandling VariabilityNextNamed-file Reference Queries

Last updated 2 months ago

We construct a named-file's folder organization using templates. Templates are completely optional. Obviously, if you don't need them, so much the better, don't use them. Regardless, here's how they work.

Templates consist of:

  • Static text, for example: /orders/

  • Placeholders in the form of a colon followed by an integer, representing a path segment from the original source file's landing path. For example: :3 or :5

  • The token colon + filename. :filename is replaced by the name of the source file.

:filename is not mandatory, but if it is present — and it typically would be — it must come last. If you don't use a :filename you are telling CsvPath to register all inbound files registered as that named-file under the same physical filename, which is fine, but not the most common approach.

In the example above, we receive a file in an MFT server. Let's say the it is an SFTP server and the user is named acme. The acme account is for Acme Inc., an external data partner who sends us order information from their Plastics division on a monthly, quarterly, and annual basis.

In this example, our CsvPath Framework staging area is called staging. (This is a configurable value; the default is inputs/named_files). And our named-file name is Acme.

For reasons that include keeping the file layout easy to read and easy for a Grafana system to monitor, we want a named-file organization that is a bit more than the default: staging/Acme/<filenames>.

To do this we create a template that is stored in the named-file manifest.json. Templates can change. Each changed template is stored with the name-file update when a new file is registered. When a template is changed the layout of the named-file becomes more complex, so we would suggest using 0 or 1 templates, and definitely not many. Keep in mind that CsvPath Framework is built for automation, so registering files with a single template should feel like a natural fit.

Named-file templates turn arrival paths into named-file paths