qualink
quality + link โ linking your data to quality.
Blazing fast data quality framework for Python, built on Apache DataFusion.
High Performance
Leverages Apache DataFusion for blazing-fast SQL-based data quality checks with zero-copy Arrow processing.
25+ Built-in Constraints
Completeness, uniqueness, statistics, patterns, formats, cross-table comparisons, and more โ all ready to use.
YAML Configuration
Define your entire validation suite declaratively in YAML โ no code required for standard checks.
Async First
Built with asyncio for non-blocking execution. Run checks sequentially or in parallel.
Multiple Formatters
Output results as human-readable text, JSON for pipelines, or Markdown for reports.
CLI โ qualinkctl
Run validations from the terminal with a single command. Perfect for CI/CD pipelines and automation.
Fluent Builder API
Chain methods to define checks with a clean, readable, Pythonic builder pattern.
Quick Example
import asyncio
from datafusion import SessionContext
from qualink.checks import Check, Level
from qualink.constraints import Assertion
from qualink.core import ValidationSuite
from qualink.formatters import MarkdownFormatter
async def main():
ctx = SessionContext()
ctx.register_csv("users", "users.csv")
result = await (
ValidationSuite()
.on_data(ctx, "users")
.with_name("User Data Quality")
.add_check(
Check.builder("Critical")
.with_level(Level.ERROR)
.is_complete("user_id")
.is_unique("email")
.has_size(Assertion.greater_than(0))
.build()
)
.run()
)
print(MarkdownFormatter().format(result))
asyncio.run(main())
โก Benchmark Highlights
Real-world validation on NYC Yellow Taxi trip data.
12 check groups ยท 98.9% pass rate ยท powered by Apache DataFusion
See full benchmark details โ
๐งญ Available Now
Profile, persist, monitor, and bootstrap data quality workflows with features already available in qualink.
๐ Analyzers
Compute reusable dataset and column metrics before turning them into checks.
๐๏ธ Metrics Repository
Persist analyzer outputs over time to track quality trends, regressions, and baselines.
๐ Anomaly Detection
Detect unexpected metric shifts using rate-of-change and z-score strategies.
๐ก Intelligent Rule Suggestions
Generate candidate Qualink rules from profiling results to bootstrap validation suites faster.