Validation Suite
The ValidationSuite is the top-level entry point for running data-quality checks. It collects one or more Check objects and executes them against a DataFusion table.
Creating a Suite
There are two ways to create a suite:
Fluent API (recommended)
from datafusion import SessionContext
from qualink.core import ValidationSuite
ctx = SessionContext()
ctx.register_csv("users", "users.csv")
result = await (
ValidationSuite()
.on_data(ctx, "users")
.with_name("My Suite")
.add_check(check1)
.add_check(check2)
.run()
)
Builder Pattern
builder = ValidationSuite.builder("My Suite")
builder.on_data(ctx, "users")
builder.add_check(check1)
builder.add_checks([check2, check3])
result = await builder.run()
Configuration Options
.with_name(name: str)
Set the suite name, which appears in reports and logs.
.on_data(ctx: SessionContext, table_name: str)
Bind the suite to a DataFusion session context and table.
.add_check(check: Check)
Add a single check to the suite.
.add_checks(checks: list[Check])
Add multiple checks at once.
.run_parallel(enabled: bool = False)
Enable or disable concurrent check execution. When True, all checks run concurrently via asyncio.gather.
result = await (
ValidationSuite()
.on_data(ctx, "users")
.run_parallel(True)
.add_check(check1)
.add_check(check2)
.run()
)
ValidationResult
The .run() method returns a ValidationResult with:
| Property | Type | Description |
|---|---|---|
success |
bool |
True if no ERROR-level constraints failed |
status |
str |
"Success", "Warning", or "Error" |
report |
ValidationReport |
Detailed metrics, results, and issues |
ValidationReport
| Property | Type | Description |
|---|---|---|
suite_name |
str |
Name of the suite |
metrics |
ValidationMetrics |
Aggregate pass/fail counts |
check_results |
dict[str, list] |
Per-check constraint results |
issues |
list[ValidationIssue] |
Failed constraint details |
ValidationMetrics
| Property | Type | Description |
|---|---|---|
total_checks |
int |
Number of checks |
total_constraints |
int |
Number of constraints evaluated |
passed |
int |
Constraints that passed |
failed |
int |
Constraints that failed |
skipped |
int |
Constraints skipped |
error_count |
int |
Failures at ERROR level |
warning_count |
int |
Failures at WARNING level |
pass_rate |
float |
passed / (passed + failed) |
Printing the Result
print(result)
Produces:
Validation PASSED: My Suite
Checks: 2 | Constraints: 5
Passed: 4 | Failed: 1 | Skipped: 0
Issues:
[warning] Data Quality / Completeness(name): ...