4 Data checks YAML file
The YMAL file holds the code and metadata of all data checks. The checks are derived from a core suite of tests and assertions being developed by TDWG’s Biodiversity Data Quality Task Group 2 ( Data Quality Tests and Assertions). More information and links can be found in the Learn more section.
4.1 Data check example
DC_b23110e7-1be7-444a-a677-cdee0cf4330c:
name: countryMismatch
meta:
Description:
Main: Check if given country match given country code.
InputQuestion: Does country and country code match?
Example:
Fail: Country name (dwc:country) and ISO country code (dwc:countryCode) do
not match
Pass: Country name (dwc:country) and ISO country code (dwc:countryCode) match
InputFail: country=Australia, countryCode=4
InputPass: country=Australia, countryCode=AU
OutputFail: Failed
OutputPass: Passed
Resolution:
Record: SingleRecord
Term: MultiTerm
DarwinCoreClass: Location
Keywords: location,iso,country
guid: b23110e7-1be7-444a-a677-cdee0cf4330c
Flags:
Severity: Warning
Warning: Inconsistent
Output: Validation
Dimension: Consistency
Pseudocode: |
get.Country($countryCode) == $country
Source:
Reference:
CreatedBy: Povilas Gibas
MaintainedBy: Povilas Gibas
CreationDate: 2018-06-27
ModificationDate: 2018-06-27
ModificationHist:
Input:
Target: country,countryCode
Dependency:
DependencyType: Internal
DataChecks:
Rpackages: rgbif
Data: isocodes$name,isocodes$code
Functionality: |
FUNC <- function() {
result <- sapply(seq_along(TARGET1), function(i) {
if (is.na(TARGET1[i]) | is.na(TARGET2[i])) {
NA
} else {
which(DEPEND1 == TARGET1[i]) == which(DEPEND2 == TARGET2[i])
}
})
result <- unlist(result)
return(result)
}
4.2 Manage your own data checks
After adding/ removing/ editing the YAML file, you can load data checks into R using getDC()
function.
DC <- getDC("path to your YAML file")
You can also export data checks from your YAML file to .rda and roxygen2 comments.
exportDC("path to your YAML file")