Connecting to and using ARXaaS¶
Calls to ARXaaS is made through the ARXaaS class. ARXaaS implements methods for the following functionality:
- Anonymize a Dataset object
- Analyze re-identification risk for a Dataset object
- Create generalization hierarchies (See: Creating Hierarchies)
Creating¶
When creating a instance of the ARXaaS class you need to pass a full url to the service running.
Example
from pyarxaas import ARXaaS
arxaas = ARXaaS(https://localhost:8080)
Risk Profile¶
Re-identfification risk for prosecutor, journalist and markteter attack models can be obtained using the ARXaaS risk_profile method. The method takes a Dataset object and returns a RiskProfile. See Using the Dataset class for more on the Dataset class. More in depth information on re-identificaiton risk ARX | risk analysis
Example
risk_profile = arxaas.risk_profile(dataset)
Risk profile contains different properties containg analytics on the dataset re-identification risk. Most important is the re-identification risk property.
# create risk profile ...
risks = risk_profile.re_identification_risk
The property contains a mapping of risk => value. What is a acceptable risk depends entirely on the context of the dataset.
Anonymization¶
Anonymizing a dataset is as simple as passing a Dataset containing the neccessary hierarchies, a sequence of Privacy Model to use and optionally a suppersion limit to the anonymize() method. The method, if succesfull returns a AnonymizeResult object containing the new dataset.
Example
kanon = KAnonymity(2)
ldiv = LDiversityDistinct(2, "disease") # in this example the dataset has a disease field
anonymize_result = arxaas.anonymize(dataset, [kanon, ldiv], 0.2)
anonymized_dataset = anonymize_result.dataset
Hierarchy Generation¶
Generalizaiton hierarchies are a important part of anonymization. ARXaaS contains a hierarchy() method. It takes a configured Hierarchy Builders object and a dataset column represented as a common Python list. It returns a 2D list structure containing a new hierarchy.
Example making a redaction hierarchy
redaction_builder = RedactionHierarchyBuilder()
zipcodes = [47677, 47602, 47678, 47905, 47909, 47906, 47605, 47673, 47607]
zipcode_hierarchy = arxaas.hiearchy(redaction_builder, zipcodes)