anonymizer
Class QIDAttribute

java.lang.Object
  extended by anonymizer.QIDAttribute

public class QIDAttribute
extends java.lang.Object

Data structure that holds all relevant information related to a quasi-identifier attribute. Namely, attribute index in the input, the value generalization hierarchy (read from the configuration file), string->integer domain mapping for categorical attributes.


Field Summary
 java.util.Hashtable<java.lang.String,java.lang.Integer> catDomMapping
          Mapping from the categorical domain to discrete-valued numerical (integer) domain for categorical attributes
private  java.util.Hashtable<java.lang.String,java.lang.String[]> childLookup
          Lookup to find children (specializations) of an interval
private  java.util.Hashtable<java.lang.String,java.lang.String[]> generalizationSeq
          Maps each VGH entry to a list of ground-domain values (i.e., not generalized values)
 int index
          Index of the attribute
private  java.util.LinkedList<Interval> leafIntervals
          List of leaf intervals in the VGH
private  java.util.Hashtable<java.lang.String,java.lang.Integer> nonLeaves
          Maps every non-leaf VGH node to a unique index, assigned by a breadth-first traversal of the VGH
private  java.util.Hashtable<java.lang.String,java.lang.String> parentLookup
          Lookup to find the generalization of an interval
private  java.lang.String suppValue
          Suppression value for the VGH
 
Constructor Summary
QIDAttribute(int index, org.w3c.dom.Node att)
          Class constructor
 
Method Summary
 java.lang.String generalize(java.lang.String value)
          Retrieves the generalization of any QID value (leaf or non-leaf)
 boolean generalizesTo(java.lang.String value, java.lang.String generalization)
          Checks whether a generalization matches a value
private  int getDepth(java.lang.String leafInterval)
          Calculates the depth of a leaf interval (i.e., length of the path to the suppression value)
 java.lang.String[] getGeneralizationSequence(java.lang.String val)
          Get the list of ground-domain values that the provided value represents (applies only to categorical attributes mapped to a discrete domain)
 int[] getLeafCategories(java.lang.String val)
           
 java.util.Hashtable<java.lang.String,java.lang.Integer> getNonLeaves()
          Get the non-leaf nodes of the VGH
 java.lang.String getSup()
          Getter for supression value
 int getVGHDepth(boolean validateDGH)
          Calculates the depth of the value generalization hierarchy
 boolean isDomainGeneralizationHierarchy()
          Checks whether the VGH for this qid-attribute corresponds to a domain generalization hierarchy.
private  void parseMapping(org.w3c.dom.Node map)
          Assign the mapping from the categorical domain to discrete valued numerical domain
private  void parseVGH(org.w3c.dom.Node vgh)
          Obtain the value-generalization hierarchy for the attribute
 java.lang.String[] specialize(java.lang.String value)
          Retrieves higher granularity values of some generalized value
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

catDomMapping

public java.util.Hashtable<java.lang.String,java.lang.Integer> catDomMapping
Mapping from the categorical domain to discrete-valued numerical (integer) domain for categorical attributes


index

public int index
Index of the attribute


suppValue

private java.lang.String suppValue
Suppression value for the VGH


parentLookup

private java.util.Hashtable<java.lang.String,java.lang.String> parentLookup
Lookup to find the generalization of an interval


childLookup

private java.util.Hashtable<java.lang.String,java.lang.String[]> childLookup
Lookup to find children (specializations) of an interval


leafIntervals

private java.util.LinkedList<Interval> leafIntervals
List of leaf intervals in the VGH


nonLeaves

private java.util.Hashtable<java.lang.String,java.lang.Integer> nonLeaves
Maps every non-leaf VGH node to a unique index, assigned by a breadth-first traversal of the VGH


generalizationSeq

private java.util.Hashtable<java.lang.String,java.lang.String[]> generalizationSeq
Maps each VGH entry to a list of ground-domain values (i.e., not generalized values)

Constructor Detail

QIDAttribute

public QIDAttribute(int index,
                    org.w3c.dom.Node att)
             throws java.lang.Exception
Class constructor

Parameters:
att - The node that describes a QID attribute in the configuration file
index - Index of the attribute
Throws:
java.lang.Exception
Method Detail

parseMapping

private void parseMapping(org.w3c.dom.Node map)
                   throws java.lang.Exception
Assign the mapping from the categorical domain to discrete valued numerical domain

Parameters:
map - Root node that contains (category, integer-value) pairs
Throws:
java.lang.Exception

parseVGH

private void parseVGH(org.w3c.dom.Node vgh)
               throws java.lang.Exception
Obtain the value-generalization hierarchy for the attribute

Parameters:
vgh - Root node that contains the suppression value and all its chilren (specializations)
Throws:
java.lang.Exception

isDomainGeneralizationHierarchy

public boolean isDomainGeneralizationHierarchy()
Checks whether the VGH for this qid-attribute corresponds to a domain generalization hierarchy. This basically requires that all leave values in the VGH be at the same depth from the root (i.e., the suppression value). The check should be carried out all quasi-identifier attributes of all full-domain anonymization methods (e.g., Datafly, Incognito, etc.).

Returns:
true if VGH corresponds to a DGH, false otherwise

getVGHDepth

public int getVGHDepth(boolean validateDGH)
                throws java.lang.Exception
Calculates the depth of the value generalization hierarchy

Parameters:
validateDGH - if true, this method first checks whether the domain generalization hiearchy represents a value generalization hierarchy (i.e., whether all leaves are at the same depth or not)
Returns:
Depth of the VGH
Throws:
java.lang.Exception - If input validation fails

getDepth

private int getDepth(java.lang.String leafInterval)
Calculates the depth of a leaf interval (i.e., length of the path to the suppression value)

Parameters:
leafInterval - String representation of a leaf interval
Returns:
depth of the leaf interval

getSup

public java.lang.String getSup()
Getter for supression value

Returns:
root value of the DGH

generalize

public java.lang.String generalize(java.lang.String value)
Retrieves the generalization of any QID value (leaf or non-leaf)

Parameters:
value - some value from the QID's domain
Returns:
immediate parent of the value (or itself if DGH root is specified)

specialize

public java.lang.String[] specialize(java.lang.String value)
                              throws java.lang.Exception
Retrieves higher granularity values of some generalized value

Parameters:
value - some non-leaf value from DGH
Returns:
array of possible specializations of the value, or null if the attribute is continuous and the value is not a node of the VGH
Throws:
java.lang.Exception

generalizesTo

public boolean generalizesTo(java.lang.String value,
                             java.lang.String generalization)
                      throws java.lang.Exception
Checks whether a generalization matches a value

Parameters:
value - some highest granularity value (not generalized)
generalization - Any value from the VGH domain
Returns:
true if the value generalizes to the specified generalized value
Throws:
java.lang.Exception

getNonLeaves

public java.util.Hashtable<java.lang.String,java.lang.Integer> getNonLeaves()
Get the non-leaf nodes of the VGH

Returns:
A hashtable that maps every non-leaf VGH node to a unique index, assigned by a breadth-first traversal of the VGH

getGeneralizationSequence

public java.lang.String[] getGeneralizationSequence(java.lang.String val)
Get the list of ground-domain values that the provided value represents (applies only to categorical attributes mapped to a discrete domain)

Parameters:
val - some value in the VGH
Returns:
starting with the value itself, returns a vector of generalizations all the way upto the suppression value

getLeafCategories

public int[] getLeafCategories(java.lang.String val)