Module mimir.attacks.quantile

Implementation of the attack proposed in 'Scalable Membership Inference Attacks via Quantile Regression' https://arxiv.org/pdf/2307.03694.pdf

Classes

class CustomTrainer (alpha_fpr, **kwargs)

Trainer is a simple but feature-complete training and eval loop for PyTorch, optimized for 🤗 Transformers.

Args

model ([PreTrainedModel] or torch.nn.Module, optional): The model to train, evaluate or use for predictions. If not provided, a model_init must be passed.

<Tip>

[`Trainer`] is optimized to work with the [`PreTrainedModel`] provided by the library. You can still use
your own models defined as <code>torch.nn.Module</code> as long as they work the same way as the 🤗 Transformers
models.

</Tip>

args ([TrainingArguments], optional): The arguments to tweak for training. Will default to a basic instance of [TrainingArguments] with the output_dir set to a directory named tmp_trainer in the current directory if not provided. data_collator (DataCollator, optional): The function to use to form a batch from a list of elements of train_dataset or eval_dataset. Will default to [default_data_collator] if no tokenizer is provided, an instance of [DataCollatorWithPadding] otherwise. train_dataset (Union[torch.utils.data.Dataset, torch.utils.data.IterableDataset, datasets.Dataset], optional): The dataset to use for training. If it is a [~datasets.Dataset], columns not accepted by the model.forward() method are automatically removed.

Note that if it's a <code>torch.utils.data.IterableDataset</code> with some randomization and you are training in a
distributed fashion, your iterable dataset should either use a internal attribute <code>generator</code> that is a
<code>torch.Generator</code> for the randomization that must be identical on all processes (and the Trainer will
manually set the seed of this <code>generator</code> at each epoch) or have a <code>set\_epoch()</code> method that internally
sets the seed of the RNGs used.

eval_dataset (Union[torch.utils.data.Dataset, Dict[str, torch.utils.data.Dataset, datasets.Dataset]), optional): The dataset to use for evaluation. If it is a [~datasets.Dataset], columns not accepted by the model.forward() method are automatically removed. If it is a dictionary, it will evaluate on each dataset prepending the dictionary key to the metric name. tokenizer ([PreTrainedTokenizerBase], optional): The tokenizer used to preprocess the data. If provided, will be used to automatically pad the inputs to the maximum length when batching inputs, and it will be saved along the model to make it easier to rerun an interrupted training or reuse the fine-tuned model. model_init (Callable[[], PreTrainedModel], optional): A function that instantiates the model to be used. If provided, each call to [~Trainer.train] will start from a new instance of the model as given by this function.

The function may have zero argument, or a single one containing the optuna/Ray Tune/SigOpt trial object, to
be able to choose different architectures according to hyper parameters (such as layer count, sizes of
inner layers, dropout probabilities etc).

compute_metrics (Callable[[EvalPrediction], Dict], optional): The function that will be used to compute metrics at evaluation. Must take a [EvalPrediction] and return a dictionary string to metric values. Note When passing TrainingArgs with batch_eval_metrics set to True, your compute_metrics function must take a boolean compute_result argument. This will be triggered after the last eval batch to signal that the function needs to calculate and return the global summary statistics rather than accumulating the batch-level statistics. callbacks (List of [TrainerCallback], optional): A list of callbacks to customize the training loop. Will add those to the list of default callbacks detailed in here.

If you want to remove one of the default callbacks used, use the [`Trainer.remove_callback`] method.

optimizers (Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler.LambdaLR], optional, defaults to (None, None)): A tuple containing the optimizer and the scheduler to use. Will default to an instance of [AdamW] on your model and a scheduler given by [get_linear_schedule_with_warmup] controlled by args. preprocess_logits_for_metrics (Callable[[torch.Tensor, torch.Tensor], torch.Tensor], optional): A function that preprocess the logits right before caching them at each evaluation step. Must take two tensors, the logits and the labels, and return the logits once processed as desired. The modifications made by this function will be reflected in the predictions received by compute_metrics.

Note that the labels (second parameter) will be <code>None</code> if the dataset does not have them.

Important attributes:

- **model** -- Always points to the core model. If using a transformers model, it will be a [`PreTrainedModel`]
  subclass.
- **model_wrapped** -- Always points to the most external model in case one or more other modules wrap the
  original model. This is the model that should be used for the forward pass. For example, under <code>DeepSpeed</code>,
  the inner model is wrapped in <code>DeepSpeed</code> and then again in <code>torch.nn.DistributedDataParallel</code>. If the inner
  model hasn't been wrapped, then <code>self.model\_wrapped</code> is the same as <code>self.model</code>.
- **is_model_parallel** -- Whether or not a model has been switched to a model parallel mode (different from
  data parallelism, this means some of the model layers are split on different GPUs).
- **place_model_on_device** -- Whether or not to automatically place the model on the device - it will be set
  to <code>False</code> if model parallel or deepspeed is used, or if the default
  <code>TrainingArguments.place\_model\_on\_device</code> is overridden to return <code>False</code> .
- **is_in_train** -- Whether or not a model is currently running <code>train</code> (e.g. when <code>evaluate</code> is called while
  in <code>train</code>)
Expand source code
class CustomTrainer(Trainer):
    def __init__(
        self,
        alpha_fpr,
        **kwargs,
    ):
        super().__init__(**kwargs)
        self.alpha_fpr = alpha_fpr

    def compute_loss(self, model, inputs, return_outputs=False):
        labels = inputs.pop("labels")
        # forward pass
        outputs = model(**inputs)
        logits = outputs.get("logits")
        loss = ch.mean(
            ch.max(
                self.alpha_fpr * (logits - labels),
                (1 - self.alpha_fpr) * (labels - logits),
            )
        )
        return (loss, outputs) if return_outputs else loss

Ancestors

  • transformers.trainer.Trainer

Methods

def compute_loss(self, model, inputs, return_outputs=False)

How the loss is computed by Trainer. By default, all models return the loss in the first element.

Subclass and override for custom behavior.

class QuantileAttack (config, model: Model, alpha: float)

Implementation of the attack proposed in 'Scalable Membership Inference Attacks via Quantile Regression' https://arxiv.org/pdf/2307.03694.pdf

alpha (float): Desired FPR

Expand source code
class QuantileAttack(Attack):
    """
    Implementation of the attack proposed in 'Scalable Membership Inference Attacks via Quantile Regression'
    https://arxiv.org/pdf/2307.03694.pdf
    """

    def __init__(self, config, model: Model, alpha: float):
        """
        alpha (float): Desired FPR
        """
        ref_model = QuantileReferenceModel(
            config, name="Sreevishnu/funnel-transformer-small-imdb"
        )
        super().__init__(self, config, model, ref_model)
        self.alpha = alpha

    def _train_quantile_model(self, dataset):
        def tokenize_function(examples):
            return self.ref_model.tokenizer(
                examples["text"], padding="max_length", truncation=True
            )

        tokenized_dataset = dataset.map(tokenize_function, batched=True)
        training_args = TrainingArguments(
            output_dir="quantile_ref_model",
            evaluation_strategy="epoch",
            num_train_epochs=1,
        )

        def compute_metrics(eval_pred):
            predictions, labels = eval_pred
            rmse = mean_squared_error(labels, predictions, squared=False)
            return {"rmse": rmse}

        trainer = CustomTrainer(
            alpha_fpr=self.alpha,
            model=self.ref_model.model,
            args=training_args,
            train_dataset=tokenized_dataset,
            eval_dataset=tokenized_dataset,
            compute_metrics=compute_metrics,
        )
        # Train quantile model
        trainer.train()

    def prepare(self, known_non_members):
        """
        Step 1: Use non-member dataset, collect confidence scores for correct label.
        Step 2: Train a quantile regression model that takes X as input and predicts quantile. Use pinball loss
        Step 3: Test by checking if member: score is higher than output of quantile regression model.
        """

        # Step 1: Use non-member dataset, collect confidence scores for correct label.
        # Get likelihood scores from target model for known_non_members
        # Note that these non-members should be different from the ones in testing
        scores = [self.target_model.get_ll(x) for x in known_non_members]
        # Construct a dataset out of this to be used in Huggingface, with
        # "text" containing the actual data, and "labels" containing the scores
        dataset = Dataset.from_dict({"text": known_non_members, "labels": scores})

        # Step 2: Train a quantile regression model that takes X as input and predicts quantile. Use pinball loss
        self._train_quantile_model(dataset)

    def attack(self, document, **kwargs):
        # Step 3: Test by checking if member: score is higher than output of quantile regression model.

        # Get likelihood score from target model for doc
        ll = self.target_model.get_ll(document)

        # Return ll - quantile_model(doc)
        tokenized = self.ref_model.tokenizer(document, return_tensors="pt")
        # Shift items in the dictionary to the correct device
        tokenized = {k: v.to(self.ref_model.model.device, non_blocking=True) for k, v in tokenized.items()}
        quantile_score = self.ref_model.model(**tokenized)
        print(quantile_score)
        quantile_score = quantile_score.logits.item()

        # We want higher score to be non-member
        return quantile_score - ll

Ancestors

Methods

def prepare(self, known_non_members)

Step 1: Use non-member dataset, collect confidence scores for correct label. Step 2: Train a quantile regression model that takes X as input and predicts quantile. Use pinball loss Step 3: Test by checking if member: score is higher than output of quantile regression model.

Inherited members