Module mimir.attacks.quantile
Implementation of the attack proposed in 'Scalable Membership Inference Attacks via Quantile Regression' https://arxiv.org/pdf/2307.03694.pdf
Classes
class CustomTrainer (alpha_fpr, **kwargs)
-
Trainer is a simple but feature-complete training and eval loop for PyTorch, optimized for 🤗 Transformers.
Args
model ([
PreTrainedModel
] ortorch.nn.Module
, optional): The model to train, evaluate or use for predictions. If not provided, amodel_init
must be passed.<Tip> [`Trainer`] is optimized to work with the [`PreTrainedModel`] provided by the library. You can still use your own models defined as <code>torch.nn.Module</code> as long as they work the same way as the 🤗 Transformers models. </Tip>
args ([
TrainingArguments
], optional): The arguments to tweak for training. Will default to a basic instance of [TrainingArguments
] with theoutput_dir
set to a directory named tmp_trainer in the current directory if not provided. data_collator (DataCollator
, optional): The function to use to form a batch from a list of elements oftrain_dataset
oreval_dataset
. Will default to [default_data_collator
] if notokenizer
is provided, an instance of [DataCollatorWithPadding
] otherwise. train_dataset (Union[torch.utils.data.Dataset
,torch.utils.data.IterableDataset
,datasets.Dataset
], optional): The dataset to use for training. If it is a [~datasets.Dataset
], columns not accepted by themodel.forward()
method are automatically removed.Note that if it's a <code>torch.utils.data.IterableDataset</code> with some randomization and you are training in a distributed fashion, your iterable dataset should either use a internal attribute <code>generator</code> that is a <code>torch.Generator</code> for the randomization that must be identical on all processes (and the Trainer will manually set the seed of this <code>generator</code> at each epoch) or have a <code>set\_epoch()</code> method that internally sets the seed of the RNGs used.
eval_dataset (Union[
torch.utils.data.Dataset
, Dict[str,torch.utils.data.Dataset
,datasets.Dataset
]), optional): The dataset to use for evaluation. If it is a [~datasets.Dataset
], columns not accepted by themodel.forward()
method are automatically removed. If it is a dictionary, it will evaluate on each dataset prepending the dictionary key to the metric name. tokenizer ([PreTrainedTokenizerBase
], optional): The tokenizer used to preprocess the data. If provided, will be used to automatically pad the inputs to the maximum length when batching inputs, and it will be saved along the model to make it easier to rerun an interrupted training or reuse the fine-tuned model. model_init (Callable[[], PreTrainedModel]
, optional): A function that instantiates the model to be used. If provided, each call to [~Trainer.train
] will start from a new instance of the model as given by this function.The function may have zero argument, or a single one containing the optuna/Ray Tune/SigOpt trial object, to be able to choose different architectures according to hyper parameters (such as layer count, sizes of inner layers, dropout probabilities etc).
compute_metrics (
Callable[[EvalPrediction], Dict]
, optional): The function that will be used to compute metrics at evaluation. Must take a [EvalPrediction
] and return a dictionary string to metric values. Note When passing TrainingArgs withbatch_eval_metrics
set toTrue
, your compute_metrics function must take a booleancompute_result
argument. This will be triggered after the last eval batch to signal that the function needs to calculate and return the global summary statistics rather than accumulating the batch-level statistics. callbacks (List of [TrainerCallback
], optional): A list of callbacks to customize the training loop. Will add those to the list of default callbacks detailed in here.If you want to remove one of the default callbacks used, use the [`Trainer.remove_callback`] method.
optimizers (
Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler.LambdaLR]
, optional, defaults to(None, None)
): A tuple containing the optimizer and the scheduler to use. Will default to an instance of [AdamW
] on your model and a scheduler given by [get_linear_schedule_with_warmup
] controlled byargs
. preprocess_logits_for_metrics (Callable[[torch.Tensor, torch.Tensor], torch.Tensor]
, optional): A function that preprocess the logits right before caching them at each evaluation step. Must take two tensors, the logits and the labels, and return the logits once processed as desired. The modifications made by this function will be reflected in the predictions received bycompute_metrics
.Note that the labels (second parameter) will be <code>None</code> if the dataset does not have them.
Important attributes:
- **model** -- Always points to the core model. If using a transformers model, it will be a [`PreTrainedModel`] subclass. - **model_wrapped** -- Always points to the most external model in case one or more other modules wrap the original model. This is the model that should be used for the forward pass. For example, under <code>DeepSpeed</code>, the inner model is wrapped in <code>DeepSpeed</code> and then again in <code>torch.nn.DistributedDataParallel</code>. If the inner model hasn't been wrapped, then <code>self.model\_wrapped</code> is the same as <code>self.model</code>. - **is_model_parallel** -- Whether or not a model has been switched to a model parallel mode (different from data parallelism, this means some of the model layers are split on different GPUs). - **place_model_on_device** -- Whether or not to automatically place the model on the device - it will be set to <code>False</code> if model parallel or deepspeed is used, or if the default <code>TrainingArguments.place\_model\_on\_device</code> is overridden to return <code>False</code> . - **is_in_train** -- Whether or not a model is currently running <code>train</code> (e.g. when <code>evaluate</code> is called while in <code>train</code>)
Expand source code
class CustomTrainer(Trainer): def __init__( self, alpha_fpr, **kwargs, ): super().__init__(**kwargs) self.alpha_fpr = alpha_fpr def compute_loss(self, model, inputs, return_outputs=False): labels = inputs.pop("labels") # forward pass outputs = model(**inputs) logits = outputs.get("logits") loss = ch.mean( ch.max( self.alpha_fpr * (logits - labels), (1 - self.alpha_fpr) * (labels - logits), ) ) return (loss, outputs) if return_outputs else loss
Ancestors
- transformers.trainer.Trainer
Methods
def compute_loss(self, model, inputs, return_outputs=False)
-
How the loss is computed by Trainer. By default, all models return the loss in the first element.
Subclass and override for custom behavior.
class QuantileAttack (config, model:Â Model, alpha:Â float)
-
Implementation of the attack proposed in 'Scalable Membership Inference Attacks via Quantile Regression' https://arxiv.org/pdf/2307.03694.pdf
alpha (float): Desired FPR
Expand source code
class QuantileAttack(Attack): """ Implementation of the attack proposed in 'Scalable Membership Inference Attacks via Quantile Regression' https://arxiv.org/pdf/2307.03694.pdf """ def __init__(self, config, model: Model, alpha: float): """ alpha (float): Desired FPR """ ref_model = QuantileReferenceModel( config, name="Sreevishnu/funnel-transformer-small-imdb" ) super().__init__(self, config, model, ref_model) self.alpha = alpha def _train_quantile_model(self, dataset): def tokenize_function(examples): return self.ref_model.tokenizer( examples["text"], padding="max_length", truncation=True ) tokenized_dataset = dataset.map(tokenize_function, batched=True) training_args = TrainingArguments( output_dir="quantile_ref_model", evaluation_strategy="epoch", num_train_epochs=1, ) def compute_metrics(eval_pred): predictions, labels = eval_pred rmse = mean_squared_error(labels, predictions, squared=False) return {"rmse": rmse} trainer = CustomTrainer( alpha_fpr=self.alpha, model=self.ref_model.model, args=training_args, train_dataset=tokenized_dataset, eval_dataset=tokenized_dataset, compute_metrics=compute_metrics, ) # Train quantile model trainer.train() def prepare(self, known_non_members): """ Step 1: Use non-member dataset, collect confidence scores for correct label. Step 2: Train a quantile regression model that takes X as input and predicts quantile. Use pinball loss Step 3: Test by checking if member: score is higher than output of quantile regression model. """ # Step 1: Use non-member dataset, collect confidence scores for correct label. # Get likelihood scores from target model for known_non_members # Note that these non-members should be different from the ones in testing scores = [self.target_model.get_ll(x) for x in known_non_members] # Construct a dataset out of this to be used in Huggingface, with # "text" containing the actual data, and "labels" containing the scores dataset = Dataset.from_dict({"text": known_non_members, "labels": scores}) # Step 2: Train a quantile regression model that takes X as input and predicts quantile. Use pinball loss self._train_quantile_model(dataset) def attack(self, document, **kwargs): # Step 3: Test by checking if member: score is higher than output of quantile regression model. # Get likelihood score from target model for doc ll = self.target_model.get_ll(document) # Return ll - quantile_model(doc) tokenized = self.ref_model.tokenizer(document, return_tensors="pt") # Shift items in the dictionary to the correct device tokenized = {k: v.to(self.ref_model.model.device, non_blocking=True) for k, v in tokenized.items()} quantile_score = self.ref_model.model(**tokenized) print(quantile_score) quantile_score = quantile_score.logits.item() # We want higher score to be non-member return quantile_score - ll
Ancestors
Methods
def prepare(self, known_non_members)
-
Step 1: Use non-member dataset, collect confidence scores for correct label. Step 2: Train a quantile regression model that takes X as input and predicts quantile. Use pinball loss Step 3: Test by checking if member: score is higher than output of quantile regression model.
Inherited members