dru_utilities¶
Role: DRU (Discretize / Regularize Unit) utilities for transforming communication message logits into differentiable probabilities during training and discrete bits during execution. Location:
Q_Sea_Battle.dru_utilities
Overview¶
This module implements helper functions for the Discretize / Regularize Unit (DRU) used in DIAL-style training of communicating agents. It provides a differentiable mapping (dru_train) that adds Gaussian noise and applies a logistic nonlinearity during centralized training, and a discretizing mapping (dru_execute) that hard-thresholds logits to bits for decentralized execution. The module is intentionally free of trainable parameters and supports both NumPy arrays and TensorFlow tensors.
Public API¶
Functions¶
_is_tf_tensor(x: Any) -> bool¶
Signature: _is_tf_tensor(x: Any) -> bool
Purpose: Return True if x is a TensorFlow tensor.
Arguments:
- x (Any): Value to test.
Returns:
- bool: True if tf.is_tensor(x) is True, otherwise False.
Errors:
- Not specified.
Example:
import tensorflow as tf
from Q_Sea_Battle.dru_utilities import _is_tf_tensor
x = tf.constant([1.0, 2.0])
assert _is_tf_tensor(x) is True
dru_train(message_logits: ArrayLike, sigma: float = 2.0, clip_range: Tuple[float, float] | None = (-10.0, 10.0)) -> ArrayLike¶
Signature: dru_train(message_logits: ArrayLike, sigma: float = 2.0, clip_range: Tuple[float, float] | None = (-10.0, 10.0)) -> ArrayLike
Purpose: Apply the differentiable DRU mapping used during centralized training: additive Gaussian noise on logits followed by a logistic/sigmoid transformation, optionally clipping the noisy logits for numerical stability.
Arguments:
- message_logits (ArrayLike): Logits for communication dimensions; may be a scalar, NumPy array, or TensorFlow tensor of shape (..., m).
- sigma (float, default 2.0): Standard deviation of Gaussian noise added to logits; must be non-negative.
- clip_range (Tuple[float, float] | None, default (-10.0, 10.0)): Optional (min, max) to clip noisy logits before applying the logistic; if None, no clipping is applied.
Returns:
- ArrayLike: Same type and shape as message_logits, with values in (0, 1); TensorFlow outputs are differentiable with respect to message_logits.
Errors:
- ValueError: If sigma < 0.0.
Example:
import numpy as np
import tensorflow as tf
from Q_Sea_Battle.dru_utilities import dru_train
# NumPy usage
logits_np = np.array([0.0, 2.0, -2.0], dtype=np.float32)
probs_np = dru_train(logits_np, sigma=0.0) # deterministic sigmoid
print(probs_np)
# TensorFlow usage (differentiable)
logits_tf = tf.constant([[0.0, 1.0, -1.0]], dtype=tf.float32)
with tf.GradientTape() as tape:
tape.watch(logits_tf)
probs_tf = dru_train(logits_tf, sigma=0.0)
grads = tape.gradient(probs_tf, logits_tf)
print(probs_tf, grads)
dru_execute(message_logits: ArrayLike, threshold: float = 0.0) -> ArrayLike¶
Signature: dru_execute(message_logits: ArrayLike, threshold: float = 0.0) -> ArrayLike
Purpose: Apply the discretizing DRU mapping used during decentralized execution: element-wise hard thresholding of logits to bits.
Arguments:
- message_logits (ArrayLike): Logits for communication dimensions; may be a scalar, NumPy array, or TensorFlow tensor of shape (..., m).
- threshold (float, default 0.0): Logit-space threshold used to produce discrete bits; 0.0 corresponds to probability threshold 0.5.
Returns:
- ArrayLike: For NumPy inputs, a NumPy array of int with values in {0, 1} and the same shape as message_logits. For TensorFlow inputs, a tf.Tensor of tf.float32 with values in {0.0, 1.0}.
Errors:
- Not specified.
Example:
import numpy as np
import tensorflow as tf
from Q_Sea_Battle.dru_utilities import dru_execute
logits_np = np.array([-0.1, 0.0, 0.2], dtype=np.float32)
bits_np = dru_execute(logits_np, threshold=0.0)
print(bits_np) # [0 0 1]
logits_tf = tf.constant([-0.1, 0.0, 0.2], dtype=tf.float32)
bits_tf = dru_execute(logits_tf, threshold=0.0)
print(bits_tf) # tf.Tensor([0. 0. 1.], shape=(3,), dtype=float32)
Constants¶
- None.
Types¶
ArrayLike = Union[float, np.ndarray, tf.Tensor]
Dependencies¶
numpy(np): Used for NumPy-based computation and sampling Gaussian noise in the NumPy path.tensorflow(tf): Used for TensorFlow-based computation, noise sampling in the TensorFlow path, and differentiable sigmoid mapping.typing: UsesAny,Tuple, andUnionfor type annotations.Q_Sea_Battle.logit_utilities.logit_to_prob: Used to convert logits to probabilities in the NumPy path (and referenced in documentation for consistency).
Planned (design-spec)¶
- Not specified.
Deviations¶
- The module docstring states that the logistic nonlinearity is implemented via
Q_Sea_Battle.logit_utilities.logit_to_probfor consistency, but the TensorFlow path usestf.nn.sigmoiddirectly (documented as equivalent tologit_to_probwhen using logits).
Notes for Contributors¶
- Randomness/reproducibility:
dru_trainrelies on global seeds fornp.randomandtf.randomset elsewhere; tests should set seeds explicitly when deterministic behavior is required. - Keep TensorFlow operations on-tensor in the TensorFlow path to preserve gradient flow through
message_logits. - Avoid adding trainable parameters to this module; it is intended to be a fixed transformation given inputs and noise settings.
Related¶
Q_Sea_Battle.logit_utilities.logit_to_prob(used for stable/log-consistent probability computations in the NumPy path).
Changelog¶
- 0.1: Initial implementation of DRU utilities (
dru_train,dru_execute) and helper_is_tf_tensor.