QSeaBattle – Quick Start Guide¶
This notebook demonstrates the core workflow of QSeaBattle: defining a game layout, selecting players, running tournaments, and evaluating performance.
1. Game layout¶
The GameLayout defines the problem instance:
field_sizedetermines the number of battlefield cells (n2 = field_size**2).comms_sizedetermines how many bits Player A may communicate to Player B.
Most players assume uniform random inputs and operate on flattened binary vectors.
2. Players and assistance models¶
Players are always created via a factory (e.g. SimplePlayers, MajorityPlayers,
PRAssistedPlayers, NeuralNetPlayers).
The factory:
- Validates compatibility with the layout.
- Constructs Player A and Player B.
- Owns any shared resources (classical, PR-assisted, or learned).
You can switch strategies by changing only the factory class.
3. Running a tournament¶
A Tournament repeatedly samples random game instances and lets the players interact
with the environment.
The output is a TournamentLog containing:
- Win rate
- Per-game outcomes
- Optional player log-probabilities (if supported)
This separation makes it easy to benchmark different player types under identical conditions.
4. Interpreting results¶
When comparing strategies, focus on:
- Relative performance across layouts.
- Sensitivity to noise or adversarial parameters.
- Gaps between classical, PR-assisted, and learned strategies.
The goal is not a single "best" player, but understanding why certain resources provide an advantage.
import sys
import pandas as pd
sys.path.append("../src")
import Q_Sea_Battle as QSB
field_sizes = [4,8,16,32]
number_of_games_in_tournament = 10000
channel_noise_levels = [0.0, 0.1, 0.3, 0.5]
results_list = []
for field_size in field_sizes:
for noise_level in channel_noise_levels:
# Layout: variable field size, 1-bit communication, fixed number of games
layout = QSB.GameLayout(
field_size=field_size,
comms_size=1,
number_of_games_in_tournament=number_of_games_in_tournament,
channel_noise=noise_level
)
env = QSB.GameEnv(layout)
players = QSB.Players(layout)
tournament = QSB.Tournament(env, players, layout)
log = tournament.tournament()
mean_reward, std_err = log.outcome()
# Store results in list
results_list.append({
'player_type': 'base',
'field_size': field_size,
'noise_level': noise_level,
'performance': mean_reward,
'95p error +/-': 1.96 * std_err,
'reference': 0.5,
'in_interval': (mean_reward - 1.96 * std_err <= 0.5 <= mean_reward + 1.96 * std_err)
})
# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)
print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))
print("="*100)
Tournament simulations completed.
====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT
====================================================================================================
player_type field_size noise_level performance 95p error +/- reference in_interval
base 4 0.0000 0.4994 0.0098 0.5000 True
base 4 0.1000 0.4948 0.0098 0.5000 True
base 4 0.3000 0.5004 0.0098 0.5000 True
base 4 0.5000 0.4960 0.0098 0.5000 True
base 8 0.0000 0.5022 0.0098 0.5000 True
base 8 0.1000 0.4984 0.0098 0.5000 True
base 8 0.3000 0.4984 0.0098 0.5000 True
base 8 0.5000 0.5058 0.0098 0.5000 True
base 16 0.0000 0.4990 0.0098 0.5000 True
base 16 0.1000 0.4931 0.0098 0.5000 True
base 16 0.3000 0.4953 0.0098 0.5000 True
base 16 0.5000 0.5097 0.0098 0.5000 True
base 32 0.0000 0.4972 0.0098 0.5000 True
base 32 0.1000 0.4991 0.0098 0.5000 True
base 32 0.3000 0.4961 0.0098 0.5000 True
base 32 0.5000 0.5009 0.0098 0.5000 True
====================================================================================================
results_list = []
for field_size in field_sizes:
for noise_level in channel_noise_levels:
# Layout: variable field size, 1-bit communication, fixed number of games
layout = QSB.GameLayout(
field_size=field_size,
comms_size=1,
number_of_games_in_tournament=number_of_games_in_tournament,
channel_noise=noise_level
)
env = QSB.GameEnv(layout)
players = QSB.SimplePlayers(layout)
tournament = QSB.Tournament(env, players, layout)
log = tournament.tournament()
mean_reward, std_err = log.outcome()
ref = QSB.expected_win_rate_simple(field_size = field_size,
comms_size=1,
channel_noise=noise_level)
# Store results in list
results_list.append({
'player_type': 'simple',
'field_size': field_size,
'noise_level': noise_level,
'performance': mean_reward,
'95p error +/-': 1.96 * std_err,
'reference': ref,
'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
})
# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)
print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))
print("="*100)
Tournament simulations completed.
====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT
====================================================================================================
player_type field_size noise_level performance 95p error +/- reference in_interval
simple 4 0.0000 0.5291 0.0098 0.5312 True
simple 4 0.1000 0.5204 0.0098 0.5250 True
simple 4 0.3000 0.5125 0.0098 0.5125 True
simple 4 0.5000 0.5048 0.0098 0.5000 True
simple 8 0.0000 0.5167 0.0098 0.5078 True
simple 8 0.1000 0.5075 0.0098 0.5062 True
simple 8 0.3000 0.5054 0.0098 0.5031 True
simple 8 0.5000 0.4975 0.0098 0.5000 True
simple 16 0.0000 0.5073 0.0098 0.5020 True
simple 16 0.1000 0.5035 0.0098 0.5016 True
simple 16 0.3000 0.5064 0.0098 0.5008 True
simple 16 0.5000 0.5054 0.0098 0.5000 True
simple 32 0.0000 0.5005 0.0098 0.5005 True
simple 32 0.1000 0.4994 0.0098 0.5004 True
simple 32 0.3000 0.4993 0.0098 0.5002 True
simple 32 0.5000 0.4957 0.0098 0.5000 True
====================================================================================================
results_list = []
for field_size in field_sizes:
for noise_level in channel_noise_levels:
# Layout: variable field size, 1-bit communication, fixed number of games
layout = QSB.GameLayout(
field_size=field_size,
comms_size=1,
number_of_games_in_tournament=number_of_games_in_tournament,
channel_noise=noise_level
)
env = QSB.GameEnv(layout)
players = QSB.MajorityPlayers(layout)
tournament = QSB.Tournament(env, players, layout)
log = tournament.tournament()
mean_reward, std_err = log.outcome()
ref = QSB.expected_win_rate_majority(field_size = field_size,
comms_size=1,
channel_noise=noise_level)
# Store results in list
results_list.append({
'player_type': 'majority',
'field_size': field_size,
'noise_level': noise_level,
'performance': mean_reward,
'95p error +/-': 1.96 * std_err,
'reference': ref,
'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
})
# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)
print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))
print("="*100)
Tournament simulations completed. ==================================================================================================== RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT ==================================================================================================== player_type field_size noise_level performance 95p error +/- reference in_interval majority 4 0.0000 0.5983 0.0096 0.5982 True majority 4 0.1000 0.5817 0.0097 0.5786 True majority 4 0.3000 0.5389 0.0098 0.5393 True majority 4 0.5000 0.5044 0.0098 0.5000 True majority 8 0.0000 0.5511 0.0097 0.5497 True majority 8 0.1000 0.5389 0.0098 0.5397 True majority 8 0.3000 0.5322 0.0098 0.5199 False majority 8 0.5000 0.5028 0.0098 0.5000 True majority 16 0.0000 0.5294 0.0098 0.5249 True majority 16 0.1000 0.5273 0.0098 0.5199 True majority 16 0.3000 0.5164 0.0098 0.5100 True majority 16 0.5000 0.4926 0.0098 0.5000 True majority 32 0.0000 0.5177 0.0098 0.5125 True majority 32 0.1000 0.5133 0.0098 0.5100 True majority 32 0.3000 0.5013 0.0098 0.5050 True majority 32 0.5000 0.4947 0.0098 0.5000 True ====================================================================================================
results_list = []
p_high = 1.0
for field_size in field_sizes:
for noise_level in channel_noise_levels:
# Layout: variable field size, 1-bit communication, fixed number of games
layout = QSB.GameLayout(
field_size=field_size,
comms_size=1,
number_of_games_in_tournament=number_of_games_in_tournament,
channel_noise=noise_level
)
env = QSB.GameEnv(layout)
players = QSB.PRAssistedPlayers(game_layout = layout, p_high = p_high)
tournament = QSB.Tournament(env, players, layout)
log = tournament.tournament()
mean_reward, std_err = log.outcome()
ref = QSB.expected_win_rate_assisted(field_size = field_size,
comms_size=1,
channel_noise=noise_level,
p_high=p_high)
# Store results in list
results_list.append({
'player_type': 'assisted/P_high= '+str(p_high),
'field_size': field_size,
'noise_level': noise_level,
'performance': mean_reward,
'95p error +/-': 1.96 * std_err,
'reference': ref,
'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
})
# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)
print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT.")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))
print("="*100)
Tournament simulations completed.
====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT.
====================================================================================================
player_type field_size noise_level performance 95p error +/- reference in_interval
assisted/P_high= 1.0 4 0.0000 1.0000 0.0000 1.0000 True
assisted/P_high= 1.0 4 0.1000 0.9004 0.0059 0.9000 True
assisted/P_high= 1.0 4 0.3000 0.6977 0.0090 0.7000 True
assisted/P_high= 1.0 4 0.5000 0.4896 0.0098 0.5000 False
assisted/P_high= 1.0 8 0.0000 1.0000 0.0000 1.0000 True
assisted/P_high= 1.0 8 0.1000 0.9039 0.0058 0.9000 True
assisted/P_high= 1.0 8 0.3000 0.7084 0.0089 0.7000 True
assisted/P_high= 1.0 8 0.5000 0.4970 0.0098 0.5000 True
assisted/P_high= 1.0 16 0.0000 1.0000 0.0000 1.0000 True
assisted/P_high= 1.0 16 0.1000 0.8993 0.0059 0.9000 True
assisted/P_high= 1.0 16 0.3000 0.7015 0.0090 0.7000 True
assisted/P_high= 1.0 16 0.5000 0.5047 0.0098 0.5000 True
assisted/P_high= 1.0 32 0.0000 1.0000 0.0000 1.0000 True
assisted/P_high= 1.0 32 0.1000 0.8955 0.0060 0.9000 True
assisted/P_high= 1.0 32 0.3000 0.7003 0.0090 0.7000 True
assisted/P_high= 1.0 32 0.5000 0.4958 0.0098 0.5000 True
====================================================================================================
results_list = []
p_high = 0.85
for field_size in field_sizes:
for noise_level in channel_noise_levels:
# Layout: variable field size, 1-bit communication, fixed number of games
layout = QSB.GameLayout(
field_size=field_size,
comms_size=1,
number_of_games_in_tournament=number_of_games_in_tournament,
channel_noise=noise_level
)
env = QSB.GameEnv(layout)
players = QSB.PRAssistedPlayers(game_layout = layout, p_high = p_high)
tournament = QSB.Tournament(env, players, layout)
log = tournament.tournament()
mean_reward, std_err = log.outcome()
ref = QSB.expected_win_rate_assisted(field_size = field_size,
comms_size=1,
channel_noise=noise_level,
p_high=p_high)
# Store results in list
results_list.append({
'player_type': 'assisted/P_high= '+str(p_high),
'field_size': field_size,
'noise_level': noise_level,
'performance': mean_reward,
'95p error +/-': 1.96 * std_err,
'reference': ref,
'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
})
# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)
print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT.")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))
print("="*100)
Tournament simulations completed.
====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT.
====================================================================================================
player_type field_size noise_level performance 95p error +/- reference in_interval
assisted/P_high= 0.85 4 0.0000 0.6225 0.0095 0.6200 True
assisted/P_high= 0.85 4 0.1000 0.5973 0.0096 0.5960 True
assisted/P_high= 0.85 4 0.3000 0.5472 0.0098 0.5480 True
assisted/P_high= 0.85 4 0.5000 0.5027 0.0098 0.5000 True
assisted/P_high= 0.85 8 0.0000 0.5573 0.0097 0.5588 True
assisted/P_high= 0.85 8 0.1000 0.5532 0.0097 0.5471 True
assisted/P_high= 0.85 8 0.3000 0.5244 0.0098 0.5235 True
assisted/P_high= 0.85 8 0.5000 0.5030 0.0098 0.5000 True
assisted/P_high= 0.85 16 0.0000 0.5256 0.0098 0.5288 True
assisted/P_high= 0.85 16 0.1000 0.5264 0.0098 0.5231 True
assisted/P_high= 0.85 16 0.3000 0.4982 0.0098 0.5115 False
assisted/P_high= 0.85 16 0.5000 0.4983 0.0098 0.5000 True
assisted/P_high= 0.85 32 0.0000 0.5107 0.0098 0.5141 True
assisted/P_high= 0.85 32 0.1000 0.5188 0.0098 0.5113 True
assisted/P_high= 0.85 32 0.3000 0.5036 0.0098 0.5056 True
assisted/P_high= 0.85 32 0.5000 0.5053 0.0098 0.5000 True
====================================================================================================
results_list = []
field_size = 64
noise_level = 0.45
p_high_values = [float(n/100) for n in range(75,95,1)]
for p_high in p_high_values:
# Layout: variable field size, 1-bit communication, fixed number of games
layout = QSB.GameLayout(
field_size=field_size,
comms_size=1,
number_of_games_in_tournament=number_of_games_in_tournament,
channel_noise=noise_level
)
env = QSB.GameEnv(layout)
players = QSB.PRAssistedPlayers(game_layout = layout, p_high = p_high)
tournament = QSB.Tournament(env, players, layout)
log = tournament.tournament()
mean_reward, std_err = log.outcome()
ref = QSB.expected_win_rate_assisted(field_size = field_size,
comms_size=1,
channel_noise=noise_level,
p_high=p_high)
ic_bound = QSB.limit_from_mutual_information(field_size=field_size,
comms_size=1,
channel_noise=noise_level)
# Store results in list
results_list.append({
'player_type': 'assisted/P_high= '+str(p_high),
'field_size': field_size,
'noise_level': noise_level,
'performance': mean_reward,
'reference': ref,
'information_constraint': ic_bound,
'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
})
# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)
print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT.")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))
print("="*100)
Tournament simulations completed.
====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT.
====================================================================================================
player_type field_size noise_level performance reference information_constraint in_interval
assisted/P_high= 0.75 64 0.4500 0.4881 0.5000 0.5008 False
assisted/P_high= 0.76 64 0.4500 0.4971 0.5000 0.5008 True
assisted/P_high= 0.77 64 0.4500 0.4997 0.5000 0.5008 True
assisted/P_high= 0.78 64 0.4500 0.5021 0.5000 0.5008 True
assisted/P_high= 0.79 64 0.4500 0.5054 0.5001 0.5008 True
assisted/P_high= 0.8 64 0.4500 0.4937 0.5001 0.5008 True
assisted/P_high= 0.81 64 0.4500 0.5008 0.5002 0.5008 True
assisted/P_high= 0.82 64 0.4500 0.4996 0.5002 0.5008 True
assisted/P_high= 0.83 64 0.4500 0.5032 0.5003 0.5008 True
assisted/P_high= 0.84 64 0.4500 0.4975 0.5005 0.5008 True
assisted/P_high= 0.85 64 0.4500 0.5014 0.5007 0.5008 True
assisted/P_high= 0.86 64 0.4500 0.5119 0.5010 0.5008 False
assisted/P_high= 0.87 64 0.4500 0.5016 0.5013 0.5008 True
assisted/P_high= 0.88 64 0.4500 0.5021 0.5019 0.5008 True
assisted/P_high= 0.89 64 0.4500 0.5048 0.5025 0.5008 True
assisted/P_high= 0.9 64 0.4500 0.5053 0.5034 0.5008 True
assisted/P_high= 0.91 64 0.4500 0.5098 0.5046 0.5008 True
assisted/P_high= 0.92 64 0.4500 0.5055 0.5062 0.5008 True
assisted/P_high= 0.93 64 0.4500 0.5106 0.5082 0.5008 True
assisted/P_high= 0.94 64 0.4500 0.5045 0.5108 0.5008 True
====================================================================================================