QSeaBattle – Quick Start Guide¶

This notebook demonstrates the core workflow of QSeaBattle: defining a game layout, selecting players, running tournaments, and evaluating performance.

1. Game layout¶

The GameLayout defines the problem instance:

  • field_size determines the number of battlefield cells (n2 = field_size**2).
  • comms_size determines how many bits Player A may communicate to Player B.

Most players assume uniform random inputs and operate on flattened binary vectors.

2. Players and assistance models¶

Players are always created via a factory (e.g. SimplePlayers, MajorityPlayers, PRAssistedPlayers, NeuralNetPlayers).

The factory:

  • Validates compatibility with the layout.
  • Constructs Player A and Player B.
  • Owns any shared resources (classical, PR-assisted, or learned).

You can switch strategies by changing only the factory class.

3. Running a tournament¶

A Tournament repeatedly samples random game instances and lets the players interact with the environment.

The output is a TournamentLog containing:

  • Win rate
  • Per-game outcomes
  • Optional player log-probabilities (if supported)

This separation makes it easy to benchmark different player types under identical conditions.

4. Interpreting results¶

When comparing strategies, focus on:

  • Relative performance across layouts.
  • Sensitivity to noise or adversarial parameters.
  • Gaps between classical, PR-assisted, and learned strategies.

The goal is not a single "best" player, but understanding why certain resources provide an advantage.

In [9]:
import sys
import pandas as pd
sys.path.append("../src")
import Q_Sea_Battle as QSB
In [10]:
field_sizes = [4,8,16,32]
number_of_games_in_tournament = 10000
channel_noise_levels = [0.0, 0.1, 0.3, 0.5]
In [11]:
results_list = []
for field_size in field_sizes:
    for noise_level in channel_noise_levels:
        # Layout: variable field size, 1-bit communication, fixed number of games
        layout = QSB.GameLayout(
            field_size=field_size,
            comms_size=1,
            number_of_games_in_tournament=number_of_games_in_tournament,
            channel_noise=noise_level
        )

        env = QSB.GameEnv(layout)
        players = QSB.Players(layout)
        tournament = QSB.Tournament(env, players, layout)

        log = tournament.tournament()
        mean_reward, std_err = log.outcome()

        # Store results in list
        results_list.append({
            'player_type': 'base',
            'field_size': field_size,
            'noise_level': noise_level,
            'performance': mean_reward,
            '95p error +/-': 1.96 * std_err,
            'reference': 0.5,
            'in_interval': (mean_reward - 1.96 * std_err <= 0.5 <= mean_reward + 1.96 * std_err)
        })

# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)

print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))

print("="*100)
Tournament simulations completed.

====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT
====================================================================================================
player_type  field_size  noise_level  performance  95p error +/-  reference  in_interval
       base           4       0.0000       0.4994         0.0098     0.5000         True
       base           4       0.1000       0.4948         0.0098     0.5000         True
       base           4       0.3000       0.5004         0.0098     0.5000         True
       base           4       0.5000       0.4960         0.0098     0.5000         True
       base           8       0.0000       0.5022         0.0098     0.5000         True
       base           8       0.1000       0.4984         0.0098     0.5000         True
       base           8       0.3000       0.4984         0.0098     0.5000         True
       base           8       0.5000       0.5058         0.0098     0.5000         True
       base          16       0.0000       0.4990         0.0098     0.5000         True
       base          16       0.1000       0.4931         0.0098     0.5000         True
       base          16       0.3000       0.4953         0.0098     0.5000         True
       base          16       0.5000       0.5097         0.0098     0.5000         True
       base          32       0.0000       0.4972         0.0098     0.5000         True
       base          32       0.1000       0.4991         0.0098     0.5000         True
       base          32       0.3000       0.4961         0.0098     0.5000         True
       base          32       0.5000       0.5009         0.0098     0.5000         True
====================================================================================================
In [12]:
results_list = []
for field_size in field_sizes:
    for noise_level in channel_noise_levels:
        # Layout: variable field size, 1-bit communication, fixed number of games
        layout = QSB.GameLayout(
            field_size=field_size,
            comms_size=1,
            number_of_games_in_tournament=number_of_games_in_tournament,
            channel_noise=noise_level
        )

        env = QSB.GameEnv(layout)
        players = QSB.SimplePlayers(layout)
        tournament = QSB.Tournament(env, players, layout)

        log = tournament.tournament()
        mean_reward, std_err = log.outcome()

        ref = QSB.expected_win_rate_simple(field_size = field_size, 
                                           comms_size=1,
                                           channel_noise=noise_level)

        # Store results in list
        results_list.append({
            'player_type': 'simple',
            'field_size': field_size,
            'noise_level': noise_level,
            'performance': mean_reward,
            '95p error +/-': 1.96 * std_err,
            'reference': ref,
            'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
        })

# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)

print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))

print("="*100)
Tournament simulations completed.

====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT
====================================================================================================
player_type  field_size  noise_level  performance  95p error +/-  reference  in_interval
     simple           4       0.0000       0.5291         0.0098     0.5312         True
     simple           4       0.1000       0.5204         0.0098     0.5250         True
     simple           4       0.3000       0.5125         0.0098     0.5125         True
     simple           4       0.5000       0.5048         0.0098     0.5000         True
     simple           8       0.0000       0.5167         0.0098     0.5078         True
     simple           8       0.1000       0.5075         0.0098     0.5062         True
     simple           8       0.3000       0.5054         0.0098     0.5031         True
     simple           8       0.5000       0.4975         0.0098     0.5000         True
     simple          16       0.0000       0.5073         0.0098     0.5020         True
     simple          16       0.1000       0.5035         0.0098     0.5016         True
     simple          16       0.3000       0.5064         0.0098     0.5008         True
     simple          16       0.5000       0.5054         0.0098     0.5000         True
     simple          32       0.0000       0.5005         0.0098     0.5005         True
     simple          32       0.1000       0.4994         0.0098     0.5004         True
     simple          32       0.3000       0.4993         0.0098     0.5002         True
     simple          32       0.5000       0.4957         0.0098     0.5000         True
====================================================================================================
In [13]:
results_list = []
for field_size in field_sizes:
    for noise_level in channel_noise_levels:
        # Layout: variable field size, 1-bit communication, fixed number of games
        layout = QSB.GameLayout(
            field_size=field_size,
            comms_size=1,
            number_of_games_in_tournament=number_of_games_in_tournament,
            channel_noise=noise_level
        )

        env = QSB.GameEnv(layout)
        players = QSB.MajorityPlayers(layout)
        tournament = QSB.Tournament(env, players, layout)

        log = tournament.tournament()
        mean_reward, std_err = log.outcome()

        ref = QSB.expected_win_rate_majority(field_size = field_size, 
                                           comms_size=1,
                                           channel_noise=noise_level)

        # Store results in list
        results_list.append({
            'player_type': 'majority',
            'field_size': field_size,
            'noise_level': noise_level,
            'performance': mean_reward,
            '95p error +/-': 1.96 * std_err,
            'reference': ref,
            'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
        })

# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)

print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))

print("="*100)
Tournament simulations completed.

====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT
====================================================================================================
player_type  field_size  noise_level  performance  95p error +/-  reference  in_interval
   majority           4       0.0000       0.5983         0.0096     0.5982         True
   majority           4       0.1000       0.5817         0.0097     0.5786         True
   majority           4       0.3000       0.5389         0.0098     0.5393         True
   majority           4       0.5000       0.5044         0.0098     0.5000         True
   majority           8       0.0000       0.5511         0.0097     0.5497         True
   majority           8       0.1000       0.5389         0.0098     0.5397         True
   majority           8       0.3000       0.5322         0.0098     0.5199        False
   majority           8       0.5000       0.5028         0.0098     0.5000         True
   majority          16       0.0000       0.5294         0.0098     0.5249         True
   majority          16       0.1000       0.5273         0.0098     0.5199         True
   majority          16       0.3000       0.5164         0.0098     0.5100         True
   majority          16       0.5000       0.4926         0.0098     0.5000         True
   majority          32       0.0000       0.5177         0.0098     0.5125         True
   majority          32       0.1000       0.5133         0.0098     0.5100         True
   majority          32       0.3000       0.5013         0.0098     0.5050         True
   majority          32       0.5000       0.4947         0.0098     0.5000         True
====================================================================================================
In [14]:
results_list = []
p_high = 1.0
for field_size in field_sizes:
    for noise_level in channel_noise_levels:
        # Layout: variable field size, 1-bit communication, fixed number of games
        layout = QSB.GameLayout(
            field_size=field_size,
            comms_size=1,
            number_of_games_in_tournament=number_of_games_in_tournament,
            channel_noise=noise_level
        )

        env = QSB.GameEnv(layout)
        players = QSB.PRAssistedPlayers(game_layout = layout, p_high = p_high)
        tournament = QSB.Tournament(env, players, layout)

        log = tournament.tournament()
        mean_reward, std_err = log.outcome()

        ref = QSB.expected_win_rate_assisted(field_size = field_size, 
                                           comms_size=1,
                                           channel_noise=noise_level,
                                           p_high=p_high)

        # Store results in list
        results_list.append({
            'player_type': 'assisted/P_high= '+str(p_high),
            'field_size': field_size,
            'noise_level': noise_level,
            'performance': mean_reward,
            '95p error +/-': 1.96 * std_err,
            'reference': ref,
            'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
        })

# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)

print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT.")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))

print("="*100)
Tournament simulations completed.

====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT.
====================================================================================================
         player_type  field_size  noise_level  performance  95p error +/-  reference  in_interval
assisted/P_high= 1.0           4       0.0000       1.0000         0.0000     1.0000         True
assisted/P_high= 1.0           4       0.1000       0.9004         0.0059     0.9000         True
assisted/P_high= 1.0           4       0.3000       0.6977         0.0090     0.7000         True
assisted/P_high= 1.0           4       0.5000       0.4896         0.0098     0.5000        False
assisted/P_high= 1.0           8       0.0000       1.0000         0.0000     1.0000         True
assisted/P_high= 1.0           8       0.1000       0.9039         0.0058     0.9000         True
assisted/P_high= 1.0           8       0.3000       0.7084         0.0089     0.7000         True
assisted/P_high= 1.0           8       0.5000       0.4970         0.0098     0.5000         True
assisted/P_high= 1.0          16       0.0000       1.0000         0.0000     1.0000         True
assisted/P_high= 1.0          16       0.1000       0.8993         0.0059     0.9000         True
assisted/P_high= 1.0          16       0.3000       0.7015         0.0090     0.7000         True
assisted/P_high= 1.0          16       0.5000       0.5047         0.0098     0.5000         True
assisted/P_high= 1.0          32       0.0000       1.0000         0.0000     1.0000         True
assisted/P_high= 1.0          32       0.1000       0.8955         0.0060     0.9000         True
assisted/P_high= 1.0          32       0.3000       0.7003         0.0090     0.7000         True
assisted/P_high= 1.0          32       0.5000       0.4958         0.0098     0.5000         True
====================================================================================================
In [15]:
results_list = []
p_high = 0.85
for field_size in field_sizes:
    for noise_level in channel_noise_levels:
        # Layout: variable field size, 1-bit communication, fixed number of games
        layout = QSB.GameLayout(
            field_size=field_size,
            comms_size=1,
            number_of_games_in_tournament=number_of_games_in_tournament,
            channel_noise=noise_level
        )

        env = QSB.GameEnv(layout)
        players = QSB.PRAssistedPlayers(game_layout = layout, p_high = p_high)
        tournament = QSB.Tournament(env, players, layout)

        log = tournament.tournament()
        mean_reward, std_err = log.outcome()

        ref = QSB.expected_win_rate_assisted(field_size = field_size, 
                                           comms_size=1,
                                           channel_noise=noise_level,
                                           p_high=p_high)

        # Store results in list
        results_list.append({
            'player_type': 'assisted/P_high= '+str(p_high),
            'field_size': field_size,
            'noise_level': noise_level,
            'performance': mean_reward,
            '95p error +/-': 1.96 * std_err,
            'reference': ref,
            'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
        })

# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)

print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT.")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))

print("="*100)
Tournament simulations completed.

====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT.
====================================================================================================
          player_type  field_size  noise_level  performance  95p error +/-  reference  in_interval
assisted/P_high= 0.85           4       0.0000       0.6225         0.0095     0.6200         True
assisted/P_high= 0.85           4       0.1000       0.5973         0.0096     0.5960         True
assisted/P_high= 0.85           4       0.3000       0.5472         0.0098     0.5480         True
assisted/P_high= 0.85           4       0.5000       0.5027         0.0098     0.5000         True
assisted/P_high= 0.85           8       0.0000       0.5573         0.0097     0.5588         True
assisted/P_high= 0.85           8       0.1000       0.5532         0.0097     0.5471         True
assisted/P_high= 0.85           8       0.3000       0.5244         0.0098     0.5235         True
assisted/P_high= 0.85           8       0.5000       0.5030         0.0098     0.5000         True
assisted/P_high= 0.85          16       0.0000       0.5256         0.0098     0.5288         True
assisted/P_high= 0.85          16       0.1000       0.5264         0.0098     0.5231         True
assisted/P_high= 0.85          16       0.3000       0.4982         0.0098     0.5115        False
assisted/P_high= 0.85          16       0.5000       0.4983         0.0098     0.5000         True
assisted/P_high= 0.85          32       0.0000       0.5107         0.0098     0.5141         True
assisted/P_high= 0.85          32       0.1000       0.5188         0.0098     0.5113         True
assisted/P_high= 0.85          32       0.3000       0.5036         0.0098     0.5056         True
assisted/P_high= 0.85          32       0.5000       0.5053         0.0098     0.5000         True
====================================================================================================
In [16]:
results_list = []
field_size = 64
noise_level = 0.45
p_high_values = [float(n/100) for n in range(75,95,1)]
for p_high in p_high_values:
    # Layout: variable field size, 1-bit communication, fixed number of games
    layout = QSB.GameLayout(
        field_size=field_size,
        comms_size=1,
        number_of_games_in_tournament=number_of_games_in_tournament,
        channel_noise=noise_level
    )

    env = QSB.GameEnv(layout)
    players = QSB.PRAssistedPlayers(game_layout = layout, p_high = p_high)
    tournament = QSB.Tournament(env, players, layout)

    log = tournament.tournament()
    mean_reward, std_err = log.outcome()

    ref = QSB.expected_win_rate_assisted(field_size = field_size, 
                                        comms_size=1,
                                        channel_noise=noise_level,
                                        p_high=p_high)
    
    ic_bound = QSB.limit_from_mutual_information(field_size=field_size,
                                                    comms_size=1,
                                                    channel_noise=noise_level)

    # Store results in list
    results_list.append({
        'player_type': 'assisted/P_high= '+str(p_high),
        'field_size': field_size,
        'noise_level': noise_level,
        'performance': mean_reward,
        'reference': ref,
        'information_constraint': ic_bound,
        'in_interval': (mean_reward - 1.96 * std_err <= ref <= mean_reward + 1.96 * std_err)
    })

# Create DataFrame from collected results
results_df = pd.DataFrame(results_list)

print("Tournament simulations completed.")
print("\n" + "="*100)
print(f"RESULTS SUMMARY FOR {number_of_games_in_tournament} GAMES PER TOURNAMENT.")
print("="*100)
print(results_df.to_string(index=False, float_format='%.4f'))

print("="*100)
Tournament simulations completed.

====================================================================================================
RESULTS SUMMARY FOR 10000 GAMES PER TOURNAMENT.
====================================================================================================
          player_type  field_size  noise_level  performance  reference  information_constraint  in_interval
assisted/P_high= 0.75          64       0.4500       0.4881     0.5000                  0.5008        False
assisted/P_high= 0.76          64       0.4500       0.4971     0.5000                  0.5008         True
assisted/P_high= 0.77          64       0.4500       0.4997     0.5000                  0.5008         True
assisted/P_high= 0.78          64       0.4500       0.5021     0.5000                  0.5008         True
assisted/P_high= 0.79          64       0.4500       0.5054     0.5001                  0.5008         True
 assisted/P_high= 0.8          64       0.4500       0.4937     0.5001                  0.5008         True
assisted/P_high= 0.81          64       0.4500       0.5008     0.5002                  0.5008         True
assisted/P_high= 0.82          64       0.4500       0.4996     0.5002                  0.5008         True
assisted/P_high= 0.83          64       0.4500       0.5032     0.5003                  0.5008         True
assisted/P_high= 0.84          64       0.4500       0.4975     0.5005                  0.5008         True
assisted/P_high= 0.85          64       0.4500       0.5014     0.5007                  0.5008         True
assisted/P_high= 0.86          64       0.4500       0.5119     0.5010                  0.5008        False
assisted/P_high= 0.87          64       0.4500       0.5016     0.5013                  0.5008         True
assisted/P_high= 0.88          64       0.4500       0.5021     0.5019                  0.5008         True
assisted/P_high= 0.89          64       0.4500       0.5048     0.5025                  0.5008         True
 assisted/P_high= 0.9          64       0.4500       0.5053     0.5034                  0.5008         True
assisted/P_high= 0.91          64       0.4500       0.5098     0.5046                  0.5008         True
assisted/P_high= 0.92          64       0.4500       0.5055     0.5062                  0.5008         True
assisted/P_high= 0.93          64       0.4500       0.5106     0.5082                  0.5008         True
assisted/P_high= 0.94          64       0.4500       0.5045     0.5108                  0.5008         True
====================================================================================================