Using OODA to write battler AI

Started by DoubleX, July 05, 2016, 09:45:39 am

Previous topic - Next topic

DoubleX

First, let's cite what OODA is:
- OODA loop
- The Tao of Boyd: How to master the OODA Loop
- Unlocking the power of Colonel John Boyd's OODA Loop

After reading all those, you, as a game developer, might wonder: How can we use OODA to write battler AI? Let's start with exploring the following oversimplified version:
(The below assumes that the battler AI are built to minimize the chance for the players to win. The OODA will be used slightly differently if that's not the battler AI's ultimate goal.)

Observe
To observe is to receive as many relevant info as accurately(including fake info detections by verifications) as possible. However, what can be observed by a battler?
(For the sake of writing battler AI, issues with fake info can be safely ignored as all relevant info can always be accurately received by any RMVXA battler AI.)
I think that any normal battler should be able to observe the opponents' current party/troop formations(like consisting of which actors/enemies), all their executed actions, and the allies' current statuses(like having which states/buffs/debuffs and how many hp/mp/tp).
Sometimes it could, however, be unfair for a battler to be able to observe the opponents' current capabilities(like classes, equips, parameters, extra parameters, special parameters, level and skill lists) or carried items.

For example, at the start of a battle, a boss facing a 6 actor party might be able to observe that all those actors are:
- heavily geared towards dealing damage with significant sacrifice from every other battle aspect, and/or
- mages only excelling at using magics, and/or
- fighters only excelling at using physical moves, and/or
- totally unresistant to some states/debuffs, and/or
- drastically faster than that boss(in action battle system, active time battle, charge turn battle, tactical battle system, etc), and/or something else.

Orient
To orient is to use mental models to make sense of those observed info(including judging the underlying implications and meanings) to fathom the current situations to generate working action plans. This process are affected by the following factors:
(For the sake of battler AI, mental models means the complete frameworks consisting of all the known strategies and tactics with all their known counters in the addressed RMVXA games that can be used by the allies, opponents and the battlers themselves.)
- New Information
- Previous Experience
- Cultural Traditions
- Genetic Heritage
- Analysis and Synthesis
For the sake of writing battler AI, cultural traditions and genetic heritage can be regarded as a battler's characters if they've to be taken info account, or simply ignored if they don't.
About the rest:
- New information must always be taken into account. Otherwise the battler AI will fail to realize that the situations have changed, let alone adapting to them(or better yet, controlling them).
- Previous experience, which will be fed by the new information, must always be taken into account too. In terms of battler AI, it means the AI will have to store all the relevant info in the current battle so far. Sometimes the full pictures, like the opponents' strategies, can only be accurately formed by combining the previous experience with the new information.
- Analysis means breaking down the existing mental models into smaller pieces, while synthesis means using those smaller pieces to form new mental models that can accurately address the new situations. This implies that the battler AI should be written to include as many known strategies and tactics with as many of their counters as possible, and they should all be able to be broken down into independent modular pieces that can be integrated into new strategies and tactics with their own counters. It also means that the battler AI should be able to judge which pieces are called for and run the integrated strategies and tactics by combining all those pieces.

For example:
- At the start of a battle, a boss fighting a 6 actor party observed that they're all heavily geared towards dealing damage with significant sacrifice from every other battle aspect.
By recalling all the known strategies and tactics with their counters, the boss AI will realize that it's extremely likely that that 6 actor party will try to kill the boss as quickly as possible(strategies) by having as high damage throughput as possible(tactics).
The boss AI will then use all the known strategies and tactics to generate all the ones that can counter the players' ones.
- In action battle system, active time battle, charge turn battle, tactical battle system, or any other battle system having the time dimension, a boss noticed that some actors just halted momentarily even when they should have acted instead(whether they should act instead needs orientations to tell) in some situations.
The boss AI will know that it's highly probable that the players aren't good at addressing those situations yet.
The boss AI will then use all the known strategies and tactics to generate all the ones that can make those situations happen more frequently.

Orientation's the most important part as it determines how accurate the observations will be interpreted and how useful the generated solutions will be executed.
For the sake of the battler AI, orientation's also the hardest part to be implemented well as it needs to stores all the known strategies and tactics with all their known counters, and the algorithms decoding the opponents' ones and coming up with all the working counters.

Decide
To decide is to judge which of all the generated action plans will be the best calls for the addressed situations which are already accurately fathomed. All these judgments should be treated as the best hypothesis which are to be tested rather than absolutely correct choices.

For example:
- At the start of a battle, a boss fighting a 6 actor party figured out that the party tried to kill the boss as quickly as possible(strategies) by having as high damage throughput as possible(tactics), and the boss AI has already generated all counters that can work.
The boss AI then forms the hypothesis that "it's better to mix several counters together to confuse the players by obfuscating the intentions" and decided to build action plans around that.

Act
To act is to build the action plans around the decisions and test those action plans by executing them(either simultaneously or treating the rest as backups that can always be quickly called). Then observations will be needed to check if those action plans worked and which ones work the best(and they'll be the main ones until the new best comes) in order to "finish"(and then restart) the "infinite" loop.

For example:
- A boss executes several counters together in a mixed manner to stop the players' strategies and tactics.
As the players realized they've been countered, they begin to adapt and use new strategies and tactics instead.
The boss AI noticed that the players are adapting to the new situations so the boss reorient in order to form new hypothesis to build and execute new action plans that will work.

Tempo
When it comes to oppositions, the faster one can effectively and efficiently run the OODA loop(higher velocity) and the faster one can change its tempo(higher acceleration), the more advantageous one will usually be in general. By controlling that tempo well via changing it rapidly and surprisingly, it's even possible to get inside the others' OODA loops to consistently reset them to the observe phase so they'll be confused and have to passively react to the new situations(they'll eventually end up being paralyzed in observations or making rash actions that won't work at all), while the one gaining the upper hand can constantly have an accurate full pictures and active control to the new situations.

For example:
- At the start of a battle(In action battle system, active time battle, charge turn battle, tactical battle system, or any other battle system having the time dimension) having 2 bosses, they act whenever they become able to by making moves that don't need charging(time delay between attempting to make a move and actually making that move).
This possibly led to players think that "act whenever becoming able to" are those bosses' patterns and those players build their action plans around their hypothesis.
Then suddenly those bosses delayed for a long time even when they become able to act, and that potentially lures players into thinking that those bosses are trying to make moves with long charging times so they make moves(which need to charge) that can cancel opponents' charging moves.
But that's exactly what those bosses want the players to do so instead those bosses make some other moves that don't need charging when the players are charging their moves.
The players now become confused and have to observe the new situations to form new mental models that can accurately address the new situations, while those bosses can take advantage from the resets of the players' OODA loops.

Building such battler AI won't be easy nor simple even for AI professionals. The builders need to have a thorough understanding to the battle systems, the battles and all the possible battlers involved as well as a fluent command on programming AI in general. Nevertheless, I still think that such AI would be incredibly challenging and extraordinarily difficult to beat if it could ever be built.

P.S.: Bear in mind that the aforementioned are still just an oversimplified version of the original OODA, which would certainly be overkill in writing battler AI.
My RMVXA/RMMV/RMMZ scripts/plugins: http://rpgmaker.net/users/DoubleX/scripts/

Blizzard

Even though simplified, this is a good post and can serve as introduction to AI for less experienced programmers and scripters.

The model can actually be enhanced by intermediate states and the decision making can be implemented in many different ways. There are approaches from simple state machines, over fuzzy logic calculations all to neutral networks and genetic algorithms.
Check out Daygames and our games:

King of Booze 2      King of Booze: Never Ever
Drinking Game for Android      Never have I ever for Android
Drinking Game for iOS      Never have I ever for iOS


Quote from: winkioI do not speak to bricks, either as individuals or in wall form.

Quote from: Barney StinsonWhen I get sad, I stop being sad and be awesome instead. True story.

DoubleX

Actually I already have a basic draft on implementing OODA AI in RPGs using the default battle system(those in RMVXA, RMMV, etc), charge turn battle and active time battle.

Basically, implementing the observe section will be defining what can be observed under what conditions. The AI should be able to observe almost everything from the allies, but only limited information from the opponents.

Implementing the orient section will need a knowledge database - list of strategies and list of tactics, a memory storing past events in battles, and the result of the observation.
Then fuzzy logic can be used to give a rating on how much the current situation, combined with past experiences, manifests those strategies and tactics.
After that, fuzzy logic can also be applied to give a rating on how excellent a strategy/tactic(building block strategy/tactics or composite ones based on those building blocks) can counter the estimated strategies/counters executed by opponents. To me, that's the hardest part(and tempo will also be taken into account).

Implementing the decide section will be based on the rating for each strategy/tactic countering the opponent's ones.
For instance, the AI can simply randomly pick 1 among the 5 strategies/tactics with the highest rating.

Implementing the act section will be based on the decided strategy/tactic. The AI should be able to turn a general strategy/tactics into concrete action plans, according to what the AI has observed.
From all currently available moves(including doing nothing), the AI should be able to give a rating on how much each move can contribute to the decided strategy/tactic.
To me, that's the second hardest part, especially in real time battle systems(mainly because of time performance issue).

I can already foresee that, the OODA AI implementation will be heavily based on the strategy pattern.

P.S.: I think that state machines will suffice for small scale OODA AIs facing low complexities, but they'll probably fail in large scale complex use cases.
My RMVXA/RMMV/RMMZ scripts/plugins: http://rpgmaker.net/users/DoubleX/scripts/

Blizzard

I agree, balancing fuzzy logic for AI is not easy and needs some trial and error.

State machines can actually scale pretty well if you use a more tree-like transition graph rather than one that can go from anywhere to anywhere. Or if you use a state machine only internally to determine a basic reaction branch after observation and then making a final decision on what to do within that limited action set mandated by the prior branch.
Check out Daygames and our games:

King of Booze 2      King of Booze: Never Ever
Drinking Game for Android      Never have I ever for Android
Drinking Game for iOS      Never have I ever for iOS


Quote from: winkioI do not speak to bricks, either as individuals or in wall form.

Quote from: Barney StinsonWhen I get sad, I stop being sad and be awesome instead. True story.