Duke4.net Forums: scripting optimizations - Duke4.net Forums

Jump to content

Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic

scripting optimizations

User is offline   Danukem 

  • Duke Plus Developer

#1

I want to optimize my pathfinding code, and I'm wondering if someone more knowledgeable about the inner workings of the game can tell which operations are more computationally expensive. My code at various points uses:

ldist

canseespr

Right now it checks visibility first, and then only checks distance when visibility is confirmed (it's a bit more complicated than that, since visibility is not always required, but that's close enough to the truth). But if canseespr is the more expensive command, then I'll want to switch the order.

There's also a good chance I can replace each of my canseespr calls with a hitscan instead. A single hitscan with the right angle, zangle, and clipmask should be able to confirm visibility. But I don't know whether that would be significantly cheaper than using canseespr. I'll probably get some false negatives, so I'll lose a bit of accuracy.
1

User is offline   Kyanos 

#2

Well I'm not the one to answer about the technicalities but from a logic point of view canseespr would probably return negative more often which would help performance being first... I think.

If you are calling things every frame there are probably some big gains to be made by only running certain subroutines every second or so.

All said with absolutely no certainty :o
0

User is offline   Danukem 

  • Duke Plus Developer

#3

View PostPhotonic, on 28 October 2019 - 02:15 PM, said:

If you are calling things every frame there are probably some big gains to be made by only running certain subroutines every second or so.


Already doing that. Processing is staggered with only certain nodes doing the calculations on certain tics. Even so, in a bigger map with lots of nodes and routes, and the need to recalculate routes dynamically (e.g. in capture the flag when you have a runner with the flag who needs to be hunted down), the calculations themselves need optimization.
0

User is offline   Kyanos 

#4

            vInstruction(CON_CANSEESPR):
                insptr++;
                {
                    int const nSprite1 = Gv_GetVar(*insptr++);
                    int const nSprite2 = Gv_GetVar(*insptr++);

                    if (EDUKE32_PREDICT_FALSE((unsigned)nSprite1 >= MAXSPRITES || (unsigned)nSprite2 >= MAXSPRITES))
                    {
                        CON_ERRPRINTF("invalid sprite %d\n", (unsigned)nSprite1 >= MAXSPRITES ? nSprite1 : nSprite2);
                        abort_after_error();
                    }

                    int const nResult = cansee(sprite[nSprite1].x, sprite[nSprite1].y, sprite[nSprite1].z, sprite[nSprite1].sectnum,
                                               sprite[nSprite2].x, sprite[nSprite2].y, sprite[nSprite2].z, sprite[nSprite2].sectnum);

                    Gv_SetVar(*insptr++, nResult);
                    dispatch();
                }


            vInstruction(CON_LDIST):
            vInstruction(CON_DIST):
                insptr++;
                {
                    int const out = *insptr++;
                    vec2_t    in;
                    Gv_FillWithVars(in);

                    if (EDUKE32_PREDICT_FALSE((unsigned)in.x >= MAXSPRITES || (unsigned)in.y >= MAXSPRITES))
                    {
                        CON_ERRPRINTF("invalid sprite %d, %d\n", in.x, in.y);
                        abort_after_error();
                    }

                    Gv_SetVar(out, (VM_DECODE_INST(tw) == CON_LDIST ? ldist : dist)(&sprite[in.x], &sprite[in.y]));
                    dispatch();
                }



There's both blocks of code from gameexec.cpp, seeing as how ldist is only using x & y co-ords and canseespr uses x,y,z&sectnum I would assume that canseespr must be heavier math if you dig deeper into the functions in source. ... I just did.

int32_t ldist(const void *s1, const void *s2)
{
    auto sp1 = (vec2_t const *)s1;
    auto sp2 = (vec2_t const *)s2;
    return sepldist(sp1->x - sp2->x, sp1->y - sp2->y);
}


And yes cansee is way heavier it deserves a spoiler.
Spoiler


This post has been edited by Photonic: 28 October 2019 - 04:31 PM

1

User is offline   Danukem 

  • Duke Plus Developer

#5

Thanks for the research! I'll definitely change it around to check distance first. That will help out a lot in bigger maps where most of the nodes are outside the range of other nodes. Replacing cansee with a hitscan check potentially has bigger gains, but also risks false negatives, so I'll only do that if I'm not getting good enough results.

EDIT: Is there a way to do multi-dimensional arrays in CON? It would provide a practical way for each node to store data about which other nodes are visible from it. There is some dynamic change due to doors closing and such, but it would still be useful to store the baseline values.

This post has been edited by Trooper Dan: 28 October 2019 - 05:38 PM

0

User is offline   Danukem 

  • Duke Plus Developer

#6

I'm going to try some optimizations and record results here. First, the baseline. In a particular large map with lots of nodes, but without any actors actually using the nodes. Framerate is measured when staring at a specific wall from a specific angle, to eliminate other variables.

Baseline with nodes doing full calculations every tic: 170 fps

Baseline with staggered calculations (once every 5 tics): 238 fps


Next, going to test after the first optimization. This will have distance checks coming before visibility checks.

Distance before vis optimization with full calculations: 218 fps

Distance before vis optimization with staggered calculations: 232 fps

So I got a very substantial bump in fps when doing the optimization by itself, but when it was combined with staggered calculations I actually got a small drop in fps. I'm not sure what to make of this. Next I checked in a different map, this time with live enemies searching for the player.

Arena1 actor search no optimization: 235fps
Arena1 staggered calculations: 240fps

Arena1 actor search dist over vis optimized: 240fps
Arean1 actor search dist over vis opt + staggered: 238fps

Here it was obvious that the calculations were not having a big impact in the first place (probably because this map has fewer nodes). But again, we see a gain when no other optimization is used, and a small drop when the dist before vis optimization is combined with staggered calculations. This is puzzling.
0

#7

View PostTrooper Dan, on 28 October 2019 - 10:53 PM, said:

I'm going to try some optimizations and record results here. First, the baseline. In a particular large map with lots of nodes, but without any actors actually using the nodes. Framerate is measured when staring at a specific wall from a specific angle, to eliminate other variables.

Baseline with nodes doing full calculations every tic: 170 fps

Baseline with staggered calculations (once every 5 tics): 238 fps


Next, going to test after the first optimization. This will have distance checks coming before visibility checks.

Distance before vis optimization with full calculations: 218 fps

Distance before vis optimization with staggered calculations: 232 fps

So I got a very substantial bump in fps when doing the optimization by itself, but when it was combined with staggered calculations I actually got a small drop in fps. I'm not sure what to make of this. Next I checked in a different map, this time with live enemies searching for the player.

Arena1 actor search no optimization: 235fps
Arena1 staggered calculations: 240fps

Arena1 actor search dist over vis optimized: 240fps
Arean1 actor search dist over vis opt + staggered: 238fps

Here it was obvious that the calculations were not having a big impact in the first place (probably because this map has fewer nodes). But again, we see a gain when no other optimization is used, and a small drop when the dist before vis optimization is combined with staggered calculations. This is puzzling.


Do you see these results consistently over multiple repetitions, i.e. multiple launches of the game? It's possible that they are well within a standard deviation, depending on the noise on your system.

Judging from the code I'd definitely say go with ldist first. cansee executes a number of for loops and fills a bunch of memory buffers which is definitely not fast.

This post has been edited by Doom64hunter: 29 October 2019 - 05:55 AM

0

User is offline   Danukem 

  • Duke Plus Developer

#8

View PostDoom64hunter, on 29 October 2019 - 05:54 AM, said:

Do you see these results consistently over multiple repetitions, i.e. multiple launches of the game? It's possible that they are well within a standard deviation, depending on the noise on your system.

Judging from the code I'd definitely say go with ldist first. cansee executes a number of for loops and fills a bunch of memory buffers which is definitely not fast.


I did one launch per test on the first map, then observed fps over a couple of minutes. I did two launches per test in the second map, and the difference was still there. However, I'm going with the noise explanation as I have nothing else. Ideally I want to remove the latency in node reaction caused by the staggered processing as much as possible, because the latency has a cumulative effect on how long it takes paths to update across the whole map, and that can lead to actors doing dumb things for as much as a couple of seconds. So, having significant gains when all nodes are processed every tic should be prioritized.
0

User is offline   Kyanos 

#9

For my own curiosity I went and looked at hitscan (clip.cpp) it's 300+ lines.

Spoiler


Also IMO your testing is non-conclusive, too small of a map pool. There is a lot of unique per map factors that will need to be averaged out, I'd say a few small to medium maps and test as many of the larger ones as you can.

This post has been edited by Photonic: 29 October 2019 - 01:13 PM

0

User is offline   Danukem 

  • Duke Plus Developer

#10

A couple of years ago, I did end up switching to hitscan instead of canseespr for the purpose of actors confirming shootability to their targets. My actors have various factions and can fight each other, so the player centered commands such as ifcanshoottarget are no good. Anyway, the main reason I did it was to prevent actors from trying to shoot each other through unbreakable windows. With hitscan, I can control the clipmask and make unbreakable transparent barriers prevent lock-on. I guess I don't really know whether a hitscan is cheaper than a canseespr. But for node checking, there would be a similar benefit to the actor shooting case. Right now I have nodes that can see each other but are not accessible to each other, due to unbreakable glass and similar things. Currently I have to tag the nodes manually to get the results I want in those cases, but with hitscan it would happen naturally.
0

Share this topic:


Page 1 of 1
  • You cannot start a new topic
  • You cannot reply to this topic


All copyrights and trademarks not owned by Voidpoint, LLC are the sole property of their respective owners. Play Ion Fury! ;) © Voidpoint, LLC

Enter your sign in name and password


Sign in options