Neural Network based alt detection

55 views
Skip to first unread message

justin kilpatrick

unread,
Mar 28, 2015, 5:36:54 PM3/28/15
to CivcraftD...@googlegroups.com
There was some discussion on this in an old thread, what does everyone think of actually trying this? 

We have years of data to feed into this system, and it has the huge advantage of being easy to open source without us having to worry about making it too easy to avoid, the strength is in what the system learns. 

At first it would just handle two pearls alts detection like the current alt script, we could even have it piggyback on the current script for a while (not doing anything but accepting the feedback we give the current script to improve its guesses and do initial learning beyond being fed legacy data) 

Once it has alt detection down we can shift it into the actual banning alts role once we get the other factors for that sorted. Either way I think it would probably make for a superior alt script. 

Rourke750

unread,
Mar 28, 2015, 5:39:23 PM3/28/15
to justin kilpatrick, CivcraftD...@googlegroups.com
Ah so a bergify script basically.

justin kilpatrick

unread,
Mar 28, 2015, 5:41:55 PM3/28/15
to Rourke750, CivcraftD...@googlegroups.com
wat

Rourke750

unread,
Mar 28, 2015, 5:44:43 PM3/28/15
to justin kilpatrick, CivcraftD...@googlegroups.com
In 1.0 around the time of spock bot and the gimmick brigade Berge calculated the log in/ log out times of all the player data that spock bot had.  He was able to guess who certain people alts were because a player was most likely to switch into their alt account a few minutes after they logged out of their main.  Once this occurred enough times it was a certain percent accurate that the two accounts were alts of each other.

justin kilpatrick

unread,
Mar 28, 2015, 5:45:52 PM3/28/15
to Rourke750, CivcraftD...@googlegroups.com
well yes we could just scan berge's brain and upload it to the server, but a neural network might work a little better. 

Rourke750

unread,
Mar 28, 2015, 5:47:27 PM3/28/15
to justin kilpatrick, CivcraftD...@googlegroups.com
...

Andrew Jawa

unread,
Mar 28, 2015, 7:03:03 PM3/28/15
to Rourke750, justin kilpatrick, CivcraftD...@googlegroups.com
Done, it will apply on server restart.

justin kilpatrick

unread,
Mar 28, 2015, 7:06:09 PM3/28/15
to Andrew Jawa, Rourke750, CivcraftD...@googlegroups.com
you uploaded berge?

Andrew Jawa

unread,
Mar 28, 2015, 7:38:17 PM3/28/15
to justin kilpatrick, Rourke750, CivcraftD...@googlegroups.com
berge is love, berge is life

Greg Boiczyk

unread,
Mar 28, 2015, 8:01:13 PM3/28/15
to Dr. Andrew Jawa, justin kilpatrick, Rourke Rabinowitz, CivcraftD...@googlegroups.com

Civcraft is now space based

justin kilpatrick

unread,
Mar 28, 2015, 10:18:21 PM3/28/15
to Greg Boiczyk, Dr. Andrew Jawa, Rourke Rabinowitz, CivcraftD...@googlegroups.com
this thread turned into a parody of itself very quickly. 

Jake Jungbluth

unread,
Mar 28, 2015, 10:29:25 PM3/28/15
to justin kilpatrick, Greg Boiczyk, Dr. Andrew Jawa, Rourke Rabinowitz, CivcraftD...@googlegroups.com
It's aware 

Gavin Jenkins

unread,
Mar 29, 2015, 5:12:17 PM3/29/15
to civcraftd...@googlegroups.com, CivcraftD...@googlegroups.com
Definitely down for trying this.

Initial steps:

1) Make a plugin if there isn't one already that can log anything and everything possible for a given player for some amoutn of time

2) Fly around in game invisibly and manually identify a sample of definite non-bots, and definite bots.

3) Collect data on those identified bots/non-bots for awhile intensively

4) Also collect some data on random 5 minute samples of other people blindly

So SOME pre-categorized data (at least 10-20 different bots and different humans), and SOME blind data (as much as you can store). From there, we can proceed to learn the factors that matter. The more data, the better it will work.
The next step after identifying the factors would be to then apply them to unknown players, and verify how accurate it is by again flying aroud invisibly and seeing if its predictions are good or not, until satisfied.
This process MAY have to be repeated later if/when you actually change the rules and start either banning or kicking/captcha-ing identified bots. Because botters will start trying to mimic humans, and new factors may have to be re-learned. It may not be though. The more data you collect initially, the more likely it is to continue predicting successfully even if/when people try to mimic humans, without you having to update it.

justin kilpatrick

unread,
Mar 29, 2015, 6:29:02 PM3/29/15
to Gavin Jenkins, civcraftd...@googlegroups.com
to try and understand the development process, do we need the data first, or can we design the plugin and then just put it live and feed it data, I am imagining somthing that links into the current alts list with a few other commands for things like manual bot identification, then we just run it on the main server where it learns without really taking any action while we manually inform it of various things. 

Later on we could add the parts required for actual action and enforcement. First we teach it, then we have it make suggestions, then we have it act for us. 

Gavin Jenkins

unread,
Mar 29, 2015, 7:28:48 PM3/29/15
to civcraftd...@googlegroups.com, gavj...@gmail.com
It cannot be realistically initially programmed and tuned without at least a fair amount of example data. Even if you want it to in production mode adapt on the fly (which is possible yes).

After that though, it's up to you. The training feedback must be given at some point, it's your preference when. Basically, if you put in very little data or time up front, what will happen is it will spit out "75% confident"s at you instead of "90% confident"s. So you'll have to spend more time following up manually to confirm botting. Then as you report your findings later, it will start to get more accurate.

But at some point o another, you have to put in the time to manually inspect and tell it if it's right or not. That can be early, that can be later. But without that, it's pretty much logically impossible to make it work on its own. It could identify different groups, but it won't know which ones are bots and player without you telling it at some point...

justin kilpatrick

unread,
Mar 29, 2015, 7:41:43 PM3/29/15
to Gavin Jenkins, civcraftd...@googlegroups.com

So we need to get a large set of sanitized example data for you?

Gavin

unread,
Mar 29, 2015, 7:46:34 PM3/29/15
to justin kilpatrick, civcraftdevelopers
Yes, I don't know what you would be or should be comfortable with. ANYTHING could be important. Even names (botters might tend to use certain kinds of names), and even absolute coordinates rather than just relative coordinates (certain areas of the map might be more botting prone), etc. Obviously things you wouldn't want me to know.

What might work is if you can find some volunteers who don't mind me knowing their actual data and record from them, then it's not an issue to give everything to get it in the general ballpark, and further learning can happen on-server. Finding botters who have representative bots and who want to help with a bot detection program is I imagine a tricky proposition, though...

justin kilpatrick

unread,
Mar 29, 2015, 7:50:16 PM3/29/15
to Gavin, civcraftdevelopers
since we also want to use this for alts our primary use case is for people who do not tell us, this is why I wanted to train it entirely on the server easier to do that than sanitize data, but if we have to we can. We just need to decide what to collect and work out how to do it. 

Gavin

unread,
Mar 29, 2015, 7:51:51 PM3/29/15
to justin kilpatrick, civcraftdevelopers
Sigh, okay I MIIIIIGHT be able to feasibly do it with just a list of the variables considered, and be able to make up some plausible data myself. It will really suck initially at categorizing, but it should be able to actually at least function that way and then start to learn on its own. You just couldn't trust it for awhile, because it would be tuned to my imagination, not reality to begin with.

That's bare minimum though. Different types of data all have to be transformed into network-interpretable format differently.

justin kilpatrick

unread,
Mar 29, 2015, 7:55:32 PM3/29/15
to Gavin, civcraftdevelopers
that could work, infact we could get you a smaller set of sample data, the issue is with large sets, regardless we need to figure out what to log, we intend to teach this thing for a long time. 

xavter2

unread,
Mar 30, 2015, 4:56:42 PM3/30/15
to civcraftd...@googlegroups.com, gavj...@gmail.com
Are we sure we have enough data to do this, or have identified the features to be training the neural net on? I only do this for small lab experiments, but the neural nets I've trained have had training datasets that are several gigabytes in size. My experience may be a bit different than what we're working on but we still need a large amount of training data, and will need to identify the features we will be training it to classify.

Rourke750

unread,
Mar 30, 2015, 5:36:32 PM3/30/15
to xavter2, Gavin Jenkins, civcraftd...@googlegroups.com

Well there is around three years of logs.  You think that's enough...

justin kilpatrick

unread,
Mar 30, 2015, 5:36:40 PM3/30/15
to xavter2, civcraftd...@googlegroups.com, gavj...@gmail.com
We have every log of every day Civcraft has ever been run, everylogin, every logout, every minute online, lots of admin and logging data about what they do when they are online (that we use as admins), we easily have 1/4 of a tb worth of logs containing every notable thing that anyone has ever done. Of course our logging has improved during this time so its not totally consistent but if you need absolute consistency the last couple of months alone are 10gigs ish. 

That sound like enough?

xavter2

unread,
Mar 30, 2015, 5:39:40 PM3/30/15
to civcraftd...@googlegroups.com, xav...@gmail.com, gavj...@gmail.com
Lol I suppose that completely changes things then. More data more power. Do you guys happen to have logs about which chests people have used, what items were taken, and when? I feel like that might be something insightful to analyze.

justin kilpatrick

unread,
Mar 30, 2015, 5:46:03 PM3/30/15
to xavter2, civcraftd...@googlegroups.com, gavj...@gmail.com
We log just about everything you can conceive of and we regularly add more things to that list. 

Gavin

unread,
Mar 30, 2015, 8:48:37 PM3/30/15
to justin kilpatrick, xavter2, civcraftdevelopers
Yeah of course that's enough =P Whatever you're willing to share is helpful. From just the variables you want to use and the format, up to sanitized samples of data on some subset of those variables.

Neural networks are by no means necessarily the best route by the way, Xavter. I would probably also try some more boring types of stuff like just plain old regressions and automated factor analyses. There are some other less generic neural-type models out there too specifically deigned for categorization, like ALCOVE and COVIS.

In response to another followup message: Alt detection I do not imagine would work as well with categorization approaches. That's probably better with straight math for correlating log ins, activity locations, shared chests, shared items, targeted logical stuff like that. Maybe also supplementing with similarity ratings of the two characters' gameplay styles (psychological type similarity rating of all their bulk play data)

berge403

unread,
Mar 31, 2015, 6:04:37 PM3/31/15
to civcraftd...@googlegroups.com, gregb...@gmail.com, dr.andr...@gmail.com, rourk...@gmail.com, CivcraftD...@googlegroups.com
Now you have to make a addition to tomorrows changelog that you uploaded Berge's brain to the server to help with alt detection.

Then, make occasional slipups through server messages referencing kings/gondolin/muh quadrant/collaboration/etc.

But realistically... reading through the rest of this thread, wildweazel pointed me at it.

Matthías Valdimarsson

unread,
Apr 1, 2015, 10:07:15 AM4/1/15
to berge403, Andrew Jawa, civcraftd...@googlegroups.com, rourk...@gmail.com, gregb...@gmail.com

Muh 40 men in prot

Muh 250.000 homesteaders

justin kilpatrick

unread,
Apr 6, 2015, 9:18:23 PM4/6/15
to Matthías Valdimarsson, berge403, Andrew Jawa, civcraftd...@googlegroups.com, rourk...@gmail.com, gregb...@gmail.com
bumping, what needs to happen for this? 

Rourke750

unread,
Apr 6, 2015, 9:41:45 PM4/6/15
to justin kilpatrick, Dr. Andrew Jawa, Greg Boiczyk, berge403, civcraftd...@googlegroups.com, Matthías Valdimarsson

Someone to start :/ 

Gavin Jenkins

unread,
Apr 7, 2015, 2:16:32 PM4/7/15
to Rourke750, Greg Boiczyk, justin kilpatrick, berge403, civcraftd...@googlegroups.com, Matthías Valdimarsson, Dr. Andrew Jawa

Variables list and/or data.

programmerdan

unread,
Apr 7, 2015, 3:10:24 PM4/7/15
to civcraftd...@googlegroups.com, rourk...@gmail.com, gregb...@gmail.com, kilpatri...@gmail.com, berg...@gmail.com, goldma...@gmail.com, dr.andr...@gmail.com, gavin-...@uiowa.edu
I'd love to get involved, so a semi-public (?) repository for code and "variables list and/or data" would be great as well; not sure if such access-limited development resources already exist in the Civcraft medium or if it's something that needs investment/doing.

justin kilpatrick

unread,
Apr 7, 2015, 8:34:49 PM4/7/15
to programmerdan, civcraftd...@googlegroups.com, rourk...@gmail.com, gregb...@gmail.com, berg...@gmail.com, goldma...@gmail.com, dr.andr...@gmail.com, gavin-...@uiowa.edu
we can just give you a repo on the Github civcraft organization, that works for access control, then the variables list could just be a google doc. That sound good?

programmerdan

unread,
Apr 8, 2015, 3:14:46 PM4/8/15
to civcraftd...@googlegroups.com, progra...@gmail.com, rourk...@gmail.com, gregb...@gmail.com, berg...@gmail.com, goldma...@gmail.com, dr.andr...@gmail.com, gavin-...@uiowa.edu
Yeah, that'd work great. Gavin -- works for you?

justin kilpatrick

unread,
Apr 8, 2015, 5:21:50 PM4/8/15
to programmerdan, civcraftd...@googlegroups.com, rourk...@gmail.com, gregb...@gmail.com, berg...@gmail.com, goldma...@gmail.com, dr.andr...@gmail.com, gavin-...@uiowa.edu
what should the name of the repo be?

Daniel Boston

unread,
Apr 8, 2015, 5:56:06 PM4/8/15
to Jenkins, Gavin W, justin kilpatrick, civcraftd...@googlegroups.com, rourk...@gmail.com, gregb...@gmail.com, berg...@gmail.com, goldma...@gmail.com, dr.andr...@gmail.com
Been a while since I worked with formal structures in this domain, but I'd love to talk shop with you on this. Might be a bit heavy/domain specific for the full dev list, so once we get started in a repo we should move most of the discussions on technical details there. 

Also, jealous that you have an active Matlab license. Mine expired about 3-4 years ago, haven't bitten the bullet to renew as personal yet. Are you familiar with R? Other good libraries for NN/GA/SVMs exist but iirc R is still king with good Java support.

On Wed, Apr 8, 2015 at 5:34 PM, Jenkins, Gavin W <gavin-...@uiowa.edu> wrote:
Sure thang. I'm probably just gonna fuck around in Matlab for awhile though and see if I can get it to work at all before doing a whole actual java plugin.

From: justin kilpatrick [kilpatri...@gmail.com]
Sent: Tuesday, April 07, 2015 7:34 PM
To: programmerdan; civcraftd...@googlegroups.com
Cc: rourk...@gmail.com; gregb...@gmail.com; berg...@gmail.com; goldma...@gmail.com; dr.andr...@gmail.com; Jenkins, Gavin W
Subject: Re: Neural Network based alt detection

programmerdan

unread,
Apr 10, 2015, 12:56:52 PM4/10/15
to civcraftd...@googlegroups.com, progra...@gmail.com, rourk...@gmail.com, gregb...@gmail.com, berg...@gmail.com, goldma...@gmail.com, dr.andr...@gmail.com, gavin-...@uiowa.edu
I'm terrible with names. NNProfiler? CivProfile? BetterPlayerProfiles? ServerIntelligenceModule? CivcraftNSA?

Andrew Jawa

unread,
Apr 10, 2015, 1:10:23 PM4/10/15
to programmerdan, civcraftd...@googlegroups.com, Rourke Rabinowitz, Greg Boiczyk, Jesse Berger, Matthías Valdimarsson, gavin-...@uiowa.edu
​Could name it after Exultant

Jesse Berger

unread,
Apr 10, 2015, 1:21:58 PM4/10/15
to programmerdan, a, goldma...@gmail.com, gregb...@gmail.com, rourk...@gmail.com, civcraftd...@googlegroups.com, gavin-...@uiowa.edu

LiterallyNSA
GimmickDetector
NeuralServerAnalyzer
NeuralAssociativeServerAnalyzer

Sean Chambers

unread,
Apr 10, 2015, 1:36:56 PM4/10/15
to Jesse Berger, programmerdan, a, goldma...@gmail.com, gregb...@gmail.com, rourk...@gmail.com, civcraftd...@googlegroups.com, gavin-...@uiowa.edu
Glados

Sent from my iPhone

justin kilpatrick

unread,
Apr 10, 2015, 4:19:55 PM4/10/15
to Sean Chambers, Jesse Berger, programmerdan, a, goldma...@gmail.com, gregb...@gmail.com, rourk...@gmail.com, civcraftd...@googlegroups.com, gavin-...@uiowa.edu

Valve probably won't sue us. So I like Glados or we could call it Hal

Daniel Boston

unread,
Apr 10, 2015, 4:39:58 PM4/10/15
to justin kilpatrick, Sean Chambers, Jesse Berger, a, goldma...@gmail.com, gregb...@gmail.com, rourk...@gmail.com, civcraftd...@googlegroups.com, gavin-...@uiowa.edu
"probably" -- last words before eaten by grue

I like HAL.

Jesse Berger

unread,
Apr 10, 2015, 4:48:02 PM4/10/15
to Justin Kilpatrick, civcraftd...@googlegroups.com, Matthías Valdimarsson, gavin-...@uiowa.edu, programmerdan, rourk...@gmail.com, gregb...@gmail.com, a, Sean Chambers

Or we avoid IP entirely and name it according to what it does (neural/NSA)

justin kilpatrick

unread,
Apr 10, 2015, 7:13:49 PM4/10/15
to Jesse Berger, civcraftd...@googlegroups.com, Matthías Valdimarsson, gavin-...@uiowa.edu, programmerdan, rourk...@gmail.com, gregb...@gmail.com, a, Sean Chambers
The NSA might sue us. 

Rourke750

unread,
Apr 10, 2015, 7:15:44 PM4/10/15
to justin kilpatrick, civcraftd...@googlegroups.com, gavin-...@uiowa.edu, programmerdan, a, gregb...@gmail.com, Jesse Berger, Sean Chambers, Matthías Valdimarsson

"Ttk randomly disappears in the night"

justin kilpatrick

unread,
Apr 10, 2015, 7:27:57 PM4/10/15
to Rourke750, civcraftd...@googlegroups.com, gavin-...@uiowa.edu, programmerdan, a, gregb...@gmail.com, Jesse Berger, Sean Chambers, Matthías Valdimarsson
I already have contingency plans for that, what do you think the changelog is for? Its not for Civcraft its my dead man switch. If the NSA where to nab me they could not just hush it up they would have to hire some poor sap to administrate Civcraft 

staygroovy57

unread,
Apr 13, 2015, 8:23:34 AM4/13/15
to civcraftd...@googlegroups.com, CivcraftD...@googlegroups.com

Not sure that it's necessary to use something that's already popular in other games.

Maybe something like 'Sentinel' ? There's a Greek mythological equivalent, it's "Argus" (the ever-vigilant 100-eye guardian) 

programmerdan

unread,
Apr 13, 2015, 3:02:31 PM4/13/15
to civcraftd...@googlegroups.com, CivcraftD...@googlegroups.com
Argus is cool, but Sentinel is amazing. Forget my prior thoughts, I lobby for Sentinel as the project name.

justin kilpatrick

unread,
Apr 13, 2015, 3:56:40 PM4/13/15
to programmerdan, civcraftd...@googlegroups.com
Sentinel it is, unless anyone has opposition?

Rourke750

unread,
Apr 13, 2015, 4:08:07 PM4/13/15
to justin kilpatrick, civcraftd...@googlegroups.com, Daniel Boston

BetterExpensiveSentinelPlus

justin kilpatrick

unread,
Apr 14, 2015, 7:31:24 PM4/14/15
to Rourke750, civcraftd...@googlegroups.com, Daniel Boston
Ok so we have our name Sentinel, let me make the github, here you go 
https://github.com/Civcraft/Sentinel

Gavin Jenkins

unread,
Apr 20, 2015, 12:42:35 AM4/20/15
to justin kilpatrick, Rourke750, civcraftdevelopers, Daniel Boston
Cool thanks. Still need a variable list too, at least.

Nick

unread,
Apr 20, 2015, 4:45:20 PM4/20/15
to gavin-...@uiowa.edu, justin kilpatrick, Rourke750, civcraftdevelopers, Daniel Boston
You'll need to work out what variables you want to look into. Then plugin support to gather that data will need to be added.

N

Gavin Jenkins

unread,
Apr 20, 2015, 8:43:36 PM4/20/15
to civcraftd...@googlegroups.com, progra...@gmail.com, gavin-...@uiowa.edu, kilpatri...@gmail.com, rourk...@gmail.com
I don't even know which ones you have. We started out with a minimum requirement of a sanitized actual data set to make a decent version of this plugin. And then I conceded that MAYBE a list of the variables MIGHT be enough to do a sort-of-competent hobbled version of it that would need to learn a ton and be fixed significantly after going online.

Now all of the sudden it's "no not even a list."  That degree of handicap I can't work with. List is a bare minimum to not waste a ton of time and still end up with a nonfunctional plugin.

Nick

unread,
Apr 20, 2015, 8:45:09 PM4/20/15
to Gavin Jenkins, civcraftd...@googlegroups.com, Daniel Boston, gavin-...@uiowa.edu, Justin Kilpatrick, Rourke Rabinowitz
I don't know what you need.

N

Gavin Jenkins

unread,
Apr 20, 2015, 8:54:43 PM4/20/15
to civcraftd...@googlegroups.com, kilpatri...@gmail.com, progra...@gmail.com, gavj...@gmail.com, gavin-...@uiowa.edu, rourk...@gmail.com
A list of variables you keep data on:

Do you record every block place and break? If so does that include timestamps? Do you record movement events? If so at what granularity? Do you record interaction events (swinging arms and tools)? if so, does that include angle? Do you record all the chat information a person gets like from jukes? Or only chats they send out?

Etc. etc. Plus some idea of what format each one is in in the database (double precision? integer? Again, timestamps on what?) That way I can go through and convert each one to a neural network compatible input format and make up fake data for each that seems plausible to me to test with.

Otherwise I'm guessing blindly and am probably going to be wrong the vast majority of the time == wasted effort.

Nick

unread,
Apr 20, 2015, 9:00:20 PM4/20/15
to Gavin Jenkins, civcraftd...@googlegroups.com, Justin Kilpatrick, Daniel Boston, gavin-...@uiowa.edu, Rourke Rabinowitz
I'll let ttk2 deal with this since it's his baby, lol.

N

justin kilpatrick

unread,
Apr 20, 2015, 9:14:06 PM4/20/15
to Nick, Gavin Jenkins, civcraftd...@googlegroups.com, Daniel Boston, gavin-...@uiowa.edu, Rourke Rabinowitz
I was going to say that this should move to a more private channel, but I don't think we intend to close source this plugin, so whatever we record would become public knowledge anyways,  that being said we have not added too much major since an early version of Misclog leaked out with 1.0, if none of the admins bring up a major mistake I am making we can talk about most of what we record. 

Nick

unread,
Apr 20, 2015, 9:21:08 PM4/20/15
to justin kilpatrick, Gavin Jenkins, civcraftd...@googlegroups.com, Daniel Boston, gavin-...@uiowa.edu, Rourke Rabinowitz

Gavin Jenkins

unread,
Apr 20, 2015, 9:25:05 PM4/20/15
to civcraftd...@googlegroups.com, gavj...@gmail.com, kilpatri...@gmail.com, gavin-...@uiowa.edu, progra...@gmail.com, rourk...@gmail.com
Getting a 404 on that link, and tried googling the repository by name and author also no luck.

Rourke750

unread,
Apr 20, 2015, 9:28:01 PM4/20/15
to Gavin Jenkins, civcraftd...@googlegroups.com, Justin Kilpatrick, gavin-...@uiowa.edu, Daniel Boston
Its a private repo, was meant for ttk2

Daniel Boston

unread,
Apr 20, 2015, 9:28:51 PM4/20/15
to Gavin Jenkins, civcraftd...@googlegroups.com, Justin Kilpatrick, Gavin Jenkins, Rourke Rabinowitz
Can't really have it both ways. We could go private and encourage more full disclosure for the developers who've dealt with NN/GA/AI systems before, or have a hobbled system with incomplete understanding of what's recorded, but public. If exposure is a concern, let's close this off and restrict access.

Nick

unread,
Apr 20, 2015, 9:32:44 PM4/20/15
to Daniel Boston, Gavin Jenkins, civcraftd...@googlegroups.com, Justin Kilpatrick, Gavin Jenkins, Rourke Rabinowitz
You will have to record data no matter what. There will have to be work done to get whatever data you want. That's why it's more interesting to have you figure out what you need and then we can work to get data to you.

N

Daniel Boston

unread,
Apr 20, 2015, 9:33:45 PM4/20/15
to Nick, Gavin Jenkins, civcraftd...@googlegroups.com, Justin Kilpatrick, Gavin Jenkins, Rourke Rabinowitz
I think Gavin and I were assuming we'd hook into existing data sources so we could leverage historic data in forming future-perfect models.

Gavin Jenkins

unread,
Apr 20, 2015, 9:39:07 PM4/20/15
to civcraftd...@googlegroups.com, rourk...@gmail.com, progra...@gmail.com, gavj...@gmail.com, gavin-...@uiowa.edu, kilpatri...@gmail.com
More interesting for you. Not interesting to me to be attempting to nightmarishly code something through several layers of obscurity with no real ability to test or debug it. I completely understand if you don't want to give out that info, but I really need that to do a remotely competent job of this.

Unless you're implying that whatever I think is important could merely be easily added to the logging because I requested it? Then maybe yes. I'd still need a good idea in that case of how much of a budget you use vs. have in available database space and bandwidth then.

Otherwise I'm also happy to merely be a modeling consultant to somebody working on the actual code.

justin kilpatrick

unread,
Apr 20, 2015, 10:27:14 PM4/20/15
to Gavin Jenkins, civcraftd...@googlegroups.com, rourk...@gmail.com, progra...@gmail.com, gavin-...@uiowa.edu
we can add logging at will for you (within reason) or adapt existing events to fit your setup, but frankly I don't think our logging system is too hard to figure out, what would you easily track if you where trying to make somthing that could reconstruct all important events on the server? Its a pretty simple set of things. 

Gavin Jenkins

unread,
Apr 21, 2015, 3:30:03 PM4/21/15
to justin kilpatrick, programmerdan, civcraftdevelopers, rourk...@gmail.com

Not really obvious IMO to guess that. For example one of the most useful of all info would be movement vectors on a moment by moment basis, In small intensive samples if necessary for space. I really very much doubt you store player movement vectors already though.

The stuff that is important for following up tickets etc. is a different set of info than stuff useful for bot detection. The mundane, high volume, nitty gritty unimportant seeming details are probably most important for bot detection when dealing with people who sudpect detection snd take measures to avoid the easy obvious stuff.

justin kilpatrick

unread,
Apr 21, 2015, 4:00:01 PM4/21/15
to gavin-...@uiowa.edu, programmerdan, civcraftdevelopers, rourk...@gmail.com
You are right about that, we store just enough to re-construct a basic picture of whats going on, for a human, player vectors are useless, most interaction events are all we need (even then you can leave a good number of unimportant interactions unlogged and still be able to make a great picture of whats going on.) overall we do log quite a bit but from the perspective of everything we could log we manage to store very little that we can reconstruct much from, that being said its kinda hard to do a proper reconstruction, its possible but it could take hours if we really care about details. 

Now if we did store vectors even if just in the neural net that might overwhelm our storage space. 

Gavin Jenkins

unread,
Apr 21, 2015, 5:39:23 PM4/21/15
to civcraftd...@googlegroups.com, gavin-...@uiowa.edu, rourk...@gmail.com, progra...@gmail.com
Do you have room to store every tiny little thing for 10 second bursts at random times averaging roughly every 5 minutes or something?

Daniel Boston

unread,
Apr 21, 2015, 5:43:39 PM4/21/15
to Gavin Jenkins, civcraftd...@googlegroups.com, Gavin Jenkins, Rourke Rabinowitz
If we just duplicated the packet stream into an offsite aggregator, tagged with player UUID, leveraging that windowed filter, I'd imagine it would be possible to avoid storage overages, or at least contain costs. As well, we wouldn't need to keep this sampling data forever. Just until we were confident in the ability of the code to discern profile(s). Then we could start throwing away old data of this granularity.

Gavin Jenkins

unread,
Apr 21, 2015, 6:11:22 PM4/21/15
to civcraftd...@googlegroups.com, gavj...@gmail.com, gavin-...@uiowa.edu, rourk...@gmail.com
Well you need to store it for however long a ticket system takes to cycle through, basically, for people to follow up on flags, etc. Maybe roughly a week? I dunno. Definitely not forever.

justin kilpatrick

unread,
Apr 21, 2015, 6:59:10 PM4/21/15
to Gavin Jenkins, civcraftd...@googlegroups.com, gavin-...@uiowa.edu, rourk...@gmail.com

We want this system to hold enough persistent data that it can make educated decisions on if a new player is an alt of someone who played years ago. Not detection could maybe use the bursts. But that would not every go to the disk.

Gavin Jenkins

unread,
Apr 21, 2015, 8:15:44 PM4/21/15
to civcraftd...@googlegroups.com, gavj...@gmail.com, gavin-...@uiowa.edu, rourk...@gmail.com
Yeah those are basically two different plugins. Bot detection is more neural network-y and has to work under assumptions of constant hostile efforts to subvert it. Needs tiny subtle details, doesn't need to hold much in memory by comparison.

Alt association is more a similarity problem where I'd suggest something more like a Bayesian model, and it would typically be used on much "softer targets" that much less plausibly could trick a basic model. Can probably work very well on coarser information, needs storage, etc.

justin kilpatrick

unread,
Apr 21, 2015, 8:19:35 PM4/21/15
to Gavin Jenkins, civcraftd...@googlegroups.com, gavin-...@uiowa.edu, rourk...@gmail.com
we probably want to work on alts first, so I think we can move on from talks of logging minutiae until we tackle bot detection. 

programmerdan

unread,
Apr 23, 2015, 4:05:08 PM4/23/15
to civcraftd...@googlegroups.com, gavj...@gmail.com, rourk...@gmail.com, gavin-...@uiowa.edu
I kind of thought the original idea was to detect bots. You've already got an alt association script -- sure, plenty of false positives and lots of micromanagement involved, but it's there. The itch was purportedly a need to detect and identify bot-like behavior so they could be "kicked first" in high-load environments. Perhaps i misread the need?

Rourke750

unread,
Apr 23, 2015, 6:42:32 PM4/23/15
to programmerdan, civcraftd...@googlegroups.com, Gavin Jenkins, gavin-...@uiowa.edu
I believe the idea was to mark certain behavior and detect a variety of things including bots.

justin kilpatrick

unread,
Apr 23, 2015, 9:09:54 PM4/23/15
to Rourke750, programmerdan, civcraftd...@googlegroups.com, Gavin Jenkins, gavin-...@uiowa.edu
I was hoping we could do it all under one roof, but I see why that's not practical now. As far as things that are take time now to enforce the alts list is a major one that we already have training data for, bots would be a large undertaking that would be best left for later I feel .
Reply all
Reply to author
Forward
0 new messages