PDA

View Full Version : The dupe problem


Neilski
03-06-2008, 21:52
Howdy...

I know dupes are a major problem here. They're actually the main reason I gave up playing a couple of years ago. I recently restarted playing with the (vain) hope that the problem would be cured... No such luck - forums (e.g. this one) are full of warnings that loads of stuff (just about all HRs) is duped. Kind of spoils the idea of trading if your stuff might vanish. (No safe currency?)

What I'm wondering is: why isn't the duping problem already "fixed"? I've heard a few things about RustStorm, which apparently worked but hasn't been run in years. Bizarre?

Anyway, I guess this has been discussed to death, so perhaps someone can point me to a web page (or forum post?) explaining the underlying problem with fixing it?

thanks
Neil

ProfessionalBerg
03-06-2008, 21:55
Blizzard doesn't care - D2 makes them almost no money.

Passive ruststorms are run every time you enter or exit a game, but they're not nearly as effective a they could be.

Active ruststorms are an extremely time-consuming and costly process (Imagine cross-checking the ENTIRE item database!). The last Active Ruststorm was run before last Ladder reset, IIRC.

AnimeCraze
03-06-2008, 22:03
Agreed, to achieve pseudo O(n) complexity (it's not even really O(n)), you need to use hash tables. The hash table must be at least as big as the current item database, if not bigger (to minimize collisions). Now, just imagine how much space does that take. If they don't want to spend that kind of space, they would need an O(n log n) algorithm, at least. That is slow on such a big database.

Neilski
03-06-2008, 22:13
Blizzard doesn't care - D2 makes them almost no money.

Passive ruststorms are run every time you enter or exit a game, but they're not nearly as effective a they could be.

Active ruststorms are an extremely time-consuming and costly process (Imagine cross-checking the ENTIRE item database!). The last Active Ruststorm was run before last Ladder reset, IIRC.

Hi. Thx for reply.

Money: yes, I guess it's pretty good of Blizzard to still run the servers after all this time. Minimal opportunities for advertising...

Passive ruststorm: ah, wasn't aware - how did you find this out?

Active: I've not really attempted to work out the size of the problem really. But CPU cycles are "cheap", no?
[Ballpark guess alert.] Let's say 50,000 accounts on a realm (way off?), 8 characters in each, 50 items in each character; that's 20 million items on the realm. Let's say 32 bit IDs on each (cd b 64?) item, and allow another 64 bits to encode the account/character info - 12 bytes per line. A dump file with all of that info wouldn't stretch the smallest pen drive I've ever owned. A quick sort-by-id-field and hey presto, spot (and nuke!) the duplicates :-)

The CPU time to sort the file on a modern PC would (finger in the air) be a few tens of seconds?
The storage of the actual user files would be disk, so generating the dump file might take a while, but surely an hour or two would suffice...

Now, I guess if it was that easy it wouldn't be an expensive, rare operation, so I must assume I've missed something major... Ah well :-)

ProfessionalBerg
03-06-2008, 22:17
I really don't know how D2 item database works, but Blizzard's lack of activity on that front makes us assume a pessimistic side.

About the Passive Ruststorm - I guess some time ago (long time ago) someone found out that his dupes disappear upon leaving/entering the game. Obviously, some kind of program must be doing this. It was branded as Passive Ruststorm.

Alternatively, there are many talented hackers out there, that can read the code of the game. Maybe they somehow found out about the passive dupe detection program.

TheNamelessOne
03-06-2008, 22:34
I believe it is a moral rather than technical approach that will eventually triumph if anything does, there are increasing numbers of people like yourself, and I think many people here who dislike the dupe culture, with more of the young , headstrong, morally irresponsible types going to other, newer games there is a feeling in these forums (but sadly less on the realms) of moving back towards honesty. (Via SP in many cases I think) But this is perhaps happening to slowly to even realistically infect all the realms while the games still maintains interest and support.

carnivore
03-06-2008, 22:35
[Ballpark guess alert.] Let's say 50,000 accounts on a realm (way off?), 8 characters in each, 50 items in each character; that's 20 million items on the realm. Let's say 32 bit IDs on each (cd b 64?) item, and allow another 64 bits to encode the account/character info - 12 bytes per line.

i think the actual number of accounts and items are at least 10 to 100 times higher, but this doesnt really matter. what matters is, that you need to compare each and every one of those items with all others.

1. run: 20,000,000 operations
2. run: 19,999,999 operations
3. run: 19,999,998 operations
and so on...

basically you do: (number of items) * (half the number of items) operations
with your 20 million, we got 200,000,000,000,000 operations, assume im right with my 100 times more items, add a couple of zeros.

of course you could cut out a lot of those comparisons by sorting some data, comparing only interesting stuff or whatever you could come up with. the principle wont change.
take also into account, that these checks would be done while normal game play still runs. this means two things: 1. you only have a fraction of the actual cpu time and your database is under current change (you also have to look at all items in all games, that arent on any characters.)

hashtables use by definition highly different numbers for very similar base values (items in this case). so no partial comparison, always full numbers.

edit: i think there are around 100 servers per realm. a passive ruststorm would probably just look for your "50" items on all those games on this particular server. hard to guess what the number of calculations there will be, but obviously not so many.

Neilski
03-06-2008, 23:35
I believe it is a moral rather than technical approach that will eventually triumph if anything does ...

Morality will hopefully triumph one day, but even decent people can be tempted. (Hey, even I'm not a saint! I hacked the Elite save games on my ZX Spectrum quite a few years back, and found to my surprise that I ruined the game for myself.) A techy solution - "simply" make duping impossible - seems a little more attractive if possible :-)

AnimeCraze
03-06-2008, 23:40
1. The number of accounts is higher than you think, simply because of mules.
2. Disk I/O = SLOOOOWWWWWW.
3. As for making dupes impossible, that requires a server that actually follows the ACID principle, which I am sure the D2 servers doesn't. In short, it's a complete rewrite of the servers.

@carnivore: It is actually possible to achieve pretty close to linear order of ops, with proper use of hash tables. The idea is you compare if and only if 2 items got the same hash, which chances are they are dupes of each other. Of course, to allocate a big enough hash table means you need to double the # of servers.

Neilski
03-06-2008, 23:49
...
basically you do: (number of items) * (half the number of items) operations
with your 20 million, we got 200,000,000,000,000 operations, assume im right with my 100 times more items, add a couple of zeros.


Ah, you're assuming a worst-case sort algorithm. Decent ones are O(n log n) which means much much much fewer operations than you've worked out for the O(n^2) case. (E.g. see Wikipedia page.)

I'm so sad I just tried some tests: a 2 million line file (all I had room for in my very small shell account, hehe) consisting of md5sum output (thus about 35 characters per line, way more than enough for the info I expect in my imaginary dump file) sorted in less than one minute on an elderly Xeon 2.8GHz CPU (not state of the art by a long stretch).
I also verified O(n log n) behaviour with smaller files.

So my 20 million line file would sort in about 10 minutes on similar hardware,
and if you're right about the number of accounts being 100 times bigger then OK, we're looking at a good fraction of a day.


take also into account, that these checks would be done while normal game play still runs. this means two things: 1. you only have a fraction of the actual cpu time and your database is under current change (you also have to look at all items in all games, that arent on any characters.)


Actually I'd aim pretty low and assume it would be done on a snapshot of the data, not on the live servers in real-time (i.e. not in normal game play).
You could even run it constantly on last night's files, and nobody would keep a dupe for more than a day.

This would be absolutely nowhere near as good as a real-time dupe remover of course, it's just a brute-force thought-experiment solution I'm using to demonstrate (to myself?) that pretty simple measures *could* be taken to break the back of the illegal economy and get the dupers to move on.

Anyone spotting the giant flaw which must surely still be present in my logic, please point it out :-)

(I assume there's a flaw, because I assume that if Blizzard really wanted everyone to move on to paid servers, they'd just switch the D2 servers off...)

Neilski
04-06-2008, 01:00
1. The number of accounts is higher than you think, simply because of mules.
2. Disk I/O = SLOOOOWWWWWW.
3. As for making dupes impossible, that requires a server that actually follows the ACID principle, which I am sure the D2 servers doesn't. In short, it's a complete rewrite of the servers.


1. yup, very possible, I was guessing purely based on the output of /users (which I assumed referred to all realms?)

2. yes, I haven't tried to estimate just how slow yet ;-)

3. Hmm, yes, could be a big problem. Makes me wonder if Ruststorm was actually discontinued because of false positives?

carnivore
04-06-2008, 01:15
@carnivore: It is actually possible to achieve pretty close to linear order of ops, with proper use of hash tables. The idea is you compare if and only if 2 items got the same hash, which chances are they are dupes of each other. Of course, to allocate a big enough hash table means you need to double the # of servers.

Ah, you're assuming a worst-case sort algorithm. Decent ones are O(n log n) which means much much much fewer operations than you've worked out for the O(n^2) case. (E.g. see Wikipedia page.)


you are right of course. but you will need a sorted list of items to do so, preferably by hashcode (any order based on item stats could in some way speed this up). for some reason i was assuming that the only kind of order we have is by account/char (which would result in random sequence of hashcodes).

if we further assume off-line comparison, we could create such a list and eliminate double items in the life database afterwards.
now we could take a wild guess at how blizzard build the d2-database structure and estimate the effort for all this...

Gorny
04-06-2008, 05:19
Neilski, please do not reply in a thread multiple times without someone frst posting after, you can edit your posts for up to one hour to add or remove content.

We view multiple successive postings in rapid succession without someone else first posting after you as spamming.

Any questions, feel free to send a PM.

rswt
04-06-2008, 20:00
The other issue is that while I don't like the dupes and most at this forum don't like the dupes.....Many others love making high runewords. Without dupes, many many people (younger kids?) would leave. My suspicion is that this would kill any remaining sales at Target, Walmart etc... where yes the game is available.

You can argue that this is incorrect, but there is no reason at this point to run a rust storm and really "turn-off" a large portion of those playing.

"Soapbox warning"

In addition, I suspect that even many that prefer no dupes would find other games etc... Myself, I finished seven characters, no HRs except self-founds (1 Vex), then quite for a week. Now, I'm back at it until, my characters are decked out in elite gear including HRs. I suspect after a few more High Runewords though that will be it and I'll move onto another game/pastime.


*swt41

Neilski
04-06-2008, 22:04
The other issue is that while I don't like the dupes and most at this forum don't like the dupes.....Many others love making high runewords. Without dupes, many many people (younger kids?) would leave. My suspicion is that this would kill any remaining sales at Target, Walmart etc... where yes the game is available.


Aha, a financial motive for Blizzard to keep the servers up... :-) Indeed maybe more than enough to cover the cost of the electricity, esp. if they don't spend programmer hours fixing bugs...

You can argue that this is incorrect, but there is no reason at this point to run a rust storm and really "turn-off" a large portion of those playing.

"Soapbox warning"

In addition, I suspect that even many that prefer no dupes would find other games etc... Myself, I finished seven characters, no HRs except self-founds (1 Vex), then quite for a week. Now, I'm back at it until, my characters are decked out in elite gear including HRs. I suspect after a few more High Runewords though that will be it and I'll move onto another game/pastime.


Hmm, good point. I once played a Tesladin to about lvl 83 and don't remember ever getting a decent rune. (Though it was a lousy Tesladin with no decent backup attack!)
If the HRs are really so rare then I guess it would be a real bummer to lose the cool runewords. Maybe the "right" overall fix is to skew the odds a little more in favour of high-end stuff *and* kill the dupes :-)