03:10:57 Windows builds of master branch on crawl.develz.org updated to: 0.25-a0-631-gc915619fb7 03:11:51 Unstable branch on crawl.beRotato.org updated to: 0.25-a0-631-gc915619 (34) 03:13:49 -!- Psymania_ is now known as Psymania 03:48:49 Monster database of master branch on crawl.develz.org updated to: 0.24-a0-443-g80245de385 04:03:10 -!- amalloy is now known as amalloy_ 06:01:15 New branch created: pull/1317 (1 commit) 13https://github.com/crawl/crawl/pull/1317 06:01:15 03RojjaCebolla02 {GitHub} 07https://github.com/crawl/crawl/pull/1317 * 0.25-a0-632-g5278e56: Add more kinds of divine favor 10(5 minutes ago, 1 file, 12+ 0-) 13https://github.com/crawl/crawl/commit/5278e56c6b9c 12:03:18 Gauntlet timer/countdown not making announcements 13https://crawl.develz.org/mantis/view.php?id=12226 by CyclopB 12:23:58 Stable (0.23) branch on underhound.eu updated to: 0.23.1-93-ge536e68a2c 13:07:22 Unstable branch on crawl.akrasiac.org updated to: 0.25-a0-631-gc915619 (34) 15:46:10 03advil02 07* 0.25-a0-632-g7713895: Refactor init_user to be asynchronous 10(3 minutes ago, 1 file, 43+ 22-) 13https://github.com/crawl/crawl/commit/771389537338 15:46:46 it would be sort of fun to just rewrite webtiles with modern asyncio code in python 3.7 15:51:05 yeah 15:52:22 I need to rewrite beem et all to use an asyncronous irc library, but the problem is I had to implement SASL myself using the synchronous irc module, since that was easier to understand and figuring out the nuances of SASL was enough work 16:02:10 maybe by now someone has written an async version? 16:02:42 looks like I need to work out a way to make ttyrec writing asynchronous too 16:03:34 Unstable branch on crawl.kelbi.org updated to: 0.25-a0-632-g7713895373 (34) 16:04:40 there is at least one async irc module, but I got errors when testing it 16:05:00 something I should look into again, though 16:05:31 I'm relatively new to coroutines but it does seem like when this stuff goes wrong it's very tricky to debug 16:18:27 actually maybe this is already somewhat asynchronous, just not the header 16:26:29 gammafunk do you know what sort of guarantees there are for execution order, if any? 16:27:43 or where to look for this? 16:28:55 i.e. if there is an asynchronous _handle_read function triggered by a file read, and multiple calls are triggered before any of them run, I would hope that they run in the same order they were triggered -- but I can't seem to find any details on this? 16:30:51 I don't, no 16:31:33 if this were a pure python 3 async that used await this would all be a lot clearer 16:35:22 can you make a coroutine that uses a sequence of awaits to handle the sequence of reads and then only have one instance of scheduling the coroutine without awaiting? 16:35:59 or are we not allowed to use await at all 16:36:35 async/await is a syntax error in python 2.7 16:36:47 I *think* it's possible to use tornado's coroutines with yield syntax though 16:36:57 yeah, should be 16:37:06 yield from is what I used 16:37:14 in beem, actually still do 16:37:22 in the public source, at least 16:38:21 using asyncio.coroutine decorator to define the coroutine, and then yield from to call it asynchronously 16:39:02 but I guess going all the way back to 2.7 things might be different 16:39:47 tornado has its own coroutine decorators, I just have to understand how to use them 16:40:32 also maybe to rein in my instinct to rewrite this code from scratch and instead figure out a minimal change 19:25:07 Unstable branch on underhound.eu updated to: 0.25-a0-632-g7713895373 (34) 21:02:12 advil, what changes are you trying to make? 21:03:09 it might be easier to upgrade all servers and then drop py2 support 21:03:14 the ttyrec writing code is sometimes significantly blocking on cao, and would ideally be done without any blocking calls 21:04:02 ah, right 21:04:02 I'm not actually sure how one is supposed to fix this in any version of tornado, though 21:04:19 it's blocking on file.write, which doesn't seem like a very common issue for most people 21:06:03 I think possibly using ThreadPoolExecutor on the io calls might work, in py3 21:06:57 I assume youve seen https://stackoverflow.com/a/38811251 ? 21:07:16 yeah 21:07:25 that's basically the idea 21:07:33 what file is it writing to? 21:08:16 the real problem is that arbitrary random access to some file that's not in cache on cao's chroot makes the server churn 21:08:29 in this case, some arbitrary ttyrec associated with a particular game 21:08:48 how much ram does the server have? 21:10:22 10GB apparently? 21:10:24 weird number 21:10:54 that is weird... maybe it's a vm after all? 21:11:23 slow hdd would not be inconsistent with that 21:11:56 the chroot has a truly absurd number of files 21:12:10 do you know what files it's writing to? 21:12:29 misc ttyrecs 21:12:44 whatever games are currently being played 21:13:07 thats tough to fix 21:13:10 I haven't diagnosed it carefully but I'm guessing that the timeout happens on the first access typically 21:13:33 there are 83681 user directories alone 21:14:07 yeah, I'm not sure if fixing the issue at the fs level is practical, though surely it could be tuned better *somehow* 21:14:46 I would guess that another factor is that apache has a near constant stream of access requests from bots that are endlessly spidering the user directories, so the cache probably gets cycled very quickly 21:15:10 though I'm not confident those are the same disk 21:15:17 lol, yeah, that sounds like a problem 21:15:29 do ttyrecs have any latency requirements? 21:15:37 well, I guess if it's serving from the chroot they are 21:15:48 I don't think so, and in fact the ttyrecs all seem to get written 21:16:28 but it's probably a non-trivial factor in server lag that this happens (though the blocking issue I fixed with init_user was much more common) 21:16:42 maybe fork a separate process, send ttyrec blobs to it over a pipe, and it can write them with low disk priority? 21:17:09 hm could work 21:17:43 that won't work if the queue could get backed up, or if disk io is the bottleneck 21:18:24 disk io isn't generally a bottleneck, based on iotop 21:18:37 usually under 10%, often lower 21:19:12 I can trivially make it go to 100% by running ls without `-f` in the directory with the user accounts 21:21:43 so I'd guess the ttyrec induced lag happens intermittently? 21:22:57 uhh, wait, 83000 user directories? 21:23:04 haha yes 21:23:23 it's probably lagging so much because every time it opens a ttyrec it has to walk through that 21:23:36 as far as I can tell that part actually should be ok 21:24:17 well, random access to that directory isn't exactly 21:24:36 but the number of files is large but not out of spec for ext3 21:25:05 *the number of directories 21:25:06 it's ext3? Hmm 21:25:19 maybe? 21:25:23 how do you tell again? 21:25:54 df 21:26:01 it's ext4 21:26:27 so sub ext4 into what I said above (it's just been a month or so since I did all this research) 21:28:13 ime 80,000 is going to cause significant lag 21:28:42 today in the last 10 hours, there have been 226 instances where the lag from the now-fixed init_user issue reached 500ms, and 176 instances where the ttyrec issue reached 500ms 21:28:51 yeah, definitely for certain things 21:29:08 but for example `ls -f` is fine, cd is fine, ls in arbitrary subdirectories is fine 21:29:24 but not ls without -f. 21:29:27 ? 21:29:34 yes, without -f it is really bad 21:29:47 when I do that by accident I usually kill -9 it rather than wait 21:29:55 in that particular directory only 21:29:59 doesn't -f disable sorting or something? 21:30:02 yeah 21:30:22 that doesn't make sense... 21:30:39 what about ls -f | sort 21:31:32 that's fine 21:31:43 this is a known issue with ls in big directories afaict 21:32:30 oh maybe it's the colorization 21:32:30 probably that, yeah 21:32:30 yeah that's it 21:32:30 ls --color=never is also fine 21:34:35 so it seems ttyrecs lag when first opening the ttyrec file, you said? 21:34:56 that's my guess but I haven't tested that carefully 21:36:07 you could try strace head some-random-ttyrec 21:36:35 if strace has an option for timestamped output 21:39:33 btw for the linting and testing, I will probably set things up to make two venvs in source/, one each for py2 and py3 21:40:15 is that going to be wildly incompatible with your conda setup? 21:41:01 predictably, python tooling is still totally inadequate the moment you want to do anything slightly funky 21:41:05 I don't think that's going to be a problem if it's not imposed on whoever's compiling 21:41:31 no, this is for devs only 21:43:40 well, for running tests, to be precise 21:44:21 I think something I'm doing with this strace idea is spoiling my attempts to pick an arbitrary file 21:57:59 a quick hacky fix might be to fork a subprocess that opens the ttyrec file (and others) whenever a user logs in 21:58:19 that'd be a lot easier that a ttyrec writing process 22:10:41 +aidanh | if strace has an option for timestamped output <-- it does, but if the issue is a giant dirent you won't see this in strace. You'd need dtrace or a kernel debugger 22:11:59 about your linting & testing -- have you seen tox? That might simplify your "two virtualenvs" idea. It's explicitly designed to run your tests on multiple python versions 22:14:30 I was thinking about intermediate start calls, but yeah dtrace would be better 22:15:01 I tried tox and it demanded a pyproject.toml 22:16:19 apparently I need to set the build system or something 22:16:47 i haven't used tox a lot tbh. So can't really help with it 22:17:08 advil: what version of python 3 can cao use? I'm wondering if you could put in place a python-3 only workaround for the slow IO performance 22:18:04 nox worked flawlessly but it doesn't support py2 22:21:06 https://github.com/crawl/crawl/wiki/DCSS-Servers-overview ; CAO is debian 7 22:25:00 I think the next oldest server is CDO, which has python 3.4.2 22:29:59 is there anyone with access to cbro? 22:30:30 I know john.stein is super busy, but we will need to update that at some point 22:30:39 obviously johnstein; he might be willing to grant access to others, although I'm not sure in what capacity 22:30:42 preferably sooner rather than later 22:30:50 in the past I had a login, but that seemed to be no longer working when I last tested 22:31:10 and it was a non-privileged login so I could...well I'm not sure what all I could even do with it 22:31:59 aidanh: I can ping him about it, can you give me a list of what would need upgrading? 22:32:39 it'd be the whole chroot upgrade 22:33:11 which is basically just moving some files around, with a small patch to the build script 22:33:41 would you have instructions I could refer him to? 22:33:44 it can wait until cao is upgraded, I reckon, but worth keeping in mind 22:33:59 sure 22:34:10 yeah I've been meaning to write them up 22:38:52 all the actual player data is in dgldir/, right? 22:39:05 I may just keep that outside the chroot and bind mount it in 22:39:30 is dgldir the chroot itself? 22:39:49 It's in the root directory of the chroot 22:40:05 yeah, all such data lives in the chroot 22:40:28 on a lot of servers, this is not actually named "dgldir" of course 22:43:08 oops, think I ran into that ls slowdown that advil talked about 22:43:23 heh 22:43:45 it's probably worth actually killing because it will lag playing 22:43:55 yeah I did eventually manage to ^C it 22:48:09 not actually named dgldir? you gotta be kidding me 22:48:13 aidanh: I see I misread your response, and I do see dgldir 22:48:22 however that is not where *all* user data in chroot resides 22:48:49 crawl-master contains all the crawl builds, and e.g. user saves are in the various build subdirectories 22:49:02 oh you mean the chroot itself is not always in the same place? 22:49:59 definitely not always in e.g. /home/crawl/DGL in that it's actually /chroot on CAO 22:49:59 hrm, OK, well I guess two bind mounts it is 22:50:18 ... don't tell me the webserver directory is right next to all these crawl binaries 22:50:53 it's next to the crawl versioned directories 22:51:18 and how do you mean "webserver directory" exactly; just the crawl-specific stuff? 22:51:56 what I mean is `webserver` :D 22:52:58 oh, webtiles specifically, yes 22:53:49 it's like this was purposefully designed to be impossible to compartmentalise 22:54:14 that directory looks like this: https://www.dropbox.com/s/u1h671qexnw9ssp/Screenshot%202020-03-18%2022.53.49.png?dl=0 22:54:46 well, it was designed for the very left edge of these graphs: http://crawl.akrasiac.org/scoring/per-day.html 22:57:33 I remember 2007 23:05:59 separation of whatnow? 23:06:49 back in my day we didn't have this SOLID malarky we just wrote actually solid programs! Yes everything is in one directory but it worked for us on the PDP-8 so I don't see what the problem now is